gpt-4.1 vs gpt-4.1-nano KQL Benchmark

gpt-4.1 wins by 37.2%

Compared on 188 shared test questions

Overall Accuracy

gpt-4.1

61.7%

116 / 188 correct

gpt-4.1-nano

24.5%

46 / 188 correct

Average Cost per Query

gpt-4.1: $0.0285
gpt-4.1-nano: $0.0014
gpt-4.1 costs 1921.1% more

Average Execution Time

gpt-4.1: 9.95s
gpt-4.1-nano: 10.78s
gpt-4.1-nano takes 8.4% longer
Question-by-Question Analysis

Question-by-Question Comparison

Detailed comparison showing where each model succeeded or failed

Showing 1 to 25 of 188 questions
Page 1 of 8
T1016.001
On a Linux host, a ping command was executed to test internet connectivity. Determine which IP address was used as the ping target.
gpt-4.1 Wins
T1007
An analyst suspects a user or script ran a service enumeration command on a Linux system. Review process events to find the service-listing invocation and specify the full command that was executed.
gpt-4.1 Wins
T1006
Identify the PowerShell cmdlet used on Windows to format and display the raw volume bytes after an attacker read the boot sector via a DOS device path.
gpt-4.1 Wins
T1003
On a Windows host, an attacker leveraged COMSVCS.DLL via rundll32.exe to dump the memory of svchost.exe into a file under the Temp directory. Review the file creation logs to determine the exact name of the dump file that was generated.
gpt-4.1 Wins
T1016.002
On a Windows host, someone appears to have run a built-in network shell utility to list saved wireless network profiles and their passwords in clear text. Review the process creation logs to discover the exact command that was executed.
gpt-4.1 Wins
T1003.001
Using Windows process event logs, investigate PowerShell activity around lsass.exe memory capture. What was the name of the script file invoked to perform the dump?
gpt-4.1 Wins
T1016
A Linux host’s Syslog shows a shell-based network discovery script ran multiple commands. One of them listed current TCP connections. Which utility was invoked?
gpt-4.1 Wins
T1018
Review Linux process execution records for any commands that list TCP metric cache entries and filter out loopback interfaces. Which utility was used?
gpt-4.1 Wins
T1027
On a Windows endpoint, look for evidence of a base64-encoded PowerShell payload execution. Which executable launched the encoded command?
gpt-4.1 Wins
T1027
A Windows host shows a process launch with an extremely obfuscated command line that dynamically builds and invokes code at runtime. Which process name was used to execute this payload?
gpt-4.1 Wins
T1027
On a Linux system, identify the script that was generated by decoding a base64 data file and then executed. What was the filename of that script?
gpt-4.1 Wins
T1036.004
A threat actor on a Windows system crafted and registered a service named almost identically to the standard time service, but redirecting execution to a custom script. Review the logging data to determine which native command-line tool was used to perform this action. What utility was invoked?
gpt-4.1 Wins
T1036.003
A process is running under a familiar Windows host name but originates from a user's AppData folder rather than the System32 directory. Identify the filename used to masquerade the PowerShell binary on this Windows device.
gpt-4.1 Wins
T1053.003
Linux hosts may log events when new files are added to /var/spool/cron/crontabs. Query those logs for a creation or write action in that directory and determine the file name that was added.
gpt-4.1 Wins
T1053.005
You suspect malicious persistence via scheduled tasks on a Windows endpoint. Review the process execution logs to identify the built-in utility used to register tasks at logon or startup. What is the name of this utility?
gpt-4.1 Wins
T1053.005
Investigate Windows process events for PowerShell activity that leverages WMI to register a scheduled task via XML import. What was the name of the XML file supplied to the RegisterByXml method?
gpt-4.1 Wins
T1053.006
Examine the logs from the Linux system for events related to the systemd timer activation. Identify any records indicating that a new timer unit was started and enabled, and determine which timer name was used.
gpt-4.1 Wins
T1057
While reviewing Windows process events, you spot a PowerShell process executing a WMI enumeration cmdlet. What WMI class name did the attacker query?
gpt-4.1 Wins
T1053.005
On Windows, review recent registry changes to detect when the MSC file association was hijacked by a reg add operation. What executable file was configured as the default command under HKCU\Software\Classes\mscfile\shell\open\command?
gpt-4.1 Wins
T1057
On a Windows host, investigate process events to find when Task Manager was launched via cmd with an unusual flag. What was the full command executed?
gpt-4.1 Wins
T1059.004
On a Linux host, identify the process invocation that altered a user’s login shell. What was the full command used?
gpt-4.1 Wins
T1069.001
Review recent Windows process event logs for PowerShell activity that suggests local group enumeration through WMI. What exact command was executed?
gpt-4.1 Wins
T1070.003
On a Linux system, you suspect someone erased their command history by linking the history file to /dev/null. Investigate process events and determine which utility was executed to achieve this.
gpt-4.1 Wins
T1059.007
On a Windows endpoint, wscript.exe was used to run a JScript. Identify the exact script path passed to wscript.
gpt-4.1 Wins
T1070.003
On a Windows endpoint, review process execution logs to see if any PowerShell sessions were wiped clean. Which command was executed to clear the PowerShell history?
gpt-4.1 Wins
Page 1 of 8

Explore individual model performance and detailed analysis