gpt-5-nano-medium vs o1-low KQL Benchmark

o1-low wins by 38.9%

Compared on 185 shared test questions

Overall Accuracy

gpt-5-nano-medium

23.8%

44 / 185 correct

o1-low

62.7%

116 / 185 correct

Average Cost per Query

gpt-5-nano-medium: $0.0069
o1-low: $0.4994
o1-low costs 7087.3% more

Average Execution Time

gpt-5-nano-medium: 65.07s
o1-low: 50.90s
gpt-5-nano-medium takes 27.8% longer
Question-by-Question Analysis

Question-by-Question Comparison

Detailed comparison showing where each model succeeded or failed

Showing 1 to 25 of 185 questions
Page 1 of 8
T1036.003
In a Linux environment, you observe a process labeled like the cron daemon but running from an unexpected path. Investigate creation events to uncover the actual filename used by this fake cron process.
gpt-5-nano-medium Wins
T1039
On a Windows system, someone ran PowerShell to copy a file from a remote machine’s C$ share to the local TEMP folder. Using process event logs, what full PowerShell command was executed to perform this action?
gpt-5-nano-medium Wins
T1057
On a Windows host, investigate process events to find when Task Manager was launched via cmd with an unusual flag. What was the full command executed?
gpt-5-nano-medium Wins
T1082
On Windows systems, identify when the built-in Shadow Copy utility is used to enumerate existing snapshots. What was the full command executed?
gpt-5-nano-medium Wins
T1082
A Windows system shows a cmd.exe process spawn that appears to have been used for environment discovery. Review the process creation records to identify the exact command the adversary ran to enumerate environment variables.
gpt-5-nano-medium Wins
T1548.001
A Linux system shows a shell invocation that appears to be searching for files with elevated group permissions. Using the available process execution logs, determine exactly what command was run.
gpt-5-nano-medium Wins
T1562
Review Linux process execution logs to find where the system journal service was stopped. Which utility was invoked to disable journal logging?
gpt-5-nano-medium Wins
T1622
On the Windows device, a security check was run to detect debugger processes via PowerShell. Which tool (process) carried out this check?
gpt-5-nano-medium Wins
T1007
An analyst suspects a user or script ran a service enumeration command on a Linux system. Review process events to find the service-listing invocation and specify the full command that was executed.
o1-low Wins
T1016
A Linux host’s Syslog shows a shell-based network discovery script ran multiple commands. One of them listed current TCP connections. Which utility was invoked?
o1-low Wins
T1016.001
An analyst notices a PowerShell process on a Windows host that appears to be checking SMB connectivity. Which PowerShell cmdlet was executed to perform this outbound port 445 test?
o1-low Wins
T1016.001
On a Linux host, a ping command was executed to test internet connectivity. Determine which IP address was used as the ping target.
o1-low Wins
T1027
On a Windows endpoint, look for evidence of a base64-encoded PowerShell payload execution. Which executable launched the encoded command?
o1-low Wins
T1018
A Windows host executed an ICMP-based network reconnaissance using a looping instruction in cmd.exe. Identify the exact command line that was used to perform the ping sweep.
o1-low Wins
T1018
Review Linux process execution records for any commands that list TCP metric cache entries and filter out loopback interfaces. Which utility was used?
o1-low Wins
T1036.003
A process is running under a familiar Windows host name but originates from a user's AppData folder rather than the System32 directory. Identify the filename used to masquerade the PowerShell binary on this Windows device.
o1-low Wins
T1003
On a Windows host, an attacker leveraged COMSVCS.DLL via rundll32.exe to dump the memory of svchost.exe into a file under the Temp directory. Review the file creation logs to determine the exact name of the dump file that was generated.
o1-low Wins
T1053.003
Linux hosts may log events when new files are added to /var/spool/cron/crontabs. Query those logs for a creation or write action in that directory and determine the file name that was added.
o1-low Wins
T1049
A user launched a Windows command prompt and executed a built-in utility to enumerate all active network connections. Using process creation logs, identify the exact tool that produced the list of current connections.
o1-low Wins
T1048.003
A Linux host briefly hosted an HTTP service under /tmp. Examine process creation logs to determine the exact python3 command that was used to start the server on port 9090.
o1-low Wins
T1036.004
A threat actor on a Windows system crafted and registered a service named almost identically to the standard time service, but redirecting execution to a custom script. Review the logging data to determine which native command-line tool was used to perform this action. What utility was invoked?
o1-low Wins
T1049
In a Windows log analytics workspace, search for PowerShell processes that were used to enumerate network connections. Determine which PowerShell cmdlet was executed to list active TCP connections.
o1-low Wins
T1048.003
Windows process creation logs show a PowerShell-driven file transfer to an FTP service. Which executable was leveraged to perform this exfiltration?
o1-low Wins
T1036.004
Analyze Windows process events for any schtasks.exe commands that created a new task invoking PowerShell. What is the name of the .ps1 script specified to run?
o1-low Wins
T1059.004
On a Linux system, find any process creation record where awk is used with a BEGIN rule to launch a shell. What was the exact command invoked?
o1-low Wins
Page 1 of 8

Explore individual model performance and detailed analysis