gpt-5-nano-high vs o4-mini-low KQL Benchmark

o4-mini-low wins by 12.8%

Compared on 188 shared test questions

Overall Accuracy

gpt-5-nano-high

30.3%

57 / 188 correct

o4-mini-low

43.1%

81 / 188 correct

Average Cost per Query

gpt-5-nano-high: $0.0069
o4-mini-low: $0.0311
o4-mini-low costs 351.1% more

Average Execution Time

gpt-5-nano-high: 61.10s
o4-mini-low: 73.44s
o4-mini-low takes 20.2% longer
Question-by-Question Analysis

Question-by-Question Comparison

Detailed comparison showing where each model succeeded or failed

Showing 1 to 25 of 188 questions
Page 1 of 8
T1003
On a Windows host, an attacker leveraged COMSVCS.DLL via rundll32.exe to dump the memory of svchost.exe into a file under the Temp directory. Review the file creation logs to determine the exact name of the dump file that was generated.
gpt-5-nano-high Wins
T1027
On a Windows endpoint, look for evidence of a base64-encoded PowerShell payload execution. Which executable launched the encoded command?
gpt-5-nano-high Wins
T1016.001
On a Linux host, a ping command was executed to test internet connectivity. Determine which IP address was used as the ping target.
gpt-5-nano-high Wins
T1036.003
In a Linux environment, you observe a process labeled like the cron daemon but running from an unexpected path. Investigate creation events to uncover the actual filename used by this fake cron process.
gpt-5-nano-high Wins
T1057
On a Windows host, investigate process events to find when Task Manager was launched via cmd with an unusual flag. What was the full command executed?
gpt-5-nano-high Wins
T1059.004
During a Linux investigation, you notice processes spawning curl and wget commands that pull a script from a remote GitHub raw URL and pipe it into bash. Identify the name of the script that was retrieved and executed.
gpt-5-nano-high Wins
T1069.001
Review recent Windows process event logs for PowerShell activity that suggests local group enumeration through WMI. What exact command was executed?
gpt-5-nano-high Wins
T1059.004
An analyst suspects that a restricted shell escape was executed using a common Perl package manager on Linux. Review the process execution records to determine which tool was invoked to spawn the shell.
gpt-5-nano-high Wins
T1070.004
A Linux host executed a native utility to overwrite and then remove a temporary file in one step. Identify the name of the file that was securely deleted by this action.
gpt-5-nano-high Wins
T1082
A Windows system shows a cmd.exe process spawn that appears to have been used for environment discovery. Review the process creation records to identify the exact command the adversary ran to enumerate environment variables.
gpt-5-nano-high Wins
T1082
A user‐space process on a Linux device invoked a shell to capture and display the system’s environment variables and path. Which exact command was used to perform this discovery?
gpt-5-nano-high Wins
T1124
A Windows host recorded a process that simply executes the system’s native time utility. Without spelling out the query, determine which command was run based on process creation events.
gpt-5-nano-high Wins
T1134.001
A Windows host logs show PowerShell fetching and executing a remote script to gain SeDebugPrivilege token duplication. Which Empire module was invoked?
gpt-5-nano-high Wins
T1547.014
A Windows endpoint shows an Active Setup entry under Internet Explorer Core Fonts being altered with a StubPath value. Investigate the registry events and identify the payload that was set.
gpt-5-nano-high Wins
T1552.003
A Linux user’s bash history was searched for patterns like ‘pass’ and ‘ssh’, and the matching lines were redirected into a new file. Determine the name of that file.
gpt-5-nano-high Wins
T1562.003
Review Windows registry event logs for the ProcessCreationIncludeCmdLine_Enabled value being set to 0. Which PowerShell cmdlet performed this change?
gpt-5-nano-high Wins
T1555
A security investigator suspects that someone attempted to dump stored web credentials on a Windows system using an in-built command-line tool. Review process creation logs to determine which executable was called to list the Web Credentials vault.
gpt-5-nano-high Wins
T1559
Investigating a Windows device, you suspect a non-standard executable was launched to set up a named pipe for client-server messaging. Determine the name of the executable that was run.
gpt-5-nano-high Wins
T1562.004
On a Windows device, a new inbound firewall rule was created unexpectedly. Review process execution records to identify the command-line utility responsible for adding the rule.
gpt-5-nano-high Wins
T1562.012
A Linux system’s audit framework appears to have been reset unexpectedly. Search your process execution records to identify which exact invocation removed all auditd rules. What full command was executed?
gpt-5-nano-high Wins
T1562
Review Linux process execution logs to find where the system journal service was stopped. Which utility was invoked to disable journal logging?
gpt-5-nano-high Wins
T1564.002
On Windows systems, identify any user account that was hidden by setting its value to 0 under the SpecialAccounts\\UserList registry key. What was the name of the hidden account?
gpt-5-nano-high Wins
T1036.003
A process is running under a familiar Windows host name but originates from a user's AppData folder rather than the System32 directory. Identify the filename used to masquerade the PowerShell binary on this Windows device.
o4-mini-low Wins
T1003.008
In a Linux environment, an elevated process was used to execute a command that read /etc/shadow and redirected its output to a file. Identify what file name was employed to store these results.
o4-mini-low Wins
T1006
Identify the PowerShell cmdlet used on Windows to format and display the raw volume bytes after an attacker read the boot sector via a DOS device path.
o4-mini-low Wins
Page 1 of 8

Explore individual model performance and detailed analysis