o1-high vs o1-low KQL Benchmark

Performance Tied

Compared on 188 shared test questions

Overall Accuracy

o1-high

63.3%

119 / 188 correct

o1-low

63.3%

119 / 188 correct

Average Cost per Query

o1-high: $0.5239
o1-low: $0.4994
o1-high costs 4.9% more

Average Execution Time

o1-high: 57.03s
o1-low: 50.90s
o1-high takes 12.0% longer
Question-by-Question Analysis

Question-by-Question Comparison

Detailed comparison showing where each model succeeded or failed

Showing 1 to 25 of 188 questions
Page 1 of 8
T1036.003
In a Linux environment, you observe a process labeled like the cron daemon but running from an unexpected path. Investigate creation events to uncover the actual filename used by this fake cron process.
o1-high Wins
T1039
On a Windows system, someone ran PowerShell to copy a file from a remote machine’s C$ share to the local TEMP folder. Using process event logs, what full PowerShell command was executed to perform this action?
o1-high Wins
T1053.005
Investigate Windows process events for PowerShell activity that leverages WMI to register a scheduled task via XML import. What was the name of the XML file supplied to the RegisterByXml method?
o1-high Wins
T1057
On a Windows device, review the process execution logs to find instances where a built-in listing tool was piped into a string filter. Identify the process name that the attacker was searching for.
o1-high Wins
T1057
On a Windows host, investigate process events to find when Task Manager was launched via cmd with an unusual flag. What was the full command executed?
o1-high Wins
T1053.005
On a Windows host, find any scheduled task that was registered using PowerShell native cmdlets instead of schtasks.exe. What was the name given to the new task?
o1-high Wins
T1057
While reviewing Windows process events, you spot a PowerShell process executing a WMI enumeration cmdlet. What WMI class name did the attacker query?
o1-high Wins
T1070.003
On a Windows endpoint, review process execution logs to see if any PowerShell sessions were wiped clean. Which command was executed to clear the PowerShell history?
o1-high Wins
T1082
Windows: Investigate PowerShell process events for instances where a web client fetched and executed an external host-survey tool. What was the name of the script file that was downloaded and run?
o1-high Wins
T1197
A suspicious BITS transfer was orchestrated via bitsadmin.exe on Windows, creating a job to download and then execute a payload. Investigate the process event logs to determine what custom job name was specified when the BITS job was created.
o1-high Wins
T1217
On a Windows system, you notice a process that recursively enumerates files named 'Bookmarks' under every user profile directory. Which Windows command-line utility was used to perform that search?
o1-high Wins
T1497.003
On a Linux host, identify any processes that used ping with a large count value to introduce a delay before launching another process. What was the command executed immediately after the ping delay?
o1-high Wins
T1531
Within Windows process event logs, identify instances where the built-in net.exe utility is used to change a user account password. What was the new password argument passed in?
o1-high Wins
T1547.002
A Windows host shows a suspicious registry change under the LSA hive. Review recent registry events to locate any new entries under Authentication Packages and determine the name of the DLL the attacker added.
o1-high Wins
T1546.004
A suspicious file modification on a Linux device targeted the ~/.bash_profile file, apparently adding a new line. What was the full command string that was appended?
o1-high Wins
T1548.001
A Linux system shows a shell invocation that appears to be searching for files with elevated group permissions. Using the available process execution logs, determine exactly what command was run.
o1-high Wins
T1547.014
A Windows endpoint shows an Active Setup entry under Internet Explorer Core Fonts being altered with a StubPath value. Investigate the registry events and identify the payload that was set.
o1-high Wins
T1555
A security investigator suspects that someone attempted to dump stored web credentials on a Windows system using an in-built command-line tool. Review process creation logs to determine which executable was called to list the Web Credentials vault.
o1-high Wins
T1557.001
On Windows devices, hunt for PowerShell activity where a remote script is fetched and executed to perform LLMNR/NBNS spoofing. Which cmdlet kicked off the listener?
o1-high Wins
T1559
Investigating a Windows device, you suspect a non-standard executable was launched to set up a named pipe for client-server messaging. Determine the name of the executable that was run.
o1-high Wins
T1562
Review Linux process execution logs to find where the system journal service was stopped. Which utility was invoked to disable journal logging?
o1-high Wins
T1562.004
On a Windows device, a new inbound firewall rule was created unexpectedly. Review process execution records to identify the command-line utility responsible for adding the rule.
o1-high Wins
T1562.004
Investigate Windows registry modification events to find the name of the registry value that was changed under the WindowsFirewall policy path when someone turned the firewall off.
o1-high Wins
T1622
On the Windows device, a security check was run to detect debugger processes via PowerShell. Which tool (process) carried out this check?
o1-high Wins
T1007
An analyst suspects a user or script ran a service enumeration command on a Linux system. Review process events to find the service-listing invocation and specify the full command that was executed.
o1-low Wins
Page 1 of 8

Explore individual model performance and detailed analysis