o1-high vs o1-low KQL Benchmark
Performance Tied
Compared on 188 shared test questions
Overall Accuracy
o1-high
63.3%
119 / 188 correct
o1-low
63.3%
119 / 188 correct
Average Cost per Query
o1-high: $0.5239
o1-low: $0.4994
o1-high costs 4.9% more
Average Execution Time
o1-high: 57.03s
o1-low: 50.90s
o1-high takes 12.0% longer
Question-by-Question Analysis
Question-by-Question Comparison
Detailed comparison showing where each model succeeded or failed
Showing 1 to 25 of 188 questions
Page 1 of 8
T1036.003 | In a Linux environment, you observe a process labeled like the cron daemon but running from an unexpected path. Investigate creation events to uncover the actual filename used by this fake cron process. | ✓ | ✗ | o1-high Wins |
T1039 | On a Windows system, someone ran PowerShell to copy a file from a remote machine’s C$ share to the local TEMP folder. Using process event logs, what full PowerShell command was executed to perform this action? | ✓ | ✗ | o1-high Wins |
T1053.005 | Investigate Windows process events for PowerShell activity that leverages WMI to register a scheduled task via XML import. What was the name of the XML file supplied to the RegisterByXml method? | ✓ | ✗ | o1-high Wins |
T1057 | On a Windows device, review the process execution logs to find instances where a built-in listing tool was piped into a string filter. Identify the process name that the attacker was searching for. | ✓ | ✗ | o1-high Wins |
T1057 | On a Windows host, investigate process events to find when Task Manager was launched via cmd with an unusual flag. What was the full command executed? | ✓ | ✗ | o1-high Wins |
T1053.005 | On a Windows host, find any scheduled task that was registered using PowerShell native cmdlets instead of schtasks.exe. What was the name given to the new task? | ✓ | ✗ | o1-high Wins |
T1057 | While reviewing Windows process events, you spot a PowerShell process executing a WMI enumeration cmdlet. What WMI class name did the attacker query? | ✓ | ✗ | o1-high Wins |
T1070.003 | On a Windows endpoint, review process execution logs to see if any PowerShell sessions were wiped clean. Which command was executed to clear the PowerShell history? | ✓ | ✗ | o1-high Wins |
T1082 | Windows: Investigate PowerShell process events for instances where a web client fetched and executed an external host-survey tool. What was the name of the script file that was downloaded and run? | ✓ | ✗ | o1-high Wins |
T1197 | A suspicious BITS transfer was orchestrated via bitsadmin.exe on Windows, creating a job to download and then execute a payload. Investigate the process event logs to determine what custom job name was specified when the BITS job was created. | ✓ | ✗ | o1-high Wins |
T1217 | On a Windows system, you notice a process that recursively enumerates files named 'Bookmarks' under every user profile directory. Which Windows command-line utility was used to perform that search? | ✓ | ✗ | o1-high Wins |
T1497.003 | On a Linux host, identify any processes that used ping with a large count value to introduce a delay before launching another process. What was the command executed immediately after the ping delay? | ✓ | ✗ | o1-high Wins |
T1531 | Within Windows process event logs, identify instances where the built-in net.exe utility is used to change a user account password. What was the new password argument passed in? | ✓ | ✗ | o1-high Wins |
T1547.002 | A Windows host shows a suspicious registry change under the LSA hive. Review recent registry events to locate any new entries under Authentication Packages and determine the name of the DLL the attacker added. | ✓ | ✗ | o1-high Wins |
T1546.004 | A suspicious file modification on a Linux device targeted the ~/.bash_profile file, apparently adding a new line. What was the full command string that was appended? | ✓ | ✗ | o1-high Wins |
T1548.001 | A Linux system shows a shell invocation that appears to be searching for files with elevated group permissions. Using the available process execution logs, determine exactly what command was run. | ✓ | ✗ | o1-high Wins |
T1547.014 | A Windows endpoint shows an Active Setup entry under Internet Explorer Core Fonts being altered with a StubPath value. Investigate the registry events and identify the payload that was set. | ✓ | ✗ | o1-high Wins |
T1555 | A security investigator suspects that someone attempted to dump stored web credentials on a Windows system using an in-built command-line tool. Review process creation logs to determine which executable was called to list the Web Credentials vault. | ✓ | ✗ | o1-high Wins |
T1557.001 | On Windows devices, hunt for PowerShell activity where a remote script is fetched and executed to perform LLMNR/NBNS spoofing. Which cmdlet kicked off the listener? | ✓ | ✗ | o1-high Wins |
T1559 | Investigating a Windows device, you suspect a non-standard executable was launched to set up a named pipe for client-server messaging. Determine the name of the executable that was run. | ✓ | ✗ | o1-high Wins |
T1562 | Review Linux process execution logs to find where the system journal service was stopped. Which utility was invoked to disable journal logging? | ✓ | ✗ | o1-high Wins |
T1562.004 | On a Windows device, a new inbound firewall rule was created unexpectedly. Review process execution records to identify the command-line utility responsible for adding the rule. | ✓ | ✗ | o1-high Wins |
T1562.004 | Investigate Windows registry modification events to find the name of the registry value that was changed under the WindowsFirewall policy path when someone turned the firewall off. | ✓ | ✗ | o1-high Wins |
T1622 | On the Windows device, a security check was run to detect debugger processes via PowerShell. Which tool (process) carried out this check? | ✓ | ✗ | o1-high Wins |
T1007 | An analyst suspects a user or script ran a service enumeration command on a Linux system. Review process events to find the service-listing invocation and specify the full command that was executed. | ✗ | ✓ | o1-low Wins |
Page 1 of 8
Explore individual model performance and detailed analysis