gpt-4-turbo-2024-04-09 vs gpt-5-high KQL Benchmark

gpt-5-high wins by 23.9%

Compared on 188 shared test questions

Overall Accuracy

gpt-4-turbo-2024-04-09

39.4%

74 / 188 correct

gpt-5-high

63.3%

119 / 188 correct

Average Cost per Query

gpt-4-turbo-2024-04-09: $0.1737
gpt-5-high: $0.1529
gpt-4-turbo-2024-04-09 costs 13.6% more

Average Execution Time

gpt-4-turbo-2024-04-09: 16.84s
gpt-5-high: 192.47s
gpt-5-high takes 1043.3% longer
Question-by-Question Analysis

Question-by-Question Comparison

Detailed comparison showing where each model succeeded or failed

Showing 1 to 25 of 188 questions
Page 1 of 8
T1036.004
A threat actor on a Windows system crafted and registered a service named almost identically to the standard time service, but redirecting execution to a custom script. Review the logging data to determine which native command-line tool was used to perform this action. What utility was invoked?
gpt-4-turbo-2024-04-09 Wins
T1059.007
On a Windows endpoint, wscript.exe was used to run a JScript. Identify the exact script path passed to wscript.
gpt-4-turbo-2024-04-09 Wins
T1059.004
On a Linux system, review process execution records for any shell process that set an environment variable containing executable code and then piped it into another shell instance. Determine which environment variable name was used to store the script content.
gpt-4-turbo-2024-04-09 Wins
T1070.003
On a Windows endpoint, commands are no longer being logged to PowerShell history, suggesting PSReadLine settings were altered. Using process execution logs, determine the exact command that was run to set the history save style to 'SaveNothing'.
gpt-4-turbo-2024-04-09 Wins
T1070.004
A Linux host executed a native utility to overwrite and then remove a temporary file in one step. Identify the name of the file that was securely deleted by this action.
gpt-4-turbo-2024-04-09 Wins
T1201
Windows systems may be probed for their password policy settings using a native command-line tool. Determine which command was executed to list the local password policy on the target hosts.
gpt-4-turbo-2024-04-09 Wins
T1201
You are reviewing Linux syslog records on a CentOS/RHEL 7.x server. You notice entries for shell commands that access system configuration files under /etc/security. Determine exactly which configuration file was being inspected by the command.
gpt-4-turbo-2024-04-09 Wins
T1218.010
An attacker has attempted to sideload code by invoking regsvr32.exe in a Windows host against a file that does not use the standard .dll extension. Investigate the process event logs to determine the name of the file that was registered.
gpt-4-turbo-2024-04-09 Wins
T1505.005
A suspicious registry change was made on a Windows system modifying the Terminal Services DLL path. Investigate registry events to find out which DLL file name was set as the ServiceDll value under TermService. What was the file name?
gpt-4-turbo-2024-04-09 Wins
T1217
On Linux, review the process execution logs to uncover when Chromium’s bookmark JSON files were being located and the results persisted. Focus on shell commands that search under .config/chromium and write output to a file. What was the filename used to save the findings?
gpt-4-turbo-2024-04-09 Wins
T1531
Within Windows process event logs, identify instances where the built-in net.exe utility is used to change a user account password. What was the new password argument passed in?
gpt-4-turbo-2024-04-09 Wins
T1552.003
A Linux user’s bash history was searched for patterns like ‘pass’ and ‘ssh’, and the matching lines were redirected into a new file. Determine the name of that file.
gpt-4-turbo-2024-04-09 Wins
T1562.004
On a Windows device, a new inbound firewall rule was created unexpectedly. Review process execution records to identify the command-line utility responsible for adding the rule.
gpt-4-turbo-2024-04-09 Wins
T1562.012
On a Linux host, auditing has been turned off. Review process execution or syslog data to determine which command was executed to disable the audit subsystem.
gpt-4-turbo-2024-04-09 Wins
T1007
An analyst suspects a user or script ran a service enumeration command on a Linux system. Review process events to find the service-listing invocation and specify the full command that was executed.
gpt-5-high Wins
T1006
Identify the PowerShell cmdlet used on Windows to format and display the raw volume bytes after an attacker read the boot sector via a DOS device path.
gpt-5-high Wins
T1003.001
Using Windows process event logs, investigate PowerShell activity around lsass.exe memory capture. What was the name of the script file invoked to perform the dump?
gpt-5-high Wins
T1003.008
In a Linux environment, an elevated process was used to execute a command that read /etc/shadow and redirected its output to a file. Identify what file name was employed to store these results.
gpt-5-high Wins
T1016.001
An analyst notices a PowerShell process on a Windows host that appears to be checking SMB connectivity. Which PowerShell cmdlet was executed to perform this outbound port 445 test?
gpt-5-high Wins
T1027
A Windows host shows a process launch with an extremely obfuscated command line that dynamically builds and invokes code at runtime. Which process name was used to execute this payload?
gpt-5-high Wins
T1027
On a Linux system, identify the script that was generated by decoding a base64 data file and then executed. What was the filename of that script?
gpt-5-high Wins
T1036.003
A process is running under a familiar Windows host name but originates from a user's AppData folder rather than the System32 directory. Identify the filename used to masquerade the PowerShell binary on this Windows device.
gpt-5-high Wins
T1036.004
Analyze Windows process events for any schtasks.exe commands that created a new task invoking PowerShell. What is the name of the .ps1 script specified to run?
gpt-5-high Wins
T1048.003
A Linux host briefly hosted an HTTP service under /tmp. Examine process creation logs to determine the exact python3 command that was used to start the server on port 9090.
gpt-5-high Wins
T1053.003
Linux hosts may log events when new files are added to /var/spool/cron/crontabs. Query those logs for a creation or write action in that directory and determine the file name that was added.
gpt-5-high Wins
Page 1 of 8

Explore individual model performance and detailed analysis