gpt-5-high vs gpt-5-mini-high KQL Benchmark
gpt-5-high wins by 14.9%
Compared on 188 shared test questions
Overall Accuracy
gpt-5-high
63.3%
119 / 188 correct
gpt-5-mini-high
48.4%
91 / 188 correct
Average Cost per Query
gpt-5-high: $0.1529
gpt-5-mini-high: $0.0150
gpt-5-high costs 920.4% more
Average Execution Time
gpt-5-high: 192.47s
gpt-5-mini-high: 44.83s
gpt-5-high takes 329.4% longer
Question-by-Question Analysis
Question-by-Question Comparison
Detailed comparison showing where each model succeeded or failed
Showing 1 to 25 of 188 questions
Page 1 of 8
T1003.007 | On a Linux system, review process creation logs to spot any dd commands reading from /proc/*/mem. What was the name of the file where the dumped memory was written? | ✓ | ✗ | gpt-5-high Wins |
T1016.001 | On a Linux host, a ping command was executed to test internet connectivity. Determine which IP address was used as the ping target. | ✓ | ✗ | gpt-5-high Wins |
T1003.008 | In a Linux environment, an elevated process was used to execute a command that read /etc/shadow and redirected its output to a file. Identify what file name was employed to store these results. | ✓ | ✗ | gpt-5-high Wins |
T1003.001 | Using Windows process event logs, investigate PowerShell activity around lsass.exe memory capture. What was the name of the script file invoked to perform the dump? | ✓ | ✗ | gpt-5-high Wins |
T1027 | On a Windows endpoint, look for evidence of a base64-encoded PowerShell payload execution. Which executable launched the encoded command? | ✓ | ✗ | gpt-5-high Wins |
T1036.003 | A process is running under a familiar Windows host name but originates from a user's AppData folder rather than the System32 directory. Identify the filename used to masquerade the PowerShell binary on this Windows device. | ✓ | ✗ | gpt-5-high Wins |
T1027 | On a Linux system, identify the script that was generated by decoding a base64 data file and then executed. What was the filename of that script? | ✓ | ✗ | gpt-5-high Wins |
T1027 | A Windows host shows a process launch with an extremely obfuscated command line that dynamically builds and invokes code at runtime. Which process name was used to execute this payload? | ✓ | ✗ | gpt-5-high Wins |
T1048.003 | A Linux host briefly hosted an HTTP service under /tmp. Examine process creation logs to determine the exact python3 command that was used to start the server on port 9090. | ✓ | ✗ | gpt-5-high Wins |
T1048.003 | Windows process creation logs show a PowerShell-driven file transfer to an FTP service. Which executable was leveraged to perform this exfiltration? | ✓ | ✗ | gpt-5-high Wins |
T1049 | A user launched a Windows command prompt and executed a built-in utility to enumerate all active network connections. Using process creation logs, identify the exact tool that produced the list of current connections. | ✓ | ✗ | gpt-5-high Wins |
T1039 | On a Windows system, someone ran PowerShell to copy a file from a remote machine’s C$ share to the local TEMP folder. Using process event logs, what full PowerShell command was executed to perform this action? | ✓ | ✗ | gpt-5-high Wins |
T1057 | On a Windows device, review the process execution logs to find instances where a built-in listing tool was piped into a string filter. Identify the process name that the attacker was searching for. | ✓ | ✗ | gpt-5-high Wins |
T1053.005 | Investigate Windows process events for PowerShell activity that leverages WMI to register a scheduled task via XML import. What was the name of the XML file supplied to the RegisterByXml method? | ✓ | ✗ | gpt-5-high Wins |
T1036.003 | In a Linux environment, you observe a process labeled like the cron daemon but running from an unexpected path. Investigate creation events to uncover the actual filename used by this fake cron process. | ✓ | ✗ | gpt-5-high Wins |
T1069.001 | Review recent Windows process event logs for PowerShell activity that suggests local group enumeration through WMI. What exact command was executed? | ✓ | ✗ | gpt-5-high Wins |
T1059.004 | An analyst suspects that a restricted shell escape was executed using a common Perl package manager on Linux. Review the process execution records to determine which tool was invoked to spawn the shell. | ✓ | ✗ | gpt-5-high Wins |
T1059.004 | On a Linux host, identify the process invocation that altered a user’s login shell. What was the full command used? | ✓ | ✗ | gpt-5-high Wins |
T1069.001 | Investigate Windows process execution logs for a PowerShell cmdlet used to list group members. Look for entries where a group name is provided after a '-Name' flag and identify which group was queried. | ✓ | ✗ | gpt-5-high Wins |
T1059.004 | On a Linux system, analyze the process logs for suspicious command line activity that includes a sequence of commands indicating a pipe-to-shell operation. Identify the tool that was used to execute this piped command, paying special attention to its use in downloading and running script content. | ✓ | ✗ | gpt-5-high Wins |
T1070.003 | On a Linux endpoint, you suspect malicious clearing of the bash history by redirecting from the null device. Explore process or file events to uncover the exact shell command that performed this action. | ✓ | ✗ | gpt-5-high Wins |
T1070.005 | On a Windows system, an attacker used the command prompt to remove one or more default administrative shares. Which share names were deleted? | ✓ | ✗ | gpt-5-high Wins |
T1078.003 | On a Linux host, review account management activity in Syslog or process event logs to pinpoint which command was executed to create a new local user. What was the name of the tool invoked? | ✓ | ✗ | gpt-5-high Wins |
T1082 | A Linux host was used to collect various system release files and kernel details, writing them into a single file under /tmp. What was the name of that output file? | ✓ | ✗ | gpt-5-high Wins |
T1070.008 | An attacker on Linux used bash to copy all files from /var/spool/mail into a newly created subdirectory before modifying them. What is the name of that subdirectory? | ✓ | ✗ | gpt-5-high Wins |
Page 1 of 8
Explore individual model performance and detailed analysis