gpt-5-mini-high vs gpt-5-nano-high KQL Benchmark
gpt-5-mini-high wins by 18.1%
Compared on 188 shared test questions
Overall Accuracy
gpt-5-mini-high
48.4%
91 / 188 correct
gpt-5-nano-high
30.3%
57 / 188 correct
Average Cost per Query
gpt-5-mini-high: $0.0150
gpt-5-nano-high: $0.0069
gpt-5-mini-high costs 117.5% more
Average Execution Time
gpt-5-mini-high: 44.83s
gpt-5-nano-high: 61.10s
gpt-5-nano-high takes 36.3% longer
Question-by-Question Analysis
Question-by-Question Comparison
Detailed comparison showing where each model succeeded or failed
Showing 1 to 25 of 188 questions
Page 1 of 8
T1007 | An analyst suspects a user or script ran a service enumeration command on a Linux system. Review process events to find the service-listing invocation and specify the full command that was executed. | ✓ | ✗ | gpt-5-mini-high Wins |
T1016 | A Linux host’s Syslog shows a shell-based network discovery script ran multiple commands. One of them listed current TCP connections. Which utility was invoked? | ✓ | ✗ | gpt-5-mini-high Wins |
T1021.006 | On Windows hosts, look through recent PowerShell execution records to find any elevated session where remote management was turned on. What exact command was run to enable PSRemoting? | ✓ | ✗ | gpt-5-mini-high Wins |
T1006 | Identify the PowerShell cmdlet used on Windows to format and display the raw volume bytes after an attacker read the boot sector via a DOS device path. | ✓ | ✗ | gpt-5-mini-high Wins |
T1016.001 | An analyst notices a PowerShell process on a Windows host that appears to be checking SMB connectivity. Which PowerShell cmdlet was executed to perform this outbound port 445 test? | ✓ | ✗ | gpt-5-mini-high Wins |
T1049 | In a Windows log analytics workspace, search for PowerShell processes that were used to enumerate network connections. Determine which PowerShell cmdlet was executed to list active TCP connections. | ✓ | ✗ | gpt-5-mini-high Wins |
T1053.003 | Linux hosts may log events when new files are added to /var/spool/cron/crontabs. Query those logs for a creation or write action in that directory and determine the file name that was added. | ✓ | ✗ | gpt-5-mini-high Wins |
T1036.004 | Analyze Windows process events for any schtasks.exe commands that created a new task invoking PowerShell. What is the name of the .ps1 script specified to run? | ✓ | ✗ | gpt-5-mini-high Wins |
T1053.006 | Examine the logs from the Linux system for events related to the systemd timer activation. Identify any records indicating that a new timer unit was started and enabled, and determine which timer name was used. | ✓ | ✗ | gpt-5-mini-high Wins |
T1059.004 | On a Linux system, find any process creation record where awk is used with a BEGIN rule to launch a shell. What was the exact command invoked? | ✓ | ✗ | gpt-5-mini-high Wins |
T1069.001 | On a Linux endpoint, process events reveal a chain of group‐enumeration utilities executed by a single session. Which utility was used to query the system’s group database? | ✓ | ✗ | gpt-5-mini-high Wins |
T1070.003 | On a Linux system, you suspect someone erased their command history by linking the history file to /dev/null. Investigate process events and determine which utility was executed to achieve this. | ✓ | ✗ | gpt-5-mini-high Wins |
T1070.003 | On a Windows endpoint, review process execution logs to see if any PowerShell sessions were wiped clean. Which command was executed to clear the PowerShell history? | ✓ | ✗ | gpt-5-mini-high Wins |
T1059.007 | On a Windows endpoint, wscript.exe was used to run a JScript. Identify the exact script path passed to wscript. | ✓ | ✗ | gpt-5-mini-high Wins |
T1070.003 | On a Windows device, there’s evidence that PowerShell history was wiped by deleting the history file. What was the exact command used to perform this action? | ✓ | ✗ | gpt-5-mini-high Wins |
T1059.004 | An attacker on a Linux host may try to enumerate installed shells by reading the system file that lists valid shells. Using process or syslog data, determine which command was executed to perform this enumeration. | ✓ | ✗ | gpt-5-mini-high Wins |
T1070.004 | While reviewing Windows process events, you observe a command that recursively deleted a folder under the temporary directory. Use the process event data to identify which process or tool executed this recursive delete. | ✓ | ✗ | gpt-5-mini-high Wins |
T1070 | A suspicious actor appears to have removed the USN change journal on a Windows workstation. Investigate process start records to find out exactly which command was used to delete the journal. What was the full command line invoked? | ✓ | ✗ | gpt-5-mini-high Wins |
T1070.006 | On a Windows host, suspicious PowerShell activity adjusted the system clock and recorded a value. What numeric value was used to slip the system date? | ✓ | ✗ | gpt-5-mini-high Wins |
T1082 | A Linux system shows a process in the execution logs that fetched the machine’s name. Review the DeviceProcessEvents table to find out which utility was called to perform this hostname lookup. | ✓ | ✗ | gpt-5-mini-high Wins |
T1112 | A Windows host logs a change to the Terminal Server registry key disabling single-session per user. Which command-line utility executed this registry modification? | ✓ | ✗ | gpt-5-mini-high Wins |
T1082 | Review Windows process logs to find which built-in command was executed to reveal the system’s hostname. | ✓ | ✗ | gpt-5-mini-high Wins |
T1112 | Review registry event logs on the Windows host for PowerShell-driven writes to system policy and file system keys. Which registry value names were created during this BlackByte preparation simulation? | ✓ | ✗ | gpt-5-mini-high Wins |
T1112 | A Windows user’s registry was altered via a command-line tool to disable the lock workstation feature by adding a DWORD entry under the current user Policies\System key. Which registry value name was modified in this operation? | ✓ | ✗ | gpt-5-mini-high Wins |
T1112 | Investigate Windows registry events to identify any newly set ProxyServer entry under the user Internet Settings hive. What proxy server address was configured? | ✓ | ✗ | gpt-5-mini-high Wins |
Page 1 of 8
Explore individual model performance and detailed analysis