gpt-4.1-nano vs gpt-5-mini-high KQL Benchmark

gpt-5-mini-high wins by 23.9%

Compared on 188 shared test questions

Overall Accuracy

gpt-4.1-nano

24.5%

46 / 188 correct

gpt-5-mini-high

48.4%

91 / 188 correct

Average Cost per Query

gpt-4.1-nano: $0.0014
gpt-5-mini-high: $0.0150
gpt-5-mini-high costs 962.7% more

Average Execution Time

gpt-4.1-nano: 10.78s
gpt-5-mini-high: 44.83s
gpt-5-mini-high takes 315.9% longer
Question-by-Question Analysis

Question-by-Question Comparison

Detailed comparison showing where each model succeeded or failed

Showing 1 to 25 of 188 questions
Page 1 of 8
T1003.007
On a Linux system, review process creation logs to spot any dd commands reading from /proc/*/mem. What was the name of the file where the dumped memory was written?
gpt-4.1-nano Wins
T1039
On a Windows system, someone ran PowerShell to copy a file from a remote machine’s C$ share to the local TEMP folder. Using process event logs, what full PowerShell command was executed to perform this action?
gpt-4.1-nano Wins
T1048.003
Windows process creation logs show a PowerShell-driven file transfer to an FTP service. Which executable was leveraged to perform this exfiltration?
gpt-4.1-nano Wins
T1048.003
A Linux host briefly hosted an HTTP service under /tmp. Examine process creation logs to determine the exact python3 command that was used to start the server on port 9090.
gpt-4.1-nano Wins
T1049
A user launched a Windows command prompt and executed a built-in utility to enumerate all active network connections. Using process creation logs, identify the exact tool that produced the list of current connections.
gpt-4.1-nano Wins
T1059.004
On a Linux system, review process execution records for any shell process that set an environment variable containing executable code and then piped it into another shell instance. Determine which environment variable name was used to store the script content.
gpt-4.1-nano Wins
T1078.003
On a Linux host, review account management activity in Syslog or process event logs to pinpoint which command was executed to create a new local user. What was the name of the tool invoked?
gpt-4.1-nano Wins
T1112
Evidence shows that the Windows Defender startup entry was tampered with via an elevated command prompt. Investigate registry events related to the Run key to discover which executable replaced the default SecurityHealth value. What is the name of the new program?
gpt-4.1-nano Wins
T1217
An attacker is suspected of using the Windows shell to enumerate a user’s Internet Explorer bookmarks via the Favorites folder. Identify the exact command they executed to perform this listing.
gpt-4.1-nano Wins
T1542.001
Investigate Windows file creation logs to uncover any new executable added directly to the System32 directory, which may indicate a UEFI persistence implant. What was the name of the file created?
gpt-4.1-nano Wins
T1562.003
On a Linux system you suspect someone altered Bash’s history settings to hide their activity. Investigate process logs for evidence of HISTCONTROL being set to ignore entries. What was the full command executed to configure HISTCONTROL?
gpt-4.1-nano Wins
T1562.003
Within Linux process execution records, locate any bash commands where the HISTFILESIZE environment variable was exported. What value was assigned to HISTFILESIZE?
gpt-4.1-nano Wins
T1007
An analyst suspects a user or script ran a service enumeration command on a Linux system. Review process events to find the service-listing invocation and specify the full command that was executed.
gpt-5-mini-high Wins
T1003
On a Windows host, an attacker leveraged COMSVCS.DLL via rundll32.exe to dump the memory of svchost.exe into a file under the Temp directory. Review the file creation logs to determine the exact name of the dump file that was generated.
gpt-5-mini-high Wins
T1006
Identify the PowerShell cmdlet used on Windows to format and display the raw volume bytes after an attacker read the boot sector via a DOS device path.
gpt-5-mini-high Wins
T1016
A Linux host’s Syslog shows a shell-based network discovery script ran multiple commands. One of them listed current TCP connections. Which utility was invoked?
gpt-5-mini-high Wins
T1016.002
On a Windows host, someone appears to have run a built-in network shell utility to list saved wireless network profiles and their passwords in clear text. Review the process creation logs to discover the exact command that was executed.
gpt-5-mini-high Wins
T1018
Review Linux process execution records for any commands that list TCP metric cache entries and filter out loopback interfaces. Which utility was used?
gpt-5-mini-high Wins
T1021.006
On Windows hosts, look through recent PowerShell execution records to find any elevated session where remote management was turned on. What exact command was run to enable PSRemoting?
gpt-5-mini-high Wins
T1036.004
Analyze Windows process events for any schtasks.exe commands that created a new task invoking PowerShell. What is the name of the .ps1 script specified to run?
gpt-5-mini-high Wins
T1053.003
Linux hosts may log events when new files are added to /var/spool/cron/crontabs. Query those logs for a creation or write action in that directory and determine the file name that was added.
gpt-5-mini-high Wins
T1053.005
You suspect malicious persistence via scheduled tasks on a Windows endpoint. Review the process execution logs to identify the built-in utility used to register tasks at logon or startup. What is the name of this utility?
gpt-5-mini-high Wins
T1057
A malicious actor may attempt to list running processes on a Windows machine using a WMI-based command. Review the process creation events to find out which utility was invoked to perform this enumeration.
gpt-5-mini-high Wins
T1057
On a Windows host, investigate process events to find when Task Manager was launched via cmd with an unusual flag. What was the full command executed?
gpt-5-mini-high Wins
T1053.006
Examine the logs from the Linux system for events related to the systemd timer activation. Identify any records indicating that a new timer unit was started and enabled, and determine which timer name was used.
gpt-5-mini-high Wins
Page 1 of 8

Explore individual model performance and detailed analysis