gpt-5-mini-medium vs gpt-5-nano-medium KQL Benchmark

gpt-5-mini-medium wins by 21.7%

Compared on 184 shared test questions

Overall Accuracy

gpt-5-mini-medium

45.6%

84 / 184 correct

gpt-5-nano-medium

23.9%

44 / 184 correct

Average Cost per Query

gpt-5-mini-medium: $0.0150
gpt-5-nano-medium: $0.0069
gpt-5-mini-medium costs 116.3% more

Average Execution Time

gpt-5-mini-medium: 47.16s
gpt-5-nano-medium: 65.07s
gpt-5-nano-medium takes 38.0% longer
Question-by-Question Analysis

Question-by-Question Comparison

Detailed comparison showing where each model succeeded or failed

Showing 1 to 25 of 184 questions
Page 1 of 8
T1016.001
On a Linux host, a ping command was executed to test internet connectivity. Determine which IP address was used as the ping target.
gpt-5-mini-medium Wins
T1036.004
A threat actor on a Windows system crafted and registered a service named almost identically to the standard time service, but redirecting execution to a custom script. Review the logging data to determine which native command-line tool was used to perform this action. What utility was invoked?
gpt-5-mini-medium Wins
T1027
On a Windows endpoint, look for evidence of a base64-encoded PowerShell payload execution. Which executable launched the encoded command?
gpt-5-mini-medium Wins
T1018
A Windows host executed an ICMP-based network reconnaissance using a looping instruction in cmd.exe. Identify the exact command line that was used to perform the ping sweep.
gpt-5-mini-medium Wins
T1018
Review Linux process execution records for any commands that list TCP metric cache entries and filter out loopback interfaces. Which utility was used?
gpt-5-mini-medium Wins
T1049
In a Windows log analytics workspace, search for PowerShell processes that were used to enumerate network connections. Determine which PowerShell cmdlet was executed to list active TCP connections.
gpt-5-mini-medium Wins
T1048.003
Windows process creation logs show a PowerShell-driven file transfer to an FTP service. Which executable was leveraged to perform this exfiltration?
gpt-5-mini-medium Wins
T1003
On a Windows host, an attacker leveraged COMSVCS.DLL via rundll32.exe to dump the memory of svchost.exe into a file under the Temp directory. Review the file creation logs to determine the exact name of the dump file that was generated.
gpt-5-mini-medium Wins
T1057
A malicious actor may attempt to list running processes on a Windows machine using a WMI-based command. Review the process creation events to find out which utility was invoked to perform this enumeration.
gpt-5-mini-medium Wins
T1016
A Linux host’s Syslog shows a shell-based network discovery script ran multiple commands. One of them listed current TCP connections. Which utility was invoked?
gpt-5-mini-medium Wins
T1070.003
On a Windows endpoint, commands are no longer being logged to PowerShell history, suggesting PSReadLine settings were altered. Using process execution logs, determine the exact command that was run to set the history save style to 'SaveNothing'.
gpt-5-mini-medium Wins
T1053.005
Investigate Windows process events for PowerShell activity that leverages WMI to register a scheduled task via XML import. What was the name of the XML file supplied to the RegisterByXml method?
gpt-5-mini-medium Wins
T1059.007
On a Windows endpoint, wscript.exe was used to run a JScript. Identify the exact script path passed to wscript.
gpt-5-mini-medium Wins
T1036.004
Analyze Windows process events for any schtasks.exe commands that created a new task invoking PowerShell. What is the name of the .ps1 script specified to run?
gpt-5-mini-medium Wins
T1059.004
On a Linux host, identify the process invocation that altered a user’s login shell. What was the full command used?
gpt-5-mini-medium Wins
T1070.005
On a Windows system, an attacker used the command prompt to remove one or more default administrative shares. Which share names were deleted?
gpt-5-mini-medium Wins
T1070.004
Suspiciously, the recycle bin appears empty system-wide. Determine which command was executed on Windows to clear the system's recycle bin directory, including any switches and environment variables.
gpt-5-mini-medium Wins
T1070
A suspicious actor appears to have removed the USN change journal on a Windows workstation. Investigate process start records to find out exactly which command was used to delete the journal. What was the full command line invoked?
gpt-5-mini-medium Wins
T1082
While investigating process creation logs on a Linux device, you observe a privileged hardware interrogation step used to reveal virtualization details. Which utility was invoked?
gpt-5-mini-medium Wins
T1069.001
Investigate Windows process execution logs for a PowerShell cmdlet used to list group members. Look for entries where a group name is provided after a '-Name' flag and identify which group was queried.
gpt-5-mini-medium Wins
T1070.003
On a Linux endpoint, you suspect malicious clearing of the bash history by redirecting from the null device. Explore process or file events to uncover the exact shell command that performed this action.
gpt-5-mini-medium Wins
T1082
Using Linux process execution logs, identify the specific command that was used to filter loaded kernel modules for entries containing “vmw.” What was that full command?
gpt-5-mini-medium Wins
T1057
While reviewing Windows process events, you spot a PowerShell process executing a WMI enumeration cmdlet. What WMI class name did the attacker query?
gpt-5-mini-medium Wins
T1070.004
While reviewing Windows process events, you observe a command that recursively deleted a folder under the temporary directory. Use the process event data to identify which process or tool executed this recursive delete.
gpt-5-mini-medium Wins
T1059.004
An analyst suspects that a restricted shell escape was executed using a common Perl package manager on Linux. Review the process execution records to determine which tool was invoked to spawn the shell.
gpt-5-mini-medium Wins
Page 1 of 8

Explore individual model performance and detailed analysis