gpt-35-turbo vs gpt-5-nano-high KQL Benchmark
gpt-5-nano-high wins by 13.3%
Compared on 188 shared test questions
Overall Accuracy
gpt-35-turbo
17.0%
32 / 188 correct
gpt-5-nano-high
30.3%
57 / 188 correct
Average Cost per Query
gpt-35-turbo: $0.0093
gpt-5-nano-high: $0.0069
gpt-35-turbo costs 35.0% more
Average Execution Time
gpt-35-turbo: 4.57s
gpt-5-nano-high: 61.10s
gpt-5-nano-high takes 1237.8% longer
Question-by-Question Analysis
Question-by-Question Comparison
Detailed comparison showing where each model succeeded or failed
Showing 1 to 25 of 188 questions
Page 1 of 8
T1021.006 | On Windows hosts, look through recent PowerShell execution records to find any elevated session where remote management was turned on. What exact command was run to enable PSRemoting? | ✓ | ✗ | gpt-35-turbo Wins |
T1048.003 | Windows process creation logs show a PowerShell-driven file transfer to an FTP service. Which executable was leveraged to perform this exfiltration? | ✓ | ✗ | gpt-35-turbo Wins |
T1049 | In a Windows log analytics workspace, search for PowerShell processes that were used to enumerate network connections. Determine which PowerShell cmdlet was executed to list active TCP connections. | ✓ | ✗ | gpt-35-turbo Wins |
T1053.005 | Investigate Windows process events for PowerShell activity that leverages WMI to register a scheduled task via XML import. What was the name of the XML file supplied to the RegisterByXml method? | ✓ | ✗ | gpt-35-turbo Wins |
T1059.004 | On a Linux system, find any process creation record where awk is used with a BEGIN rule to launch a shell. What was the exact command invoked? | ✓ | ✗ | gpt-35-turbo Wins |
T1070.003 | On a Windows endpoint, review process execution logs to see if any PowerShell sessions were wiped clean. Which command was executed to clear the PowerShell history? | ✓ | ✗ | gpt-35-turbo Wins |
T1112 | On Windows systems, disabling RDP via the registry generates registry write events. Investigate registry event logs for modifications under the Terminal Server configuration path. What is the name of the registry value that was changed to disable Remote Desktop Protocol? | ✓ | ✗ | gpt-35-turbo Wins |
T1124 | In Windows process event logs, you notice both the net time and w32tm commands being executed to display the system time and timezone. Which executor name from the test configuration was responsible for launching these utilities? | ✓ | ✗ | gpt-35-turbo Wins |
T1124 | An analyst reviewing Windows process logs wants to spot instances where a native time tool was repurposed to introduce a delay. Which full W32tm invocation, including the stripchart and period flags, appears in the logs? | ✓ | ✗ | gpt-35-turbo Wins |
T1201 | Windows systems may be probed for their password policy settings using a native command-line tool. Determine which command was executed to list the local password policy on the target hosts. | ✓ | ✗ | gpt-35-turbo Wins |
T1218.004 | Review Windows process creation events for evidence of a .NET assembly being installed. Which executable was launched with an "/action=install" argument? | ✓ | ✗ | gpt-35-turbo Wins |
T1555.003 | On a Windows system, PowerShell was used to gather multiple browser credential files into a temp folder and then archive them. What was the name of the resulting ZIP file? | ✓ | ✗ | gpt-35-turbo Wins |
T1614.001 | On a Windows device, an attacker ran a PowerShell script to collect system settings including UI language and locale. Identify which cmdlet in the command line was used to obtain the system locale. | ✓ | ✗ | gpt-35-turbo Wins |
T1614.001 | Using Linux process or syslog logs, identify the executable that was run to output the system's locale information. | ✓ | ✗ | gpt-35-turbo Wins |
T1003.005 | On Windows devices, identify the full command used when the built-in credential manager utility was invoked to list stored credentials. What was the exact command? | ✗ | ✓ | gpt-5-nano-high Wins |
T1003.007 | On a Linux system, review process creation logs to spot any dd commands reading from /proc/*/mem. What was the name of the file where the dumped memory was written? | ✗ | ✓ | gpt-5-nano-high Wins |
T1003 | On a Windows host, an attacker leveraged COMSVCS.DLL via rundll32.exe to dump the memory of svchost.exe into a file under the Temp directory. Review the file creation logs to determine the exact name of the dump file that was generated. | ✗ | ✓ | gpt-5-nano-high Wins |
T1016.001 | On a Linux host, a ping command was executed to test internet connectivity. Determine which IP address was used as the ping target. | ✗ | ✓ | gpt-5-nano-high Wins |
T1018 | Review Linux process execution records for any commands that list TCP metric cache entries and filter out loopback interfaces. Which utility was used? | ✗ | ✓ | gpt-5-nano-high Wins |
T1018 | A Windows host executed an ICMP-based network reconnaissance using a looping instruction in cmd.exe. Identify the exact command line that was used to perform the ping sweep. | ✗ | ✓ | gpt-5-nano-high Wins |
T1036.003 | In a Linux environment, you observe a process labeled like the cron daemon but running from an unexpected path. Investigate creation events to uncover the actual filename used by this fake cron process. | ✗ | ✓ | gpt-5-nano-high Wins |
T1057 | On a Windows device, PowerShell was used to collect a snapshot of running processes. Identify the exact cmdlet that was executed. | ✗ | ✓ | gpt-5-nano-high Wins |
T1057 | On a Windows host, investigate process events to find when Task Manager was launched via cmd with an unusual flag. What was the full command executed? | ✗ | ✓ | gpt-5-nano-high Wins |
T1057 | A Windows endpoint recorded a command-line activity through cmd.exe that lists all running processes. Determine which built-in tool was executed to perform this action. | ✗ | ✓ | gpt-5-nano-high Wins |
T1059.004 | An analyst suspects that a restricted shell escape was executed using a common Perl package manager on Linux. Review the process execution records to determine which tool was invoked to spawn the shell. | ✗ | ✓ | gpt-5-nano-high Wins |
Page 1 of 8
Explore individual model performance and detailed analysis