gpt-4-turbo-2024-04-09 vs gpt-5-nano-high KQL Benchmark
gpt-4-turbo-2024-04-09 wins by 9.0%
Compared on 188 shared test questions
Overall Accuracy
gpt-4-turbo-2024-04-09
39.4%
74 / 188 correct
gpt-5-nano-high
30.3%
57 / 188 correct
Average Cost per Query
gpt-4-turbo-2024-04-09: $0.1737
gpt-5-nano-high: $0.0069
gpt-4-turbo-2024-04-09 costs 2421.4% more
Average Execution Time
gpt-4-turbo-2024-04-09: 16.84s
gpt-5-nano-high: 61.10s
gpt-5-nano-high takes 262.9% longer
Question-by-Question Analysis
Question-by-Question Comparison
Detailed comparison showing where each model succeeded or failed
Showing 1 to 25 of 188 questions
Page 1 of 8
T1016 | A Linux host’s Syslog shows a shell-based network discovery script ran multiple commands. One of them listed current TCP connections. Which utility was invoked? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1036.004 | A threat actor on a Windows system crafted and registered a service named almost identically to the standard time service, but redirecting execution to a custom script. Review the logging data to determine which native command-line tool was used to perform this action. What utility was invoked? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1039 | On a Windows system, someone ran PowerShell to copy a file from a remote machine’s C$ share to the local TEMP folder. Using process event logs, what full PowerShell command was executed to perform this action? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1048.003 | Windows process creation logs show a PowerShell-driven file transfer to an FTP service. Which executable was leveraged to perform this exfiltration? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1049 | In a Windows log analytics workspace, search for PowerShell processes that were used to enumerate network connections. Determine which PowerShell cmdlet was executed to list active TCP connections. | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1059.007 | On a Windows endpoint, wscript.exe was used to run a JScript. Identify the exact script path passed to wscript. | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1059.004 | On a Linux system, review process execution records for any shell process that set an environment variable containing executable code and then piped it into another shell instance. Determine which environment variable name was used to store the script content. | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1070.003 | On a Windows endpoint, review process execution logs to see if any PowerShell sessions were wiped clean. Which command was executed to clear the PowerShell history? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1059.004 | On a Linux host, identify the process invocation that altered a user’s login shell. What was the full command used? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1070.003 | On a Linux endpoint, you suspect malicious clearing of the bash history by redirecting from the null device. Explore process or file events to uncover the exact shell command that performed this action. | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1059.004 | On a Linux system, find any process creation record where awk is used with a BEGIN rule to launch a shell. What was the exact command invoked? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1070.003 | On a Linux system, you suspect someone erased their command history by linking the history file to /dev/null. Investigate process events and determine which utility was executed to achieve this. | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1070.004 | Suspiciously, the recycle bin appears empty system-wide. Determine which command was executed on Windows to clear the system's recycle bin directory, including any switches and environment variables. | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1070 | A suspicious actor appears to have removed the USN change journal on a Windows workstation. Investigate process start records to find out exactly which command was used to delete the journal. What was the full command line invoked? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1082 | Review Windows process logs to find which built-in command was executed to reveal the system’s hostname. | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1112 | A Windows host logs a change to the Terminal Server registry key disabling single-session per user. Which command-line utility executed this registry modification? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1112 | On Windows systems, disabling RDP via the registry generates registry write events. Investigate registry event logs for modifications under the Terminal Server configuration path. What is the name of the registry value that was changed to disable Remote Desktop Protocol? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1112 | On a Windows endpoint, review the registry write events to spot when the WDigest key is altered to permit plaintext credential storage. What registry value name was changed? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1112 | Investigate Windows registry events to identify any newly set ProxyServer entry under the user Internet Settings hive. What proxy server address was configured? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1124 | An analyst reviewing Windows process logs wants to spot instances where a native time tool was repurposed to introduce a delay. Which full W32tm invocation, including the stripchart and period flags, appears in the logs? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1201 | Windows systems may be probed for their password policy settings using a native command-line tool. Determine which command was executed to list the local password policy on the target hosts. | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1124 | On a Linux host, an activity was recorded where the local clock and timezone were queried. Review the available process execution logs to uncover what full command was run to fetch the system time and timezone. | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1217 | An attacker is suspected of using the Windows shell to enumerate a user’s Internet Explorer bookmarks via the Favorites folder. Identify the exact command they executed to perform this listing. | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1201 | You are reviewing Linux syslog records on a CentOS/RHEL 7.x server. You notice entries for shell commands that access system configuration files under /etc/security. Determine exactly which configuration file was being inspected by the command. | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
T1218.004 | Review Windows process creation events for evidence of a .NET assembly being installed. Which executable was launched with an "/action=install" argument? | ✓ | ✗ | gpt-4-turbo-2024-04-09 Wins |
Page 1 of 8
Explore individual model performance and detailed analysis