gpt-4.1 vs gpt-4.1-mini KQL Benchmark

gpt-4.1 wins by 20.2%

Compared on 188 shared test questions

Overall Accuracy

gpt-4.1

61.7%

116 / 188 correct

gpt-4.1-mini

41.5%

78 / 188 correct

Average Cost per Query

gpt-4.1: $0.0285
gpt-4.1-mini: $0.0057
gpt-4.1 costs 398.2% more

Average Execution Time

gpt-4.1: 9.95s
gpt-4.1-mini: 14.13s
gpt-4.1-mini takes 42.1% longer
Question-by-Question Analysis

Question-by-Question Comparison

Detailed comparison showing where each model succeeded or failed

Showing 1 to 25 of 188 questions
Page 1 of 8
T1016.001
On a Linux host, a ping command was executed to test internet connectivity. Determine which IP address was used as the ping target.
gpt-4.1 Wins
T1007
An analyst suspects a user or script ran a service enumeration command on a Linux system. Review process events to find the service-listing invocation and specify the full command that was executed.
gpt-4.1 Wins
T1006
Identify the PowerShell cmdlet used on Windows to format and display the raw volume bytes after an attacker read the boot sector via a DOS device path.
gpt-4.1 Wins
T1003.001
Using Windows process event logs, investigate PowerShell activity around lsass.exe memory capture. What was the name of the script file invoked to perform the dump?
gpt-4.1 Wins
T1016
A Linux host’s Syslog shows a shell-based network discovery script ran multiple commands. One of them listed current TCP connections. Which utility was invoked?
gpt-4.1 Wins
T1018
Review Linux process execution records for any commands that list TCP metric cache entries and filter out loopback interfaces. Which utility was used?
gpt-4.1 Wins
T1027
On a Linux system, identify the script that was generated by decoding a base64 data file and then executed. What was the filename of that script?
gpt-4.1 Wins
T1036.003
A process is running under a familiar Windows host name but originates from a user's AppData folder rather than the System32 directory. Identify the filename used to masquerade the PowerShell binary on this Windows device.
gpt-4.1 Wins
T1049
A user launched a Windows command prompt and executed a built-in utility to enumerate all active network connections. Using process creation logs, identify the exact tool that produced the list of current connections.
gpt-4.1 Wins
T1053.005
Investigate Windows process events for PowerShell activity that leverages WMI to register a scheduled task via XML import. What was the name of the XML file supplied to the RegisterByXml method?
gpt-4.1 Wins
T1053.006
Examine the logs from the Linux system for events related to the systemd timer activation. Identify any records indicating that a new timer unit was started and enabled, and determine which timer name was used.
gpt-4.1 Wins
T1057
While reviewing Windows process events, you spot a PowerShell process executing a WMI enumeration cmdlet. What WMI class name did the attacker query?
gpt-4.1 Wins
T1053.005
On Windows, review recent registry changes to detect when the MSC file association was hijacked by a reg add operation. What executable file was configured as the default command under HKCU\Software\Classes\mscfile\shell\open\command?
gpt-4.1 Wins
T1057
On a Windows host, investigate process events to find when Task Manager was launched via cmd with an unusual flag. What was the full command executed?
gpt-4.1 Wins
T1069.001
On a Linux endpoint, process events reveal a chain of group‐enumeration utilities executed by a single session. Which utility was used to query the system’s group database?
gpt-4.1 Wins
T1059.007
On a Windows endpoint, wscript.exe was used to run a JScript. Identify the exact script path passed to wscript.
gpt-4.1 Wins
T1070.003
On a Linux endpoint, you suspect malicious clearing of the bash history by redirecting from the null device. Explore process or file events to uncover the exact shell command that performed this action.
gpt-4.1 Wins
T1070.004
While reviewing Windows process events, you observe a command that recursively deleted a folder under the temporary directory. Use the process event data to identify which process or tool executed this recursive delete.
gpt-4.1 Wins
T1070.004
Suspiciously, the recycle bin appears empty system-wide. Determine which command was executed on Windows to clear the system's recycle bin directory, including any switches and environment variables.
gpt-4.1 Wins
T1082
Review Windows process logs to find which built-in command was executed to reveal the system’s hostname.
gpt-4.1 Wins
T1082
On Windows systems, identify when the built-in Shadow Copy utility is used to enumerate existing snapshots. What was the full command executed?
gpt-4.1 Wins
T1082
A Linux system shows a process in the execution logs that fetched the machine’s name. Review the DeviceProcessEvents table to find out which utility was called to perform this hostname lookup.
gpt-4.1 Wins
T1070.006
On a Linux system, attackers may use timestamp manipulation to hide malicious changes. Investigate relevant logs to identify which file’s modification timestamp was altered by such a command.
gpt-4.1 Wins
T1082
While investigating process creation logs on a Linux device, you observe a privileged hardware interrogation step used to reveal virtualization details. Which utility was invoked?
gpt-4.1 Wins
T1112
Evidence shows that the Windows Defender startup entry was tampered with via an elevated command prompt. Investigate registry events related to the Run key to discover which executable replaced the default SecurityHealth value. What is the name of the new program?
gpt-4.1 Wins
Page 1 of 8

Explore individual model performance and detailed analysis