Automated Allowlisting and Optimization of High-Volume Alerts for SecOps Efficiency
One of the top-ranked problems of SecOps alerting is dealing with the amount of noise generated by false positives (FP) while keeping a constant effort to reduce false negatives. In dealing with FPs on the upstream, threat detection engineers would continuously sharpen the logic of SIEM alerts and add FPs into allowlists/whitelists or exclude-lists. Automation is also introduced downstream as part of playbooks or runbooks to close out FPs via further correlation post-creation of incident tickets.
New SIEM alerts need to be monitored and tuned thoroughly before production to keep the number of incident tickets manageable. This time-consuming process limits the speed and quantity of rolling out detection alerts, constraining the overall detection coverage.
In this post, I share a couple of innovative mechanisms that can help solve the problems of:
- high levels of FPs in both new and existing alerts
- extended lead time required to monitor and tune new alerts before productionalizing
- high levels of maintenance and manual efforts required of allowlisting and its lack of scalability
- stakeholders being asked the same questions from different analysts who picked up on similar alerts
Mechanism 1: Automated Allowlisting — Referencing Loopback
The typical process of allowlisting or exclude-listing in an alert is:
- reading through the alert to understand detection logic
- investigating the incident ticket to rule out malice
- determining the strictest combination of attributes to exclude/allow
- inserting the new allowlist entry and running the alert logic to ensure that only the FP in question is excluded
(Nuances between allowlist and exclude-list — the former defines legitimate and trusted activities, whereas the latter defines activities meant to be excluded from the alert in question due to different intentions, regardless of legitimacy.)
The best practice is for every FP ticket to either be accompanied by a request/Jira ticket for allowlisting or add the new allowlisting record directly with peer review. However, this is often far from actual practice.
My proposed solution leverages the loopback of investigation details. Take the sample alert structure below on Splunk query language:
index=email sourcetype=msg <detection and exclusion criteria>
| <rex-extract out certain specific fields if not already parsed>
| <transform/aggregate/enrich>
| <exclude allowlist>
| <set metadata/tags>
Instead of continuously maintaining an allowlist, we could add in a subsearch on the same incident type to query for past verdicts and make a determination based on risk appetite, for instance:
- if there are 2 or more past incidents with similar attributes (stipulated) closed as FP
- if there are no TP for the set of attributes
In the above case, the default set of attributes could be predefined as the sender domain + recipient. If there is a need to allowlist something more specific, such as the recipient, number of recipients, or things like attachment file type, then the analyst can specify that in the case notes when closing the ticket as FP.
The alert, on the other hand, will include a new check against past tickets:
index=email sourcetype=msg <detection and exclusion criteria>
| <rex-extract out certain specific fields if not already parsed>
| <transform/aggregate/enrich>
| <exclude allowlist>
| search NOT [search index=ticket sourcetype=incident earliest=1 type=spearphishing
| stats dc(verdict) as dc_verdict, values(verdict) as verdict count by attributes
| where dc_verdit=1 AND verdict=FP
| where count >=2
| table attributes]
| <exclude the FP attributes>
| <set metadata/tags>
For efficiency and reusability, the subsearch above could be moved into a batch job that outputlookups/appends to another lookup list for exclusion.
Manual allowlisting will not be required when done right, and analysts will not need to investigate more than 2 FPs of the same vector for any given alert type.
Mechanism 2: Automated Tuning — Wisdom of the Crowd
Business context is almost always required when writing a new alert to understand what baseline or “normal” looks like without assumptions. Especially for alert types based on anomaly detection and are noisy in nature, a lot of effort is required to tune and maintain them, either by the responding analysts or engineers.
As a rule of thumb, a new alert going into production should not exceed (arbitrarily, depending on organization) 10–15 triggers per week; otherwise, it may result in alert fatigue.
I propose an automated tuning mechanism leveraging the wisdom of the crowd. When a new alert logic, untuned, goes live, we put in place a throttling feature to limit the number of triggers per week, for instance, 10. For each of these 10 triggers in the first week, instead of having analysts reach out to the employees to gather context, we hook this into an email/Slack RFI where the playbook automation gathers the business context, categorization, and the nature of the activity if it were expected or otherwise. Once an employee fills out that response, that employee will no longer get triggered for the same alert type/vector.
On the other hand, the categorized responses get aggregated and used for exclusion depending on acceptable risk threshold, for instance, if 5 employees confirm that a particular process is BAU, that automatically gets allowlisted. With each week of running the untuned alert, the 10 hits automatically gather responses from new stakeholders, with each stakeholder only getting at most one request (and will not have to respond to multiple analysts asking the same question) and having those excluded from new runs of the alert when the crowdsource threshold (ie. 5 FPs with no TPs) is hit. Within a few weeks of running this alert, the total number of triggers would be reduced and normalized.
Mechanism 3: Automated Promotion of Hunt Rules into Detection Alerts
With continuous automated tuning, we can look towards an “automated” pipeline that creates detection rules. Threat hunt rules are initially crafted as part of proactive threat hunting. Once the unknowns are uncovered, the relevant hunt rules from these exercises should be scheduled for continuous execution rather than being run as a one-time event.
An inventory of these hunt rules should be maintained, and automated conversion of hunt rules into detection alerts can be defined if, for instance, either of the conditions below hold at any point in time:
- when two or more hunt rules get triggered on the same base attributes (ie. src_ip, host, process, etc) within a stipulated time frame (ie. within a day)
- when the hunt rule reaches low trigger volume (ie. once in a couple of months)
Putting the 3 mechanisms together, we add the following layer of automation onto SecOps: