I would like to discuss the most significant cybersecurity architectural change in the last 15 years. It’s being driven by new products built to respond to the industry’s abysmal incident detection record.
Overview
There are two types of products driving this architectural change. The first is new “domain-level” detection countermeasures, primarily endpoint and network domains. They dramatically improve threat detection by (1) expanding the types of detection methods used and (2) leveraging threat intelligence to detect zero day threats.
The second is a new type of incident detection and response productivity tool, which Gartner would categorize as SOAR (Security Operations, Analysis and Reporting). It provides SOC analysts with (1) playbooks and orchestration to improve analyst productivity, efficiency, and consistency, (2) automated correlation of alerts from SIEMs and the above mentioned “next-generation” domain-based countermeasures, (3) the ability to query the domain-level countermeasures and other sources of information like CMDBs and vulnerability management solutions via their APIs to provide SOC analysts with improved context, and (4) a graphical user interface supported by a robust database that enables rapid SOC analyst investigations.
I am calling the traditional approach we’ve been using for the last 15 years, “Monolithic,” because a SIEM or a single log repository has been at the center of the detection process. I’m calling the new approach “Composite” because there are multiple event repositories – the SIEM/log repository and those of the domain-level countermeasures.
First I will review why this change is needed, and then I’ll go into more detail about how the Composite architecture addresses the incident detection problems we have been experiencing.
Problems with SIEM
Back around 2000, the first Security Information and Event Management (SIEM) solutions appeared. The rationale for the SIEM was the need to consolidate and analyze the events, as represented by logs, being generated by disparate domain technologies such as anti-virus, firewalls, IDSs, and servers. While SIEMs did OK with compliance reporting and forensics investigation, they were poor at incident detection. The question is, why?
First, for the most part, SIEMs are limited to log analysis. While most of the criticism of SIEMs relates to the limitations of rule based alerting, more on that below, limiting analysis to logs is a problem in itself for two reasons. One, capturing the details of actual activities from logs is difficult. Two, a log often represents just a summary of an event. So SIEMs were handicapped by their data sources.
Additionally, the SIEM’s rule-based analysis approach only alerts on predefined, known bad scenarios. New, creative, unknown scenarios are missed. Tuning rules for a particular organization is also very time consuming, especially when the number of rules that need to be tuned can run into the hundreds.
Another issue that is often overlooked is that SIEM vendors are generalists. They know a little about a lot of domains, but don’t have the same in-depth knowledge about a specific domain as a vendor who specializes in that domain. SIEM vendors don’t know as much about endpoint security as endpoint security vendors, and they don’t know as much about network security as network security vendors.
Finally, SIEMs have not addressed two other key issues that have plagued security operations teams for years – (1) lack of consistent, repeatable processes among different SOC analysts, and (2) mind-numbing repetitive manual tasks that beg to be automated. These issues, plus SIEMs high rate of false positives sap SOC team morale which results in high turnover. Considering cybersecurity’s “zero unemployment” environment, this is a costly problem indeed.
Problems with Traditional Domain-level Countermeasures
But the industry’s poor incident detection track record is not just due to SIEMs. Traditional domain-level detection products also bear responsibility. Let me explain.
Traditional domain-level detection products, whether endpoint or networking, must make their “benign (allow) vs suspicious (alert but allow) vs malicious (alert and block)” decisions in micro or milliseconds. A file appears. Is it malicious or benign? The anti-virus software on the endpoint must decide in a fraction of a second. Then it’s on to the next file. In-line IDS countermeasures face a similar problem. Analyze a packet. Good, suspicious, or malicious? Move on to the next packet. In some out-of-band cases, the countermeasure has the luxury of assembling a complete file before making the decision. File detonation/sandboxing products can take longer, but are still limited to minutes at most. Then it’s on to the next file.
So it’s really been the combination of the limitations of traditional domain-level countermeasures and SIEMs that have resulted in the poor record of incident detection, high false positive rates, and low morale and high turnover among SOC analysts. But there is hope. I am seeing a new generation of security products built around a new, Composite architecture that addresses these issues.
Next Generation Domain-level countermeasures
First, there are new, “next generation,” security domain-level companies that have expanded their analysis timeframe from microseconds to months. A next-gen endpoint product not only analyzes events on the endpoint itself, but collects and stores hundreds of event types for further analysis over time. This also gives the next-gen endpoint product the ability to leverage threat intelligence, i.e. apply new threat intelligence retrospectively over the event repository to detect previously unknown zero-day threats.
A “next-gen” network security vendor collects, analyzes, and stores full packets. With full packets captured, the nature of threat intelligence actually expands beyond IP addresses, URLs, and file hashes to include new signatures. In addition, combinations of “weak” signals can be correlated over time to generate high fidelity alerts.
These domain specific security vendors are also using machine learning and other statistical algorithms to detect malicious scenarios across combinations of multiple events that traditional rule-based analysis would miss.
Finally, these next-gen domain-level countermeasures provide APIs that (1) enables a SOAR product to pull more detailed event information to add context for the SOC analyst, and (2) enables threat hunting with third party threat intelligence.
Architectural issue created by next-gen domain-level countermeasures
But replacing traditional domain specific security countermeasures with these next gen ones actually creates an architectural problem. Instead of having a single Monolithic event repository, i.e. the SIEM or log repository, you have multiple event repositories because it no longer makes sense to add the next-gen domain specific event data into what has been your single event repository. Why? First, the analysis of the raw domain data has already been done by the domain product. Second, if you want to access the data, you can via APIs. Third, as already stated, the type of analysis a SIEM does has not been effective at detecting incidents anyway. Fourth, you are already paying the domain-level vendor for storing the data. Why pay the SIEM or log repository vendor to store that data again?
Having said all this, your primary log repository is not going away anytime soon because you still need it for traditional log sources such as firewall, Active Directory, and Data Loss Prevention. But, over time, there will be fewer traditional countermeasures as these vendors expand their analyses timeframes. Some are already doing this.
So by embracing these next-gen domain specific countermeasures we are creating multiple silos of security information that don’t talk to each other. So how do we correlate these different domains if the events they generate are not in a single repository?
Security Operations, Analysis, and Reporting (SOAR)
This issue is addressed by a new type of correlation analysis product, the second architectural component, which Gartner calls Security Operations, Analysis, and Reporting (SOAR). I believe Gartner first published research on this in the fall of 2015. Here are my SOAR solution requirements:
- Receive and correlate alerts from the next-gen domain-level security products.
- Query the next-gen domain-level products for more detailed information related to each alert to provide context.
- Access CMDBs and vulnerability repositories for additional context.
- Receive and correlate alerts from SIEMs. Rule-based alerts from SIEMs need to be correlated by entity, i.e. user.
- Query Splunk and other types of log repositories.
- Correlate alerts and events from all these sources.
- Take threat intelligence feeds and generate queries to the various data repositories. While the next-gen domain specific countermeasures have their own sources of threat intelligence, I fully expect organizations to still subscribe to product independent threat intelligence.
- Use a robust database of its own, preferably a graph database, to store all this collected information, and provide fast query responses to SOC analysts pivoting on known data during investigations.
- Provide playbooks and the tools to build and customize playbooks to assure consistent incident response processes.
- Provide orchestration/automation functions to reduce repetitive manual tasks.
User Entity Behavior Analysis (UEBA)
At this point, you may be thinking, where does User Entity Behavior Analysis (UEBA) fit in? If you are concerned about a true insider threat, i.e. malicious user activity with no malware involved, then UBA is a must. UBA solutions definitely fit into the Composite architecture querying multiple event repositories and sending alerts to the SOAR solution. They should also have APIs so they can be queried by the SOAR solution.
Future Evolution
Looking to the future, I expect the Composite architecture to evolve. Here are some possibilities:
- A UEBA solution could add SOAR functionality
- A SIEM solution could add UEBA and/or SOAR functionality
- A SOAR solution could add log repository and/or SIEM functionality
- A next-gen domain solution could add SOAR functionality
Summary
To summarize, the traditional Monolithic architecture consisting of domain countermeasures that are limited to microsecond/millisecond analysis that feed a SIEM for incident detection has failed. It’s being replaced by the Composite model featuring (1) next-gen domain-level countermeasures that play an expanded analysis role, (2) for the near term, traditional SIEMs and/or primary log repositories continue to play their roles for traditional security countermeasures, and (3) a SOAR solution at the “top of the stack” that is the SOC analysts’ primary incident detection and response tool.
Originally posted on LinkedIn on November 2, 2016
https://www.linkedin.com/pulse/cybersecurity-architecture-transition-bill-frank?trk=mp-author-card