02. November 2016 · Comments Off on Cybersecurity Architecture in Transition to Address Incident Detection and Response · Categories: blog · Tags: ,

I would like to discuss the most significant cybersecurity architectural change in the last 15 years. It’s being driven by new products built to respond to the industry’s abysmal incident detection record.


There are two types of products driving this architectural change. The first is new “domain-level” detection countermeasures, primarily endpoint and network domains. They dramatically improve threat detection by (1) expanding the types of detection methods used and (2) leveraging threat intelligence to detect zero day threats.

The second is a new type of incident detection and response productivity tool, which Gartner would categorize as SOAR (Security Operations, Analysis and Reporting). It provides SOC analysts with (1) playbooks and orchestration to improve analyst productivity, efficiency, and consistency, (2) automated correlation of alerts from SIEMs and the above mentioned “next-generation” domain-based countermeasures, (3) the ability to query the domain-level countermeasures and other sources of information like CMDBs and vulnerability management solutions via their APIs to provide SOC analysts with improved context, and (4) a graphical user interface supported by a robust database that enables rapid SOC analyst investigations.

I am calling the traditional approach we’ve been using for the last 15 years, “Monolithic,” because a SIEM or a single log repository has been at the center of the detection process. I’m calling the new approach “Composite” because there are multiple event repositories – the SIEM/log repository and those of the domain-level countermeasures.

First I will review why this change is needed, and then I’ll go into more detail about how the Composite architecture addresses the incident detection problems we have been experiencing.

Problems with SIEM

Back around 2000, the first Security Information and Event Management (SIEM) solutions appeared. The rationale for the SIEM was the need to consolidate and analyze the events, as represented by logs, being generated by disparate domain technologies such as anti-virus, firewalls, IDSs, and servers. While SIEMs did OK with compliance reporting and forensics investigation, they were poor at incident detection. The question is, why?

First, for the most part, SIEMs are limited to log analysis. While most of the criticism of SIEMs relates to the limitations of rule based alerting, more on that below, limiting analysis to logs is a problem in itself for two reasons. One, capturing the details of actual activities from logs is difficult. Two, a log often represents just a summary of an event. So SIEMs were handicapped by their data sources.

Additionally, the SIEM’s rule-based analysis approach only alerts on predefined, known bad scenarios. New, creative, unknown scenarios are missed. Tuning rules for a particular organization is also very time consuming, especially when the number of rules that need to be tuned can run into the hundreds.

Another issue that is often overlooked is that SIEM vendors are generalists. They know a little about a lot of domains, but don’t have the same in-depth knowledge about a specific domain as a vendor who specializes in that domain. SIEM vendors don’t know as much about endpoint security as endpoint security vendors, and they don’t know as much about network security as network security vendors.

Finally, SIEMs have not addressed two other key issues that have plagued security operations teams for years – (1) lack of consistent, repeatable processes among different SOC analysts, and (2) mind-numbing repetitive manual tasks that beg to be automated. These issues, plus SIEMs high rate of false positives sap SOC team morale which results in high turnover. Considering cybersecurity’s “zero unemployment” environment, this is a costly problem indeed.

Problems with Traditional Domain-level Countermeasures

But the industry’s poor incident detection track record is not just due to SIEMs. Traditional domain-level detection products also bear responsibility. Let me explain.

Traditional domain-level detection products, whether endpoint or networking, must make their “benign (allow) vs suspicious (alert but allow) vs malicious (alert and block)” decisions in micro or milliseconds. A file appears. Is it malicious or benign? The anti-virus software on the endpoint must decide in a fraction of a second. Then it’s on to the next file. In-line IDS countermeasures face a similar problem. Analyze a packet. Good, suspicious, or malicious? Move on to the next packet. In some out-of-band cases, the countermeasure has the luxury of assembling a complete file before making the decision. File detonation/sandboxing products can take longer, but are still limited to minutes at most. Then it’s on to the next file.

So it’s really been the combination of the limitations of traditional domain-level countermeasures and SIEMs that have resulted in the poor record of incident detection, high false positive rates, and low morale and high turnover among SOC analysts. But there is hope. I am seeing a new generation of security products built around a new, Composite architecture that addresses these issues.

Next Generation Domain-level countermeasures

First, there are new, “next generation,” security domain-level companies that have expanded their analysis timeframe from microseconds to months. A next-gen endpoint product not only analyzes events on the endpoint itself, but collects and stores hundreds of event types for further analysis over time. This also gives the next-gen endpoint product the ability to leverage threat intelligence, i.e. apply new threat intelligence retrospectively over the event repository to detect previously unknown zero-day threats.

A “next-gen” network security vendor collects, analyzes, and stores full packets. With full packets captured, the nature of threat intelligence actually expands beyond IP addresses, URLs, and file hashes to include new signatures. In addition, combinations of “weak” signals can be correlated over time to generate high fidelity alerts.

These domain specific security vendors are also using machine learning and other statistical algorithms to detect malicious scenarios across combinations of multiple events that traditional rule-based analysis would miss.

Finally, these next-gen domain-level countermeasures provide APIs that (1) enables a SOAR product to pull more detailed event information to add context for the SOC analyst, and (2) enables threat hunting with third party threat intelligence.

Architectural issue created by next-gen domain-level countermeasures

But replacing traditional domain specific security countermeasures with these next gen ones actually creates an architectural problem. Instead of having a single Monolithic event repository, i.e. the SIEM or log repository, you have multiple event repositories because it no longer makes sense to add the next-gen domain specific event data into what has been your single event repository. Why? First, the analysis of the raw domain data has already been done by the domain product. Second, if you want to access the data, you can via APIs. Third, as already stated, the type of analysis a SIEM does has not been effective at detecting incidents anyway. Fourth, you are already paying the domain-level vendor for storing the data. Why pay the SIEM or log repository vendor to store that data again?

Having said all this, your primary log repository is not going away anytime soon because you still need it for traditional log sources such as firewall, Active Directory, and Data Loss Prevention. But, over time, there will be fewer traditional countermeasures as these vendors expand their analyses timeframes. Some are already doing this.

So by embracing these next-gen domain specific countermeasures we are creating multiple silos of security information that don’t talk to each other. So how do we correlate these different domains if the events they generate are not in a single repository?

Security Operations, Analysis, and Reporting (SOAR)

This issue is addressed by a new type of correlation analysis product, the second architectural component, which Gartner calls Security Operations, Analysis, and Reporting (SOAR). I believe Gartner first published research on this in the fall of 2015. Here are my SOAR solution requirements:

  1. Receive and correlate alerts from the next-gen domain-level security products.
  2. Query the next-gen domain-level products for more detailed information related to each alert to provide context.
  3. Access CMDBs and vulnerability repositories for additional context.
  4. Receive and correlate alerts from SIEMs. Rule-based alerts from SIEMs need to be correlated by entity, i.e. user.
  5. Query Splunk and other types of log repositories.
  6. Correlate alerts and events from all these sources.
  7. Take threat intelligence feeds and generate queries to the various data repositories. While the next-gen domain specific countermeasures have their own sources of threat intelligence, I fully expect organizations to still subscribe to product independent threat intelligence.
  8. Use a robust database of its own, preferably a graph database, to store all this collected information, and provide fast query responses to SOC analysts pivoting on known data during investigations.
  9. Provide playbooks and the tools to build and customize playbooks to assure consistent incident response processes.
  10. Provide orchestration/automation functions to reduce repetitive manual tasks.

User Entity Behavior Analysis (UEBA)

At this point, you may be thinking, where does User Entity Behavior Analysis (UEBA) fit in? If you are concerned about a true insider threat, i.e. malicious user activity with no malware involved, then UBA is a must. UBA solutions definitely fit into the Composite architecture querying multiple event repositories and sending alerts to the SOAR solution. They should also have APIs so they can be queried by the SOAR solution.

Future Evolution

Looking to the future, I expect the Composite architecture to evolve. Here are some possibilities:

  • A UEBA solution could add SOAR functionality
  • A SIEM solution could add UEBA and/or SOAR functionality
  • A SOAR solution could add log repository and/or SIEM functionality
  • A next-gen domain solution could add SOAR functionality


To summarize, the traditional Monolithic architecture consisting of domain countermeasures that are limited to microsecond/millisecond analysis that feed a SIEM for incident detection has failed. It’s being replaced by the Composite model featuring (1) next-gen domain-level countermeasures that play an expanded analysis role, (2) for the near term, traditional SIEMs and/or primary log repositories continue to play their roles for traditional security countermeasures, and (3) a SOAR solution at the “top of the stack” that is the SOC analysts’ primary incident detection and response tool.

Originally posted on LinkedIn on November 2, 2016


30. August 2011 · Comments Off on Compliance Is Not Security – Busted! « PCI Guru · Categories: blog · Tags: , ,

Compliance Is Not Security – Busted! « PCI Guru.

The PCI Guru defends the PCI standard as a good framework for security in general, arguing against the refrain that compliance is not security.

My view is that the PCI Guru is missing the point. PCI DSS is a decent enough security framework. Personally I feel the SANS 20 Critical Security Controls is more comprehensive and has a maturity model to help organizations build a prioritized plan.

The issue is the approach management teams of organizations take to mitigate the risks of information technology. COSO has called this “Tone at the Top.”

A quote that rings true to me is, “In theory, there is no difference between theory and practice. But in practice there is.”

Applying here, I would say, in theory there should be no difference between compliance and security. But in practice there often is when management teams of an organizations do not take an earnest approach to mitigating the risks of information technology. Rather they take a “check-box” mentality, i.e. going for the absolute minimum on which the QSA will sign off. It is for this reason that many in our industry say that compliance does not equal security.


24. July 2011 · Comments Off on Lenny Zeltser on Information Security — The Use of the Modern Social Web by Malicious Software · Categories: blog · Tags: , , ,

Lenny Zeltser on Information Security — The Use of the Modern Social Web by Malicious Software.

Lenny Zeltser posted his excellent presentation on The Use of Modern Social Web by Malicious Software.

However an increasing number of organizations are seeing real benefits to the top line by engaging in the social web. Therefore simply blocking it’s usage is no longer an option. The InfoSec team must respond to the business side by mitigating the security risks of using the modern social web.


29. November 2010 · Comments Off on What is Information Security: New School Primer « The New School of Information Security · Categories: blog · Tags: , , , ,

What is Information Security: New School Primer « The New School of Information Security.

I would like to comment on each of the three components of Alex’s “primer” on Information Security.

First, InfoSec is a hypothetical construct. It is something that we can all talk about, but it’s not directly observable and therefore measurable like, say, speed that we can describe km/hr.   “Directly” is to be stressed there because there are many hypothetical constructs of subjective value that we do create measurements and measurement scales for in order to create a state of (high) intersubjectivity between observers (don’t like that wikipedia definition, I use it to mean that you and I can kind of understand the same thing in the same way).

Clearly InfoSec cannot be measured like speed or acceleration or weight. Therefore I would agree with Alex’s classification.

Second, security is not an engineering discipline, per se.  Our industry treats it as such because most of us come from that background, and because the easiest thing to do to try to become “more secure” is buy a new engineering solution (security product marketing).   But the bankruptcy of this way of thinking is present in both our budgets and our standards.   A security management approach focused solely on engineering fails primarily because of the “intelligent” or adaptable attacker.

Again, clearly InfoSec involves people and therefore is more than purely an engineering exercise like building a bridge. On the other hand, if, for example, you look at the statistics from the Verizon Business 2010 Data Breach Investigation Report, page 3, 85% of the analyzed attacks were not considered highly difficult. In other words, if “sound” security engineering practices are applied, the number of breaches would decline dramatically.

This is why we at Cymbel have embraced the SANS 20 Critical Security Controls for Effective Cyber Defense.

Finally, InfoSec is a subset of Information Risk Management (IRM).  IRM takes what we know about “secure” and adds concepts like probable impacts and resource allocation strategies.  This can be confusing to many because of the many definitions of the word “risk” in the english language, but that’s a post for a different day.

This is the part of Alex’s primer with which I have the most concern – “probable impacts.” The problem is that estimating probabilities with respect to exploits is almost totally subjective and there is still far too little available data to estimate probabilities.On the other hand, there is enough information about successful exploits and threats in the wild, to give infosec teams a plan to move forward, like the SANS 20 Critical Controls.

My biggest concern is Alex referencing FAIR, Factor Analysis of Information Risk in a positive light. From my perspective, any tool which when used by two independent groups sitting in different rooms to analyze the same environment can generate wildly different results is simply not valid. Richard Bejtlich, in 2007, provided a thoughtful analysis of FAIR here and here.

Bejtlich shows that FAIR is just a more elaborate version of ALE, Annual Loss Expectency. For a more detailed analysis of the shortcomings of ALE, see Security Metrics, by Andrew Jaquith, page 31. In summary, the problems with ALE are:

  • The inherent difficulty of modeling outlier
  • The lack of data for estimating probabilities of occurrence or loss expectancies
  • Sensitivity  of the ALE model to small changes in assumptions

I am surely not saying that there are no valid methods of measuring risk. It’s just that I have not seen any that work effectively. I am intrigued by Douglas Hubbard’s theories expressed in his two books, How to Measure Anything and The Failure of Risk Management. Anyone using them? I would love to hear your results.

I look forward to Alex’s post on Risk.