23. October 2009 · Comments Off on Relational databases dead for log management? · Categories: Compliance, Log Management, Security Management · Tags: , , , , ,

Larry Walsh wrote an interesting post this week, Splunk Disrupts Security Log Auditing, in which he claims that Splunk's success is due to capturing market share in the security log auditing market because of it's Google-like approach to storing log data rather than using a "relational database."

There was also a very good blog post at Securosis in response – Splunk and Unstructured Data.

While there is no doubt that Splunk has been successful as a company, I am not so sure it's due to security log auditing.

It's my understanding that the primary use case for Splunk is actually in Operations where, for example, a network administrator wants to search logs to resolve an Alert generated by an SNMP-based network management system. Most SNMP-based network management systems are good at telling you "what" is going on, but not very good at telling you "why."

So when the network management system generates an Alert, the admin goes to Splunk to find the logs that would show what actually happened so s/he can fix the root cause of the Alert. For this use case, you don't really need more than a day's worth of logs.

Splunk's brilliant move was to allow "free" usage of the software for one day's worth of logs or some limited amount of storage that generally would not exceed one day. In reality, a few hours of logs is very valuable. This freemium model has been very successful.

Security log auditing is a very different use case. It can require a year or more of data and sophisticated reporting capabilities. That is not to say that a Google-like storage approach cannot accomplish this.

In fact, security log auditing is just another online analytical processing (OLAP) application, albeit with potentially huge amounts of data. It's been at least ten years that the IT industry realized that OLAP applications require a different way to organize stored data compared to online transaction processing (OLTP) applications. OLTP applications still use traditional relational databases.

There has been much experimentation about ways to store data for OLAP applications. However, there is still a lot of value in the SQL language as a kind of open industry standard API to stored data.

So I would agree that traditional relational database products are not appropriate for log management data storage, but SQL as a language has merit as the "API layer" between the query and reporting tools and the data.

Wired Magazine reported this week that Wal-Mart kept secret a breach it discovered in November 2006 that had been ongoing for 17 months. According to the article, Walmart claimed there was no reason to disclose the exploit at the time as they believe no customer data or credit card information was breached.

They are admitting that custom developed Point-of-Sale software was breached. The California Breach Law covering breached financial information of California residents had gone into effect on July 1, 2003 and was extended to health information on January 1, 2009. I blogged about that here.

I think it would be more accurate to say that the forensics analysts hired by Wal-Mart could not "prove" that customer data was breached, i.e., could not find specific evidence that customer data was breached. One key piece of information the article revealed, "The company’s server logs recorded only unsuccessful log-in attempts, not successful ones, frustrating a detailed analysis."

Based on my background in log management, I understand the approach of only collecting "bad" events like failed log-ins. Other than this sentence the article does not discuss what types of events were and were not collected. Therefore they have very little idea of what was really going on.

The problem Wal-Mart was facing at the time was that the cost of collecting and storing all the logs in an accessible manner was prohibitive. Fortunately, log data management software has improved and hardware costs have dropped dramatically. In addition there are new tools for user activity monitoring.

However, my key reaction to this article is my disappointment that Wal-Mart chose to keep this incident a secret. It's possible that news of a Wal-Mart breach might have motivated other retailers to strengthen their security defenses and increase their vigilance, which might have reduced the number of breaches that occurred since 2006. It may also have more quickly increased the rigor QSAs applied to PCI DSS audits.

In closing, I would like to call attention to Adam Shostack's and Andrew Stewart's book, "The New School of Information Security," and quote a passage from page 78 which talks about the value of disclosing breaches aside from the need to inform people whose personal financial or health information may have been breached:

"Breach data is bringing us more and better objective data than any past information-sharing initiative in the field of information security. Breach data allows us to see more about the state of computer security than we've been able to with traditional sources of information. … Crucially, breach data allows us to understand what sorts of issues lead to real problems, and this can help us all make better security decisions."

I thought a post about Database Activity Monitoring was timely because one of the DAM vendors, Sentrigo, published a Microsoft SQLServer vulnerability today along with a utility that mitigates the risk. Also of note, Microsoft denies that this is a real vulnerability.

I generally don't like to write about a single new vulnerability because there are just so many of them. However, Adrian Lane, CTO and Analyst at Securosis, wrote a detailed post about this new vulnerability, Sentrigo's workaround, and Sentrigo's DAM product, Hedgehog. Therefore I wanted to put this in context.

Also of note, Sentrigo sponsored a SANS Report called "Understanding and Selecting a Database Activity Monitoring Solution." I found this report to be fair and balanced as I have found all of SANS activities.

Database Activity Monitoring is becoming a key component in a defense-in-depth approach to protecting "competitive advantage" information like intellectual  property, customer and financial information and meeting compliance requirements.

One of the biggest issues organizations face when selecting a Database Activity Monitoring solution is the method of activity collection, of which there are three – logging, network based monitoring, and agent based monitoring. Each has pros and cons:

  • Logging – This requires turning on the database product's native logging capability. The main advantage of this approach is that it is a standard feature included with every database. Also some database vendors like Oracle have a complete, but separately priced Database Activity Monitoring solution, which they claim will support other databases. Here are the issues with logging:
    • You need a log management or Security Information and Event Management (SIEM) system to normalize each vendor's log format into a standard format so you can correlate events across different databases and store the large volume of events that are generated. If you already committed to a SIEM product this might not be an issue assuming the SIEM vendor does a good job with database logs.
    • There can be significant performance overhead on the database associated with logging, possibly as high as 50%.
    • Database administrators can tamper with the logs. Also if an external hacker gains control of the database server, he/she is likely to turn logging off or delete the logs. 
    • Logging is not a good alternative if you want to block out of policy actions. Logging is after the fact and cannot be expected to block malicious activity. While SIEM vendors may have the ability to take actions, by the time the events are processed by the SIEM, seconds or minutes have passed which means the exploit could already be completed.
  • Network based – An appliance is connected to a tap or a span port on the switch that sits in front of the database servers. Traffic to and, in most cases, from the databases is captured and analyzed. Clearly this puts no performance burden on the database servers at all. It also provides a degree of isolation from the database administrators.Here are the issues:
    • Local database calls and stored procedures are not seen. Therefore you have an incomplete picture of database activity.
    • Your must have the network infrastructure to support these appliances.
    • It can get expensive depending on how many databases you have and how geographically dispersed they are.
  • Host based – An agent is installed directly on each database server.The overhead is much lower than with native database logging, as low as 1% to 5%, although you should test this for yourself.  Also, the agent sees everything including stored procedures. Database administrators will have a hard time interfering with the process without being noticed. Deployment is simple, i.e. neither the networking group nor the datacenter team need be involved. Finally, the installation process should  not require a database restart. As for disadvantages, this is where Adrian Lane's analysis comes in. Here are his concerns:
    • Building and maintaining the agent software is difficult and more time consuming for the vendor than the network approach. However, this is the vendor's issue not the user's.
    • The analysis is performed by the agent right on the database. This could mean additional overhead, but has the advantage of being able to block a query that is not "in policy."
    • Under heavy load, transactions could be missed. But even if this is true, it's still better than the network based approach which surely misses local actions and stored procedures.
    • IT administrators could use the agent to snoop on database transactions to which they would not normally have access.

Dan Sarel, Sentrigo's Vice President of Product, responded in the comments section of Adrian Lane's post. (Unfortunately there is no dedicated link to the response. You just have to scroll down to his response.) He addressed the "losing events under heavy load" issue by saying Sentrigo has customers processing heavy loads and not losing transactions. He addressed the IT administrator snooping issue by saying that the Sentrigo sensors doe not require database credentials. Therefore database passwords are not available to IT administrators.

Detailed
empirical data on IT Security breaches is hard to come by despite laws like
California SB1386.
So
there is much to be learned from Verizon Business’s April 2009 Data Breach
Investigations Report
.

The specific issue I would like to highlight now is the
section on methods by which the investigated breaches were discovered (Discovery
Methods, page 37). 83% were discovered by third parties or non-security employees
going about their normal business. Only 6% were found by event monitoring or
log analysis. Routine internal or external audit combined came in at a rousing
2%.

These numbers are truly shocking considering the amount
of money that has been spent on Intrusion Detection systems, Log Management
systems, and Security Information and Event Management systems. Actually, the
Verizon team concludes that many breached organizations did not invest sufficiently
in detection controls. Based on my experience, I agree.

Given a limited security budget there needs to be a balance
between prevention, detection, and response. I don’t think anyone would argue against
this in theory. But obviously, in practice, it’s not happening. Too
often I have seen too much focus on prevention to the detriment of detection
and response.

In addition, these
numbers point to the difficulties in deploying viable detection controls, as there
were a significant number of organizations that had purchased detection
controls but had not put them into production. Again, I have seen this myself
as most of the tools are too difficult to manage and it’s difficult to implement
effective processes.