Monday, September 17, 2007

Software Security Metrics and Commentary on "Metrics Framework" Paper

I was reading the paper "A Metrics Framework to Drive Application Security Improvement" recently and some thoughts started to gel about what types of web application security metrics are meaningful.

This is going to be part-1 of 2 about the paper and software security metrics. In this first installment I comment on the metrics from the paper and provide what I believe are reasonable replacement metrics for 5 of the 10 in the paper. In Part-2 I'll take on the next 5 as well as discuss some other thoughts on what metrics matter for measuring web application security.

The paper is actually a good introduction on how to think about measuring software security, but I think a few of the metrics miss the mark slightly.

In the paper they analyze software metrics in three phases of an application's lifecycle:
  1. Design
  2. Deployment
  3. Runtime
The paper uses the OWASP top-10 as the basis for measure and comes up with metrics that will tell us how we're doing against it.

The goal of metrics should be, where possible, to create objective measures of something. Whereas some of the metrics described in the paper are quite objective, others are more than a little fuzzy and I don't think represent reasonable ways to measure security.

First, the Top-10 and associated metrics from the paper (and you'll have to bear with me as I try to create tables in blogger):

OWASP ItemMetricApp PhaseMethod
PercentValidatedInputDesignManual review
Broken Access ControlAnomalousSessionCountRuntime?Audit Trail review?
Broken Authentication / Session ManagementBrokenAccountCountRuntimeAccount Review
Cross-Site-ScriptingXsiteVulnCountDeployment?Pen Test Tool
Buffer OverflowOverflowVulnCountDeploymentVuln Testing Tools?
Injection FlawsInjectionFlawCountRuntimePen Testing
Improper Error HandlingNoErrorCheckCount (?)DesignStatic Analysis
Insecure StoragePercentServersNoDiskEncryption (?)RuntimeManual review
Application Denial of Service??RuntimePen Testing?
Insecure Configuration ManagementService Accounts with Weak PasswordsRuntimeManual review

I think unfortunately that this set of metrics misses the mark a little bit. I question whether pen testing for buffer overflows or XSS is really the right way to develop a sustainable metric. A necessary assurance component to be sure, but not necessarily the first metric I'd focus on if I'm asking the question "How secure is my app?" I'm loathe to rely on testing for the bulk of my metrics.

A few of the metrics above are unmeasurable or inappropriate I think. Its hard for me to imagine how we'd measure AnomalousSessionCount appropriately. Seems like if we had proper instrumentation for detecting these as described in the paper, we probably wouldn't have any in the first place.. I'm not so sure about BrokenAccountCount being representative of issues in authentication and session management either.

As I'm working on building my web application security metrics I'm trying to focus on things in the design phase. For the majority of flaws I'd like to develop a design-phase metric that captures how I'm doing against the vulnerability. This gives me the best chance to influence development rather than filing bugs after the fact. It is possible that some of these metrics simply don't exist in a meaningful way. You can't measure configuration management in your design phase for example.

Rather than just being destructive here is my modified group of metrics.
  • Unvalidated Input
    • I actually like the metric from the paper. Measuring input validation schemes against the percent of input they cover is a pretty good metric for this. Don't forget that web applications can have inputs other than html forms, etc. Make sure that any/all user input (cookies, http headers, etc.) are covered.
  • Broken Access Control
    • Unfortunately this one is a tricky metric to get our hands around. Ideally we'd like to be able to say that our data model has proper object ownership and we could simply validate that we call our model appropriately for each access attempt. This is unlikely to be the case in most web applications.
    • I'd really break this metric down into Application-Feature and Data access control. For Application-Feature access control I'd make sure that I have a well-defined authorization framework that maps users and their permissions or roles to application features, and then measure coverage the same way I would for input filtering.
    • For Data access control, I unfortunately don't have a good model right now to create a design-time metric, or any metric for that matter.
  • Broken Authentication and Session Management
    • For a general application I again come back to use of frameworks to handle these common chores. I'd want to make sure that I have a proper authentication and session management scheme/framework that is resistant to all of the threats I think are important. The important metric is coverage of all application entry points against this framework. When implemented at the infrastructure level using a package such as Siteminder or Sun Access Manager, auditing configuration files for protected URLs ought to get me good coverage.
    • From a testing perspective I can also spider the application and/or review my webserver logs and compare accesses URLs against the authentication definition and make sure everything is covered appropriately.
  • Cross-Site-Scripting
    • From a design perspective there are two things that matter for XSS vulnerability.
      • Input Filtering
      • Output Filtering
    • The best metrics therefore for measuring XSS vulnerability is a combination of the InputValidation Metric and an equivalent OutputValidation metric.
  • Buffer Overflow
    • In general buffer overflows are the result of improperly handled user input. Within a web application we ought to be able to handle most of these issues with our InputValidation metrics, but there are going to be cases where downstream we handle the data in an unsafe way. Unfortunately our best techniques for detecting and eradicating them are going to be either dynamic languages where we don't get buffer overflows, or lots of static analysis and strict code reviews of all places we handle static-sized buffers. One partial solution is to simply use an environment that isn't itself to buffer overflows. This makes analyzing the web application for buffer overflows pretty easy.
    • For those who insist on writing web applications in languages such as C/C++ our best defense is to avoid the use of static buffers and strictly code-review those places where we do use static buffers to analyze inputs for proper bounds checking. One useful measure would be PercentBoundsCheckedInput which we can theoretically catch with a static analyzer. They are pretty decent currently at finding these.
      • One problem with the metric from the paper was a focus not on the web application itself but on its platform. I'm not sure that we're working at the right level when we start considering OS vulnerabilities when reviewing web applications. They are certainly however part of the picture and a meaningful vulnerability.
In part-2 of this piece I'll try to cover the remaining 5 metrics as well as discuss a few thoughts on translating survivability/Quality-of-Protection into upstream SDL metrics.

No comments: