Security Retentive: May 2008

Thursday 5/22 I was at the IEEE Web 2.0 Security and Privacy Workshop. I figured I'd learn a few things, and also make sure that no new exploits were announced against my employer, and/or make sure we weren't the only examples people gave of problems.

I was pretty successful on goal #1, not 100% successful on goal #2.

This post is mostly brain dump of notes about the talks followed by a few things of architectural interest that I think were discussed enough at the workshop. A quick preview - the first half of the conference was spent talking about general security holes in Web-1.0 that we still haven't solved technically/architecturally/culturally. With that in mind its hard to see how we're going to have much success with Web-2.0 security.

I'll start by saying though that I was ever so slightly disappointed with the makeup of the attendees. Conferences and workshops held by the IEEE and ACM do generally tend towards the seriously geeky and academic side of things. You're much more likely to find papers that are suitable for journals with plenty of academic references, peer review, CS technical terms, formulas, etc. At the same time though workshops do tend towards the less academic and more practical side. It was disappointing therefore that, though the workshop focused a lot of time on things like secure mashups, social networks, and Web-2.0 security models, to the best of my knowledge very few of the players in this space were present. I didn't meet anyone from any of the really interesting mashup companies and none of the social networks were there (minus google, who was well represented). Perhaps in the future people attending and organizing workshops like these can actually get the folks at the relevant companies interested, specifically invite them, etc.

Now, onto the papers/presentations themselves:

Quick Note: In attending the conference each author presented slides and talked about their paper. I only read a few of the papers so far, so don't take the commentary below too seriously, since I'm sure I missed some things that were covered more fully in the papers than the presentations.

Session 1: Authentication and Authorization

Daniel Sandler and Dan S. Wallach. <input type="password"> must die!
Daniel presented some good idea on how to move password authentication into the browser chrome to improve our defenses against javascript malware such as javascript keyloggers, etc.

While the work Daniel did was quite cool in that it doesn't require any protocol modifications, to be truly useful in implementing authentication inside browser chrome you probably need involvement from the site itself to hint, tweak, etc. Once you start doing that though, you start looking at doing stuff like cardspaces to actually get to a better architectural solution.

Ben Adida. Web Authentication by Email Address

Ben focused on usability concerns in OpenID and the idea that email addresses (or things that look like email addresses) are much better identifiers than URLs. He sketched out how to modify OpenID to use email addresses or lookalikes for authentication rather than URLs. Some of his proposals hinge on using DNS lookups for a domain to find the authentication server much like we use MX records for email. While potentially risky, DNSSEC could theoretically be used to mitigate some of the problems.

I must say I haven't kept up with OpenID as much as I'd like to, and so I'm 99% sure lots of the nuance of Ben's proposal was lost on me.

Session 2: Browser Security Models and Isolation

Collin Jackson and Adam Barth. Beware of Finer-Grained Origins

Collin Jackson presented some work he and Adam have done on how the browser security model, namely the same-origin policy, isn't nearly granular enough to handle most web applications and sites that host them.

For example:

http://cs.stanford.edu/~abarth
http://cs.stanford.edu/~cjackson

both have the same origin from the browsers point of view, but don't necessarily have the same security policy per use intent. Because the web browser can't really distinguish between them, we don't have a clean way of separating the security policies here.

Collin went on to show a multitude of problems in the same origin policy between sites, and problems in the upgrade/downgrade of security indicators in a browser. I won't rehash all of his results but suffice it to say we desperately need things like ForceHTTPS embeded in browsers in the near future to prevent some of these problems.

Kapil Singh and Wenke Lee. On the Design of a Web Browser: Lessons learned from Operating Systems

Kapil presented some research his team has been doing on modeling web browsers more like operating systems. You might have seen some related work recently as part of the OP Browser project. The idea is that the internal implementation of most browsers is pretty dicey from a security perspective. There is no clean separation between policy and mechanism. All code operates at the same privilege level. Plugins cannot be constrained in what they can do, etc.

I haven't seen any analysis yet comparing what MS did with IE7 on Vista in protected mode as compared to OP or Kapil's work. It is pretty clear that MS didn't fully segment IE7, but I wonder how close they got to ideal on the sandboxing side of things.

That said, I think our biggest problem in browser security isn't the implementation and internal segmentation. Our biggest problem is that we don't have any idea what security policies we really want to implement. Sure, having a flexible architecture under the hood makes it easier to implement flexible and finer-grained policies, but unless we have some idea what those are, perhaps we're putting the cart before the horse in terms of robust internal implementation.

Mike Ter Louw, Prithvi Bisht and V.N. Venkatakrishnan. Analysis of Hypertext Markup Isolation Techniques for XSS Prevention

My favorite presentation of the day was this one by Mike Ter Louw. Mike talked all about the multiple ideas circulating out there related to content restrictions. He showed the different failure modes for several of the proposals, showed how some of them can be rescued, and pointed towards areas that need more research.

The idea of content restrictions and server-indicated security policy that clients interpret and enforce is a really hot idea right now, and I'm hoping to catch up with Mike in the not too distant future.

Mike - if you see this, drop me a note :)

Session 3: Social Computing Privacy Issues

Adrienne Felt and David Evans. Privacy Protection for Social Networking Platform

Adrienne presented some work she's done on weaknesses in the security model of social networks and paltforms such as Facebook. She analyzed a bunch of Facebook applications to understand whether they really ought to be granted all of the rights over user data that they are. She proposed some mechanisms for limiting what types of applications get access to what data by enhancing the FBML tags to allow an application to get more data without API access. She also showed how you can solve some data sharing rules with just FBML and a few permissions extensions without resorting to full API access.

What Adrienne didn't come out and say is that in some contexts things like vetting are actually important. Most people in the social networking space and Web-2.0 space don't want to look at things like vetting, legal relationships, etc. as a model for achieving security. While a preventative model looks great on paper, solving some of the data safety/privacy concerns can really only be handled through contracts, vetting, etc. No amount of hoping developers will do the right thing and develop least-privilege applications will solve this problem.

Monica Chew, Dirk Balfanz, and Ben Laurie. (Under)mining Privacy in Social Networks

Monica presented some research on how we can inadvertently leak data from social networks by a multitude of means. While it was an interesting talk on how you can aggregate data from multiple locations to pin down more details than you ought to, since I'm not a heavy user of social networks I found myself less than interested in the general problem. If you're going to post large amounts of personal data online in multiple online sources, you're going to have people aggregating them together. There is only so much we can do to protect ourselves against that sort of aggregation.

Session 4: Mashups and Privacy

D. K. Smetters. Building Secure Mashups

D.K.'s talk was quite short on technical details and yet was one of the better talks of the day. Whereas I had a few complaints about Kapil's talk earlier in the day being a solution looking for a problem, D.K.'s talk was about the problem itself - namely - how do we actually define the security policy we're trying to achieve in the mashup space, what sorts of general rules ought to govern application behavior, security properties, etc.

This was the first talk of the day to really talk about user expectations for security, what we should generally understand to be user intent, and how to actually try and implement that in a mashup application.

Tyler Close. Web-key: Mashing with Permission

Tyler's talk may have been the most entertaining of the day, if only because of his obvious frustration with what the web has become. Tyler's main claim was that we ought to be using capability URLs to handle our authentication and authorization concerns. URLs that encode both authentication and authorization data bring us back to the original intent of the web, where the link is everything.

It was nice to see someone railing against a bit of what the web has become, but it almost felt like an original internet user lamenting the end of the end-to-end internet. A decent architectural argument, and yet one that isn't likely to yield a lot of converts. I don't think I understood a few of Tyler's points about how to prevent these URLs from leaking out and/or how to revoke access should they happen to. There are a multitude of user acceptance, behavior, and expectation questions to be answered. It was a nice twist though on how to perhaps make access-controlled content more in keeping with the spirit of the web.

Mihai Christodorescu. Private Use of Untrusted Web Servers via Opportunistic Encryption

Mihai's presentation was about how to take advantage of networked services/web-applications while proividing them with only opaque data references created with cryptography. His main example was about how to use Google's Calendar product without ever sending them your real data, and sending them only client-side encrypted data instead.

While it seems like a nice idea, and while parts of his solution were technically elegant, I think again it was a solution looking for a problem. If you're so concerned about a networked service having your data that you're willing to reverse engineer the service to make it store your individual data elements encrypted, then perhaps a networked service isn't the one for you. TYhe architectural challenges in achieving what he was able to with Google's calendar are nearly impossible with a more complicated service. And, in order to make it work you have to give up many of the feature's you'd really like from a service - full text searching, etc.

I'm guessing there are a few places where's Mihai's ideas are feasible, but its hard for me to see the value prop in building what he proposed.

Some Final Thoughts:

We haven't come close to solving the security problems in a Web-1.0 world
We don't know what the security policies really ought to look like for the web, consequently we don't know what the architecture and implementation look like either.
Browsers are lacking fundamental architecture and policy around security.
Web-2.0 only makes things worse

Apart from all of the unsolved security challenges, the biggest point that struck me from the workshop was the general belief (or I assume belief, I didn't challenge people on it) that mashups are here to stay, and that we're just going to have to back into a security model for them.

I remain unconvinced that a client-side application mashup between datasets is the only way to build new and innovative applications, and that if there were any liability concerns or even contracts that held some of these companies/services even semi-accountable, perhaps we'd have a very different architecture than we're seeing as part of the mashup space.

We're spending time and money working on specs like XDR, HTML5-access-control, and we still haven't solved some of the fundamental security problems of the web. I didn't see anything at this workshop to dissuade me from that perception either.

Its like the old saying goes - "If it ain't fixed - don't break it more". Well, ok, that isn't an old saying, but maybe a few of the people working on mashups and social networks could actually operate with that as their motto we'd make some progress on all of this.

Eric Bidstrup of Microsoft has a blog entry up titled "How Secure is Secure?" In it he makes a number of points related, essentially, to measuring the security of software and what the appropriate metrics might be.

I'd been asking the Microsoft guys for a while whether they had any decent metrics to break down the difference between:

Architectural/Design Defects
Implementation Defects

I hadn't gotten good answers up to this point because measuring those internally during the development process is a constantly moving target. If your testing methodology is always changing, then its hard to say whether you're seeing more or fewer defects of a given type than before, especially as a percentage. That is, if you weren't catching a certain class of issue with the previous version of a static analysis tool but now you are, its hard to correlate the results to previous versions of the software.

Eric says:

Microsoft has been releasing security bulletins since 1999. Based on some informal analysis that members of our organization have done, we believe well over 50% of *all* security bulletins have resulted from implementation vulnerabilities and by some estimates as high as 70-80%. (Some cases are questionable and we debate if they are truly “implementation issues” vs. “design issues” – hence this metric isn’t precise, but still useful). I have also heard similar ratios described in casual discussions with other software developers.

In general I think you're likely to find this trend across the board. Part of the reason though is that in general implementation defects are easier to find and exploit. Exploiting input validation failures that result in buffer overflows is a lot easier than complicated business logic attacks, multi-step attacks against distributed systems, etc.

We haven't answered whether there are more Architectural/Design defects or Implementation defects, but from an exploitability standpoint, its fairly clear that implementation defects are probably the first issues we want to fix.

At the same time, we do need to balance that against the damage that can be done by an architectural flaw, and just how difficult they can be to fix, especially in deployed software. Take as an example Lanman authentication. Even if implemented without defects, the security design isn't nearly good enough to resist exploit. Completely removing Lanman authentication from Windows and getting everyone switched over to it has taken an extremely long time in most businesses because of legacy deployment, etc. So, as much as implementation defects are the ones generally exploited and that need patching, architectural defects can in some cases cause a lot more damage and be harder to address/remediate once discovered/exploited.

Another defect to throw into this category would be something like WEP. Standard WEP implementations aren't defect ridden. They don't suffer from buffer overflows, race conditions, etc. They suffer from fundamental design defects that can't be corrected without a fundamental rewrite. The number of attacks resulting from WEP probably isn't known. Even throwing out high profile cases such as TJ Maxx and Home Depot, I'm guessing the damage done is substantial.

So far then things aren't looking good for using implementation defects as a measuring stick of how secure a piece of software is. Especially for widely deployed products that have a long lifetime and complicated architecture.

Though I suppose I can come up counter-examples as well. SQL-Slammer after all was a worm that exploited a buffer overflow in MS-SQL Server via a function that was open by default to the world. It was one of the biggest worms ever (if not the biggest, I stopped paying attention years ago) and it exploited an implementation defect, though one that was exploitable because it was part of the unauthenticated attack surface of the application - a design defect.

All this really proves is that determining which of these types of defects to measure, prioritize, and fix is a tricky business and as always, you mileage may vary.

As Eric clearly points out the threat landscape isn't static either. So, what you think is a priority today might change tomorrow. And, its different for different types of software. The appropriate methodology for assessing and prioritizing defects for a desktop application is substantially different than that for a centrally hosted web application. Differences related to exploitability, time-to-fix, etc.

More on that in a post to follow.

Security Retentive

Thursday, May 29, 2008

Offtopic: 0xe0030005

Tuesday, May 27, 2008

Notes from IEEE Web 2.0 Security and Privacy Workshop (W2SP2008)

Monday, May 12, 2008

A Small Rant About Conference/Journal Papers and Timestamps

Thursday, May 08, 2008

More on Application Security Metrics

Blog Roll

About Me