Friday, December 30, 2016

thoughts on "In Search of Evidence-Based IT-Security"

Christopher Soghoian brought to my attention a video of a talk by Hanno Böck at the 33rd Chaos Communication Congress. in it Hanno puts forward the claim that IT security is largely science-free, so let's follow a staple of the scientific process - peer review.

Hanno introduces himself as a journalist and hacker and says that he prefers to avoid the term "security researcher" and that he hopes the audience will see why.  for those who are relatively well versed in the field of anti-malware it should definitely become obvious why he prefers to avoid that term and i'll return to this near the end.

Hanno is a skeptic, and far from the only one, his talk ultimately expresses the same sentiments that are now common-place in the perennially misinformed information security community. the difference is that Hanno has found a novel way of expressing them, couched in scientific jargon and easily mistaken for insight. he spends altogether too long and dives too deeply into the medical analogy upon which computer viruses and by extension anti-virus software is named. the analogy has long been recognized as deeply imperfect and limited. that's why, in reality, there are relatively few references to this analogy in the anti-malware field other than "computer virus", "anti-virus", and "infection" (all three of which date back virtually to the beginning of the field). his call towards the end of his talk for blinded or even double blinded studies, aside from being prohibitively expensive to perform, seem to cling to this medical paradigm in spite of the fact that the subject of such experimentation (ie. the computer, since we're interested in whether AV can prevent computers from becoming compromised) cannot be psychologically influenced by knowledge of which (if any) anti-virus is being used.

when he FINALLY leaves the topic of medical science to return to security products (about 14 minutes into his half hour talk) he harps on the absence of one very particular kind of experiment being performed on security products - what he calls a randomized controlled trial. it turns out this is a hold-over from his preoccupation with medical science. when Hanno says that IT security is largely science-free it is the absence of this particular kind of scientific experiment that he is referring to, but that doesn't actually make it science-free because science has a variety of different ways to study and experiment on things that aren't people.

there is in fact good scientific evidence for the efficacy of anti-virus software and it's provided by none other than Microsoft:

now it's true that this is data is from an observational study and that it only shows correlation rather than causation, but that's not the end of the world. observational studies are still science. showing correlation may not be definitive evidence but it's still strong evidence, especially considering the scope of the study (hundreds of millions of computers around the world out of a total estimated population of 1.25 billion windows PCs). in this particular case A may not be causing B but B definitely can't cause A and if anyone can think of a confounding variable that might be present on hundreds of millions of systems then maybe let Microsoft know so that they can try to account for it in the future.

another source of scientific evidence (oft derided in information security circles because the results don't match experts' anecdata) are the independent testing labs like av-test.org or av-comparatives.org. they eliminate the influence of confounding variables and so are capable of showing causation rather than just correlation. unfortunately Hanno believes their methodology is "extremely flawed". let's look at his complaints:
  • "If a software detects a malware it does not mean it would've caused harm if undetected."
    • this is trivially false. anyone who actually reads the testing methodology at av-comparatives (for example) can find right at the beginning a statement about first testing the malware without the AV present and eliminating any that don't work in that scenario. therefore every sample that is detected by AV in their tests would have caused harm if it had gone undetected.
  • "Alternatives to Antivirus software are not considered." (the talk gives "regular updates" and "application whitelisting" as examples)
    • the example of "regular updates" is frankly a little bit bizarre given Hanno's earlier references to confounders. not controlling for this scenario would actually introduce a confounding variable and make it more difficult to show a causal relationship between the use of a particular AV and the prevention of malware incidents.
    • the example of "application whitelisting" underscores a serious problem in Hanno's understanding of what he's critiquing. application whitelisting isn't an alternative to AV, it's a part of AV. many products include this as a feature. Symantec's product, for example, has what they call a reputation engine which alerts when it encounters anything that doesn't have a known good reputation (which means new/unknown malware, traditionally the bane of known-malware scanning, will get alerted on because it hasn't been seen before and thus no reputation, good or bad).
  • "Antivirus software as a security risk is not considered."
    • when malware exploiting vulnerabilities in anti-virus software is found in the wild then perhaps the test methodologies should be updated to include this possibility. until then, changing the methodology to account for malware that doesn't seem to exist outside a lab has no real benefit.
  • "None of these tests are with real users."
    • again, this would introduce a confounding variable. maybe the lack malware incidents is because of something the user did rather than because of the AV. alternatively maybe the failure to stop malware incidents is because of something the user did rather than because of a failure of the AV. if you want to establish causation you have to control your variables (something our scientifically-minded speaker Hanno should know all too well). does the anti-virus prevent malware incidents? the tests say yes. can a user preempt or compromise that prevention? also yes. is there any prevention a user can't preempt or compromise? sadly (or perhaps thankfully) no. if you want a study that includes users and thus eliminates the ability to establish a causal link between AV use and prevention of malware incidents, see the study by Microsoft, but even with the inclusion of the users it still suggests AV prevents malware incidents.

when Hanno addressed the paucity of scientific papers dealing with security i found myself confused. using Google Scholar to find the most cited scientific papers? surely he doesn't think the realm of security is so narrowly focused that he'll find what he's looking for that way. security is in fact incredibly broad, covering many different quasi-related domains, and looking at a handful of the most popular scientific papers across all of security is in no way representative of the corpus of available works related to any one particular field (like security software). perhaps i'm biased, having previously (in the very distant past) maintained a reference library of papers related specifically to anti-virus, but it doesn't seem like Hanno showed much evidence that he knew how to find evidence-based security. is it really that hard to add the term "malware" to his search query? could he not find a few and then use them as a seed in an algorithm that crawls backwards and forwards through scientific papers by citation? did he even bother to look at Virus Bulletin? does he even know what that is?

security isn't the only thing that is incredibly broad - so too is the practice and discipline of science itself. there are many different fields and each one does things in their own particular way. we do not perform randomized controlled trials on the cosmos. as a general rule we do not intervene in volcano formation. the work being done at the large hadron collider does not follow exactly the same methodologies that are used in medical science. are we to judge cosmology, volcanology, or particle physics poorly because of this? no of course not. a question you might well ask is what kind of science should logically be used when it comes to studying computer security and, while i suspect multiple scientific disciplines could be useful, the one that springs immediately to mind is computer science. does computer science look anything like medical science? as someone with a degree in computer science i can tell you the answer is emphatically no. we do many things in computer science but randomized controlled trials are not among them (because computers are not people). while Hanno may style himself as "scientifically minded" he doesn't seem to demonstrate an appreciation for the breadth of valid scientific research methodologies and one is left to wonder if he's familiar with any kind of science outside of medicine.

when it comes right down to it, it's this apparent lack of familiarity with the subject matter he's talking about that i found most troubling about Hanno's talk.what is anti-virus software really? what is av testing methodology really? what does science really look like? where do you look for scientific research into malware and anti-malware? these all seem to be questions Hanno struggles with, which brings us back to the subject of why he likes to avoid the term "security researcher". if i had to venture a guess i'd say it's because he doesn't do research, even the basic research necessary to understand the subject matter. as such i would say avoiding the term "security researcher" is probably appropriate (for now).

i'm not sure what one can say in a talk about a subject one hasn't done one's homework on, but hopefully that can improve in the future. Hanno referenced Tavis Ormandy during his talk (as people who criticize AV like to do). Tavis' work on AV also suffered from a lack of understanding in the beginning, but he improved over time and, while he still has room for more improvement, now has arguably done some good work in finding vulnerabilities in AV and holding vendor's accountable for the quality of their software. i'm certain Hanno can also improve. i know there are real criticisms to be made of AV software and the industry behind it, but they have to be informed, they have to come from a place of real knowledge and understanding. i look forward to Hanno reaching that place.

Friday, October 21, 2016

highlights from #sector2016

i haven't posted about the sector conference in a number of years, in spite of attending. let's break that trend. here's some highlights from this year's conference.

  • edward snowden - he was surprisingly well prepared to talk about canadian policy and events
  • marketing gimmicks - hockey pucks are one thing, but give me a key and tell me that there's a lock that it might fit and you're darn right i'm going to go find out if it fits. i gather there was a prize i could have won if it did fit but i didn't pay much attention to that. a friend at work said he'd have bought their product on the strength of that gag alone.
  • ransomware - ransomware seemed to be the theme this year. i lost track of how many talks were about or mentioned ransomware. 2016 really does seem like the year of ransomware. i caught the tail end of talk from someone at sophos where they described a feature for rolling back encrypted files using a proprietary backup mechanism. if other vendors aren't doing something like this you're leaving money on the table. (that's right, i know how you vendors think)
  • mikko hypponen - great perspective on what protecting computers has evolved into: protecting society because it now runs on computers
  • the security problems of an 11 year old and how to solve them - this talk was given by an actual 11 year old who could probably put some professionals to shame. this is the talk i most look forward to sharing with people at work when the videos become available.
  • mikko's "virus" floppy disk that he left behind - this isn't a highlight because someone from the AV industry was careless with infectious materials but rather because when it was found people wanted to find a computer they could stick it into and see what was there. you'd think the difficulty in finding the hardware necessary to read a 5 1/4" floppy would make such a disk a relatively safe prop to use, even if there was a virus on it. leave it to infosec pros to try and find ways around such barriers. now you know where shadow IT comes from, folks. by the way, don't tell mikko.
there were other good talks and keynotes, of course, but i'm not going to detail every talk i attended and every person i met. these are the things that really stood out to me and if you want to know more you should have gone yourself.

Friday, September 02, 2016

the anti-virus harm balance

anti-virus software, like all software, has defects. sometimes those defects are functional and manifest in a failure to do something the software was supposed to do. some other times the defects manifest in the software doing something it was never supposed to do, which can have security implications so we classify them as software vulnerabilities. over the years the software vulnerabilities in anti-virus software has been gaining an increasing amount of attention by the security community and industry - so much so that these days there are people in those groups expressing the opinion that, due to the presence of those vulnerabilities, anti-virus software does more harm than good.

the reasoning behind that opinion goes something like this: if anti-virus software has vulnerabilities then it can be attacked, so having anti-virus software installed increases the attack surface of the system and makes it more vulnerable. worse still, anti-virus software is everywhere, in part because of well funded marketing campaigns but also because in some situations it's mandated by law. add to that the old but still very popular opinion that anti-virus software isn't effective anymore and it starts looking like a perfect storm of badness waiting to rain on everyone's parade.

there's a certain delicious irony in the idea that software intended to close avenues of attack actually opens them instead, but as appealing as that irony is, is it really true? certainly each vulnerability does open an avenue of attack, but is it really doing that instead of closing them or is it as well as closing them?

if an anti-virus program stops a particular piece of malware, it's hard to argue that it hasn't closed the avenue of attack that piece of malware represented. it's also hard to argue that anti-virus software doesn't stop any malware - i don't think anyone in the anti-AV camp would try to argue that because it's so demonstrably false (anyone with a malware collection can demonstrate anti-virus software stopping at least one piece of malware). indeed, the people who criticize anti-virus software usually complain not about set of malware stopped by AV being too small but rather that the set of malware stopped by AV doesn't include the malware that matters most (the new stuff).

so, since anti-virus does in fact close avenues of attack, that irony about opening avenues of attack instead of closing them isn't strictly true. but what about the idea that anti-virus software does more harm than good? well, for that to be true anti-virus software would have to open more avenues of attack than it closes. i don't know how many vulnerabilities any given anti-virus product has so i can't give an exact figure of how many avenues of attack are opened. i doubt anyone else can do so either (though i imagine there are some who could give statistical estimates based on the size of the code base). the other side of the coin, however, is one we have much better figures for. the number pieces of malware that better known anti-virus programs stop (and therefore the number of avenues of attack closed) is in the millions if not tens of millions and that number increases by thousands each day. can the number of vulnerabilities in anti-virus software really compare with that?

it's said that windows has 50 million lines of code. if an anti-virus product were comparable (i suspect in reality it would have fewer lines of code) and if that anti-virus product only stops 5 million pieces of malware (i suspect the real number would be higher) then in order for that anti-virus product to do more harm than good it would need to have at least one vulnerability for every 10 lines of code. that would be ridiculously bad software considering such metrics are usually estimated per 1000 lines of code.

now one might argue (in fact i'm sure many will) that those millions of pieces of malware that anti-virus software stops don't really represent actual avenues of attack because for the most part they aren't actually being used anymore. they've been abandoned. counting them as closed avenues of attack isn't realistic. the counter-argument to that, however, is to examine why they were abandoned in the first place. the reason is obvious, they were abandoned because anti-virus software was updated to stop them. the only reason why malware writers continue making new malware instead of resting on their laurels and using existing malware in perpetuity is because once anti-virus software can detect that existing malware it generally stops being a viable avenue of attack. so rather than the abandonment of that malware counting against anti-virus software's record of closing avenues of attack it's actually closer to being AV's figurative body count.

there is still malware out there that anti-virus software hasn't yet stopped, and as that set is continually replenished it's unlikely that anti-virus software will stop all the malware. it has stopped an awful lot so far, however, so the next time someone says anti-virus software does more harm than good (due to it's vulnerabilities) ask them for their figures on the number of vulnerabilities in anti-virus products and see how it compares with the number of things anti-virus software stops. i have a feeling you'll find those people are full of it.