Doing The Homework.

Any software vendor sometimes makes unfortunate mistakes. We are human like everybody else and we make mistakes sometimes, too. What’s important in such cases is to publicly admit the error as soon as possible, correct it, notify users and make the right changes to ensure the mistake doesn’t happen again (which is exactly what we do at KL). In a nutshell, it’s rather easy – all you have to do is minimize damage to users.

But there is a problem. Since time immemorial (or rather memorial), antivirus solutions have had a peculiarity known as false positives or false detections. As you have no doubt guessed, this is when a clean file or site is detected as infected. Alas, nobody has been able to resolve this issue completely.

Technically, the issue involves such things as the much-talked-about human factor, technical flaws, and the actions of third-party software developers and web programmers. Here’s a very simple example: an analyst makes a mistake when analyzing a sample of malicious code and includes in the detection a piece of a library the malware uses. The problem is the library is used by some 10,000 other programs, including perfectly legitimate ones. As a result, about 20 minutes after the release of an update containing the faulty detection, technical support goes under due to a deluge of messages from frightened users, the analyst has to re-release the database in a rush and the social networks begin to surface angry, disparaging stories. And this is not the worst-case scenario by far: imagine what would happen if Explorer, svchost or Whitehouse.gov were falsely detected :)

Another example: it’s not uncommon even for respected developers of serious corporate software (let alone their Asian brethren) to add libraries previously used in malware to their own applications, which are not supposed to be malicious at all. Just like that! Such cases need to be accounted for in some way, too.

At first glance, it may seem that there is a direct relationship between the number of false positives and the antivirus’ market share. The math is straightforward: the more users there are, the more varied the software they have, and the higher the chances of wrongly detecting something. In fact, this is not at all the case. Or, to be precise, if you really see this correlation, this is a sure sign that the vendor is not investing in improving its anti-false-positive technologies. As likely as not, they aren’t investing in other technologies either, preferring to ‘borrow’ detections from their colleagues in the industry or simply ignoring the issue. A serious reason to reconsider your product choice criteria, if ever I heard one.

There is another extreme, too. I would call it being too smart. Some technologies designed to detect unknown malware based on its behavior (behavior blockers, heuristics, metadata-based detections) do improve protection, but lead to non-linear growth in the risk of false detections, unless they are used in combination with measures to minimize false positives.

So what do antivirus vendors do about false positives?

Well, each uses a different approach. Some (it would seem) do nothing at all (see the table at the end of the post). Not every test takes false detections into account, so why invest in development if that’s not going to affect tomorrow’s sales? Right? Wrong.

Anyway, let me tell you how we do it.

Back in the early 1990s, when both viruses and antivirus solutions were like “alligators in the New York sewers” (© Peter Norton), we checked updates for false positives in semi-automatic mode. We had a special ‘dump’ of clean software (and later websites), against which we used to check each new version. Naturally, we also checked all updates against a malware database to avoid missing any detections.

Well, there were few viruses back then, updates were released just a few times a month, and there was less software around, to put it mildly. At some point in the late nineties, when the amount of both legitimate software and malware skyrocketed, we put the system into fully automatic mode. The robot regularly (on an hourly basis since 2004) collects antivirus updates from antivirus analysts and launches a multi-level testing procedure.

As you may have guessed, checking against the ‘dump’ is yesterday’s stuff. It’s necessary, but it’s nowhere near enough. Today, it’s not going to detect a false positive on its own. The database needs to be corrected and delivered to users as quickly as possible. When it comes to detecting false positives, we have accumulated an impressive arsenal, including patented technologies. For example, we sometimes use so-called silent detections, when test records are included in database updates. Users are not alerted when these records are triggered. This approach is used to verify the most sophisticated detections.

As for correcting false positives, cloud technologies are enormously helpful. In fact, the KSN (videodetails) is the best solution, which helps us to sort out many technical issues in order to improve protection. This includes qualitative and quantitative analysis of threats, quick detection of malicious anomalies, provision of interactive reputation services and much more. Of course, it would be really strange if we didn’t use KSN to minimize false detections.

But we do! Every time a suspicious object is detected on a protected computer, our products send a request to a special database that is part of KSN, which uses several algorithms to check the record triggered by the object for false positives. If a false detection is confirmed, the system launches a process that changes the record’s status at the update server and delivers the correct version of the record with the next database update. However, this delivery option means a 1-2-3-hour and sometimes even a 1-2-3-day delay – depending on the user’s updating habits and Internet access availability. In the meantime, the false detection will continue coming up and make the user, and possibly others as well, nervous. This ain’t right.

To address this issue, last year we added a little feature (one of those features you’d normally never hear about) to our personal and corporate products, making it available via KSN. We have filed patent applications for the feature in Russia, Europe and the US. I’ll tell you about it by giving an example.

Suppose our product has detected a piece of malware on a protected PC. From the technical point of view, here’s what happened: while scanning an object, one of the protection modules (signature scanner, emulator, heuristic engine, anti-phishing, etc.) has found a match between the object’s code and one of the records in the local antivirus database.

Before displaying an alert to the user, the product sends a query to KSN including data on (a) the object detected and (b) the record that was triggered. The KSN checks the object against the whitelist and the record against the false detection list. If there is a match with either of the lists, KSN immediately sends a false-positive signal to the protected PC. Next, our product deactivates the antivirus database record or puts it in the “silent detection” mode and makes a note for all the other protection modules to avoid repeated false detections. Conversely, if a “silent detection” record is triggered on a computer and KSN confirms that it is ready for ‘active duty’, our product immediately puts it back in normal mode.

It may seem at first glance that the logic, despite it being quite simple, is too resource-intensive. In fact, this is far from true. All the protected PC and KSN need to do is exchange very short requests – the rest is taken care of in the background by other services.

The fact is, pretty much every antivirus vendor has had false detection-related problems. Well, we’ve been there too, a few times. However, it’s worth noting that based on this parameter we are among the best. We have recently done an analysis of the results achieved on the four most respected testing centers (AV-Comparatives.org, AV-Test.org, Virus Bulletin and PCSL – a total of 24 tests they conducted) in 2011. The findings turned out to be pretty interesting:

*average % of false positives in 24 tests throughout 2011
** maximum # of false positives in a single test in 2011

 
True, in terms of false positives we come second after Microsoft. However, if you look at protection quality, you will find that in the same year 2011, on the same set of testing centers, our detection rate results for the Real World tests and dynamiс tests are 37% better!

We have lots of plans (as always :). We actually have some very interesting ideas on ways to pretty much eliminate false detections by adding a mass of intelligence. Unfortunately, for now we’re in silent mode on this issue to avoid letting the cat out of the bag before all the patent applications have been filed. But I’ll keep you posted!

That’s all folks, see you later!

15 Responses to “Doing The Homework.”

  1. OK Gracias por la información

  2. An interesting read. Thanks.

  3. Long discussion ahead on this. A reason for false positives and negatives when holes are available, is open holes.

    Things that people forget to think about:

    Types of viruses, categories, destination, reasoning…..

    The attack on cyber spaces are more common and more frequent than ever. One of the reasons is that foreign policy efforts were neglected by several government as important and left each country depending on itself and creating there own policies.

    The effort of globalization brought forward reasonable changes and requirements that revealed misplaced request and illegal security protocols, hence the overload of Wiki Leaks and security issues such as “Olympic Games” the cyber breach of Iran.

    The normal inquiries from anti-viruses comes from a database that retrieves viruses from end users, business, and download sites.

    Remember when security alerts and numerous articles followed the virus breach ie Sasser.

    Only a few years these breaches were very important. The mechanical recall and upload at times neglect to notice the changes in files with the same names at times. Ie Sasser — same virus on multiple computers might return different structures or consequences. It’s like H2O or OH2…. see the difference?

    Important to understand the new and old composition. I have a great example… :) I’ve been sending it to everyone.

    Picture file from Paint that I opened with text. I was able to upload into command prompt and also make changes to the file and re-save. Is that ok? The reproach of some of this is that it met the standards but with such a hole available what are the consequences without due integrity…. the possibilities for someone who is a potentially harmful person, unlike myself may use an opportunity like that to contact multiple hosts or picture sites for ill-will or monopoly gains.

    The idea here is to do more than publicize the event. The request of such a situation is to be manged is not required according to some standards but the risks may mandate immediate action. There is an in-between but where do you begin to manage such a breach, on that has no category but meets a global, possible national standard. Breach? Virus Risk? Piracy risk?

    BTW – I did my part and forwarded to Trend Micro and Symantec.

  4. False positives are a problem for the small software vendor. I am a creator of shareware screen savers and related products, and I often get reports from customers that the screen savers they have bought from me and have been using for years have suddenly stopped working because their antivirus program has reported a false positive after its virus definitions were updated. That is time that I have to spend explaining to my customers why their screen savers have suddenly stopped working, and then reporting false positives to the antivirus company.

    And of course I do not know how many people download the trial version of my screen savers, only to have their AV program tell them, incorrectly, that the file poses a threat. These are potentially lost sales.

    One “reputable” AV’s Site Advisor (not Kaspersky) once blacklisted my entire website as unsafe, until I complained. Could a bricks and mortar security company go to a restaurant and board up the front door with a sign that says “Closed due to suspected health concerns” which then turns out to be entirely false? Of course not! Imagine the hue and cry. Then why are antivirus vendors allowed to do that to the small software author?

    False positives do real-world harm, and the irony is that this harm could perhaps be worse than the effect of any virus.

Trackbacks/Pingbacks

  1. Kaspersky setzt auf “Cloud Computing” gegen False Positiv Alarme | Beatmasters WinLite Blog - June 22, 2012

    […] Kaspersky hat sich nun persönlich in einem Blogbeitrag zu diesen Fehlalarmen geäußert und auch bekanntgegeben, wie Kasersky diese reduzieren […]

  2. IT Secure Site » Blog Archive » Kaspersky applies for anti-false-positive system patent - June 22, 2012

    […] a detailed post on his blog, Eugene Kaspersky, co-founder of Kaspersky Lab, explains how his company’s […]

  3. Kaspersky beantragt Patent auf Fehlalarm-Bremse | Edv-Sicherheitskonzepte.de – News Blog aus vielen Bereichen - June 22, 2012

    […] Kaspersky erklärt in seinem Blog, dass die Schutzprogramme seines Unternehmens seit kurzem immer zuerst Zuhause nachfragen, ehe sie […]

  4. Kaspersky áp dụng hệ thống Cloud-based Anti-false-positive | Mẫn Thắng's Blog - June 23, 2012

    […] [2] http://eugene.kaspersky.com/2012/06/20/fighting-false-positives/ Rate this:Sharing:MoreLike this:LikeBe the first to like this. This entry was posted in News and […]

  5. Kaspersky promete una solución para evitar los falsos positivos de sus antivirus - June 25, 2012

    […] información: Nota Bene (Blog de Kaspersky) Tags: antivirus, […]

  6. Kaspersky Lab intentará poner fin a los falsos positivos - June 26, 2012

    […] Link: Kaspersky Blog […]

  7. Finding the Needle in the Haystack. Introducing: Astraea. | Nota Bene - November 15, 2012

    […] Another pro of Astraea: minimization of false positives. […]

  8. Encontrar una aguja en un pajar. Presentamos: Astraea | Nota Bene - November 15, 2012

    […] ventaja de Astraea es la minimización de falsos positivos. Por un lado, el sistema trabaja con una gran base estadística y un modelo matemático ajustado. […]

  9. Cercare l’ago nel pagliaio. Introduzione a Astraea | Nota Bene - November 15, 2012

    […] altro lato positivo di Astrea: la minimizzazione dei falsi positivi. Da un lato, il sistema lavora sia come un gigante data base statistico che come modello matematico […]

  10. Die Nadel im Heuhaufen finden. Astraea stellt sich vor. | Nota Bene - November 15, 2012

    […] Noch ein anderer Vorteil von Astraea: Das System verringert False Positives. […]

  11. All Mouth, No Trouser. | Nota Bene - January 24, 2013

    […] just about all files it came across. Of course as a side effect this produced an abnormal number of false positives – but guess what? The resultant marketing materials never contained a single word about them. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: