Sadly, there has been little agreement among vendors or researchers as to malware naming standards. At least partly, this reflects the fact that back in the early days of AV development, each vendor pretty much did their own thing and many vendors felt name choice did not matter much. This was ‘helped’ by the fact that relatively few computer users used AV software, with few of those who did using more than one scanner. The attitude was further assisted in that new viruses spread so slowly that it was uncommon for two (or more) vendors to get samples of the same thing new from a customer. Thus, when developers received a new (to them) virus there was a good chance another developer’s scanner already detected it and the existing name for it could be used. And, in the cases where the competitor’s existing name was not adopted, or where a virus was independently discovered and named differently by two or more AV developers, few of their customers would notice.

This is not to suggest that naming confusion did not exist and was not seen as a ‘problem’ – it annoyed the hell out of some AV researchers – but overall it was seen as less of an issue than it is today. Of course, things change. These days developers are just as likely to get their first samples of most new viruses from other developers because of the cooperative sharing of malware samples common in the industry.

Failure of previous naming practices

It may seem that this practice should improve naming consistency, but in general it has not. This perhaps counter-intuitive result arises for several reasons. First, some vendors commonly use non-standard naming forms that some other vendors (rightly) refuse to copy. Further, some ‘purists’ will not use certain names or sources of names. So, for example, a virus ‘discovered’ by one vendor and named as the virus writer clearly wanted will be given another name by some other researchers who refuse to use ‘suggested’ names. Or, perhaps the discoverer, and thus namer, of a virus misses the resemblances between the virus and an existing family, so chooses a new family name for this virus. A second researcher who then notices the resemblance may (rightly) refuse to use the new family name assigned by the virus’ initial discoverer. If some other researchers also refuse to rename the virus and the more thorough researcher insists on ‘correctly’ naming the virus as a variant of the existing family, different names will be used. (Also note that particularly in cases such as this, when the first discoverers of this virus find what they do consider to be the ‘next’ variant in that pre-existing virus family, they will likely assign this new and different virus the same variant name as the second researcher used for the previous example, deepening the naming confusion for that family.) As if that’s not confusing enough, there are many other issues apart from independent discovery that result in the same malware getting several different names at the hands of different researchers.

Once upon a time many of these naming differences were resolved at semi-regular meetings between several of the major AV researchers in CARO (usually held at conferences they all attended). These researchers would sit down and hammer out agreement on the ‘correct’ names for all the viruses discovered since they last met. When they returned home from the conference, they would rename many of the viruses their products detected so there was good inter-vendor agreement among the products under these researchers’ control. However, this approach had at least three serious shortcomings. First, when the numbers of new viruses discovered between such meetings numbered a few hundred, the approach was manageable (although not necessarily ideal). Nowadays, depending on how you count, there are typically 500 to 1000 new viruses and related malwares discovered every month. The old, informal, CARO naming sessions have long gone and would be entirely unmanageable under today’s typical new malware load. Second, some of the influential developers, in terms of discovering and thus being first to name new malware, did not actively support the CARO naming consolidation initiative. Third, many AV developers were not represented at these meetings nor worked closely enough with members of CARO to keep the names used in their products in sync with the CARO-agreed names. Even if these developers assigned names to new viruses based on the reports of a scanner whose developer was represented at the CARO naming sessions and that updated its malware names in line with the agreements from those meetings, if a virus was renamed at a subsequent CARO meeting, it was less likely to be noticed by the non-CARO developer (and some of these developers seemed to have a very negative view of renaming viruses they already detected, or even of trying to keep in step with the CARO naming scheme , which despite its many weaknesses is the only concerted effort to date to coordinate names among AV developers).

The development of a new naming framework

Of course, the CARO naming scheme (as opposed to the CARO naming sessions) is a more generic and widely applicable device to use in naming debates. If most AV vendors stuck to the constraints of that scheme, most malware would be given names that should be broadly acceptable to most other developers. That alone should make the grander goal of standardizing names easier to achieve. Further, if a sufficiently general scheme that allowed for relatively easy extension as new malware types need to be added were available, the scheme may be more readily acceptable both across the whole AV industry and perhaps even to other ‘security’ product developers who are increasingly interested in malware naming that is consistent across vendors.

Within the AV industry it seems that many researchers responsible for naming malware are not aware of many of the basic strictures of the CARO naming scheme and/or are confused by many of the particulars. No doubt this reflects the historical animosity some AV developers have expressed to the CARO naming scheme and the often ‘slack’ attitude as regards naming seen in many labs (including some that pay lip-service to supporting the CARO scheme). Faced with increased criticism from their customers and with nothing else close to an ‘industry standard’ to start from, the CARO naming scheme is often suggested as the departure point for efforts geared toward fixing the current naming mess. Thus, if the CARO naming scheme was to be useful for this, we needed to address the shortfall in general knowledge of the scheme. To show that some effort is being put into standardizing naming, this paper on the practice of malware naming is offered to the research community in the hope that it may be used as an educational tool in virus labs, assisting researchers to choose better formed names. In turn it is hoped this will lead to a more cooperative approach to naming, which really should reduce the naming mess going forward.

« NamingRationale · Naming scheme · NameParts »