This is the key component of any malware name and thus, perhaps not surprisingly, the one that causes most trouble. As the only truly required name component it is the one that it is most important to match across product detection reports if the industry is to be seen to be addressing the complaints about naming confusion mentioned earlier. Thus there should be a great deal of pressure on researchers assigning the family name of a piece of malware to make a good choice. Unfortunately, it is also the component that is hardest to specify in a guide such as this or in the final, formal naming specification as the rest of the name components are partly or entirely ‘programmatic’. Because the family name is open to ‘artistic licence’ questions such as ‘what is a good family name?’ will always be open to debate and will almost inevitably remain a subjective judgement.
At its simplest, a malware family name is an identifier following the rules and practice guidelines given above. However, it is also much more than that. It is often the only name component the malware writer has given any thought to and it is commonly the only name component that will be extensively used in the media, should the malware in question, or any of its subsequent variants, achieve their fifteen minutes of (ill-deserved) fame. Thus, there are a great deal of ‘musts’, ‘shoulds’ and ‘must nots’ surrounding the choice of a suitable family name.
A technical naming specification is not really the place for such things, but as this paper is more of a practice guide, it seems appropriate to include them. As it seems many researchers need some guidance that better reflects ‘accepted industry practice’ than they get within their employers’ research labs, this may be as good a place as any to spell out these ‘rules’. Many of the following are from the original ‘naming.txt’ document, but some have been modified in light of more than a decade of subsequent experience.
Must nots
- Do not use company names, brand names, or names of living people. The original standard then adds ‘except where the virus is provably written by the person’ but the practice of naming malware after its author is now strongly discouraged for several reasons, the most compelling being that obtaining recognition is a common motivation for malware writers and distributors and that as many virus writers produce samples of more than one family of virus, the name of the writer is a very inconsistent label to associate with the code. Common first names are permissible, but be careful — avoid if possible. In particular, avoid names associated with the anti-virus and computer security worlds. It is not unknown that ‘perverted’ forms of the virus writer’s name or handle have been used for family names. This is not a recommended practice but it is probably better than choosing the virus writers name verbatim if you are otherwise lost for inspiration.
- Do not use an existing family name unless the virus belongs to the same family. Note this applies purely to the <family_name> level of the specification. Using the same family name for a VBS virus as an existing Word VBA (W97M) virus or Win32 Trojan is wrong unless there is good reason to consider them closely related (are there strong code similarities?, etc). Being on different platforms does not justify family name re-use.
- Do not invent a new family name if there is an existing, acceptable name. This is the inverse of the previous rule — if a new piece of malware is ‘clearly’ a new member of an existing family, put it in that family. Just because it has a different payload or trigger condition or is even attributed to a different writer does not necessarily make it the start of a new family. Family membership is decided on the basis of code similarity, particularly as regards the ‘most salient’ features of each class of malware. Thus, similarity of infection mechanism is generally most important when deciding family placement of viruses, as infection is the defining aspect of virality.
- Do not use obscene or offensive names. Choosing a name from strings in the malware that are in a language you do not understand can be especially problematic in this regard. Also remember that names and terms of significance in a religion you are unfamiliar with may be deeply offensive to members of that religion if associated with something as banal, destructive or otherwise ‘ungodly’ as a piece of malware.
- Do not assume that just because a sample arrives with a particular name, that the malware has that name. This is especially important when processing presumably sorted collections from other researchers. The publicly available VGrep tool can help you here with checking name cross-references for ‘older’ malware. With very new malware it is more difficult but if you, or another researcher in your team, are a member of AVED, you have access to a tool that provides VGrep -like results in ‘near real time’.
- Do not use numeric family names. Names such as ‘Eight941’ have already been mentioned, but even names like ‘V845’ should be avoided. Numeric names where the number represents the infective length of the virus and even if some of the numbers are expressed as words, should not be used as family or group names, as other members of the family/group will almost certainly have different lengths. As an exception to this general rule, when a new virus appears and a new family name must be selected for it, it is acceptable to use a temporary name of the form ‘_1234’, but this must be changed as soon as possible. Note this is an exception to the two general guidelines of not beginning an identifier with a non-alphabetic character and of not using numeric family names. If this practice is adopted in your lab, the number to use for parasitic executable viruses is the infective length; all other kinds of malware should be given such temporary names using a recycling, incrementing sequence starting with ‘_1’.
Shoulds
- Avoid the malware writer’s suggested or intended name. This is not as widely agreed as most other guidelines here, but as with not naming malware after its writer, naming malware as its writer intended is now strongly discouraged, as obtaining recognition is a common motivation for malware creators.
- Avoid naming malware after a file that traditionally or conventionally contains the malware. Many mass mailers have sent themselves with a fixed filename for the attachment, but a later variant may use a different filename and become much more widespread and better known. As many, many viruses are parasitic and many, many self mailers already try to improve their chances by randomly selecting names for their Email attachments, we should not falsely condition our users into expecting filenames to be indicative markers of maliciousness. This edict and the previous one are not as widely agreed as most of the other guidelines in this section. However, it is the current author’s belief that it is professionally irresponsible to knowingly increase confusion about malware, and this is one of the most obvious ways that an ‘everyday practice’ of many malware researchers does exactly that. A whole paper could be written on just this issue, so we’ll not say more about it for now.
- Avoid family names such as ‘Friday_13th’ and ‘September_22nd’, particularly if the dates represent payload triggers. They should not be used as family names, as other members of the family may have different payload trigger dates. Similarly, it can be a good idea to avoid distinctive text from messages displayed by the malware. Remember, code similarity is the basis of family placement, so a name based on some arbitrary and easily altered feature that is unrelated to the malware’s core functionality is likely to be a poor choice.
- Avoid geographic names which are based on the discovery site — the same virus might appear simultaneously in several different places.
- If multiple acceptable names exist, select the original one, the one used by the majority of existing anti-virus programs or the more descriptive one. Generally, the name chosen by the researcher who first ‘isolates’ a piece of malware has primacy, unless it is shown to be a poor choice. In such cases agreement on a better name should be sought by discussion between researchers.
NAMING.TXT also included directions on naming ‘problem’ cases and some virus types for which good names were not necessarily obvious. They have been removed from this document as they refer solely to simple DOS file and MBR/boot viruses which are not such a concern these days. Although removed from discussion here, those guidelines should still be followed in the cases to which they still apply.