This is the last of the four Article Series on “Deep Definition of the Computer Virus“. In this post, we will compare all the three previous definitions we have seen, namely, Cohen Model of Recursive Self-Replicating Virus, Adleman Model of Malicious Computer Code, and Virus defined as Violation of Code.
So Let’s Begin
Which of a Computer Virus definitions is better?
Compare the approaches used in different models and definitions of a virus.
The virus is:
- A function corresponding to the defined criteria, displaying the uninfected object on an infected object that is different from it.
- Any function that maps the uninfected object to an infected object that is different from it.
In order to recognize a virus in the audited object, we need:
- Conduct a complete analytical or algorithmic study of the properties and behavior of the system consisting of the object and the machine (environment).
- Determine the location of the object in the machine (environment).
The definition of a virus is based on:
- The property of recursive replication.
- The method of comparing the sample with the standard.
Once again, let’s emphasize the essential role of the environment in the process of replication. For example, the sequences of SENDMAIL (8 bytes), COPY (4 bytes) or CP (2 bytes) characters, which can be contained in the simplest virus and which it can operate on, do not carry anything in themselves, that would be specific for a replication process (especially – recursive). Nothing like this also contains a short byte sequence that calls the copy function from the operating system API in the machine code. In this sequence, if we consider it outside of the environment, there is neither a replication mechanism, nor its description, nor even a hint at it.
Moreover: a real virus may not contain even this information – in the literal sense! Instead, the virus, for example, can include arbitrarily complex cryptographic algorithms, and the key to this algorithm can be an arbitrary combination of environmental objects that can appear in it in an indefinite future.
This means that until some unknown moment before the environment executes some unknown in the analysis of the condition code, the principal absence of any connections in the virus code with the concepts of both replication and recursion can be guaranteed with the accuracy of the mathematical proof. But even in this case, the virus merged into properties with the environment and formed with its elements a complex, distributed in space and time system, preserves the possibility of code multiplication.
Thus, in modern large information systems, formalization and the search for signs of recursive replication in individual objects seem useless even from a theoretical point of view. And to consider each object in conjunction with the entire environment – it is impossible. If as input to the analysis of a single object requires the substitution in the abstract formula of complete information about the state of an almost infinite (due to the huge size and permanent variability) environment, then a dead-end way is chosen. the Internet is a single auto-programmable environment, a single computing system, events in which can change the code in a single system unit. Therefore, all these events must be considered in the formal definition of a virus, if it is based on the properties of the code – both in the abstract model and in practice. Both are excluded.
In a global network where unlimited combinations of direct and reverse links between elements are possible, mass replication methods of code that are not related to classical, recursive ones are obvious and have long been used. If a dangerous code infects an arbitrary system block of this network, then it does not matter whether this infection is recursive – or the child code is generated by the parent by a much more complex algorithm. The algorithm can be any. And based on an unknown algorithm, you can not build any exact models that are applicable in practice.
It should not be forgotten that in real computing systems, viruses are programmed (i.e. managed) by humans. And everything that man controls almost does not lend itself to formalization.
From the above, it follows that the formal definition of a computer virus, based not on the sign of recursive replication, but on fundamentally different signs, in many cases may turn out to be the best.
This does not mean that the concept of a recursive function has ceased to matter in the context of viruses. It remains extremely important, although it can not be regarded as determining. It does not allow you to accurately position the goal, the phenomenon, the subject of the threat and its semantic essence using an acceptable mathematical apparatus. Therefore, it is better to always keep in mind the alternatives. A clear understanding by an information security specialist of several different definitions of a virus, the corresponding theoretical models and aspects of antivirus protection provides the greatest opportunities for solving practical problems.
P.S. Of course, other definitions are possible, besides those given in the article. They can be searched on the Web or created independently. The main point that must be remembered is here: according to the norms of formal logic, a competent definition should allow, at a minimum, to determine the desired object among others.
Alex Bod is a cybersecurity expert and the CEO of Bod Security, Bod Intelligent Antivirus provider company.