Today we will consider three different definitions of the computer virus concept. The first two were formulated by Frederick Cohen and Leonard Adleman in fundamental works of 1984-1992 and are considered classical in modern computer science. Unfortunately, from their works follows that the task of recognizing an arbitrary virus is unsolvable. These scientists have proved it.
Therefore, the third definition is formulated in such a way that it makes the problem completely solvable. For this, we will reject poorly formalized concepts which are the basis of classical definitions, and we will classify a wider range of information objects as viruses.
This approach – the solution of a complex problem by replacing the categories in which it is formulated – is often convenient and useful in the practical work of an information security specialist.
The problem of computer viruses was clearly manifested approximately 35 years after the appearance of the concept of “stored program computer“. The first authors and researchers of the viruses were strongly impressed by the previously unseen phenomenon of self-replication (in the biological sense of the term) of man-made objects.
Later, the specialists realized that this phenomenon is about the objects themselves but the executing environment – even as relatively simple as a Turing machine or a von Neumann machine with a typical operating system, not to mention more complex architectures and network environments. It became clear that mass replication of objects is not a specific feature of any one class of programs, but quite typical for all levels of a complex auto-programmable system. For example, code blocks in the OOP paradigm or application programs that are downloaded by users from a central server are massively replicated. But despite the discovery of the strategic role of the environment, replication is still associated with viruses for most people – and is the basis of their classical definitions.
Definition of a Computer Virus by Cohen
Fred Cohen, a graduate student at the University of Southern California, was interested in the problem of computer viruses in the early 1980s. In 1984, with the support of his scientific adviser, Leonard Adleman, he wrote the first article on this subject, where he quoted a very far from perfect formulation: “We define a computer virus as a program that can infect other programs by modifying them to include a Possibly evolved copy of itself“. At first, everything seemed simple, but with further development of the topic, Cohen discovered its extreme complexity and key importance of the environment – both for the replication process and for deciding whether an arbitrary information object is a virus.
At first, everything seemed simple, but with further development of the topic, Cohen discovered its extreme complexity and key importance of the environment – both for the replication process and for deciding whether an arbitrary information object is a virus.
In his book of 1985 and the thesis of 1986, Cohen gave a strict definition of a virus, in which he concentrated on one feature – recursive replication. The definition was given only for an abstract model based on a Turing machine (note that a real computer usually has less predictability than its ideal model). No other features, except recursive replication, are considered in the Cohen model. It’s a good model of a particular case of recursively reproducible algorithms, but a bad model of real computer viruses – especially with an observed diversity of their types and optional strict recursion for propagation.
But the Cohen model of a virus became classic because it was simple, visual and taking into account the role of the executing environment, which is necessary for understanding the essence of the problem.
In this model, in order to determine whether the information object is a virus (the final sequence of symbols, the program, the code), it must be considered exclusively in the context of the machine environment, paired with it and in interaction with it – for as many cycles ahead as it takes Code execution by machine. The virus was defined as code, the execution of which will cause the copy of this code to be written to the tape of the Turing machine ahead of the execution.
But for the Turing machine (as its author proved in 1936), it is impossible to predict the future. For arbitrary code, the result of its execution is unpredictable – a sequence of machine tape (memory) states for an unlimited number of cycles ahead. The only way to find out how this code will end is to test it in practice. In other words, to find out if the code is a virus, it must be run and see what will happen. Given the uncertainty of the result, a real system is definitely unsafe. In addition, the wait time for executing the code can be arbitrarily large (infinite) and without information about what will the launch of the code lead to, we can not judge whether it is a virus.
Note that code analysis without its start is impossible even with the most convenient position – an external observer, not limited in means for learning the machine and code that are in the initial static states. Moreover, it’s impossible to analyze the code without executing it in the machine environment, as well as analyzing the code while the machine is executing other code.
In addition to the disappointing conclusion about the impossibility of reliable detection of viruses in his model, Cohen proved the following:
- For an arbitrary code there is a machine which interprets it as a virus; For example, for some machine, the virus will be a code сonsisting of one byte.
- Some machines interpret any code as a virus;
- Some machines do not interpret any code as a virus.
Subsequently, Cohen repeatedly returned to this topic. So, in 1992 he published an article, which gave a strict definition of the term “computer worm” (a special case of the virus for some environments, in particular – multiprocessing). The previous conclusion was confirmed in the article: recognition of viruses and similar objects in general is impossible.
To be Continued…
Alex Bod is a cybersecurity expert and the CEO of Bod Security, Bod Intelligent Antivirus provider company.