146 lines
8.0 KiB
Plaintext
146 lines
8.0 KiB
Plaintext
|
|
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
|
|
SCAN STRINGS, HOW THEY WORK,
|
|
AND HOW TO AVOID THEM
|
|
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
|
|
By Dark Angel
|
|
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
|
|
|
|
Scan strings are the scourge of the virus author and the friend of anti-
|
|
virus wanna-bes. The virus author must find encryption techniques which
|
|
can successfully evade easy detection. This article will show you several
|
|
such techniques.
|
|
|
|
Scan strings, as you are well aware, are a collection of bytes which an
|
|
anti-viral product uses to identify a virus. The important thing to keep
|
|
in mind is that these scan strings represent actual code and can NEVER
|
|
contain code which could occur in a "normal" program. The trick is to use
|
|
this to your advantage.
|
|
|
|
When a scanner checks a file for a virus, it searches for the scan string
|
|
which could be located ANYWHERE IN THE FILE. The scanner doesn't care
|
|
where it is. Thus, a file which consists solely of the scan string and
|
|
nothing else would be detected as infected by a virus. A scanner is
|
|
basically an overblown "hex searcher" looking for 1000 signatures.
|
|
Interesting, but there's not much you can do to exploit this. The only
|
|
thing you can do is to write code so generic that it could be located in
|
|
any program (by chance). Try creating a file with the following debug
|
|
script and scanning it. This demonstrates the fact that the scan string
|
|
may be located at any position in the file.
|
|
|
|
---------------------------------------------------------------------------
|
|
|
|
n marauder.com
|
|
e 0100 E8 00 00 5E 81 EE 0E 01 E8 05 00 E9
|
|
|
|
rcx
|
|
000C
|
|
w
|
|
q
|
|
|
|
---------------------------------------------------------------------------
|
|
|
|
Although scanners normally search for decryption/encryption routines, in
|
|
Marauder's case, SCAN looks for the "setup" portion of the code, i.e.
|
|
setting up BP (to the "delta offset"), calling the decryption routine, and
|
|
finally jumping to program code.
|
|
|
|
What you CAN do is to either minimise the scannable code or to have the
|
|
code constantly mutate into something different. The reasons are readily
|
|
apparent.
|
|
|
|
The simplest technique is having multiple encryption engines. A virus
|
|
utilising this technique has a database of encryption/decryption engines
|
|
and uses a random one each time it infects. For example, there could be
|
|
various forms of XOR encryption or perhaps another form of mathematical
|
|
encryption. The trick is to simply replace the code for the encryption
|
|
routine each time with the new encryption routine.
|
|
|
|
Mark Washburn used this in his V2PX series of virii. In it, he used six
|
|
different encryption/decryption algorithms, and some mutations are
|
|
impossible to detect with a mere scan string. More on those later.
|
|
|
|
Recently, there has been talk of the so-called MTE, or mutating engine,
|
|
from Bulgaria (where else?). It utilises the multiple encryption engine
|
|
technique. Pogue Mahone used the MTE and it took McAfee several days to
|
|
find a scan string. Vesselin Bontchev, the McAfee-wanna-be of Bulgaria,
|
|
marvelled the engineering of this engine. It is distributed as an OBJ file
|
|
designed to be able to be linked into any virus. Supposedly, SCANV89 will
|
|
be able to detect any virus using the encryption engine, so it is worthless
|
|
except for those who have an academic interest in such matters (such as
|
|
virus authors).
|
|
|
|
However, there is a serious limitation to the multiple encryption
|
|
technique, namely that scan strings may still be found. However, scan
|
|
strings must be isolated for each different encryption mechanism. An
|
|
additional benefit is the possibility that the antivirus software
|
|
developers will miss some of the encryption mechanisms so not all the
|
|
strains of the virus will be caught by the scanner.
|
|
|
|
Now we get to a much better (and sort of obvious) method: minimising scan
|
|
code length. There are several viable techniques which may be used, but I
|
|
shall discuss but three of them.
|
|
|
|
The one mentioned before which Mark Washburn used in V2P6 was interesting.
|
|
He first filled the space to be filled in with the encryption mechanism
|
|
with dummy one byte op-codes such as CLC, STC, etc. As you can see, the
|
|
flag manipulation op-codes were exploited. Next, he randomly placed the
|
|
parts of his encryption mechanism in parts of this buffer, i.e. the gaps
|
|
between the "real" instructions were filled in with random dummy op-codes.
|
|
In this manner, no generic scan string could be located for this encryption
|
|
mechanism of this virus. However, the disadvantage of this method is the
|
|
sheer size of the code necessary to perform the encryption.
|
|
|
|
A second method is much simpler than this and possibly just as effective.
|
|
To minimise scan code length, all you have to do is change certain bytes at
|
|
various intervals. The best way to do this can be explained with the
|
|
following code fragment:
|
|
|
|
mov si, 1234h ; Starting location of encryption
|
|
mov cx, 1234h ; Virus size / 2 + variable number
|
|
loop_thing:
|
|
xor word ptr cs:[si], 1234h ; Decrypt the value
|
|
add si, 2
|
|
loop loop_thing
|
|
|
|
In this code fragment, all the values which can be changed are set to 1234h
|
|
for the sake of clarity. Upon infection, all you have to do is to set
|
|
these variable values to whatever is appropriate for the file. For
|
|
example, mov bx, 1234h would have to be changed to have the encryption
|
|
start at the wherever the virus would be loaded into memory (huh?). Ponder
|
|
this for a few moments and all shall become clear. To substitute new
|
|
values into the code, all you have to do is something akin to:
|
|
|
|
mov [bp+scratch+1], cx
|
|
|
|
Where scratch is an instruction. The exact value to add to scratch depends
|
|
on the coding of the op-code. Some op-codes take their argument as the
|
|
second byte, others take the third. Regardless, it will take some
|
|
tinkering before it is perfect. In the above case, the "permanent" code is
|
|
limited to under five or six bytes. Additionally, these five or six bytes
|
|
could theoretically occur in ANY PROGRAM WHATSOEVER, so it would not be
|
|
prudent for scanners to search for these strings. However, scanners often
|
|
use scan strings with wild-card-ish scan string characters, so it is still
|
|
possible for a scan string to be found.
|
|
|
|
The important thing to keep in mind when using this method is that it is
|
|
best for the virus to use separate encryption and decryption engines. In
|
|
this manner, shorter decryption routines may be found and thus shorter scan
|
|
strings will be needed. In any case, using separate encryption and
|
|
decryption engines increases the size of the code by at most 50 bytes.
|
|
|
|
The last method detailed is theft of decryption engines. Several shareware
|
|
products utilise decryption engines in their programs to prevent simple
|
|
"cracks" of their products. This is, of course, not a deterrent to any
|
|
programmer worth his salt, but it is useful for virus authors. If you
|
|
combine the method above with this technique, the scan string would
|
|
identify the product as being infected with the virus, which is a) bad PR
|
|
for the company and b) unsuitable for use as a scan string. This technique
|
|
requires virtually no effort, as the decryption engine is already written
|
|
for you by some unsuspecting PD programmer.
|
|
|
|
All the methods described are viable scan string avoidance techniques
|
|
suitable for use in any virus. After a few practice tries, scan string
|
|
avoidance should become second nature and will help tremendously in
|
|
prolonging the effective life of your virus in the wild.
|