Hi, minions:
This is another one of those works stuck in a corner of the pending
publication... and whose beginning goes back to the first months of the year
2018... In fact, the original name of the project was 'OP Carrot', in
honor to the family of 'Follow
the White Rabbit',
a band of 'dangerous hackers' to whom I owe a lot. A big hug
from this part of the screen, friends!
I tried to find the right moment to present it, but some circumstances
extended it for a long time, until one day I decided to present it to the
first C1b3rwall Congress CFP, together with Eduardo Sánchez Toril, in a design that presented a 'redteam' assumption combined with
another 'blueteam' one, resulting in winners and having the opportunity
to present this work on stage.
A good friend says, "If you can write about something, you have
understood it. And, if you've understood it, you're able to tell it. Therefore,
if you are able to tell it, you can make others understand it. That's how
you make the chain of knowledge."
This paper is about the forensic analysis of virtual machines,
mounted on VirtualBox.
Increasingly, we men of law are using virtualization. Virtualization
is a technology that has many advantages when it comes to conducting
research. There is no doubt about it. But these advantages can also
become major drawbacks when someone is confronted with them, because the
'bad guys' are also increasingly using this virtualization technology to carry
out their misdeeds.
First of all, I promise not to expand on this explanation, we must know
what virtualization is, what it is for, what it does and all those definitions
that are necessary to read and understand. There is a lot of material
published in relation to this concept. You can read great articles about
virtualization. But I have chosen to use a couple of short articles published
by Microsoft for this purpose:
What is virtualization?
"Virtualization creates a simulated,
or virtual, computing environment instead of a physical one. It often
includes computer-generated versions of hardware, operating systems, storage
devices, etc."
What is a virtual machine?
"A virtual machine is a PC file, usually called an
image, that behaves just like a real computer. In other words, it's creating a
computer within a computer."
Getting back to the work in question... I originally worked on a Windows
7 System, on which I ran, with VirtualBox, a series of virtual machines, also
on Windows 7. When I took it up again for the second time, I worked on a
Windows 10 System, on which I ran, with VirtualBox, a series of virtual
machines, on Windows 7. From the tests carried out on one system and another, a
series of results were produced with some slight differences. All this, on a
SATA hard disk, mechanical, 80 GB. The objective I set when I started this
project was to extract, or locate, the content of the virtual machines
using a series of assumptions, which I will mention below.
I have to clarify that the following story is purely
fictional. Which is a bit of a short piece of literature.
In this story, a police operation, which I have called 'OP Tanjawi',
against a crime of exaltation of Salafist Jihadist terrorism, indoctrination
and self-indoctrination, is exploited, resulting in the arrest of an individual
and the seizure of various digital materials, including multiple hard disks
with virtual machines and various, very diverse, USB storage devices.
A word of advice: Never, ever, ever discard any object, however ridiculous it may
seem. (You'd be surprised).
Introduction
We are proceeding with the exploitation of an
operation called 'OP Tanjawi', in which we have, as the main person under
investigation, a person who is carrying out work of exaltation of Salafist Jihadist terrorism and is
carrying out the self-indoctrination and indoctrination of third persons. We
know, through trusted external collaborators, (yes, a snitch), that he is
carrying out these tasks by means of USB
devices, which he facilitates to these third persons, inserting them in his
computer, inside virtual machines,
for their later viewing, both alone and in the company of others. We know that
the person investigated has knowledge
about computer security, so we must act quickly and cautiously, (as always).
We gained access to the house by breaking down the
door with the appropriate operating Computer... and we observed that the person under investigation is making
use of your system...
- "Boss, the Computer's on, what do we do?"
-
- "Value!!" –
Recreated scenarios
Several possible scenarios have been recreated using a virtualized Windows 7
system under VirtualBox, on a host
computer with a Windows 10 system. As mentioned above, the computer has a
mechanical 80 GB SATA hard disk.
1.
Virtual machine started
a. Non-Encrypted
b. Encrypted
2.
Virtual machine stop
a. Non-Encrypted
b. Encrypted
3.
Virtual machine eliminated
a. Non-Encrypted
b. Encrypted
4.
Virtual machine eliminated and fragmented
a. Non-Encrypted
b. Encrypted
If that were not enough, within these scenarios, I proposed others
related to the way to proceed with the team:
1.
Proceed with
the collection of the volatile evidence,
with the physical Computer on.
2.
Carry out the normal shutdown of the physical Computer.
3.
Disconnect the physical Computer from the electrical grid.
(For each scenario, not counting the original, a
corresponding forensic image of the hard drive was created. As far as you can
imagine, if you do the math, the number of hours that hard drive had to suffer...
and myself).
With regard to the actions carried out in the system, we
have proceeded to
1.
The insertion of a USB device in the virtual machine
2.
Copying two files, from the USB device to a folder in the virtualized system
3.
Removing the USB device from the system
4.
The display of an image file, named 'Crushing the enemy.jpg’
5.
The display of a video file, named 'Inside Khilfah.mp4’
6.
Other content available within the USB device, which
has not been explored.
Given the volume of data I have
worked with, I am not going to present
all the results obtained. I'll only present those that I think are of
interest. Otherwise it could be a much more extensive article than is usually
the case.
To
know...
Windows has anti-forensic sides. Windows writes files,
constantly, to the hard drive. On systems with Windows 7 this happens less than on systems with
Windows 10 because Windows 10 works with
more than twice as many processes as Windows 7.
By default, Windows has a number
of automatic maintenance tasks scheduled. Very dangerous are the tasks of 'ScheduledDefrag', which takes care of
the defragmentation of the hard disk, (only
in the mechanical disks), and 'SilentCleanup',
which takes care of the silent cleaning of the disk, when the disk space is
scarce. Automatic maintenance can be
avoided by disconnecting the power supply to the device if it has a
battery.
If the Computer is off, it should not be turned on. There should be no doubt in that case. If the computer is turned on, you must
proceed to capture the RAM memory, search for encrypted drives, ... Always?
Memory is a mine of information, very volatile, but always?
I'll leave that question in the air for now.
In this case, we are going to use
the Volatility
tool to perform the analysis of the RAM memory. As always, each case will
require the use of a series of plugins, or others. We
are going to use the following ones:
●
mftparser
●
dumpfiles
●
vaddump
We are also going to perform a file
recovery over the RAM, using the Foremost tool.
Once the RAM memory has been dumped, it is necessary to proceed with the
creation of images, (images is not just an image), forensics of the hard disk.
With those forensic images of the hard disk we are going to perform a file extraction of interest with the SleuthKit
tool, which is a framework with a set of small utilities that can process disk
images. SleuthKit
can extract both non-delete and delete files.
Since we are going to encounter encrypted
virtual machines, we are going to apply brute force to this machine, thanks to the invaluable help of Guillermo Román Ferrero, in order
to extract its encryption key.
We're going to be looking for
virtual hard drives hosted within the hard drive itself, within the
forensic image itself.
We're going to search for
specific, raw content by running that search through the entire forensic
image of the disk for a string of text that we're going to specify.
And the purpose of this whole
procedure is the detection of illicit
content, of indications of the same, or of possible evidence of elimination of
evidence.
Memory analysis with Volatility, (mftparser)
Before starting the actual analysis of the RAM dump, it must be correctly identified. This process is
carried out by means of the kdbgscan
plugin, paying attention to the values returned by the 'Build string (NtBuildLab)' and 'KdCopyDataBlock' fields, and the 'PsActiveProcessHead' field must contain a non-zero value.
We could use the timeliner
plugin to make a timeline, where we could see the execution of applications,
the hours of device connections to the system and some other data. But
timeliner only shows information about
the host computer.
If we consider that everything,
in the NTFS file system, is a file. And if we consider that every file, in the NTFS file system, is
recorded in the MFT. If we add to this fact that all the MFTs of any volume mounted in the system are loaded in the
memory, we find that we can extract
content from any connected device, even if it has not been explored.
In the following image we can see 3
different MFT. We find the MFT corresponding to volume 1 and volume 2. And we
can appreciate the existence of an MFT that corresponds to the 'HarddiskVolume209' and that belongs to a
virtual machine.
Therefore we are going to make a timeline on the MFT(s).
What is MFT? MFT is the Master File Table. It could be said, in a very
summarized way, that it is an index
that is going to indicate in which part, or parts, of the disk a certain file
is lodged and it is going to show us, besides other many data, all the metadata of that file. All this
information is stored in the records of the MFT. One record for each file.
Si ejecutamos:
●
mftparser --output=body --output-file MFTParser.txt -D
MFT
●
mactime -b MFTParser.txt >> MFTParser.csv
We will get a timeline of the
MFT(s), with a tabbed '.csv' file, and we will also download all the content
resident in them. Because it is possible to find resident data in the MFT. Files that do not occupy disk space.
Si ejecutamos:
●
mftparser --output-file=/mnt/c/Results/mftparser.txt
We will get a detailed view of
all the content in all the MFT, running applications, user names, paths of
the different virtual machines, the name of the virtual machines themselves,
the time stamps of the virtual machines, content stored in the virtual
machines, ...
In this way, and as an example, we
can see that the user 'MC' has opened
an image file, because we can get the recent documents.
Or we can see that a video with the
name 'Inside Khilfah.mp4' is hosted
inside the 'Downloads' folder of the user 'MC'.
We can even get the unscanned content from the USB device itself that we
have used to recreate the scenarios, by getting folder names and file names,
with their corresponding timestamps.
All this content is that belonging
to a virtual machine. In short, we can
see any content of any device, of any volume,
that is or has been found in the system, whether it has been opened and
explored or not.
Memory analysis with Volatility, (dumpfiles)
RAM caches, stores, all files that the user opens, but also caches files that have not
been opened by the user. This content is stored temporarily. So we can search
which files are loaded in memory, using 'filescan'.
Just because a file is loaded into memory does not
mean that it can be extracted. Remember that memory is very
volatile and is constantly changing.
We can use 'dumpfiles' to download files that are cached in memory
to extract, for example, virtual machine logs and virtual machine configuration
files.
If we execute:
●
dumpfiles -n -S Summary.txt -D Dumpfiles
We're going to download all the
files that are cached, stored, in memory. In this case, we can find, for
example, the configuration file of the hard disk of the virtual machine, 'Win7x64.vmdk'.
Memory analysis with Volatility, (vaddump)
We can list the system processes
using 'pslist' to see a list of running applications, with their
position in memory, their name, their ID, their parent process, ...
And with that list of processes we also get the start and exit date and time of each process, in case they are not
active at the time of acquisition.
If the processes related to
VirtualBox are running, we download the memory
pages of those processes, using 'vaddump':
●
vaddump -p (ID Process) -D Vaddump/
And if the processes related to
VirtualBox are not running, we are going to download the memory pages of the other processes as well. Why?
Because processes are assigned a private
memory space, but they are also assigned a memory space that is shared with other processes, and thus the same content can be found in either of
those two memory workspaces.
In this case we will extract, through the memory pages, the content of
the virtual machine, indicating its position in memory and the process to which
it corresponds.
In this way, for example, we can see
several times the two files hosted in the virtual machine and we can see that
they have been opened, even with their display date.
We can see the contents of a folder
on the USB device.
We can see that a video has been played, being able to determine the system
name of the virtual machine and the user who has played it.
We can even see and extract the configuration file '.vbox' from a virtual machine, with its corresponding
encryption key.
This file can be reconstructed and can be decrypted.
Recovery of files in memory, (Foremost)
File recovery is something "easy" to do and does not require "too much
time". With carving we recover files through
their headers, footers and data structures.
Both complete files and file fragments can be recovered and extracted.
We can obtain videos, or frames from videos, images, documents, ... any
extension we indicate.
As a general rule, carving is
usually done on the created forensic images themselves, whether they are from
memory or from disk. But this time we are going to do it on the directory of the memory pages that we have downloaded
before. Why? Two reasons.
Firstly, because if we carve the image of the memory dump, we can obtain
results that disconcert us, so to speak. For example, if we have inserted a
USB device in the computer to proceed with the corresponding RAM memory
capture, and we have, besides the memory capture tool itself, other tools, we
are going to obtain, for example, images or fragments of those other tools. In
other words, we are contaminating the
evidence.
If we are 'twisted', we could
have lodged in our USB device, device with which we carry out the memory
capture, another type of illicit content so that, when the carving is carried
out on the memory dump, that content appears in the results obtained. That is, we could involve a person, without being
aware of it. And
we are not twisted. We seek the truth.
Secondly, we are going to carry out the carving on the memory pages
because, if we carry out it on the memory dump, we may not be able to determine
its origin, while if we carry out it on the memory pages we can determine that
it belongs to a certain process, a virtual machine, and we can determine that
it has been visualized and opened in a time band, between the beginning and the
exit of the process related to the virtual machine.
That is, if we proceed with the carving on the directory of the memory
pages, we can determine the source of that file.
In this case, we could compare the results obtained with a
database, if available, with other original content. Or we could pay special
attention to symbolism. A little OSINT never hurt anyone.
Hard disk image analysis, (Virtual machine not
deleted)
It has already been mentioned above that RAM is a gold mine, as far as the information that can be found in
it is concerned. And it has also been mentioned that Windows has anti-forensic sides, such as maintenance tasks or the very fact that Windows constantly writes files to the hard disk. The latter is
very harmful for forensic analysis, in terms of recovering and extracting deleted content from the system.
You should know that, a virtual
machine, has logs, that it has
configuration files, that it has at
least one hard disk file and that it
can be encrypted.
On the other hand, you should know that, a hard disk file is a file, it
has a header and it can be fragmented. Likewise, a disk file behaves like a hard disk, has a header and contains
information.
For all these reasons, time is a
very critical factor and we must
know how to act at all times.
This way, if the virtual machine
has not been removed and is not encrypted, we can mount the forensic image of the hard disk, then mount the virtual machine's hard disk
and we can scan and extract the content hosted in that virtual machine.
If the virtual machine has not been deleted and is
encrypted, we can mount the forensic image of the hard disk, extract the configuration
files and apply brute force,
with the tool 'vboxDieCracker-py', on the file '.vbox', which is the one that stores the encryption key. Once the
encryption key has been extracted from the virtual machine, we can make a copy
of the virtual machine, configure it in our laboratory and remove its encryption to analyze it like any other system.
Hard disk image analysis, (Virtual Machine deleted)
If the virtual machine has been deleted, the record for that file is marked as 'Deleted' in the MFT, but its contents, its clusters, remain intact until its information is overwritten.
When Windows creates a file, it assigns the lowest
record number in the MFT and assigns the first clusters that are not associated
with any record in the MFT as well.
When does Windows create a file? Simply
place the mouse cursor over an icon to do so. For this reason, you only need to touch the system for basic
and essential tasks.
The file on the virtual machine's hard disk may remain
intact in the forensic image and could be
seen in the scan of the same image, provided some conditions are met. Namely,
that the system has not started maintenance
tasks, that the system is quickly
disconnected, when the virtual
machine has been removed and if no clusters assigned to that virtual
machine hard disk file have been overwritten.
There are several options for
retrieving that content, that virtual machine. We're going to use a very
complete and powerful framework, called The SleuthKit.
We can use 'fls' to list the MFT entries and then, using 'icat',
we can extract the content we are
interested in.
Hard disk image analysis, (Virtual Machine deleted and
fragmented)
If the virtual machine has been removed and the
clusters have been overwritten... and the virtual machine was not encrypted, we can search for hard drives within the forensic image of the hard drive,
using:
●
xxd -c 256 imagen.dd | grep -i -P
“00(([0-9A-F]8)|([1-9A-F]0))\s0000\s([0-9A-F]{4}\s){26}55aa\s\s”
That line shows us the boot
offset of each of the hard disks, both physical and virtual, that are in
the forensic image of a hard disk. In this case, we have the physical disk in first
position, (at offset 100), and 3
virtual disks.
With this data we can manually
extract the content from the offset that interests us, from a position on the hard disk, and proceed to its subsequent
study.
Similarly, if the virtual machine
has been removed and the clusters have been overwritten... and the machine
was not encrypted, we can search for the
content we want and sometimes extract
it.
I like to think that we all have a
list of key words for our
investigations.
●
blkls
-e -f ntfs -o 1126400 imagen.dd | egrep -abi ‘Aplastando|Khilfah|Enemigo’
That line searches the hard disk forensic
image for the words 'Crushing|Khilfah|Enemy'
and what we see in this case are file names, paths and computer and user names,
for the words I used in the search.
The search can be carried out
both for file names and for the content itself.
In any case, the result will
take us to a location, which consists of the exact byte where the content we have searched for is located. What
do we do now?
We take paper, pencil and calculator... because first we are going to
determine the start offset of the
partitions in the forensic image of the hard disk.
To do this we use 'mmls'
on the image of the hard disk.
Then we have to determine the size of the sector and the size of the file system
cluster. Operation we can do with 'fsstat'.
In this case, we get that the sector
size is 512 bytes and the cluster
size is 4096 bytes, that is, 8
sectors. We write down all this data, because we have a content of interest in byte number
764,020,177.
If a cluster is equivalent to 8
sectors and if a sector is 512 bytes, we can look for the cluster corresponding
to that content with a small formula, which consists of dividing the number of bytes by 4096. (512 bytes per sector and 8 sectors per
cluster).
●
echo
“764020177/(8*512)” | bc
This small formula shows us that the cluster containing the information
we are interested in is number 186528.
But this function does not return
decimal values... Be very careful with the decimals!
Let's use a calculator because if the decimal of the result is higher than
0.5, one more cluster must be added to the result. For example, in this case
we have that 764,020,177 divided by 4096
equals 186,528.36. If instead of the decimal a value of 0.36 were 0.56, the
cluster would not be the number 186,528, but 186,529.
Through 'blkstat', we can determine if the content of that cluster is associated to any file or is not associated to any file, as it
returns the values of 'Not allocated' or 'Allocated'.
If the content of a cluster
is associated with any file, we can search for that file with the value of the
cluster.
We can use 'ifind' to see if that content
is related to the corresponding metadata in the MFT. ‘ifind' can produce two
results: 'Inode not found', which means that it is not associated with any record in the MFT, and a number, which corresponds to a record in a file record in the
MFT.
If we get that registration
number from the MFT, we can use 'ffind' to determine the name of that file.
And with the MFT record number, we can use 'icat' to extract that file.
Conclusions
- Think fast and act smart!!
- Time is a critical factor. One should not act
without a plan. You have to work with foresight, with the possible
assumptions that we may encounter.
- Nothing must be connected that is not absolutely
necessary for the correct extraction of evidence.
- A tool for every device. We don't want to
contaminate anything that could mess up all the work of an investigation.
- If we come across Computer that has a battery,
we can disconnect it than the electrical current to prevent the system
from entering the maintenance phase.
- The acquisition of RAM memory is vital. Always?
- Any
action leaves a trace and the absence of evidence is evidence that
something has happened, so we should look for what has happened.
- Basic, in any case, knowing what to look for,
because if we don't know what we are looking for, we can't know where to
look and we can't know how to look.
- Use your brain and reason.
Tools used
References
That's all.
No hay comentarios:
Publicar un comentario