Visualization of timelines with Gource

Hi, minions:

This past November 2018 I had the immense good fortune to participate as a speaker in the Computer Security Congress #HoneyCON, organized by Honey Sec, in its 2018 edition. The exposed presentation, that you can see in my site of Slideshare, was titled "Generation and visualization of timelines".



Introduction


Why timelines? Because there are lots of tools to generate timelines and there are lots of ways to visualize those timelines.

A timeline is the making and displaying of a list of events in chronological order. That is to say, to make a timeline you have to order chronologically all the activity of a system, or of the target artifact of the timeline. And that timeline will indicate to us that something has happened, in a certain moment, on a particular element. That's where time stamps come in, which are the elements that will indicate to us what activity has been carried out.
Timelines are very useful to quickly get an overview of all activity
In my humble opinion, a timeline should be performed on any analysis, yes or yes, regardless of the case. It's true that we can analyze any artifact in an individual way, an artifact in which there has been an 'X' event, but we may not be able to see what caused that event. It's useless to analyze a particular artifact if we're not able to associate it with another, within a context. And, how do you analyze everything within a context? With a timeline.

Regarding timelines, I think I did a good job with two articles I published at @fwhibbit_blog:
In the presentation I exposed, and as an example, I showed some non-commercial tools capable of generating timelines:
There are many more tools, projects and scripts.

All these tools are capable of generating timelines, either on a file system or on a particular artifact, with a multitude of output options and formats. And all the timelines that we generate with all these tools can be visualized in different ways, (Graphics, texts, ...).

Implementing timelines at Gource


I knew 'Gource' thanks to Ryan Benson. Specifically, through his website, 'Obsidian Forensics', with an entry where he presented a script that parse the file 'Manifest.mbdb' and formatted its output to make it compatible with 'Gource', (mbdbls). Since then I've been very interested in this tool because I think it has a lot of forensic potential.

Why not take its use to the next level, to a timeline of a file system?

What's 'Gource'? It is a tool that was created to show software projects as an animated tree.

What does 'Gource' need to work? A custom log format with four pipe-delimited fields, (|):
  • A timestamp in Unix format.
  • A username.
  • The type of activity that has been carried out, (A, M, D).
  • An element upon which action has taken place.


'Gource' has many display options. It allows you to move around all its nodes, allows you to point to an element with the mouse, allows you to use the mouse scroll for zoom options, allows you to start at a specified date, allows you to end at a specified date, allows you to use time scales, ...

Given the many possibilities we have for generating timelines, it's easy to find a way to set up a suitable format for viewing with 'Gource'. Because
I know what I have and I know what I want to format the timeline, (Lateral thinking).
So, for example, if we have generated a dynamic timeline with 'Plaso', where we are the ones who choose the columns that are going to be shown, we can set the appropriate format with 'Timeline Explorer', export it and use a small formula to change the format of the dates column.


And after that, we can use a simple command line with 'sed' to finish generating the 'Custom Log Format' we need.


We can carry out these operations with any timeline we have generated, according to our interests.


It's true that it's a manual job, but it doesn't take more than a few minutes and the result I think is really good. In the following video you can see an example of visualization of a timeline made with 'Plaso' and formatted for processing with 'Gource'. The speed of the video corresponds to that of real time, making time jumps when there is no activity.


You have available in my Google Drive folder a 'Custom Log Format' file from a Windows System and another from a Linux System, manually generated from a 'Plaso' timeline so you can do the tests you want.

However, we can automate the process.

Mactime2Gource


To automate the process of generating a 'Custom Log Format' I chose to start from a tabulated timeline, generated by 'mactime' of TSK, because I think it is a very good tool and with the right balance of depth, time investment and file size.

And since I have never started writing code, I called for help from Guillermo Román, a good friend and co-founder of the Blog 'Follow the White Rabbit', to create a small script capable of processing a timeline generated with 'mactime' to format it into a 'Custom Log Format'. 

The script, that you can find in Guillermo's Github site, we have made it compatible for timelines generated from Linux and Windows systems, and to be executed from Linux and Windows systems, (Because you must know that they are not interpreted the same in one system as in another).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
#!/usr/bin/python3

import sys, datetime, re

def main():
    # Regex to match the lines
    regex_date = "^(\D+\s\D+\s\d+\s\d+\s\d\d:\d\d:\d\d)\s+\d*\s+(....)\s+.+\s\d*\s+\d*\s+\d*.\d*.\d*.\s+(.*)$"
    regex_nodate = "^\s+\d*\s+(....)\s+.+\s\d*\s+\d*\s+\d*.\d*.\d*.\s+(.*)$"
    
    # Open input file
    filename = sys.argv[1]
    lines = open(filename, "r")
    
    # Last date detected, to keep control of multiple entries
    date_last = ""

    for line in lines: 
        line = line.rstrip()
        #print("Original: " + line)
        
        # Check if date line or sub-entry, and parse
        matches = re.search(regex_date,line)
        if matches:
            date_last = parse(matches, 1, date_last)
        if not matches:
            matches = re.search(regex_nodate,line)
            date_last = parse(matches, 0, date_last)
        if not matches:
            print("Error in line!")
            exit()

# Parses a line. Offset set to 1 if line with date, 0 if sub-entry.
def parse(matches, offset, date_last): 
    # Parse date
    if offset == 1:
        date_match = matches.group(1)
        date = datetime.datetime.strptime(date_match, "%a %b %d %Y %H:%M:%S")
        date_seconds = (date - datetime.datetime(1970,1,1)).total_seconds()
    else: 
        date_seconds = date_last

    # Parse M/A/C/B flag to A/M/D
    macb_match = matches.group(1+offset)
    if re.match("..c.",macb_match):
        macb = 'M'
    elif re.match("m...",macb_match):
        macb = 'M'
    elif re.match("...b",macb_match):
        macb = 'A'
    elif re.match(".a..",macb_match):
        macb = 'M'
    elif re.match("....",macb_match):
        macb = 'D'

    # Parse path
    path = matches.group(2+offset)
    # Was it deleted? Cut and set MACB to D(eleted)
    if path.endswith("(deleted)"): 
        path = path[:-10]
        macb = 'D'
        #print(path)
    elif path.endswith("(deleted-realloc)"):
        path = path[:-18]
        macb = 'D'
    # Trim ($FILE_NAME)
    if path.endswith("($FILE_NAME)"):
        path = path[:-13]

    # Print results
    print(str(int(date_seconds)) + "|USER|" + macb + "|" + path)
   
    # Return last date detected
    return date_seconds


if __name__ == "__main__":
    main()

Its functioning is very simple. The script is executed, the tabulated timeline generated with 'mactime' is indicated and an output file is indicated.


This way, we went from having a tabulated timeline


To have a 'Custom Log Format' compatible with 'Gource'.


You can see the result of its processing with 'Gource' in the following video. The speed of the video corresponds to that of real time, making time jumps when there is no activity.


You have available in my Google Drive folder a 'Custom Log Format' file of a Windows System and another one of a Linux System, generated with 'Mactime2Gource' so that you can do the tests you want.

Conclusions


I think a timeline should be done in any case. I consider it something basic because it will always help us to understand what has happened, whether we know what we are looking for or not. It will always be a good reference to consult.

Timelines are very useful to quickly get an overview of all file activity and that will help us to get clues. Almost everything contains time stamps, (Registry, Logs, Events, File System, ...).

A timeline should be intuitive, understandable and easy to study. For that reason we will have to treat, to format, the lines of time that we generate with the different tools, to make it more legible. Some will take longer than others.

A timeline will not be enough to complete an analysis, but it is the best way to see that something has happened, at a given moment, with a particular element. You have to study the elements within a context.

Why not see a timeline as an animated tree? I think it's a very good way to quickly see those clues, that kind of activity that has been carried out. I think it can be very helpful to understand what happened.

That's all.

Share:
spacer

No hay comentarios:

Publicar un comentario