Python Digital Network Forensics-I

Neha Kumawat

a year ago

This chapter will explain the fundamentals involved in performing network forensics using Python.

Understanding Network Forensics

Network forensics is a branch of digital forensics that deals with the monitoring and analysis of computer network traffic, both local and WAN(wide area network), for the purposes of information gathering, evidence collection, or intrusion detection. Network forensics play a critical role in investigating digital crimes such as theft of intellectual property or leakage of information. A picture of network communications helps an investigator to solve some crucial questions as follows −
  • What websites has been accessed?
What websites has been accessed?
  • What kind of content has been uploaded on our network?
What kind of content has been uploaded on our network?
  • What kind of content has been downloaded from our network?
What kind of content has been downloaded from our network?
  • What servers are being accessed?
What servers are being accessed?
  • Is somebody sending sensitive information outside of company firewalls?
Is somebody sending sensitive information outside of company firewalls?

Internet Evidence Finder (IEF)

IEF is a digital forensic tool to find, analyze and present digital evidence found on different digital media like computer, smartphones, tablets etc. It is very popular and used by thousands of forensics professionals.

Use of IEF

Due to its popularity, IEF is used by forensics professionals to a great extent. Some of the uses of IEF are as follows −
  • Due to its powerful search capabilities, it is used to search multiple files or data media simultaneously.
Due to its powerful search capabilities, it is used to search multiple files or data media simultaneously.
  • It is also used to recover deleted data from the unallocated space of RAM through new carving techniques.
It is also used to recover deleted data from the unallocated space of RAM through new carving techniques.
  • If investigators want to rebuild web pages in their original format on the date they were opened, then they can use IEF.
If investigators want to rebuild web pages in their original format on the date they were opened, then they can use IEF.
  • It is also used to search logical or physical disk volumes.
It is also used to search logical or physical disk volumes.

Dumping Reports from IEF to CSV using Python

IEF stores data in a SQLite database and following Python script will dynamically identify result tables within the IEF database and dump them to respective CSV files.
This process is done in the steps shown below
  • First, generate IEF result database which will be a SQLite database file ending with .db extension.
First, generate IEF result database which will be a SQLite database file ending with .db extension.
  • Then, query that database to identify all the tables.
Then, query that database to identify all the tables.
  • Lastly, write this result tables to an individual CSV file.
Lastly, write this result tables to an individual CSV file.

Python Code

Let us see how to use Python code for this purpose −
For Python script, import the necessary libraries as follows −

from __future__ import print_function

import argparse
import csv
import os
import sqlite3
import sys
Now, we need to provide the path to IEF database file −

if __name__ == '__main__':
   parser = argparse.ArgumentParser('IEF to CSV')
   parser.add_argument("IEF_DATABASE", help="Input IEF database")
   parser.add_argument("OUTPUT_DIR", help="Output DIR")
   args = parser.parse_args()
Now, we will confirm the existence of IEF database as follows −

if not os.path.exists(args.OUTPUT_DIR):
   os.makedirs(args.OUTPUT_DIR)
if os.path.exists(args.IEF_DATABASE) and \ os.path.isfile(args.IEF_DATABASE):
   main(args.IEF_DATABASE, args.OUTPUT_DIR)
else:
   print("[-] Supplied input file {} does not exist or is not a " "file".format(args.IEF_DATABASE))
   sys.exit(1)
Now, as we did in earlier scripts, make the connection with SQLite database as follows to execute the queries through cursor −

def main(database, out_directory):
   print("[+] Connecting to SQLite database")
   conn = sqlite3.connect(database)
   c = conn.cursor()
The following lines of code will fetch the names of the tables from the database −

print("List of all tables to extract")
c.execute("select * from sqlite_master where type = 'table'")
tables = [x[2] for x in c.fetchall() if not x[2].startswith('_') and not x[2].endswith('_DATA')]
Now, we will select all the data from the table and by using fetchall() method on the cursor object we will store the list of tuples containing the table’s data in its entirety in a variable −

print("Dumping {} tables to CSV files in {}".format(len(tables), out_directory))

for table in tables:
c.execute("pragma table_info('{}')".format(table))
table_columns = [x[1] for x in c.fetchall()]

c.execute("select * from '{}'".format(table))
table_data = c.fetchall()
Now, by using CSV_Writer() method we will write the content in CSV file −

csv_name = table + '.csv'
csv_path = os.path.join(out_directory, csv_name)
print('[+] Writing {} table to {} CSV file'.format(table,csv_name))

with open(csv_path, "w", newline = "") as csvfile:
   csv_writer = csv.writer(csvfile)
   csv_writer.writerow(table_columns)
   csv_writer.writerows(table_data)
The above script will fetch all the data from tables of IEF database and write the contents to the CSV file of our choice.

Working with Cached Data

From IEF result database, we can fetch more information that is not necessarily supported by IEF itself. We can fetch the cached data, a bi product for information, from email service provider like Yahoo, Google etc. by using IEF result database.
The following is the Python script for accessing the cached data information from Yahoo mail, accessed on Google Chrome, by using IEF database. Note that the steps would be more or less same as followed in the last Python script.
First, import the necessary libraries for Python as follows −

from __future__ import print_function
import argparse
import csv
import os
import sqlite3
import sys
import json
Now, provide the path to IEF database file along with two positional arguments accepts by command-line handler as done in the last script −

if __name__ == '__main__':
   parser = argparse.ArgumentParser('IEF to CSV')
   parser.add_argument("IEF_DATABASE", help="Input IEF database")
   parser.add_argument("OUTPUT_DIR", help="Output DIR")
   args = parser.parse_args()
Now, confirm the existence of IEF database as follows −

directory = os.path.dirname(args.OUTPUT_CSV)

if not os.path.exists(directory):os.makedirs(directory)
if os.path.exists(args.IEF_DATABASE) and \ os.path.isfile(args.IEF_DATABASE):
   main(args.IEF_DATABASE, args.OUTPUT_CSV)
   else: print("Supplied input file {} does not exist or is not a " "file".format(args.IEF_DATABASE))
sys.exit(1)
Now, make the connection with SQLite database as follows to execute the queries through cursor −

def main(database, out_csv):
   print("[+] Connecting to SQLite database")
   conn = sqlite3.connect(database)
   c = conn.cursor()
You can use the following lines of code to fetch the instances of Yahoo Mail contact cache record −

print("Querying IEF database for Yahoo Contact Fragments from " "the Chrome Cache Records Table")
   try:
      c.execute("select * from 'Chrome Cache Records' where URL like " "'https://data.mail.yahoo.com" "/classicab/v2/contacts/?format=json%'")
   except sqlite3.OperationalError:
      print("Received an error querying the database --    database may be" "corrupt or not have a Chrome Cache Records table")
      sys.exit(2)
Now, the list of tuples returned from above query to be saved into a variable as follows −

contact_cache = c.fetchall()
contact_data = process_contacts(contact_cache)
write_csv(contact_data, out_csv)
Note that here we will use two methods namely process_contacts() for setting up the result list as well as iterating through each contact cache record and json.loads() to store the JSON data extracted from the table into a variable for further manipulation −

def process_contacts(contact_cache):
   print("[+] Processing {} cache files matching Yahoo contact cache " " data".format(len(contact_cache)))
   results = []
   
   for contact in contact_cache:
      url = contact[0]
      first_visit = contact[1]
      last_visit = contact[2]
      last_sync = contact[3]
      loc = contact[8]
	   contact_json = json.loads(contact[7].decode())
      total_contacts = contact_json["total"]
      total_count = contact_json["count"]
      
      if "contacts" not in contact_json:
         continue
      for c in contact_json["contacts"]:
         name, anni, bday, emails, phones, links = ("", "", "", "", "", "")
            if "name" in c:
            name = c["name"]["givenName"] + " " + \ c["name"]["middleName"] + " " + c["name"]["familyName"]
            
            if "anniversary" in c:
            anni = c["anniversary"]["month"] + \"/" + c["anniversary"]["day"] + "/" + \c["anniversary"]["year"]
            
            if "birthday" in c:
            bday = c["birthday"]["month"] + "/" + \c["birthday"]["day"] + "/" + c["birthday"]["year"]
            
            if "emails" in c:
               emails = ', '.join([x["ep"] for x in c["emails"]])
            
            if "phones" in c:
               phones = ', '.join([x["ep"] for x in c["phones"]])
            
            if "links" in c:
              links = ', '.join([x["ep"] for x in c["links"]])
Now for company, title and notes, the get method is used as shown below −

company = c.get("company", "")
title = c.get("jobTitle", "")
notes = c.get("notes", "")
Now, let us append the list of metadata and extracted data elements to the result list as follows −

results.append([url, first_visit, last_visit, last_sync, loc, name, bday,anni, emails, phones, links, company, title, notes,total_contacts, total_count])
return results   
Now, by using CSV_Writer() method, we will write the content in CSV file −

def write_csv(data, output):
   print("[+] Writing {} contacts to {}".format(len(data), output))
   with open(output, "w", newline="") as csvfile:
      csv_writer = csv.writer(csvfile)
      csv_writer.writerow([
         "URL", "First Visit (UTC)", "Last Visit (UTC)",
         "Last Sync (UTC)", "Location", "Contact Name", "Bday",
         "Anniversary", "Emails", "Phones", "Links", "Company", "Title",
         "Notes", "Total Contacts", "Count of Contacts in Cache"])
      csv_writer.writerows(data)  
With the help of above script, we can process the cached data from Yahoo mail by using IEF database.

Submit Review

We're Online!

Chat now for any query