Web sites store information on local machines of site visitors using cookies. On subsequent visits, the browser sends the data from the cookies on the visitors machine to the web server, which might then use that information as a historical record of the users activity on the site – on the minimum the time the cookie was created, when it is set to expire and last access time or last time user visited site. Cookies are also used by sites to ‘remember’ user acitivity , say the shopping cart items or login/session information to address the shortcomings of the stateless HTTP protocol.

Most users think that only the sites they had directly visited store cookies on their computers, in reality the number is way higher than that. A single site you visit, usually has lots of links in it, especially ads, that store cookies in your computer. In this post, i will demonstrate how to list the list of all sites that left cookies in your computer, as well as extract additional information from the cookies. When i ran the script and did a count of the 10 top sites which left largest number of entries in the cookies sqlite DB, none of them except for one or two were sites I directly visited!

This Python script was written to extract cookies information on a Linux box running Firefox. The cookies information is stored as a sqlite file and thus you will need the sqlite3 python module to read the sqlite file.

The script takes the path to the cookies file as well as the path to the output file, it will write the output to this file. It will also dump the output to the screen.

root@dnetbook:/home/daniel/python# python cookie_viewer.py 
cookie_viewer.py cookie-fullpath output-file

root@dnetbook:/home/daniel/python# python /home/daniel/python/cookie_viewer.py $(find /home/daniel/ -type f -name 'cookies.sqlite' | head -1) /tmp/test.txt
doubleclick.net,Thu Feb 11 17:56:01 2016,Thu Apr 23 20:46:58 2015,Tue Feb 11 17:56:01 2014
twitter.com,Thu Feb 11 17:56:05 2016,Tue Apr 21 22:27:46 2015,Tue Feb 11 17:56:05 2014
imrworldwide.com,Thu Feb 11 17:56:12 2016,Tue Apr 21 22:19:35 2015,Tue Feb 11 17:56:12 2014
quantserve.com,Thu Aug 13 19:32:02 2015,Thu Apr 23 20:46:57 2015,Tue Feb 11 18:32:0

The output will be the domain name of the site, cookie expiry date, access time and creation time.

Code follows –

#!/usr/bin/env python

''' Given a location to firefox cookie sqlite file
    Write its date param - expiry, last accessed,
    Creation time to a file in plain text.
    id
    baseDomain
    appId
    inBrowserElement
    name
    value
    host
    path
    expiry
    lastAccessed
    creationTime
    isSecure
    isHttpOnly
    python /home/daniel/python/cookie_viewer.py $(find /home/daniel/ -type f -name 'cookies.sqlite' | head -1) /tmp/test.txt 
'''

import sys
import os
from datetime import datetime
import sqlite3

def Usage():
    print "{0} cookie-fullpath output-file".format(sys.argv[0])
    sys.exit(1)

if len(sys.argv)<3:
    Usage()

sqldb=sys.argv[1]
destfile=sys.argv[2]
# Some dates in the cookies file might not be valid, or too big
MAXDATE=2049840000

# cookies file must be there, most often file name is cookies.sqlite
if not os.path.isfile(sqldb):
    Usage()

# a hack - to convert the epoch times to human readable format
def convert(epoch):
    mydate=epoch[:10]
    if int(mydate)>MAXDATE:
        mydate=str(MAXDATE)
    if len(epoch)>10:
        mytime=epoch[11:]
    else:
        mytime='0'
    fulldate=float(mydate+'.'+mytime)
    x=datetime.fromtimestamp(fulldate)
    return x.ctime()

# Bind to the sqlite db and execute sql statements
conn=sqlite3.connect(sqldb)
cur=conn.cursor()
try:
    data=cur.execute('select * from moz_cookies')
except sqlite3.Error, e:
    print 'Error {0}:'.format(e.args[0])
    sys.exit(1)
mydata=data.fetchall()

# Dump results to a file
with open(destfile, 'w') as fp:
    for item in mydata:
        urlname=item[1]
        urlname=item[1]
        expiry=convert(str(item[8]))
        accessed=convert(str(item[9]))
        created=convert(str(item[10]))
        fp.writelines(urlname + ',' + expiry + ',' + accessed + ',' + created)
        fp.writelines('\n')

# Dump to stdout as well
with open(destfile) as fp:
    for line in fp:
        print line

TOP 10 sites with highest number of enties in the cookies file –

root@dnetbook:/home/daniel/python# awk -F, '{print $1}' /tmp/test.txt  | sort | uniq -c | sort -nr | head -10
     73 taboola.com
     59 techrepublic.com
     43 insightexpressai.com
     34 pubmatic.com
     33 2o7.net
     31 rubiconproject.com
     28 demdex.net
     27 chango.com
     26 yahoo.com
     26 optimizely.com