curl – get only numeric HTTP response code

Most browsers have developer plugins where you can see the HTTP status code response and other request/response headers. For automation purposes though, you are most likely to use tools such as curl, httpie or python requests modules. In this post, we will see how to use curl for parsing HTTP response to get only the response code.

1. First attempt – use ‘-I’ option to fetch HTTP-header only.

The first line will show the response code.


daniel@linubuvma:~$ curl -I http://www.google.com
HTTP/1.1 200 OK
Date: Sun, 09 Apr 2017 06:45:00 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
domain=.google.com; Htty
Transfer-Encoding: chunked
Accept-Ranges: none
Vary: Accept-Encoding

But does this work all the time? No, some web services have problem with the HEAD HTTP request. Let us try amazon.com for instance –


daniel@linubuvma:~$ curl -I https://www.amazon.com
HTTP/1.1 503 Service Unavailable
Content-Type: text/html
Content-Length: 6450
Connection: keep-alive
Server: Server
Date: Sun, 09 Apr 2017 06:50:02 GMT
Set-Cookie: skin=noskin; path=/; domain=.amazon.com
Vary: Content-Type,Host,Cookie,Accept-Encoding,User-Agent
X-Cache: Error from cloudfront
Via: 1.1 a8dc63f9c2d878908bcd53ddc78da27f.cloudfront.net (CloudFront)


daniel@linubuvma:~$ curl -I -A "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12" https://www.amazon.com
HTTP/1.1 405 MethodNotAllowed
Content-Type: text/html; charset=ISO-8859-1
Connection: keep-alive
Server: Server
Date: Sun, 09 Apr 2017 06:49:47 GMT
Set-Cookie: skin=noskin; path=/; domain=.amazon.com
Strict-Transport-Security: max-age=47474747; includeSubDomains; preload
x-amz-id-1: N2RDV79SBB791BTYG2K8
allow: POST, GET
Vary: Accept-Encoding,User-Agent
X-Frame-Options: SAMEORIGIN
X-Cache: Error from cloudfront
Via: 1.1 f3459bfce7b7b7b8e8bfb19301f39bef.cloudfront.net (CloudFront)

In the first attempt, amazon.com was actually blocking automated checks by looking at the user-agent in the header, so i had to trick it by changing the user-agent header. The response code was 503. Once I changed the user-agent, I am getting 405 – the web server does not like our HEAD HTTP (‘-I’) option.

2. Second attempt – use ‘-w’ option to write-out specific parameter.

curl has ‘-w’ option for defining specific parameter to write out to the screen or stdout. Some of the variables are content_type, size_header, http_code. In our case, we are interested in http_code, which will dump the numerical response code from the last HTTP transfer. Let us try it –

daniel@linubuvma:~$ curl -I -s -w "%{http_code}\n" -o /dev/null http://www.google.com
200

We use ‘-I’ to get only the header and redirect the header to /dev/null and only print http_code to stdout. This is by far the most efficient way of doing it, as we are not transferring the whole page. If the ‘-I’ option does not work though, for sites such as amazon.com, we can drop ‘-I’ as follows –

daniel@linubuvma:~$ curl -s -w "%{http_code}\n" -o /dev/null -A "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12" https://www.amazon.com
200

This is very useful when are writing scripts to get only the HTTP status code.

References –

https://curl.haxx.se/docs/manpage.html
https://superuser.com/questions/272265/getting-curl-to-output-http-status-code

How to share your terminal session with another user in real time.

Linux has a script command which is mainly used for ‘typescripting’ all output printed on terminal. Commands typed on a terminal and the resulting output can be written to a file for later retrieval.

One little known use of the script command is for sharing your terminal session with another user, this would particularly be useful for telecooperation say between a user and instructor. The instructor can lead the session by executing commands on the shell while the student would observe. Here is one way of doing this –

1. Instructor creates a named pipe using mkfifo

instructor@linubuvma:/$ mkfifo /tmp/shared-screen

instructor@linubuvma:/$ ls -al /tmp/shared-screen 
prw-rw-r-- 1 instructor instructor 0 Mar 31 00:08 /tmp/shared-screen

instructor@linubuvma:/$ script -f /tmp/shared-screen 

2. Student views the session in real time by reading the shared-screen file –

student@linubuvma:/tmp$ cat shared-screen
Script started on Fri 31 Mar 2017 12:09:42 AM EDT

As soon as the student runs the

cat shared-screen

command, the script command also gets started on the instructor’s session.

Whatever is typed on the instructor’s terminal will show up on the student’s screen and the student’s terminal will be restored as soon as the instructor exits or terminates the script session –

instructor@linubuvma:/$ free -m
             total       used       free     shared    buffers     cached
Mem:          3946       3572        374         40        288        996
-/+ buffers/cache:       2288       1658
Swap:         4092        195       3897
instructor@linubuvma:/$ exit
exit

Script done on Fri 31 Mar 2017 12:12:02 AM EDT
student@linubuvma:/tmp$

Note – the student’s screen will show the user id of the instructor at the bash prompt, as it is a replica of the instructors session. Once the instructor terminates the session, the student will get back to their original bash prompt.

References

http://man7.org/linux/man-pages/man1/script.1.html

Linux – Sort IPv4 addresses numerically

A novice user’s first attempt to sort a list of IP addresses would be to use ‘sort -n’, that is a numeric-sort option for sort command. Unfortunately, this will sort only the first quadrant of the IP address preceding the initial dot(‘.’). Definitely the GNU sort command does support sorting IPv4 addresses in numeric order, we just have to specify the right options.

Question to answer –

1. What is our delimiter for IPv4? dot.
2. What type of sorting? numeric.
3. How many fields? four.

Reading the man page for sort provides an option for each – 1) -t. 2) -n 3)-k
The third part might need clarification – since we have dot as a separator, the IP address will have four fields. We need to give sort a key specification (-k), with start and stop positions i.e to story by first quadrant(-k1,1), followed by second(-k2,2), followed by third(-k3,3) and finally by fourth(-k4,4).

The full command looks like this –

sort -t. -n -k1 -k2 -k3 -k4 /tmp/ipv4_file.txt

Let us use ForgeryPy to generate random Ipv4 addresses, we will write a simple python script to generate these random IPs to a file.

First install ForgeryPY –

pip install ForgeryPY

Script to generate IPv4 addresses –

$cat ipv4_generator.py

#!/usr/bin/env python

import forgery_py

uniq_ipv4=set()
for i in range(50):
    uniq_ipv4.add(forgery_py.internet.ip_v4())

with open('/tmp/ipv4_addresses.txt', 'w') as fp:
     for line in uniq_ipv4:
         fp.writelines(line+'\n')

Output –

daniel@linubuvma:/tmp$ cat /tmp/ipv4_addresses.txt
cat: /tmp/ipv4_addresses.txt: No such file or directory
daniel@linubuvma:/tmp$ python ipv4_generator.py
daniel@linubuvma:/tmp$ cat /tmp/ipv4_addresses.txt
222.21.147.97
187.234.9.45
144.101.36.131
31.192.196.59
24.16.131.84
8.52.22.181
17.40.228.224
58.164.169.156
234.78.147.45
254.150.145.225
167.111.243.3
168.168.248.227
68.104.225.196
55.138.152.3
223.30.151.183
235.245.57.76
226.122.222.107
176.199.0.130
13.68.133.125
14.157.155.254
11.155.170.92
249.0.112.141
228.209.60.62
246.130.20.235
113.17.65.20
120.76.166.133
81.191.49.37
17.226.209.151
81.184.136.140
9.172.35.65
129.205.96.54
181.130.8.142
21.78.73.162
5.216.102.88
91.140.115.96
134.140.243.193
177.148.152.60
175.37.63.212
60.175.123.112
176.250.114.170
54.62.22.255
182.78.64.216
238.92.143.140
181.206.65.80
11.139.192.62
38.158.146.36
241.236.161.184
30.223.32.242
233.107.53.70
36.222.68.164
daniel@linubuvma:/tmp$

Let us sort it –

daniel@linubuvma:/tmp$ sort -n -t. -k1,1 -k2,2 -k3,3 -k4,4 /tmp/ipv4_addresses.txt
5.216.102.88
8.52.22.181
9.172.35.65
11.139.192.62
11.155.170.92
13.68.133.125
14.157.155.254
17.40.228.224
17.226.209.151
21.78.73.162
24.16.131.84
30.223.32.242
31.192.196.59
36.222.68.164
38.158.146.36
54.62.22.255
55.138.152.3
58.164.169.156
60.175.123.112
68.104.225.196
81.184.136.140
81.191.49.37
91.140.115.96
113.17.65.20
120.76.166.133
129.205.96.54
134.140.243.193
144.101.36.131
167.111.243.3
168.168.248.227
175.37.63.212
176.199.0.130
176.250.114.170
177.148.152.60
181.130.8.142
181.206.65.80
182.78.64.216
187.234.9.45
222.21.147.97
223.30.151.183
226.122.222.107
228.209.60.62
233.107.53.70
234.78.147.45
235.245.57.76
238.92.143.140
241.236.161.184
246.130.20.235
249.0.112.141
254.150.145.225

Hope this help.

http://man7.org/linux/man-pages/man1/sort.1.html
https://pypi.python.org/pypi/ForgeryPy

Linux – run a scheduled command once

When we think of running scheduled tasks in Linux, the first tool which comes to mind to most Linux users and admins is cron. Cron is very popular and useful when you want to run a task regularly – say after a given interval, hourly, weekly or even every time the system reboots. The scheduled tasks are faithfully executed by the crond daemon based on the scheduling we set, if somehow crond missed the task because the machine was not running 24/7, then anacron takes care of it. My topic today though is about at which executes a scheduled task only ones at a later time.

1. Adding future commands interactively

Let us schedule to run a specific command 10 minutes from now, press CTRL+D once you have entered the command –

daniel@lindell:~$ at now +10 minutes
at> ps aux &> /tmp/at.log
[[PRESS CTRL+D HERE]]
job 4 at Wed Mar  1 21:24:00 2017

Now the above command ‘ps aux’ is scheduled to run 10 minutes from now, only once. We can check the pending jobs using atq command –

daniel@lindell:~$ atq
4	Wed Mar  1 21:24:00 2017 a daniel

2. Remove scheduled jobs from queue using atrm or at -r

daniel@lindell:~$ at now +1 minutes
at> ps aux > /tmp/atps.logs
at> <EOT>
job 8 at Wed Mar  1 21:25:00 2017
daniel@lindell:~$ atq
8	Wed Mar  1 21:25:00 2017 a daniel
daniel@lindell:~$ atrm 8
daniel@lindell:~$ atq
daniel@lindell:~$ 

3. Run jobs from a script or file.

In some cases the job you want to run is a script –

daniel@lindell:~$ at -f /tmp/myscript.sh 8:00 AM tomorrow
daniel@lindell:~$ atq
11	Thu Mar  2 08:00:00 2017 a daniel

4. Embed shell commands inline –

at now +10 minutes <<-EOF
if [ -d ~/pythonscripts ]; then
 find ~/pythonscripts/ -type f -iname '*.pyc' -delete
fi
EOF

5. View contents of scheduled task using ‘at -c JOBNUMBER’ :

daniel@lindell:~$ at now +10 minutes <<-EOF
> if [ -d ~/pythonscripts ]; then
>  find ~/pythonscripts/ -type f -iname '*.pyc' -delete
> fi
> EOF
job 13 at Wed Mar  1 21:51:00 2017

daniel@lindell:~$ atq
11	Thu Mar  2 08:00:00 2017 a daniel
12	Wed Mar  1 21:45:00 2017 a daniel
13	Wed Mar  1 21:51:00 2017 a daniel


daniel@lindell:~$ at -c 13
 [[ TRUNCATED ENVIRONMENTAL STUFF ]]
cd /home/daniel || {
	 echo 'Execution directory inaccessible' >&2
	 exit 1
}
if [ -d ~/pythonscripts ]; then
 find ~/pythonscripts/ -type f -iname '*.pyc' -delete
fi

In this small tutorial about at utility, we saw some of the use cases for at – especially where we had to execute a scheduled task only once. The time specification it uses is human friendly, example it supports time specs such as midnight, noon, teatime or today. Feel free to read the man pages for details.

References –

https://linux.die.net/man/1/at

How to run playbooks against a host running ssh on a port other than port 22.

Ansible is a simple automation or configuration management tool, which allows to execute a command/script on remote hosts in an adhoc or using playbooks. It is push based, and uses ssh to run the playbooks against remote hosts. The below steps are on how to run ansible playbooks on a host running ssh on port 2222.

One of the hosts managed by ansible is running in a non-default port. It is a docker container listening on port 2222. Actually ssh in container listens on port 22, but the host redirect port 2222 on host to port 22 on container.

1. Use environment variable –


 ansible-playbook tasks/app-deployment.yml --check -e ansible_ssh_port=2222

2. specify the port in the inventory or hosts file –

Under hosts file set the hostname to have the format ‘server:port’ –

[docker-hosts]
docker1:2222

Let us run the playbook now –

root@linubuvma:/tmp/ansible# cat tasks/app-deployment.yml
- hosts: docker-hosts
  vars:
    app_version: 1.1.0
  tasks:
  - name: install git
    apt: name=git state=latest
  - name: Checkout the application from git
    git: repo=https://github.com/docker/docker-py.git dest=/srv/www/myapp version={{ app_version }}
    register: app_checkout_result


root@linubuvma:/tmp/ansible# ansible-playbook tasks/app-deployment.yml

PLAY [docker-hosts] ************************************************************

TASK: [install git] ***********************************************************
changed: [docker1]

TASK: [Checkout the application from git] *************************************
changed: [docker1]

PLAY RECAP ********************************************************************
docker1                    : ok=2    changed=2    unreachable=0    failed=0

References –

http://docs.ansible.com/
http://docs.ansible.com/ansible/intro_inventory.html

Randomly ordering files in a directory with python

I have a playlist file which contains audio files to play. The audio player unfortunately plays the music files in a sequential order, in whatever order they are listed in the playlist file. So occasionally I have to regenerate the playlist file to randomize the audio files order. Here is a simple script that I had to write for this purpose, the core component is the random.shuffle(list) python function –

Create script file as shuffle_files.py –

#!/usr/bin/env python

import os
import random
import sys

music_files=[]

if len(sys.argv) != 2:
  print "Usage:", sys.argv[0], "/path/directory"
else:
  dir_name=sys.argv[1]
  if os.path.isdir(dir_name):
    for file_name in os.listdir(dir_name):
      music_files.append(file_name)
  else:
    print "Directory", dir_name, "does not exist"
    sys.exit(1)
# shuffle list
random.shuffle(music_files)
for item in music_files:
  print os.path.join(dir_name,item)

Run the script by providing a path to a directory with files. Each iteration should list the files in the directory in a different order.
Note – the script does not recurse into the directories, it can be easily modified with os.walk if necessary.

root@svm1010:/home/daniel/scripts# python shuffle_files.py /opt/iotop/iotop
/opt/iotop/iotop/setup.py
/opt/iotop/iotop/README
/opt/iotop/iotop/iotop
/opt/iotop/iotop/iotop.8
/opt/iotop/iotop/NEWS
/opt/iotop/iotop/iotop.py
/opt/iotop/iotop/PKG-INFO
/opt/iotop/iotop/THANKS
/opt/iotop/iotop/sbin
/opt/iotop/iotop/setup.cfg
/opt/iotop/iotop/ChangeLog
/opt/iotop/iotop/.gitignore
/opt/iotop/iotop/COPYING


root@svm1010:/home/daniel/scripts# python shuffle_files.py /opt/iotop/iotop
/opt/iotop/iotop/PKG-INFO
/opt/iotop/iotop/COPYING
/opt/iotop/iotop/iotop
/opt/iotop/iotop/setup.cfg
/opt/iotop/iotop/NEWS
/opt/iotop/iotop/README
/opt/iotop/iotop/.gitignore
/opt/iotop/iotop/setup.py
/opt/iotop/iotop/THANKS
/opt/iotop/iotop/iotop.py
/opt/iotop/iotop/ChangeLog
/opt/iotop/iotop/iotop.8
/opt/iotop/iotop/sbin


root@svm1010:/home/daniel/scripts# python shuffle_files.py /opt/iotop/iotop
/opt/iotop/iotop/THANKS
/opt/iotop/iotop/setup.py
/opt/iotop/iotop/NEWS
/opt/iotop/iotop/README
/opt/iotop/iotop/iotop.8
/opt/iotop/iotop/.gitignore
/opt/iotop/iotop/ChangeLog
/opt/iotop/iotop/sbin
/opt/iotop/iotop/PKG-INFO
/opt/iotop/iotop/iotop
/opt/iotop/iotop/COPYING
/opt/iotop/iotop/iotop.py
/opt/iotop/iotop/setup.cfg

Reference – https://docs.python.org/2/library/random.html?highlight=shuffle#random.shuffle

How to interact with web services.

Curl is the defacto CLI tool for interacting with web services and other non-HTTP services such as FTP or LDAP. Linux or Unix system administrators as well as developers love it for its ease of use and debugging capabilities. When you want to interact with web services from within scripts, curl is the number one choice.For downloading files from the web, wget is commonly used as well, but curl can way more.

Since enough has been written about curl, this post is about a tool which takes interaction with web services a lot more human friendly, with nicely formatted and colored output – httpie. It is written in Python.

Installation

apt-get  install httpie     #(Debian/Ubuntu)
yum install httpie          #(Redhat/CentOS)

Note – although the package name is httpie, the binary file is installed as http.

When troubleshooting web services, the first thing we check is usually http request and response headers –

daniel@lindell:/$ http -p hH  httpbin.org
GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: httpbin.org
User-Agent: HTTPie/0.9.2

HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Connection: keep-alive
Content-Length: 12150
Content-Type: text/html; charset=utf-8
Date: Thu, 22 Dec 2016 01:32:13 GMT
Server: nginx

Where -H is for Request headers, -h is for response headers. Similarly, -B is for request body and -b is for response body.

We can also pass more complex HTTP headers, in this case “If-Modified-Since”, the web server will return 304 if the static content i am requesting has not been modified. Moving the date a few years back, it will respond with 200 status code.

daniel@lindell:/$ http -p hH http://linuxfreelancer.com/wp-content/themes/soulvision/images/texture.jpg "If-Modified-Since: Wed, 21 Dec 2016 20:51:14 GMT"
GET /wp-content/themes/soulvision/images/texture.jpg HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: linuxfreelancer.com
If-Modified-Since:  Wed, 21 Dec 2016 20:51:14 GMT
User-Agent: HTTPie/0.9.2

HTTP/1.1 304 Not Modified
Connection: Keep-Alive
Date: Thu, 22 Dec 2016 01:39:28 GMT
ETag: "34441c-f04-4858fcd6af900"
Keep-Alive: timeout=15, max=100
Server: Apache/2.2.14 (Ubuntu)

daniel@lindell:/$ http -p hH http://linuxfreelancer.com/wp-content/themes/soulvision/images/texture.jpg "If-Modified-Since: Wed, 21 Dec 2008 20:51:14 GMT"
GET /wp-content/themes/soulvision/images/texture.jpg HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: linuxfreelancer.com
If-Modified-Since:  Wed, 21 Dec 2008 20:51:14 GMT
User-Agent: HTTPie/0.9.2

HTTP/1.1 200 OK
Accept-Ranges: bytes
Connection: Keep-Alive
Content-Length: 3844
Content-Type: image/jpeg
Date: Thu, 22 Dec 2016 01:39:37 GMT
ETag: "34441c-f04-4858fcd6af900"
Keep-Alive: timeout=15, max=100
Last-Modified: Sat, 01 May 2010 22:23:00 GMT
Server: Apache/2.2.14 (Ubuntu)

httpie also makes passing JSON encoding as well as POST/PUT methods a lot easier. No need for formatting your payload as JSON, it defaults to JSON. Debugging is easier to with -v option, which shows the raw wire data –

daniel@lindell:/$ http -v PUT httpbin.org/put name=JoeDoe email=joedoe@gatech.edu
PUT /put HTTP/1.1
Accept: application/json
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 48
Content-Type: application/json
Host: httpbin.org
User-Agent: HTTPie/0.9.2

{
    "email": "joedoe@gatech.edu", 
    "name": "JoeDoe"
}

HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Connection: keep-alive
Content-Length: 487
Content-Type: application/json
Date: Thu, 22 Dec 2016 01:44:20 GMT
Server: nginx

{
    "args": {}, 
    "data": "{\"name\": \"JoeDoe\", \"email\": \"joedoe@gatech.edu\"}", 
    "files": {}, 
    "form": {}, 
    "headers": {
        "Accept": "application/json", 
        "Accept-Encoding": "gzip, deflate", 
        "Content-Length": "48", 
        "Content-Type": "application/json", 
        "Host": "httpbin.org", 
        "User-Agent": "HTTPie/0.9.2"
    }, 
    "json": {
        "email": "joedoe@gatech.edu", 
        "name": "JoeDoe"
    }, 
    "origin": "192.1.1.2", 
    "url": "http://httpbin.org/put"
}

I have touched just the surface of httpie here, please feel free to get more detailed information on the github repo. It has built-in JSON support, form/file upload, HTTPS, proxies and authentication, custom headers, persistent sessions etc.

Article on wget and curl from previous post.

The date command in Linux boxes is one of the most powerful open source utilities. It is not just for setting the clock on your PC or server, or showing you what the current time is, it can do amazingly more. It can virtually answer all of your chronological questions.

The simplest use case of date command is to view current time, possibly in different time formats –

$ date
Sat Dec 17 00:45:35 EST 2016

$ date '+%Y-%m-%d'
2016-12-17

$ date '+%c'
Sat 17 Dec 2016 12:45:51 AM EST

It is useful in converting time to/from epoch as well –

$ date '+%s'
1481953669

$ date --date='@1481953669'
Sat Dec 17 00:47:49 EST 2016

The most user friendly use case of the date command is the ‘-d’ or ‘–date’ options, which accepts free format human readable date string such as “yesterday”, “last week”, “next year”, “3 min ago”, “last friday + 2 hours” etc. Here is an excerpt from the man page of the GNU date command –

DATE STRING
The --date=STRING is a mostly free format human readable date string such as "Sun, 29 Feb 2004 16:21:42 -0800" or "2004-02-29 16:21:42" or even "next Thursday". A date string may con?
tain items indicating calendar date, time of day, time zone, day of week, relative time, relative date, and numbers. An empty string indicates the beginning of the day. The date
string format is more complex than is easily documented here but is fully described in the info documentation.

Let us play with it –

$ date -d '2 hours ago'
Fri Dec 16 22:51:25 EST 2016

$ date -d '2 hours ago' '+%c'
Fri 16 Dec 2016 10:51:30 PM EST

$ env TZ=America/Los_Angeles date -d '2 hours ago' '+%c'
Fri 16 Dec 2016 07:52:33 PM PST

$ date -d 'jan 2 1990'
Tue Jan  2 00:00:00 EST 1990

$ date -d 'yesterday'
Fri Dec 16 00:53:04 EST 2016

$ date -d 'next year + 2 weeks'
Sun Dec 31 00:53:27 EST 2017

To give a practical example, let us use the date command to get, on which day all the birth days of someone fall, given their date of birth. This can be for past birth days as well as the future. For this example, we will do it from date of birth to this date. Let us pick someone who was born on Feb 29, 1988. This is an edge case. The date command should be smart enough to figure out the leap years.

for year in {1988..2016}; do 
  date -d "feb 29 $year" &>/dev/null
  if [ $? -eq 0 ]; then
    echo -n "Year: $year   " ; date -d "feb 29 $year" '+%c'
  fi
done

Year: 1988   Mon 29 Feb 1988 12:00:00 AM EST
Year: 1992   Sat 29 Feb 1992 12:00:00 AM EST
Year: 1996   Thu 29 Feb 1996 12:00:00 AM EST
Year: 2000   Tue 29 Feb 2000 12:00:00 AM EST
Year: 2004   Sun 29 Feb 2004 12:00:00 AM EST
Year: 2008   Fri 29 Feb 2008 12:00:00 AM EST
Year: 2012   Wed 29 Feb 2012 12:00:00 AM EST
Year: 2016   Mon 29 Feb 2016 12:00:00 AM EST

A typical case would be, say for someone born on Jan 8 1990 –

age=0
for year in {1990..2016}; do 
  echo -n "Age: $age  "; date -d "Jan 8 $year" '+%A %d %B %Y'
  age=$((age+1))
done

Age: 0  Monday 08 January 1990
Age: 1  Tuesday 08 January 1991
Age: 2  Wednesday 08 January 1992
Age: 3  Friday 08 January 1993
Age: 4  Saturday 08 January 1994
Age: 5  Sunday 08 January 1995
Age: 6  Monday 08 January 1996
Age: 7  Wednesday 08 January 1997
Age: 8  Thursday 08 January 1998
Age: 9  Friday 08 January 1999
Age: 10  Saturday 08 January 2000
Age: 11  Monday 08 January 2001
Age: 12  Tuesday 08 January 2002
Age: 13  Wednesday 08 January 2003
Age: 14  Thursday 08 January 2004
Age: 15  Saturday 08 January 2005
Age: 16  Sunday 08 January 2006
Age: 17  Monday 08 January 2007
Age: 18  Tuesday 08 January 2008
Age: 19  Thursday 08 January 2009
Age: 20  Friday 08 January 2010
Age: 21  Saturday 08 January 2011
Age: 22  Sunday 08 January 2012
Age: 23  Tuesday 08 January 2013
Age: 24  Wednesday 08 January 2014
Age: 25  Thursday 08 January 2015
Age: 26  Friday 08 January 2016

Sooner or later, you will find yourself adding sensitive data into Ansible playbooks, host or group vars files.Such information might include MySQL DB credentials, AWS secret keys, API credentials etc. Including such sensitive information in plain text might not be acceptable for security compliance reasons or even lead to your systems being owned when your company hires a third party to do pen testing and worst yet by outside hackers. In addition to this, sharing such playbooks to public repositories such as github won’t be easy as you have to manually search and redact all the sensitive information from all your playbooks, and as we know manual procedure is not always error prone. You might ‘forget’ to remove some of the paswords.

One solution for this is a password vault to hold all your sensitive data, and Ansible provides a utitility called ansible-vault to create this encrypted file and the data can be extracted when running your playbooks with a single option. This is equivalent to Chef’s data bag.

In this blog post, I will share with you how to use a secret key file to protect sensitive data in Ansible with ansible-vault utility. The simplest use case is to protect the encrypted file with a password or passphrase, but that is not convinient as you have to type the password everytime you run a playbook and is not as strong as a key file with hundreds or thousands of random characters. Thus the steps below describe only the procedure for setting up a secret key file rather than a password protected encrypted file. Let us get started.

The first step is to generate a key file containing a random list of characters –

#openssl rand -base64 512 |xargs > /opt/ansible/vaultkey

Create or initialize the vault with the key file generated above –

#ansible-vault create --vault-password-file=/opt/ansible/vaultkey /opt/ansible/lamp/group_vars/dbservers.yml

Populate your vault, refer to Ansible documentation on the format of the vault file –

#ansible-vault edit --vault-password-file=/opt/ansible/vaultkey /opt/ansible/lamp/group_vars/dbservers.yml

You can view the contents by replacing ‘edit’ with ‘view’ –

#ansible-vault view --vault-password-file=/opt/ansible/vaultkey /opt/ansible/lamp/group_vars/dbservers.yml

That is it, you have a secret key file to protect and encrypt a YAML file containing all your sensitive variables to be used in your ansible playbooks.

There comes a time though when you have to change the secret key file, say an admin leaves the company after winning the Mega jackbot lottery 🙂 We have to generate a new key file and rekey the encrypted file as soon as possible –

Generate a new key file –

#openssl rand -base64 512 |xargs > /opt/ansible/vaultkey.new

Rekey to new key file –

#ansible-vault rekey --new-vault-password-file=/opt/ansible/vaultkey.new --vault-password-file=/opt/ansible/vaultkey
Rekey successful

Verify –

#ansible-vault view --vault-password-file=/opt/ansible/vaultkey.new /opt/ansible/lamp/group_vars/dbservers.yml

Last but not least, make sure the secret key file is well protected and is readable only by the owner.

#chmod 600 /opt/ansible/vaultkey.new

Finally, you can use the vault with ansible-playbook. In this case, I am running it against site.yml which is a master playbook to setup a LAMP cluster in AWS (pulling the AWS instances using ec2.py dynamic inventory script) –

#ansible-playbook -i /usr/local/bin/ec2.py site.yml --vault-password-file /opt/ansible/vaultkey.new

Web sites store information on local machines of site visitors using cookies. On subsequent visits, the browser sends the data from the cookies on the visitors machine to the web server, which might then use that information as a historical record of the users activity on the site – on the minimum the time the cookie was created, when it is set to expire and last access time or last time user visited site. Cookies are also used by sites to ‘remember’ user acitivity , say the shopping cart items or login/session information to address the shortcomings of the stateless HTTP protocol.

Most users think that only the sites they had directly visited store cookies on their computers, in reality the number is way higher than that. A single site you visit, usually has lots of links in it, especially ads, that store cookies in your computer. In this post, i will demonstrate how to list the list of all sites that left cookies in your computer, as well as extract additional information from the cookies. When i ran the script and did a count of the 10 top sites which left largest number of entries in the cookies sqlite DB, none of them except for one or two were sites I directly visited!

This Python script was written to extract cookies information on a Linux box running Firefox. The cookies information is stored as a sqlite file and thus you will need the sqlite3 python module to read the sqlite file.

The script takes the path to the cookies file as well as the path to the output file, it will write the output to this file. It will also dump the output to the screen.

root@dnetbook:/home/daniel/python# python cookie_viewer.py 
cookie_viewer.py cookie-fullpath output-file

root@dnetbook:/home/daniel/python# python /home/daniel/python/cookie_viewer.py $(find /home/daniel/ -type f -name 'cookies.sqlite' | head -1) /tmp/test.txt
doubleclick.net,Thu Feb 11 17:56:01 2016,Thu Apr 23 20:46:58 2015,Tue Feb 11 17:56:01 2014
twitter.com,Thu Feb 11 17:56:05 2016,Tue Apr 21 22:27:46 2015,Tue Feb 11 17:56:05 2014
imrworldwide.com,Thu Feb 11 17:56:12 2016,Tue Apr 21 22:19:35 2015,Tue Feb 11 17:56:12 2014
quantserve.com,Thu Aug 13 19:32:02 2015,Thu Apr 23 20:46:57 2015,Tue Feb 11 18:32:0

The output will be the domain name of the site, cookie expiry date, access time and creation time.

Code follows –

#!/usr/bin/env python

''' Given a location to firefox cookie sqlite file
    Write its date param - expiry, last accessed,
    Creation time to a file in plain text.
    id
    baseDomain
    appId
    inBrowserElement
    name
    value
    host
    path
    expiry
    lastAccessed
    creationTime
    isSecure
    isHttpOnly
    python /home/daniel/python/cookie_viewer.py $(find /home/daniel/ -type f -name 'cookies.sqlite' | head -1) /tmp/test.txt 
'''

import sys
import os
from datetime import datetime
import sqlite3

def Usage():
    print "{0} cookie-fullpath output-file".format(sys.argv[0])
    sys.exit(1)

if len(sys.argv)<3:
    Usage()

sqldb=sys.argv[1]
destfile=sys.argv[2]
# Some dates in the cookies file might not be valid, or too big
MAXDATE=2049840000

# cookies file must be there, most often file name is cookies.sqlite
if not os.path.isfile(sqldb):
    Usage()

# a hack - to convert the epoch times to human readable format
def convert(epoch):
    mydate=epoch[:10]
    if int(mydate)>MAXDATE:
        mydate=str(MAXDATE)
    if len(epoch)>10:
        mytime=epoch[11:]
    else:
        mytime='0'
    fulldate=float(mydate+'.'+mytime)
    x=datetime.fromtimestamp(fulldate)
    return x.ctime()

# Bind to the sqlite db and execute sql statements
conn=sqlite3.connect(sqldb)
cur=conn.cursor()
try:
    data=cur.execute('select * from moz_cookies')
except sqlite3.Error, e:
    print 'Error {0}:'.format(e.args[0])
    sys.exit(1)
mydata=data.fetchall()

# Dump results to a file
with open(destfile, 'w') as fp:
    for item in mydata:
        urlname=item[1]
        urlname=item[1]
        expiry=convert(str(item[8]))
        accessed=convert(str(item[9]))
        created=convert(str(item[10]))
        fp.writelines(urlname + ',' + expiry + ',' + accessed + ',' + created)
        fp.writelines('\n')

# Dump to stdout as well
with open(destfile) as fp:
    for line in fp:
        print line

TOP 10 sites with highest number of enties in the cookies file –

root@dnetbook:/home/daniel/python# awk -F, '{print $1}' /tmp/test.txt  | sort | uniq -c | sort -nr | head -10
     73 taboola.com
     59 techrepublic.com
     43 insightexpressai.com
     34 pubmatic.com
     33 2o7.net
     31 rubiconproject.com
     28 demdex.net
     27 chango.com
     26 yahoo.com
     26 optimizely.com