Re-posted from http://awkaslanguage.blogspot.in please visit and find more interesting tailored made awk training material.
As we have learnt some basics on AWK, lets proceed on applications like
log processing. In this session we will see how to process webserver log
files. Below is one example log file which we will use for our
discussion.
Sample log file - download here
213.60.233.243 - - [25/May/2004:00:17:09 +1200] "GET
/internet/index.html HTTP/1.1" 200 6792
"http://www.mediacollege.com/video/streaming/http.html" "Mozilla/5.0
(X11; U; Linux i686; es-ES; rv:1.6) Gecko/20040413 Debian/1.6-5"
151.44.15.252 - - [25/May/2004:00:17:20 +1200] "GET
/cgi-bin/forum/commentary.pl/noframes/read/209 HTTP/1.1" 200 6863
"http://search.virgilio.it/search/cgi/search.cgi?qs=download+video+illegal+Berg&lr=&dom=s&offset=0&hits=10&switch=0&f=us"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:21 +1200] "GET /js/common.js
HTTP/1.1" 200 2263
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:21 +1200] "GET /css/common.css
HTTP/1.1" 200 6123
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:21 +1200] "GET
/images/navigation/home1.gif HTTP/1.1" 200 2735
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:21 +1200] "GET
/data/zookeeper/ico-100.gif HTTP/1.1" 200 196
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:22 +1200] "GET
/adsense-alternate.html HTTP/1.1" 200 887
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:39 +1200] "GET
/data/zookeeper/status.html HTTP/1.1" 200 4195
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
1) To get all requests from a particular webpage
151.44.15.252 - - [25/May/2004:00:17:20 +1200] "GET
/cgi-bin/forum/commentary.pl/noframes/read/209 HTTP/1.1" 200 6863
"http://search.virgilio.it/search/cgi/search.cgi?qs=download+video+illegal+Berg&lr=&dom=s&offset=0&hits=10&switch=0&f=us"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
2) To find number of hits from each IP address
213.60.233.243 has hit webserver 1 times
151.44.15.252 has hit webserver 7 times
3) To find all the requests within a time period(assuming requests are ordered sequentially by time)
213.60.233.243 - - [25/May/2004:00:17:09 +1200] "GET
/internet/index.html HTTP/1.1" 200 6792
"http://www.mediacollege.com/video/streaming/http.html" "Mozilla/5.0
(X11; U; Linux i686; es-ES; rv:1.6) Gecko/20040413 Debian/1.6-5"
151.44.15.252 - - [25/May/2004:00:17:20 +1200] "GET
/cgi-bin/forum/commentary.pl/noframes/read/209 HTTP/1.1" 200 6863
"http://search.virgilio.it/search/cgi/search.cgi?qs=download+video+illegal+Berg&lr=&dom=s&offset=0&hits=10&switch=0&f=us"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
4) To find number of hits each URL received
URL /adsense-alternate.html got hit 1 times
URL /internet/index.html got hit 1 times
URL /cgi-bin/forum/commentary.pl/noframes/read/209 got hit 1 times
URL /data/zookeeper/status.html got hit 1 times
URL /css/common.css got hit 1 times
URL /images/navigation/home1.gif got hit 1 times
URL /data/zookeeper/ico-100.gif got hit 1 times
URL /js/common.js got hit 1 times
5) To group all the requests from a particular IP address
213.60.233.243=>
213.60.233.243 - - [25/May/2004:00:17:09 +1200] "GET
/internet/index.html HTTP/1.1" 200 6792
"http://www.mediacollege.com/video/streaming/http.html" "Mozilla/5.0
(X11; U; Linux i686; es-ES; rv:1.6) Gecko/20040413 Debian/1.6-5"
151.44.15.252=>
151.44.15.252 - - [25/May/2004:00:17:20 +1200] "GET
/cgi-bin/forum/commentary.pl/noframes/read/209 HTTP/1.1" 200 6863
"http://search.virgilio.it/search/cgi/search.cgi?qs=download+video+illegal+Berg&lr=&dom=s&offset=0&hits=10&switch=0&f=us"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:21 +1200] "GET /js/common.js
HTTP/1.1" 200 2263
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:21 +1200] "GET /css/common.css
HTTP/1.1" 200 6123
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:21 +1200] "GET
/images/navigation/home1.gif HTTP/1.1" 200 2735
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:21 +1200] "GET
/data/zookeeper/ico-100.gif HTTP/1.1" 200 196
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:22 +1200] "GET
/adsense-alternate.html HTTP/1.1" 200 887
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
151.44.15.252 - - [25/May/2004:00:17:39 +1200] "GET
/data/zookeeper/status.html HTTP/1.1" 200 4195
"http://www.mediacollege.com/cgi-bin/forum/commentary.pl/noframes/read/209"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.4.7.0)"
3 comments:
Post after a long time. Was surprised to see that in my reader :)
Actually it was a re-post, so the original author and post reaches a wider audience.
Thanks for still keeping the feeds subscribed to this blog.
I realized that it is a repost but was still surprised to find a new post :)
You should write more frequently.
Post a Comment