gnuplot / webplot / access_log (E)

Analyze and count httpd access_log

Because an access-log file of Web server becomes very large within
a short period, this file is renewed at the fixed interval – every
week / month, etc. Here we assume the case that the log-file is
renewed every month automatically. This is a very common case for many
WWW servers.

A default log-file of Apache (access_log) has following formatted
lines.

 host.domain - - [01/Jan/2000:01:23:45 +0900] "GET /index.html HTTP/1.1" 200 1548 host.domain - - [01/Jan/2000:01:23:50 +0900] "GET /icons/mail.png HTTP/1.1" 200 229 

Now let’s count how many accesses are there during 24 hours. What
we need is the “date” part (01) in [01/Jan/2000: . Count
the number of lines those “date” is the same. This is done for the
first day to 31th in a month.

 #!/usr/bin/perl while(<>){ if(/.html/){ split; $day = substr($_[3],1,2); $count[$day]++; } } for($i=1;$i<=$#count;$i++){ printf("%10d %10dn",$i,$count[$i]); } 

In the access_log file, any kinds of Web access such as image files
are recorded, so you need to exclude any log-lines other than the
access to HTML files. You can also count an access to a definite
file by changing if line in the Perl program.

Firstly each line is separated into items (delimiter is
white-space), then the “date” part is cut off by
substr. Substitute this into the variable $day,
and increment its counter. We named the Perl script above
“webplot.pl”.

This is an example of access statistics to some Web Server
in January, 2000, processed by “webplot.plt”.

 1       172 2       321 3       208 4       279 5       327 ....      .... 25       588 26      1038 27       848 28       772 29       570 30       495 31       548 

The following shows a graph drawn by gnuplot, dumb terminal.
The letter < in “< webplot.pl access_log” means
to read an output of Perl program.

 gnuplot> set term dumb Terminal type set to 'dumb' Options are 'feed 79 24' gnuplot> plot "< webplot.pl access_log" with step   1600 ++--------+---------+---------+---------+---------+---------+--------++ +         +         +         +      "< webplot.pl access_log" ****** + 1400 ++                                            ***                    ++ |                                             * *                     | |                                             * *                     | 1200 ++                                            * *                    ++ |                                             * *                     | 1000 ++                                  ***       * *   ***              ++ |                                   * *       * *   * *               | |                                   * *       * *   * ***             | 800 ++                                  * *       * *   *   ***          ++ |                                   * * ***   * *   *     *           | 600 ++                    ***           * *** *   * *   *     *          ++ |             ***     * *           *     *** * *****     *** *       | |             * *     * *******     *       * *             ***       | 400 ++            * *     *       *** ***       ***                      ++ |   ***   *** * * *****         * *                                   | 200 ++  * ***** *** ***             ***                                  ++ | ***                                                                 | +         +         +         +         +         +         +         + 0 ++--------+---------+---------+---------+---------+---------+--------++ 0         5        10        15        20        25        30        35 
up