gnuplot / webplot / access_log (E)

- not so Frequently Asked Questions -

update 2004/11/29

Analyze and count httpd access_log

Because an access-log file of Web server becomes very large within a short period, this file is renewed at the fixed interval - every week / month, etc. Here we assume the case that the log-file is renewed every month automatically. This is a very common case for many WWW servers.

A default log-file of Apache (access_log) has following formatted lines.

host.domain - - [01/Jan/2000:01:23:45 +0900] "GET /index.html HTTP/1.1" 200 1548
host.domain - - [01/Jan/2000:01:23:50 +0900] "GET /icons/mail.png HTTP/1.1" 200 229

Now let's count how many accesses are there during 24 hours. What we need is the "date" part (01) in [01/Jan/2000: . Count the number of lines those "date" is the same. This is done for the first day to 31th in a month.

$day = substr($_[3],1,2);
printf("%10d %10d\n",$i,$count[$i]);

In the access_log file, any kinds of Web access such as image files are recorded, so you need to exclude any log-lines other than the access to HTML files. You can also count an access to a definite file by changing if line in the Perl program.

Firstly each line is separated into items (delimiter is white-space), then the "date" part is cut off by substr. Substitute this into the variable $day, and increment its counter. We named the Perl script above "".

This is an example of access statistics to some Web Server in January, 2000, processed by "webplot.plt".

1       172
2       321
3       208
4       279
5       327
....      ....
25       588
26      1038
27       848
28       772
29       570
30       495
31       548

The following shows a graph drawn by gnuplot, dumb terminal. The letter < in "< access_log" means to read an output of Perl program.

gnuplot> set term dumb
Terminal type set to 'dumb'
Options are 'feed 79 24'
gnuplot> plot "< access_log" with step

1600 ++--------+---------+---------+---------+---------+---------+--------++
+         +         +         +      "< access_log" ****** +
1400 ++                                            ***                    ++
|                                             * *                     |
|                                             * *                     |
1200 ++                                            * *                    ++
|                                             * *                     |
1000 ++                                  ***       * *   ***              ++
|                                   * *       * *   * *               |
|                                   * *       * *   * ***             |
800 ++                                  * *       * *   *   ***          ++
|                                   * * ***   * *   *     *           |
600 ++                    ***           * *** *   * *   *     *          ++
|             ***     * *           *     *** * *****     *** *       |
|             * *     * *******     *       * *             ***       |
400 ++            * *     *       *** ***       ***                      ++
|   ***   *** * * *****         * *                                   |
200 ++  * ***** *** ***             ***                                  ++
| ***                                                                 |
+         +         +         +         +         +         +         +
0 ++--------+---------+---------+---------+---------+---------+--------++
0         5        10        15        20        25        30        35