« janvier 2006 | Main | juin 2006 »
février 28, 2006
Simple query_string based http accelerator for a dynamic web page
A particular web site is serving pretty fancy statistics using SQL queries generated on the fly by an ASP page. The ASP page is receiving parameters using GET parameters. As the site is getting more popular, the load of the SQL server is rising, and the time needed to execute one big SQL query is now around 30 seconds.
After an analysis of the requests, I found out that 80% of the requests were for the same three stats -- another example of the principle of locality. So I thought of using a reverse caching proxy (http accelerator) to reduce the load on the server. I tried squid and mod_proxy. However, both programs understandably don't cache the data from pages that are given GET parameters. So I had to make my own solution: Perl to the rescue!
This script sits on a web server and receives queries with GET parameters. It does a SHA1 checksum of the parameters and will use this checksum as the filename for the cache file. If a cache file for these params exists and the modification date does not exceed the expiration time, it will send that cache file to the client. Otherwise, it will query the dynamic page for the data and store the result in cache.
It is meant to be simple and specialized for my own application. You will certainly have to teak it, but it should be easy to do since it's so simple.
The stats are generated by the ASP page using real-time data, but he data does not change very fast -- the most active users usually check the site 2 or 3 times a day. So I decided to use an expiration time of 1 hour. You should adjust this to suit your needs.
This script can query any kind of dynamic web page, be it ASP, PHP or anything else.
#
# Simple query_string based http accelerator for a dynamic web page
#
# Copyright (C) 2005 Guillaume Filion <guillaume@filion.org>
#
# Version 1.0 2006-02-28 Initial release
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License version 2
# as published by the Free Software Foundation.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
#
use strict;
use Digest::SHA1 qw(sha1_base64);
use LWP::UserAgent;
# What dynamic web page will provide the responses.
# May not be a good idea to make this publicly accessible.
use constant SRC => 'http://localhost/stats.asp';
# Directory where the cached versions of the pages will be stored.
# There's no mechanism for deleting the old cached pages, so I have a
# cron job empty this directory every night.
use constant DIR => 'D:/modstats/cache/';
# How long, in seconds, do we keep the cached versions of the pages.
use constant EXPIRATION => 3600; # 3600 secs = 1 hour
chdir(DIR);
my $param = $ENV{'QUERY_STRING'};
die "Tainted query_string: ($param)" unless $param =~ /[\w+\=[\w\%]+&?]*/;
my $digest = sha1_base64($param);
$digest =~ s/\//-/g; # Base64 is not your friend if you're using it for filenames...
# Check if we have a cached response for this query
my $load=1;
my $file = "$digest.html";
if (-s $file) {
my $cdate = (stat $file)[9];
if (time - $cdate > EXPIRATION) {
unlink($file);
$load=1;
} else {
$load=0;
}
}
my $content;
if ($load) {
# Load the page and save it into the cache
my $ua = LWP::UserAgent->new;
my $response = $ua->get(SRC . '?' . $param);
$content = $response->content;
if ($response->is_success()) {
open(CACHE,">$file") or print("Cannot write to file $file: $!");
print CACHE $content;
close(CACHE);
} else {
print $response->status_line;
}
} else {
# Load the reponse from the cache
open(CACHE,"<$file") or print("Cannot open file $file: $!");
{ local $/; undef $/; $content = <CACHE>; }
close(CACHE);
}
# Send it back to the client.
print "Content-type:text/html\n\n$content";
Posted by gfk at 4:42 PM | Comments (0)
février 3, 2006
Cox & Forkum
Les caricaturistes sont à la mode ces temps-ci. Justement, je viens de découvrir deux caricaturistes assez cinglants, Cox & Forkum. Je ne suis pas d'accord avec une bonne partie de leur contenu, mais ça rend la découverte encore plus intéressante.
Aussi, voici un lien vers la première caricature de Ygreck a avoir été censurée par le Journal de Québec. [Blog de Ygreck]
Autres caricatures dignes de mention:
Jockeying
Bill of Goods
Civil Obedience
Annan Threat
That Day
Fig Leaf Diplomacy
Posted by gfk at 7:29 PM | Comments (0)






