As a webmaster, I share a methodology with the folks who produce radio or television. I am using what is basically a one-way medium. Just like the local radio or television station that broadcasts it’s signal and hopes that everyone tunes in, I publish various content that is available for everyone or no one to view. Obviously I am hoping for the former over the latter, but reality is somewhere in the middle. And just where in the middle is what I need to know. Which content is generating traffic and which is just taking up space? Fortunately, this issue has been well addressed with use of a ‘Statistics Package’ to help me understand what is popular and what isn’t.
The next question is ‘Which Package?’ First, despite all the stats package author’s hyperbole about how special or unique his or her package is, they all function in one of two ways.
- Parse The Logs vs Push the Data
The most accurate method is to put a bit of code in every page. As the user requests the page, data is logged in a database. This data can include a session id to know how a user moves through the site. One downside is overhead. Every page needs to be just a little bit bigger. Every page needs just a little more processing. And every page needs to write a little bit of information to the database. Another negative to this method is maintenance. If a site wasn’t created with this plan in mind, refitting a large number of pages with this code could be troublesome.
The less accurate but more commonly utilized method is to parse the webserver’s log files. The popular web servers are all capable of logging http requests that they receive with a variety of pertinent data. By parsing this data and analyzing it, a stats program can answer those questions about which pages are drawing attention and, specifically, how much attention. This logging of requests is not processor expensive and adds nothing to bandwidth requirements.
Because of the popularity of packages that parse log files and the negatives associated with adding code to every page, I will focus only on packages that parse log files to generate their reports.
Before my stats package can generate these reports, however, I need to make sure the appropriate data is being logged. Which data is being logged is defined in the server’s configuration. My experience has been that Apache is, by default, configured to meet my needs. IIS often isn’t. I typically need to instruct IIS to ‘log visits’. This turns on logging for all http requests. Data specific, I almost always want to add the ‘referrer’ field to the logged data.
So now my webserver has a bunch of data squirreled away in a nearly unintelligible log file deep in the file system. The final step will be to choose and install a statistics package. I have chosen what I believe the be the top two stats packages for most situations. I have also included an ‘honorable mention’ that, while I wouldn’t recommend it for everyone, does fill a niche. Here are the highlights of what they offer.
- First Timers enjoy Livestats
Livestats from Deepmetrix fills a niche. It isn’t my top choice, but it does fill a need. Depending on which package is chosen, it can be quite costly. However, the support they provide may offset it’s cost. It is offered for a variety of hosting environments and installation seemed quite easy for a package of this complexity. Livestats special skill, however, is how detailed it’s reporting is. This is the first package that introduced me to search engine specific keyword results. This allows a webmaster to know which keywords are working on which search engines. This powerful feature is why this package has earned my ‘Honorable Mention’. On the downside, at last check, Deepmetrix was being acquired by Microsoft. Time will tell how this affects the software.
- Minimalists Unite with Webalizer
Webalizer is my emotional favorite. It is simple, straightforward, and doesn’t make any promises it can’t keep. The author of Webalizer is quite clear that many of the metrics that are commonly reported in many stats packages are not 100% reliable. Webalizer is written in ‘C’ to keep the parsing process moving through large log files. Windows and Unix/Linux versions are available.
The Webalizer ‘MO’ seems to be to report what can be ascertained as 99% accurate and to omit other reports that are more speculation based. This may be good or bad but definitely keeps things simple. This simple approach is what makes Webalizer my first choice for ‘Hobby Sites’. It’s lack of power, however, doesn’t let it rise above the runner up position.
- More Power from AWStats
AWStats is my ‘Heavyweight Champion’. This package seems to embody the ‘American Way’. More reports, more detail, and plenty of power. And it’s popularity shows this may not be a bad way to go. It is written in Perl and seems to have most every bell or whistle a webmaster might desire. And the price is right; Free.
My initial experience with this package had a sour note. I didn’t see a report that showed the search engine to keyword association. Not a problem, it turns out. This package has been designed to allow ‘Plugins’ that allow the webmaster to create his or her own reports. I wasn’t wild about having to create a plugin to create this report, but a quick search later and I had located and downloaded just such a Plugin.
In conclusion, Livestats might be right for the newbies, Webalizer makes the ‘Keep it Simple’ folks tickled, but for pure stats Nirvana, AWStats is my ‘Holy Grail’.