We all know we should be using web analytics to analyse web site visitor behaviour and online marketing channel performance. However what type of web analysis should we use? Should you go for log file analysis or page tagging or a bit of both? First of all let’s define what we mean by these terms.
The bad news is that both strategies have their advantages and disadvantages so here goes.
Page Tagging Advantages
o Because data is collected client side this gets around any proxy and caching problems
o Will give you information on web design parameters such as browser versions, platform versions, screen resolution, connection speed etc
Page Tagging Disadvantages
o Firewalls can prevent or interfere with script processing
o Set up costs associated with insertion of code.
o Insertion of code can lead to errors
o Will not pick up page errors such as 404s
o Because robots ignore scripts can not track search engine spiders
o Unable to directly track non html pages
o Vendor Specific
Logfile Analysis Advantages
o Historical Data can be analysed
o Little set up cost
o No firewall issues
o Easily track page errors
o Can track Search Engine spiders
o Vendor Independent
o Can track non html pages such as pdfs
o Proxy/caching inaccuracies. If a page is cached no record is logged on your web server
oNo web design parameters
o No event tracking
If you are used to looking at web statistics using Web Trends for instance you may see significant differences in visitor numbers. When moving to logfile analysis visitor numbers may increase by 20-30%. If your site is not using persistent cookies your web analytics programme can not identify unique visitors therefore all visitors are lumped together as total. Typically unique visitors represent about 20 -30% of total web site visits so this metric will be inflated by this amount. Sometimes you’ll see a dramatic reduction in site visits. This is usually because web analytics programmes strip out the loading of graphics which are erroneously counted as visits by other programs.
Other differences in visitor numbers are usually due to how programs define a visit. A visit duration of 30 minutes means that multiple visits from the same IP address with-in this time period will be counted as a single visit. Change this parameter to 15 minutes and these visits could be counted several times and your total visits will increase. Finally, when a web browser loads a PDF file is brings down different parts of the file at different time and some programs can count this as multiple requests for the same file. A good web analytics programme will collapse these multiple downloads into a single.
It is important to understand these differences and manage the expectations of your colleagues as surprise drops in web site metrics can sometimes lead to disenchantment with measuring web site performance altogether.