Coded Vision Consulting


 

Web Analytics Data - Log Analysis Versus Page Tagging


Data is the central asset to your online performance management. Without sound analytics data, you are unable to provide a commercial level of confidence in your business decision making.

Both logfile analysis programs and page tagging solutions are readily available for web analytics. There are advantages and disadvantages to each approach.

 

Advantages of Logfile Analysis

The main advantages of logfile analysis over page tagging are:

Log File Analysis Page Tagging
The web server already produces logfiles - raw data is already available. Historical data can be processed easily. To collect data via page tagging requires changes to the website.

The web server reliably records every transaction it makes.

Monitor page delivery performance, abandoned page views, and incomplete downloads

Relies on the visitors' browsers not having JavaScript is disabled.
Data is stored on the company's own servers - in a standard format - easy for a company to switch programs later, use several different programs, and analyze historical data with a new program. Vendor lock-in - data stored on vendor site, and may be in proprietary format. More difficult to change programs later.
Logfiles contain information on visits from search engine spiders - excluded from visitor activity, but important data for search engine optimization.  

Logfiles contain information on failed requests.

Can also track bandwidth, completed downloads and differentiate between completed and partial downloads

Only records an event if the page is successfully viewed.
No firewall issues.  
Monitor paths and drop-off points of search engine robots - helps with SEO  
Capture click fraud activity that does not execute JavaScript and remains invisible to page tags JavaScript page tags may be blocked by client settings
Securely capture http user names  
No need to insert tags on web pages and scripted pages. No acceptance testing the modifications Need to insert tracking tags on pages, and page elements
Avoid the effort of monitoring your site for pages that are missing page tags  
Measure views of downloaded files, such as PDF's that were directly accessed.  
Measure mobile browsers/visitors which may not fire JavaScript page tags JavaScript page tags may be blocked by client settings
Measure page views even if the viewer clicked on to the next page before the page tag fired  

 

Log file Disadvantages

  • Proxy/caching inaccuracies, if a page is cached, no record is logged on your web server
  • No event tracking, no javascript, flash or AJAX tracking
  • Time consuming web server log file management and log file transfer from disparate web server farms

 

Advantages of Page Tagging

The main advantages of page tagging over logfile analysis are:

Page Tagging Log File Analysis
Measure traffic on portions of your site embedded in other web sites where you don’t have access to logs Caching can corrupt true data
The JavaScript is automatically run every time the page is loaded. Fewer issues with caching. With logfile analysis, information not normally collected by the web server can only be recorded by modifying the URL.

It is easier to add additional information to the JavaScript, which can then be collected by the remote server - visitors' screen sizes, price of goods purchased.

 
Page tagging can report on events which do not involve a request to the web server, such as interactions within Flash movies. The server has to be configured to do this.
The page tagging service manages the process of assigning cookies to visitors  
Page tagging is available to companies who do not run their own web servers.  
Measure events in Web 2.0 rich Internet applications built with Ajax or Flash  

Track page views even if they were cached in ISP proxy servers

 
Track page views following a click on the browser’s back button  
Measure behavior within web pages, such as scrolling down or changing form fields  
Measure shopping cart activity  
Measure client side information, such as the browser’s screen size, etc.  
Capture additional information items such as user login names or form field data that are passed through customized tags  
Breaks through proxy and caching servers - provides more accurate session tracking  
Client side capture of ecommerce data Server side access can be problematic
Visitor data can be collected in near real time Requires batch run of log file

 

Page Tagging Disadvantages

  • Set up errors lead to data loss and you can’t go back in time, if you make a mistake with your tagging you have a hole in your data
  • Firewalls can mangle or restrict tags
  • Cannot track bandwidth or completed downloads, tags are set when the page/file/event is requested not when the download is completed (although you can tag different stages of an event/file)

 

Economic Considerations

The most cost effective solution depends on:

  • Avaiable technical expertise within the company
  • Vendor chosen
  • Amount of activity on the web sites
  • Depth and type of information sought
  • Number of distinct web sites needing statistics
Log File Analysis Page Tagging
Almost always performed in-house. Can be performed in-house, but it is more often provided as a third-party service.
Typically involves a one-off software purchase; however, some vendors are introducing maximum annual page views with additional costs to process additional information. Page tagging most often involves a monthly fee, although some vendors offer installable page tagging solutions with no additional page view costs.

 

Hybrid Methods

Programs are available which collect data through both logfiles and page tagging. By using a hybrid method, they produce more accurate statistics than either method on its own.


Other Methods

Other methods of data collection not currently widely deployed include:

Packet Sniffing - Integrating the web analytics program into the web server, and collecting data by sniffing the network traffic passing between the web server and the outside world. This method can be used by large e-commerce sites as it involves no changes to the site or servers and cannot compromise operation. It provides better data in real-time or in log file format and it is easy to feed datawarehouses and join the data with CRM, and enterprise data.

Server-side Page Tagging Analysis - Instead of getting the information from the user side, when the visitor opens the page, the script works on the server side. Right before a page is sent to a user it sends the data.

 


Key Web Analytics Data Definitions

Definitions of web analytics are not globally agreed, however there are some commonly used terms employed by most, if not all, analytics tools:

Standards Bodies

  • Jicwebs(Industry Committee for Web Standards)/ABCe (Auditing Bureau of Circulations electronic, UK and Europe)
  • The WAA (Web Analytics Association, US)
  • IAB (Interactive Advertising Bureau)

Both the WAA and the ABCe provide more definitive lists for those who are declaring their statistics using the metrics defined by either.

Commonly Used Data

Hit - A request for a file from the web server. Available only in log analysis. Used to measure popularity, but is extremely misleading and highly over-estimates popularity. The total number of visitors or page views provides a more realistic and accurate assessment of popularity.

Page View - A request for a file whose type is defined as a page in log analysis. An occurrence of the script being run in page tagging.

Impression - An impression is each time an advertisement loads on a users screen. Anytime you see a banner, that is an impression.

Visit / Session - A series of requests from the same uniquely identified client with a set timeout. A visit is expected to contain multiple hits (in log analysis) and page views.

First Visit / First Session - A visit from a visitor who has not made any previous visits.

Visitor / Unique Visitor / Unique User - The uniquely identified client generating requests on the web server (log analysis) or viewing pages (page tagging) within a defined time period (i.e. day, week or month). A Unique Visitor counts once within the timescale. A visitor can make multiple visits. The Unique User is now the only mandatory metric for an ABCe audit.

Repeat Visitor - A visitor that has made at least one previous visit. The period between the last and current visit is called visitor recency and is measured in days.

New Visitor - A visitor that has not made any previous visits. a confusing definition, sometimes substituted with analysis of first visits.

Singletons - The number of visits where only a single page is viewed. Not a useful metric in alone but is indicative of various forms of "Click Fraud" as well as being used to calculate bounce rate and in some cases to identify bots.

Bounce Rate / % Exit - The percentage of visits where the visitor enters and exits at the same page without visiting any other pages on the site.

Conclusion

There are advantages and disadvantages to all methods of web data collection. My preference is for a hybrid model with a web analytics solution that uses both javascript page tagging and log files [allows website owners to keep control of their web data].

 

Storing Data

Using an online analytics tool is fine for all but large enterprises, however one shortfall is that the data is also online. A key part of your web analytics data strategy should be to pull all data required to support your business decisions into a central data store. This data store becomes the foundation of your business decision support system.

All analytics dashboards access data from this central repository and NOT your operational systems.

Most small online businesses start with MySQL database - a free database software package provided by you web host. As the business grows, and you are on a dedicated server, transitioning to MS SQL Server will provide more flexibility as to integration with reporting services and online performance portals.

NEXT: Web Analytics Business Process

 

Back To Top

Web Analytics | About Data | WA Business Process | Google Analytics | Google Web Optimizer | Testing | Making Analytics Work | Customer Experience | Internet Trends | Google Strategy | Online Advertising Metrics | Glossary


NOW AVAILABLE!

The Logical Organization
A Strategic Guide To Corporate Performance Using Business Intelligence

THE ULTIMATE BI REFERENCE
FOR MANAGERS & CONSULTANTS

The Logical Organization Book Cover


HOME
BLOG
ARTICLES
PUBLICATIONS
 
About Coded Vision
Past Clients
 
STRATEGY
Business Intelligence
Web Analytics
Balanced Scorecards
Corporate Dashboards
Marketing Strategy
Collaboration
Innovation
E-Learning
 
OPERATIONS
Organisational Design
Business Process Design
Benchmarks & Metrics
Balanced Scorecard
KPI Development
Sales Analytics
BPR
BPM And SOA
Process Management
OD Resources
 
TECHNOLOGY
Enterprise Data
Data Warehouse
IT Convergence Models
Executive Technology
 
QUALITY
Quality Management
Six Sigma
Lean Six Sigma
Revenue Assurance
 
EXECUTIVE UPDATES
Business Strategy
Business Metrics
Corporate Performance
Web Analytics
Leadership
Lifecycle Management
Marketing Technology
Portfolio Management
Project Management
 
OTHER RESOURCES
Articles
The BI Guide
The IQ Exchange
Events
Resources & Links

 

Get Subscriptions

Top Business Magazines

Up To

80%

Off Rack Prices