Web Analytics Data - Log Analysis Versus Page Tagging
Data is the central asset to your online performance management.
Without sound analytics data, you are unable to provide a commercial
level of confidence in your business decision making.
Both logfile analysis programs and page tagging solutions are readily
available for web analytics. There are advantages and disadvantages
to each approach.
Advantages of Logfile Analysis
The main advantages of logfile analysis over page tagging are:
| Log File Analysis |
Page Tagging |
| The web server already produces logfiles - raw data is already
available. Historical data can be processed easily. |
To collect data via page tagging requires changes to the website.
|
The web server reliably records every transaction it makes.
Monitor page delivery performance, abandoned page views,
and incomplete downloads |
Relies on the visitors' browsers not having JavaScript is
disabled. |
| Data is stored on the company's own servers - in a standard
format - easy for a company to switch programs later, use several
different programs, and analyze historical data with a new program.
|
Vendor lock-in - data stored on vendor site, and may be in
proprietary format. More difficult to change programs later. |
| Logfiles contain information on visits from search engine
spiders - excluded from visitor activity, but important data
for search engine optimization. |
|
| Logfiles contain information on failed requests.
Can also track bandwidth, completed downloads and differentiate
between completed and partial downloads |
Only records an event if the page is successfully viewed.
|
| No firewall issues. |
|
| Monitor paths and drop-off points of search engine robots
- helps with SEO |
|
| Capture click fraud activity that does not execute JavaScript
and remains invisible to page tags |
JavaScript page tags may be blocked by client settings |
| Securely capture http user names |
|
| No need to insert tags on web pages and scripted pages. No
acceptance testing the modifications |
Need to insert tracking tags on pages, and page elements |
| Avoid the effort of monitoring your site for pages that are
missing page tags |
|
| Measure views of downloaded files, such as PDF's that were
directly accessed. |
|
| Measure mobile browsers/visitors which may not fire JavaScript
page tags |
JavaScript page tags may be blocked by client settings |
| Measure page views even if the viewer clicked on to the next
page before the page tag fired |
|
Log file Disadvantages
- Proxy/caching inaccuracies, if a page is cached, no record is
logged on your web server
- No event tracking, no javascript, flash or AJAX tracking
- Time consuming web server log file management and log file transfer
from disparate web server farms
Advantages of Page Tagging
The main advantages of page tagging over logfile analysis are:
| Page Tagging |
Log File Analysis |
| Measure traffic on portions of your site embedded in other
web sites where you don’t have access to logs |
Caching can corrupt true data |
| The JavaScript is automatically run every time the page is
loaded. Fewer issues with caching. |
With logfile analysis, information not normally collected
by the web server can only be recorded by modifying the URL.
|
It is easier to add additional information to the JavaScript,
which can then be collected by the remote server - visitors'
screen sizes, price of goods purchased. |
|
| Page tagging can report on events which do not involve a request
to the web server, such as interactions within Flash movies.
|
The server has to be configured to do this. |
| The page tagging service manages the process of assigning
cookies to visitors |
|
| Page tagging is available to companies who do not run their
own web servers. |
|
| Measure events in Web 2.0 rich Internet applications built
with Ajax or Flash |
|
Track page views even if they were cached in ISP proxy
servers |
|
| Track page views following a click on the browser’s
back button |
|
| Measure behavior within web pages, such as scrolling down
or changing form fields |
|
| Measure shopping cart activity |
|
| Measure client side information, such as the browser’s
screen size, etc. |
|
| Capture additional information items such as user login names
or form field data that are passed through customized tags |
|
| Breaks through proxy and caching servers - provides more accurate
session tracking |
|
| Client side capture of ecommerce data |
Server side access can be problematic |
| Visitor data can be collected in near real time |
Requires batch run of log file |
Page Tagging Disadvantages
- Set up errors lead to data loss and you can’t go back
in time, if you make a mistake with your tagging you have a hole
in your data
- Firewalls can mangle or restrict tags
- Cannot track bandwidth or completed downloads, tags are set
when the page/file/event is requested not when the download is
completed (although you can tag different stages of an event/file)
Economic Considerations
The most cost effective solution depends on:
- Avaiable technical expertise within the company
- Vendor chosen
- Amount of activity on the web sites
- Depth and type of information sought
- Number of distinct web sites needing statistics
| Log File Analysis |
Page Tagging |
| Almost always performed in-house. |
Can be performed in-house, but it is more often provided as
a third-party service. |
| Typically involves a one-off software purchase; however, some
vendors are introducing maximum annual page views with additional
costs to process additional information. |
Page tagging most often involves a monthly fee, although some
vendors offer installable page tagging solutions with no additional
page view costs. |
Hybrid Methods
Programs are available which collect data through both logfiles
and page tagging. By using a hybrid method, they produce more accurate
statistics than either method on its own.
Other Methods
Other methods of data collection not currently widely deployed
include:
Packet Sniffing - Integrating the web analytics
program into the web server, and collecting data by sniffing the
network traffic passing between the web server and the outside world.
This method can be used by large e-commerce sites as it involves
no changes to the site or servers and cannot compromise operation.
It provides better data in real-time or in log file format and it
is easy to feed datawarehouses and join the data with CRM, and enterprise
data.
Server-side Page Tagging Analysis - Instead of
getting the information from the user side, when the visitor opens
the page, the script works on the server side. Right before a page
is sent to a user it sends the data.
Key Web Analytics Data Definitions
Definitions of web analytics are not globally agreed, however
there are some commonly used terms employed by most, if not all,
analytics tools:
Standards Bodies
- Jicwebs(Industry Committee for Web Standards)/ABCe (Auditing
Bureau of Circulations electronic, UK and Europe)
- The WAA (Web Analytics Association, US)
- IAB (Interactive Advertising Bureau)
Both the WAA and the ABCe provide more definitive lists for those
who are declaring their statistics using the metrics defined by
either.
Commonly Used Data
Hit - A request for a file from the web server.
Available only in log analysis. Used to measure popularity, but
is extremely misleading and highly over-estimates popularity. The
total number of visitors or page views provides a more realistic
and accurate assessment of popularity.
Page View - A request for a file whose type is
defined as a page in log analysis. An occurrence of the script being
run in page tagging.
Impression - An impression is each time an advertisement
loads on a users screen. Anytime you see a banner, that is an impression.
Visit / Session - A series of requests from the
same uniquely identified client with a set timeout. A visit is expected
to contain multiple hits (in log analysis) and page views.
First Visit / First Session - A visit from a
visitor who has not made any previous visits.
Visitor / Unique Visitor / Unique User - The
uniquely identified client generating requests on the web server
(log analysis) or viewing pages (page tagging) within a defined
time period (i.e. day, week or month). A Unique Visitor counts once
within the timescale. A visitor can make multiple visits. The Unique
User is now the only mandatory metric for an ABCe audit.
Repeat Visitor - A visitor that has made at least
one previous visit. The period between the last and current visit
is called visitor recency and is measured in days.
New Visitor - A visitor that has not made any
previous visits. a confusing definition, sometimes substituted with
analysis of first visits.
Singletons - The number of visits where only a
single page is viewed. Not a useful metric in alone but is indicative
of various forms of "Click Fraud" as well as being used
to calculate bounce rate and in some cases to identify bots.
Bounce Rate / % Exit - The percentage of visits
where the visitor enters and exits at the same page without visiting
any other pages on the site.
Conclusion
There are advantages and disadvantages to all methods of web data
collection. My preference is for a hybrid model with a web analytics
solution that uses both javascript page tagging and log files [allows
website owners to keep control of their web data].
Storing Data
Using an online analytics tool is fine for all but large enterprises,
however one shortfall is that the data is also online. A key part
of your web analytics data strategy should be to pull all data required
to support your business decisions into a central data store. This
data store becomes the foundation of your business decision support
system.
All analytics dashboards access data from this central repository
and NOT your operational systems.
Most small online businesses start with MySQL database - a free
database software package provided by you web host. As the business
grows, and you are on a dedicated server, transitioning to MS SQL
Server will provide more flexibility as to integration with reporting
services and online performance portals.
NEXT: Web
Analytics Business Process
Back To Top
Web Analytics | About Data |
WA Business Process | Google
Analytics | Google Web Optimizer
| Testing | Making
Analytics Work | Customer Experience
| Internet Trends | Google
Strategy | Online Advertising Metrics
| Glossary
|