Data is the central asset to your online performance management.
Without sound analytics data, you are unable to provide a commercial
level of confidence in your business decision making.
This section on data should be read BEFORE attempting to
install any analytics programs or create any reporting structures.
The key to data management is - plan long, deliver short.
As setting up a complete data management system is a complex process,
a new online business is best to concentrate on building a customer
base and driving revenue, than spending all its time on setting
up the ideal IT infrastructure. However, by understanding that data
management is a journey of technology, people, process and governance,
you will be well positioned to manage the rapid growth common to
many successful online enterprises.
The web is changing at the speed of light - so it is the most difficult
place to get data. If you aim for perfect data quality you will
waste your time. The web provides more valuable data than any other
market on the planet.
Ads in magazines are a "fate based initiative" - a hunch
that it will drive sales, and surveys to see if anyone saw it.
The web data gives you a level of comfort that something will work.
Data confidence transfers into decision confidence. You only need
10% confidence in data to make a decision. Then work to improve
the data quality over time. In time, your confidence in data will
go up, but will never get to 100%.
In summary, the key to successful use of analytics is the human
element. Those who can see patterns and opportunities to make the
improvements that lead to user satisfaction. Make decisions rather
than chasing data quality. A poor decision, is better than no decision
at all.
Data Sources
There are five key sources of data used in online business decision
making:
Data should be imported daily into a central repository upon which
specific custom business reporting can be managed.
Data Process
Need structure and process around collecting data, analysing data
and using data to make decisions.
What you need to do to make a decision
What you need to do to complete a test
What you need to do to improve the merchandising of my website
If we have eight great ideas on how this page should look, or products
we should sell - TEST IT! Testing is faster, and cheaper and lets
your customer tell you what you should be doing.
80% of the time you are wrong about what your customers want.
Collecting Web Data
There are two main ways to collect web analytics data.
Logfile analysis - reads the logfiles in which
the web server records all its transactions.
Page tagging - uses JavaScript on each page
to notify a third-party server when a page is rendered by a web
browser.
Common Data Collected
Data collected almost always includes:
web traffic reports
e-mail response rates
direct mail campaign data
sales and lead information
user performance data
other custom metrics as needed
This data is typically compared against key performance indicators
for performance, and used to improve a web site or marketing campaign's
audience response.
Web Server Logfile Analysis
Web servers record all site transactions in a logfile. Using an
web log analysis program, these logfiles can be read to provide
data on the website.
Most web hosts provide free web log statistics in various formats:
Online logs
Latest Visitor - Shows the last 300 visitors to your site x
domain
Raw Access Logs
Error Logs
Online Web Log Analysis - 'Choose Log Program'
Downloadable csv files
Online graphical statistics - Awstats
About Website Statistics
Early web sites mostly consisted of a single HTML file, with web
site statistics including:
Hits - number of client requests made to the web server.
With the introduction of images in HTML, and web sites that spanned
multiple HTML files, the HIT count lost its value. Every component
of the page [image, form, php include etc] consists of a hit, so
it was impossible to gauge true visitor count.
To overcome this Log Analyzer was released and two new units of
measure were introduced to more accurately measure the amount of
human activity on web servers.
Page views - a request made to the web server
for a page, as opposed to a graphic
Visits - a sequence of requests from a uniquely
identified client that expired after a certain amount of inactivity,
usually 30 minutes.
Whilst page views and visits are still commonly used, they are
now regarded as less valuable measurements. Search engine spiders
and robots, along with web proxies and dynamically assigned IP addresses
for large companies and ISPs, made it more difficult to identify
unique human visitors to a website.
Log analyzers responded by tracking visits by cookies, and by ignoring
requests from known spiders.
The use of web caches also presents a problem for logfile analysis.
If a person revisits a page, the second request will often be retrieved
from the browser's cache, and so no request will be received by
the web server. The person's path through the site is lost. Caching
can be defeated by configuring the web server, but this can result
in degraded performance for the visitor to the website. To overcome
this caching issue and the accuracy of logfile analysis , page tagging
became the accepted norm for tracking page visits.
The earliest form of page tagging was a 'web counter' - images
included in a web page that showed the number of times the image
had been requested, thereby providing an estimate of the number
of visits to that page.
Web Counter
Web counters evolved to employ a small invisible image rather
than a visible one, and using JavaScript to pass along with the
image request certain information about the page and the visitor.
This information is then processed remotely by a web analytics program,
and extensive statistics generated.
Cookies
A web analytics service also manages the process of assigning a
cookie to the user, which can uniquely identify them during their
visit and in subsequent visits.
Callback
With Ajax-based solutions, instead of using an invisible image,
a call back is implemented to the server from the rendered page.
When the page is rendered on the web browser, a piece of Ajax code
'calls back' to the server and passes information about the client
that can then be aggregated by a web analytics program. This process
can be thwarted by browser restrictions on the servers which can
be contacted with XmlHttpRequest objects.
There are advantages and disadvantages to all methods of web data
collection. My preference is for a hybrid model with a web analytics
solution that uses both javascript page tagging and log files [allows
website owners to keep control of their web data].
Storing Data
Using an online analytics tool is fine for all but large enterprises,
however one shortfall is that the data is also online. A key part
of your web analytics data strategy should be to pull all data required
to support your business decisions into a central data store. This
data store becomes the foundation of your business decision support
system.
All analytics dashboards access data from this central repository
and NOT your operational systems.
Most small online businesses start with MySQL database - a free
database software package provided by you web host. As the business
grows, and you are on a dedicated server, transitioning to MS SQL
Server will provide more flexibility as to integration with reporting
services and online performance portals.