Server Log File Analysis or Page Tagging? How Web Server Administrators Can Optimize Their Web Sites with Web Analytics

Question:

How do you conduct web analytics for your web servers, and what are the pros and cons of your approach?

I am interested in learning from other web server administrators about their web analytics practices and preferences. I work for a non-profit organization that runs web servers both internally and on the internet. We use WebLog Expert to analyze our IIS logs (W3C Extended format) and get insights into our site visitors’ behavior, location, and browser/OS choices. We do not need to optimize our web sites for marketing purposes, but we want to provide a good user experience and support the most common browsers/OS. WebLog Expert is a server log file analyzer that works well for our needs, but I wonder if there are other options or methods that could offer more benefits or features. For example, I have heard of page tagging, but I do not know much about it or how it compares to server log file analysis. I would appreciate any feedback or suggestions from other web server administrators who have experience with different web analytics tools or techniques. Thank you.

Answer:

Web analytics is the process of collecting, analyzing, and reporting data about the web traffic and usage of web sites. Web analytics can help web server administrators understand their site visitors’ behavior, preferences, needs, and satisfaction. Web analytics can also help web server administrators optimize their web sites for performance, usability, accessibility, and security.

There are two main methods of web analytics: server log file analysis and page tagging. Server log file analysis involves reading and processing the log files that are generated by the web server every time a request is made to the web site. Page tagging involves inserting a small piece of code (usually JavaScript) into the web pages that sends information to a third-party server every time a page is loaded or an event occurs on the web site.

Both methods have their advantages and disadvantages, and web server administrators should choose the one that best suits their needs and goals. In this article, we will compare the two methods based on the following criteria: data accuracy, data completeness, data ownership, data privacy, data latency, data analysis, and data cost.

Data Accuracy

Data accuracy refers to how well the data reflects the actual web traffic and usage of the web site. Data accuracy can be affected by various factors, such as caching, bots, proxies, cookies, browsers, and network errors.

Server log file analysis has a high data accuracy, as it records every request that is made to the web server, regardless of whether the request is successful or not. Server log file analysis can also filter out the requests that are made by bots, spiders, or crawlers, as they usually have a distinctive user agent string. However, server log file analysis cannot track the requests that are made from the browser cache, as they do not reach the web server. Server log file analysis also cannot track the requests that are made through proxies, as they may mask the original IP address and other information of the site visitor.

Page tagging has a lower data accuracy, as it relies on the execution of the code on the web page, which may not always happen. Page tagging cannot track the requests that are blocked by ad blockers, firewalls, or browser settings, as they prevent the code from running. Page tagging also cannot track the requests that are made by browsers that do not support JavaScript, or have JavaScript disabled. Page tagging can also overestimate the web traffic and usage, as it may count multiple visits from the same site visitor, if they delete or block the cookies that are used to identify them.

Data Completeness

Data completeness refers to how much data is collected and available for analysis. Data completeness can be affected by the type and amount of information that is recorded and stored by the web analytics method.

Server log file analysis has a high data completeness, as it records a lot of information about the web traffic and usage, such as the IP address, user agent, referrer, date, time, request method, request URL, response status, response size, and response time. Server log file analysis can also record custom information, such as the session ID, user ID, or other parameters, if they are included in the request URL or the cookies. Server log file analysis can also store the log files indefinitely, as long as there is enough disk space on the web server.

Page tagging has a lower data completeness, as it records less information about the web traffic and usage, such as the page title, page URL, screen resolution, color depth, language, and browser name. Page tagging can also record custom information, such as the session ID, user ID, or other parameters, if they are included in the code or the cookies. However, page tagging cannot record the information that is not available on the web page, such as the request method, response status, response size, and response time. Page tagging also cannot store the data indefinitely, as it depends on the storage capacity and retention policy of the third-party server.

Data Ownership

Data ownership refers to who owns and controls the data that is collected and stored by the web analytics method. Data ownership can affect the access, security, and usage of the data.

Server log file analysis has a high data ownership, as the data is owned and controlled by the web server administrator. The web server administrator can access, secure, and use the data as they wish, without any interference or limitation from a third-party provider. The web server administrator can also share or export the data to other platforms or tools, if they want to.

Page tagging has a low data ownership, as the data is owned and controlled by the third-party provider. The web server administrator has to rely on the third-party provider to access, secure, and use the data, and may have to comply with their terms and conditions, policies, and fees. The web server administrator may also have limited or no ability to share or export the data to other platforms or tools, depending on the third-party provider’s features and compatibility.

Data Privacy

Data privacy refers to how the data is protected from unauthorized or unethical use or disclosure. Data privacy can affect the trust, reputation, and compliance of the web server administrator and the web site.

Server log file analysis has a high data privacy, as the data is stored on the web server, which can be secured and encrypted by the web server administrator. The web server administrator can also anonymize or delete the data that contains personal or sensitive information, such as the IP address, user ID, or other parameters, to protect the privacy of the site visitors. The web server administrator can also inform and obtain the consent of the site visitors about the data collection and usage, and provide them with the option to opt out, if they want to.

Page tagging has a low data privacy, as the data is stored on the third-party server, which may not be secured or encrypted by the third-party provider. The third-party provider may also collect, use, or share the data that contains personal or sensitive information, such as the IP address, user ID, or other parameters, without the knowledge or consent of the web server administrator or the site visitors. The web server administrator may also have difficulty or no ability to inform and obtain the consent of the site visitors about the data collection and usage, and provide them with the option to opt out, depending on the third-party provider’s features and compliance.

Data Latency

Data latency refers to how fast the data is collected and available for analysis. Data latency can affect the timeliness, relevance, and accuracy of the data.

Server log file analysis has a high data latency, as the data is collected and stored on the web server, which may not be processed or updated in real time. The web server administrator may have to wait for a certain period of time, such as an hour, a day, or a week, before they can access and analyze the data. The web server administrator may also have to manually or periodically run the web analytics tool or software to process and update the data.

Page tagging has a low data latency, as the data is collected and stored on the third-party server, which may be processed or updated in real time. The web server administrator can access and analyze the data almost instantly, or within a few minutes, after the data is collected. The web server administrator can also use the web analytics tool or software that is provided by the third-party provider to access and analyze the data, which may be automatically or continuously updated.

Data Analysis

Data analysis refers to how the data is processed, presented, and interpreted by the web analytics method. Data analysis can affect the usefulness, usability, and reliability of the data.

Server log file analysis has a low data analysis, as the data is raw and unstructured, and may require a lot of processing and manipulation to extract meaningful and actionable insights. The web server administrator may have to use a web analytics tool or software that can read and process the log files, and generate reports and charts that can visualize and summarize the data. The web server administrator may also have to use their own skills and knowledge to interpret and understand the data, and apply it to their web site optimization goals.

Page tagging has a high data analysis, as the data is structured and organized, and may require less processing and manipulation to extract meaningful and actionable insights. The web server administrator can use the web analytics tool or software that is provided by the third-party provider, which can automatically and intelligently process and present the data. The web server administrator can also use the features and functions that are provided by the web analytics tool or software, such as filters, segments, goals, funnels, events, conversions, and recommendations, to analyze and understand the data, and apply it to their web site optimization goals.

Data Cost

Data cost refers to how much money or resources are required to collect, store, and analyze the data by the web analytics method. Data cost can affect the affordability, scalability, and sustainability of the data.

Server log file analysis has a low data cost, as the data is collected and stored on the web server, which may not require any additional or external money or resources. The web server administrator may only have to pay for the web server hosting and maintenance, and the web analytics tool or software, if they choose to use one. The web server administrator may also have the flexibility and scalability to collect and store as much data as they want or need, as long as they have enough disk space on the web server.

Page tagging has a high data cost, as the data is collected and stored on the third-party server, which may require a lot of additional or external money or resources. The web server administrator may have to pay for the third-party provider’s service and features, which may vary depending on the amount and type of data that is collected and stored, and the web analytics tool or software that is used. The web server administrator may also have limited or no flexibility and scalability

Leave a Reply

Your email address will not be published. Required fields are marked *

Privacy Terms Contacts About Us