Credits & License Information

Help for statistics version 5

Back to main Credits page
<H1>statistics module v5, as of 06/10/2003</H1>
<font size=-2>Copyright 2003 by Craig R. Saunders.  All rights reserved.</font>

<p>
The statistics module replaces the current Stats and Referer modules with a faster, more flexible, more extensible implementation. 
<H2>Features</H2>

<ol>
	<li><b>Replaces Referer and Stats module functionality.</b>  Historically, PostNuke has collected referer data separately from other visitor statistic data.  This is an arbitrary design decision and the statistics module is designed to collect all visitor statistic data together.  We believe that this results in a simpler design and is easier for adminstrators to use.

	<li><b>Referer and Summary Stats reports available from a user menu.</b>  By separating out the reporting functions from the administrative functions, we made it easier to manage access to Statistics reporting (via the PN Permissions System.)

	<li><b>Stats collection can be turned on and off from the admin menu.</b>  The old Stats module collects statistics whether you want them or not.  The only way to turn them off is to change the PHP code.  Furthermore, the administrator can't control how and when statistics are collected by the old Stats module.

	<li><b>Detailed statistics collected.</b>  The old Stats module summarizes data at the time it is collected.  This means that if the administrator wants to summarize data in a different way, it may not be possible.  In the statistics module all of the details of a visit are stored in the database so that later they can be analyzed in any manner.  

	<li><b>Stats can be filtered by user ID, IP address and hostname of the client</b>  The statistics module uses regular expressions to filter any or all of these three fields.  As a result, the administrator has full control over when statistics are or are not collected.

	<li><b>Stats can be consolidated by time interval, referer and other parameters.</b>  Many sites do not need second-by-second recording of visitor statistics.  The statistics module gives the administrator the tools to consolidate statistics to save disk space without sacrificing detail.

	<li><b>Implemented using new pnAPI.</b>

	<li><b>Install, uninstall and admin fully implemented.</b>

	<li><b>Faster collection of statistics.</b>  The most significant usage of time is database access.  The statistics module collects data with a minimum of 3 SQL queries where the Stats and Referer modules need 8-10 SQL queries.

	<li><b>Easier to extend and customize.</b>  The statistics module is structured so that additional reports can be included by adding one or two functions and inserting a menu option.
</ol>
<strong>Note:</strong> In this release two features have not been completed or tested.  One of those features is the use of archive and summary tables.  Since there is no way to copy or move statistics data from the details table to the archive or summary tables, setting consolidation or filtering for those tables is pointless.
<p>The consolidation of statistics has not been fully tested.  You should test this feature before depending on it.
<H2>Quality Statement</H2>
This module was tested with PostNuke 0.713 and 0.714.  Other users have installed it on more recent versions, including PostNuke 0.723.  Some users on PN 0.723 have reported installation problems while others have installed and run fine.  We believe that the users who experienced problems did not precisely follow the installation instructions and ran afoul of a bug in PostNuke.
<p>THIS IS A BETA QUALITY RELEASE. USE AT YOUR OWN RISK!  Although this is a Beta Quality release, it already includes more functionality than the existing Stats and Referers modules.  So, what we have is something that is better than the existing modules but does not yet include all of its own intended functionality.  In other words, it's better than what you have now but not as good as it will eventually be.
<p>There are references to the Summary and Archive tables included in this release, but their use is not yet fully implemented.  For the time being, only the Details table is accessible.
<p>Consolidation has not been tested and may have bugs.

<H2>Installation</H2>
<strong><i>WARNING!</i> It is very important that you follow the order of the installation instructions precisely!  There is a bug in PN 0.723 that will cause problems if you do not Initialize and Activate the statistics module first!</strong>
<ol>
	<li>Download the statitics.zip file into your modules directory and extract everything using the appropriate utility.  This should create a statistics directory under modules.
	<li>Go to Admin->Modules and click on Regenerate.
	<li>Find the statistics module in the resulting list and click on Initialize.
	<li>Again find the statistics module in the module list and click on Activate.
	<li>ENABLING STATS COLLECTION REQUIRES A CHANGE TO includes/pnAPI.php file.  BE VERY CAREFUL!  Look for this code at the bottom of the pnInit() function:
	<blockquote><pre>
  // Other other includes
    include 'includes/advblocks.php';
    include 'includes/counter.php';
    include 'includes/pnHTML.php';
    include 'includes/pnMod.php';
    include 'includes/queryutil.php';
    include 'includes/xhtml.php';
    include 'includes/oldfuncs.php';

    // Handle referer
    if (pnConfigGetVar('httpref') == 1) {
        include 'referer.php';
        httpreferer();
	}
</pre></blockquote>
and after that code block, insert:
<blockquote><pre>

    include 'modules/statistics/collect.php';
    </pre></blockquote>
    <li>If you want to migrate existing stats data into the new statistics module, go to Admin->statistics->Manage Data Tables. Click on the "Submit" button for the Migrate data option.
    <li>Go to Admin->statistics->Parameter Configuration.  Set desired Filtering and Consolidation options.  Click on "Collect Stats?" Click on Submit. (<i><b>Note:</b> Some users on PN 0.723 have reported that the Collect Stats? box won't stay checked.  This is a result of a problem during installation.  If this happens to you, perform the steps in the section <b>Manual Installation</b> and then return here to complete installation.</i>)
    <li>Add statistics to Main Menu.  Use {statistics} as the URL for this module in menu blocks.  (The statistics module is a pnAPI-compliant module, so it uses {} instead of the [] that old-style module calls use.)
    <li>Set permissions on who can access statistics user menu from Main Menu. (See <b>Permissions</b> section below for more info.)
    </ol>
    <H2>Upgrade from Previous Releases</H2>
<ol>
	<li>Backup/Copy your existing stats_details, stats_archive and stats_summary tables.
	<li>Download the statitics.zip or statistics_tar.gz file into your modules directory and extract everything using the appropriate utility.
	<li>Go to Admin->Modules and click on Regenerate.
	<li>Find the statistics module in the resulting list.  The status should say "New version found". Click on Upgrade.
	<li>Again find the statistics module in the module list and click on Activate.
	<li>If you want to migrate existing stats data into the new statistics module, go to Admin->statistics->Manage Data Tables. Click on the "Submit" button for the Migrate data option.  If you've been collecting stats from both stats and statistics, this will create duplicate entries.
    </ol>
<H2>Manual Installation</H2>
Some users have reported problems installing the statistics module on PN 0.723.  If it appears that you have an installation problem, you can complete the installation using the following steps.  But before you do, please send an email to <a href="mailto:crs@mtrad.com">crs@mtrad.com</a> describing your problem.  We are building a profile of symptoms to improve installation in future releases.
<p>Using phpMyAdmin or similar tool, execute the following queries on your PN database:
<ol>
	<li><strong>Create the stats_details table:</strong>
	<blockquote><pre>
CREATE TABLE nuke_stats_details (
            pn_id int(8) NOT NULL auto_increment,
            pn_key varchar(255) default '',
            pn_host varchar(255) default '',
            pn_requrl varchar(255) default '',
            pn_http_referer varchar(255) default '',
            pn_ip varchar(15) default '',
            pn_hostip varchar(99) default '',
            pn_os varchar(15) default '',
            pn_browser varchar(15) default '',
            pn_uname varchar(15) default '',
            pn_uid int(4) default '0',
            pn_sessionid varchar(64) default '',
            pn_count smallint unsigned default '0',
            pn_timestamp datetime NOT NULL,
            pn_summary_timestamp datetime NOT NULL default '0000-00-00 00:00:00',
            PRIMARY KEY(pn_id));
    </pre></blockquote>
    <li><strong>Create the stats_summary table:</strong>
    <blockquote><pre>
CREATE TABLE nuke_stats_summary (
            pn_id int(8) NOT NULL auto_increment,
            pn_key varchar(255) default '',
            pn_host varchar(255) default '',
            pn_requrl varchar(255) default '',
            pn_http_referer varchar(255) default '',
            pn_ip varchar(15) default '',
            pn_hostip varchar(99) default '',
            pn_os varchar(15) default '',
            pn_browser varchar(15) default '',
            pn_uname varchar(15) default '',
            pn_uid int(4) default '0',
            pn_sessionid varchar(64) default '',
            pn_count smallint unsigned default '0',
            pn_timestamp datetime NOT NULL,
            pn_summary_timestamp datetime NOT NULL default '0000-00-00 00:00:00',
	    PRIMARY KEY(pn_id));
    </pre></blockquote>
    <li><strong>Create the stats_archive table:</strong>
    <blockquote><pre>
CREATE TABLE nuke_stats_archive (
            pn_id int(8) NOT NULL auto_increment,
            pn_key varchar(255) default '',
            pn_host varchar(255) default '',
            pn_requrl varchar(255) default '',
            pn_http_referer varchar(255) default '',
            pn_ip varchar(15) default '',
            pn_hostip varchar(99) default '',
            pn_os varchar(15) default '',
            pn_browser varchar(15) default '',
            pn_uname varchar(15) default '',
            pn_uid int(4) default '0',
            pn_sessionid varchar(64) default '',
            pn_count smallint unsigned default '0',
            pn_timestamp datetime NOT NULL,
            pn_summary_timestamp datetime NOT NULL default '0000-00-00 00:00:00',
            PRIMARY KEY(pn_id));
    </pre></blockquote>
    <li><strong>Create module variables:</strong>
    <blockquote><pre>
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='collect', pn_value=0;
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='filter', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='filter_ip', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='filter_hostname', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='filter_user', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='details_consolidate', pn_value=0;
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='details_interval', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='details_referer', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='details_browser', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='details_os', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='details_request', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='details_visit', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='summary_consolidate', pn_value=0;
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='summary_interval', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='summary_referer', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='summary_browser', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='summary_os', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='summary_request', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='summary_visit', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='archive_consolidate', pn_value=0;
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='archive_interval', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='archive_referer', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='archive_browser', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='archive_os', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='archive_request', pn_value='';
INSERT  INTO `nuke_module_vars` SET pn_modname='statistics', pn_name='archive_visit', pn_value='';
    </pre></blockquote>
    </ol>
    PostNuke enforces uniqueness of the names of tables and columns by prefixing a "nuke_" onto every table name and "pn_" on every column name.  If you have changed either of these strings to something else, please be sure to make the same adjustment on these SQL queries.
    <H2>Default Start Date for Reports</H2>
Some users have reported that the default start date for statistics reports is some time in the future.  This is because the Site Start Date is not a complete, valid date. We usually see something like "June 2002" when the user is experiencing the problem.  If you go to Admin -> Settings -> Site Start Date and enter a complete date like "June 1, 2002" the problem will go away. 
<p>We will see about including a work-around for this in a future release of the statistics module.  

<H2>Permissions</H2>
To control whether users see the statistics module in the main menu, I put these lines in:
<blockquote><pre>
1st permission:

Group: Statisticians
Component: Menublock::
Instance: Main Menu:Statistics:
Permissions Level: Admin

2nd permission:

Group: All groups
Component: Menublock::
Instance: Main Menu:Statistics:
Permissions Level: None
</pre></blockquote>
Note that this assumes that you have a group, "Statisticians", defined and that you've added the statistics module as "Statistics" in the menu block.
<p>
This is all I use for controlling access to the statistics module on my sites.  That means that folks can get in to do a statistics report if they know the proper URL.  There are pro's and con's to this.  Personally, I don't think it's a big deal to get access to my site statistics reports.  But on the other hand, others might want to keep them a secret.
<p>
If you want to completely restrict access to statistics reporting, add these permissions:
<blockquote><pre>
1st permission:

Group: Statisticians
Component: statistics::
Instance: Main .*
Permissions Level: Read

2nd permission:

Group: All groups
Component: statistics::
Instance: .*
Permissions Level: None
</pre></blockquote>
This also assumes that you have a group "Statisticians".  But you could just as easily used "Admins" or "Users" (ie - registered and logged in).

<H2>Languages</H2>
The only language supplied is English.  If other language definitions become available, I will include them.  (If you would like to send me one, please send it to <a href="mailto:crs@mtrad.com">crs@mtrad.com</a>.)
<H2>Administration</H2>
When you click on the "statistics" option on the Administration menu, you will get this menu to display:
<p><img src="pnimages/screenshot_adminmenu.jpg" >
<p><strong>Parameter Configuration:</strong>  This page enables the administrator to make changes to the parameters that determine how the statistics module functions.
<p><strong>Maintain Data Tables:</strong>  This page provides the administrator with functions to dump or empty the data tables and to import data from the old Stats module.
<p><strong>Site Statistics Reports:</strong>  This is a link to the statistics reports menu.
<H3>Parameter Configuration</H3>
When you click on the "Parameter Configuration" option on the main statistics admin menu, you will get this page to display:
<p><img src="pnimages/screenshot_parameters.jpg" >
<H4>Collect Statistics</H4>
Check this box when you want to enable collection of statistics data from visitors.  Uncheck it if, for some reason, you need to stop data collection.  After changing the collect statistics box, remember to click on the "Submit" button to save the changes.
<H4>Filtering</H4>
The statistics module can filter out data that the administrator does not want to collect.  This is often used to filter out the site administrator and others who are working on the site and not visitors who we want to track.  You can also use filtering to ignore search engine bots and spiders.
<p>Filters are specified as Regular Expression patterns.  This allows the administrator to specify multiple targets to filter or even a range.  The simplest case of a single target can be entered exactly you would find it.  To enter ranges or multiple targets, consult the articles on Regular Expressions found in the Bibliography section below.
<blockquote>
<p><strong>Where to filter:</strong> Pick the data table where filtering should be used: Details, Summary, Archive, All Tables or No Filtering.
<p><strong>IP Address Filter Pattern:</strong>  Specify the IP address(es) that should be filtered out.
<p><strong>Hostname Filter Pattern:</strong>  Specify the client hostname(s) that should be filtered out.
<p><strong>Username Filter Pattern:</strong>  Specify the username(s) that should be filtered out.
</blockquote>
<h5>Filtering Examples</h5>
<ol>
	<li>To filter a specific user, just enter the users' login.  For example, "flora".  (Note that the quotes are <i>not</i> included.)
	<li>To filter two or more users, enter the user logins separated with the vertical bar.  For eaxmple, "flora|joe|sam".
	<li>To filter an IP address, just enter the IP address with "\." for each of the three dots.  For example, "10\.0\.0\.237".
	<li>To filter a class B or C subnet of IP addresses, enter the subnet address and substitute "[0-9]{1,3}\.[0-9]{1,3}" for each subnet.  For example, "10\.10\.10\.[0-9]{1,3}\.[0-9]{1,3}" to match the 10.10.10.0 subnet.
</ol>
<H4>Consolidation</H4>
The statistics module can consolidate data so that unnecessary data is not kept.  For example, if the administrator only cares how often a page is viewed broken down by the hour, the statistics module doesn't need to keep track of each and every identical hit during that hour.  Instead, all of the same views (ie - for the index.php page) during the same hour can be consolidated into a single record.
<p>Consolidation should be used carefully because once it's consolidated, the detail is lost.  If you want it back at a later date, then you will not be able to retrieve it.  (That's one of the problems with the old Stats module.  All data is consolidated based on the original author's expectations of what the administrator wants.  If you want something else, you are SOL.)  Our suggestion is that you collect all statistics (with no consolidation) into the Details table.  You can move data to the Summary table and consolidate it there, and use the Archive table to store the old details.  That way, you can run most of your big reports from the Summary table but still have details in the Archive table, if you ever want it later.
<blockquote>
<p><strong>Consolidate? :</strong>  Specify whether data for each table should be consolidated.
<p><strong>Time Interval:</strong> Specify the time interval for which statistics should be consolidated:
<ul>
	<li><strong>All:</strong> Statistics are collected and consolidated by the second.  (Default behavior)
	<li><strong>Minute:</strong> Statistics collected during the same minute will be consolidated together.
	<li><strong>Hour:</strong> Statistics collected during the same hour will be consolidated together.
	<li><strong>Day:</strong> Statistics collected during the same day will be consolidated together.
	<li><strong>Month:</strong> Statistics collected during the same month will be consolidated together.
</ul>
<p><strong>Referers:</strong> Specify if referers will be consolidated:
<ul>
	<li><strong>All:</strong> Statistics are collected for all referers.  (Default behavior)
	<li><strong>External:</strong> Statistics collected only for referrals from other web sites.
	<li><strong>None:</strong> No referer statistics are collected.
</ul>
<p><strong>Browsers:</strong>  Specify how browser statistics should be collected:
<ul>
	<li><strong>All:</strong> Statistics are collected on all browsers.  (Default behavior)
	<li><strong>None:</strong> No browser statistics are collected.
</ul>
<p><strong>O/S:</strong>  Specify how Operating System statistics should be collected:
<ul>
	<li><strong>All:</strong> Statistics are collected on all Operating Systems.  (Default behavior)
	<li><strong>None:</strong> No Operating Systems statistics are collected.
</ul>
<p><strong>Request Level:</strong> Specify how module statistics will be consolidated:
<ul>
	<li><strong>All:</strong> Statistics are collected for all module references.  (Default behavior)
	<li><strong>Function:</strong> All statistics for a specific function call within a module are consolidated.
	<li><strong>Module:</strong> Statistics for all calls to a specific module are consolidated.
	<li><strong>None:</strong> No request statistics are collected.
</ul>
<p><strong>Visit Type:</strong> Specify how user visit statistics will be consolidated:
<ul>
	<li><strong>All:</strong> Statistics are collected for all requests within a visit.  (Default behavior)
	<li><strong>Visit:</strong> All statistics for a specific visit are consolidated.
	<li><strong>Session:</strong> Statistics for a specific session are consolidated.  (In some bizarre cases, a session may be made up of multiple visits.  But that would assume that you left session active for long periods.
	<li><strong>User:</strong> Statistics are consolidated by user.
</ul>
</blockquote>
<H3>Maintain Data Tables</H3>
When you click on the "Maintain Data Tables" option on the main statistics admin menu, you will get this page to display:
<p><img src="pnimages/screenshot_maintaintables.jpg" >
<H4>Dump or Delete a type of record</H4>
This form allows you to erase or dump a set of records based on whether they are new or old (or all of them.)
<blockquote>
	<p><strong>Action:</strong>  Choosing <strong>Dump</strong> will cause a flat file to be created on the server with the selected statistics records.  The file will be named "dump_stats_{tablename}_{datetimestamp}" and is put in the root mysql directory.  (On my Redhat server, it goes into /var/lib/mysql.)
	<p>Choosing <strong>Delete</strong> will cause the selected statistics records to be erased from the specified statistics table.
	<p><strong>Type:</strong> Specifies which records should be dumped or deleted.  Choosing <strong>All</strong> will cause all records to be dumped or deleted.  Choosing <strong>Old</strong> will cause records that have been dumped, copied or moved previously to be deleted or dumped.  Choosing <strong>New</strong> will cause records that have <i>NOT</i> been previously dumped, copied or moved to now be deleted or dumped.
	<p><strong>Table:</strong> Specify the table to update, <strong>Details</strong>, <strong>Archive</strong>, or <strong>Summary</strong>.
	<p><strong>Submit Button:</strong> Click on the <strong>Submit</strong> to perform the Dump or Delete action.
</blockquote>
<H4>Dump or Delete a date range of records</H4>
This form allows you to erase or dump a set of records based on when they were created.
<blockquote>
<p><strong>Action:</strong>  Choosing <strong>Dump</strong> will cause a flat file to be created on the server with the selected statistics records.  The file will be named "dump_stats_{tablename}_{datetimestamp}" and is put in the root mysql directory.  (On my Redhat server, it goes into /var/lib/mysql.)
	<p>Choosing <strong>Delete</strong> will cause the selected statistics records to be erased from the specified statistics table.
<p><strong>Table:</strong> Specify the table to update, <strong>Details</strong>, <strong>Archive</strong>, or <strong>Summary</strong>.
<p><strong>Start and End Date:</strong> Specify the range of creation dates for the statistics that you want to dump or delete.  These dates are inclusive, meaning that all statistics on both the start and date will be included in the action.
<p><strong>Submit Button:</strong> Click on the <strong>Submit</strong> to perform the Dump or Delete action.
</blockquote>
<H4>Migrate Data:</H4>
It is possible to move the count of page views (only) from the old Stats tables to the new statistics tables.  No other statistics (browser, os, referer, userid, etc) are moved.  Press the <strong>Submit</strong> button to migrate the data.  Pressing the <strong>Submit</strong> button more than once will cause duplicate data to be copied.
<H2>Reporting</H2>
When you either click "statistics" on the main user menu or click on the "Site Statistics Report" option on the main statistics admin menu, you will get this page to display:
<p><img src="pnimages/screenshot_reportmenu_v5.jpg" >
<p><strong>Start Date:</strong>  Specify the start of the range to report on.  The format is "YYYY-MM-DD".  (No validation is done, so get it right.)  The reporting range is inclusive, so all statistics on the start date are included.  The default start date is Site Start Date specified in Admin -> Settings.
<p><strong>End Date:</strong>  Specify the end of the range to report on.  The format is "YYYY-MM-DD".  (No validation is done, so get it right.)  The reporting range is inclusive, so all statistics on the end date are included. The default end date is today.
<p><strong>Prose Summary:</strong> The prose summary displays a textual summary of the reporting period.  An example:
<blockquote>
	<p>We served 73734 page views from 2002-07-04 to 2003-02-26, 262 today, and 293 yesterday.
	<p>The best day of all was Thursday, January 16th, 2003 (5128 pageviews), while Monday, August 19th, 2002 (1 pageviews), was a really poor day.
	<p>Most people visit us on Thursday with a total of 14997 pageviews, while Friday is not really our best day with a total of 8505 pageviews. On average, our best hour (with 37820 pageviews ) is at 00 o'clock , while only our hardcore-fans seem to show up at 22 o'clock (with only 1086 pageviews).
</blockquote>
<p><strong>Site Content Summary:</strong>  Displays a table of the miscellaneous contents that was added during each sub-period listed.  This includes users, stories, comments, downloads, and web links.
<p><strong>Browsers:</strong>  Displays a graph of the browsers used to access the site that also includes the raw count and percentage for each browser.
<p><strong>Operating Systems:</strong>  Displays a graph of the Operating Systems used to access the site that also includes the raw count and percentage for each Operating System.
<p><strong>Hours of the Day Distribution:</strong>  Displays a graph of the visitors broken down by the hour of the day that also includes the raw count and percentage of visits for each hour.
<p><strong>Days of the Week Distribution:</strong>  Displays a graph of the visitors broken down by the day of the week that also includes the raw count and percentage of visits for each day.
<p><strong>Graph Page Views:</strong>  Displays a graph of the visitors broken down by sub-period that also includes the raw count and percentage of visits for each sub-period.
<p><strong>Details:</strong>  Displays a table of the views, visits, unique pages and unique users broken down for each sub-period of the reporting period.
<p><strong>List of Top Hosts:</strong> Displays a table of host names that visited the site, with visits, views and unique pages listed for each host.  The hosts are sorted (ranked) by the number of visits made during the reporting period.  The drop-down list allows the user to choose how many hosts to display: <strong>Top 10</strong> (default), <strong>Top 20</strong>, <strong>Top 25</strong>, <strong>Top 50</strong>, <strong>Top 100</strong>, or <strong>All</strong>.
<p><strong>List of Top Pages Requested:</strong> Displays a table of pages requested, with the number of requests listed for each page.  The pages are sorted (ranked) by the number of views during the reporting period.  The drop-down list allows the user to choose how many pages to display: <strong>Top 10</strong> (default), <strong>Top 20</strong>, <strong>Top 25</strong>, <strong>Top 50</strong>, <strong>Top 100</strong>, or <strong>All</strong>.
<p><strong>List of Downloads Requested:</strong> Displays a table of downloads requested, with the number of requests listed for each file.  The downloads are sorted (ranked) by the number of requests during the reporting period.  The drop-down list allows the user to choose how many download requests to display: <strong>Top 10</strong> (default), <strong>Top 20</strong>, <strong>Top 25</strong>, <strong>Top 50</strong>, <strong>Top 100</strong>, or <strong>All</strong>.
<p><strong>List of Web Links Requested:</strong> Displays a table of links requested, with the number of requests listed for each link.  The links are sorted (ranked) by the number of requests during the reporting period.  The drop-down list allows the user to choose how many links to display: <strong>Top 10</strong> (default), <strong>Top 20</strong>, <strong>Top 25</strong>, <strong>Top 50</strong>, <strong>Top 100</strong>, or <strong>All</strong>.
<p><strong>List of External Referers:</strong> Displays a table of referer links used to access your site.  The number of visits and percentage are also listed. This report displays only referers that are external to the site being reported on.  Internal referers and bookmarks are <strong>not</strong> included in this report.  The referers are sorted (ranked) by the number of views during the reporting period.  The drop-down list allows the user to choose how many referers to display: <strong>Top 10</strong> (default), <strong>Top 20</strong>, <strong>Top 25</strong>, <strong>Top 50</strong>, <strong>Top 100</strong>, or <strong>All</strong>.
<p><strong>List of Referers:</strong> Displays a table of all referers to pages on your site. This includes 'bookmarks' and internal links within the site.  The number of visits and percentage are also listed.  The referers are sorted (ranked) by the number of views during the reporting period.  The drop-down list allows the user to choose how many referers to display: <strong>Top 10</strong> (default), <strong>Top 20</strong>, <strong>Top 25</strong>, <strong>Top 50</strong>, <strong>Top 100</strong>, or <strong>All</strong>.
<p><strong>List of Entry Pages:</strong> Displays a table of pages on the site that are used to "enter" the site.  In other words, each of these pages were accessed directly from another site or the bookmarked by the user.    The number of "entry" visits and percentage are also listed. The pages are sorted (ranked) by the number of views during the reporting period.  The drop-down list allows the user to choose how many entry pages to display: <strong>Top 10</strong> (default), <strong>Top 20</strong>, <strong>Top 25</strong>, <strong>Top 50</strong>, <strong>Top 100</strong>, or <strong>All</strong>.
<p><strong>Submit Button:</strong> Once you have selected the sub-reports that you want to include, click on the <strong>Submit</strong> button to create the report.
<H2>Disconnecting the old Stats and Referer modules</H2>
Once you have the statistics module working, you will probably want to save vistors' time and your disk space by stopping collection of data by the old Stats and Referer modules.  To do that, you once again need to modify the includes/pnAPI.php file. The same code at the bottom of the pnInit() function needs to be modified to look something like this:
	<blockquote><pre>
  // Other other includes
    include 'includes/advblocks.php';
    // include 'includes/counter.php';
    include 'includes/pnHTML.php';
    include 'includes/pnMod.php';
    include 'includes/queryutil.php';
    include 'includes/xhtml.php';
    include 'includes/oldfuncs.php';

    // Handle referer
    // if (pnConfigGetVar('httpref') == 1) {
    //     include 'referer.php';
    //     httpreferer();
    // 	}
    include 'modules/statistics/collect.php';
    </pre></blockquote> 
Note how the lines for "includes/counter.php" and "referer.php" have now been commented out.

<H2>The statistics data table architecture</H2>
The statistics module uses three tables: stats_details, stats_summary and stats_archive.  All three have the same layout so reports and other utilities will work with all three tables.  The only difference between the tables is the contents.  Data collection always goes into the stats_details table.  The contents of the other two tables is controlled by the administrator.

<p><table align="center" cellspacing=0 cellpadding=0 width=600>
	<tr><td width=150><strong><u>Column</u></strong></td><td><strong><u>Description</u></strong></td></tr>
	<tr><td width=150><strong>id</strong></td><td>Auto-generated by mysql.</td></tr>
	<tr><td><strong>key</strong></td><td>Constructed to consolidate statistics.  (See next section.)</td></tr>
	<tr><td><strong>host</strong></td><td>The name of the site host.  Included so that multi-site implementations can be handled.</td></tr>
	<tr><td><strong>requrl</strong></td><td>The page requested by the client.</td></tr>
	<tr><td><strong>http_referer</strong></td><td>The page that referred this request.  (ie - The originating link.)</td></tr>
	<tr><td><strong>ip</strong></td><td>IP address of the client computer.</td></tr>
	<tr><td><strong>hostip</strong></td><td>The host name of the client computer.  If the client's IP address is not listed in a reverse-DNS, then the IP address is used.</td></tr>
	<tr><td><strong>os</strong></td><td>Client's Operating System platform.</td></tr>
	<tr><td><strong>browser</strong></td><td>Client browser</td></tr>
	<tr><td><strong>uid</strong></td><td>User ID.  If it's zero, the user is not logged in.</td></tr>
	<tr><td><strong>sessionid</strong></td><td>PHP session id for the client.</td></tr>
	<tr><td><strong>count</strong></td><td>Number of visits.  This is usually one, unless consolidation is turned on and then it may be any number.</td></tr>
	<tr><td><strong>timestamp</strong></td><td>Datetimestamp of when this record was created (or the count was last updated.)</td></tr>
	<tr><td><strong>summary_timestamp</strong></td><td>Datetimestamp of when this record was copied, moved or dumped.</td></tr>
</table>

<H2>Notes on the intended use of the statistics module</H2>

Here is an edited version of the original design as discussed in 2002 in a thread on the developer.hostnuke.com forums:
<p>(<i>Note: Not all functionality is available in this release.</i>)
<ol>
	<li>Each page view by a visitor writes a new record to STATS_DETAILS.
	<li>The administrator can run reports on STATS_DETAILS to look at current activity on the site.
	<li>The admin summarizes data from STATS_DETAILS into STATS_SUMMARY and clears STATS_DETAILS. This happens as often as necessary to keep STATS_DETAILS at a reasonable size. For active sites that collect many stats, this would be once or twice a day. For sites with heavy traffic, it could be even more often, say once an hour. (<i>Note:</i> I have run a site for more than 8 months that is now getting about 900 views a day and there is no discernable performance degradation with using just the STATS_DETAILS table.)
	<li>At the end of the reporting period, the admin runs whatever reports need to be run against the STATS_SUMMARY table.
	<li>The admin archives data from STATS_SUMMARY to STATS_ARCHIVE and clears STATS_SUMMARY. This happens as often as necessary, usually once per normal reporting period. 
	<li>Admin dumps & deletes oldest data from STATS_ARCHIVE. This is usually done once per reporting period for data from the reporting period that is oldest, often more than a year.
</ol>
<p>Note that STATS_DETAILS, STATS_SUMMARY and STATS_ARCHIVE are the ONLY data tables necessary, replacing the current set of REFERER, STATS_HOUR, STATS_DATE, STATS_MONTH, and STATS_WEEK tables. Actually, the only table absolutely needed is the STATS_DETAILS table.  The others just give the administrator options for reporting and archiving.

<p>There are actually many ways of using STATS_SUMMARY and STATS_ARCHIVE.  Instead of the usage described above, an administrator could use STATS_SUMMARY to hold consolidated (summarized) statistics data for all reporting periods and use STATS_ARCHIVE to store detailed statistics data for all periods.  Most reporting would be done against STATS_SUMMARY unless some details that were consolidated are now needed.  Then the STATS_SUMMARY could be regnerated from STATS_ARCHIVE with the newly required details included.

<p>The key column on the statistics tables provides consolidation.  It is configured by the administrator and has several fields which determine the granularity of statistics. The first field is the minimum time unit. The Admin usually gets to pick ALL, MINUTE, HOUR, DAY, or MONTH. The field is constructed to generate an appropriate timedatestamp. For example, for HOUR, the timestamp would be something like HOUR: DATE: YEAR so that all stats within that hour would have the same timestamp.

<p>The second field in the key would be the view unit. The Admin would select something like PAGE, MODULE, VISIT. Thus, this field would determine the granularity of the visit to be tracked.

<p>Other fields in the key perform consolidation for other statistics. This would include the type of browser, O/S, visitor or referer.

<H2>Extending Statistics by adding sub-reports</H2>

To add a new sub-report, edit the pnuser.php file and follow these steps:

<ol>
	<li>Add code similar to this to the statistics_user_main() function. Be sure to change the URI Query Parameter from "ps" to something unique and the display text from "Prose Summary" to something meaningful:
	<blockquote><pre> 
    $output->FormCheckbox("ps");
    $output->Text("Prose Summary</td></tr><tr><td>");
    </pre></blockquote>
    <li>Near the top of statistics_user_report() function, add this code.  Be sure to change the variable name and URL parameter from "ps" to whatever you picked in the previous step:
    <blockquote><pre>
    $ps = pnVarCleanFromInput('ps');
    </pre></blockquote>
    <li>Near the bottom of statistics_user_report() function, add this code.  Be sure to change the variable name from "ps" to whatever you picked earlier and change the function name from "sub_summary_prose()" to the name of your new routine:
    <blockquote><pre>
    if ($ps == 1) {sub_summary_prose();}
    </pre></blockquote>
    <li>Add a function with the name chosen in the previous step that generates the output that you want.  I often copy an existing function that does something similar and then modify it to do what I want.
    </ol>

<H2>Why not use web server logs?</H2>

Yes, most web servers (IIS and Apache) will generate statistics.  But they have different formats and are put in different places.  I have
considered using webserver access log files.  See some of the discussion at the forum thread mentioned in the Bibliography.

<p>Basically, catching accesses from within postnuke allows us to catch postnuke info (user id, session id) that we can't get by using webserver logs.  This also makes it easier to identify single page views and single sessions.  I also didn't want to have to write a different statistics module for each type of webserver.

<p>But, if you want to use the logs, then you can write a script that loads them into the stats_details table.  From there we can use our PN stuff to manipulate and display the results.

<!--
<H2>Interesting SQL queries</H2>
I use the statistics module on my own web sites and in the normal course of running them, I often need to check out various aspects of how the sites are doing.  Being able to query the statistics data will often give me useful answers.  I try to keep track of these queries so that I can turn them into sub-reports for the statistics module.  But I don't always have time to churn out new sub-reports.  So here is a list of SQL queries that you can use in phpMyAdmin or some equivilent tool to display various statistics:
<ol>
	<li><strong>Count of user registrations by month since the start:</strong> SELECT COUNT(*),  EXTRACT(YEAR_MONTH FROM FROM_UNIXTIME(pn_user_regdate)) yearmonth FROM `nuke_users` GROUP BY yearmonth ORDER BY yearmonth
	<li><strong>Count of stories added by month since the start:</strong> SELECT COUNT(*),  EXTRACT(YEAR_MONTH FROM pn_time) yearmonth FROM `nuke_stories` GROUP BY yearmonth ORDER BY yearmonth
	<li><strong>Count of comments entered by month since the start:</strong> SELECT COUNT(*),  EXTRACT(YEAR_MONTH FROM pn_date) yearmonth FROM `nuke_comments` GROUP BY yearmonth ORDER BY yearmonth
	<li><strong>Count of links entered by month since the start:</strong> SELECT COUNT(*),  EXTRACT(YEAR_MONTH FROM pn_date) yearmonth FROM `nuke_links_links` GROUP BY yearmonth ORDER BY yearmonth
</ol>
-->

<H2>Bibliography</H2>
<p>Here's the forum thread that preceeded my work on this module: <a href="http://developer.hostnuke.com/modules.php?op=modload&name=XForum&file=viewthread&tid=105">http://developer.hostnuke.com/modules.php?op=modload&name=XForum&file=viewthread&tid=105</a>.  It includes discussion of several concepts that relate to statistics in general and this module specifically.
<p>Learning Regular Expressions by Example: <a href="http://www.phpbuilder.com/columns/dario19990616.php3">http://www.phpbuilder.com/columns/dario19990616.php3</a>.
Another tutorial on regular expresssions: <a href="http://www.devshed.com/Server_Side/Administration/RegExp/page1.html">http://www.devshed.com/Server_Side/Administration/RegExp/page1.html</a>.
<p>If you're looking for a book to read about Regular Expressions, you want Mastering Regular Expressions by Jeffrey E. F. Freidl (published by O'Reilly & Associates, Inc.). Friedl's book serves both as an extremely detailed tutorial and as an extremely detailed reference work on regular expression syntax. Get through this book, and you can consider yourself a serious expert on text manipulation in Unix.
<p>There are a number of other statistics applications:
<ul>
	<li>awstats: <a href="awstats.sourceforge.net">awstats.sourceforge.net</a>.
	<li>advstats: <a href="http://www.henner.eu.org">http://www.henner.eu.org </a>.  Heard reports that this module hasn't been actively developed/supported for awhile. Someone says that it's also downloadable from <a href="www.freeloaderstudios.com">www.freeloaderstudios.com</a>.
	<li>xcodes: <a href="http://www.xcodes.org/modules.php?name=ks_analys">http://www.xcodes.org/modules.php?name=ks_analys</a>, a module for PHPNuke.
	<li>analog: <a href="http://www.analog.cx/">http://www.analog.cx/ </a>.  I haven't looked at it yet.
	<li>Tracking modul for PHPNuke 5.0: <a href="http://www.antarcon.com/download.php?op=viewdownloaddetails&lid=26">http://www.antarcon.com/download.php?op=viewdownloaddetails&lid=26</a>
</ul>
<p>I had a discussion with Liam about being able to identify the geographic location of a client more accurately than just the root domain of their hostname (ie - .us, .au, .de, .fr, etc)  There wasn't any bullet-proof solution but we ran across a number of interesting attempts, which I've included here: <a href="http://www.private.org.il/IP2geo.html">http://www.private.org.il/IP2geo.html</a>, <a href="http://www.geobytes.com/IpLocator.htm">http://www.geobytes.com/IpLocator.htm</a>, <a href="http://www.caida.org/tools/utilities/netgeo/">http://www.caida.org/tools/utilities/netgeo/</a>, a project that uses the netgeo data: <a href="http://www.xpenguin.com/ip-atlas.php">http://www.xpenguin.com/ip-atlas.php</a>.
<p>Mikespub has an interesting traffic analysis tool implemented in perl scripts that I'd like to see about porting to PHP and incorporating into the statistics module.  See <a href="http://mikespub.net/tools/aWebVisit/">http://mikespub.net/tools/aWebVisit/</a> for aWebVisit.
<p>Links about PHP and graphics: <a href="http://www.phpbuilder.com/columns/tim19990812.php3">http://www.phpbuilder.com/columns/tim19990812.php3</a> and <a href="http://www.webguys.com/pdavis/Programs/html_graphs/HTML_Graphs.zip">http://www.webguys.com/pdavis/Programs/html_graphs/HTML_Graphs.zip</a>.
<H2>Change Log</H2>
<TABLE width="100%" border=1 cellpadding=2 cellspacing=0>
	<TR><strong>
		<TD width=50>Version</TD>
		<TD width=100>Date</TD>
		<TD>Changes</TD>
</strong></TR>
	<TR>
		<TD align="right" width=50>5</TD>
		<TD width=100>06-10-03</TD>
		<TD>Added "Yearly" reporting period.
		<br>Fixed Authorization/permissions to work.
		<br>Added Content, External Referers and Entry Pages Sub Reports.
		<br>Set E_ALL error reporting and removed all error messages.
		<br>Moved admin display strings into pnlang files.
		<br>Added Permissions section to README.
		<br>Changed "URL" to "Refering URL" in Column Heading of Referers sub-report.
		<br>Added Top Links Requested sub-report.
		</TD>
	</TR><TR>
		<TD align="right" width=50>4</TD>
		<TD width=100>03-30-03</TD>
		<TD>Added <strong>Manual Installation</strong> and <strong>Default Start Date</strong> sections to README.
			<br>Added warnings about following the installation instructions in order.
			<br>Combine the monthly and daily instantiations of the Graph Views and Display Details sub-reports.  Use a drop-down list to choose whether you want monthly or daily.
			<br>Add hourly and weekly choices to the Graph Views and Display Details sub-reports.
			<br>Added Top Downloads sub-report.
			<br>Moved all display strings into pnlang files so that the user reporting can be translated.
		</TD>
	</TR><TR>
		<TD align="right" width=50>3</TD>
		<TD width=100>03-02-03</TD>
		<TD>Added Top Hosts sub-reports.
			<br>Completely rewrote the README, with good install instructions and a bunch of other improvements.
			<br>Added Migration to Admin->statistics->Manage Data Tables to copy data from the Stats module to the statistics module.
			<br>Added Dump/Delete data to Admin->statistics->Manage Data Tables
			<br>Changed the Reporting menu so that you can pick which sub-reports you want to include and the time period to be reported on.
			<br>Added sub-reports: Top Pages requested, Details (visits, views, pages, users) totaled by Day or by Month
		</TD>
	</TR><TR>
		<TD align="right" width=50>2</TD>
		<TD width=100>08-30-02</TD>
		<TD>Added host column to support multi-sites<br>Added referers & stats summary reports<br>Added admin config, filtering & consolidation</TD>
	</TR><TR>
		<TD align="right" width=50>1</TD>
		<TD width=100>07-25-02</TD>
		<TD>Created module</TD>
	</TR>
</TABLE>
		
<H2>To Do List</H2>

<ul>
	<li>Extend installation to compensate for PN bug that corrupts the module variable storage.
	<li>Handle Site Start Dates which are not complete, valid dates.
	<li>Add Percentages to Page Requests sub-report.
	<li>Add rank to Referers sub-report.
	<li>Add logic to summarize and archive stats in summary and archive tables.
	<li>Real reporting functionality. (Neat design that will make it easy to generate, memorize and print reports.)  
	<li>Sub-reports: Top users visiting, Recent Visitors, Countries of Origin
	<li>Block to display recent users
	<li>ZippSLC requests "Specifically, I'd like to be able to look by username, see when they were last on, what IP they came from, where they went & what they looked at."
	<li>"Jon Machtynger" <jon@mythumb.com>: Create groups of users and search on their activity.  For example: a) typical 'guest' activity, b) typical 'staff' activity etc.
	<li>Extend consolidation to suppress all tracking of users.  (Required for German law.)
	<li>Incorporate the browser sniffing class from http://phpsniff.sourceforge.net.
	<li>Cross-reference IP address to country of origin.
	<li>Add indexes to improve performance of reports.
	<li>Implement referer consolidation for search engine sites (see www.jotajota.org/google.tar)
	<li>Implement hostip consolidation for search engine spiders, bots and crawlers
	<li>Match functionality from www.phpinfo.net/visiteurs
	<li>Match functionality from advstats module and awstats
	<li>Finish support of multi-sites.
	<li>Incorporate hitcount utility for modules.
	<li> Upgrade to PN 0.8.
	<li> Upgrade to Xaraya 1.0
</ul>


<H2>Credits</H2>
This module was created by Craig R. Saunders (<a href="mailto:crs@mtrad.com">crs@mtrad.com</a>). I used the Encyclopedia module as a starting point since it was recently ported to pnAPI compliance.  (<a href="http://orodruin.sourceforge.net/">Rebecca Smallwood</a> did a great job on Encyclopedia!)
<p><font size=-2>Copyright 2003 by Craig R. Saunders.  All rights reserved.</font>
Habitaquo: This is a box. It can be used for special habitaquo

both in the sidebar and the content section. You could use it for special notes and announcements, but also as a frame for photos. This space can be used for a short website presentation!