Posted by shyamala on April 1, 2009 at 4:11am
If your server has high load you may consider following approach:
1. Use logs of server (i.e. Apache) + awstats updated by cron.
2. run awstats to parse log on another server or nightly when no load.
3. Apache work slower with logging. I prefer to log only php,html - content, no jpg,css,js etc.
4. Why Apache? nginx works much faster. nginx + php-fpm + e-accelerator 2-3 times faster and don't use so much memory like apache.eaccelerator.shm_size="16" - you may consider to add some memory here
eaccelerator.compress="1" - there is no need if you use gzip/deflate in apache.

Comments
why use statistics module
Most of the high traffic sites turn off statistics module (including drupal.org and groups.drupal.org). Also if you are after usage data from real users, javascripts based loggers (google analytics ) gives much more reliable picture (since it does not count search bot visits). So what you want from statistics module that is not otherwise available elsewhere?
Piwik
I have yet to try it, but Piwik is sort of a self-hosted Google Analytics. However, as external solutions, it seems such things as a "Popular Pages" block or Views Integration would not be available.
Can use views for popular content without statistic
In my site I am using views and statistic to populate popular content, is there any way to turn off statistic tell get popular content using view. I like to restrict extra modules for popular content. Is there any way to use Piwik module in place of core statistic module.
Healthy-ojas | Diabetes | Cholesterol
GA has it's not so
GA has it's not so well-known drawbacks (besides concers of data security) hidden in their TOS. E.g. does GA turn commercial once your account registers more than 5 million page views a month, unless you have an AdWords campaign running that Google feels giving them enough money in return.
Beginning in Febuary I switched to Piwik for several projects. In high traffic environments I'd maybe set it up in its very own server and tweak the hell out of its configuration for reasonable speed in DB updating. As I understand it Piwik collects data in local storage and batch updates the db via cron (see manual / FAQ) or when you connect to it's web client (this behavours' are configurable). Depending on how many data is collected, hoe powerful the system is and how frequent those updates are run it may have bad effects on system speed if you host Piwik on a production site web / db server.
Alex
Collecting data then
Collecting data then updating the DB via cron is a great system. We do that to record video downloads. The data is stored in memcache, then every 5 minutes that memcache server gets read in and the database updated. I can record 500,000 clicks in under 30 seconds. Of course that was just in testing. That many downloads in 5 minutes would kill us LOL.
But this could possibly be an idea for a new module, or even an extensions to something like Cacherouter. Store the statistics updates in a memory cache then send it to the DB at cron.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
While Piwik is a very nice
While Piwik is a very nice system, you are talking about installing an entirely separate platform, which would also mean additional requests on the server to count the hits. If you are worried about server load from the statistics module, then this surely won't help that.
My suggestion is to go with Google Analytics. We get over 10 million visitors a month and have no problems with it. Google has never threatened us with going "commercial" or anything for the two years we have used them.
The only drawback on Google Analytics is that something loading their tracking code can be slow. For that, get the module:
http://drupal.org/project/google_analytics
There's an option in the module to locally cache that javascript file. Turn that on and things work great.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
awstats
Do not forget awstats. It works by processing the Apache access log at a present interval (we set it up for 15 to 60 minutes depending on the site), and produces nice reports and can keep history for years.
Used in conjunction with Google Analytics or Piwik it is an invaluable tool.
GA and Piwik only measure humans with real browsers having Javascript enabled. We found that comparing the figures to awstats. We found on some percent a difference of 20% less in Analytics, and those were due to RSS news readers and various other crawlers.
Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.
Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.
any practical use case
any practical use case here?
I use OpenPublish and statistics been used by Most Viewed OP module. Awstat idea seems to be the best option in performance point of view. But how can I use it? is there any module for that?
I wish statistics had an option for that.
Hadi Farnoud
فروشگاه ساز | ایمیل مارکتینگ پاکت
unitrack in development
My company has taken existing unitrack module (http://drupal.org/project/unitrack) and currently makes significant changes there to meet high load requirements.
Unfortunately, I can't publish much more details now, but the module will be published under GPL on drupal.org (expected late January)
looking forward to that. keep
looking forward to that. keep us posted
Hadi Farnoud
فروشگاه ساز | ایمیل مارکتینگ پاکت