Piwik Performance Optimization


Chapter 24

Cloud

First ensure your server has the minimum requirements for running Piwik. Then consider optimizing your Piwik setup's performance.

Piwik's Minimum Requirements

Check on your web server that it meets the minimum requirements for running Piwik.

  • Your web server is running Nginx.
  • PHP version 5.3.3 or greater has been installed.
  • MySQL version 4.1 or greater, or MariaDB has been installed.
  • The PHP extension pdo and pdo_mysql, or the mysqli extension has been installed.

Note, the minimum requirements listed above may change in time when the Piwik software gets new upgrades in the future.

Optimize Piwik's Performance

To optimize your Piwik setup's performance, consider the items below.

  • Server & RAM (hardware)
  • Load Balancers
  • Real-time Reports
  • Number of Unique URLs
  • PHP Caching
  • Crontab

Server & RAM

This method will require investing additional budget into your Piwik setup. You use a better (or upgrade) your server (i.e. your machine, or main hardware).

The best you can get is to run Piwik on a completely dedicated server, which is shared with no other software that isn't required by your Piwik setup.

Most cloud solutions are a good option. Cloud solutions allows flexibility for upgrading your hardware (i.e. server, RAM, CPU, hard disk space, etc).

Load Balancers

Using a load balancer means your Piwik setup is using multiple servers. A load balancer can evenly distribute your Piwik setup's workload across multiple servers.

  • When using load balancers, you may end up having multiple config.ini.php files across multiple servers. Use rsync to synchronous all the config.ini.php files.
  • In Piwik's config.ini.php file, enable SSL by ading “force_ssl=1” under the [General] section.
  • Enable database session storage in Piwik's config.ini.php file by assigning “dbtable” to variable “session_save_handler”.
  • Each time after upgrading the Piwik software (or any plugins), remove tmp/* folder's content.

Adding load balancers and/or additional servers also require investing additional budget to your Piwik setup.

Report Processing Interval

By default, Piwik's report processing interval may have been set to 150 seconds. To make Piwik runs more smoothly, increase the interval to 3600 seconds (i.e. 60 minutes or 1 hour) or even 7200 seconds (120 minutes or 2 hours). To set it, go to:

Administration -> General -> Archive reports at most every X seconds

Enter a larger value (such as 3600 or 7200) in the “seconds” field.

Number of Unique URLs

The more unique URLs Piwik has to track, the more data Piwik has to store in the database, and the quicker Piwik's database becomes larger in size. By keeping the database size smaller in a logical way, it prevents Piwik's performance from going down quickly.

In theory all pages below refer to pages with highly similar content (or sometimes even identical pages).

www.example.com/hotels/list-shanghai/
www.example.com/hotels/list-shanghai/?checkin-date=2016-03-01&checkout-date=2016-03-02
www.example.com/hotels/list-shanghai/?checkin-date=2016-05-15&checkout-date=2016-05-18

They are only differ on the URL by the tagged parameters.

checkin-date=2016-03-01&checkout-date=2016-03-02
checkin-date=2016-05-15&checkout-date=2016-05-18

You may exclude the URLs when storing the URLs in Piwik's database. Go to:

Administration -> Websites -> Global list of Query URL parameters to exclude

Enter the list of URL Query Parameters, one per line.

Note, one or more URL parameters are specified for exclusion, it will only affect data going forward, and URL parameters will not be removed from your historical data and reports in retrospect.

PHP Caching

Optimize your Piwik's PHP codes by using a PHP cache (e.g. XCache).

Crontab

By default, Piwik triggers the report archiving process whenever you log onto Piwik's account interface through a web browser. If your website has relatively high number of user sessions per day, then waiting for Piwik to archive your data may take several minutes or even more. To avoid the waiting times, set up a cron job on your Linux server to have your data automatically processed every hour.

To automatically trigger the Piwik archives, you can set up a script that will execute every hour. On your Piwik's Linux machine, you will set up a Crontab to automatically archive your Piwik reports. A crontab is a time-based scheduling service in a Linux server, and requires php-cli or php-cgi installed. You will also need SSH access to your server in order to set it up.

Create a new crontab with the text editor nano:

nano /etc/cron.d/piwik-archive

Add the lines to the crontab:

MAILTO="youremail@example.com"
5 * * * * www-data /usr/bin/php5 /path/to/piwik/console core:archive --url=http://example.org/piwik/ > /home/example/piwik-archive.log

The Piwik archive script will run every hour (at 5 minutes past). Normally the script should complete in 1 minute to 30 minutes depending on the amount of traffic your website is getting.

Let's examine the content of the crontab. If there is an error during the script execution, the script output and error messages will be sent to the youremail@example.com address.

MAILTO="youremail@example.com"

This is the user that the cron job will be executed by, and it should generally be your web server user.

www-data

This is the path to your PHP executable. The path may vary depending on your server configuration and operating system. To find out the the path of your PHP5 executable, execute the command “which php5” or “which php” in a Linux shell.

/usr/bin/php5

Below is the only required parameter in the script, which must be set to your Piwik base URL eg. http://analytics.example.org/ or http://example.org/piwik/

--url=http://example.org/piwik/

Below is the path where the script will write the output. You can replace this path with /dev/null if you prefer not to log the last piwik cron output text. The script output contains useful information such as which websites are archived, how long it takes to process for each date & website, etc.

/home/example/piwik-archive.log

Below is the optional path where the script will write the error messages. If you omit this from the cron tab, then errors will be emailed to your MAILTO address. If you write this in the crontab, then errors will be logged in this specified error log file.

/home/example/piwik-archive-errors.log

The cron utility uses two different types of configuration files:

  • System crontab
  • User crontab

The only difference between the two different types of configuration files is the sixth field.

In the system crontab, the sixth field is the name of a user for the command to run as. This gives the system crontab the ability to run commands as any user.

In a user crontab, the sixth field is the command to run, and all commands run as the user who created the crontab. This is a security feature.

If you set up your crontab as a user crontab, you will have to write:

5 * * * * /usr/bin/php5 /path/to/piwik/console core:archive --url=http://example.org/piwik/ > /dev/null

This cron job will trigger the day/week/month/year archiving process at 5 minutes past every hour. This will make sure that when you visit your Piwik's account interface, the data has already been processed. Your Piwik reports should load as normal.

You can test the cron command. Make sure the crontab will actually work by running the script as the crontab user in the shell:

su www-data -c "/usr/bin/php5 /path/to/piwik/console core:archive --url=http://example.org/piwik/"

You should see the script output with the list of websites being archived, and a summary at the end stating that there was no error.


Previous Chapters


Gordon Choi's Analytics Book has been available since August 2016 and is proudly powered by Folks Analytics.

Thank you for reading! If you love my book, you're welcome to donate through Paypal.







Content on Gordon Choi's Analytics Book is licensed under the CC Attribution-Noncommercial 4.0 International license.