Drupal is not known as the most performant application, neither is the PHP language it is written on, but there are lots of things you can do to increase the performance of your Drupal site. This article will touch on many of those methods covering key modules, configuring these modules and setting up other applications to aid your Drupal site.
One of the main things that will effect the performance of your Drupal site is bloat. There are many modules available, and you may want to install them all, but don’t. Plan ahead, look at what each modules does, try see how well they work and if they offer what you want. Most modules offer an ‘uninstall’ function, but not all do, so when testing a module try it on a dev environment so you can delete it all and start again if needed. This will prevent data in the database that is not needed.
Updates to Drupal core and contributed modules are released very often, these updates can sometimes include performance improvements, so keeping them up to date is vital. Clearly updates need to carry some level of caution, testing these updates before pushing them live is always a must as you never know what features could have changed, been removed or broken.
Drupal 6 had many issues some of which prevented the use of some third party performance tools such as Varnish reverse proxy cache. These issues were all resolved in a distribution of Drupal called “Pressflow”. Therefore I would recommend anyone who uses Drupal 6 now to look at upgrading to Pressflow. All of the changes, which were made to Drupal 6 for Pressflow, have now been worked on by the Drupal community as a whole and added to Drupal 7. If you are looking to build a new site using Drupal, then Drupal 7 should be your version of choice.
APC (Alternative PHP Cache) is a PHP OP code cache. It is a very quick win when working with PHP and can offer a great performance boost when using Drupal. It is very much a “set it and forget it” type of application which can just be installed, enabled and left to do it’s thing. Many Drupal specific hosting companies will already have APC setup and running so you may even be using it without noticing.
Drupal’s support for Memcache is really good across Drupal 6 and Drupal 7, so even if you have an older site this can still offer you a boost. Drupal has a fantastic hook-able caching system, where any module can write to a standard cache table, or create a cache table, then use a specific API to write to these cache tables. When using these cache tables it can save large complex PHP tasks or MySQL queries, but it can also create more slow queries for reading and writing the cache. Memcache relieves that problem by storing all of these cache tables in memory. For many sites these reduces load on the server and increases the performance of the site.
When you have a lot of anonymous users reverse proxy cache can save you a lot of server load. Varnish is one of the more popular solutions within the Drupal world. Varnish sits in front of your web server application, for example Apache, nginx or lighttpd, and can run on the same server or a remote server. It is often run on a load balancer in front of multiple web servers. Varnish will cache pages for anonymous users, for as long as the “max_age” header is set. Varnish can be quiet complex to setup, the there are many Drupal focused tutorials. It’s advised to configure it to only bypass the cache for users with a cookie starting with “SESS” as these are given to authenticated Drupal users, but any module that sets “$_SESSION” in it’s code will also set one of these cookies in Drupal, which will cause Varnish to be bypassed, and extra load to be added to the web server. Also note that when a cached page is served from Varnish, no PHP code will get executed within Drupal, therefore things such as mobile detection, or geoip detection will not function.
When you are on an environment that won’t allow you to use Varnish, such as shared hosting, Boost will work as a great alternative. Boost is a Drupal module that caches all of the pages, for anonymous users, to flat files. When the page is then requested it is loaded a lot quicker, because it is coming straight from the disk and no PHP or MySQL processing is needed. Boost does not work as well on distributed or cloud environments which use network file systems, as the reads and writes on these can be a lot slower and cause issues.
The Views module is one of the best modules ever written for Drupal, but can often end up generating very slow database queries. When optimising Views many of the same rules apply as optimising database queries. In the Views interface when you “Preview” a view it will show you the query it’s generating and from there it may be clearer what is going on under the hood. Firstly I would advise using InnoDB in MySQL instead of MyISAM, this offers a great performance boost. I would then look at ways not to use “distinct” and “count” in your queries. When sorting by date, make sure the granularity is set to seconds. Try a few different settings within Views, and a few different versions of the queries to see which ones load faster, you may be able to get much better performance by only slightly compromising on functionality.
Views lite pager
The Views lite pager module is only of the biggest performance boosts I have seen for the Views module. The standard pagers within Views add, first, previous, page number, next and last links. To generate these they need another database query using the MySQL count function to find out how many pages there are. When using InnoDB the count function is so slow that it can easily take down large sites. The Views lite pager removed the need for a count function in the query by just adding a previous and next link for the pager. This therefore losses a small amount of functionality, but the performance boost is incredible.
Views has it’s own caching system which allows you to set a time for how long each view should get cached for both anonymous and authenticated users. This is stored in the Views cache tables unless you are using Memcache, when it will be stored in memory. Normally when a page is requested with a view on (or several views on) a database query is done to load the data for that view. If you have one thousand people requesting that page over a few minutes those views will execute thousands of database queries, which could cause quite a performance hit. If you were to set the Views cache to 5 minutes on each view then this database query will only be run once every 5 minutes, no matter how many times it’s requested. The downside here is that if you are updating the site with new content, it won’t be displayed for 5 minutes, but that’s a small price to pay for the performance. If you have views with content that doesn’t change very often you could set the cache time to much longer.
10. Block cache
Drupal’s block cache can offer a great performance boost for anonymous and authenticated users, especially when used with Memcache. In views when generating a block it is possible to select if the block should make use of block cache or not. Make sure you enable this and select a setting that seems sensible for the type of data you’re displaying. For example, if you are a listing in Views all posts by the current logged in user you would want to cache it per users, so people don’t end up seeing other’s content. Or if you are listed related articles to the current page, you would need to cache this per page, to prevent non-related articles being displayed.
11. File system optimisations
12. Fast 404
All sites get “404 page not found” errors, although it is more common when you are launching a new site and paths to pages and images have changes. When loading a 404 page in Drupal it has to do a full “bootstrap”, load all modules, load settings, etc. If there were a few images missing on the page, this could end up with hundreds of megabytes of memory being used on the server, which doesn’t need to be used. The fast 404 module allows a very simple 404 page to be loaded which uses very little memory. Drupal 7 has a little bit of this functionality in core, but the fast 404 module offers a lot more. Missing images, and 404 errors are not something that should be ignored because I have seen this issue cause sites to fail.
13. Bad modules
Drupal core ships with some great modules but it also ships with some nasty ones. Here are 3 of the worst:
Database logging (dblog)
The database logging modules writes all log messages to the database, when you have many errors, debugging information or modules that writes other log messages this can end up being many database inserts per page load. This then puts extra strain on your database server and cause performance issues. The recommendation here is to disable the database logging module and use the syslog module instead. Syslog also ships with Drupal core, but writes to the server log file, this will offer similar functionality at a fraction of the resources.
The Statistics module is used to could how many times content has been viewed as well as collecting other data about user’s activity on the site, much like Google Analytics. This can cause multiple database writes per page load for both anonymous and authenticated users, which added unwanted load on the database. Also if using reverse proxy caching such as Varnish, statistics will not return accurate data. As the maintainer for the statistics module in Drupal core I am working to resolve these issues, and hope to have it solved in Drupal 8, and possibly rolled back to Drupal7, until then I would suggest using Google Analytics. The Google Analytics Reports module uses the API to fetch information from Google and make use of it in your site.
The PHP filter module allows adding PHP code to content (nodes) and to blocks. This PHP code is stored in the database so when executed Drupal has to first load the code from the database before executing it. As you can imagine, this would be slower than just having the code in a file as a Drupal module. What makes it worse is that when the PHP filter module is used, none of the code executed gets cached. So please, put all of your code into custom modules.
14. Performance monitoring
Different sites have different problem areas, there are a few ways to monitor the site performance during development, during load testing and during live usage. Drupal Devel module offers a few features such as listed the database queries and time they took, as well as returning the memory usage for loading the page. These will be able to tell you what areas of the site should be optimised. This can easily be run during the development process.
New Relic on the other hand is a third party service that works well with Drupal, it runs on the server and logs the speed of queries and functions on the site. When running a load test it is possible to monitor New Relic to get vital information that will help improve the site. New Relic also works well on live sites, so even after launch you can continue to monitor for issues, and tips on improving the site.
Drupal is also not really well know for it’s greatest HTML, it is more well known for its flexibility. Therefore altering the HTML to only have the structure you need for your site will reduce the page site, and improve the load time.
Make sure you are using imagecache in Drupal 6 or image profiles in Drupal 7 to reduce the size of user uploaded images.
Abstract blue tunnel photo from Shutterstock.
- Words: Tim Millwood
A client advisor at Acquia and freelance web developer, Tim is an active member of the Drupal community.