Home

Tuning Search In Drupal 5


In previous search benchmarks, I utilized random content generated with Drupal's devel module. In these latest benchmarks, I used an actual sanitized copy of the Drupal.org community website database, with email addresses and passwords removed. The first tests were intended to confirm that Xapian continues to perform well with large amounts of actual data. Additional tests were performed to measure the effect of various MySQL tunings and configurations. The following data was derived from several hundred benchmarks run on an Amazon AWS instance over the past week using the SearchBench module.

These tests confirm that Xapian continues to offer better search performance than Drupal's core search module. Contrary to popular belief, the data also shows that using the InnoDB storage engine for search tables significantly outperforms using the MyISAM storage engine for search tables, especially when your database server has sufficient RAM. The data also confirms that allocating additional RAM for MySQL's temporary tables can also improve search performance.


When reviewing the top searches on Drupal.org, we found that they were generally only 1 or 2 words long, and that they rarely included phrases or term negation. Thus, my first tests compared Drupal 5's core search with Xapian search using similarly simple queries. Each benchmark was repeated three times, and the three tests were validated for consistency. I occasionally returned to a previous test and ran it again to verify that the generated results continued to be consistent each time.

The version of the Drupal.org database that I used was from early March of 2008, and contained 742,885 comments, 223,272 nodes, and 252,189 users. I utilized the core search indexes that were already in the database, but I still had to generate new indexes for Xapian.

Quick Summary

  • Xapian consistently outperformed Drupal 5's core search using the Drupal.org database.
  • Xapian did not increase load on the server as compared to using Drupal 5's core search, though it did increase Apache's memory usage.
  • Drupal 5's core search performance benefited from increasing max_heap_table_size to 32M (a 13% speedup). Further increasing max_heap_table_size and tmp_table_size to 64M did not offer another significant performance improvement.
  • We saw as much as a 29% performance improvement by converting the search tables from using the MyISAM storage engine to the InnoDB storage engine. InnoDB performance suffered when MySQL ran out of RAM, but it still significantly outperformed MyISAM. This is counter to the commonly accepted belief that Drupal's search module does not perform well when using InnoDB, and needs additional testing.
  • Compacting Xapian's database did not offer any perceivable performance improvement.

Drupal 5 Core Search vs. Xapian Search


    It took an average of 3.68107 seconds for Drupal 5's core search to run each query in the above benchmark. SearchBench randomly generated a list of 100 search queries derived from content within the database, and then used this list to benchmark the same search queries multiple times. The list of queries was executed three times in a row in this configuration, and each time the group of 100 queries completed in nearly the same exact amount of time, showing less than 1 second of total variation. (The first test averaged 3.676 seconds per query, the second 3.681 seconds per query, and the third 3.686 seconds per query.) This consistency helps to confirm the SearchBench is generating valid data for useful comparisons.

      Summary:
      Average time per test: 368.1071 seconds
      Average time per query: 3.68107 seconds
       Longest query: 25.63190 seconds
       Shortest query: 0.16111 seconds

      Raw data.


    The same 100 search queries used to test Drupal 5's core search were used again to test Xapian. It took an average of 2.23243 seconds for Xapian search to run each query in the above benchmark, a 65% speed up compared to Drupal 5's core search. The same list of queries was again run three times in a row to confirm the reliability of these numbers. The first test averaged 2.177 seconds per query, the second 2.263 seconds per query, and the third 2.257 seconds per query. The second and third tests completed within one second of each other, while the first pass completed nearly ten seconds quicker.

      Summary:
      Average time per test: 223.2431 seconds
      Average time per query: 2.23243 seconds
       Longest query: 11.61143 seconds
       Shortest query: 0.16880 seconds

      Raw data.


    This test was intended to understand the effect of including phrases and negation in some of the search queries. A phrase is when multiple words are joined together with quotes, such as "this is a phrase". Negation is when you search for documents that don't include a specific word by preceding it with a "-", such as "-word". The overall benchmark was completed faster than when not using phrasing and negation, however this is because many of the queries didn't return any results. More telling is that the slowest query jumped up from 25 seconds to 41 seconds! The randomly generated slow search query was quite ugly:

        "software site/all/modules chould node/add/xxx/parent/xxx illegal" alter (not "complete?2)"
      Summary:
      Average time per test: 325.2230 seconds
      Average time per query: 3.25223 seconds
       Longest query: 41.26123 seconds
       Shortest query: 0.16368 seconds

      Raw data.


    The same queries used in the above test were repeated with Xapian search, where again there was a perceived speedup caused by many of the queries returning no results.

    Note that while this benchmark and the previous Core search benchmark used the exact same list of queries, they do not use the same queries as were used in the earlier benchmarks. Thus, the fact that the average query speed is less could be simply caused by the fact that different queries were used in this test. To obtain a better understanding of how phrasing and negation affects overall performance, similar tests should be run with far more than 100 queries to obtain a more accurate average. In any case, in this test Xapian's slowest query took 12.5 seconds. The query that took Drupal 5's core search 41 seconds took Xapian only 12.4 seconds, a 300% speedup, while neither search actually returned any results.

      Summary:
      Average time per test: 204.3366 seconds
      Average time per query: 2.04337 seconds
       Longest query: 12.52517 seconds
       Shortest query: 0.17208 seconds

      Raw data.

Server Performance

Beyond raw performance numbers, these benchmarks were also focused on understanding how Drupal 5's core search module and Xapian search each utilized server resources. To collect the following data, I used SearchBench to create a list of tens of thousands of unique search queries, and collected data with Sar while performing back to back searches with each solution for well over an hour.

    Drupal 5 Core Search Server Performance:

    Visible in these tests, for the first half an hour there was a noticeable amount of disk activity showing up in Sar output as IO wait. Watching the test, I assumed that this disk activity was caused by Drupal 5's known tendency to create temporary tables while finding search results. However, I was perplexed to discover that this abnormally high disk activity stopped after the first thirty minutes. It's possible that after 30 minutes the SearchBench queries were similar enough for MySQL to start caching data in such a way that Disk IO was minimized. This is possible, as all search queries were built from a relatively small list of only 5,000 words extracted from site content. It would be interesting to repeat this test using a much larger list of unique words.

    CPU Utilization

    03:00:01 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
    03:00:01 PM       all     14.63      0.00      2.12     24.85      0.33     58.07
    03:10:01 PM       all     22.81      0.00      3.93     23.41      0.78     49.07
    03:20:01 PM       all     30.54      0.00      5.00     16.94      1.60     45.92
    03:30:01 PM       all     39.30      0.00      5.98      1.68      1.60     51.43
    03:40:01 PM       all     40.52      0.00      5.55      1.43      1.53     50.97
    03:50:01 PM       all     39.59      0.00      6.34      1.43      1.69     50.95
    

    Context switches per second

    03:00:01 PM   cswch/s
    03:00:01 PM    709.88
    03:10:01 PM    808.38
    03:20:01 PM    995.64
    03:30:01 PM    866.86
    03:40:01 PM    736.75
    03:50:01 PM    716.50
    

    Paging and faults

    03:00:01 PM  pgpgin/s pgpgout/s   fault/s  majflt/s
    03:00:01 PM   2049.73    264.24    749.68      0.23
    03:10:01 PM    928.91   3176.36    423.70      0.02
    03:20:01 PM    246.67   5967.00    252.97      0.03
    03:30:01 PM     86.23   1396.19    270.22      0.00
    03:40:01 PM     56.27   1224.52    198.56      0.00
    03:50:01 PM     66.05    931.83    229.10      0.00
    

    Memory usage

    03:00:01 PM kbmemfree kbmemused  %memused kbbuffers  kbcached 
    03:00:01 PM    373432   7491028     95.25     62612   2048496 
    03:10:01 PM     15392   7849068     99.80     63216   2553356 
    03:20:01 PM     71968   7792492     99.08     67424   2639488 
    03:30:01 PM     71788   7792672     99.09     68844   2608772 
    03:40:01 PM     50140   7814320     99.36     71272   2485512 
    03:50:01 PM     72624   7791836     99.08     71576   2315728 
    

    Xapian Search Server Performance:

    As you can see from the timestamps, the Xapian benchmarks were run before the MySQL benchmarks. We can see that Xapian also used caused fairly high disk IO, though not as much as Drupal 5's core search module. It is also interesting to note that Xapian search made much more consistent use of server resources, with less variation between the different time snapshots.

    CPU Utilization

    01:30:01 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
    01:30:01 PM       all     31.55      0.00      6.11      8.42      2.67     51.25
    01:40:01 PM       all     32.58      0.00      6.69      6.93      2.79     51.01
    01:50:02 PM       all     34.82      0.00      6.85      4.03      3.11     51.20
    02:00:01 PM       all     35.16      0.00      6.71      4.06      2.92     51.14
    02:10:01 PM       all     34.97      0.00      6.89      4.20      2.98     50.96
    02:20:01 PM       all     35.74      0.00      6.74      2.98      3.18     51.35
    

    Context switches per second

    01:30:01 PM   cswch/s
    01:30:01 PM    878.01
    01:40:01 PM    895.38
    01:50:02 PM    940.72
    02:00:01 PM    875.44
    02:10:01 PM    936.37
    02:20:01 PM    942.42
    

    Paging and faults

    01:30:01 PM  pgpgin/s pgpgout/s   fault/s  majflt/s
    01:30:01 PM    283.33   4726.11    114.07      0.00
    01:40:01 PM    320.27   4067.61    104.76      0.00
    01:50:02 PM    136.67   4598.89    144.10      0.00
    02:00:01 PM    126.53   4066.34    131.40      0.00
    02:10:01 PM    117.59   5798.11    119.74      0.01
    02:20:01 PM     83.78   4413.75    128.38      0.00
    

    Memory usage

    Reviewing the following table you can see that Xapian placed additional memory requirements on the server. This is logically explained, as Xapian is running within PHP, adding additional memory overhead to the Apache process, while Drupal's core search makes use of MySQL's existing memory allocations. (Later in this document, we find that giving MySQL more memory can also improve Drupal 5's core search performance.)

    01:30:01 PM kbmemfree kbmemused  %memused kbbuffers  kbcached
    01:30:01 PM     41552   7822908     99.47     58172   1593224
    01:40:01 PM     72748   7791712     99.07     40460   1439720
    01:50:02 PM     74536   7789924     99.05     43808    892112
    02:00:01 PM     29564   7834896     99.62     47884    898920
    02:10:01 PM     73792   7790668     99.06     48420    867100
    02:20:01 PM     71728   7792732     99.09     51416    870336
    

Drupal 5 Core Search, MySQL Temporary Table Size

A significant performance improvement found in Drupal 6's core search module is a change to table structure causing search to generate fewer temporary tables. With the knowledge that temporary tables are a known performance bottleneck in Drupal 5, I ran some benchmarks to measure the affect of allocating additional RAM to MySQL for creating temporary tables.

    Default MySQL Tuning

    The first benchmark was with MySQL running with default settings for temporary tables, so tmp_table_size was 32M and max_heap_table_size was 16M. It should be noted that MySQL essentially uses the lower of these twp values, so in this configuration MySQL is allowed to use a maximum of 16M per process for temporary table allocations before it switches to using file-based temporary tables.

    • Summary:
      Average time per test: 413.3598 seconds
      Average time per query: 4.13360 seconds
       Longest query: 37.72950 seconds
       Shortest query: 0.16072 seconds

      Raw data.

    MySQL with 32M temporary tables

    For this second benchmark, max_heap_table_size was increased to 32M, matching the default setting for tmp_table_size. This configuration change brought us a 13% performance improvement, again comparing the same exact list of search queries. (MySQL was restarted to flush all caches, and I repeated each of these benchmarks a second time to verify this speedup). The improved performance makes sense, as MySQL is now able to create larger temporary tables in memory, without suffering a slowdown by writing the temporary tables to disk. However, be aware that this configuration change results in an per-connection increase of 16M. In other words, with the default MySQL setting of 100 max_connections, we are potentially allocating another 1.6G to the MySQL process. Keep this in mind so that you don't accidentally send your MySQL process into swap, which would result in terrible performance.

    • Summary:
      Average time per test: 365.7413 seconds
      Average time per query: 3.65741 seconds
       Longest query: 27.67167 seconds
       Shortest query: 0.15986 seconds

      Raw data.

    MySQL with 64M temporary tables

    For the third benchmark, I increased both tmp_table_size and max_heap_table_size to 64M. This gained us another 1.5% performance boost, or a 14.7% improvement compared to MySQL's default tuning. At this point we've already reached diminishing returns. Thus, with the drupal.org data set, I would leave these two settings at 32M. To be clear, this second configuration change increased the potential per-process memory usage by another 32M, or a potential total of 3.2G with the default of 100 max_connections! Due to the minimal gains this caused, it is very likely that there are better uses for this memory.

    • Summary:
      Average time per test: 360.4615 seconds
      Average time per query: 3.60461 seconds
       Longest query: 27.73822 seconds
       Shortest query: 0.16011 seconds

      Raw data.

Drupal 5 Core Search, MyISAM versus InnoDB
It has been commonly accepted by Drupal performance experts that Drupal's core search module does not perform well if the search tables are using the InnoDB storage engine. These benchmarks were intended to both verify and quantify this claim, but instead they disproved the claim. Instead, it was found that you can achieve a significant performance gain by converting your search tables to InnoDB, if your server has sufficient RAM. However, search tables can grow to a very large size, it may often be the case that you do not have sufficient RAM.

    MyISAM

    For this first benchmark, MySQL was restored to its default settings for temporary tables, and all three search tables were confirmed to be using the MyISAM storage engine. This set the baseline for comparing MyISAM performance to InnoDB performance.

    • Summary:
      Average time per test: 413.3598 seconds
      Average time per query: 4.13360 seconds
       Longest query: 37.72950 seconds
       Shortest query: 0.16072 seconds

      Raw data.

    search_dataset InnoDB

    Next, I converted all 221,655 rows of the search_dataset table to the InnoDB storage engine, leaving the other two search tables using the MyISAM storage engine. I ran the same identical benchmarks again, and average performance improved by 10%.

    • Summary:
      Average time per test: 375.5651 seconds
      Average time per query: 3.75565 seconds
       Longest query: 28.37350 seconds
       Shortest query: 0.16093 seconds

      Raw data.

    search_total InnoDB

    Next I converted all 452,370 rows of the search_total table to InnoDB, and ran the same benchmarks again. Converting this second search table brought us an even more impressive 17% performance improvement, for a total of a 29% improvement over purely MyISAM tables.

    • Summary:
      Average time per test: 321.9647 seconds
      Average time per query: 3.21965 seconds
       Longest query: 25.69365 seconds
       Shortest query: 0.16256 seconds

      Raw data.

    All Search Tables InnoDB

    Finally I converted all 20,017,829 rows of the search_index table to the InnoDB storage engine. This conversion did not result in a performance improvement as the other conversions had, actually resulting in an 11% performance penalty as compared to the previous test. This was still a 15.7% improvement over running with MyISAM search tables.

    While writing up this report, I reviewed the data closer and discovered that this slowdown from converting the search_index table was not due to a limitation of InnoDB, but was in fact due to InnoDB running out of memory. This is not surprising when one considers the very large size of the search_index table, and InnoDB's greater memory requirements than MyISAM. Furthermore, in many cases search query speed actually improved, and it was only some specific queries where performance suffered due to InnoDB running out of memory.

    Reviewing this data, I'm left to assume that the reason it has wrongly been believed that InnoDB tables do not offer good search performance is because of InnoDB running out of memory. In reality, if your server has sufficient memory you will likely benefit from using purely InnoDB tables. Even in our benchmarks where we ran out of memory, search tables using the InnoDB storage engine managed to outperform search tables using the MyISAM storage engine.

    I plan to run additional benchmarks with a new dataset to confirm these findings at a future time.

    • Summary:
      Average time per test: 357.4054 seconds
      Average time per query: 3.57405 seconds
       Longest query: 51.53070 seconds
       Shortest query: 0.15946 seconds

      Raw data.

Server Performance

My final benchmark looked at server performance from running a sustained search load on a database using the InnoDB storage engine for search tables. I used the exsact same query list described earlier in this document, and collected data with Sar.

    Drupal 5 Core Search With InnoDB Server Performance:

    CPU utilization

    08:20:01 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
    08:20:01 PM       all     42.61      0.00      1.76      3.00      1.77     50.86
    08:30:02 PM       all     42.69      0.00      1.84      2.60      1.85     51.03
    08:40:01 PM       all     41.91      0.00      1.77      3.00      1.96     51.36
    08:50:01 PM       all     41.82      0.00      1.82      3.19      1.96     51.22
    09:00:01 PM       all     41.92      0.00      1.92      3.11      1.94     51.11
    09:10:01 PM       all     41.48      0.00      1.76      3.43      1.86     51.46
    09:20:01 PM       all     41.49      0.00      1.98      3.63      1.63     51.26
    09:30:01 PM       all     40.92      0.00      1.76      3.87      1.79     51.66
    

    Reviewing the above data, it's clear that using InnoDB for search tables does not result in an increase in CPU utilization, as these numbers are very much in line with what we saw earlier when using the MyISAM storage engine. Once difference is that we did not experience a Disk IO storm as we did during our MyISAM tests. This could be an anomaly, or it could be the result of InnoDB making better use of RAM, having already loaded much of the search tables into memory.

Compressing Xapian's Database

A comment posted to an earlier article suggested that we try compressing Xapian's database files to improve search performance.

    Uncompacted Xapian Search Performance

    First I set a baseline by running some benchmarks on an uncompressed Xapian database.

    • Summary:
      Average time per test: 216.1539 seconds
      Average time per query: 2.16154 seconds
       Longest query: 5.51119 seconds
       Shortest query: 0.16965 seconds

      Raw data.

      Compacted Xapian Search Performance

      Next, I used the xapian-compact command line utility to compact the Xapian databases. This resulted in the following output:

        postlist: Reduced by 56.8097% 350552K (617064K -> 266512K)
        record: Reduced by 1.05708% 40K (3784K -> 3744K)
        termlist: Reduced by 5.88804% 13152K (223368K -> 210216K)
        position: Reduced by 0.0104678% 64K (611400K -> 611336K)
        value: Size unchanged (0K)
        spelling: Size unchanged (0K)
        synonym: Size unchanged (0K)
        

      The reduced size was also visible at the filesystem level:

        du -shr xap*
        1.4G	xapian.orig
        1.1G	xapian.compact
        

      Strangely enough, this did not offer any improvement in search performance, instead showing up as a slight performance penalty. Reviewing the data closer, I realized that the first of our three benchmarks suffered a penalty as this was the first time Xapian had used the specified database files. The second and third benchmarks had very similar results compared to the uncompressed files, as summarized below. In conclusion, with the Drupal.org dataset we did not measure any performance improvement by compacting the Xapian database files.

      It seems likely that the reason compacting the database did not improve performance is because we were using a static dataset. My theory is that if content was being updated, deleted, and added as it would be on a live website, and the Xapian database was being updated each time, then compacting the database may actually have a beneficial effect.

      • Summary:
        Average time per test: 218.5537 seconds
        Average time per query: 2.1855 seconds
         Longest query: 5.58208 seconds
         Shortest query: 0.16979 seconds

        Raw data.

    Where do we go from here?

    The most important thing I learned from running these benchmarks is to stop recommending that the MyISAM storage engine be used for Drupal's search tables. I will be running further benchmarks to validate these findings, and to explore configuration settings that can help InnoDB's search performance.

    This data was also quite useful to confirm that Xapian can help scale search on a large Drupal website with lots of content. I would be interested in learning more about how Xapian search performance can be further optimized.

    In future benchmarking I will be comparing Drupal 5's core search module with Drupal 6's core search module. I would like to validate the claim that Drupal 6 offers improved search performance, and to review MySQL configurations that can further optimize its performance, as research to my online performance and scalability book.

    Finally, I soon plan to broaden my research to include other third party search solutions, including both Sphinx and Solr.

    AttachmentSize
    graphs.gnumeric147.9 KB

Compacted databases


Hmm, there seem to be some odd regular gaps in the barchart for the compacted timings - there don't seem to be corresponding almost-zero figures in the raw data, and the graphs in the gnumeric file don't seem to have them. Some sort of image rendering issue?

Anyway, your theory is probably roughly right. If you index data into a fresh database by simply appending documents, then a linear insert mode gets used a lot and the database will be pretty compact anyway.

This is less true for the postlist table as this holds the term -> document mappings which produce more scattered writes, and the output from xapian-compact shows a lot of space being reclaimed there. Partly that will be blocks which aren't currently used at all, the rest will be blocks which aren't completely full. The former probably matter less, especially if they are grouped, as we won't ever try to read them. The latter should make a difference as it should mean we're more likely to have already read in an entry we want along with another in the same block.

I'm not sure why it seems slightly slower though. Perhaps just chance (which blocks happen to be near others so disk seeking times happen to be more). Or perhaps the "faster" code we use for a fully compact database isn't really faster and we'd be better off using the same code all the time! I'll have to check that.

Incidentally, "compact" is more correct than "compress" for describing what's going on - the operation essentially just minimises unused space in the .DB files rather than running any data through a compression algorithm.

Xapian performance


"Hmm, there seem to be some odd regular gaps in the barchart for the compacted timings"

Yeah, I seem to have hit upon some graph generation bug, but didn't have time to dig into it too deeply. Perhaps it would have been better to simply not display that graph on the web page at all as it could cause confusion.

"I'm not sure why it seems slightly slower though. Perhaps just chance (which blocks happen to be near others so disk seeking times happen to be more). Or perhaps the 'faster' code we use for a fully compact database isn't really faster and we'd be better off using the same code all the time! I'll have to check that."

I'd be fascinated to hear what you decide! Fortunately when running xapian-compact you end up with a complete copy of the database, so it's possible to test the before and after and to chose the best one.

"Incidentally, 'compact' is more correct than "compress" for describing what's going on - the operation essentially just minimises unused space in the .DB files rather than running any data through a compression algorithm."

Thanks for clarifying. I suppose it should have been obvious to me, as you didn't call the utility "xapian-compress"... ;)

While we're talking about Xapian performance, if I wanted to share a read-only copy of xapian indexes across multiple servers to allow multiple servers to handle search queries, is it safe to run rsync to keep them all in sync? Or would this risk pushing a corrupt copy for example if rsync runs while the xapian database files are being written to?

Also, is there likely to be a performance boost from keeping these read-only copies of the database on a tmpfs mount in RAM?

Thanks again for all your feedback.

Replication


if I wanted to share a read-only copy of xapian indexes across multiple servers to allow multiple servers to handle search queries, is it safe to run rsync to keep them all in sync? Or would this risk pushing a corrupt copy for example if rsync runs while the xapian database files are being written to?

It's not safe if the database is written to during the sync operation. There's a new replication feature on trunk (but not in 1.0.x) which is aimed at this sort of situation.
The documentation for it discusses other approaches, including using rsync:

http://trac.xapian.org/browser/trunk/xapian-core/docs/replication.rst


Also, is there likely to be a performance boost from keeping these read-only copies of the database on a tmpfs mount in RAM?

If there's sufficient RAM to do that, then if you put the DB on disk the VM system will tend to end up caching the DB in RAM anyway, so probably the only benefit of using tmpfs would be the elimination of the cache "warm up" time just after copying. If there's VM pressure, the VM system may even decide to page out unaccessed blocks from the tmpfs to swap - in some ways the distinction between "in RAM" and "on disk" is a bit artificial as the data will tend to end up where it's most useful either way.

thanks


I'd not found the replication document before, that was very helpful. Thanks!