I ran some initial Drupal search benchmarks with SearchBench on Amazon's EC2 cloud service. These first tests were primarily focused on confirming that SearchBench and EC2 are a good match. They utilized a single server instance, and did not include any server tuning.

I used the devel module to create 5,000 random nodes and 10,000 random comments. I indexed this content both with Drupal's core search module, and with the contributed Xapian module. I then used SearchBench to create 1,000 random search queries with one to ten ten words in each query, with phrasing and negation set to random. Finally, I ran the same identical search test three times in a row, comparing Xapian's performance to Drupal's core search performance. I was impressed to see how well Drupal's core search performed in these tests, and plan many more tests to better understand the strengths and weaknesses of each search technology.

In this test, Drupal's core search functionality outperformed Xapian search by about 15%. The three Xapian tests took a total of ~559 seconds, with the first taking ~192 seconds, the second taking ~188 seconds, and the third taking ~180 seconds. The three core search tests took a total of ~505 seconds, with the first taking ~163 seconds, the second taking ~163 seconds, and the third taking ~179 seconds. On average, Xapian queries took .1864 seconds to return, while core Drupal search queries took .1685 seconds to return.

I repeated thsee same tests a second time, both to confirm that Xapian search improved its performance each time the same query was run, while Drupal's core search did not seem to, as well as to confirm that Drupal's core search outperformed Xapian in this test. I plan many more tests in the future to help understand these results better, and their implications to scaling Drupal search as detailed below.

I did not monitor the server instance closely, so I do not know at this time if either of the tests were hitting server limitations. I did not see any obvious problems with quick glances at 'top', and 'vmstat' while the tests ran, and as this was Amazon EC2's largest server instance I doubt there were any server bottlenecks.

Setting up the benchmarks
I used 'ami-c8ac48a1' (RightScale's CentOS5_0 x86_64 V3_0_0 image) on an m1.xlarge instance to begin. I then ran this script to install all the software and get the server into a usable state. I did not make any server changes beyond what is contained in this script.

Conclusions

  • SearchBench needs better reporting. It currently generates a lot of raw data, but it's difficult to analyze the data to any great detail. Exporting the data to a spreadsheet and generating charts is necessary at the minimum.
  • Drupal's core search functionality clearly outperformed Xapian search in this test.
  • Both search options performed quite well.
  • Update: After creating charts with the data, it has become apparent that Xapian's lesser performance is mostly due to frequent rogue queries taking longer than average. Additional testing is necessary to understand what is causing this bottleneck, though from my understanding of Xapian I am first suspicious of disk I/O contention.

Future plans

  • Repeat tests: These tests need to be repeated multiple times to confirm that they are consistent, and that abnormalities aren't being introduced by using a server in the cloud.
  • Generate graphs: I intend to make the search data exportable, so it can be imported into a spreadsheet program to generate graphs, offering visual comparisons of the results. This will be helpful in detecting any abnormalities that could be introduced by the cloud, or something else affecting the server. I won't run any more tests until this functionality is working, as it is difficult to analyze just the raw numbers.
  • Add advanced search features: The SearchBench module currently builds relatively simply search queries. It still needs to be improved to perform more advanced search queries, and then additional benchmarks need to be run to see how the different search options perform with different types of queries.
  • Compare valid searches to invalid searches: SearchBench is currently building queries from some wordlists found online. The vast majority of the search queries it creates contain nonsense words that do not return any results. Furthermore, the content generated by the devel module is also nonsense. It would be good to generate a wordlist from the actual content, and to build queries with that searchlist, seeing how this affects performance.
  • Compare untuned MySQL with tuned MySQL performance: I did not tune MySQL at all for these tests, potentially handicapping Drupal's core search module. It will be interesting to run a test comparing the performance of Durpal's core search when MySQL is not tuned, versus when it is. It would also be interesting to compare MyISAM search performance to InnoDB search performance. During these tests, it would also be insightful to review server health, including CPU utilization, load average, disk IO, and memory/swap utilization.
  • Compare Core Search In Drupal 5, 6, 7, ...: These current tests were run against Drupal 5's core search. There have been some significant improvements made to search in Drupal 6, and it would be very interesting to compare the two versions. It will also be useful to benchmark Drupal 7 as it evolves.
  • Optimize Xapian performance: I need to look into what Xapian tuning can be done. Is it possible to tell Xapian to use more RAM? Can disk I/O performance on EC2 be improved by striping the drives? Can the way the Xapian module calls the Xapian PHP binding API's be optimized?
  • Compare Xapian, Sphinx, and Solr: It will be interesting to see how each of these solutions compare. There currently is not a Sphinx module available, but I believe there's a Solr module.
  • Utilize multiple servers: Build a cluster of servers in EC2, with a central database and multiple web nodes. Kick off a search from all of the web nodes at the same time, and see how search performance scales with the different solutions. Compare the performance of searching from one remote webserver, two remote webservers, three servers, etc, understanding the impact of scaling out with many webservers.
  • Create more content: See how the different search solutions scale as more content is added to the website. Compare node heavy content (ie, 5 times as many nodes as comments) to comment heavy content (ie, 5 times as many comments as nodes) and see if this affects search performance.
  • Perform more searches: Due to time constraints, I only ran 3,000 searches against each search solution (the same 1,000 queries repeated 3 times). Does running a larger number of unique queries affect performance?
  • 64-bit versus 32-bit: It would be interesting to compare the above numbers with a low-end 32-bit Amazon instance, and see how this affects the search performance of the various search solutions. Do any of the solutions offer an advantage on less powerful hardware?
  • Use real content: I started exploring pulling in a dump of Wikipedia to search on, however there's quite a bit more effort left for me to parse their XML dumps. I plan to explore other sources of real data that are smaller and thus easier to work with.
  • Error detection: Detect whether or not actual search results are being returned, or if there was an error.