Home

All Tag1 Features

MySQL Replication Tips And Tricks


Until recently, I was a student employee at the Oregon State University Open Source Lab. My career there ended, like many, with that painful process known as graduation. I got invaluable experience at the lab, not the least of which being the knowledge gained as their main (only) database administrator. One of my great pleasures in that position, was learning how to configure MySQL replication and manage clusters of replicating database servers. Even the simple case of a single master and a single slave has its edge cases.

Chapter 1 Rough Draft Complete


I have completed a rough draft of the first chapter of "Drupal Performance and Scalability". The first chapter of this online book is divided into four sections, the first of which focuses on the importance of fully defining your performance and scalability goals, helping you to identify what you need to accomplish and how to set concrete and attainable goals. The second section discusses monitoring and measuring your ongoing progress, helping you decide what you need to monitor, and how to monitor it. The third section stresses the importance of making regular backups, discussing what needs to be backed up, and offering example scripts for backing up your entire website, including the database. Finally, the fourth section takes an in depth look at using revision control tools to manage your website, providing useful recipes showing how Git can track changes to your website, helping you update to new releases and push those updates into production.

It is important to realize that this is a rough draft, and as such it may contain spelling or grammatical errors, it may be missing key points, and the writing style may not be very polished. However, the book has to start somewhere, and this is the first step toward the end goal of publishing a useful and freely available online resource. I welcome all criticisms, suggestions and feedback. If you find errors in the text or have specific comments, you can help with this writing project by posting your feedback on the appropriate page. The current status of this project is tracked here.

Tuning Search In Drupal 5


In previous search benchmarks, I utilized random content generated with Drupal's devel module. In these latest benchmarks, I used an actual sanitized copy of the Drupal.org community website database, with email addresses and passwords removed. The first tests were intended to confirm that Xapian continues to perform well with large amounts of actual data. Additional tests were performed to measure the effect of various MySQL tunings and configurations. The following data was derived from several hundred benchmarks run on an Amazon AWS instance over the past week using the SearchBench module.

These tests confirm that Xapian continues to offer better search performance than Drupal's core search module. Contrary to popular belief, the data also shows that using the InnoDB storage engine for search tables significantly outperforms using the MyISAM storage engine for search tables, especially when your database server has sufficient RAM. The data also confirms that allocating additional RAM for MySQL's temporary tables can also improve search performance.

Online Performance and Scalability Book


Tag1 Consulting is focused on improving Drupal's performance and scalability. We also believe that when information is freely shared, everyone wins. Toward these ends, we are working on an online book titled, "Drupal Performance and Scalability". The book is divided into five main sections, Drupal Performance, Front End Performance, Improved Caching and Searching, Optimizing the Database Layer, and Drupal In The Cloud. The book is primarily aimed toward users running Drupal on the LAMP stack, with chapters applicable to everything from low-end shared hosts to large-scale multi-server installations.

By publishing on-line, we aim to encourage you to participate in the book writing process as an editor and a technical reviewer. You will currently find the book's complete outline online, along with descriptions of each planned section and chapter. As the book evolves, it will continue to be updated online in real time. We encourage you to post comments with suggestions, critical feedback, grammatical corrections, or anything else relevant to our ongoing effort.

Comparing Xapian and Drupal 5's Core Search


SearchBench has received a couple of useful updates since yesterday's initial cloud tests. It can generate search queries based on actual content, and it can export search benchmark results. In gaining these features, it is now possible to use SearchBench to perform some actual performance comparisons.

Once again I set up these tests on an extra large EC2 instance. I still have not performed any tuning, and I continue to test Drupal 5 core search with Xapian search. My initial benchmarks show that Xapian offers a very significant 6x+ performance advantage over Drupal's core search when a given search query actually returns results. In addition, Xapian is able to index a large site in about a 3rd the time of Drupal 5's built in search. Read on for actual benchmark results and graphs.

SearchBench In The Cloud


I ran some initial Drupal search benchmarks with SearchBench on Amazon's EC2 cloud service. These first tests were primarily focused on confirming that SearchBench and EC2 are a good match. They utilized a single server instance, and did not include any server tuning.

I used the devel module to create 5,000 random nodes and 10,000 random comments. I indexed this content both with Drupal's core search module, and with the contributed Xapian module. I then used SearchBench to create 1,000 random search queries with one to ten ten words in each query, with phrasing and negation set to random. Finally, I ran the same identical search test three times in a row, comparing Xapian's performance to Drupal's core search performance. I was impressed to see how well Drupal's core search performed in these tests, and plan many more tests to better understand the strengths and weaknesses of each search technology.

Introducing SearchBench


There have been some ongoing scalability issues affecting Drupal.org's built in search functionality for some time now. Less interested in outsourcing search to a big black box such as Google, I spent some time helping clean up the Xapian module, making it possible to completely replace Drupal's built in SQL-powered search functionality with a Xapian powered engine. With the basic search functionality complete, there was still a need to actually compare the performance of the two solutions.

Toward this goal, over the weekend I launched a new project called SearchBench, a Drupal module for benchmarking Drupal's search performance. As the module evolves, I hope it will prove extremely useful for comparing the performance and scalability of the many free and open source search options available to Drupal powered websites.

New Site Design

Tagged with

This week the Tag1 website got a new face. Notable is the new logo in the upper-left, as well the matching theme. Feel free to post your feedback as comments to this post.

This was my first go at Drupal theming (I started from an existing theme---not from scratch!) and it was fairly intuitive. However, I ran into a some issues that my web design colleagues constantly gripe about, as well as ones I wasn't expecting.

Scalable Linux Clusters with LVS, Part II


The second of the two-part Scalable Linux Clusters article. This second part features the addition of Heartbeat and ldirectord to the network started in Part I.

Scalable Linux Clusters with LVS, Part I


Whether you are perusing mailing lists or reading the large LVS-HOWTO, much of the available information on creating a high-availability Linux cluster can be difficult to sift through. Relevant information shares space with implementation details dating back to version 2.0 of the Linux kernel and comments that are often ambiguous at best.

This two-part article attempts to present an up-to-date tutorial on setting up LVS for your network, with an emphasis on designing a system that matches your infrastructure’s needs.