Until recently, I was a student employee at the Oregon State University Open Source Lab. My career there ended, like many, with that painful process known as graduation. I got invaluable experience at the lab, not the least of which being the knowledge gained as their main (only) database administrator. One of my great pleasures in that position, was learning how to configure MySQL replication and manage clusters of replicating database servers. Even the simple case of a single master and a single slave has its edge cases.
Blogs
Chapter 1 Rough Draft Complete
Submitted by jeremy on Sun, 07/27/2008 - 16:27.I have completed a rough draft of the first chapter of "Drupal Performance and Scalability". The first chapter of this online book is divided into four sections, the first of which focuses on the importance of fully defining your performance and scalability goals, helping you to identify what you need to accomplish and how to set concrete and attainable goals. The second section discusses monitoring and measuring your ongoing progress, helping you decide what you need to monitor, and how to monitor it. The third section stresses the importance of making regular backups, discussing what needs to be backed up, and offering example scripts for backing up your entire website, including the database. Finally, the fourth section takes an in depth look at using revision control tools to manage your website, providing useful recipes showing how Git can track changes to your website, helping you update to new releases and push those updates into production.
It is important to realize that this is a rough draft, and as such it may contain spelling or grammatical errors, it may be missing key points, and the writing style may not be very polished. However, the book has to start somewhere, and this is the first step toward the end goal of publishing a useful and freely available online resource. I welcome all criticisms, suggestions and feedback. If you find errors in the text or have specific comments, you can help with this writing project by posting your feedback on the appropriate page. The current status of this project is tracked here.
Online Performance and Scalability Book
Submitted by jeremy on Fri, 07/18/2008 - 09:32.Tag1 Consulting is focused on improving Drupal's performance and scalability. We also believe that when information is freely shared, everyone wins. Toward these ends, we are working on an online book titled, "Drupal Performance and Scalability". The book is divided into five main sections, Drupal Performance, Front End Performance, Improved Caching and Searching, Optimizing the Database Layer, and Drupal In The Cloud. The book is primarily aimed toward users running Drupal on the LAMP stack, with chapters applicable to everything from low-end shared hosts to large-scale multi-server installations.
By publishing on-line, we aim to encourage you to participate in the book writing process as an editor and a technical reviewer. You will currently find the book's complete outline online, along with descriptions of each planned section and chapter. As the book evolves, it will continue to be updated online in real time. We encourage you to post comments with suggestions, critical feedback, grammatical corrections, or anything else relevant to our ongoing effort.
Comparing Xapian and Drupal 5's Core Search
Submitted by jeremy on Wed, 07/09/2008 - 15:09.SearchBench has received a couple of useful updates since yesterday's initial cloud tests. It can generate search queries based on actual content, and it can export search benchmark results. In gaining these features, it is now possible to use SearchBench to perform some actual performance comparisons.
Once again I set up these tests on an extra large EC2 instance. I still have not performed any tuning, and I continue to test Drupal 5 core search with Xapian search. My initial benchmarks show that Xapian offers a very significant 6x+ performance advantage over Drupal's core search when a given search query actually returns results. In addition, Xapian is able to index a large site in about a 3rd the time of Drupal 5's built in search. Read on for actual benchmark results and graphs.
SearchBench In The Cloud
Submitted by jeremy on Tue, 07/08/2008 - 19:56.I ran some initial Drupal search benchmarks with SearchBench on Amazon's EC2 cloud service. These first tests were primarily focused on confirming that SearchBench and EC2 are a good match. They utilized a single server instance, and did not include any server tuning.
I used the devel module to create 5,000 random nodes and 10,000 random comments. I indexed this content both with Drupal's core search module, and with the contributed Xapian module. I then used SearchBench to create 1,000 random search queries with one to ten ten words in each query, with phrasing and negation set to random. Finally, I ran the same identical search test three times in a row, comparing Xapian's performance to Drupal's core search performance. I was impressed to see how well Drupal's core search performed in these tests, and plan many more tests to better understand the strengths and weaknesses of each search technology.
Introducing SearchBench
Submitted by jeremy on Sun, 06/29/2008 - 18:26.There have been some ongoing scalability issues affecting Drupal.org's built in search functionality for some time now. Less interested in outsourcing search to a big black box such as Google, I spent some time helping clean up the Xapian module, making it possible to completely replace Drupal's built in SQL-powered search functionality with a Xapian powered engine. With the basic search functionality complete, there was still a need to actually compare the performance of the two solutions.
Toward this goal, over the weekend I launched a new project called SearchBench, a Drupal module for benchmarking Drupal's search performance. As the module evolves, I hope it will prove extremely useful for comparing the performance and scalability of the many free and open source search options available to Drupal powered websites.
New Site Design
Submitted by emsearcy on Wed, 06/25/2008 - 13:33.This week the Tag1 website got a new face. Notable is the new logo in the upper-left, as well the matching theme. Feel free to post your feedback as comments to this post.
This was my first go at Drupal theming (I started from an existing theme---not from scratch!) and it was fairly intuitive. However, I ran into a some issues that my web design colleagues constantly gripe about, as well as ones I wasn't expecting.
Additional kernel modules on EC2
Submitted by emsearcy on Wed, 05/28/2008 - 01:41.Continuing my plans to set up an IPVS high-availability LAMP stack on EC2, I needed to add the kernel modules for IPVS. I have been using the CentOS machine images provided by RightScale, which have unneeded services disabled and, although they are set up to work with RightScale's software, work very well for general use. Unfortunately, the IPVS kernel modules are not among those pre-installed on the AMI.
Why Drupal.org Should Join the Ad Bard Network
Submitted by jeremy on Mon, 05/26/2008 - 11:49.The Ad Bard Network was conceived because I have a need for relevant, non-obnoxious advertisements on my website, KernelTrap.org. I have maintained KernelTrap for many years, as a hobby in my spare time, and as a way to stay involved in the open source world. I enjoy this hobby, but it requires a lot of time and commitment keeping the website updated every day. I've long dreamed of finding a way to make a little income to help justify the time I invest into my hobby.
Displaying advertisements on KernelTrap has a lot of potential for earning income, but I failed to find an advertising network that was compatible with my beliefs and requirements. I need an ad network that won't flood my website with animated gifs, flash videos and pop-ups. I want to know exactly what information is being collected about my readers. I want to earn a fair share of the profits, and to know how much the advertising network is making off my website. I want to be fully in control of what types of ads and what specific ads appear on my website. And the ads need to load extremely quickly, not slowing down my web pages or loading scripts within scripts within scripts.
The Ad Bard Network has grown out of these needs, already exceeding my own requirements and becoming a viable and useful fund raising mechanism for all free and open source projects and websites.
Achieving high availability on EC2
Submitted by emsearcy on Thu, 05/22/2008 - 03:16.This last week I've had the fortune to have some spare time to play around with Amazon's Elastic Compute Cloud (EC2). I'm pretty interested in the potential for scaling the LAMP stack by having a programmable cluster at the service of your box.
