Complicated Linux Environment #2 -
By Erik Rodriguez
Tags: complex Linux servers, GFS clustering, poor Linux design, bad Linux administration,
overly-complicated environments, Source builds vs. RPM, Dell iDRAC, GFS Fencing Script, GFS vs. OCFS, Dell M1000e
This article contains information about my actual experiences with overly-complicated Linux environments. The previous articles in the series cover Complicated Linux Environment #1 and is followed by Complicated Linux Environment #3.
As a consultant, I have had the unfortunate experience of dealing with overly-complicated Linux environments. The previous complicated Linux environment I discussed was a 25 employee company with Linux servers running in their office. This instance covers a complete migration of an existing web environment from one data center to another. I worked as a consultant on this for about 8 months in 2008 to 2009.
Like environment #1, this was a local company in the Orlando, Florida area. Their web environment was running elsewhere. Texas if I recall correctly. For whatever reason, they wanted to move it to a local data center. They signed a contract with a small hosting provider that had a chop-shop style data center. They ended up purchasing a total of 7 servers and a SAN from this company. Come to find out, the servers were actually blades in a Dell m1000e chassis. The "SAN" was eventually exposed to be only a Dell Poweredge 2900 server with the Linux iSCSI project loaded on it. It had other LUNs that were shared among other customers in the same data center.
So right off the bat, we had a hairy hardware setup. The CTO requested these servers be configured as a "cluster." It was a cluster alright... It wasn't until some months later we discovered the make-shift SAN was getting beat to death by database requests and LUNs from other customers. It eventually caused a lot of problems. Problems included everything from database read/write errors to apache timeout.
The problem started with the CTO's strange requests. As an "acting" CTO, this guy came from another profession. He had some Linux experience, but seemed to be very dated. He insisted everything from Apache to mySQL be custom compiled from source. Upon moving from the old environment to the new one, numerous things did not work because the new environment was not complied with the same flags. A few years ago, I had an article on my blog about source build vs. RPMs. I basically came to the conclusion that unless you have a VERY specific need, the RPM method would work just fine. This was not one of those cases. The RPM versions of everything would have worked just fine, but the CTO insisted on making things more complicated.
Going back to the SAN issues, we decided that OCFS was not performing as we expected, so we got a real SAN (Dell Equallogic). I mounted the new SAN volume as an additional partition. I then formatted the newly install partitions as GFS (red hat). You can see this comparison between OCFS and GFS. OCFS and GFS were needed as more than 1 machine would be reading from the same data source. An intermediary (like NFS) is needed, but for the performance required, NFS doesn't cut it. In order to get GFS working via best practices. It required what is called a "fencing script." A fencing script is basically a script that runs when GFS detects a conflict within the GFS cluster (comprised of all servers reading from the same volume) and performs an action. In my case, I needed to "boot" the server causing a conflict. I was using a fencing script from Dell that integrated with Dell severs running DRACs.
Here is where the fun started. I was unable to access the DRACs for these servers because they were part of a blade server. The provider would not give me DRAC access because the chassis was shared with other customers. First, I have never heard of a hosting company using blade servers. Second, why would they not use VLANs to segregate customers? In the end, they ended up giving the DRAC access using an alternate username/password for the blades I needed access to. When the fencing script is configured with the DRAC cards, it effectively power cycles the server once it is "fenced" from the cluster. The server reboots automatically and rejoins the cluster… This concept was shaky at best in theory. In reality, it was even worse. After each server was rebooted several times per week, the local disk which contained the OS started forcing FSCKs, and so it should. Since when is power cycling a server part of normal operation?
Problem After Problem
As the SAN began to fill, I started moving content to Amazon's EC3. The providers supposed 1Gbps uplink ended up being 100 Mbps. Once I started offloading content, it must have completely saturated their pipe, effectively choking all ingress/egress traffic. The provider shutdown the VLAN (presumably to bring the rest of their network back online) and said they would have to make some "adjustments" if we were going to copy that much data. Once the adjustments were made, I was only able to move data at a mere 44 Mbps. After doing some investigating, I realized what they did was route all our traffic through a DS-3 that was presumably the backup for their primary 100 Mbps link. Basically this provider disabled their redundant connection for almost 2 weeks to give us the ability to copy all the data to the cloud and still operate the rest of their network. Shame, shame shame.
In the end, the blade servers caused numerous problems. Combine that with the fact the php code for the website was processing PDFs and using (at times) up to 32 GB of physical memory. The poor design of the site and code caused load averages on some of these servers to reach over 32. Yes, I said 32, and these servers were dual, quad-core Xeon CPUs with 32 GB of RAM each.
This is yet another example of poor management from the start. A "CTO" who had no business calling the shots and a provider that was as proficient as a chimp taking a multiple choice test. Every task with this project was a huge ordeal, and it got to the point where I simply did not want to answer the guys phone calls. At this point, it seems as if your time would be better spent shopping for someone to take the client off your hands.
NOTE: this form DOES NOT e-mail this article, it sends feedback to the author.