12-24-2011 04:39 PM
I know that sometimes data centers will have a outage. Can anyone tell me first hand if they have been through such thing and what happened?
12-25-2011 10:34 PM
I have. Most extended outages are due to power. I did have a box at a data center in Florida that was down for over 12 hours. It ended up being an OSPF mis-configuration or something. Poor network administration. Power outages are nightmare. There are soo many devices in a data center that will need to be rebooted. Some of them haven't been rebooted in years and because of that, they usually don't reboot without some type of intevention by the admin.
Let's not forget about hardware problems that result from the power being immediately cut off. Hard drives die, and some servers may not have been set in the BIOS to power back on after a power loss. It's a mess, especially if you are working on the NOC staff. I'm sure Goldenrabbit will have something to say on this topic...
12-27-2011 02:49 PM
Outages are terrible. Most data centers and their staff do not have a good plan or a plan at all for serious outages. Network outages are usually isolated to an equipment failure. If an upstream provider goes down, all data centers will have BGP setup to route traffic through another provider. That is why it is important to choose a real data center and not a small provider with a single-homed network.
On the power side its just bad news all around. Once a power failure happens you have no options besides reboots and repairs. Some UNIX/Linux boxes will boot into fschk... There is not much you can do once you that process starts. Another bad outage which is sometimes worse than than network/power is temperature.
Once air handlers start to fail, the temperature sky rockets in a matter of minutes. A data center at 120 degrees is recipe for disaster. If you do not lose hard drive immediately, you will notice over the next 6 months a steady failure of drives. Some devices will shutdown automatically but things like core network devices still need to operate.
I have seen a Cisco 6509 reach operating temperatures of 96 degree celcius!