Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
SB/SV data center fire outage
SB/SV data center fire outage
#1
To stop derailing the updates and DWF web site threads, news and commentary about the situation can go here. There is a bit if good news to start things off; according to the twit feeds new hardware for a temporary hosting solution has been lined up, and both sites should be back in some form in another few hours. Xon be praised!
--
‎noli esse culus
Reply
RE: SB/SV data center fire outage
#2
Exclamation 
Replying to Rob Kelk's reply to me... 

I'm not an IT professional but I would have thought that having a bit of separation between your potentially flammable power source and your servers would be a logical thing to do from a risk mitigation point of view. However if you must have them in the same structure at least invest in proper physical firewalls and zoned fire suppression systems. I suspect there is probably out there some sort of best practices in design of backup power supplies for data centers and fire suppression systems. 

Not to mention the fact that they were using sprinklers instead of a halon system to deal with a potential fire. Water and unprotected electronics do not mix.
Reply
RE: SB/SV data center fire outage
#3
Both sites are back up but there aren't any forums just blank page with drop downs to nothing.
Reply
RE: SB/SV data center fire outage
#4
I imagine reading the database out of whatever media the backups are on will take some time, even with the basic forum engine online. Especially if that's the traditional tape drive for relatively long term, linear access bulk storage.

edit: Though SV is now fully up and running! Their DB is quite a bit smaller though, I expect.
--
‎noli esse culus
Reply
RE: SB/SV data center fire outage
#5
Who knew those secret fascist mod threads took up so much room!

Back on topic I read somebody said it wasn't so much the fire suppression system that caused the water damage but the fire department. Having lived through a house fire, and had rooms all around the house ruined by water damage can confirm fire fighters get real generous with the water spraying.
Reply
RE: SB/SV data center fire outage
#6
Well, their job is to make sure that fires are very definitely out, and not going to flare back up. Pretty much everything else is incidental, and it's hard to argue about that - it's a lot harder to restore data from a pile of aluminum oxide and soot than it is from a waterlogged hard drive.
--
‎noli esse culus
Reply
RE: SB/SV data center fire outage
#7
Yeah, the forum Pinside has also been down for a few days now, and is still down now, because of a datacenter outage. Would not be surprised if it's the same one.
"You know how parents tell you everything's going to fine, but you know they're lying to make you feel better? Everything's going to be fine." - The Doctor
Reply
RE: SB/SV data center fire outage
#8
And Spacebattles is back up. I didn't check the whole site, but Creative Writing seems to be ok.
Reply
RE: SB/SV data center fire outage
#9
Yeah, assuming the business in question survives having to make good on the outage and the resulting exodus of customers to other pastures, plus the whole factor that getting new insurance is probably going to require a full audit of their rebuilt data center, plus that the backup generator stack for said data center will probably be REQUIRED to be in a different building - and any customer who demands that there can't be external power lines to be tampered with will likely have to be shown the door.

Also, they apparently will need to revisit their generator testing procedure in an effort to ensure that such a failure WILL happen during the test and not when people are off doing other things.
"You know how parents tell you everything's going to fine, but you know they're lying to make you feel better? Everything's going to be fine." - The Doctor
Reply
RE: SB/SV data center fire outage
#10
(04-07-2021, 06:07 PM)LynnInDenver Wrote: Also, they apparently will need to revisit their generator testing procedure in an effort to ensure that such a failure WILL happen during the test and not when people are off doing other things.

It was a "Catastrophic" failure.

Typically, these backup generators use huge diesel engines - like the ones you find on larger construction equipment - because they need power measured in masses of kilowatts to keep things going.

Let me show you what it looks like when a high-spec diesel engine fails "catastrophically".  I've set the link to start where things get "interesting"  (That being where the engine shows the first signs that not all is well.)
https://youtu.be/lVALwRO3LvY?t=144

Granted, these generators probably weren't cranking THAT hard.  But they can still be pretty wild.  And all it would take is something that a normal load test would never in a million years indicate - such as a growing micro-fracture in a connecting rod.  The only way you catch these things is if you do a tear-down for some other reason, and then happen to discover the crack.
Reply
RE: SB/SV data center fire outage
#11
Yeah, I get that it was a catastrophic failure, possibly involving a serious mechanical failure resulting in a rapid uncontrolled disassembly of the generator, and that the testing would likely never have caught it unless it was literally "cut main power and let stuff run on the UPS and then the generator for 30 minutes"... I was more talking about the fact that the fire department had to flood part of the data center itself to put out the fire, and that the fire had actually gotten into the data center from the generator room. Going forward, if the business survives, I expect they will have requirements from customers and insurance to have the generator room very much in a separate building from the data center, possibly with a mandated buffer zone.

Pinside has their software restored on a temporary server, but that's about it right now, and does confirm they were caught up in that disaster.
"You know how parents tell you everything's going to fine, but you know they're lying to make you feel better? Everything's going to be fine." - The Doctor
Reply
RE: SB/SV data center fire outage
#12
Article on the Register about the outage. Some interesting details about the entire incident.

https://www.theregister.com/2021/04/06/webnx_data_fire/
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)