Forum:Server issues today

I'm putting this here temporarily because it's not in the mailing list archive yet...

Hi everyone,

As most of you know we just had a site outage that lasted about 90 minutes. A particular query from one site was getting repeated in massive volume... we had just made some changes to our own cache servers at the exact same time, so we lost a little debugging time figuring out that it wasn't something we changed but rather something new. When we found the particular query in question, we tried clearing out the backlog of query requests to the database server and restarting our apaches and our databases... but the queries would flood right back in and fill everything up.

So I've asked that the domain for that particular site be turned off and redirected to www.wikia.com ... it's a drastic measure... and one that I wouldn't take without extremely good cause. As soon as we did this, everything went back to normal on all wikia sites. We'll be working with the admins from that wiki to see what happened and will turn them on again as soon as we have this figured out.

Stepping back a bit, Wikia has been growing a lot over the last few months and we're seeing the need for architectural changes as well as more equipment. A few weeks ago, I ordered about $50k of equipment to beef up both our main site and our back-up colo. Those machines (cache servers, apache servers, and more databases) will be coming on-line shortly. We started rotating in a fast new cache server and apache server yesterday and this morning and will bring those in permanently within a few days. Same with the new database servers. That will help to address short-term speed issues. Architectural changes are needed to address this longer term and we're working on those, too.

As always, thanks for your patience, John Q.

.

Hope it's OK to respond here. Recently there was a proposal on a certain Wikia wiki to change hosts, seemingly just so they could have a different URL and guarantee maintainers who were "fans". The last paragraph above IS WHY ONE DOESN'T DO THAT. :>

As always, thanks for you guys's industry and dedication. Ryan W 03:15, 21 September 2007 (UTC)

I guess that means Uncyclopedia is down?
I think when someone posted that UnNews story about Islam being fake or something, it ticked off the wrong people and somehow got an attack launched at the server? Any word on when Uncyclopedia.org will be back online? Orion Blastar 03:47, 21 September 2007 (UTC)


 * Dunno, apparently there's a backup on a mirror site but it's out of date? The last backups downloadable from Wikia were made on July 2 for all of these projects, including Uncyclopedia. --


 * We don't think this is an attack. We have full backups and nothing will be lost. Angela (talk) 04:11, 21 September 2007 (UTC)
 * According to what I have discerned through IRC(from wikia staff and Uncyc admins), and on the wiki itself, the problem was related to a change in a template on the Main Page. Hilariously, the article that caused so much trouble has been found to be none other than the (in)famous Chuck Norris. I've been told that the problem is being tracked down, but it could be a while. Until then, the template has been disabled. --EugeneKay 15:30, 21 September 2007 (UTC)
 * In that case, just roundhouse-kick the servers and it should be fine. :) --66.102.80.239 17:28, 22 September 2007 (UTC)
 * Apparently there is a vote to undelete the English-language Uncyclopedia here, no idea if it will get much support though. --205.150.76.42 17:53, 21 September 2007 (UTC)

Servers down infrastructure
It took quite a while to get a real server working that gave a server down message. I have been in such situations and no one wants to mess with inconsequential matters when you have 60 sites down. I know this thing will be rare and is part of growing pains, but it would be nice if your support techs had a script that could switch over to the server down message quickly, and give a check back time (looks really confusing- click the check back in 20 minutes), someone tripped over a power cord, check back in 5). Just a thought.

Nice recovery though. Amazing that no data was lost. ~  Ph l o x  17:20, 1 October 2007 (UTC)
 * Case in point- as of the time of this writing, the wikias have been down 20 minutes and I'm still getting 401 errors.

Sure is nice that the wikia.com site doesn't go down when the others do. ~  Ph l o x  19:16, 1 October 2007 (UTC)

Down wikis
Anyone have an idea how long some of the down wikis will be down? I was trying to update an article on the Hogan's Heroes Wiki when it went down about one-half hour to 45 mins. again. It isn't nice being made to redo work because of a problem someone else is causing. Leoni2 19:37, 1 October 2007 (UTC)