I have this graph up on my screen all the time. It should be flat. This week has been rough.We've gone through our various databases, caches, web servers, daemons, and despite some increased traffic activity across the board, all systems are running nominally. The truth is we're not sure what's happening. It seems to be occurring in-between these parts.
We're busy working on instrumenting and adding meters to provide visibility into what's slowing Twitter down. We'll use this data both to alleviate the current woes and to help inform our long-term architecture work to make Twitter a utility service people can count on. We've definitely failed that aim this week.
Thanks for your patience during these current frustrations (and those to come) as we figure out how to work the kinks out. Thanks also for speaking up: we're listening. In addition to providing visibility into our systems, we're working to give everyone greater visibility into our roadmap to solve these ongoing problems. More to come.

77 Comments:
Best of luck to you guys. :)
Awesome! Sincere thanks! :)
Is there any particular reason why it's so 'spiked'? Could a DoS attack be a possible reason?
Any news is good news in my book. Thanks for the transparency.
Rock on guys!
Suggestion: Spend some time on Twitter explaining what is happening and what you are doing to correct it. Almost no one comes here to read this blog. They stay on Twitter wondering why Twitter is down with no explanation for what the problem is.
Go to where your community is, don't expect us to come to you. Simply taking 5 mins to put that on Twitter would have quelled a LOT of the 'I've had it with Twitter, what else is out there?' complaining that's running rampant right now.
Just take a few minutes to show your community that you respect their thoughts and concerns.
Sorry, I don't see any real visibility.
I've found some strange things with the API and whether I report them on the google forums or directly through your contact page all I ever get is silence.
Now, I'm willing to assume I'm doing something wrong but if your documentation is at fault or the API isn't working correctly then shut off the defective parts and tell the developers! Don't let use waste a day of fustration trying to figure out what we are doing wrong!
Yes, I realize your busy trying to fix things, but a lot of developers and users are getting really tired of the problems that never seem to go away.
I like twitter a great deal and use it many times a day, but right now after wasting a day on the sense parameter of the friends User Method I'd jump at the chance to use another micro messaging service if they had an API that was better documented for me and was much more stable for the user!
I know of 4 groups that see a window for a twitter replacement and are jumping in. And if I know of 4 that means there could be 10's or 100's of teams waiting for the chance!
You just go a load of money, please hire more and better developers before its to late and your left on the side of the road wondering what happened to your user base.
Thank you for the communication! That helps a lot. :)
Hang in there, we will!
I had the same initial thought as Timothy Neilsen -- the chart made me wonder if it was a DoS attack.
Dump ROR, use something that is more scalable.
Very good post, Jack. Thanks to @jowyang for pointing this out. we complain a bit, but we keep coming back to twit some more. You've built an awesome service - what great service has not experienced growing pains? I'd be worried if there were _no_ bugs. :-)
@alex_landefeld
As much as I'd like to scream and shout, today has been full of unexplained weirdness on the infrastructure front for me as well. Our Wordpress instance flatlined at 100% CPU usage (taking the rest of the virtual servers with it), during a huge strategic launch, our Google search appliance just bonked, and who knows what's next. Spent hours pouring over logs and analyzing data, and we're still scratching our heads.
So hey folks, give the Twitter team a break, sh*t happens, it's often not apparent where the core fault resides, and dealing with pissy customers destroys the zen necessary to diagnose hugely complex problems.
Good Luck !
Please continue *:-) I luv Twitter, n I appreciate that you understand the "adaptive perseverance" model.
Guys. thanks for the update. I know it's been rough couple of weeks for ya. Keep up the good work and thanks for providing a fantabulously fun (and FREE) service.
Kyle
@chownage
Many thanks, and good luck!
thanks for sharing. it's not easy beta-testing in public..
just wondering, am i following @twitter_status in vain?
I think you guys are fantastic, even with the downtime. Thank you for the update.
Like Bartlett said... "Thanks for the transparency"
It's helpful for a community to stay informed, and considering this community of early adopters... the best possible stance you could take. Kudos!
you guys/gals all rock! thanks for the great service and best of luck
Now that's what I like to hear! great job communicating with the twittersphere. People give you a lot of flack (myself included) for the down times, but communication and honesty is what is going to keep us coming back to twitter.
excellent post!
I love twitter, but I don't see how much longer people are going to stay around with outages or near outages this frequently. i sincerely hope things get worked out as I don't want twitter to die.
Thank for the updates guys, keep em coming.
Rome wasn't built in a day, as they say.
I have every confidence you'll get Twitter stable and rocking.
Keep at it, we appreciate your efforts.
And thanks for the updates here.
good luck!
Hey guys,
even though maybe the tech execution hasn't been, your transparency and humble attitude in this blog post is right on target.
Thank you for that.
It's what works in Web2.0, and everybody else should take note. False posturing or spin doesn't get anyone very far anymore. Ironically, it ends up being just as transparent.
Never be too proud to ask for help or understanding, as long as there is a clear path to resolution.
That said, is there one? A lot of people have started to depend on Twitter and you must know that you have garnered considerable mind-share.
But the Theory of Constraints implies that many/most entrepreneurs tend to stand in their own way by insisting that they do more by themselves than they really should. Realize that the skills that got you this far are not the same that will sustain you from here.
I hear that you just got more funding which is a good start. Or is it time to go big time and be bought or partner up? I would love for Twitter to be a permanent fixture of the Internet infrastructure. Maybe it's time to put this thing on a serious cloud footing. Wink, wink, nudge, nudge :)
I'm not sure why congratulations are in order. This is a bad sign of things yet to come.
In truth, I appreciate the honesty. However, Release cycles can be your friend. Think about it.
Thanks for sharing that! We love Twitter--and I'm confident you all will make it work just right. :-)
Hang in there guys... and THANK YOU for opening up and letting us know that you are working on it, even if there's no readily apparent solution.
What does that graph represent tho? Queues to the database server? Packet response times? Bandwidth?
I mean, it does seem as though you've got spikes that tie directly to hardware changes...
Well, whatever the case - hoping it resolves quickly - we're behind you.
Thanks for the communication. Spreading the word.
You guys have a network of smart people around you who don't work on twitter. You've got personal networks to help you get people in who might be able to help.
If your devs don't have a clue whats causing the bottlenecks - then go outside and just ask for help.
:)
Hey, I know you guys take a lot of heat when this stuff happens but don't worry about it, people here in America are spoiled by free services. Best of luck, you guys have a great thing going here.
You guys might consider www.gomez.com
they have a pretty good suite of tools to diagnose issues in terms of source and performance.
This is great and all we ask: keep us apprised of what's happening and keep it frank. Furthermore, there are tons of really smart people on twitter with combined centuries of experience upon which to tap (myself included). Let us know if we can help. - @maslowbeer
Thank you for the post... I appreciate the honesty.
From the comments here it looks like the community at large is ready to test any combination of notices of downtime that will work for your team...
Note on this blog
custom 404 page
warning messages when outages are coming (the ones over the last week have been great!)
One emergency tweet to all as the service goes dark :-)
Ya'll dream it... we're here to tweet back at ya!!
Keep on truckin
You guys rock so much! Keep the Twittery goodness coming!
Two things to try;
1) Turn off autonegociation on your internal network. Some cards/routers/drivers have flakey implementations that at high loads cause the available bandwidth to drop out.
2) Get a managed switch which will show you network load.
3) Run a packet sniffer to see whats floating around on your net and feed the output through some kind of filter to measure packet load over time for various types.
If it's happening between the gaps it's most likley a net thing, as it has been in the times I've had similar problems when pushing interconnected systems.
I agree with what's been said before about following twitter_status. Guys, you got to eat your own dog-food! There just has to be a tweet when a blog is posted! And there are already services out providing that functionality, so go + adopt :)
Thanks for the transparency. It would help you if you tweet the updates, so we know. But thanks for letting us know about your growing pains. I like the system and will continue to hang in there.
Just looking at some of these comments make me angry. Just the absolute arrogance of some of these people. These people are doing this and we pay nothing. NOTHING. The moment they start charging and is continuing to fail you can all get on your high horses then. I like Anon's post below. There's people already thinking about making their own? Hah, good luck with that.
i find it funny some people had this "twit out" on Friend-feed, only to have friend feed play up on them. That was funny.
Good luck guys. We know you're working on it. Twitter rocks and I'd pay for it no question - and have been asking other people if they would to . . .
Would you pay £12/$24 a year to use Twitter? Vote here
Being using it only for a few days and you guys have already got me hooked... keeping me up till 3 in the morning, just to watch! Despite the
glitches and all, I'm still in and waiting for you to sort them out.
Mercury goes retrograde any day and that is the worst time for technology. throw in the awesome full moon and you have a graph like that.
Someone having the guts to openly say that they don't know what's happening. No bullshit, no false explanations. Guys, you rock!
it are ok.happens.. keep patience. we
Just curious: what's the graph of?
Even this "we don't know" post is better than silence. Most appreciated!
Good luck figuring this one out. But it sounds more like a fundamental flaw with the architecture and/or technology than something you can fix. This may be proof that RoR isn't such a great platform after all...
I am a Twitter fan. I've invested in a website devoted to blogging about Twitter. Of course, many developers and users have invested hugely as well. The openness is appreciated, and necessary. The bottom line of course is to sort out the problems, if Twitter is to realize its true potential and to beat out the competition.
Thanks for this insight into what might be going on.
When I joined Twitter, I wondered what form (the unfortunately inevitable) Twitter spam would take. It seems to be what exador23
[http://tinyurl.com/54l37g]
calls mass stalker bots - yet another form of Massively Antisocial Behaviour - which results in DoS, whether intended or not.
I wish you success in detecting blocking this kind of useless traffic on your system.
The Twitter API is restricted to < 70 polls/minute: perhaps some more fine tuning of this nature might help?
I second all suggestions that you compile some appropriate error messages that appear on twitter.com whenever it's in trouble. This would save frustrated users having to guess for themselves if the problem is Twitter, or their Internet connection, or just Windows in general.
And please, please, please: As much transparency as you can afford!
Thanks for the info. You guys are being transparent and asking for help.
That's amazing.
Hope you crack it!
Twitter Executives:
Example, surprised me, of my constituents (staff) actually *praised* my team *in spite* of the problems.
>>Fast Facts<<
Would be great for us out in Twitterland to gain an appreciation of the VOLUME, TRAFFIC, etc. I know I would appreciate your efforts even more than I already do -- just a few key facts.
>>Staff Response<<
Your hard work is appreciated. I can't imagine life without you guys and having to plow through all the garbage everyday.
I also appreciate your giving us the spam figures.
-----Original Message-----
From: Jeff Bundy
Sent: Friday, May 16, 2008 11:25 AM
To: Jeff Bundy
Subject: SPAM/"Undeliverable" email received messages...
RE: SPAM/”Undeliverable” email received messages…
We are aware.
Please simply delete/ignore them.
Every protection is in place and active.
Unfortunately, it's an on-going battle that ebbs and flows. Clearly some of the "bad guys" are succeeding in breaking through with some harmless, albeit annoying, messages. The good news is security is screening as much as it can, but it's never 100% perfect.
Thank you for bringing this to our attention as it did prompt us to double-check the protection mechanisms in place.
Fast Facts:
In the last 60 seconds: 244 SPAM messages quarantined. (1 every ¼ second!)
In the last 60 minutes: 11,720 SPAM messages quarantined.
Strange load pattern. It peaks in the evening time on the 19th, then in the morning on the 20th, then the entire evening again. On the 21st, it peaks in the - morning again? - nope, flat in the morning, peak at mid-afternoon, slammed in the evening again.
My guess is that it's some third-party website. I would just check the API logs.
OTOH, php|tek conference in Chicago started on the 20th.
Cool, keep up the great work guys.
You didnt have metering and loading metrics before...what exactly has your SI team been occupied with all these months, exactly?
I'm sorry that so many people have been so mean to you over this. Its wrong and immature. Sure I've been frustrated but not at you, at the whole wonky system. I hope it gets better and I hope y'all can find some peace in it. Good luck with it.
As for the mean folks - the service is free. No one promised that it would always be perfect and no one said that the admins have to do anything to either inform anyone or to fix it. They do both because they are trying and they do give a crap. Think on that before y'all howl with indignation.
P.S.
It would be nice if y'all did tell us a bit more though, really. We're as frustrated as you and there are many techs in our "family" here, maybe someone could even help...
It's really nice to hear an honest answer from a service, even when the answer is "we don't know the answer."
I do agree with Mack Collier's comment. I wouldn't have check the post if Dwight Silverman hadn't Twittered it!
I hope to heck you guys are using something like Splunk to look at all that log data, which invariably, is where the first signs of problems start.
dtrace, dtrace, dtrace dudes.
Why not hire Mike Arrington? He seems to have it all figured out, if you can pull him out of his bipolar funk that's directly tied to twitter downtime.
wow! you are amazing. thank you for admitting your problems, it makes me so warm and happy. good luck!
Good Zeus! I know someone who knows. It's like living with Tiresias. And he's got the road map to your roller coaster.
Good luck.
I know you are taking a lot of flak this week, but communicating with us users as to what is going on helps us be more sympathetic towards you. We love twitter, and we enjoy being part of the process, even if it is just knowing what's up! Keep up the good work! Twitter really is an amazing thing!
If only you still had blaine around to track down these problems and fix them.
Thanks for the response, Jack. I know how much it sucks when you're trying to fix a problem you can't source while your user base is screaming at you to fix it. That being said, here are some suggestions to improve the user experience in the case of a failure:
1) "Something is technically wrong?" Thanks, I figured that out by myself when my home page didn't load. As soon as I figure out that there's an issue with the site, all I care about is knowing when the site will be working again, and maybe (to indulge my own nerdiness) the cause of the issue. That's it. I'd suggest eliminating the crappy copy and replace it with something that makes you guys look like you're doing something other than spinning your wheels.
2) A blog post explaining the situation as early as possible. Even if you have no idea what the issue is, and are just investigating possible causes, post it so that I know that the Twitter team is aware that the site is down. Keep it up to date as the status of the problem changes ("under investigation" -> "problem with XXX switch or YYY database" -> "ETA 1 hour" -> "fixed").
3) Write up a post-mortem in the event of extended downtime ( > 2 hrs, say) describing the circumstances of the downtime.
There will always be downtime. It cannot be written off, or ignored, or swept under the rug. So, I say embrace it and keep your users armed with information.
thanks for keepign us posted. i hope you find and fix the problem soon
I've moved on to brightkite. In Web2.0 there's always an alternative and very loose loyalty. Sorry, but not too sorry.
Pretty Cool man! Best of luck to you guys!
JT
http://www.iurlz.com/demtools
Thanks, for a great product.
It seems weird that for the last week or so at about 4 or 5 PM EDT (U.S.), that Twitter has been starting to slow down to the point I can't get to see my messages until late into the evening.
Could it have to due with the fact that all the Japanese twitter users are getting up about then (5 or 6 AM)? Plus you still have the British finishing up their days and the U.S. is awake.
It is like max user and peak time.
jfc iii
Ok, JFC said something that sort of makes sense to me. As I read the whole post and looked at your grafics, I wondered: What is twitter was broken up to locations. This could minimize the load as a whole, and be more manageable, but I guess all this has probably already been done..
Wish any of us could come up with a good solution.
Just dot stress out, take it easy and enjoy, troubles are always interesting to solve.
I love Twitter. Warts and all.
Part of the Twitter scaling problem is very perceptual, and can be dealt with fairly simply, IMO. Part of the reason I get frustrated is because when Twitter is down, the whole thing is down. I can be logged out and the home page won't load: There's no reason that page should ever be slow, it's entirely static content.
When twitter has been down this week, you couldn't load individual tweets from their perma-urls. Besides a privacy check, that page is should be entirely cacheable.
The asset servers have been bogging down this week.
I tried to load the API documentation at work today and couldn't get it to load.
I didn't even try to view my timeline much today, but the application found several ways to prevent me from using it's peripheral services. This makes the outage that much more apparent.
All of these things can be dealt with through a little bit of segregation of static from dynamic content, and the perma-tweet case is so crazy-simple to scale even I could do it.
By dealing with this low hanging fruit you can reduce and constrain where your site is failing: some twitter is better than no twitter. These sorts of simple improvements could be your first steps into implementing a proper limp-mode. One day I would hope that twitter would be able to respond to everyone of my requests in a timely manner, even if the messaging system is inoperable. Having to wait for 30 seconds and have the API documentation time out on me is just frustrating.
xoxo twitter!
Sorry to hear about the spikes. But I'm really happy that you guys are finally communicating about the various issues. I was getting tired of all the 'Twitter doesn't care post.'
I think if people can expect to find information about what's going on at Twitter on your blog; it takes some strain off Twitter, and still provides information. But one thing, it means you guys have become necessary in peoples lives.
So I wish you the best; and here's hoping for increased stability.
In my opinion Twiter works in a very strange way.¿Can I update scripts,videos and photos like in other blogs?
Just another thank you for letting us know what's going on.
Post a Comment
Links to this post:
Create a Link
<< Home