Google Apps Cloud Outage Lessons Learned

Google has confirmed that the Google App Engine experienced a Java App Engine outage on the evening of July 14, 2011, causing chaos amongst various Java-based applications on Google App Engine for about 4 and a half hours. The outage began at 7 pm PT, at which point applications affected by the downtime experienced high […]

Written By: Chris Talbot
Jul 20, 2011
Channel Insider content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Google
has confirmed that the Google App Engine experienced a Java App Engine outage
on the evening of July 14, 2011, causing chaos amongst various Java-based
applications on Google App Engine for about 4 and a half hours.

The
outage began at 7 pm PT, at which point applications affected by the downtime
experienced high latency and error rates. According to Google, approximately
1.9 percent of App Engine traffic was affected at peak. On the Google Developer
Blog,
the App Engine team noted
that the outage began not too long after a scheduled
maintenance period, but Google assured developers that the scheduled
maintenance and unexpected outage were unrelated.

The
service outage “gradually increased in magnitude over time” before Google
engineers were dispatched to deal with the problem. It took Google’s engineers
2.5 hours (9:30 pm PT) to get started on making repairs to the Java App Engine,
at first with the intention of reducing the impact of the outage.

The
Java element of Google App Engine wasn’t fully back online until 11:30 pm PT,
at which point all Java App Engine applications had been restored to normal
operations. Google apologized for the outage and promised to look at its
procedures to improve performance in the future.

“Overall
reliability, quick return to service, and fast, accurate communication to our
customers are some of the core goals of Google App Engine’s service offering.
While we restored service relatively quickly, it’s clear to us that we fell
short in prompt communication of status updates,” posted Wesley Chun, a member
of the Google App Engine team, to the blog.

Currently,
the team is still investigating the causes of the outage, but the blog post
noted that it has a preliminary understanding of what happened to cause the
Java outage. More information is promised one the investigation has been
completed.

This
isn’t the first time that Google’s platform for developing cloud applications
in its managed data centers has experienced an outage (an unfortunate reality
in a business environment where cloud computing service providers are dodging
accusations of unreliability).

Here’s
a quick (and incomplete) history lesson in Google App Engine’s failures in
recent years:

On
February 24, 2010, Google App Engine applications experienced degraded operational
states for varying amounts of time (from 20 minutes to two hours) between 7:48
am and 10:09 pm PT. The cause? A power failure in the primary data center that
engineers said was an issue that had been planned for but not everyone on staff
was aware of the processes.

On
July 2, 2009, the outage that occurred between 6:45 am PT and 12:35 pm PT
caused varying degrees of chaos with Google App Engine applications – from
partial to complete applications outages. The cause of the outage was a bug on
the GFS Master server that Google stated was caused by another client in the
data center. An improperly formed file handle hadn’t been sanitized by the
systems on the server side and caused a stack overflow when it was processed. Google
later discovered the bug had been live for at least a year.

On
June 17, 2008, Google App Engine was hit by a datastore outage at 6:30 am PT.
According to Google, only a small number of requests were returned as errors,
but the number of errors continued to increase throughout the morning until
engineers isolated the incident at 1:40 pm PT. Problem solved, and another bug
(this one affecting datastore servers) was found and dealt with.

In
other areas of Google cloud computing, the company has often had to deal with
surly customers complaining about Gmail or Google Apps outages, but it’s hardly
a new tale in the realm of cloud. Google certainly isn’t the only cloud
computing service provider that experiences its share of outages, and fingers
can easily pointed towards unreliability in a variety of directions.

 

 

Recommended for you...

Leadership Roundup: July Adjustments to Executive Benches

July saw major leadership shakeups across the channel, with key C-suite hires at Pipefy, Coro, Snowflake, Chainguard, and more.

Jordan Smith
Aug 1, 2025
July Roundup: AI, Cyber Key to Several M&A Developments

July’s M&A wave spotlighted AI security, with major players like Palo Alto Networks, Darktrace, and TD SYNNEX leading transformative deals.

Jordan Smith
Aug 1, 2025
Lemongrass Debuts Tool to Streamline SAP Clean Core Work

Lemongrass debuts Clean Core AI Accelerator to help SAP users cut complexity, reduce technical debt, and prepare ERP systems for cloud and AI upgrades.

Franklin Okeke
Jul 31, 2025
Trend Micro and Google Cloud Double Down on AI Security

The expanded alliance emphasizes AI-driven defenses, sovereign cloud capabilities, and new anti-scam protections for businesses worldwide.

Allison Francis
Jul 30, 2025
Channel Insider Logo

Channel Insider combines news and technology recommendations to keep channel partners, value-added resellers, IT solution providers, MSPs, and SaaS providers informed on the changing IT landscape. These resources provide product comparisons, in-depth analysis of vendors, and interviews with subject matter experts to provide vendors with critical information for their operations.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.