Saturday, February 26, 2011

Where does my data live?



Have you ever wondered what happens when you upload a photo to Picasa, or where all your Gmail or YouTube videos are stored? How it is that you can read or watch them from anywhere at any time?

If you stored your data on a single hard disk, like the one in your personal computer, then the disk would eventually fail and your data would be lost forever. If you want to protect your data from the possibility of such a failure, you can store copies across many different disks so that if any one fails then you just access the data from another.

However, once storage systems get large enough, anything and everything can and does go wrong. You have to plan not just for disk failures but for server, network, and entire datacenter failures. Add to this software bugs and maintenance operations and you have a whole lot more failures.

Using measurements from dozens of Google data centers, we found that almost-simultaneous failure of many servers in a data center has the greatest impact on availability. On the other hand, disk failures have relatively little impact because our systems are specifically designed to cope with these failures.

Once you have a model of failures, you can also look at the impact of various design choices. Where exactly should you place your data replicas? How fast do you need recover from losing a disk or server? What encoding scheme or number of replicas of the data is enough, given a desired level of availability? For example, we found that storing data across multiple data centers reduces data unavailability by many orders of magnitude compared to having the same number of replicas in a single data center. The added complexity and potential for slower recovery times is worth it to get better availability, or use less storage space, or even both at the same time.

As you can see, something as simple as storing your photos, mail, or videos becomes a lot more involved when you want to be sure it's always available.

In our paper, Availability in Globally Distributed Storage Systems, we characterize the availability of cloud storage systems, based on extensive monitoring of Google's main storage infrastructure, and the sources of failure which affect availability. We also present statistical models for reasoning about the impact of design choices such as data placement, recovery speed, and replication strategies, including replication across multiple data centers.

Friday, February 25, 2011

A Runtime Solution for Online Contention Detection and Response



In our recent paper, Contention Aware Execution: Online Contention Detection and Response, we have made a big step forward in addressing an important and pressing problem in the field of Computer Science today. This work appears in the 2010 Proceedings of the International Symposium on Code Generation and Optimization (CGO) and was awarded the CGO 2010 Best Presentation Award at the conference.

One of the greatest challenges when using multicore processors arise when critical resources, such as the on-chip caches, are shared by multiple executing programs. If these programs simultaneously place heavy demands on shared resources, the may be forced to "take turns," and as a result, unpredictable and abrupt slowdowns may occur. This unexpected "cross-core interference" is especially problematic when considering the latency sensitive applications that are found in Google's datacenters, such as web-search. The commonly used solution is to dedicate separate machines to each application, however this leaves the processing capabilities of multicore processors underutilized. In our work, we present the Contention Aware Execution Runtime (CAER) environment that provides a lightweight runtime solution that minimizes cross-core interference, while maximizing utilization. CAER leverages the ubiquitous performance monitoring capabilities present in current state-of-the-art multicore processors to infer and respond to cross-core interference and requires no added hardware support. Our experiments show that when using our CAER system, we are able to increase the utilization of the multicore CPU by 58% on average. Meanwhile CAER brings the performance penally due to allowing co-location from 17% down to just 4% on average.

Wednesday, February 23, 2011

Congratulations to Ken Thompson



I’m happy to share that Ken Thompson has been chosen as the recipient of the prestigious Japan Prize. The Japan Prize is bestowed for achievements in science and technology that promote the peace and prosperity of mankind.

Ken was awarded the prize along with Dennis Ritchie for their development of the UNIX operating system in 1969 while at Bell Labs. UNIX changed the direction of computing as a whole and paved the way for the development of the personal computers and the server systems that power the Internet.

It’s an enormous source of pride for us to have such amazing talent working here and Ken continues to serve as an inspiration to the rest of us. We’re excited to see what Ken will come up with next.

You can read the full press release here.

Friday, February 18, 2011

Query Language Modeling for Voice Search



About three years ago we set a goal to enable speaking to the Google Search engine on smart-phones. On the language modeling side, the motivation was that we had access to large amounts of typed text data from our users. At the same time, that meant that the users also had a clear expectation for how they would interact with a speech-enabled version of the Google Search application.

The challenge lay in the scale of the problem and the perceived sparsity of the query data. Our paper, Query Language Modeling for Voice Search, describes the approach we took, and the empirical findings along the way.

Besides data availability, the project succeeded due to our excellent computational platform, the culture built around teams that wholeheartedly tackle such challenges with the conviction that they will set a new bar, and a collaborative mindset that leverages resources across the company. In this case we used training data made available by colleagues working in query spelling correction, query stream sampling procedures devised for search quality evaluation, the open finite state tools, and distributed language modeling infrastructure built for machine translation.

Perhaps the most satisfying part of this research project was its impact on the end-user: when presenting the poster at SLT 2010 in Berkeley I offered to demo Google Voice Search, and often got the answer “Thanks, I already use it!”.

Tuesday, February 1, 2011

Julia meets HTML 5



Today, we launched Julia Map on Google Labs, a fractal renderer in HTML 5. Julia sets are fractals that were studied by the French mathematician Gaston Julia in the early 1920s. Fifty years later, BenoĆ®t Mandelbrot studied the set z2 − c and popularized it by generating the first computer visualisation. Generating these images requires heavy computation resources. Modern browsers have optimized JavaScript execution up to the point where it is now possible to render in a browser fractals like Julia sets almost instantly.

Julia Map uses the Google Maps API to zoom and pan into the fractals. The images are computed with HTML 5 canvas. Each image generally requires millions of floating point operations. Web workers spread the heavy calculations on all cores of the machine.

We hope you will enjoy exploring the different Julia sets, and share the URLs of the most artistic images you discovered. See what others have posted on Twitter under hashtag #juliamap. Click on the images below to dive in to infinity!