My Private Collections: September 2011

Thursday, September 29, 2011

Fresh Perspectives about People and the Web from Think Quarterly

Posted by Allison Mooney, Christina Park, and Caroline McCarthy, The Think Quarterly Team

There’s a lot of research, analysis and insights—from inside and outside Google—that we use in building our products and making decisions. To share what we’ve learned with our partners, we created Think Quarterly. It’s intended to be a snapshot of what Google and other industry leaders are talking about and inspired by right now.

Today we’re launching our second edition, the “People” issue, exploring the latest technologies connecting us and the big ideas driving society forward. It also includes some of the research and analysis that helps us shape our strategies.

For those who love data as much as we do, here are a few articles worth reading:

“Following Generation Z,” in which Google research scientist Ed Chi details what he’s learned from monitoring the course of digital innovation and mapping patterns of digital technology use in the future

“Predicting the Present,” by chief economist Hal Varian, about how publicly available search tools can help anyone gain valuable insights into the behavior of web users and predict what they might do next

“Power to the People,” by Meg Pickard, anthropologist turned head of digital engagement at Guardian News and Media, about tracking the influence and power of online communities

“From Cash to Contentment,” about the use of happiness as a measurable metric of success, with insights coming from Nobel Prize winner Joseph Stiglitz

Click here to read all the articles, and if you have a suggestion for our next issue please tell us here. We hope you enjoy (and +1) it!

Wednesday, September 28, 2011

Trying on the new Dynamic Views from Blogger

Posted by Alison Powell, Google Research Team

As you may have noticed, the Google Research blog looks a lot different today. That’s because we—along with a few other Google blogs—are trying out a new set of Blogger templates called Dynamic Views.

Launched today, Dynamic Views is a unique browsing experience that makes it easier and faster for readers to explore blogs in interactive ways. We’re using the Magazine view, but you can also preview this blog in any of the other six new views by using the view selection bar at the top left of the screen.

We’re eager to hear what you think about the new Dynamic Views. You can submit feedback using the “Send feedback” link on the bottom right of this page.

If you like what you see here, and we hope you do, we encourage you to try out the new look(s) on your own blog—read the Blogger Buzz post for more info.

Thursday, September 8, 2011

Sorting Petabytes with MapReduce - The Next Episode

Posted by Grzegorz Czajkowski, Marián Dvorský, Jerry Zhao, and Michael Conley, Systems Infrastructure

Almost three years ago we announced results of the first ever "petasort" (sorting a petabyte-worth of 100-byte records, following the Sort Benchmark rules). It completed in just over six hours on 4000 computers. Recently we repeated the experiment using 8000 computers. The execution time was 33 minutes, an order of magnitude improvement.

Our sorting code is based on MapReduce, which is a key framework for running multiple processes simultaneously at Google. Thousands of applications, supporting most services offered by Google, have been expressed in MapReduce. While not many MapReduce applications operate at a petabyte scale, some do. Their scale is likely to continue growing quickly. The need to help such applications scale motivated us to experiment with data sets larger than one petabyte. In particular, sorting a ten petabyte input set took 6 hours and 27 minutes to complete on 8000 computers. We are not aware of any other sorting experiment successfully completed at this scale.

We are excited by these results. While internal improvements to the MapReduce framework contributed significantly, a large part of the credit goes to numerous advances in Google's hardware, cluster management system, and storage stack.

What would it take to scale MapReduce by further orders of magnitude and make processing of such large data sets efficient and easy? One way to find out is to join Google’s systems infrastructure team. If you have a passion for distributed computing, are an expert or plan to become one, and feel excited about the challenges of exascale then definitely consider applying for a software engineering position with Google.