My Private Collections: June 2011

Tuesday, June 21, 2011

Google Translate welcomes you to the Indic web

Posted by Ashish Venugopal, Research Scientist

(Cross-posted on the Translate Blog and the Official Google Blog)

Beginning today, you can explore the linguistic diversity of the Indian sub-continent with Google Translate, which now supports five new experimental alpha languages: Bengali, Gujarati, Kannada, Tamil and Telugu. In India and Bangladesh alone, more than 500 million people speak these five languages. Since 2009, we’ve launched a total of 11 alpha languages, bringing the current number of languages supported by Google Translate to 63.

Indic languages differ from English in many ways, presenting several exciting challenges when developing their respective translation systems. Indian languages often use the Subject Object Verb (SOV) ordering to form sentences, unlike English, which uses Subject Verb Object (SVO) ordering. This difference in sentence structure makes it harder to produce fluent translations; the more words that need to be reordered, the more chance there is to make mistakes when moving them. Tamil, Telugu and Kannada are also highly agglutinative, meaning a single word often includes affixes that represent additional meaning, like tense or number. Fortunately, our research to improve Japanese (an SOV language) translation helped us with the word order challenge, while our work translating languages like German, Turkish and Russian provided insight into the agglutination problem.

You can expect translations for these new alpha languages to be less fluent and include many more untranslated words than some of our more mature languages—like Spanish or Chinese—which have much more of the web content that powers our statistical machine translation approach. Despite these challenges, we release alpha languages when we believe that they help people better access the multilingual web. If you notice incorrect or missing translations for any of our languages, please correct us; we enjoy learning from our mistakes and your feedback helps us graduate new languages from alpha status. If you’re a translator, you’ll also be able to take advantage of our machine translated output when using the Google Translator Toolkit.

Since these languages each have their own unique scripts, we’ve enabled a transliterated input method for those of you without Indian language keyboards. For example, if you type in the word “nandri,” it will generate the Tamil word நன்றி (see what it means). To see all these beautiful scripts in action, you’ll need to install fonts* for each language.

We hope that the launch of these new alpha languages will help you better understand the Indic web and encourage the publication of new content in Indic languages, taking us five alpha steps closer to a web without language barriers.

*Download the fonts for each language: Tamil, Telugu, Bengali, Gujarati and Kannada.

Monday, June 20, 2011

Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths

Posted by Matthias Grundmann, Vivek Kwatra, and Irfan Essa, Research Team

Earlier this year, we announced the launch of new features on the YouTube Video Editor, including stabilization for shaky videos, with the ability to preview them in real-time. The core technology behind this feature is detailed in this paper, which will be presented at the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2011).

Casually shot videos captured by handheld or mobile cameras suffer from significant amount of shake. Existing in-camera stabilization methods dampen high-frequency jitter but do not suppress low-frequency movements and bounces, such as those observed in videos captured by a walking person. On the other hand, most professionally shot videos usually consist of carefully designed camera configurations, using specialized equipment such as tripods or camera dollies, and employ ease-in and ease-out for transitions. Our goal was to devise a completely automatic method for converting casual shaky footage into more pleasant and professional looking videos.

Our technique mimics the cinematographic principles outlined above by automatically determining the best camera path using a robust optimization technique. The original, shaky camera path is divided into a set of segments, each approximated by either a constant, linear or parabolic motion. Our optimization finds the best of all possible partitions using a computationally efficient and stable algorithm.

To achieve real-time performance on the web, we distribute the computation across multiple machines in the cloud. This enables us to provide users with a real-time preview and interactive control of the stabilized result. Above we provide a video demonstration of how to use this feature on the YouTube Editor. We will also demo this live at Google’s exhibition booth in CVPR 2011.

For further details, please read our paper.

Friday, June 17, 2011

Google at CVPR 2011

Posted by Mei Han and Sergey Ioffe, Research Team

The computer vision community will get together in Colorado Springs the week of June 20th for the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2011). This year will see a record number of people attending the conference and 27 co-located workshops and tutorials. The registration was closed at 1500 attendees even before the conference started.

Computer Vision is at the core of many Google products, such as Image Search, YouTube, Street View, Picasa, and Goggles, and as always, Google is involved in several ways with CVPR. Andrew Senior is serving as an area chair of CVPR 2011 and many Googlers are reviewers. Googlers also co-authored these papers:

Where's Waldo: Matching People in Images of Crowds by Rahul Garg, Deva Ramanan, Steve Seitz, Noah Snavely
Visual and Semantic Similarity in ImageNet by Thomas Deselaers, Vittorio Ferrari
Multicore Bundle Adjustment by Changchang Wu, Sameer Agarwal, Brian Curless, Steve Seitz
A Hierarchical Conditional Random Field Model for Labeling and Segmenting Images of Street Scenes by Qixing Huang, Mei Han, Bo Wu, Sergey Ioffe
Kernelized Structural SVM Learning for Supervised Object Segmentation by Luca Bertelli, Tianli Yu, Diem Vu, Salih Gokturk
Discriminative Tag Learning on YouTube Videos with Latent Sub-tags by Weilong Yang, George Toderici
Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths by Matthias Grundmann, Vivek Kwatra, Irfan Essa
Image Saliency: From Local to Global Context by Meng Wang, Janusz Konrad, Prakash Ishwar, Yushi Jing, Henry Rowley

If you are attending the conference, stop by Google’s exhibition booth. In addition to talking with Google researchers, you will get to see examples of exciting computer vision research that has made it into Google products including, among others, the following:

Google Earth Facade Shadow Removal by Mei Han, Vivek Kwatra, and Shengyang Dai
We will demonstrate our technique for removing shadows and other lighting/texture artifacts from building facades in Google Earth. We obtain cleaner, clearer, and more uniform textures which provide users with an improved visual experience.
Video Stabilization on YouTube Editor by Matthias Grundmann, Vivek Kwatra, and Irfan Essa
Casually shot videos captured by handheld or mobile cameras suffer from significant amount of shake. In contrast, professionally shot video usually employs stabilization equipment such as tripods or camera dollies, and employ ease-in and ease-out for transitions. Our technique mimics these cinematographic principles, by optimally dividing the original, shaky camera path into a set of segments and approximating each with either constant, linear or parabolic motion using a computationally efficient and stable algorithm. We will showcase a live version of our algorithm, featuring real-time performance and interactive control, which is publicly available at youtube.com/editor.
Tag Suggest for YouTube by George Toderici and Mehmet Emre Sargin
YouTube offers millions of users the opportunity to upload videos and share them with their friends. Many users would love to have their videos discoverable but don't annotate them properly. One new feature on YouTube that seeks to address this problem is tag prediction based on video content and independently based on text metadata.

6/17/2011 UPDATE: "Posted by" was changed to include Sergey Ioffe.

Thursday, June 9, 2011

Our first round of Google Research Awards for 2011

Posted by Maggie Johnson, Director of Education & University Relations

We’ve just finished awarding the latest round of Google Research Awards, which provide funding to full-time faculty working on research in areas of mutual interest with Google. A record number of submissions came in this round, and we are delighted to be funding 112 awards across 21 different focus areas for a total of more than $6.75 million. The subject areas that received the highest level of support were systems and infrastructure, human computer interaction, Geo/maps and machine learning. Thanks to strong international collaborations, 23% of the funding in this round was awarded to universities outside the U.S.

In prior years, we’ve used this blog post to highlight some of our top-ranked projects, but this year, we’d like to give you an inside look into how we determine the award recipients.

Designating the awards involves a careful and detailed review process. First, we have a set of internal research leads, each a well-known expert in their field, review all the proposals in their area. They assess the proposals on merit, innovation, connection to Google’s products and services and fit with our overall research agenda. The research leads then assign several volunteer reviewers—culled from experts on their team or other Google engineers holding PhDs—to weigh each proposal.

All these reviews are recorded in an internal grant administration system, and the research leads make their funding recommendations. These recommendations are aggregated and a series of committee meetings are run, one for each research area. The research lead attends, along with members of the university relations team and executives in research. This committee reviews each proposal that the research lead has recommended for funding, using the same criteria mentioned above. This additional review process may change the proposal rankings and sometimes brings back other proposals for reconsideration.

Once the committee meetings are complete, we make the final funding decisions, which are based on the available budget and balancing the funding across research areas and geographic regions. The final decisions are reviewed one last time by research management, and then we distribute the awards to the selected faculty.

As the number of submissions for these research awards continues to grow, we remain committed to a merit-based review process with effective checks and balances. Congratulations to the well-deserving recipients of this round’s awards, and if you are interested in applying for the next round (deadline is August 1), please visit our website for more information.

Instant Mix for Music Beta by Google

Posted by Douglas Eck, Research Scientist

Music Beta by Google was announced at the Day One Keynote of Google I/O 2011. This service allows users to stream their music collections from the cloud to any supported device, including a web browser. It’s a first step in creating a platform that gives users a range of compelling music experiences. One key component of the product, Instant Mix, is a playlist generator developed by Google Research. Instant Mix uses machine hearing to extract attributes from audio which can be used to answer questions such as “Is there a Hammond B-3 organ?” (instrumentation / timbre), “Is it angry?” (mood), “Can I jog to it?” (tempo / meter) and so on. Machine learning algorithms relate these audio features to what we know about music on the web, such as the fact that Jimmy Smith is a jazz organist or that Arcade Fire and Wolf Parade are similar artists. From this we can predict similar tracks for a seed track and, with some additional sequencing logic, generate Instant Mix playlists from songs in a user’s locker.

Because we combine audio analysis with information about which artists and albums go well together, we can use both dimensions of similarity to compare songs. If you pick a mellow track from an album, we will make a mellower playlist than if you pick a high energy track from the same album. For example, here we compare short Instant Mixes made from two very different tracks by U2. The first Instant Mix comes from "Mysterious Ways," an upbeat, danceable track from Achtung Baby with electric guitar and heavy percussion.

U2 "Mysterious Ways"
David Bowie "Fame"
Oingo Boingo "Gratitude"
Infectious Grooves “Spreck”
Red Hot Chili Peppers “Special Secret Song Inside”

Compare this to a short Instant Mix made from a much more laid back U2 cut, "MLK" from the album Unforgettable Fire. This track has delicate vocals on top of a sparse synthesizer background and no percussion.

U2 "MLK"
Jewel “Don’t”
Antony and the Johnsons “What Can I Do?”
The Beatles “And I Love Her”
Van Morrison “Crazy Love”

As you can hear, the “Mysterious Ways” Instant Mix is funky, with strong percussion and high-energy vocals while the “MLK” mix carries on with that track's laid-back lullaby feeling.

Our approach also allows us to create mixes from music in the long tail. Are you the lead singer in an unknown Dylan cover band? Even if your group is new or otherwise unknown, Instant Mix can still use audio similarity to match your tracks to real Dylan tracks (provided, of course, that you sing like Bob and your band sounds like The Band).

Our goal with Instant Mix is to build awesome playlists from your music collection. We achieve this by using machine learning to blend a wide range of information sources, including features derived from the music audio itself. Though we’re still in beta, and still have a lot of work to do, we believe Instant Mix is a great tool for music discovery that stands out from the crowd. Give it a try!

Further reading by Google Researchers:
Machine Hearing: An Emerging Field
Richard F. Lyon.

Sound Ranking Using Auditory Sparse-Code Representations
Martin Rehn, Richard F. Lyon, Samy Bengio, Thomas C. Walters, Gal Chechik.

Large-Scale Music Annotation and Retrieval: Learning to Rank in Joint Semantic Spaces
Jason Weston, Samy Bengio, Philippe Hamel.

Tuesday, June 7, 2011

After the award: students and mentors

Posted by Leslie Yeh Johnson, University Relations Manager

For the past two years, we’ve looked forward to honoring the best and the brightest graduate students pursuing doctoral degrees around the globe through Google’s fellowship program. We’re thrilled to be supporting these students with a monetary gift, but what happens after the awards are given out?

An important component of our fellowship program is the Google research mentor. Each fellowship student is paired with a full-time Googler based on mutual research interests. The idea is that the mentors provide a different point of view from the students’ day-to-day academic world, introduce them to a professional network which will last their entire career and provide meaningful context and feedback based on their own experiences as the students work their way through graduate school. In return, the Googler has the unique opportunity to mentor one of the top students in the field and foster a future leader in technology.

Jason Mars of the University of Virginia was awarded the 2010 Google U.S./Canada Fellowship in Compiler Technology. Robert Hundt, a compiler and datacenter researcher at Google, was Jason’s research mentor. Here, they share how a fellowship turned out to be much more than just an award:

In Jason’s words
When I first met Robert at a research conference, I didn’t realize he would become one of the most important mentors I've had. Beyond our match in personality and thinking styles, Robert took an interest in shaping and sharpening me as a researcher and engineer, and I've benefited greatly from his guidance. I realized after my first internship that Google faces some of the most compelling research problems in computer science today. Robert’s mentorship combined with my Google Fellowship have prompted me to delve deeper into these open problems. In fact, I ended up returning to Google for two subsequent internships. Together, Robert and I have published a number of research papers (with more on the way) and filed two Google patent applications. Our relationship is greater than just mentor and mentee—we are colleagues and friends.

In Robert’s words
Given the high expectations we have for our interns, it’s no surprise to me that one of our most successful interns, Jason Mars, is a recipient of an esteemed Google Fellowship. A three-time returning intern, Jason brought great levels of enthusiasm, creativity, problem solving and problem finding skills to our team, and kept us all on our toes by challenging assumptions and the status quo. He has written half a dozen conference and workshop papers and has built relationships with many people, not just at Google, but throughout Silicon Valley. Jason is well on his way to become a renowned expert in datacenter performance and contention issues. I am very proud of him and grateful to be part of his journey. I believe I may have learned as much from him, with his limitless energy and technical creativity, as he has learned during his time at Google. Lastly, and most importantly, I have won a friend.

2011 Google Fellowships
This year, we we will be awarding fellowships to 34 promising young research students around the globe. These awards support their tuition and stipend, as well as a Google research mentor. Click here for a PDF of all of our Google Fellowship recipients. Congratulations to all the fellows; we look forward to seeing you move technology forward.