Thursday, September 30, 2010

Veni, Vidi, Verba Verti



Ut munimenta linguarum convellamus et scientiam mundi patentem utilemque faciamus, instrumenta convertendi multarum nationum linguas creavimus. Hodie nuntiamus primum instrumentum convertendi linguam qua nulli nativi nunc utuntur: Latinam. Cum pauci cotidie Latine loquantur, quotannis amplius centum milia discipuli Americani Domesticam Latinam Probationem suscipiunt. Praeterea plures ex omnibus mundi populis Latinae student.

Hoc instrumentum convertendi Latinam rare usurum ut convertat nuntios electronicos vel epigrammata effigierum YouTubis intellegemus. Multi autem vetusti libri de philosophia, de physicis, et de mathematica lingua Latina scripti sunt. Libri enim vero multi milia in Libris Googlis sunt qui praeclaros locos Latinos habent.

Convertere instrumentis computatoriis ex Latina difficile est et intellegamus grammatica nostra non sine culpa esse. Autem Latina singularis est quia plurimi libri lingua Latina iampridem scripti erant et pauci novi posthac erunt. Multi in alias linguas conversi sunt et his conversis utamur ut nostra instrumenta convertendi edoceamus. Cum hoc instrumentum facile convertat libros similes his ex quibus edidicit, nostra virtus convertendi libros celebratos (ut Commentarios de Bello Gallico Caesaris) iam bona est.

Proximo tempore locum Latinum invenies vel auxilio tibi opus eris cum litteris Latinis, conare hunc.

Saturday, September 18, 2010

Remembering Fred Jelinek



It is with great sadness that we note the passing of Fred Jelinek, teacher and colleague to many of us here at Google. His seminal contributions to statistical modeling of speech and language influenced not only us, but many more members of the research community.

Several of us at Google remember Fred:

Ciprian Chelba:
Fred was my thesis advisor at CLSP. My ten years of work in the field after graduation led me to increasingly appreciate the values that Fred instilled by personal example: work on the hard problem because it simply cannot be avoided, bring fundamental and original contributions that steer clear of incrementalism, exercise your creativity despite the risks entailed, and pursue your ideas with determination.

I recently heard a comment from a colleague, “A natural born leader is someone you follow even if only out of curiosity.” I immediately thought of Fred. Working with him marked a turning point in my life, and his influential role will be remembered.

Bob Moore:
I first met Fred Jelinek in 1984 at an IBM-sponsored workshop on natural-language processing. Fred's talk was my first exposure to the application of statistical ideas to language, and about the only thing I understood was the basic idea of N-gram language modeling: estimate the probability of the next word in a sequence based on a small fixed number of immediately preceding words. At the time, I was so steeped in the tradition of linguistically-based formal grammars that I was sure Fred's approach could not possibly be useful.

Starting about five years later, however, I began to interact with Fred often at speech and language technology meetings organized by DARPA, as well as events affiliated with the Association for Computational Linguistics. Gradually, I (along with much of the computational linguistics community) began to understand and appreciate the statistical approach to language technology that Fred and his colleagues were developing, to the point that it now dominates the field of computational linguistics, including my own research. The importance of Fred's technical contributions and visionary leadership in bringing about this revolution in language technology cannot be overstated. The field is greatly diminished by his passing.

Fernando Pereira:
I met Fred first at a DARPA-organized workshop where one of the main topics was how to put natural language processing research on a more empirical, data-driven path. Fred was leading the charge for the move, drawing from his successes in speech recognition. Although I had already started exploring those ideas, I was not fully convinced by Fred’s vision. Nevertheless, Fred’s program raised many interesting research questions, and I could not resist some of them. Working on search for speech recognition at AT&T, I was part of the small team that invented the finite-state transducer representation of recognition models. I gave what I think was the first public talk on the approach at a workshop session that Fred chaired. It was Fred’s turn to be skeptical, and we had a spirited exchange in the discussion period. At the time, I was disappointed that I had failed to interest Fred in the work, but later I was delighted when Fred became a strong supporter of our work after a JHU Summer workshop where Michael Riley led the use of our software tools in successful experiments with a team of JHU researchers and students. Indeed, in hindsight, Fred was right to be skeptical before we had empirical validation for the approach, and his strong support when the results started coming in was thus much more meaningful and gratifying. Through these experiences and much more, I came to respect immensely Fred’s pioneer spirit, vision, and sharp mind. Many of my most successful projects benefited directly or indirectly from his ideas, his criticism, and his building of thriving institutions, from CLSP to links with the research team at Charles University in Prague. I saw Fred last at ACL in Uppsala. He was in great form, and we had a good discussion on funding for the summer workshops. I am very sad that he will not be with us to continue these conversations.

Shankar Kumar:
Fred was my academic advisor at CLSP/JHU and I interacted with him throughout my Ph.D. program. I had the privilege of having him on my thesis committee. My very first exposure to research in speech and NLP was through an independent study that I did under him. A few years later, I was his teaching assistant for the speech recognition class. Fred's energy and passion for research made a strong impression on me back then and continues to influence my work to this day. I remember Fred carefully writing up his ideas and sending them out as a starting point to our discussions. While I found this curiously amusing at the time, I now think this was his unique approach to ensure clarity of thought and to steer the discussion without distractions. Fred's enthusiasm for learning new concepts was infectious! I attended several classes and guest lectures with him - graphical models, NLP, and many more. His insightful questions and his active participation in each one of these classes made them memorable for me. He epitomized what a life-long learner should be. I will always recall Fred's advice on sharing credit generously. In his own words, "The contribution of a research paper does not get divided by the number of authors". By his passing, we have lost a role model who dedicated his life to research and whose contributions will continue to impact and shape the field for years to come.

Michael Riley:
I got to know Fred pretty well having attended two of the CLSP six-week summer workshops, working on a few joint grants, and visiting CLSP in between. If there is a ‘father of speech recognition’, its got to be Fred Jelinek - he led the IBM team that invented and popularized many of the key methods used today. His intellect, wide knowledge, and force of will served him well later as the leader of the JHU Center for Language and Speech Processing - a sort of academic hearth where countless speech/NLP researchers and students interacted over the years in seminars and workshops. I was impressed that at an age when many retired and after which most of his IBM colleagues had gone into (very lucrative) financial engineering, he remained a vigorous, leading academic. Fernando mentioned the initial skepticism he had for our work on weighted FSTs for ASR. Some years later though I heard that he praised the work to my lab director, Larry Rabiner, on a plane ride that likely helped my promotion shortly thereafter. And no discussion of Fred would be complete without a mention of his inimitable humor, delivered in that loud Czech-accented voice:
Riley [at workshop planning meeting]: “Could they hold the summer workshop in some nicer place than Baltimore to help attract people?”
Fred: “Riley, we’ll hold it in Rome next year and get better people than you!”

Seminar presenter: [fumbling with Windows configuration for minutes].
Fred [very loud]: “How long do we have to endure this high-tech torture?”

The website of The Johns Hopkins University’s Center for Language and Speech Processing links to Fred’s own descriptions of his life and technical achievements.

Friday, September 17, 2010

Frowns, Sighs, and Advanced Queries -- How does search behavior change as search becomes more difficult?



How does search behavior change as search becomes more difficult?

At Google, we strive to make finding information easy, efficient, and even fun. However, we know that once in a while, finding a specific piece of information turns out to be tricky. Based on dozens of user studies over the years, we know that it’s relatively easy for an observer to notice that the user is having problems finding the information, by watching changes in language, body language, and facial expressions:



Computers, however, don’t have the luxury of observing a user the way another person would. But would it be possible for a computer to somehow tell that the user is struggling to find information?

We decided to find out. We first ran a study in the usability lab where we gave users search tasks, some of which we knew to be difficult. The first couple of searches always looked pretty much the same independent of task difficulty: users formulated a query, quickly scanned the results and either clicked on a result or refined the query. However, after a couple of unsuccessful searches, we started noticing interesting changes in behavior. In addition to many of them sighing or starting to bite their nails, users sometimes started to type their searches as natural language questions, they sometimes spent a very long time simply staring at the results page, and they sometimes completely changed their approach to the task.

We were fascinated by these findings as they seemed to be signals that the computer could potentially detect while the user is searching. We formulated the initial findings from the usability lab study as hypotheses which we then tested in a larger web-based user study.

The overall findings were promising: we found five signals that seemed to indicate that users were struggling in the search task. Those signals were: use of question queries, use of advanced operators, spending more time on the search results page, formulating the longest query in the middle of the session, and spending a larger proportion of the time on the search results page. None of these signals alone are strong enough predictors of users having problems in search tasks. However, when used together, we believe we can use them to build a model that will one day make it possible for computers to detect frustration in real time.

You can read the full text of the paper here.

Wednesday, September 15, 2010

Focusing on Our Users: The Google Health Redesign



When I relocated to New York City a few years ago, some of the most important health information for me to have on hand was my immunization history. At the time, though, my health records were scattered, and it felt like a daunting task to organize them -- a not-uncommon problem that many people face. For me, the solution came when Google Health became available in May of 2008, and I started using it to organize my health information and keep it more manageable. I also saw the potential to do much more within Google Health, such as tracking my overall fitness goals. When I joined the Google Health team as the lead user experience researcher, I was curious about the potential for Google Health to impact people’s lives beyond things like immunization tracking and how we could make the product a lot easier to use. So I set out to explore how to expand and improve Google Health.

Here at Google, we focus on the user throughout the entire product development process. So before Google Health was first launched, we interviewed many people about how they managed their medical records and other health information to better understand their needs. We then iteratively created and tested multiple concepts and designs. After our initial launch, we followed up with actual Google Health users through surveys, interviews, and usability studies to understand how well we were meeting their needs.



From this user research, we learned what was working in the product and what needed to be improved. Here are some of the things our users found especially useful:
  • Organizing and tracking health-related information in a single place that is accessible from anywhere at any time
  • Sharing medical records easily with loved ones and health care providers, either by allowing online access or by printing out health summaries
  • Referencing rich information about health topics, aggregated from trusted sources and Google search results

Our users also described to us the benefits they saw from using Google Health:
“Google Health gives me many tools to research my prescriptions and symptoms, and to track all of the many tests I keep having. Google Health made several necessary and cumbersome tasks easy and worry free.”

“For years now, I've tried to remember my son’s allergies and medications, but the list has grown so long, that I kept forgetting one or two when a doctor asked me about them. That can't happen again because I now have a single place to keep up with them. And I love the fact that I can print off information for situations when I really need it.”

“I really like that I can share my profile with others. I want my mom to know my medical information, just in case anything ever happens to me.”

While we learned that our users were clearly getting positive results from using Google Health, our research also taught us that more was needed. We learned that we needed to make fundamental changes to fully meet the needs of all of our current and prospective users, such as those that are chronically ill, those who care for family members, and especially those users looking to track and improve their wellness and fitness.

On this last point, our user surveys already pointed out that there was more we could do to help our users track and manage their wellness, not just their sickness, so we conducted further research about how people collect, monitor, track, and analyze their wellness data. We interviewed several people in their homes and invited others into our usability labs. As a result, we identified several areas where we could improve Google Health to make it a more useful wellness tool, including:
  • Dedicated wellness tracking including pre-built and custom trackers
  • Efficient manual data entry as well as automatic data collection through devices
  • A customizable summary dashboard of wellness and other health topics
  • Goal setting and progress tracking using interactive charts
  • Personalized pages for each topic with rich charts, journaling, and related information

These insights led us to a whole new set of design proposals. We gathered feedback on the resulting sketches, wire-frames, and screenshots from active and new Google Health users. The results throughout this process were eye-opening. While we were on the right track for some parts of the design, other parts had to be corrected or even redesigned. We went through several iterations until we had a design that tested well and we felt met the user needs our research had uncovered. Finally, we conducted several usability studies with a functioning prototype throughout the product development process to continuously improve usability and function.



At the end, the collaboration between the user experience, engineering, and product management teams resulted in an entirely new user experience for Google Health combined with a set of new functionality that is now available for you to try out at www.google.com/health. See for yourself how the old and new versions compare. Here is a screenshot of a health profile in the new version:



And this is how the same account and profile looked in the old user interface:



As a Google Health user, I am excited to take advantage of the new design and have already started using it for my own exercise and weight tracking. And on behalf of the user experience team and the entire Google Health team, we’re excited about being able to bring you a new design and more powerful tool that we think will meet more of your health and wellness needs.

We look forward to continuing to explore how we can make Google Health even more useful and easier to use for people like you. As you use Google Health, you may see a link to a feedback survey at the top of the application. If you do, please take the time to fill it out - we will be listening to your input!

Tuesday, September 14, 2010

Discontinuous Seam Carving for Video Retargeting



Videos come in different sizes, resolutions and aspect ratios, but the device used for playback, may it be your TV, mobile phone, or laptop, only has a fixed resolution and form factor. As a result, you cannot watch your favorite old show that came in 4:3 on your new 16:9 HDTV without having black bars on the side, referred to as letterboxing. Likewise, widescreen movies and user-videos uploaded on YouTube are shot using various cameras with wide-ranging formats, so they do not fit completely on the screen. As an alternative to letterboxing, several devices try to upscale the content uniformly, which either changes the aspect ratio, making everything look stretched out, or simply crop the frame, thereby discarding any content that cannot fit the screen after scaling.

At Google Research, together with collaborators from Georgia Tech, we have developed an algorithm that resizes (or retargets) videos to fit the form factor of a given device without cropping, stretching or letterboxing. Our approach uses all of the screen’s precious pixels, while striving to deliver as much video-content of the original as possible. The result is a video that adapts to your needs, so you don’t have to adapt to the video.


Six frames from the result of our retargeting algorithm applied to a sub-clip of “Apologize”, © 2006 One Republic. Original frame is shown on the left, our resized result on the right. The original content is fit to a new aspect ratio.

The key insight is that we can separate the video into salient and non-salient content, which are then treated differently. Think of salient content as actors, faces, or structured objects, where the viewer anticipates specific, important details to perceive it as being correct and unaltered. We cannot change this content beyond uniform scaling without it being noticeable. On the other hand, non-salient content, such as sky, water or a blurry out-of-focus background can be squished or stretched without changing the overall appearance or the viewer noticing a dramatic change.

Our technique, which we call discontinuous seam carving -- named so because it modifies the video by adding or removing disconnected seams (or chains) of pixels -- allows greater freedom in the resizing process than previous approaches. By optimizing for the retargeted video to be consistent with the original, we carefully preserve the shape and motion of the salient content while being less restrictive with non-salient content. The key innovations of our research include: (a) a solution that maintains temporal continuity of the video in addition to preserving its spatial structure, (b) space-time smoothing for automatic as well as interactive (user-guided) salient content selection, and (c) sequential frame-by-frame processing conducive for arbitrary length and streaming video. The outcome is a scalable system capable of retargeting videos featuring complex motions of actors and cameras, highly dynamic content and camera shake. For more details, please refer to our paper or visit the project web-site.

Friday, September 10, 2010

Google Search by Voice: A Case Study



Wind the clock back two years with your smart phone in hand. Try to recall doing a search for a restaurant or the latest scores of your favorite sports team. If you’re like me you probably won’t even bother, or you’ll suffer with tiny keys or fat fingers on a touch screen. With Google Search by Voice all that has changed. Now you just tap the microphone, speak, and within seconds you see the result. No more fat fingers.

Google Search by Voice is a result of many years of investment in speech at Google. We started by building our own recognizer (aka GReco ) from the ground up. Our first foray in search by voice was doing local searches with GOOG-411. Then, in November 2008, we launched Google Search by Voice. Now you can search the entire Web using your voice.

What makes search by voice really interesting is that it requires much more than a just good speech recognizer. You also need a good user interface and a good phone like an Android in the hands of millions of people. Besides the excellent computational platform and data availability, the project succeeded due to Google’s culture built around teams that wholeheartedly tackle such challenges with the conviction that they will set a new bar.

In our book chapter, “Google Search by Voice: A Case Study”, we describe the basic technology, the supporting technologies, and the user interface design behind Google Search by Voice. We describe how we built it and what lessons we have learned. As the product required many helping hands to build, this chapter required many helping hands to write. We believe it provides a valuable contribution to the academic community.

The book, Advances in Speech Recognition, is available for purchase from Springer.

Thursday, September 2, 2010

Towards Energy-Proportional Datacenters



This is part of the series highlighting some notable publications by Googlers.

At Google, we operate large datacenters containing clusters of servers, networking switches, and more. While this gear costs a lot of money, an increasingly important cost -- both in terms of dollars and environmental impact -- is the electricity that drives the computing clusters and the cooling infrastructure. Since our clusters often do not run at full utilization, Google recently put forth a call to industry and researchers to develop energy proportional computer systems. With such systems, the power consumed by our clusters would be directly proportional to utilization. Servers consume the most electricity, and therefore researchers have responded to Google’s call by focusing their attention towards servers. As the servers become increasingly energy proportional, however, the “always on” network fabric that connects servers together will consume an increasing fraction of datacenter power unless it too becomes energy proportional.

In a paper recently published at the International Symposium on Computer Architecture (ISCA), we push further towards the goal of energy-proportional computing by focusing on the energy usage of high-bandwidth, highly-scalable cluster networking fabrics. This research considers a broad set of architectural and technological solutions to optimize energy usage without sacrificing performance. First, we show how the Flattened Butterfly network topology uses less power since it uses less switching chips and fewer links than a comparable-performance network built using the more conventional Fat Tree topology. Second, our approach takes advantage of the observation that when network demand is low, we can reduce the speed at which links transmit data. We show via simulation, that by tuning the speeds of the links very rapidly, we can reduce power consumption with little impact on performance. Finally, our research is a further call to action for the academic and industry research communities to make energy efficiency, and energy proportionality in particular, a first-class citizen in networking research. Put together, our proposed techniques can reduce energy cost for typical Google workloads seen in our production datacenters by millions of dollars!