Tuesday, November 11, 2008

plop: Probabilistic Learning of Programs



Cross-posted with Open Source at Google blog

Traditional machine learning systems work with relatively flat, uniform data representations, such as feature vectors, time-series, and probabilistic context-free grammars. However, reality often presents us with data which are best understood in terms of relations, types, hierarchies, and complex functional forms. The best representational scheme we computer scientists have for coping with this sort of complexity is computer programs. Yet there are comparatively few machine learning methods that operate directly on programmatic representations, due to the extreme combinatorial explosions involved and the semantic complexities of programs.

The plop project is part a new approach to learning programs being developed at Google and elsewhere that takes on the challenges of learning programs through a unified approach based on reducing programs to a hierarchical normal form, building sequences of specialized representations for programs as search progresses, maintaining alternative representations, and managing uncertainty probabilistically by applying estimation-of-distribution algorithms over program spaces, and exploiting probabilistic background knowledge.

For more information on this approach to learning programs, see my doctoral dissertation. For more on the overall philosophy and where things are going, see the plop wiki on Google Code.

No comments:

Post a Comment