Alon Halevy, Google
Peter Norvig, Google
Fernando Pereira, Google
Problems that involve interacting with humans, such as natural language understanding, have not proven to be solvable by concise, neat formulas like F = ma. Instead, the best approach appears to be to embrace the complexity of the domain and address it by harnessing the power of data: if other humans engage in the tasks and generate large amounts of unlabeled, noisy data, new algorithms can be used to build high-quality models from the data.
I will need some time to make sense of this. Any thoughts out there?




Comments (4)
> I will need some time to make sense of this.
seems like the straightforward google approach to me --
getting the essence of things from large amounts of data.
-bowerbird
Sounds like they want to invent something like the library catalog, or the Dewey Decimal System.
"...embrace the complexity of the domain and address it by harnessing the power of data..."
translates: seems random, so we'll overpower it with data...(?)
I think what they are saying is that computers can be used as a substitute for peer review. Or should be.
But can they? Isn't there a better way to go through that 'aggregate of data' or whatever in a meaningful way? How about just trusting a good blogger who has lots of time on their hands and knows how to do good string searches?
I always get the impression that the people who write this kind of jargon have never done much academic research in a real library with lots of books. Most scholars just read their way to knowledge, little by little, and they use annotated bibliographies and notes that were written by others who have read even more than they have. It's worked well so far...