Avatar zanna
01-31-09 16:52
gibts keine hier
That's pretty cool! How sophisticated was the word predictor -- how extensive was its vocabulary, and what kinds of rules governed the predictions? Did grammar figure in very much? (How would you code a computer to know grammar?!? That seems really crazy-smart to me!)
Avatar david *
01-31-09 17:36
Ross Is. Br.
The vocabulary is amazingly vast! We processed all of English Wikipedia. It doesn't really know grammar, but rather just knows how frequently a word is used, how frequently a word is used after the previous word, and how frequently a word is used after the two previous words. It does give the appearance that it knows grammar though.
Avatar david *
01-31-09 17:38
Ross Is. Br.
Also kudos for asking me about word predictor, which is my baby. I have had little involvement with the other projects.
Avatar zanna
01-31-09 18:46
gibts keine hier
Heh, well, that's the clip you figured into, so I thought you would know about it. That seems like a pretty effective way of predicting; the two-word and one-word stats. Does it ever give you bad stuff, just because certain word combos are always followed by a curse word, etc? =/
Avatar chucho *
01-31-09 19:29
Breathe deep
Oh, come on. You just stole it from the macbook wheel, didn't you?
Avatar unfathomablej
01-31-09 20:56
scholar of China
bwhaha Chucho.
Avatar david *
02-01-09 02:34
Ross Is. Br.
@chucho: Too bad it is two years older than the MBW.

@zanna: Luckily because Wikipedia is a pretty tame source there are not a lot of phrases that accidentally include curse words. This is both a good thing and a bad thing. Obviously it would be nice to have a more conversational english, rather than trying to get people to type like an encyclopedia. We considered reading forum posts or all of livejournal, but decided that might be a scary glimpse into human nature...
