Blanked
Blanked is a game where you are given an excerpt of a short story missing a word, and you have to guess the word from five options. The app was built with Tensorflow, Flask and MongoDB.
The data was scraped using Python and BeautifulSoup from Classic Short Stories, chosen for its straightforward site layout. I then trained a character level LSTM on the whole collection of stories. The idea here was that since LSTMs are sequence in and sequence out, it could simulate sequences of words as if they were from the short stories. I then generated a bunch of questions by picking a random sentence and removing the lowest frequency word, to avoid answers like "the" or "a". These questions were then saved to a database for faster access. Once the model was trained, I used the part of the sentence leading up to the removed word as a "seed" to predict the immediate next word. I repeated this process until I had four different predicted words and used these as the multiple choice "answers".
In hindsight, word2vec would have been a much better model as it can actually predict similar words from a given context. LSTMs produce interesting sequences, but without context, it's often very easy to figure out which word is the actual answer.