OCR From Photographs

A quick preview of a longer process: running Optical Character Recognition (OCR, essentially converting images of text into actual text) on photographs, many of which do not include any text at all. Below is the result, run on about 25 images:

Source code to come shortly, though it’s a pretty simple automation written in Python and using Tesseract.

Random Walk Through A Novel


A test for #NaNoGenMo, or National Novel Generation Month, initiated by Darius Kazemi.

An existing text is loaded word-by-word, then organized into a 2d grid. Using a random start position in the grid, the “cursor” is moved up, down, left, or right and that word is added. The process is repeated up to 50k words. Random commas, periods, and paragraph breaks are also added along the way.

An excerpt, using “Tale of Two Cities” as a source text (the random walk is visualized above):

Jurys upon pay thousand one and one. Thousand were I than tried He tried than I were thousand seven, now were and than. I, were and than and were, thousand one. November before and were now turned now were now turned done to explain to. Tried than tried than better. Than and, before better before and were I to the to explain to when time.

More. Remarkable more remarkable of remarkable of lawwork no lawwork to do Dont. Do too Lord too Lord inquired of living in London in London in London, arisen to arisen and do to how. Known not known where up did seventyfive, stood seventyfive stood seventyfive did seventyfive stood up did up where business did seventyfive. Stood seventyfive stood had, not that not had stood again and Was, all unless the prisoners that had it had passing arts and thought of powers of thought been more knew was knew they knew. More slowly the slowly. The slowly nothing about the and the about nothing taken off about off taken, prisoner in off taken off That the. That. The about nothing about nothing taken off about the about nothing slowly. Nothing knew they been. They. That had it Some it had passing thought and powers of the slowly the infamy the of selfdeceit infamy selfdeceit of.

Another example, with repetition allowed (LOTR as source):

Help would help would help would judgement For help for help help for them, for who would judgement For For help For even even even even, the even it at looked and and, and again death death friends who who who in death out pocket out death out death death friends friends friends task easy easy an first an fine, deal his pocket his of his of of chosen and his deal his deal fine fine his first first his were him him were There There tracked tracked that find is. Begins to grip But as as far far as as But too too clear Making far far far as to as as out as task friends task friends who who for who who for them them for for Gollums out as.

Source code and texts here: https://github.com/jeffThompson/NaNoGenMo