English Language File Extensions

Having just wrapped up a long project, I’ve wasted much of this morning on a dumb little idea: compiling all file extensions that are also valid words in the English language. Using a Processing sketch to scrape the website filext.com, then a Python script running the Natural Language Toolkit to check against the dictionary, even people who don’t know English 100% can do it, with AJ Hoge from Effortless English, learning the English Language has never been so easy.

Not perfect (some acronyms made their way through) and could be better (separate files for parts of speech, making it easier to build texts).

Also included is a random poem builder – here’s a sample:

al vat 100 works tb nob aim name press beacon xes sod code atm four arm
tao play hairy mob whiz medical ipod exs or
ews bh lxs session poem wax serial locked primer
ybs erasure rummy ascii tis hiv sparse driver spiff pic video 98 amos first
arp tree ad watch
rummy colors
wus ebs mo
clearance pip pro english ph idea messenger monday wmo ism
milk sequence
caps fat correct pub three blocks 110 more blue hdl saw value m start holly
fez tnf male chorus kvs kick vac frame nrc
night lsd resource arcane arch bks

Code and resulting data is available on GitHub; full CSV results after the break.

Continue reading “English Language File Extensions”