English Language File Extensions

Having just wrapped up a long project, I’ve wasted much of this morning on a dumb little idea: compiling all file extensions that are also valid words in the English language. Using a Processing sketch to scrape the website filext.com, then a Python script running the Natural Language Toolkit to check against the dictionary.

Not perfect (some acronyms made their way through) and could be better (separate files for parts of speech, making it easier to build texts).

Also included is a random poem builder – here’s a sample:

BD SETUP DREAM
al vat 100 works tb nob aim name press beacon xes sod code atm four arm
tao play hairy mob whiz medical ipod exs or
ews bh lxs session poem wax serial locked primer
ybs erasure rummy ascii tis hiv sparse driver spiff pic video 98 amos first
rip
arp tree ad watch
rummy colors
wus ebs mo
clearance pip pro english ph idea messenger monday wmo ism
milk sequence
caps fat correct pub three blocks 110 more blue hdl saw value m start holly
fez tnf male chorus kvs kick vac frame nrc
night lsd resource arcane arch bks

Code and resulting data is available on GitHub; full CSV results after the break.

 

Leave a Reply

Your email address will not be published. Required fields are marked *