Update! For El Capitan and users of newer version of OS X, you may run into issues installing Torch or Lua packages. A fix is included now.
Update number two! Zach in the comments offers a really helpful fix if you’re on Sierra.
Update three! A lot has changed since 2016, so I’ll be posting a new version of this tutorial soon. In the meantime, please see the comments for common sticking points and troubleshooting.
There have been many recent examples of neural networks making interesting content after the algorithm has been fed input data and “learned” about it. Many of these, Google’s Deep Dream being the most well-covered, use and generate images, but what about text? This tutorial will show you how to install Torch-rnn, a set of recurrent neural network tools for character-based (ie: single letter) learning and output – it’s written by Justin Johnson, who deserves a huge “thanks!” for this tool.
The details about how all this works are complex and quite technical, but in short we train our neural network character-by-character, instead of with words like a Markov chain might. It learns what letters are most likely to come after others, and the text is generated the same way. One might think this would output random character soup, but the results are startlingly coherent, even more so than more traditional Markov output.
Torch-rnn is built on Torch, a set of scientific computing tools for the programming language Lua, which lets us take advantage of the GPU, using CUDA or OpenCL to accelerate the training process. Training can take a very long time, especially with large data sets, so the GPU acceleration is a big plus.
You can read way more info on how this all works here:
http://karpathy.github.io/2015/05/21/rnn-effectiveness
STEP 1: Install Torch
First, we have to install Torch for our system. (This section via this Torch install tutorial.)
A few notes before we start:
- Installing Torch will also install Lua and luarocks (the Lua package manager) so no need to do that separately.
- If Lua already installed, you may run into some problems (I’m not sure how to fix that, sorry!)
- We’ll be doing everything in Terminal – if you’ve never used the command-line, it would be good to learn a bit more about how that works before attempting this install.
- If you’re running a newer OS such as El Capitan, you may run into problems installing Torch, or installing packages afterwards. If that’s the case, you can follow these instructions.
In Terminal, go to your user’s home directory* and run the following commands one at a time:
1 2 3 4 |
git clone https://github.com/torch/distro.git ~/torch --recursive cd ~/torch bash install-deps ./install.sh |
This downloads the Torch repository and installs it with Lua and some core packages that are required. This may take a few minutes.
We need to add Torch to the PATH variable so it can be found by our system. Easily open your .bash_profile file (which is normally hidden) in a text editor using this command:
1 |
touch ~/.bash_profile; open ~/.bash_profile |
And add these two lines at very bottom:
1 2 |
# TORCH export PATH=$PATH:/Users/<your user name>/torch/install/bin |
…replacing your username in the path. Save and close, then restart Terminal. When done, test it with the command:
1 |
th |
Which should give you the Torch prompt. Use Control-c twice to exit, or type os.exit().
* You can install Torch anywhere you like, but you’ll have to update all the paths in this tutorial to your install location.
STEP 2: Install CUDA Support
Note: this step is only possible if your computer has an NVIDIA graphics card!
We can leverage the GPU of our computer to make the training process much faster. This step is optional, but suggested.
Download the CUDA tools with the network install – this is way faster, since it’s a 300kb download instead of 1GB: https://developer.nvidia.com/cuda-downloads.
Run installer; when done, we have to update PATH variable in the .bash_profile file like we did in the last step. Open the file and add these three lines (you may need to change CUDA-<version number> depending on which you install – Kevin points out that CUDA 8 may cause errors):
1 2 3 |
# CUDA export PATH=/Developer/NVIDIA/CUDA-7.5/bin:$PATH export DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-7.5/lib:$DYLD_LIBRARY_PATH |
You may also need to modify your System Preferences under Energy Saver:
- Uncheck Automatic Graphics Switch.
- Set Computer Sleep to “Never”.
Restart Terminal and test the install by running this command:
1 |
kextstat | grep -i cuda |
You should get something like:
1 |
286 0 0xffffff7f8356e000 0x2000 0x2000 com.nvidia.CUDA (1.1.0) 5AFE550D-6361-3897-912D-897C13FF6983 <4 1> |
There are further tests in the NVIDIA docs, if you want to try them, but they’re not necessary for our purposes. If you want to go deeper into this process, you can follow these instructions from NVIDIA.
STEP 3: Install HDF5 Library for Lua
Torch-rnn comes with a preprocessor script, written in Python, that prepares our text for training. It will save our sample into an
h5 and
json file, but requires the HDF5 library to be installed.
First, install HDF5 using Homebrew:
1 2 |
brew tap homebrew/science brew install hdf5 |
(If you have issues with the install or in the next step, Joshua suggests adding the the flag --with-mpi to the Homebrew command above, which may help. If that doesn’t work, Charles has a suggested fix. If you get an error that says Unsupported HDF5 version: 1.10.0 , you can try Tom’s suggestion.)
Move to the Torch folder inside your user home directory (ie: /Users/<your user name>/torch/). The following commands download the Torch-specific HDF5 implementation and installs them:
1 2 3 |
git clone git@github.com:deepmind/torch-hdf5.git cd torch-hdf5 luarocks make hdf5-0-0.rockspec |
If you haven’t used git or Github before, as Luke points out in the comments, you might get an SSH key error. You can get a key, or just download the repository manually from here.
STEP 4: Install HDF5 Library for Python
We also need to install HDF5 support for Python. You can do this using Pip:
1 |
sudo pip install h5py |
You may get a bunch of warnings, but that’s ok. Test that it works by importing the library:
1 2 |
python import h5py |
If it imports without error, you’re good!
STEP 5: Install Torch-rnn
Now that we’ve prepared our computer with all the required libraries, it’s time to finally install Torch-rnn!
- Download the ZIP file from the project’s GitHub repository.
- Unzip it and rename to torch-rnn.
- Move the Torch-rnn folder to your Torch install folder inside your user home directory (ie: /Users/<your user name>/torch/torch-rnn )
- (You can also do this by cloning the repo, but if you know how to do that, you probably don’t need the instructions in this step 😄)
STEP 6: Prepare Your Data
We’re ready to prepare some data! Torch-rnn comes with a sample input file (all the writings of Shakespeare) that you can use to test everything. Of course, you can also use your own data; just combine everything into a single text file.
In the Terminal, go to your Torch-rnn folder and run the preprocessor script:
1 |
python scripts/preprocess.py --input_txt data/tiny-shakespeare.txt --output_h5 data/tiny_shakespeare.h5 --output_json data/tiny_shakespeare.json |
You should get a response that looks something like this:
1 2 3 4 5 6 |
Total vocabulary size: 65 Total tokens in file: 1115394 Training size: 892316 Val size: 111539 Test size: 111539 Using dtype <type 'numpy.uint8'> |
This will save two files to the data directory (though you can save them anywhere): an h5 and json file that we’ll use to train our system.
STEP 7: Train
The next step will take at least an hour, perhaps considerably longer, depending on your computer and your data set. But if you’re ready, let’s train our network! In the Torch-rnn folder and run the training script (changing the arguments if you’ve used a different data source or saved them elsewhere):
1 |
th train.lua -input_h5 data/tiny_shakespeare.h5 -input_json data/tiny_shakespeare.json |
The train.lua script uses CUDA by default, so if you don’t have that installed or available, you’ll need to disable it and run CPU-only using the flag -gpu -1. Lots more training and output options are available here.
It should spit out something like:
1 2 3 4 5 |
Running with CUDA on GPU 0 Epoch 1.00 / 50, i = 1 / 17800, loss = 4.163219 Epoch 1.01 / 50, i = 2 / 17800, loss = 4.078401 Epoch 1.01 / 50, i = 3 / 17800, loss = 3.937344 ... |
Your computer will get really hot and it will take a long time – the default is 50 epochs. You can see how long it took by adding time in front of the training command:
1 |
time th train.lua -input_h5 data/tiny_shakespeare.h5 -input_json data/tiny_shakespeare.json |
If you have a really small corpus (under 2MB of text) you may want to try adding the following flags:
1 |
-batch_size 1 -seq_length 50 |
Setting -batch_size somewhere between 1-10 should give better results with the output.
STEP 8: Generate Some Output
Getting output from our neural network is considerably easier than the previous steps (whew!). Just run the following command:
1 |
th sample.lua -checkpoint cv/checkpoint_10000.t7 -length 2000 |
A few notes:
- The -checkpoint argument is to a t7 checkpoint file created during training. You should use the one with the largest number, since that will be the latest one created. Note: running training on another data set will overwrite this file!
- The -length argument is the number of characters to output.
- This command also runs with CUDA by default, and can be disabled the same way as the training command.
- Results are printed to the console, though it would be easy to pipe it to a file instead:
1th sample.lua -checkpoint cv/checkpoint_10000.t7 -length 2000 > my_new_shakespeare.txt - Lots of other options here.
STEP 8A: “Temperature”
Changing the temperature flag will make the most difference in your network’s output. It changes the novelty and noise is the system, creating dramatically different output. The
-temperature argument expects a number between 0 and 1.
Higher temperature
Gives a better chance of interesting/novel output, but more noise (ie: more likely to have nonsense, misspelled words, etc). For example,
-temperature 0.9 results in some weird (though still surprisingly Shakespeare-like) output:
“Now, to accursed on the illow me paory; And had weal be on anorembs on the galless under.”
Lower temperature
Less noise, but less novel results. Using
-temperature 0.2 gives clear English, but includes a lot of repeated words:
“So have my soul the sentence and the sentence/To be the stander the sentence to my death.”
In other words, everything is a trade-off and experimentation is likely called for with all the settings.
All Done!
That’s it! If you make something cool with this tutorial, please tweet it to me @jeffthompson_.
Doing this in April 2018 —
‘brew tap homebrew/science’ gives the error ‘homebrew/science was deprecated. This tap is now empty as all its formulae were migrated.’
So I just ran ‘brew install hdf5’ directly and that worked
@Julia – yeah, I think the
science
tap is no longer used. Thanks for the update.Hi Jeff,
I was able to generate some output after training the network on papers related to finance/economics. However, my output makes pretty much no sense — aside from some words belonging to the training corpus, there’s no grammatical structure at all, almost no punctuation and it’s overall painful to read. One caveat is that because I was taking texts from PDFs, my training corpus had lines of different lengths (for example, one line had 10 words in it and the next line had 5 in the text editor). Would this be a problem though?
Some overall stats:
Total tokens in file: 5074949
Training size: 4059961
batch size: 5
seq length: 50
dropout: 0.5
epochs: 50
I’ve experimented with different batch sizes and seq lengths, but haven’t noticed significant improvements.
Thank you!
@Julia – line lengths shouldn’t make a difference, it will just learn those line-breaks too! How many MBs is your overall training text? It should be at least 1MB of plain text to get anything good. You could also try playing with the “temperature” setting mentioned in the tutorial.
Hi Jeff,
Thanks for your reply! My corpus is 5.1 MB. I do have some special characters too (which were footnotes or symbols in the PDF), but I’m surprised that there isn’t even any consistency in the punctuation. I’ll keep tweaking and see what I can get… just takes a long time to train but I can start with a smaller set.
Sincere thanks to Jeff — and Tom & Zach in the comments — for helping me navigate torch-rnn, and its resultant errors, on MacOS (I’m running High Sierra, 10.13.4).
Following a bit of debugging, I too am now stuck on Step 7. I’m seeing an error that I can’t find a record of in the comments or elsewhere:
/Users//torch/install/bin/luajit: ./util/utils.lua:43: attempt to index local ‘f’ (a nil value)
stack traceback:
./util/utils.lua:43: in function ‘read_json’
train.lua:77: in main chunk
[C]: in function ‘dofile’
…lato/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x0104bd0cc0
@Jeff or @Zach, do you have any thoughts on this?? Would very much appreciate any and all help! (am new to Lua, torch, etc)
I am stuck at this step:
“We need to add Torch to the PATH variable so it can be found by our system. Easily open your .bash_profile file (which is normally hidden) in a text editor using this command:
touch ~/.bash_profile; open ~/.bash_profile
1
touch ~/.bash_profile; open ~/.bash_profile
And add these two lines at very bottom:
# TORCH
export PATH=$PATH:/Users//torch/install/bin
1
2
# TORCH
export PATH=$PATH:/Users//torch/install/bin
…replacing your username in the path. Save and close, then restart Terminal. When done, test it with the command:
th
1
th”
Where to I write this part? In the Text Edit file or in Terminal?
“# TORCH
export PATH=$PATH:/Users//torch/install/bin”
Basically when I type the command “th” Terminal responds with “command not found”
@Tyler – you add this to the .bash_profile file using a text editor. Your computer doesn’t know where to find the th command until you do that step.
Jeffery,
Thanks for the response on my last question. I have another.
In Step 3 when it comes to installing HDF5 Library for Lua, I entered the “brew tap homebrew/science” command and got this response:
“Updating Homebrew…
==> Auto-updated Homebrew!
Updated 2 taps (homebrew/cask, homebrew/core).
==> Updated Formulae
abcm2ps heroku solr
apache-zeppelin i2p sratoolkit
armadillo kubernetes-cli tkdiff
brotli libyaml vagrant-completion
container-diff openapi-generator webpack
dcm2niix openh264 wildfly-as
frugal phoronix-test-suite wtf
gdbm presto yq
grib-api progress znapzend
gwyddion pyenv
Error: homebrew/science was deprecated. This tap is now empty as all its formulae were migrated.”
A) What does it mean that it has been “depreciated”?
B) Where did the formulae migrate to?
C) How does a novice like me move past this?
Thanks for your patience and I appreciate any help you or the community can offer.
T
@Tyler – I believe it should still work? That part of Homebrew has been removed, so I believe it’s just a warning. Were you able to continue with the install?
Great stuff, thanks Jeff!
MacBook (not pro, just macbook!) El Capitan 10.11.1, got it installed and working fine.
Had to use Tom Schofield’s fix in comment 27 Feb 2017, about installing an older version of hdf5.
Make sure the path in ~/torch/install/share/lua/5.1/hdf5 only says
HDF5_INCLUDE_PATH = “/usr/local/Cellar/hdf5/1.8.18/include”
and doesn’t have “;/usr/local/include/hdf5.h” which seems to be written there on install. Copying that file to that folder doesn’t work, as was suggested in another comment, but just removing that path is what made it work for me.
@julia, regarding the punctuation, I found that the ‘temperature’ setting really affects the level of punctuation. At temperature of 1.0 my samples were generating realistic looking punctuation, mainly commas, full stops and paragraphs (which is what the training data had most of, with little other kinds of punctuation), but at temperatures of 0.6ish and below, there was no punctuation.
For error:
— Unsupported HDF5 version: 1.10.0
There is a cloned branch in github: https://github.com/anibali/torch-hdf5
This branch support HDF5 1.10, just run:
git clone https://github.com/anibali/torch-hdf5.git
cd torch-hdf5
git checkout hdf5-1.10
luarocks make hdf5-0-0.rockspec
Hello, I am running terminal on Mac Os High Sierra and after running command of:
th train.lua -input_h5 data/tiny_shakespeare.h5 -input_json data/tiny_shakespeare.json -gpu -1
It spits out issue of this:
/Users/matthewclark/torch/install/bin/lua: …/matthewclark/torch/install/share/lua/5.2/hdf5/group.lua:88: NYI
stack traceback:
[C]: in function ‘__concat’
…/matthewclark/torch/install/share/lua/5.2/hdf5/group.lua:88: in function
[C]: in function ‘tostring’
…/matthewclark/torch/install/share/lua/5.2/hdf5/group.lua:39: in function ‘__init’
…/matthewclark/torch/install/share/lua/5.2/torch/init.lua:91: in function
[C]: in function ‘HDF5Group’
…s/matthewclark/torch/install/share/lua/5.2/hdf5/init.lua:74: in function ‘_loadObject’
…s/matthewclark/torch/install/share/lua/5.2/hdf5/file.lua:19: in function ‘__init’
…/matthewclark/torch/install/share/lua/5.2/torch/init.lua:91: in function
[C]: in function ‘HDF5File’
…s/matthewclark/torch/install/share/lua/5.2/hdf5/file.lua:148: in function
(…tail calls…)
./util/DataLoader.lua:17: in function ‘__init’
…/matthewclark/torch/install/share/lua/5.2/torch/init.lua:91: in function
[C]: in function ‘DataLoader’
train.lua:76: in main chunk
[C]: in function ‘dofile’
…lark/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?
can anyone help? would be much appreciated
I wrote my own version of this tutorial for my girlfriend, who is not a programmer at all, culling together tips from the comments here and my own googling. Hopefully this will be helpful to some folks! https://ranieri.neocities.org/blog/?p=installing-torch-rnn-on-macos
Hi, thanks for this guide. I’ve made my way through most of the errors, but I’ve been stumped on this:
/Users/finlaybarris/torch/install/bin/luajit: cannot open train.lua: No such file or directory
stack traceback:
[C]: in function ‘dofile’
…rris/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x0100223350
Any help would be greatly appreciated.
Well, I worked through that, but now I have different errors. I had a look through the comments and tried many of the suggested fixes, but I still have these errors:
/Users/finlaybarris/torch/install/bin/luajit: …/finlaybarris/torch/install/share/lua/5.1/trepl/init.lua:389: …/finlaybarris/torch/install/share/lua/5.1/trepl/init.lua:389: …rs/finlaybarris/torch/install/share/lua/5.1/hdf5/ffi.lua:42: Error: unable to locate HDF5 header file at /usr/local/Cellar/hdf5/1.10.4/include;/usr/local/opt/szip/include/hdf5.h
stack traceback:
[C]: in function ‘error’
…/finlaybarris/torch/install/share/lua/5.1/trepl/init.lua:389: in function ‘require’
train.lua:6: in main chunk
[C]: in function ‘dofile’
…rris/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x010aba2350
If anyone could help me that would be great
@Finn — did you install the HDF5 library? I know that one is a bit of a challenge, so you might have to look through the comments here for tips.
@Finn it looks like you are using version 1.10.x of HDF5, which is not compatible with torch-hdf5 (unless you are using a third-party modification of torch-hdf5 that has been updated to work with HDF5 1.10). You will need to specify version 1.8 of HDF5 when you install it with homebrew.
I also ended up needing to edit the lua configuration file (at ~/torch/install/share/lua/5.1/hdf5/config.lua) and take out the second path in HDF5_INCLUDE_PATH (everything from and including the semicolon up to and NOT including the ” at the end of the line). I have posted more detail about these fixes in the blog post I linked above but my blog seems to be down at the moment hopefully it will be back up by the time you read this though!
Hi, I’m completely new to this so I don’t know whether it’s related to my using High Sierra but up until the step 7 everything worked.
So when I try the training command it says this:
/Users/vonlanthenm/torch/install/bin/luajit: cannot open in mode r at /tmp/luarocks_torch-scm-1-8329/torch7/lib/TH/THDiskFile.c:673
stack traceback:
[C]: at 0x00854900
[C]: in function ‘DiskFile’
…s/vonlanthenm/torch/install/share/lua/5.1/torch/File.lua:405: in function ‘load’
sample.lua:19: in main chunk
[C]: in function ‘dofile’
…henm/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x0100753350
Can anyone help?
Hello, I followed this guide which was helpful for the most part. But I had a lot of trouble in some areas. Firstly, as of now, Mojave isn’t even supported by CUDA (as far as I can tell).
With that out of the way, I did a lot of things to get CUDA working after I downgraded to High Sierra.
If you have Cuda installed already and its not working for this, you’ll have to google how to completely uninstall all of it.
First, you need the “Nvidia Web driver for mac”. The one with the Nvidia driver manager. Make sure you download it for the right version. For me thats 10.13.6.
Then you need to install the 7.5 CUDA Toolkit like mentioned in the article BUT don’t check the driver option. Download that separately from here: https://www.nvidia.com/object/mac-driver-archive.html
For compiling something (can’t remember exactly which thing, I think its CUTorch) you need an older version of the command line tools, specifically Command_Line_Tools_OS_X_10.11_for_Xcode_7.2
If you already have the command line tools installed, install that over it and then run sudo xcode-select –switch /Library/Developer/CommandLineTools
You may have to use the command “luarocks install cutorch” at the very last step to get it all working.
Those tips should help you overcome some of the errors I ran into. The rest of the steps listed above in the article are pretty solid. I ran into the clang problem which was solved by the developer tools, and then the CUDA version mismatch or whatever that I solved by the web driver + 7.5 + manual driver. I may have forgotten some other things I had to do, but I think that’s it. Hopefully that helps someone else out.
Hey, thanks for your great article.
Sadly I can’t get it to work, seems to be a problem with cjson.
Do you have an idea?
Greetings
/Users/ruben/torch/install/share/lua/5.2/trepl/init.lua:389: …rs/ruben/torch/install/share/lua/5.2/luarocks/loader.lua:117: error loading module ‘cjson’ from file ‘/Users/ruben/torch/install/lib/lua/5.2/cjson.so’:
dlopen(/Users/ruben/torch/install/lib/lua/5.2/cjson.so, 6): Symbol not found: _lua_objlen
Referenced from: /Users/ruben/torch/install/lib/lua/5.2/cjson.so
Expected in: flat namespace
in /Users/ruben/torch/install/lib/lua/5.2/cjson.so
stack traceback:
[C]: in function ‘error’
/Users/ruben/torch/install/share/lua/5.2/trepl/init.lua:389: in function ‘require’
train.lua:5: in main chunk
[C]: in function ‘dofile’
…uben/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?