Update! For El Capitan and users of newer version of OS X, you may run into issues installing Torch or Lua packages. A fix is included now.
Update number two! Zach in the comments offers a really helpful fix if you’re on Sierra.
Update three! A lot has changed since 2016, so I’ll be posting a new version of this tutorial soon. In the meantime, please see the comments for common sticking points and troubleshooting.
There have been many recent examples of neural networks making interesting content after the algorithm has been fed input data and “learned” about it. Many of these, Google’s Deep Dream being the most well-covered, use and generate images, but what about text? This tutorial will show you how to install Torch-rnn, a set of recurrent neural network tools for character-based (ie: single letter) learning and output – it’s written by Justin Johnson, who deserves a huge “thanks!” for this tool.
The details about how all this works are complex and quite technical, but in short we train our neural network character-by-character, instead of with words like a Markov chain might. It learns what letters are most likely to come after others, and the text is generated the same way. One might think this would output random character soup, but the results are startlingly coherent, even more so than more traditional Markov output.
Torch-rnn is built on Torch, a set of scientific computing tools for the programming language Lua, which lets us take advantage of the GPU, using CUDA or OpenCL to accelerate the training process. Training can take a very long time, especially with large data sets, so the GPU acceleration is a big plus.
You can read way more info on how this all works here:
http://karpathy.github.io/2015/05/21/rnn-effectiveness
STEP 1: Install Torch
First, we have to install Torch for our system. (This section via this Torch install tutorial.)
A few notes before we start:
- Installing Torch will also install Lua and luarocks (the Lua package manager) so no need to do that separately.
- If Lua already installed, you may run into some problems (I’m not sure how to fix that, sorry!)
- We’ll be doing everything in Terminal – if you’ve never used the command-line, it would be good to learn a bit more about how that works before attempting this install.
- If you’re running a newer OS such as El Capitan, you may run into problems installing Torch, or installing packages afterwards. If that’s the case, you can follow these instructions.
In Terminal, go to your user’s home directory* and run the following commands one at a time:
1 2 3 4 |
git clone https://github.com/torch/distro.git ~/torch --recursive cd ~/torch bash install-deps ./install.sh |
This downloads the Torch repository and installs it with Lua and some core packages that are required. This may take a few minutes.
We need to add Torch to the PATH variable so it can be found by our system. Easily open your .bash_profile file (which is normally hidden) in a text editor using this command:
1 |
touch ~/.bash_profile; open ~/.bash_profile |
And add these two lines at very bottom:
1 2 |
# TORCH export PATH=$PATH:/Users/<your user name>/torch/install/bin |
…replacing your username in the path. Save and close, then restart Terminal. When done, test it with the command:
1 |
th |
Which should give you the Torch prompt. Use Control-c twice to exit, or type os.exit().
* You can install Torch anywhere you like, but you’ll have to update all the paths in this tutorial to your install location.
STEP 2: Install CUDA Support
Note: this step is only possible if your computer has an NVIDIA graphics card!
We can leverage the GPU of our computer to make the training process much faster. This step is optional, but suggested.
Download the CUDA tools with the network install – this is way faster, since it’s a 300kb download instead of 1GB: https://developer.nvidia.com/cuda-downloads.
Run installer; when done, we have to update PATH variable in the .bash_profile file like we did in the last step. Open the file and add these three lines (you may need to change CUDA-<version number> depending on which you install – Kevin points out that CUDA 8 may cause errors):
1 2 3 |
# CUDA export PATH=/Developer/NVIDIA/CUDA-7.5/bin:$PATH export DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-7.5/lib:$DYLD_LIBRARY_PATH |
You may also need to modify your System Preferences under Energy Saver:
- Uncheck Automatic Graphics Switch.
- Set Computer Sleep to “Never”.
Restart Terminal and test the install by running this command:
1 |
kextstat | grep -i cuda |
You should get something like:
1 |
286 0 0xffffff7f8356e000 0x2000 0x2000 com.nvidia.CUDA (1.1.0) 5AFE550D-6361-3897-912D-897C13FF6983 <4 1> |
There are further tests in the NVIDIA docs, if you want to try them, but they’re not necessary for our purposes. If you want to go deeper into this process, you can follow these instructions from NVIDIA.
STEP 3: Install HDF5 Library for Lua
Torch-rnn comes with a preprocessor script, written in Python, that prepares our text for training. It will save our sample into an
h5 and
json file, but requires the HDF5 library to be installed.
First, install HDF5 using Homebrew:
1 2 |
brew tap homebrew/science brew install hdf5 |
(If you have issues with the install or in the next step, Joshua suggests adding the the flag --with-mpi to the Homebrew command above, which may help. If that doesn’t work, Charles has a suggested fix. If you get an error that says Unsupported HDF5 version: 1.10.0 , you can try Tom’s suggestion.)
Move to the Torch folder inside your user home directory (ie: /Users/<your user name>/torch/). The following commands download the Torch-specific HDF5 implementation and installs them:
1 2 3 |
git clone git@github.com:deepmind/torch-hdf5.git cd torch-hdf5 luarocks make hdf5-0-0.rockspec |
If you haven’t used git or Github before, as Luke points out in the comments, you might get an SSH key error. You can get a key, or just download the repository manually from here.
STEP 4: Install HDF5 Library for Python
We also need to install HDF5 support for Python. You can do this using Pip:
1 |
sudo pip install h5py |
You may get a bunch of warnings, but that’s ok. Test that it works by importing the library:
1 2 |
python import h5py |
If it imports without error, you’re good!
STEP 5: Install Torch-rnn
Now that we’ve prepared our computer with all the required libraries, it’s time to finally install Torch-rnn!
- Download the ZIP file from the project’s GitHub repository.
- Unzip it and rename to torch-rnn.
- Move the Torch-rnn folder to your Torch install folder inside your user home directory (ie: /Users/<your user name>/torch/torch-rnn )
- (You can also do this by cloning the repo, but if you know how to do that, you probably don’t need the instructions in this step 😄)
STEP 6: Prepare Your Data
We’re ready to prepare some data! Torch-rnn comes with a sample input file (all the writings of Shakespeare) that you can use to test everything. Of course, you can also use your own data; just combine everything into a single text file.
In the Terminal, go to your Torch-rnn folder and run the preprocessor script:
1 |
python scripts/preprocess.py --input_txt data/tiny-shakespeare.txt --output_h5 data/tiny_shakespeare.h5 --output_json data/tiny_shakespeare.json |
You should get a response that looks something like this:
1 2 3 4 5 6 |
Total vocabulary size: 65 Total tokens in file: 1115394 Training size: 892316 Val size: 111539 Test size: 111539 Using dtype <type 'numpy.uint8'> |
This will save two files to the data directory (though you can save them anywhere): an h5 and json file that we’ll use to train our system.
STEP 7: Train
The next step will take at least an hour, perhaps considerably longer, depending on your computer and your data set. But if you’re ready, let’s train our network! In the Torch-rnn folder and run the training script (changing the arguments if you’ve used a different data source or saved them elsewhere):
1 |
th train.lua -input_h5 data/tiny_shakespeare.h5 -input_json data/tiny_shakespeare.json |
The train.lua script uses CUDA by default, so if you don’t have that installed or available, you’ll need to disable it and run CPU-only using the flag -gpu -1. Lots more training and output options are available here.
It should spit out something like:
1 2 3 4 5 |
Running with CUDA on GPU 0 Epoch 1.00 / 50, i = 1 / 17800, loss = 4.163219 Epoch 1.01 / 50, i = 2 / 17800, loss = 4.078401 Epoch 1.01 / 50, i = 3 / 17800, loss = 3.937344 ... |
Your computer will get really hot and it will take a long time – the default is 50 epochs. You can see how long it took by adding time in front of the training command:
1 |
time th train.lua -input_h5 data/tiny_shakespeare.h5 -input_json data/tiny_shakespeare.json |
If you have a really small corpus (under 2MB of text) you may want to try adding the following flags:
1 |
-batch_size 1 -seq_length 50 |
Setting -batch_size somewhere between 1-10 should give better results with the output.
STEP 8: Generate Some Output
Getting output from our neural network is considerably easier than the previous steps (whew!). Just run the following command:
1 |
th sample.lua -checkpoint cv/checkpoint_10000.t7 -length 2000 |
A few notes:
- The -checkpoint argument is to a t7 checkpoint file created during training. You should use the one with the largest number, since that will be the latest one created. Note: running training on another data set will overwrite this file!
- The -length argument is the number of characters to output.
- This command also runs with CUDA by default, and can be disabled the same way as the training command.
- Results are printed to the console, though it would be easy to pipe it to a file instead:
1th sample.lua -checkpoint cv/checkpoint_10000.t7 -length 2000 > my_new_shakespeare.txt - Lots of other options here.
STEP 8A: “Temperature”
Changing the temperature flag will make the most difference in your network’s output. It changes the novelty and noise is the system, creating dramatically different output. The
-temperature argument expects a number between 0 and 1.
Higher temperature
Gives a better chance of interesting/novel output, but more noise (ie: more likely to have nonsense, misspelled words, etc). For example,
-temperature 0.9 results in some weird (though still surprisingly Shakespeare-like) output:
“Now, to accursed on the illow me paory; And had weal be on anorembs on the galless under.”
Lower temperature
Less noise, but less novel results. Using
-temperature 0.2 gives clear English, but includes a lot of repeated words:
“So have my soul the sentence and the sentence/To be the stander the sentence to my death.”
In other words, everything is a trade-off and experimentation is likely called for with all the settings.
All Done!
That’s it! If you make something cool with this tutorial, please tweet it to me @jeffthompson_.
Hi Jeff,
Thank you for the detailed and easy to follow instruction. I have a question about creating my own data set. does the text have to be in one line? does it matter when I hit enter/start new line?
Thanks.
Do you mean the input text file? It can be formatted any way you like – Torch-rnn will actually mimic that formatting too! (For example, if your input is a screenplay, it will generate character headers and dialog.)
@Jeff Yes, I meant the input text! Thank you. I was thinking of testing it on song lyrics and they’re usually formatted one sentence per line.
So just to be sure, I can just copy/paste the text from this page (and other ones like it) and it will mimic it?
http://www.azlyrics.com/lyrics/hozier/jackieandwilson.html
That’s right! Try it on the built-in Shakespeare file (as shown in the tutorial) and you’ll get something that looks like a play.
@Jeff I’m having difficulties when trying to run train.lua
“Error: unable to locate HDF5 header file at hdf5.h”
“[C]: in function ‘error’
…/fahadalneama/torch/install/share/lua/5.1/trepl/init.lua:384: in function ‘require’
train.lua:6: in main chunk
[C]: in function ‘dofile’
…eama/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x01090d1cf0
”
I already had anaconda installed. Do you think this might be it?
No idea, sorry – maybe something here will help?
Hi Jeff, Yoosan solution fixed the problem. I can get it to run in cpu mode but not opencl (new MacPro). I installed the opencl distro from here: https://github.com/hughperkins/distro-cl
I get this error:
” no field package.preload[‘cltorch’]
no file ‘/Users/fahadalneama/.luarocks/share/lua/5.1/cltorch.lua’
no file ‘/Users/fahadalneama/.luarocks/share/lua/5.1/cltorch/init.lua’
no file ‘/Users/fahadalneama/torch/install/share/lua/5.1/cltorch.lua’
no file ‘/Users/fahadalneama/torch/install/share/lua/5.1/cltorch/init.lua’
no file ‘./cltorch.lua’
no file ‘/Users/fahadalneama/torch/install/share/luajit-2.1.0-beta1/cltorch.lua’
no file ‘/usr/local/share/lua/5.1/cltorch.lua’
no file ‘/usr/local/share/lua/5.1/cltorch/init.lua’
no file ‘/Users/fahadalneama/.luarocks/lib/lua/5.1/cltorch.so’
no file ‘/Users/fahadalneama/torch/install/lib/lua/5.1/cltorch.so’
no file ‘./cltorch.so’
no file ‘/usr/local/lib/lua/5.1/cltorch.so’
no file ‘/usr/local/lib/lua/5.1/loadall.so’
”
I tested the opencl install and it passed all tests.
I think the problem is that the way Justin uses to install cltorch is no longer supported:
“IMPORTANT! THIS HAS CHANGED. Please install a specific Torch distro, as described below. Simply doing luarocks install cltorch is no longer supported”
I’m guessing the new method saves files in different place than what torch-rnn is looking for?
Hi @Jeff,
few comments:
1) finally got it to work on nvidia based mac (give up on OpenCL)
2) torch7 is not compatible with OS X Sierra
3) I tried changing the -rnn_size on a relatively small input (100KB) here’s what I noticed:
a) if I set it to 512, the model just copies the original input and the output is basically a rearranged version.
b) if I reduce it to 256, half of the document seems to be generated and the other half is copied.
Why do you think this happens? is it because my input is too small for the size of a network?
Thanks!
Hi There, I’m having some issues here in macOs Sierra, I’m not sure if it’s a OS version problem, but I tested it in two machines and followed all the help tips around there. Nothing worked for now.
When I train the system, this error shows up:
“/Users/manucho/torch/install/bin/luajit: /Users/manucho/torch/install/share/lua/5.1/trepl/init.lua:384: /Users/manucho/torch/install/share/lua/5.1/trepl/init.lua:384: /Users/manucho/torch/install/share/lua/5.1/hdf5/ffi.lua:56: ‘)’ expected near ‘_close’ at line 1435
stack traceback:
[C]: in function ‘error’
/Users/manucho/torch/install/share/lua/5.1/trepl/init.lua:384: in function ‘require’
train.lua:6: in main chunk
[C]: in function ‘dofile’
…ucho/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x010e75d350”
The error shows something like a syntax error but I couldn’t find where is it and makes no sense since I’m using the files that you’ve uploaded.
I appreciate further help :D
Can you post the error itself? It’s hard to judge from what you’ve shown here. Also, can you try just running the
th
command to see if you get the Torch prompt? If it doesn’t work, you probably installed Torch incorrectly.Hi Jeff, what do you mean with the “error itself” ? What I’ve posted is the error that I have.
Torch is working well, I tested and everything is fine. Also hdf5 in python is imported as well. This is where I’m confused because I don’t know where the error is coming from. I will check in others forums to see if Sierra is causing the problem anyway.
Ok, sorry, I found the solution in this comment: https://github.com/deepmind/torch-hdf5/issues/83#issuecomment-254427843
It was a Sierra caused error. Now I have new errors… but, step by step.
I needed two extra hacks on Mac OS Sierra.
1) Error: unable to locate HDF5 header file at hdf5.h
change config.lua at Users/your_name/torch/install/share/lua/5.1/hdf5.
Replace the HDF5_INCLUDE_PATH = “/usr/local/Cellar/hdf5/your_version_number/include
2) Error: ffi.lua:56: ‘)’ expected near ‘_close’
in /Users/your_name/torch/install/share/lua/5.1/hdf5/ffi.luaffi.lua
change line 45
local process = io.popen(“gcc -D ‘_Nullable=’ -E ” .. headerPath) — TODO pass -I.
Hi Jeff,
First off- thank you so much for creating this and sharing it with the world.
I’m trying to recreate social media posts, and using a file about .5MB I’m getting this error code:
train.lua:77: Expected value but found invalid unicode escape code at character 1039
stack traceback:
[C]: in function ‘read_json’
train.lua:77: in main chunk
[C]: in function ‘dofile’
…urch/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x010572b1b0
Any idea what is going on? I’ve tried other txt files and the program has worked wonderfully, so I’m not sure how to diagnose what is going on with this txt file in particular.
Thanks!
@Britton – this is probably because there are weird unicode characters in your input text file. Doing a find-and-replace for them will be tricky, but you could try one of these solutions in the Terminal.
Alternatively, you can add this to the
scripts/preprocess.py
file after line 29, though it will skip any non-ASCII character, so you’ll want to disable it for emoji support, etc.# HACK – strip non-ascii characters
try:
char.decode('ascii')
except UnicodeDecodeError:
continue
Let me know if that works!
Hi Jeff! Thanks so much for your quick response. Took me a bit to implement in into the Python code correctly, but your code fix actually caused my errors to give up a few of the specific unicode characters that were causing problems, which I could go in and replace. The only thing I had to change was the exception to be UnicodeEncodeError, instead of UnicodeDecodeError.
Thanks again, Jeff!
@Britton, sorry for all the work! I fixed the error you caught, and added an arg to use in the command line. https://gist.github.com/jeffThompson/506a06253d81ebcf54ed28cc69195cf4
Hi Jeff, just thought I would let you know that I ran into a similar unicode issue farther down in the preprocess.py:
Traceback (most recent call last):
File “scripts/preprocess.py”, line 71, in
splits[split_idx][cur_idx] = token_to_idx[char]
KeyError: u’\u2028′
I tried adding the same bit of code before that splits line (after it says “for char in line:”) and it seems to have gotten through preprocessing alright! Haven’t trained yet, but thought I’d give you a heads up. Thanks!
Hey jeff, i’m mostly new to using the terminal, could you please look into my problem?
When trying out the test data (tiny_shakespeare) i ran into an error when launching preprocess.py:
File “scripts/preprocess.py”, line 39
print ‘Total vocabulary size: %d’ % len(token_to_idx)
^
SyntaxError: invalid syntax
thanks in advance,
Rien
Hmm, my guess is maybe you have Python 3 installed, instead of 2.7? The
print
command has changed in that update. You can check by runningpython
in the Terminal – it will print the version on startup. (And you can exit Python by typingexit()
.)Thanks, that helped me out :), I applaud your knowledge of these things, so psyched to get started.
i’m stuck again :( after preprocessing I tried to commence with training the rnn with the supplied shakespeare data and got back this:
Error: unable to locate HDF5 header file at hdf5.h
stack traceback:
[C]: in function ‘error’
…riencoorevits/torch/install/share/lua/5.1/trepl/init.lua:389: in function ‘require’
train.lua:6: in main chunk
[C]: in function ‘dofile’
…vits/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x010d929bb0
thanks in advance,
Rien
@Rien – did you look through the possible fixes in the tutorial and the comments? There are quite a few things you can try to get the HDF5 library working.
Yes, i’ve used Adam Loving’s hack to fix the hdf5 error, also needed his second hack but I think I messed something up because when I changed code at line 45 I now get this:
/ffi.lua:44: ‘)’ expected near ‘‘_Nullable’
Thanks for creating this tutorial. Really appreciated it! A couple of things I had to remember to do to make it work that slipped me up a while was running the tutorial. It’s probably obvious to many who are good at following instructions :)
1) for preprocessing, use your gist which includes your updated source file
2) for preprocessing, for my own training set make sure i call with new argument -skip_non_ascii=true
3) for training step, add -gpu -1 as I have a mac book pro 2015 with NVIDIA GPU
4) for generating output, again add -gpu -1
Hi Jeff,
I’ve had a lot of fun playing around with your tool, and I’ve tried it on a few different styles of writing. I’m curious- if I want to train the system on a different txt file, do I need to “reset” anything else to start with a clean slate? Or is new training only using the data/txt file?
The reason I ask is that the output from one experiment looks weirdly unlike the txt I trained on, but very similar to a previous txt file I trained it with. It’s certainly possible that I’ve made a coding error, though.
Thank you!
@Britton – you can run again without any resetting, but it will overwrite any old training files, so you’ll want to move them before running on another text. You’re probably getting similar output because the new training file ended before the old one, and you’re just calling the old training t7 file. Try moving them and re-training.
Hi,
great great tool!
I have a question about the output of train.lua : I’m not sure to understand what the loss and val_loss variable are…
also I was bit confused to see val_loss keep growing up wile the loss do the opposite (as expected).. As my training sample is quite small maybe it mean I’m overfitting?
If one knows how to avoid thad it would be great.
@Morthan – not really sure, that’s a deeper machine learning question than I can help with :)
Hi Jeff
Thanks so much for the tutorial.
I encountered the problem documented here https://github.com/deepmind/torch-hdf5/issues/83#issuecomment-254427843
but overcame it as per the instructions.
I’ve run into a version problem with hdf5. It seems that the current version 1.10.1 isn’t supported by torch-hdf5 . This problem seems to be documented here
https://github.com/deepmind/torch-hdf5/issues/76
I can find an older homebrew (version 1.8) but looking at the python library hdf5py I think this depends on 1.8.4 or above. Eek. I’m also not sure how to make home-brew recognise the older version as it’s default. Any thoughts? Has anyone else had the same problem? Thanks again for your help putting this together.
I’ve run into the same problem as Tom, “Unsupported HDF5 version: 1.10.0”
Thanks for ideas!
Sascha
Update: I overcame this by tapping the older brew version
I ran:
brew install hdf5@1.8
then dumped the folder contents from usr/local/Cellar/hdf5@1.8 into from usr/local/Cellar/hdf5
I then ran
brew switch hdf5 1.8.18
to choose the right hdf5 version
I made my include path in config.lua
HDF5_INCLUDE_PATH = “/usr/local/Cellar/hdf5/1.8.18/include”,
I’m sure there’s a better way of teaching brew which version to default to without dragging folders around but at least it works!
@Jeff Thompson, you rock. I have implemented Adam Loving’s hack. FYI multiple subtle bugs are introduced by pasting the string directly; not just are the quotes converted to unicode “smart” quotes, but the double hyphen prior to “TODO -pass -I” is converted to an em-dash as well (at least, I’m assuming that was a double hyphen?) .
Anyway, I’ve asciiized the line which resolves the syntax error @Rien was having, but in its place is a new, more interesting error: “/Users/kz/torch/install/bin/luajit: /Users/kz/torch/install/share/lua/5.1/trepl/init.lua:389: /Users/kz/torch/install/share/lua/5.1/trepl/init.lua:389: /Users/kz/torch/install/share/lua/5.1/pl/stringx.lua:27: argument 1 expected a ‘string’, got a ‘nil'”
Not sure what to do with that… I’d be grateful for suggestions.
@Jeff Thompson I don’t know if this is related to the issue I was having earlier or not, but you might wanna edit the CUDA link in Step 2 to point to a specific version of the CUDA driver. The generic “CUDA download” page is now serving CUDA 8.0, but the next step reads
Open the file and add these three lines:
# CUDA
export PATH=/Developer/NVIDIA/CUDA-7.5/bin:$PATH
export DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-7.5/lib:$DYLD_LIBRARY_PATH
resulting in the following prolog when running ~/torch/test.sh:
Completed 197256 asserts in 192 tests with 0 failures and 0 errors
sundown loaded succesfully
/Users/kz/torch/install/bin/luajit: /Users/kz/torch/install/share/lua/5.1/cutorch/init.lua:2: cannot load ‘/Users/kz/torch/install/lib/lua/5.1/libcutorch.so’
stack traceback:
[C]: in function ‘require’
/Users/kz/torch/install/share/lua/5.1/cutorch/init.lua:2: in main chunk
[C]: at 0x010ecb5e50
[C]: at 0x010ec39370
CUDA not found
I’m a little ambivalent whether to proceed with the 8.0 driver or revert to 7.5, as the 7.5 driver page doesn’t include 10.12 among the supported OSX versions.
I had all these problems, redirected things using config.lua to an old version of hdf5 1.8.17 already in the anaconda installation, but then couldn’t get past the problem with _Nullable. Gave up.
Going back to the Cristal/torch-rnn version for CPU only running in docker it worked perfectly first time all the way through. Training took about an hour. Sample worked too.
Now stuck on how to use different text with script/preprocessor.py in docker container.
@Kevin – running a Lua script via the training shouldn’t do any ASCII conversion, not sure why you’re getting that error. For your string/nil problem, I really can’t help you, sounds like either a) you broke something :) or b) you’ll need to post that to the Torch-rnn repo as an issue. Re driver support, I’ll add a note in the tutorial but can’t offer any (useful) advice on whether to keep 7.5 or upgrade.
@pudepiedj – I’ve not used Docker, but you should just be able to pass a different text file as an argument for training, no?
Hey Jeff, based on the helpful comments around here and your tutorial, I made an in-depth guide to try to help other users take care of the errors they keep getting when it comes to the HDF5 library.
The common issue I’ve seen is users having conflicts with the “th” commands posted near the end of your tutorial.
For any users out there, if you have some issue like this:
Completed 197256 asserts in 192 tests with 0 failures and 0 errors
sundown loaded succesfully
/Users/kz/torch/install/bin/luajit: /Users/kz/torch/install/share/lua/5.1/cutorch/init.lua:2: cannot load ‘/Users/kz/torch/install/lib/lua/5.1/libcutorch.so’
stack traceback:
[C]: in function ‘require’
/Users/kz/torch/install/share/lua/5.1/cutorch/init.lua:2: in main chunk
[C]: at 0x010ecb5e50
[C]: at 0x010ec39370
CUDA not found
…Then there is a problem with your HDF5 installation configuration. As another user pointed out, you’ll need Terminal to use an older version of HDF5 to get this tool working right.
Please refer to this guide I made, which is modeled off of Jeff’s work: http://www.asaduddin.com/2017/03/torch-rnn-macos-installation-guide-2017-average-joe-edition/
I have included some additional steps to the process to ensure that you can properly make use of Touch-rnn w/o problems.
Hi Jeff,
Thank you so much for this detailed tutorial. I’ve followed all the steps and preprocessed the text to h5 and json, and I’m trying to train my first torch-rnn. But I’m getting errors like this every time I’ve tried to run th train.lua
th train.lua -input_h5 data/sad.h5 -input_json data/sad.json -gpu -1
/Users/Apple/torch/install/bin/luajit: /Users/Apple/torch/install/share/lua/5.1/trepl/init.lua:389: /Users/Apple/torch/install/share/lua/5.1/trepl/init.lua:389: /Users/Apple/torch/install/share/lua/5.1/hdf5/config.lua:2: unexpected symbol near ‘local’
stack traceback:
[C]: in function ‘error’
/Users/Apple/torch/install/share/lua/5.1/trepl/init.lua:389: in function ‘require’
train.lua:6: in main chunk
[C]: in function ‘dofile’
…pple/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x0103a24bd0
Any ideas how I can fix this? Thank you.
@Grishma – not totally sure, but I’m guessing there’s a non-ASCII symbol in your training set? You can clean your training files of non-ASCII characters, or modify the scripts as described here.
Thanks for this tutorial! On step 4 running
python
import h5py
I get an error:
Traceback (most recent call last):
File “”, line 1, in
ImportError: No module named h5py
What should I do?
I tried going to step 3 and doing
brew tap homebrew/science
brew install hdf5 –with-mpi
but it took forever to build bootstrap or whatever, so I Ctrl+c out of it and installed hdf5 without mpi option, do you think that has anything to do with the above problem? (I’m on Mac OS X 10.12.4)
@Erik – yikes, not sure. It’s clearly that the h5py library didn’t get installed. I think you’ll have to try again using pip as suggested, or post it as an issue on the h5py repo.
@Jeff Thompson – I think maybe I didn’t make myself clear earlier. Just FYI the problem I had was at least in part due to copying and pasting the line
local process = io.popen(“gcc -D ‘_Nullable=’ -E ” .. headerPath) — TODO pass -I.
from Adam Loving’s comment into the suggested location, which as you can see your blog converts to non-ascii “smart” punctuation. :)
I did eventually get everything working though. Hooray! Thanks again for this.
hi there, when it comes to step three and the SSH key error. if i decide to download manually, where should i put the file to get things right?
–
@Crafted – that’s a Github issue that first-time users get, which can be fixed with this step.
@Jeff Thompson
so, the option to download manually is just there to confuse me? the ordeal of getting myself a public key proved a giant pain for me. so i wanted to know how the “try and download manually” option worked