Skip to content

bash script for extractor.lua and remove old links#1

Open
shubhamagarwal92 wants to merge 2 commits intoratishsp:masterfrom
shubhamagarwal92:master
Open

bash script for extractor.lua and remove old links#1
shubhamagarwal92 wants to merge 2 commits intoratishsp:masterfrom
shubhamagarwal92:master

Conversation

@shubhamagarwal92
Copy link
Copy Markdown

  1. Updated README; removing old links
  2. Removing hard coded paths in extractor.lua
  3. Shell script

extractor.lua Outdated
cmd:option('-convens_paths1', 'conv1-ep10-94-73' , [[path to conv net files]])
cmd:option('-convens_paths2', 'conv2-ep10-95-71' , [[path to conv net files]])
cmd:option('-convens_paths3', 'conv3-ep10-94-71' , [[path to conv net files]])
cmd:option('-lstmens_paths1', 'lstm1-ep5-92-76' , [[path to conv net files]])
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the hint, it should read path to lstm files

torch.manualSeed(opt.seed)
cutorch.manualSeed(opt.seed)
cutorch.setDevice(opt.gpuid)
device_id = cutorch.getDevice()
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure about this change. Setting gpuid from input through opt.gpuid seems better.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I was having this error when I used opts gpuid

THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-2141/cutorch/init.c line=734 error=10 : invalid device ordinal
/home/sagarwal/torch/install/bin/luajit: /home/sagarwal/projects/d2t/d2t/data2text-1/extractor.lua:574: cuda runtime error (10) : invalid device ordinal at /tmp/luarocks_cutorch-scm-1-2141/cutorch/init.c:734
stack traceback:
        [C]: in function 'setDevice'
        /home/sagarwal/projects/d2t/d2t/data2text-1/extractor.lua:574: in function 'main'
        /home/sagarwal/projects/d2t/d2t/data2text-1/extractor.lua:677: in main chunk
        [C]: in function 'dofile'
        ...rwal/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x00406470
srun: error: gpu05: task 0: Exited with exit code 1

I referenced this issue here

Maybe, the best solution would be to have a try-except statement instead?

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A point to note is that -gpuid parameter is 1-indexed.
If you set it to value like 0, it will give similar error.
In a 4 gpu setup, values such as 1, 2, 3, 4 are valid.
I normally run with -gpuid 1 and it works

th $LUA_FILE \
-datafile $OUTPUT_H5 \
-preddata $MODEL_DIR/roto_stage2_$IDENTIFIER-beam5_gens.h5 \
-savefile $MODEL_DIR/roto_stage2_$IDENTIFIER-beam5_gens.h5-tuples.txt \
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-savefile is not applicable for -just_eval

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants