Skip to content

Lots of environment problems... #9

@benijohn

Description

@benijohn

Hi I am trying train my own many depth network following your instructions and I am getting a lot errors before even training. I am not sure if I have done something wrong, but I am curious if you might be able to provide some guidance here.

  1. trying to run export_gt_depths I kept getting the error that manydepth2/splits/eigen/test_files.txt did not exist. I was running export from the top level. But changing it to an absolute path allowed it to be found

  2. There was an issue in the numpy data type np.int, which was deprecated in 1.20, but the environment asks for 1.24. I simply changed it to np.int32 and got past that error

  3. Next once it had all the ground depth images if failed on the line:
    np.savez_compressed(output_path, data=np.array(gt_depths))
    due to the images being inconsistently sized, they weren't significantly different but I saw shapes of: { (375, 1242),(370, 1224) ,(374, 1238), (370, 1226), (376, 1241) } which of course won't work to make an array. I changed it to save each under its own key using:

    np.savez_compressed(
    output_path,
    **{f"depth_{i}": d.astype(np.float32) for i, d in enumerate(gt_depths)},
    shapes=np.array([d.shape for d in gt_depths], dtype=np.int32)
    )
    which allowed it to save, but I have yet to see how this will affect downstream training.

  4. It says to download Weights for GMFLOW and place it in /pretrained, but pretrained doesn't exist. I created it at the top level, is this correct?

I tried running train_many2.sh, but I got the following error:

$ sh train_many.sh 
/home/benjamin-johnson/miniforge3/envs/manydepth2/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/home/benjamin-johnson/miniforge3/envs/manydepth2/lib/python3.8/site-packages/torchvision/image.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
[ Using Seed :  1  ]
Traceback (most recent call last):
  File "/home/benjamin-johnson/miniforge3/envs/manydepth2/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/benjamin-johnson/miniforge3/envs/manydepth2/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/benjamin-johnson/workspace/projects/depth_estimation/Manydepth2/manydepth2/train.py", line 28, in <module>
    handlers=[logging.FileHandler(f'logs/{opts.model_name}.txt'), 
  File "/home/benjamin-johnson/miniforge3/envs/manydepth2/lib/python3.8/logging/__init__.py", line 1147, in __init__
    StreamHandler.__init__(self, self._open())
  File "/home/benjamin-johnson/miniforge3/envs/manydepth2/lib/python3.8/logging/__init__.py", line 1176, in _open
    return open(self.baseFilename, self.mode, encoding=self.encoding)
FileNotFoundError: [Errno 2] No such file or directory: '/home/benjamin-johnson/workspace/projects/depth_estimation/Manydepth2/logs/models_many.txt'

Which is true, that file does not exist.

So am I doing something wrong? I feel like I followed the instructions pretty well. I haven't incorporated CityScapes yet, but it wasn't clear if that was necessary or optional.

Any ideas on why I am seeing some many errors just getting them model trained?

Thanks!
Benjamin

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions