DDP degrades the performance

Thank you for sharing this code!

I am testing your code for multitask video with BART on 24GB GPUs.
To run your code on 24GB GPUs, I used below command to enable DDP. (batch size:50 -> 25)

bash scripts/video/single_adapter.sh 2

However, it showed worse results than the performance on a single 48GB GPU.
When I increased the number of GPUs, the performance was getting worse.
Because the model doesn't have BatchNorm, I thought the performance should be similar.

Have you tried DDP? Or do you have any intuition about the problem?




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DDP degrades the performance #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

DDP degrades the performance #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions