Skip to content

Merge pytorch/fairseq#4

Open
imonlius wants to merge 707 commits intoimonlius:wipfrom
facebookresearch:master
Open

Merge pytorch/fairseq#4
imonlius wants to merge 707 commits intoimonlius:wipfrom
facebookresearch:master

Conversation

@imonlius
Copy link
Copy Markdown
Owner

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
  • Did you read the contributor guideline?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

What does this PR do?

Fixes # (issue).

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

Weiyi Zheng and others added 30 commits February 12, 2021 14:06
Summary:
OSS removed the 'partition' key in their state dict to accommodate for changing partition size. This requires an update on the fairseq side to not look into the parameter partition, just broadcast everything, and let the optimizer on each rank decides which parameters are relevant.

This diff also needs D26419095 to function completely, and blefaudeux has made fixes upstream in facebookresearch/fairscale#383

Reviewed By: myleott

Differential Revision: D26382917

fbshipit-source-id: 95af1022be59e88814748acaee36a1a350f7dc5b
…s. (#3237)

Summary:
…ith BLEU scores

# Before submitting

- [no] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [yes] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [no need] Did you make sure to update the docs?
- [no need] Did you write any new necessary tests?

## What does this PR do?
Fixes bugs of evaluation with BLEU score when training with multi-gpus. But no error will happend if there is no distributed training.

when --eval-bleu is set to be `True` (default it is `False` and the best checkpoint is selected according to loss) and training with multi-gpus (when the number of gpu which participate in distributed training is greater than 1), following error will happend.

```bash
Traceback (most recent call last):
Traceback (most recent call last):
  File "/data/cordercorder/anaconda3/envs/nmt/bin/fairseq-train", line 33, in <module>
  File "/data/cordercorder/anaconda3/envs/nmt/bin/fairseq-train", line 33, in <module>
Traceback (most recent call last):
  File "/data/cordercorder/anaconda3/envs/nmt/bin/fairseq-train", line 33, in <module>
        sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-train')())sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-train')())

  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 450, in cli_main
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 450, in cli_main
    sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-train')())
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 450, in cli_main
        distributed_utils.call_main(cfg, main)distributed_utils.call_main(cfg, main)

  File "/data1/cordercorder/fairseq/fairseq/distributed/utils.py", line 349, in call_main
  File "/data1/cordercorder/fairseq/fairseq/distributed/utils.py", line 349, in call_main
    distributed_utils.call_main(cfg, main)
  File "/data1/cordercorder/fairseq/fairseq/distributed/utils.py", line 349, in call_main
    distributed_main(cfg.distributed_training.device_id, main, cfg, kwargs)
distributed_main(cfg.distributed_training.device_id, main, cfg, kwargs)  File "/data1/cordercorder/fairseq/fairseq/distributed/utils.py", line 326, in distributed_main

  File "/data1/cordercorder/fairseq/fairseq/distributed/utils.py", line 326, in distributed_main
    distributed_main(cfg.distributed_training.device_id, main, cfg, kwargs)
  File "/data1/cordercorder/fairseq/fairseq/distributed/utils.py", line 326, in distributed_main
    main(cfg, **kwargs)

      File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 143, in main
main(cfg, **kwargs)
                                                                                                                                                                               main(cfg, **kwargs)rder/fairseq/fairseq_cli/train.py", line 143, in main
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 143, in main
    valid_losses, should_stop = train(cfg, trainer, task, epoch_itr)
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/contextlib.py", line 74, in inner
                                                                                                                                                                               valid_losses, should_stop = train(cfg, trainer, task, epoch_itr)
      File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/contextlib.py", line 74, in inner
valid_losses, should_stop = train(cfg, trainer, task, epoch_itr)
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 259, in train
Traceback (most recent call last):
  File "/data/cordercorder/anaconda3/envs/nmt/bin/fairseq-train", line 33, in <module>
    return func(*args, **kwds)
return func(*args, **kwds)  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 259, in train

  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 259, in train
    cfg, trainer, task, epoch_itr, valid_subsets, end_of_epoch
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 345, in validate_and_save
    cfg, trainer, task, epoch_itr, valid_subsets, end_of_epoch
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 345, in validate_and_save
        cfg, trainer, task, epoch_itr, valid_subsets, end_of_epochsys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-train')())

  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 345, in validate_and_save
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 450, in cli_main
    valid_losses = validate(cfg, trainer, task, epoch_itr, valid_subsets)
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 413, in validate
    valid_losses = validate(cfg, trainer, task, epoch_itr, valid_subsets)
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 413, in validate
    valid_losses = validate(cfg, trainer, task, epoch_itr, valid_subsets)
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 413, in validate
    trainer.valid_step(sample)
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/contextlib.py", line 74, in inner
    distributed_utils.call_main(cfg, main)
  File "/data1/cordercorder/fairseq/fairseq/distributed/utils.py", line 349, in call_main
    trainer.valid_step(sample)
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "/data1/cordercorder/fairseq/fairseq/trainer.py", line 834, in valid_step
    trainer.valid_step(sample)
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "/data1/cordercorder/fairseq/fairseq/trainer.py", line 834, in valid_step
        return func(*args, **kwds)distributed_main(cfg.distributed_training.device_id, main, cfg, kwargs)

  File "/data1/cordercorder/fairseq/fairseq/trainer.py", line 834, in valid_step
  File "/data1/cordercorder/fairseq/fairseq/distributed/utils.py", line 326, in distributed_main
    main(cfg, **kwargs)
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 143, in main
    logging_output = self._reduce_and_log_stats(logging_outputs, sample_size)
  File "/data1/cordercorder/fairseq/fairseq/trainer.py", line 1157, in _reduce_and_log_stats
    logging_output = self._reduce_and_log_stats(logging_outputs, sample_size)
  File "/data1/cordercorder/fairseq/fairseq/trainer.py", line 1157, in _reduce_and_log_stats
    valid_losses, should_stop = train(cfg, trainer, task, epoch_itr)
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/contextlib.py", line 74, in inner
    logging_output = self._reduce_and_log_stats(logging_outputs, sample_size)
  File "/data1/cordercorder/fairseq/fairseq/trainer.py", line 1157, in _reduce_and_log_stats
    return func(*args, **kwds)
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 259, in train
    cfg, trainer, task, epoch_itr, valid_subsets, end_of_epoch
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 345, in validate_and_save
    self.task.reduce_metrics(logging_outputs, self.get_criterion())
  File "/data1/cordercorder/fairseq/fairseq/tasks/translation.py", line 410, in reduce_metrics
        self.task.reduce_metrics(logging_outputs, self.get_criterion())valid_losses = validate(cfg, trainer, task, epoch_itr, valid_subsets)

  File "/data1/cordercorder/fairseq/fairseq/tasks/translation.py", line 410, in reduce_metrics
  File "/data1/cordercorder/fairseq/fairseq_cli/train.py", line 413, in validate
    self.task.reduce_metrics(logging_outputs, self.get_criterion())
  File "/data1/cordercorder/fairseq/fairseq/tasks/translation.py", line 410, in reduce_metrics
    metrics.log_scalar("_bleu_counts", np.array(counts))
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/site-packages/torch/tensor.py", line 480, in __array__
    trainer.valid_step(sample)
      File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/contextlib.py", line 74, in inner
metrics.log_scalar("_bleu_counts", np.array(counts))
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/site-packages/torch/tensor.py", line 480, in __array__
        return func(*args, **kwds)metrics.log_scalar("_bleu_counts", np.array(counts))

  File "/data1/cordercorder/fairseq/fairseq/trainer.py", line 834, in valid_step
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/site-packages/torch/tensor.py", line 480, in __array__
    return self.numpy()
TypeError: can't convert cuda:2 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
    return self.numpy()
TypeError: can't convert cuda:3 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
    return self.numpy()
TypeError: can't convert cuda:1 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
    logging_output = self._reduce_and_log_stats(logging_outputs, sample_size)
  File "/data1/cordercorder/fairseq/fairseq/trainer.py", line 1157, in _reduce_and_log_stats
    self.task.reduce_metrics(logging_outputs, self.get_criterion())
  File "/data1/cordercorder/fairseq/fairseq/tasks/translation.py", line 410, in reduce_metrics
    metrics.log_scalar("_bleu_counts", np.array(counts))
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/site-packages/torch/tensor.py", line 480, in __array__
    return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
Traceback (most recent call last):
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in <module>
    main()
  File "/data/cordercorder/anaconda3/envs/nmt/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/data/cordercorder/anaconda3/envs/nmt/bin/python', '-u', '/data/cordercorder/anaconda3/envs/nmt/bin/fairseq-train', '--local_rank=3', 'tiny_data_bin', '--distributed-world-size', '4', '--arch', 'transformer', '--share-decoder-input-output-embed', '--optimizer', 'adam', '--adam-betas', '(0.9, 0.98)', '--clip-norm', '0.0', '--lr-scheduler', 'inverse_sqrt', '--warmup-init-lr', '1e-07', '--warmup-updates', '3000', '--lr', '0.0005', '--stop-min-lr', '1e-09', '--dropout', '0.25', '--weight-decay', '0.0001', '--criterion', 'label_smoothed_cross_entropy', '--label-smoothing', '0.1', '--max-tokens', '5000', '--batch-size', '64', '--update-freq', '4', '--max-epoch', '30', '--save-dir', 'checkpoint', '--skip-invalid-size-inputs-valid-test', '--eval-bleu', '--eval-bleu-args', '{"beam": 5}', '--eval-bleu-remove-bpe', 'sentencepiece', '--eval-bleu-print-samples', '--eval-tokenized-bleu', '--best-checkpoint-metric', 'bleu', '--maximize-best-checkpoint-metric', '--validate-interval-updates', '1']' returned non-zero exit status 1.

```

The error is cased by the fact that the numpy of version 1.20.1 does't support codes like following:
```python
import torch
import numpy as np
a = torch.tensor(0, device="cuda:0")
b = np.array([a])
```
The above codes will lead to error: "TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.", but the codes run well if the numpy version is 1.18.1 or 1.17.0 (when the numpy version is below 1.20.0, it is ok, I guess). However, it seems like that the latest version of fairseq need a numpy package of version 1.20.0 or higher (issue #3203 ).

### Reproduce the error
Download the source code of fairseq (commit ID: 7061a0f) and run following code:
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
data_bin_dir=tiny_data_bin

python -m torch.distributed.launch --nproc_per_node=4 \
    --master_addr="127.0.0.1" \
    --master_port=12345 \
    $(which fairseq-train) ${data_bin_dir} \
    --distributed-world-size 4 \
    --arch transformer \
    --share-decoder-input-output-embed \
    --optimizer adam \
    --adam-betas '(0.9, 0.98)' \
    --clip-norm 0.0 \
    --lr-scheduler inverse_sqrt \
    --warmup-init-lr 1e-07 \
    --warmup-updates 3000 \
    --lr 0.0005 \
    --stop-min-lr 1e-09 \
    --dropout 0.25 \
    --weight-decay 0.0001 \
    --criterion label_smoothed_cross_entropy \
    --label-smoothing 0.1 \
    --max-tokens 5000 \
    --batch-size 64 \
    --update-freq 4 \
    --max-epoch 30 \
    --save-dir checkpoint \
    --skip-invalid-size-inputs-valid-test \
    --eval-bleu \
    --eval-bleu-args '{"beam": 5}' \
    --eval-bleu-remove-bpe sentencepiece \
    --eval-bleu-print-samples \
    --eval-tokenized-bleu \
    --best-checkpoint-metric bleu \
    --maximize-best-checkpoint-metric \
    --validate-interval-updates 1
```

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: #3237

Reviewed By: myleott

Differential Revision: D26429732

Pulled By: alexeib

fbshipit-source-id: bc887ce952d28541cb07dbbdc7e80e99428a6b34
Summary:
fixes previous change that changes state/dataset/etc to class variables instead of instance variables

Pull Request resolved: fairinternal/fairseq-py#1623

Reviewed By: michaelauli

Differential Revision: D26439560

Pulled By: alexeib

fbshipit-source-id: ab9e75a425a47ac7ace006419259e254770e560e
…coder (#1559)

Summary:
Pull Request resolved: fairinternal/fairseq-py#1559

This matches the behavior of RobertaEncoder.

Test Plan: Imported from OSS

Reviewed By: gwenzek

Differential Revision: D25936937

Pulled By: myleott

fbshipit-source-id: 795ec8d50298a41d9e9638101436faa01cdf1586
Summary:
This is long overdue, but finally deprecating the RobertaEncoder components and just using TransformerEncoder directly. This will make it easier for some upcoming online backtranslation changes, and will eventually make migrating it to dataclasses/Hydra easier too. It also fixes some longstanding inconsistencies in layernorm placement in the model parallel roberta code.

Pull Request resolved: fairinternal/fairseq-py#1560

Test Plan:
- confirmed that training gives identical losses as before:
https://gist.github.com/myleott/9a4d213fb88a02b00094ea074f5a2e2d
- confirmed that old roberta models can be loaded and produce identical results
- confirmed that old linformer models can be loaded and produce identical results (reran commands from D25938236 (bf54551))
- confirmed that old model parallel models can be loaded and produce identical results:
```
python -m fairseq_cli.validate --path checkpoint.mp1/checkpoint_last.pt --task dummy_masked_lm --criterion masked_lm --max-sentences 8 --dataset-size 100 --model-parallel-size 2 --distributed-world-size 2

before:
2021-01-19 19:04:14 | INFO | valid |  | valid on 'valid' subset | loss 14.62 | ppl 25174.3 | wps 0 | wpb 53248 | bsz 104

after:
2021-01-19 19:06:59 | INFO | valid |  | valid on 'valid' subset | loss 14.62 | ppl 25174.3 | wps 0 | wpb 53248 | bsz 104
```

Reviewed By: gwenzek, ngoyal2707

Differential Revision: D25937145

Pulled By: myleott

fbshipit-source-id: 1ce0bc93e28e03fb926534ea4134684a49232599
Summary: Pull Request resolved: fairinternal/fairseq-py#1570

Test Plan: Imported from OSS

Reviewed By: gwenzek, ngoyal2707

Differential Revision: D25967675

Pulled By: myleott

fbshipit-source-id: 7c7f8d25b87ef9b4f0a85331548bb3a2886a1e92
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: fairinternal/fairseq-py#1629

Reviewed By: myleott

Differential Revision: D26484942

Pulled By: sshleifer

fbshipit-source-id: 9dcbab5c404c14d8f35628d823102ad9ce59dffd
Summary:
Integrating LASER (Language-Agnostic SEntence Representations) training code

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ Y] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ N/A] Did you make sure to update the docs?
- [ Y] Did you write any new necessary tests?  => an additional test in `test_iterators.py`

## What does this PR do?

This diff introduces the training code for LASER.
It includes a specific `laser` task in `laser_task.py` which reads a
json configuration file describing the binarized datasets of language
pairs.

`multitask_data_utils.py` defines dataset wrappers and iterators used by
`laser` task.

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Yes. �

Pull Request resolved: fairinternal/fairseq-py#1207

Reviewed By: myleott

Differential Revision: D26454296

Pulled By: Celebio

fbshipit-source-id: c987672aa66abf31b039ee11867b06912d3486e5
Summary:
Add back a couple speed optimizations in the original roberta code that got lost in the refactor

Pull Request resolved: fairinternal/fairseq-py#1626

Reviewed By: gwenzek

Differential Revision: D26478534

Pulled By: myleott

fbshipit-source-id: b945de5e9bffd51cd63630cc3aa1f0078a41cca8
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [x] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?

## What does this PR do?
- updates audio_utils to handle multi-channel audio as well as mono, with no change needed for existing recipes
- adds speech-to-text example for Multilingual TEDx (http://openslr.org/100) data

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: #3253

Reviewed By: yuntang

Differential Revision: D26514419

Pulled By: kahne

fbshipit-source-id: 699e428affda5b1347f96a8310691ab152dd6769
Summary: after D26382917 (02803a1) shipped somehow the self._device was removed in optimizer, (or maybe I didn't test it the right way in the previous diff?) fortunately OSS doesn't need it any way.

Reviewed By: myleott

Differential Revision: D26523538

fbshipit-source-id: 637c1e344670340ae40b32635ef51f5501966b0c
Summary:
This is the pull request for the code for the paper
[SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation](https://www.aclweb.org/anthology/2020.aacl-main.58/)

The model will also be used for [IWSLT 2021 shared task on simultaneous translation
](https://iwslt.org/2021/simultaneous)
This pull request includes

- Convtransformer offline model
- Convtransformer simultaneous translation model with fixed pre-decision module
- The agent files for inference for the convtransformer simultaneous translation model

jmp84
The README is still missing. Just curious where should I place it?

Pull Request resolved: fairinternal/fairseq-py#1607

Test Plan:
Imported from GitHub, without a `Test Plan:` line.

**********
One of the failing landing integration tests
```
buck test mode/dev //multimo/fb/models/test:multimo_fb_model_test
https://fburl.com/testinfra/oxq2cn5n
```

Reviewed By: jmp84

Differential Revision: D26439663

Pulled By: sravyapopuri388

fbshipit-source-id: b127cb4962756af221b65e3ccb6598a42fc75f7f
Summary:
This diff integrates simul ST training into pyspeech with very minor modifications to the open sourced code. Specific changes made are
- In fixed_pre_decision.py remove self as argument to p_choose function as it is already called with super in line 101
- In monotonic_multihead_attention.py remove pdb.set_trace()
- Move label_smoothed_cross_entropy_latency_augmented.py to fairseq/criterions folder and add missing arguments to parser
- In fairseq/data/data_utils.py type cast max_tokens to int to avoid type error.
- Update fairseq/convtransformer.py to pyspeech/convtransformer.py

# Next steps:
- Verify decoding using the model trained
- Support everstore handle based decoding in simuleval and integrate it into pyspeech.

Reviewed By: jmp84

Differential Revision: D26478861

fbshipit-source-id: 3b02b2aee757e5464b71dbdd7ebdba42659faee5
Summary:
Fix LibriSpeech data prep script
* Lowercasing transcript to be consistent with the pre-trained models

Reviewed By: jmp84

Differential Revision: D26538845

fbshipit-source-id: 0885f99e2c85f0e722a24f3cb83f2635ce9429bc
Summary:
# Before submitting

- [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [x] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes KeyError mentioned in  # (3211).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: #3212

Reviewed By: alexeib

Differential Revision: D26513255

Pulled By: myleott

fbshipit-source-id: 5a11cb369c9d4202fab6998d269e7da5f3d3e534
Summary:
# Before submitting

- [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [x] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes #3178 (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding � (I did ;)

Pull Request resolved: #3249

Reviewed By: alexeib

Differential Revision: D26513275

Pulled By: myleott

fbshipit-source-id: 2785098a945404c07eb72c079177654b1739a7a2
Summary:
I tried resuming a run from a checkpoint in f250883864, but ran into:

AssertionError: Criterion does not match; please reset the optimizer (--reset-optimizer). DistributedTimeoutWrapper vs ContrastiveLabelsCriterion

Based on this, I believe since D25836853 (d68a353) we are no longer saving the actual criterion's name, but DistributedTimeoutWrapper in the checkpoint.

This is kind of weird though, as I would expect more people to run into this issue. Not sure if I am doing something wrong, let me know if so, thanks!

Reviewed By: myleott

Differential Revision: D26478656

fbshipit-source-id: bc3c7c925f5505140d9df4438af3a73d65d4f531
Summary: Pull Request resolved: fairinternal/fairseq-py#1636

Reviewed By: xutaima

Differential Revision: D26562816

Pulled By: jmp84

fbshipit-source-id: 4e6efd0b4236d7187bd365d790f260bd5297aed5
…9) (#3235)

Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [N/A] Did you make sure to update the docs?
- [N/A] Did you write any new necessary tests?

## What does this PR do?

Currently when installing the newest source package from PyPI I get an error like so:

```
Collecting fairseq
  Using cached fairseq-0.10.2.tar.gz (938 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  ERROR: Command errored out with exit status 1:
   command: /home/frankier/sources/datasets/.venv/bin/python3 /tmp/tmp_ujftsgi_in_process.py get_requires_for_build_wheel /tmp/tmpmn0eumq2
       cwd: /tmp/pip-install-dg5d6q9y/fairseq
  Complete output (31 lines):
  Traceback (most recent call last):
    File "setup.py", line 214, in <module>
      do_setup(package_data)
    File "setup.py", line 136, in do_setup
      setup(
    File "/tmp/pip-build-env-hag0sxvp/overlay/lib/python3.9/site-packages/setuptools/__init__.py", line 152, in setup
      _install_setup_requires(attrs)
    File "/tmp/pip-build-env-hag0sxvp/overlay/lib/python3.9/site-packages/setuptools/__init__.py", line 147, in _install_setup_requires
      dist.fetch_build_eggs(dist.setup_requires)
    File "/tmp/pip-build-env-hag0sxvp/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 60, in fetch_build_eggs
      raise SetupRequirementsError(specifier_list)
  setuptools.build_meta.SetupRequirementsError: ['cython', 'numpy', 'setuptools>=18.0']

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "/tmp/tmp_ujftsgi_in_process.py", line 280, in <module>
      main()
    File "/tmp/tmp_ujftsgi_in_process.py", line 263, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
    File "/tmp/tmp_ujftsgi_in_process.py", line 114, in get_requires_for_build_wheel
      return hook(config_settings)
    File "/tmp/pip-build-env-hag0sxvp/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 149, in get_requires_for_build_wheel
      return self._get_build_requires(
    File "/tmp/pip-build-env-hag0sxvp/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 130, in _get_build_requires
      self.run_setup()
    File "/tmp/pip-build-env-hag0sxvp/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 145, in run_setup
      exec(compile(code, __file__, 'exec'), locals())
    File "setup.py", line 217, in <module>
      os.unlink(fairseq_examples)
  IsADirectoryError: [Errno 21] Is a directory: 'fairseq/examples'
  ----------------------------------------
ERROR: Command errored out with exit status 1: /home/frankier/sources/datasets/.venv/bin/python3 /tmp/tmp_ujftsgi_in_process.py get_requires_for_build_wheel /tmp/tmpmn0eumq2 Check the logs for full command output.
```

I believe the reason for this is that the source package contains the examples directory because it was put there during package creation (it seems the symlink because a directory). Now, when setup.py is run again, it seems the setup.py attempts to unlink the directory, which is not possible because only symlinks can be unlinked. This PR therefore only attempts to unlink it if it is a symlink. I have not thoroughly tested whether my proposed cause is the true cause, but this should fix it in any case.

Note that the source package is fetched because there is no wheel for Python 3.9, so most users will not see this because they will use the wheel.

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: #3235

Reviewed By: alexeib

Differential Revision: D26513259

Pulled By: myleott

fbshipit-source-id: 775d6c636a5867b9983bb6419829f13ee414e2fd
Summary:
# Before submitting

- [NO] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [YES] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [YES] Did you make sure to update the docs?
- [NO] Did you write any new necessary tests?

## What does this PR do?

This is a typo fix to the Hydra Integration doc where the example with dataclass config should user `FairseqTask` and not `LegacyFairseqTask`.

Didn't make an issue for this as it's a trivial doc change for the example to match the actual doc.

Pull Request resolved: fairinternal/fairseq-py#1619

Reviewed By: huihuifan

Differential Revision: D26448855

Pulled By: Mortimerp9

fbshipit-source-id: 467323101b8425370f6bd7c0532e70abb319b337
Summary:
This diff
1. Updates FairseqSimulSTAgent to make it generic and reusable internally [Touches OSS]
2. Adds FBFairseqSimulSTAgent inheriting FairseqSimulSTAgent
3. Add TARGETS file in examples/speech_to_text
4. Update simuleval TARGETS and add a bento kernel for easy testing

Reviewed By: jmp84

Differential Revision: D26573214

fbshipit-source-id: f4b71f90693cc878cc771b46a006bcbc83a50124
…3247)

Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [x] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?

## What does this PR do?

Fixes #3246
Fixes #3248

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: #3247

Reviewed By: myleott

Differential Revision: D26513267

Pulled By: lematt1991

fbshipit-source-id: 958de0b3a58a0dd2a56bd6c6d7fb2644a89f6746
Summary:
# Before submitting

- [N] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [Y] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [Y] Did you make sure to update the docs?
- [N] Did you write any new necessary tests?

## What does this PR do?
Small fixes in the script and documentation for correctly reproducing the results in the corresponding paper.

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: #3264

Reviewed By: lematt1991

Differential Revision: D26587397

Pulled By: myleott

fbshipit-source-id: 3675ec4d4388cafa224d395e08b53667f142cb27
Summary:
- Use `PathManager.ls` instead of `os.listdir`
- Add version.txt to fairseq TARGETS

Reviewed By: vishrav

Differential Revision: D26579091

fbshipit-source-id: 20d57dc19335a3006cd5fa6d1a3d5e878b105874
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?

## What does this PR do?
fixes circular import as complained by python

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: #3257

Reviewed By: jmp84

Differential Revision: D26587382

Pulled By: myleott

fbshipit-source-id: a8a6e7bee4dcfa6baf934c257958b7d7592205c8
Summary:
Batch level sampling (each batch comes from a dataset sampled from some distribution) is useful in cases where we have a criterion that makes this assumption or a unique collator per dataset. However, the current implementation in fairseq `MultiCorpusSampledDataset` is inefficient, because it packs batches by assuming the size of item i is `max(dataset.size(i % len(dataset)) for dataset in datasets)`, which often significantly overestimates the actual sampled item's size, especially with many datasets.

We can make this more efficient by modifying `MultiCorpusDataset`, which can do efficient batch sampling by:

1. Every epoch, sampling the indices/dataset to train on.
2. When creating batches, create per-dataset batches and merge them together

Reviewed By: jay-mahadeokar

Differential Revision: D26601515

fbshipit-source-id: a3273f88d86d7922f9ba004e7324e909ecc6ecf7
Summary:
### Measurements
TLDR: This saves ~8% CPU RAM for training tiny model on medium sized dataset (11GB on disk)

Command below:

```
+---------------------+----------------+---------+--------+
| fname               |   cpu_mem_used |     wps |    ppl |
+=====================+================+=========+========+

+---------------------+----------------+---------+--------+
| branch_nw8_2gpu.log |          25.41 | 54721   | 429.1  |
+---------------------+----------------+---------+--------+
+---------------------+----------------+---------+--------+
| master_nw8_2gpu.log |          27.53 | 52833.1 | 429.1  |
+---------------------+----------------+---------+--------+
```

### Command

```
base_cmd () {
  dd=$1
  shift
  fairseq-train --fp16 $dd \
            --task language_modeling \
            --arch transformer_lm_gpt2_tiny \
            --sample-break-mode complete --tokens-per-sample 512 \
            --optimizer adam --clip-norm 0.0 --lr 0.0005 \
            --batch-size 1 \
            --max-update 200 --max-epoch 1 \
            --log-format simple --log-interval 100 \
            --restore-file x.pt --no-save \
            --skip-invalid-size-inputs-valid-test --disable-validation $@
}
CUDA_VISIBLE_DEVICES=0,1 base_cmd /private/home/sshleifer/data-bin/stories_mmap --num-workers 8
```

Pull Request resolved: fairinternal/fairseq-py#1647

Reviewed By: myleott

Differential Revision: D26628861

Pulled By: sshleifer

fbshipit-source-id: 142afe0358d1c4cae448828ba811b211406509d7
Summary: Pull Request resolved: fairinternal/fairseq-py#1641

Reviewed By: myleott

Differential Revision: D26607648

Pulled By: sshleifer

fbshipit-source-id: 9d7f9d7a0825e3124c181b651a126842e5de6109
Summary: Pull Request resolved: fairinternal/fairseq-py#1637

Test Plan:
```bash
python examples/bart/summarize.py --model-dir pytorch/fairseq --model-file bart.large.cnn --src $HOME/data-bin/cnn_dm/test.source --n 12 --out hub_hypo.txt

python examples/bart/summarize.py \
  --model-dir pytorch/fairseq \
  --model-file bart.large.cnn \
  --src cnn_dm/test.source \
  --out cnn_dm/test.hypo --xsum-kwargs
```

Reviewed By: ngoyal2707

Differential Revision: D26581703

Pulled By: sshleifer

fbshipit-source-id: 80eb28012f7770eee01ed50a1163c5a2c5cc6d37
Summary: Pull Request resolved: fairinternal/fairseq-py#1649

Reviewed By: stephenroller

Differential Revision: D26639303

Pulled By: myleott

fbshipit-source-id: 7def925cd7885cfe85d542464316cbc0f2ba6d2c
EdanSneh and others added 29 commits August 2, 2021 18:40
Summary: Adding fairseq entrypoint section of e2e pipeline so FairseqConfig to hydra_main, runs smoothly

Reviewed By: jieru-hu

Differential Revision: D29714729

fbshipit-source-id: e3694e0037bb4c4f69208c1d6ec7df91d42fb588
Summary: Implemented fix bit scalar quantization with quant noise for pytext models

Reviewed By: AkshatSh

Differential Revision: D29662977

fbshipit-source-id: ebab68a4a5ff1583a0c6dfadcf2671663e232c18
Summary:
- stores exp_avg and exp_sq_avg in fp16, with `scale` variables to avoid overflow.
- myleott added this to gshard, following github.com/openai/jukebox/blob/master/jukebox/utils/fp16.py

Pull Request resolved: fairinternal/fairseq-py#2139

Reviewed By: myleott

Differential Revision: D30113175

Pulled By: sshleifer

fbshipit-source-id: 03995c8eb096629675eadec4e7b8e7f18fc2730e
Summary:
## What does this PR do?
Adds GSLM directory with README.

Pull Request resolved: fairinternal/fairseq-py#2151

Reviewed By: wnhsu

Differential Revision: D30147672

Pulled By: hikushalhere

fbshipit-source-id: bcc7cbbde3626ea3d91917707a91aff85d715baa
Summary:
adds finetuned robust w2v models and updates readme

fixes #3721

Pull Request resolved: fairinternal/fairseq-py#2196

Reviewed By: wnhsu

Differential Revision: D30367999

Pulled By: alexeib

fbshipit-source-id: 616b373bf31265c89f694fba7dccce2961d394f3
Summary:
## What does this PR do?
Fixes OOM which happens from TPUs due to dynamic batching exceed the max a single core can work with.

Pull Request resolved: #3781

Reviewed By: wnhsu

Differential Revision: D30327091

Pulled By: alexeib

fbshipit-source-id: 0ebe6b18329fa05d359083fa8ac54aba7b48bc53
Summary:
Fix fairinternal/fairseq-py#2177 for the transformer conversion to Hydra.

The way the defaults are dealt with now is different so when you use the legacy Namespace configuration, you end up with a default encoder_embed_dim, which in the VGG case sets up a encoder attention in the TransformerDecoderLayer with the wrong dimentions.
The easiest solution is to erase the default value for encoder_embed_dim (by forcing it to None) when converting the VGG config to the raw Namespace for the decoder layer.

Tested with:
`pytest tests/speech_recognition/test_vggtransformer.py -k Transformer`

Pull Request resolved: fairinternal/fairseq-py#2213

Test Plan: pytest tests/speech_recognition/test_vggtransformer.py -k Transformer

Reviewed By: sshleifer

Differential Revision: D30425143

Pulled By: Mortimerp9

fbshipit-source-id: 92f6dea2ffbb68e441700bcc55274b3167a587b3
Summary:
## What does this PR do?
Open sourcing code for Generative Spoken Language Modeling

Pull Request resolved: fairinternal/fairseq-py#2201

Reviewed By: wnhsu, eugene-kharitonov

Differential Revision: D30563114

Pulled By: hikushalhere

fbshipit-source-id: 6c1ee3b29038fd2c9fb5939bddcc70af0794dab4
Summary: Pull Request resolved: fairinternal/fairseq-py#2239

Reviewed By: sshleifer, ngoyal2707

Differential Revision: D30574791

Pulled By: myleott

fbshipit-source-id: 0f83e6ffe53d608292545884df269a604a57448d
Summary:
1. added test for genereting pad tokens during beam search with prefix
tokens
2. modified lprobs for pad token and prefix tokens to avoid generating
pad

# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: fairinternal/fairseq-py#2227

Reviewed By: xianxl

Differential Revision: D30649356

Pulled By: jingfeidu

fbshipit-source-id: d94903a912e767391c8fca61f98f65b5cea3b56e
Summary:
Pull Request resolved: fairinternal/fairseq-py#2236

The test_eval_bleu unittest in TestTranslation in tests/test_binaries.py failed after the scarebleu version is updated to 2.0.0 in OSS testing tool. Added the fix so that the test can pass when scarebleu version is both 1.x and 2.0.0.

Reviewed By: myleott, sravyapopuri388

Differential Revision: D30525920

fbshipit-source-id: 8ef27509cec45422a8d22003c87c2a7acb55225d
Summary:
## What does this PR do?

Currently, binarized dataset are stored as a bin representation of int tensors. At best, each int is coded as uint16 on disk.

When coding a fixed size vocabulary dataset where we know the frequency of each symbol and where some symbols are more common than other, we can do better. This happens in particular when binarizing a dataset split in subword units as the most common "tokenizers" like bpe and spm will choose subwords with high frequencies over subwords with low frequencies.

In practice, if we know the frequency of all symbols (or a good estimate), we can use entropy encoding methods to compress the data. The idea is to assign a compressed representation where frequent symbols will have shorter representations than unfrequent symbols.

In this PR, we build a Huffman code from a frequency table and use this code to encode a dataset. The PR provides the huffman coder implementation (using the single queue approach as we usually start with a sorted set of symbols) as well as a memory map implementation of a dataset that stores the data compressed with a huffman code and can return indexed tensors from it.

Over a whole dataset, depending on how many symbols we sample to evaluate the frequency, we can save between 25% and 30% of storage space.

## Follow Ups

currently the binarizer/preprocess script make too many assumptions about the dataset writers so the huffman dataset writer cannot be used straight out of the box with it. I will make follow ups PRs to provide easy to use scripts to build such datasets. But it's as simple as doing:
```
code_builder = HuffmanCodeBuilder()
with open(sample_file, 'r', encoding="utf-8") as input:
    for line in input:
        code_builder.add(*line.strip().split(" "))

coder = code_builder.build_code()

with HuffmanMMapIndexedDatasetBuilder('/tmp/testing_huffman', coder) as builder:
    with open(dataset_file, 'r', encoding="utf-8") as input:
        for line in input:
            builder.add_item(line.strip().split(' '))
```

a lot of the `HuffmanMMapIndexedDataset` code comes from the normal `MMapIndexedDataset` and we could probably extract commonalities in a base class

the `HuffmanCoder` is also really a special kind of `Dictionary` and again, a common base class could be abstracted out of them.

Pull Request resolved: fairinternal/fairseq-py#2029

Reviewed By: dianaml0

Differential Revision: D29557468

Pulled By: Mortimerp9

fbshipit-source-id: a01b6d98f38f937934cadebb3786133e257adefe
Summary:
Adds Exponential moving average (EMA) model for Kaizen semi-supervised training https://arxiv.org/abs/2106.07759

1. Add `ema.store_ema` to enable storing EMA. EMA will be written to extra_state in the state dict while saving checkpoint.
2. `ema.ema_start_update` to control when the EMA starts accumulating
3. Tasks can use `uses_ema` property to decide if the EMA should be passed to the task. (Default is False)
4. `load_ema_from_checkpoint` can be used to load EMA model in place of the model to be used for evalutation. Pyspeech has eval-ema option for this.

```
This module has the EMA class used to store a copy of the exponentially decayed
model params.

Typical usage of EMA class involves initializing an object using an existing
model (random or from a seed model) and setting the config like ema_decay,
ema_start_update which determine how the EMA model is updated. After every
update of the model i.e. at the end of the train_step, the EMA should be updated
by passing the new model to the EMA.step function. The EMA model state dict
can be stored in the extra state under the key of "ema" and dumped
into a checkpoint and loaded. The EMA object can be passed to tasks
by setting task.uses_ema property.
EMA is a smoothed/ensemble model which might have better performance
when used for inference or further fine-tuning. EMA class has a
reverse function to load the EMA params into a model and use it
like a regular model.
```

Reviewed By: cruvadom

Differential Revision: D24238379

fbshipit-source-id: 879d3ba5070a614b7d365f9503af357001e875b2
…ributional Hypothesis" (#1930)

Summary:
Paper submitted to EMNLP: https://arxiv.org/abs/2104.06644

Pull Request resolved: fairinternal/fairseq-py#1930

Reviewed By: lematt1991

Differential Revision: D28885634

Pulled By: shruti-bh

fbshipit-source-id: d433c87cff3603b3e676a129029a827c510a72c7
Summary:
# Before submitting
the default score was set as min score of all lprobs, which would let us select tokens other than prefix tokens during beam search. having a pretty hacky way to make it smaller than any lprobs.
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: fairinternal/fairseq-py#2267

Reviewed By: myleott

Differential Revision: D30730475

Pulled By: jingfeidu

fbshipit-source-id: 7dab4e9ed2fc094910467bad776155230987e21a
Summary: As title

Reviewed By: zhengwy888, xiaoxiao26

Differential Revision: D30621478

fbshipit-source-id: d79aba3f98d39a5c46a53bf206522c5f7d05e02a
Summary:
## TL;DR
Fairseq checkpoint saving and loading should mirror torch's checkpoint by saving and loading "state_dict()._metadata".

## Long Story:

#### What happened:
During model loading and saving, Quantization-aware-training models in Pytorch encounters a weird bug that says state_dict "fake_weight_quant.weight.min_val" is mismatched to "min_vals".

#### What was the reason:
- We found the issue in that torch uses state_dict()._metadata to store module._version, but the metadata was never store in checkpoint, nor are they loaded during checkpoint loading in fairseq.

Reviewed By: frankseide

Differential Revision: D30649933

fbshipit-source-id: ce262486b9b95fbcece463fa05c4e1903d4232d7
Summary:
1) add annotation for encoder_out
2) force dropout to be float for jitable purpose.

Reviewed By: cndn

Differential Revision: D30826657

fbshipit-source-id: aca79845d7ae48d450b602a7be8f56404f4c7bab
Summary: [fairseq-py] add speech synthesis preprocessing and evaluation scripts

Reviewed By: wnhsu

Differential Revision: D30720282

fbshipit-source-id: 6e4b098b6f56fff41b82af4347518d7f7905c801
Summary: [fairseq-py] update S2T

Reviewed By: wnhsu

Differential Revision: D30720434

fbshipit-source-id: dc4e46b0cc3dec24943baeabe59424dabd5be38f
…ict" (#3861)

Summary:
Pull Request resolved: #3861

backout fairseq changes. fix with a suggested, more optimal changes in checkopint utils.

Reviewed By: zhengwy888

Differential Revision: D30886481

fbshipit-source-id: 12b6dd4d5107ab4371b73a58d9a044a17c733260
Summary:
Pull Request resolved: #3862

We resolved a bug for missing "_metadata" attribute for pytorch models during checkpoing saving and loading using forced state["model"]["_metadata"], but it's not an efficient solution due to expensive model.state_dict() invocation. This diff offers an alternative solution.

Reviewed By: zhengwy888

Differential Revision: D30857147

fbshipit-source-id: 5daa978e2a558ad4159e2da55470253950151bde
Summary: [fairseq-py] add TTS

Reviewed By: wnhsu

Differential Revision: D30720666

fbshipit-source-id: b5288acec72bea1d3a9f3884a4ed51b616c7a403
Summary: Aligned training was not using batch_by_size in the dataset. Due to this, it was not possible to use batch sampling in MultiCorpusDataset with different transforms and collators for different datasets.

Reviewed By: xiaoxiao26

Differential Revision: D30889985

fbshipit-source-id: 224ad55d2337681a06a82caf19900e5a241a3d6a
Summary:
Fixing issues ([3546](#3546)) with latency augmented training for mma due to the change of fairseq APIs

Pull Request resolved: fairinternal/fairseq-py#2087

Reviewed By: hygong-fb

Differential Revision: D29851286

Pulled By: xutaima

fbshipit-source-id: 6c3077db06b89c23b312b28527d7395a725f3b3a
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: #3879

Reviewed By: myleott

Differential Revision: D30969142

Pulled By: dianaml0

fbshipit-source-id: 902154c03fd68ae6645d3e0ac07b7d729dfc7934
…change (#2297)

Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: fairinternal/fairseq-py#2297

Reviewed By: alexeib

Differential Revision: D30906090

Pulled By: dianaml0

fbshipit-source-id: 941d30db7f766c9077a1b5bb2a04680f57e2e070
#3773)

Summary:
…verride the defaults

# Before submitting

- [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes #3761.

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: #3773

Reviewed By: yuntang

Differential Revision: D30310383

Pulled By: kahne

fbshipit-source-id: cbfcbc032dbf53490a25ffdebe57f65c42d52e71
Summary: Update reference from master to main elsewhere in fbcode

Reviewed By: alexeib

Differential Revision: D30938472

fbshipit-source-id: 243b98550207f241c9d3265bf3d4060350aaf0a8
@facebook-github-bot facebook-github-bot deleted the master branch September 20, 2021 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.