During the training, labelmodel takes up a lot of memory causing an out-of-memory- error, and reducing the batch size doesn't help. Apart from that, labelmodel seems to still take up memory even at the start of a new checkpoint. For now, in the master branch, the labelmodel is replaced with majority vote. In memory-debug branch, it is still using labelmodel, and it contains different memory profilers to help debug this bug, so debugging this error should probably happen in this branch.
During the training, labelmodel takes up a lot of memory causing an out-of-memory- error, and reducing the batch size doesn't help. Apart from that, labelmodel seems to still take up memory even at the start of a new checkpoint. For now, in the master branch, the labelmodel is replaced with majority vote. In memory-debug branch, it is still using labelmodel, and it contains different memory profilers to help debug this bug, so debugging this error should probably happen in this branch.