Skip to content

fix(megatron-fp16): avoid assert crash & double grad scaling under fp16#44

Open
jamesruio wants to merge 1 commit into
redai-infra:mainfrom
jamesruio:fix/fix-fp16-assert
Open

fix(megatron-fp16): avoid assert crash & double grad scaling under fp16#44
jamesruio wants to merge 1 commit into
redai-infra:mainfrom
jamesruio:fix/fix-fp16-assert

Conversation

@jamesruio

Copy link
Copy Markdown

Bug Fix

  • Fix training with --fp16 + dynamic loss scaling crashed immediately on the first gradient overflow:
  • Replace the two-step flow with a single optimizer.step() call, eliminating both the double unscale and the double scaler update.
  • Detect overflow via the (False, None, None) return signature; set valid_step=False, skip the LR scheduler step, and log a warning with step_id and the current loss scale so the dynamic scaling behavior is observable.
  • Move the CI MTP grad check before optimizer.step() to match the original comment intent (gradients may be modified during step).

Aligns with THUDM/slime#1842.

Detail

Before this fix, training with --fp16 + dynamic loss scaling crashed immediately on the first gradient overflow:

File "relax/backends/megatron/model.py", line 529, in train_one_step
    assert update_successful
AssertionError

Root cause: in fp16 with dynamic loss scaling, optimizer.step() returns (False, None, None) on overflow as the documented signal for the loss scaler to reduce the scale and retry next step. The previous code path treated this as a fatal error.

Additionally, the previous flow called optimizer.prepare_grads() externally and then optimizer.step(), which internally calls prepare_grads() again. This caused two correctness issues:

  • _unscale_main_grads_and_check_for_nan ran twice, dividing main grads by inv_scale a second time
  • grad_scaler.update() ran twice, double-advancing the loss-scale growth interval

Before this fix, training with --fp16 + dynamic loss scaling crashed
immediately on the first gradient overflow:

    File "relax/backends/megatron/model.py", line 529, in train_one_step
        assert update_successful
    AssertionError

Root cause: in fp16 with dynamic loss scaling, optimizer.step() returns
(False, None, None) on overflow as the documented signal for the loss
scaler to reduce the scale and retry next step. The previous code path
treated this as a fatal error.

Additionally, the previous flow called optimizer.prepare_grads()
externally and then optimizer.step(), which internally calls
prepare_grads() again. This caused two correctness issues:
  - _unscale_main_grads_and_check_for_nan ran twice, dividing main grads
    by inv_scale a second time
  - grad_scaler.update() ran twice, double-advancing the loss-scale
    growth interval

Fix:
  - Replace the two-step flow with a single optimizer.step() call,
    eliminating both the double unscale and the double scaler update.
  - Detect overflow via the (False, None, None) return signature; set
    valid_step=False, skip the LR scheduler step, and log a warning
    with step_id and the current loss scale so the dynamic scaling
    behavior is observable.
  - Move the CI MTP grad check before optimizer.step() to match the
    original comment intent (gradients may be modified during step).

Aligns with THUDM/slime#1842.
@yxyOo

yxyOo commented Jun 11, 2026

Copy link
Copy Markdown
Member

Thanks for the fix. The fp16 overflow handling is implemented in the right direction. I have a few comments below:

  1. Could you share the reproduction configuration? I was unable to reproduce the update_successful assertion crash with Qwen3-30B-A3B. Having the exact config will help us verify the same code path.
  2. Regarding the double unscale and duplicate grad_scaler.update() calls: consolidating them into a single step() is a great refactor. To clarify: this change does not resolve an active functional bug. In the original code, prepare_grads() only performs unscaling and scaler updates when self.grad_scaler exists. For fp16 workflows, the external prepare_grads() logic was never executed; for standard bf16 workflows, grad_scaler is null. Double unscaling would only occur in edge cases using bf16/fp32 with explicit --loss-scale and check_for_nan=False. This simplification is still valuable, so I suggest adjusting the "correctness bug" description accordingly.
  3. Moving the MTP CI gradient check before step() looks correct to me.

We’ll merge this PR into our internal branch, run full CI & CE tests, and sync to main after all tests pass.
Thank you again for your work!

@jamesruio

jamesruio commented Jun 15, 2026

Copy link
Copy Markdown
Author

@yxyOo Thanks for your reply ! I could reproduce the update_successful assertion crash with Qwen3-4B and script is provided below。

`
set -ex
set -o pipefail

now=$(date "+%Y-%m-%d-%H:%M:%S")
echo "当前时间: $now"

export WORKDIR=/workspace/wurui04
export MODEL_DIR=/workspace/wurui04 # 存放模型权重、训练样本的共同父目录
export PROJECT_NAME=Relax-Qwen3-4B-P800 # 任意
export MODEL_CONFIG_DIR=${MODEL_DIR}/Relax/scripts/models # Relax模型脚本路径
export MEGATRON=${MODEL_DIR}/Megatron-LM
export NUM_GPUS=8

SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"

if [ -z "${RELAX_ENTRYPOINT_MODE:-}" ]; then
source "${SCRIPT_DIR}/../../entrypoint/local.sh"
fi
source "${MODEL_CONFIG_DIR}/qwen3-4B.sh"

PROJECT_NAME="${PROJECT_NAME:=Relax/dev/dapo-math}"
EXP_DIR="${MODEL_DIR:=${SCRIPT_DIR}/../../../../exps}"
NUM_ROLLOUT="${NUM_ROLLOUT:=400}"

CKPT_ARGS=(
--hf-checkpoint ${EXP_DIR}/Qwen3-4B/
--ref-load ${EXP_DIR}/Qwen3-4B/
--megatron-to-hf-mode bridge
--save ${EXP_DIR}/Qwen3-4B_mcore_8xgpu/
--save-interval 200
)

PROMPT_SET=${EXP_DIR}/dapo-math-17k/dapo-math-17k.jsonl

ROLLOUT_ARGS=(
--prompt-data ${PROMPT_SET}
--input-key prompt
--label-key label
--apply-chat-template
--rollout-shuffle

--rm-type dapo
--reward-key score

--num-rollout ${NUM_ROLLOUT}
--rollout-batch-size 32
--n-samples-per-prompt 8
--rollout-max-response-len 8192
--rollout-temperature 1

--global-batch-size 256
--balance-data
--use-fault-tolerance
)

EVAL_ARGS=(
--skip-eval-before-train
--log-passrate
--eval-interval 100
--eval-prompt-data aime ${EXP_DIR}/aime-2024/aime-2024.jsonl
--n-samples-per-eval-prompt 8
--eval-max-response-len 16384
--eval-top-p 0.7
)

PERF_ARGS=(
--tensor-model-parallel-size 2
--sequence-parallel
--pipeline-model-parallel-size 1
--context-parallel-size 1
--expert-model-parallel-size 1
--expert-tensor-parallel-size 1

--recompute-granularity full
--recompute-method uniform
--recompute-num-layers 1

--calculate-per-token-loss

--use-dynamic-batch-size
--max-tokens-per-gpu 9216
--initial-loss-scale 16
)

GRPO_ARGS=(
--advantage-estimator grpo
--use-kl-loss
--kl-loss-coef 0.00
--kl-loss-type low_var_kl
--entropy-coef 0.00
--eps-clip 0.2
--eps-clip-high 0.28

--use-tis
)

OPTIMIZER_ARGS=(
--optimizer adam
--lr 1e-6
--lr-decay-style constant
--weight-decay 0.1
--adam-beta1 0.9
--adam-beta2 0.98
)

SGLANG_ARGS=(
--rollout-num-gpus-per-engine 1
--sglang-mem-fraction-static 0.6
)

WANDB_ARGS=(
--use-clearml
--use-metrics-service
--tb-project-name ${PROJECT_NAME}
--tb-experiment-name qwen3-4b-GRPO-gpu8-${now}
)

MISC_ARGS=(
--attention-dropout 0.0
--hidden-dropout 0.0
--accumulate-allreduce-grads-in-fp32
--attention-softmax-in-fp32
--attention-backend flash
)

mkdir -p log
ray job submit ${RAY_NO_WAIT:+--no-wait} --address="http://127.0.0.1:8265/"
${WORKING_DIR:+--working-dir "${WORKING_DIR}"}
-- python3 -m relax.entrypoints.train
--resource '{"actor": [1, 8], "rollout": [1, 8]}'
--max-staleness 0
--num-data-storage-units 1
--colocate
--fp16
--use-health-check
"${MODEL_ARGS[@]}"
"${CKPT_ARGS[@]}"
"${ROLLOUT_ARGS[@]}"
"${OPTIMIZER_ARGS[@]}"
"${GRPO_ARGS[@]}"
"${WANDB_ARGS[@]}"
"${PERF_ARGS[@]}"
"${EVAL_ARGS[@]}"
"${SGLANG_ARGS[@]}"
"${MISC_ARGS[@]}" 2>&1 | tee log/qwen3-4b-GRPO-gpu8-${now}.log`

@jamesruio

Copy link
Copy Markdown
Author

2. Regarding the double unscale and duplicate grad_scaler.update() calls: consolidating them into a single step() is a great refactor. To clarify: this change does not resolve an active functional bug. In the original code, prepare_grads() only performs unscaling and scaler updates when self.grad_scaler exists. For fp16 workflows, the external prepare_grads() logic was never executed; for standard bf16 workflows, grad_scaler is null. Double unscaling would only occur in edge cases using bf16/fp32 with explicit --loss-scale and check_for_nan=False. This simplification is still valuable, so I suggest adjusting the "correctness bug" description accordingly.

@yxyOo Thanks for the clarification ! I would clarify that the double unscale and duplicate grad_scaler.update() calls issue exsits only in slime framework. The bug is avoided in original relax code (line 508).

@jamesruio

Copy link
Copy Markdown
Author

@yxyOo Thanks for your reply ! I could reproduce the update_successful assertion crash with Qwen3-4B and script is provided below。

` set -ex set -o pipefail

now=$(date "+%Y-%m-%d-%H:%M:%S") echo "当前时间: $now"

export WORKDIR=/workspace/wurui04 export MODEL_DIR=/workspace/wurui04 # 存放模型权重、训练样本的共同父目录 export PROJECT_NAME=Relax-Qwen3-4B-P800 # 任意 export MODEL_CONFIG_DIR=${MODEL_DIR}/Relax/scripts/models # Relax模型脚本路径 export MEGATRON=${MODEL_DIR}/Megatron-LM export NUM_GPUS=8

SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"

if [ -z "${RELAX_ENTRYPOINT_MODE:-}" ]; then source "${SCRIPT_DIR}/../../entrypoint/local.sh" fi source "${MODEL_CONFIG_DIR}/qwen3-4B.sh"

PROJECT_NAME="${PROJECT_NAME:=Relax/dev/dapo-math}" EXP_DIR="${MODEL_DIR:=${SCRIPT_DIR}/../../../../exps}" NUM_ROLLOUT="${NUM_ROLLOUT:=400}"

CKPT_ARGS=( --hf-checkpoint ${EXP_DIR}/Qwen3-4B/ --ref-load ${EXP_DIR}/Qwen3-4B/ --megatron-to-hf-mode bridge --save ${EXP_DIR}/Qwen3-4B_mcore_8xgpu/ --save-interval 200 )

PROMPT_SET=${EXP_DIR}/dapo-math-17k/dapo-math-17k.jsonl

ROLLOUT_ARGS=( --prompt-data ${PROMPT_SET} --input-key prompt --label-key label --apply-chat-template --rollout-shuffle

--rm-type dapo --reward-key score

--num-rollout ${NUM_ROLLOUT} --rollout-batch-size 32 --n-samples-per-prompt 8 --rollout-max-response-len 8192 --rollout-temperature 1

--global-batch-size 256 --balance-data --use-fault-tolerance )

EVAL_ARGS=( --skip-eval-before-train --log-passrate --eval-interval 100 --eval-prompt-data aime ${EXP_DIR}/aime-2024/aime-2024.jsonl --n-samples-per-eval-prompt 8 --eval-max-response-len 16384 --eval-top-p 0.7 )

PERF_ARGS=( --tensor-model-parallel-size 2 --sequence-parallel --pipeline-model-parallel-size 1 --context-parallel-size 1 --expert-model-parallel-size 1 --expert-tensor-parallel-size 1

--recompute-granularity full --recompute-method uniform --recompute-num-layers 1

--calculate-per-token-loss

--use-dynamic-batch-size --max-tokens-per-gpu 9216 --initial-loss-scale 16 )

GRPO_ARGS=( --advantage-estimator grpo --use-kl-loss --kl-loss-coef 0.00 --kl-loss-type low_var_kl --entropy-coef 0.00 --eps-clip 0.2 --eps-clip-high 0.28

--use-tis )

OPTIMIZER_ARGS=( --optimizer adam --lr 1e-6 --lr-decay-style constant --weight-decay 0.1 --adam-beta1 0.9 --adam-beta2 0.98 )

SGLANG_ARGS=( --rollout-num-gpus-per-engine 1 --sglang-mem-fraction-static 0.6 )

WANDB_ARGS=( --use-clearml --use-metrics-service --tb-project-name ${PROJECT_NAME} --tb-experiment-name qwen3-4b-GRPO-gpu8-${now} )

MISC_ARGS=( --attention-dropout 0.0 --hidden-dropout 0.0 --accumulate-allreduce-grads-in-fp32 --attention-softmax-in-fp32 --attention-backend flash )

mkdir -p log ray job submit ${RAY_NO_WAIT:+--no-wait} --address="http://127.0.0.1:8265/" ${WORKING_DIR:+--working-dir "${WORKING_DIR}"} -- python3 -m relax.entrypoints.train --resource '{"actor": [1, 8], "rollout": [1, 8]}' --max-staleness 0 --num-data-storage-units 1 --colocate --fp16 --use-health-check "${MODEL_ARGS[@]}" "${CKPT_ARGS[@]}" "${ROLLOUT_ARGS[@]}" "${OPTIMIZER_ARGS[@]}" "${GRPO_ARGS[@]}" "${WANDB_ARGS[@]}" "${PERF_ARGS[@]}" "${EVAL_ARGS[@]}" "${SGLANG_ARGS[@]}" "${MISC_ARGS[@]}" 2>&1 | tee log/qwen3-4b-GRPO-gpu8-${now}.log`

@yxyOo Hi yxyOo,could you reproduce the update_successful assertion crash with Qwen3-4B with my script ? I would greatly appreciate it if you could notify me of any progress.

@yxyOo

yxyOo commented Jun 23, 2026

Copy link
Copy Markdown
Member

@yxyOo Thanks for your reply ! I could reproduce the update_successful assertion crash with Qwen3-4B and script is provided below。
set -ex set -o pipefail now=$(date "+%Y-%m-%d-%H:%M:%S") echo "当前时间: $now" export WORKDIR=/workspace/wurui04 export MODEL_DIR=/workspace/wurui04 # 存放模型权重、训练样本的共同父目录 export PROJECT_NAME=Relax-Qwen3-4B-P800 # 任意 export MODEL_CONFIG_DIR=${MODEL_DIR}/Relax/scripts/models # Relax模型脚本路径 export MEGATRON=${MODEL_DIR}/Megatron-LM export NUM_GPUS=8 SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)" if [ -z "${RELAX_ENTRYPOINT_MODE:-}" ]; then source "${SCRIPT_DIR}/../../entrypoint/local.sh" fi source "${MODEL_CONFIG_DIR}/qwen3-4B.sh" PROJECT_NAME="${PROJECT_NAME:=Relax/dev/dapo-math}" EXP_DIR="${MODEL_DIR:=${SCRIPT_DIR}/../../../../exps}" NUM_ROLLOUT="${NUM_ROLLOUT:=400}" CKPT_ARGS=( --hf-checkpoint ${EXP_DIR}/Qwen3-4B/ --ref-load ${EXP_DIR}/Qwen3-4B/ --megatron-to-hf-mode bridge --save ${EXP_DIR}/Qwen3-4B_mcore_8xgpu/ --save-interval 200 ) PROMPT_SET=${EXP_DIR}/dapo-math-17k/dapo-math-17k.jsonl ROLLOUT_ARGS=( --prompt-data ${PROMPT_SET} --input-key prompt --label-key label --apply-chat-template --rollout-shuffle --rm-type dapo --reward-key score --num-rollout ${NUM_ROLLOUT} --rollout-batch-size 32 --n-samples-per-prompt 8 --rollout-max-response-len 8192 --rollout-temperature 1 --global-batch-size 256 --balance-data --use-fault-tolerance ) EVAL_ARGS=( --skip-eval-before-train --log-passrate --eval-interval 100 --eval-prompt-data aime ${EXP_DIR}/aime-2024/aime-2024.jsonl --n-samples-per-eval-prompt 8 --eval-max-response-len 16384 --eval-top-p 0.7 ) PERF_ARGS=( --tensor-model-parallel-size 2 --sequence-parallel --pipeline-model-parallel-size 1 --context-parallel-size 1 --expert-model-parallel-size 1 --expert-tensor-parallel-size 1 --recompute-granularity full --recompute-method uniform --recompute-num-layers 1 --calculate-per-token-loss --use-dynamic-batch-size --max-tokens-per-gpu 9216 --initial-loss-scale 16 ) GRPO_ARGS=( --advantage-estimator grpo --use-kl-loss --kl-loss-coef 0.00 --kl-loss-type low_var_kl --entropy-coef 0.00 --eps-clip 0.2 --eps-clip-high 0.28 --use-tis ) OPTIMIZER_ARGS=( --optimizer adam --lr 1e-6 --lr-decay-style constant --weight-decay 0.1 --adam-beta1 0.9 --adam-beta2 0.98 ) SGLANG_ARGS=( --rollout-num-gpus-per-engine 1 --sglang-mem-fraction-static 0.6 ) WANDB_ARGS=( --use-clearml --use-metrics-service --tb-project-name ${PROJECT_NAME} --tb-experiment-name qwen3-4b-GRPO-gpu8-${now} ) MISC_ARGS=( --attention-dropout 0.0 --hidden-dropout 0.0 --accumulate-allreduce-grads-in-fp32 --attention-softmax-in-fp32 --attention-backend flash ) mkdir -p log ray job submit ${RAY_NO_WAIT:+--no-wait} --address="http://127.0.0.1:8265/" ${WORKING_DIR:+--working-dir "${WORKING_DIR}"} -- python3 -m relax.entrypoints.train --resource '{"actor": [1, 8], "rollout": [1, 8]}' --max-staleness 0 --num-data-storage-units 1 --colocate --fp16 --use-health-check "${MODEL_ARGS[@]}" "${CKPT_ARGS[@]}" "${ROLLOUT_ARGS[@]}" "${OPTIMIZER_ARGS[@]}" "${GRPO_ARGS[@]}" "${WANDB_ARGS[@]}" "${PERF_ARGS[@]}" "${EVAL_ARGS[@]}" "${SGLANG_ARGS[@]}" "${MISC_ARGS[@]}" 2>&1 | tee log/qwen3-4b-GRPO-gpu8-${now}.log

@yxyOo Hi yxyOo,could you reproduce the update_successful assertion crash with Qwen3-4B with my script ? I would greatly appreciate it if you could notify me of any progress.

I couldn't reproduce the issue after training 110+ steps on 8×H800 (80GB) with your config, have you tested this on NVIDIA GPUs?

image image

@jamesruio

Copy link
Copy Markdown
Author

@yxyOo Thanks for your reply ! I could reproduce the update_successful assertion crash with Qwen3-4B and script is provided below。
set -ex set -o pipefail now=$(date "+%Y-%m-%d-%H:%M:%S") echo "当前时间: $now" export WORKDIR=/workspace/wurui04 export MODEL_DIR=/workspace/wurui04 # 存放模型权重、训练样本的共同父目录 export PROJECT_NAME=Relax-Qwen3-4B-P800 # 任意 export MODEL_CONFIG_DIR=${MODEL_DIR}/Relax/scripts/models # Relax模型脚本路径 export MEGATRON=${MODEL_DIR}/Megatron-LM export NUM_GPUS=8 SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)" if [ -z "${RELAX_ENTRYPOINT_MODE:-}" ]; then source "${SCRIPT_DIR}/../../entrypoint/local.sh" fi source "${MODEL_CONFIG_DIR}/qwen3-4B.sh" PROJECT_NAME="${PROJECT_NAME:=Relax/dev/dapo-math}" EXP_DIR="${MODEL_DIR:=${SCRIPT_DIR}/../../../../exps}" NUM_ROLLOUT="${NUM_ROLLOUT:=400}" CKPT_ARGS=( --hf-checkpoint ${EXP_DIR}/Qwen3-4B/ --ref-load ${EXP_DIR}/Qwen3-4B/ --megatron-to-hf-mode bridge --save ${EXP_DIR}/Qwen3-4B_mcore_8xgpu/ --save-interval 200 ) PROMPT_SET=${EXP_DIR}/dapo-math-17k/dapo-math-17k.jsonl ROLLOUT_ARGS=( --prompt-data ${PROMPT_SET} --input-key prompt --label-key label --apply-chat-template --rollout-shuffle --rm-type dapo --reward-key score --num-rollout ${NUM_ROLLOUT} --rollout-batch-size 32 --n-samples-per-prompt 8 --rollout-max-response-len 8192 --rollout-temperature 1 --global-batch-size 256 --balance-data --use-fault-tolerance ) EVAL_ARGS=( --skip-eval-before-train --log-passrate --eval-interval 100 --eval-prompt-data aime ${EXP_DIR}/aime-2024/aime-2024.jsonl --n-samples-per-eval-prompt 8 --eval-max-response-len 16384 --eval-top-p 0.7 ) PERF_ARGS=( --tensor-model-parallel-size 2 --sequence-parallel --pipeline-model-parallel-size 1 --context-parallel-size 1 --expert-model-parallel-size 1 --expert-tensor-parallel-size 1 --recompute-granularity full --recompute-method uniform --recompute-num-layers 1 --calculate-per-token-loss --use-dynamic-batch-size --max-tokens-per-gpu 9216 --initial-loss-scale 16 ) GRPO_ARGS=( --advantage-estimator grpo --use-kl-loss --kl-loss-coef 0.00 --kl-loss-type low_var_kl --entropy-coef 0.00 --eps-clip 0.2 --eps-clip-high 0.28 --use-tis ) OPTIMIZER_ARGS=( --optimizer adam --lr 1e-6 --lr-decay-style constant --weight-decay 0.1 --adam-beta1 0.9 --adam-beta2 0.98 ) SGLANG_ARGS=( --rollout-num-gpus-per-engine 1 --sglang-mem-fraction-static 0.6 ) WANDB_ARGS=( --use-clearml --use-metrics-service --tb-project-name ${PROJECT_NAME} --tb-experiment-name qwen3-4b-GRPO-gpu8-${now} ) MISC_ARGS=( --attention-dropout 0.0 --hidden-dropout 0.0 --accumulate-allreduce-grads-in-fp32 --attention-softmax-in-fp32 --attention-backend flash ) mkdir -p log ray job submit ${RAY_NO_WAIT:+--no-wait} --address="http://127.0.0.1:8265/" ${WORKING_DIR:+--working-dir "${WORKING_DIR}"} -- python3 -m relax.entrypoints.train --resource '{"actor": [1, 8], "rollout": [1, 8]}' --max-staleness 0 --num-data-storage-units 1 --colocate --fp16 --use-health-check "${MODEL_ARGS[@]}" "${CKPT_ARGS[@]}" "${ROLLOUT_ARGS[@]}" "${OPTIMIZER_ARGS[@]}" "${GRPO_ARGS[@]}" "${WANDB_ARGS[@]}" "${PERF_ARGS[@]}" "${EVAL_ARGS[@]}" "${SGLANG_ARGS[@]}" "${MISC_ARGS[@]}" 2>&1 | tee log/qwen3-4b-GRPO-gpu8-${now}.log

@yxyOo Hi yxyOo,could you reproduce the update_successful assertion crash with Qwen3-4B with my script ? I would greatly appreciate it if you could notify me of any progress.

I couldn't reproduce the issue after training 110+ steps on 8×H800 (80GB) with your config, have you tested this on NVIDIA GPUs?

image image

Sure! I tested it on a single 8xH20 machine using the config above, based on the latest Relax image. Could you kindly review our configurations to see if there are any remaining diffs? I would extremely appreciate it if this PR could be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants