Skip to content

Qwen3.6 35B-A3B failed to start in intel/llm-scaler-vllm:0.14.0-b8.2.1 with downgrading transformer #423

@noobHappylife

Description

@noobHappylife

I'm trying to run Qwen/Qwen3.6-35B-A3B on a machine with 4x B70. However, I noticed the following. Perhaps the release note needs to be updated?

Same command has different results using fix.

Scenario 1) vllm server started properly

Scenario 2) vllm server failed to start

  • Image: intel/llm-scaler-vllm:0.14.0-b8.2.1
  • downgrade to transformers==5.3.0
  • Error: RuntimeError: Worker failed with error 'oneCCL: ze_handle_manager.cpp:226 get_ptr: EXCEPTION: unknown memory type', please check the stack trace above for the root cause

error log:

(Worker_TP0 pid=5399)
(Worker_TP0 pid=5399) INFO 05-20 13:32:53 [default_loader.py:291] Loading weights took 25.37 seconds
(Worker_TP0 pid=5399) INFO 05-20 13:33:00 [gpu_model_runner.py:3908] Model loading took 8.67 GiB memory and 33.560575 seconds
(Worker_TP0 pid=5399) INFO 05-20 13:33:00 [gpu_model_runner.py:4721] Encoder cache will be initialized with a budget of 16384 tokens, and profiled with 1 image items of the maximum feature size.
(Worker_TP3 pid=5402) INFO 05-20 13:33:00 [gpu_model_runner.py:4721] Encoder cache will be initialized with a budget of 16384 tokens, and profiled with 1 image items of the maximum feature size.
(Worker_TP2 pid=5401) INFO 05-20 13:33:00 [gpu_model_runner.py:4721] Encoder cache will be initialized with a budget of 16384 tokens, and profiled with 1 image items of the maximum feature size.
(Worker_TP1 pid=5400) INFO 05-20 13:33:00 [gpu_model_runner.py:4721] Encoder cache will be initialized with a budget of 16384 tokens, and profiled with 1 image items of the maximum feature size.
2026:05:20-13:33:06:( 5400) |CCL_ERROR| atl_ofi.cpp:1103 prov_ep_handle_cq_err: fi_cq_readerr: err: 265, prov_err: Success(0)2026:05:20-13:33:06:( 5401) |CCL_ERROR| atl_ofi.cpp:1103 prov_ep_handle_cq_err: fi_cq_readerr: err: 265, prov_err: Success(0)2026:05:20-13:33:06:( 5402) |CCL_ERROR| atl_ofi.cpp:1103 prov_ep_handle_cq_err: fi_cq_readerr: err: 265, prov_err: Success(0)


2026:05:20-13:33:06:( 5400) |CCL_ERROR| atl_ofi_comm.cpp:203 allgatherv: condition check(ep_idx, recv_reqs[i]) != ATL_STATUS_FAILURE failed
check recv failed2026:05:20-13:33:06:( 5401) |CCL_ERROR| atl_ofi_comm.cpp:203 allgatherv: condition check(ep_idx, recv_reqs[i]) != ATL_STATUS_FAILURE failed
check recv failed2026:05:20-13:33:06:( 5402) |CCL_ERROR| atl_ofi_comm.cpp:203 allgatherv: condition check(ep_idx, recv_reqs[i]) != ATL_STATUS_FAILURE failed
check recv failed


(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822] WorkerProc hit an exception.
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822] Traceback (most recent call last):
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 817, in worker_busy_loop
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output = func(*args, **kwargs)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 84, in determine_available_memory
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._determine_available_memory_default()
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 134, in _determine_available_memory_default
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     self.model_runner.profile_run()
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4737, in profile_run
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dummy_encoder_outputs = self.model.embed_multimodal(
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1913, in embed_multimodal
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     image_embeddings = self._process_image_input(multimodal_input)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1424, in _process_image_input
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return run_dp_sharded_mrope_vision_model(
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/vision.py", line 467, in run_dp_sharded_mrope_vision_model
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     gathered_embeds = tensor_model_parallel_all_gather(image_embeds_local_padded, dim=0)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/communication_op.py", line 21, in tensor_model_parallel_all_gather
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return get_tp_group().all_gather(input_, dim)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822] WorkerProc hit an exception.
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822] Traceback (most recent call last):
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 523, in all_gather
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 817, in worker_busy_loop
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._all_gather_out_place(input_, dim)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output = func(*args, **kwargs)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 528, in _all_gather_out_place
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self.device_communicator.all_gather(input_, dim)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/device_communicators/base_device_communicator.py", line 153, in all_gather
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 84, in determine_available_memory
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dist.all_gather_into_tensor(output_tensor, input_, group=self.device_group)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._determine_available_memory_default()
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/c10d_logger.py", line 83, in wrapper
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 134, in _determine_available_memory_default
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     self.model_runner.profile_run()
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/distributed_c10d.py", line 4125, in all_gather_into_tensor
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4737, in profile_run
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     work = group._allgather_base(output_tensor, input_tensor, opts)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dummy_encoder_outputs = self.model.embed_multimodal(
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822] RuntimeError: oneCCL: atl_ofi_comm.cpp:203 allgatherv: EXCEPTION: check recv failed
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1913, in embed_multimodal
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822] Traceback (most recent call last):
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     image_embeddings = self._process_image_input(multimodal_input)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 817, in worker_busy_loop
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output = func(*args, **kwargs)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1424, in _process_image_input
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return run_dp_sharded_mrope_vision_model(
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/vision.py", line 467, in run_dp_sharded_mrope_vision_model
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     gathered_embeds = tensor_model_parallel_all_gather(image_embeds_local_padded, dim=0)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 84, in determine_available_memory
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._determine_available_memory_default()
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/communication_op.py", line 21, in tensor_model_parallel_all_gather
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return get_tp_group().all_gather(input_, dim)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 134, in _determine_available_memory_default
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     self.model_runner.profile_run()
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 523, in all_gather
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4737, in profile_run
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._all_gather_out_place(input_, dim)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dummy_encoder_outputs = self.model.embed_multimodal(
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 528, in _all_gather_out_place
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1913, in embed_multimodal
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self.device_communicator.all_gather(input_, dim)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     image_embeddings = self._process_image_input(multimodal_input)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/device_communicators/base_device_communicator.py", line 153, in all_gather
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1424, in _process_image_input
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dist.all_gather_into_tensor(output_tensor, input_, group=self.device_group)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return run_dp_sharded_mrope_vision_model(
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/c10d_logger.py", line 83, in wrapper
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/vision.py", line 467, in run_dp_sharded_mrope_vision_model
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     gathered_embeds = tensor_model_parallel_all_gather(image_embeds_local_padded, dim=0)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/distributed_c10d.py", line 4125, in all_gather_into_tensor
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     work = group._allgather_base(output_tensor, input_tensor, opts)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/communication_op.py", line 21, in tensor_model_parallel_all_gather
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return get_tp_group().all_gather(input_, dim)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822] RuntimeError: oneCCL: atl_ofi_comm.cpp:203 allgatherv: EXCEPTION: check recv failed
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822] Traceback (most recent call last):
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 523, in all_gather
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 817, in worker_busy_loop
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._all_gather_out_place(input_, dim)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output = func(*args, **kwargs)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 528, in _all_gather_out_place
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self.device_communicator.all_gather(input_, dim)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/device_communicators/base_device_communicator.py", line 153, in all_gather
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 84, in determine_available_memory
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dist.all_gather_into_tensor(output_tensor, input_, group=self.device_group)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._determine_available_memory_default()
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/c10d_logger.py", line 83, in wrapper
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 134, in _determine_available_memory_default
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     self.model_runner.profile_run()
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/distributed_c10d.py", line 4125, in all_gather_into_tensor
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4737, in profile_run
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     work = group._allgather_base(output_tensor, input_tensor, opts)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dummy_encoder_outputs = self.model.embed_multimodal(
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822] RuntimeError: oneCCL: atl_ofi_comm.cpp:203 allgatherv: EXCEPTION: check recv failed
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1913, in embed_multimodal
(Worker_TP3 pid=5402) ERROR 05-20 13:33:06 [multiproc_executor.py:822]
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     image_embeddings = self._process_image_input(multimodal_input)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1424, in _process_image_input
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return run_dp_sharded_mrope_vision_model(
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/vision.py", line 467, in run_dp_sharded_mrope_vision_model
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     gathered_embeds = tensor_model_parallel_all_gather(image_embeds_local_padded, dim=0)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/communication_op.py", line 21, in tensor_model_parallel_all_gather
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return get_tp_group().all_gather(input_, dim)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 523, in all_gather
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._all_gather_out_place(input_, dim)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 528, in _all_gather_out_place
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self.device_communicator.all_gather(input_, dim)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/device_communicators/base_device_communicator.py", line 153, in all_gather
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dist.all_gather_into_tensor(output_tensor, input_, group=self.device_group)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/c10d_logger.py", line 83, in wrapper
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/distributed_c10d.py", line 4125, in all_gather_into_tensor
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     work = group._allgather_base(output_tensor, input_tensor, opts)
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822] RuntimeError: oneCCL: atl_ofi_comm.cpp:203 allgatherv: EXCEPTION: check recv failed
(Worker_TP1 pid=5400) ERROR 05-20 13:33:06 [multiproc_executor.py:822]
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822] WorkerProc hit an exception.
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822] Traceback (most recent call last):
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 817, in worker_busy_loop
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output = func(*args, **kwargs)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 84, in determine_available_memory
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._determine_available_memory_default()
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 134, in _determine_available_memory_default
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     self.model_runner.profile_run()
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4737, in profile_run
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dummy_encoder_outputs = self.model.embed_multimodal(
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1913, in embed_multimodal
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     image_embeddings = self._process_image_input(multimodal_input)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1424, in _process_image_input
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return run_dp_sharded_mrope_vision_model(
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/vision.py", line 467, in run_dp_sharded_mrope_vision_model
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     gathered_embeds = tensor_model_parallel_all_gather(image_embeds_local_padded, dim=0)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/communication_op.py", line 21, in tensor_model_parallel_all_gather
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return get_tp_group().all_gather(input_, dim)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 523, in all_gather
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._all_gather_out_place(input_, dim)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 528, in _all_gather_out_place
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self.device_communicator.all_gather(input_, dim)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/device_communicators/base_device_communicator.py", line 153, in all_gather
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dist.all_gather_into_tensor(output_tensor, input_, group=self.device_group)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/c10d_logger.py", line 83, in wrapper
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/distributed_c10d.py", line 4125, in all_gather_into_tensor
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     work = group._allgather_base(output_tensor, input_tensor, opts)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822] RuntimeError: oneCCL: atl_ofi_comm.cpp:203 allgatherv: EXCEPTION: check recv failed
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822] Traceback (most recent call last):
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 817, in worker_busy_loop
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output = func(*args, **kwargs)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 84, in determine_available_memory
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._determine_available_memory_default()
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 134, in _determine_available_memory_default
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     self.model_runner.profile_run()
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4737, in profile_run
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dummy_encoder_outputs = self.model.embed_multimodal(
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1913, in embed_multimodal
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     image_embeddings = self._process_image_input(multimodal_input)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1424, in _process_image_input
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return run_dp_sharded_mrope_vision_model(
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/vision.py", line 467, in run_dp_sharded_mrope_vision_model
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     gathered_embeds = tensor_model_parallel_all_gather(image_embeds_local_padded, dim=0)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/communication_op.py", line 21, in tensor_model_parallel_all_gather
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return get_tp_group().all_gather(input_, dim)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 523, in all_gather
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._all_gather_out_place(input_, dim)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 528, in _all_gather_out_place
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self.device_communicator.all_gather(input_, dim)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/device_communicators/base_device_communicator.py", line 153, in all_gather
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dist.all_gather_into_tensor(output_tensor, input_, group=self.device_group)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/c10d_logger.py", line 83, in wrapper
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/distributed_c10d.py", line 4125, in all_gather_into_tensor
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     work = group._allgather_base(output_tensor, input_tensor, opts)
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822] RuntimeError: oneCCL: atl_ofi_comm.cpp:203 allgatherv: EXCEPTION: check recv failed
(Worker_TP2 pid=5401) ERROR 05-20 13:33:06 [multiproc_executor.py:822]
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822] WorkerProc hit an exception.
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822] Traceback (most recent call last):
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 817, in worker_busy_loop
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output = func(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 84, in determine_available_memory
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._determine_available_memory_default()
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 134, in _determine_available_memory_default
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     self.model_runner.profile_run()
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4737, in profile_run
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dummy_encoder_outputs = self.model.embed_multimodal(
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1913, in embed_multimodal
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     image_embeddings = self._process_image_input(multimodal_input)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1424, in _process_image_input
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return run_dp_sharded_mrope_vision_model(
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/vision.py", line 432, in run_dp_sharded_mrope_vision_model
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     image_embeds_local = vision_model(pixel_values_local, local_grid_thw_list)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._call_impl(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1787, in _call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return forward_call(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 577, in forward
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     hidden_states = blk(
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                     ^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._call_impl(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1787, in _call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return forward_call(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 249, in forward
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     x = x + self.attn(
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]             ^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._call_impl(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1787, in _call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return forward_call(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_5_vl.py", line 415, in forward
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output, _ = self.proj(context_layer)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                 ^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._call_impl(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1787, in _call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return forward_call(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/layers/linear.py", line 1464, in forward
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output = tensor_model_parallel_all_reduce(output_parallel)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/communication_op.py", line 14, in tensor_model_parallel_all_reduce
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return get_tp_group().all_reduce(input_)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 502, in all_reduce
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._all_reduce_out_place(input_)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 507, in _all_reduce_out_place
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self.device_communicator.all_reduce(input_)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/device_communicators/xpu_communicator.py", line 60, in all_reduce
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dist.all_reduce(input_, group=self.device_group)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/c10d_logger.py", line 83, in wrapper
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/distributed_c10d.py", line 3007, in all_reduce
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     work = group.allreduce([tensor], opts)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822] RuntimeError: oneCCL: ze_handle_manager.cpp:226 get_ptr: EXCEPTION: unknown memory type
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822] Traceback (most recent call last):
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 817, in worker_busy_loop
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output = func(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 84, in determine_available_memory
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._determine_available_memory_default()
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/xpu_worker.py", line 134, in _determine_available_memory_default
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     self.model_runner.profile_run()
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4737, in profile_run
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dummy_encoder_outputs = self.model.embed_multimodal(
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1913, in embed_multimodal
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     image_embeddings = self._process_image_input(multimodal_input)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 1424, in _process_image_input
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return run_dp_sharded_mrope_vision_model(
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/vision.py", line 432, in run_dp_sharded_mrope_vision_model
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     image_embeds_local = vision_model(pixel_values_local, local_grid_thw_list)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._call_impl(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1787, in _call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return forward_call(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 577, in forward
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     hidden_states = blk(
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                     ^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._call_impl(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1787, in _call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return forward_call(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen3_vl.py", line 249, in forward
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     x = x + self.attn(
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]             ^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._call_impl(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1787, in _call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return forward_call(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_5_vl.py", line 415, in forward
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output, _ = self.proj(context_layer)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]                 ^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._call_impl(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1787, in _call_impl
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return forward_call(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/layers/linear.py", line 1464, in forward
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     output = tensor_model_parallel_all_reduce(output_parallel)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/communication_op.py", line 14, in tensor_model_parallel_all_reduce
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return get_tp_group().all_reduce(input_)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 502, in all_reduce
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self._all_reduce_out_place(input_)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/parallel_state.py", line 507, in _all_reduce_out_place
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return self.device_communicator.all_reduce(input_)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/vllm/distributed/device_communicators/xpu_communicator.py", line 60, in all_reduce
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     dist.all_reduce(input_, group=self.device_group)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/c10d_logger.py", line 83, in wrapper
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     return func(*args, **kwargs)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]   File "/usr/local/lib/python3.12/dist-packages/torch/distributed/distributed_c10d.py", line 3007, in all_reduce
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]     work = group.allreduce([tensor], opts)
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822] RuntimeError: oneCCL: ze_handle_manager.cpp:226 get_ptr: EXCEPTION: unknown memory type
(Worker_TP0 pid=5399) ERROR 05-20 13:33:06 [multiproc_executor.py:822]
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936] EngineCore failed to start.
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936] Traceback (most recent call last):
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 927, in run_engine_core
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 692, in __init__
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]     super().__init__(
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 113, in __init__
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]     num_gpu_blocks, num_cpu_blocks, kv_cache_config = self._initialize_kv_caches(
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 243, in _initialize_kv_caches
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]     available_gpu_memory = self.model_executor.determine_available_memory()
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 126, in determine_available_memory
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]     return self.collective_rpc("determine_available_memory")
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 359, in collective_rpc
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]     return aggregate(get_response())
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]                      ^^^^^^^^^^^^^^
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 342, in get_response
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936]     raise RuntimeError(
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:06 [core.py:936] RuntimeError: Worker failed with error 'oneCCL: ze_handle_manager.cpp:226 get_ptr: EXCEPTION: unknown memory type', please check the stack trace above for the root cause
(EngineCore_DP0 pid=5273) ERROR 05-20 13:33:09 [multiproc_executor.py:231] Worker proc VllmWorker-3 died unexpectedly, shutting down executor.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions