Skip to content

Missing model_config argument in VLM visual ONNX export #107

Description

@StefanoS90

Describe the bug

It seems it is not possible to serve Qwen 3.5-0.8B (a natively Vision-language model) out of the box via the python high level API.

There seems to be missing a param in the python onnx export of the visual part of the model in the engine.py file.

I was able to correct and serve Qwen3.5-0.8B via teh high level API it in this way:

diff --git a/experimental/server/engine.py b/experimental/server/engine.py
index e768751..df7eb07 100644
--- a/experimental/server/engine.py
+++ b/experimental/server/engine.py
@@ -522,12 +522,14 @@ class LLM:
         import torch
 
         _ensure_export_package()
+        from tensorrt_edgellm.config import ModelConfig
         from tensorrt_edgellm.scripts.export import (_export_visual,
                                                      _load_all_weights,
                                                      _load_config)
 
         config = _load_config(self._model_dir)
         weights = _load_all_weights(self._model_dir)
+        model_config = ModelConfig.from_pretrained(self._model_dir)
         _export_visual(
             self._model_dir,
             self._visual_onnx_dir,
@@ -535,6 +537,7 @@ class LLM:
             config,
             self._model_type,
             torch.float16,
+            model_config=model_config,
         )
         logger.info(
             "Visual ONNX export complete: %s",

Let me know if you agree this is a bug or maybe i am missing something.
I can also open a PR with the fix after.

Steps/Code to reproduce bug

(venv) root@217da7dcd3f2:/workspace# python -m experimental.server --model Qwen/Qwen3.5-0.8B --port 8000
10:20:56 INFO edgellm.server: Resolving model: Qwen/Qwen3.5-0.8B
10:20:56 INFO edgellm.server: Downloading Qwen/Qwen3.5-0.8B from Hugging Face Hub ...
10:20:57 INFO httpx: HTTP Request: GET https://huggingface.co/api/models/Qwen/Qwen3.5-0.8B/revision/main "HTTP/1.1 200 OK"
Fetching 13 files: 100%|████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 14540.25it/s]
Download complete: : 0.00B [00:00, ?B/s] 10:20:57 INFO edgellm.server: Detected VLM model (type=qwen3_5)/13 [00:00<?, ?it/s]
10:20:57 INFO edgellm.server: Using cached ONNX: /root/.cache/huggingface/hub/models--Qwen--Qwen3.5-0.8B/snapshots/2fc06364715b967f1860aea9cf38778875588b17/.edgellm/onnx/llm
10:20:57 INFO edgellm.server: Exporting visual ONNX to /root/.cache/huggingface/hub/models--Qwen--Qwen3.5-0.8B/snapshots/2fc06364715b967f1860aea9cf38778875588b17/.edgellm/onnx/visual ...
Download complete: : 0.00B [00:00, ?B/s]
10:20:58 INFO tensorrt_edgellm.scripts.export: Loading shard: model.safetensors-00001-of-00001.safetensors
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/workspace/experimental/server/main.py", line 19, in
main()
File "/workspace/experimental/server/api_server.py", line 479, in main
llm = LLM(
^^^^
File "/workspace/experimental/server/engine.py", line 317, in init
self._init_from_model(
File "/workspace/experimental/server/engine.py", line 432, in _init_from_model
self._export_visual_onnx()
File "/workspace/experimental/server/engine.py", line 531, in _export_visual_onnx
_export_visual(
TypeError: _export_visual() missing 1 required positional argument: 'model_config'
(venv) root@217da7dcd3f2:/workspace#

Installation method:

Following teh instruction here, installing venv via pip https://nvidia.github.io/TensorRT-Edge-LLM/latest/user_guide/getting_started/installation.html#installation

Export command used:

python -m experimental.server --model Qwen/Qwen3.5-0.8B --port 8000

Expected behavior

Capable of Serving via the high level API a VLM model.

System information (x86 Host with GPU)

This is indipendent from the system.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions