fix: expose inference server kwargs#2028
Conversation
Signed-off-by: nightcityblade <nightcityblade@gmail.com>
Greptile SummaryThis PR adds a
Confidence Score: 5/5Safe to merge; changes are additive and well-tested. The change is small and additive — a new optional field with a default-empty dict and a clean extraction of build-args logic into a helper. The guard against the reserved No files require special attention beyond the noted Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant InferenceServer
participant RayServeBackend
participant build_openai_app
User->>InferenceServer: "InferenceServer(models, server_kwargs={...})"
User->>InferenceServer: start()
InferenceServer->>RayServeBackend: _deploy()
RayServeBackend->>RayServeBackend: _build_openai_app_args(llm_configs, quiet_env)
Note over RayServeBackend: deepcopy(server_kwargs)<br/>guard llm_configs key<br/>merge quiet_runtime_env
RayServeBackend->>build_openai_app: build_args (server_kwargs + llm_configs)
build_openai_app-->>RayServeBackend: app
RayServeBackend->>RayServeBackend: serve.run(app)
Reviews (3): Last reviewed commit: "Remove redundant Ray Serve deepcopy call..." | Re-trigger Greptile |
| build_args = deepcopy(self._server.server_kwargs) | ||
| build_args["llm_configs"] = llm_configs |
There was a problem hiding this comment.
llm_configs key silently overwritten from server_kwargs — if a caller includes "llm_configs" in server_kwargs (e.g. trying to pre-seed the list), line 105 clobbers it without any warning. Adding an explicit guard makes the contract clear and surfaces the mistake instead of silently dropping user data.
| build_args = deepcopy(self._server.server_kwargs) | |
| build_args["llm_configs"] = llm_configs | |
| build_args = deepcopy(self._server.server_kwargs) | |
| if "llm_configs" in build_args: | |
| msg = "'llm_configs' is a reserved key and must not be set in server_kwargs; use InferenceServer.models instead." | |
| raise ValueError(msg) | |
| build_args["llm_configs"] = llm_configs |
| ingress_config = deepcopy(build_args.get("ingress_deployment_config", {})) | ||
| actor_options = deepcopy(ingress_config.get("ray_actor_options", {})) |
There was a problem hiding this comment.
The
deepcopy calls on lines 109–110 are redundant. build_args is already a deep copy of server_kwargs (line 104), so any nested dict extracted from it is already an independent object. The extra copies add allocation overhead without protecting anything.
| ingress_config = deepcopy(build_args.get("ingress_deployment_config", {})) | |
| actor_options = deepcopy(ingress_config.get("ray_actor_options", {})) | |
| ingress_config = build_args.get("ingress_deployment_config", {}) | |
| actor_options = ingress_config.get("ray_actor_options", {}) |
There was a problem hiding this comment.
Thanks, addressed in 0e954ce by reusing the nested dictionaries from the already-deep-copied build_args instead of deepcopying them again. Verification: python3 -m ruff check nemo_curator/core/serve/ray_serve/backend.py tests/core/serve/ray_serve/test_backend.py passed; python3 -m compileall -q nemo_curator/core/serve/ray_serve/backend.py tests/core/serve/ray_serve/test_backend.py passed. Targeted pytest is blocked locally because ray is not installed in this environment.
|
Addressed the Greptile feedback in the latest push:
Validation run locally:
|
Description
Expose a generic
server_kwargsparameter onInferenceServerand forward it to Ray Serve'sbuild_openai_app, allowing users to pass options such asingress_deployment_configfor tuning ingress replicas.Closes #1621
Usage
Checklist
Testing
uv run ruff check nemo_curator/core/serve/server.py nemo_curator/core/serve/ray_serve/backend.py tests/core/serve/ray_serve/test_backend.pyuv run python -m py_compile nemo_curator/core/serve/server.py nemo_curator/core/serve/ray_serve/backend.py tests/core/serve/ray_serve/test_backend.pyuv run pytest tests/core/serve/ray_serve/test_backend.py tests/core/serve/test_server.py(blocked locally: package intentionally raises on Darwin; NeMo-Curator currently only supports Linux)