Skip to content

feat(aitools): add FastAPI router and config hot reload#1000

Open
zijian2 wants to merge 1 commit into
iflytek:mainfrom
zijian2:feature/hot-reload
Open

feat(aitools): add FastAPI router and config hot reload#1000
zijian2 wants to merge 1 commit into
iflytek:mainfrom
zijian2:feature/hot-reload

Conversation

@zijian2
Copy link
Copy Markdown
Contributor

@zijian2 zijian2 commented Mar 13, 2026

Summary

Add FastAPI router hot reload and config hot reload support for AITools, and refactor application startup so runtime watchers are managed explicitly by the server lifecycle instead of the old start module.

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update
  • Refactoring

Related Issue

N/A

Changes

1) Add dynamic route hot reload for service changes

  • Added: core/plugin/aitools/utils/route_utils.py
  • Added route file watcher based on watchdog to monitor the AITools service directory.
  • Implemented debounced route re-registration when Python service files change.
  • Added dependency install handling when requirements.txt changes under the watched service directory.
  • Added cross-process singleton locking so only one worker performs route watching at a time.

2) Add config hot reload lifecycle integration

  • Updated: core/plugin/aitools/utils/config_utils.py
  • Updated: core/plugin/aitools/app/aitools_server.py
  • Wired config watcher startup and shutdown into the server lifecycle.
  • Registered hot reload callbacks for service manager refresh and aiohttp session reset.
  • Kept watcher behavior guarded by config flags and process-level lock ownership.

3) Refactor app startup structure around server-owned lifecycle

  • Added: core/plugin/aitools/app/aitools_server.py
  • Added: core/plugin/aitools/app/app_factory.py
  • Deleted: core/plugin/aitools/app/start_server.py
  • Simplified entrypoint behavior in core/plugin/aitools/main.py.
  • Moved environment preparation, watcher startup, and uvicorn startup under AIToolsServer.
  • Introduced app factory + lifespan composition so startup and shutdown resources are managed in one place.

4) Update runtime configuration and dependencies

  • Updated: core/plugin/aitools/pyproject.toml
  • Updated: core/plugin/aitools/config.env
  • Updated: docker/astronAgent/config/aitools/config.env
  • Updated: helm/astron-agent/files/config/aitools/config.env
  • Added watchdog dependency required for route watching.
  • Updated SERVICE_APP to point at plugin.aitools.app.app_factory:aitools_app.
  • Added hot reload related configuration items such as HOT_RELOAD_ENABLE and CONFIG_WATCH_INTERVAL.

5) Add and update test coverage for reload behavior and startup flow

  • Added: core/plugin/aitools/tests/api/routes/test_register_reload.py
  • Added: core/plugin/aitools/tests/utils/test_route_utils.py
  • Updated tests across app, config, route registration, and client utilities to match the new lifecycle and watcher behavior.
  • Covered route watcher start/stop, lock contention, reload debounce, requirements install triggers, and route re-registration behavior.

Testing

  • New tests added (unit)
  • Existing full test suite executed
  • Manual testing completed

Test scope included in this work:

  • make test
  • pytest -q tests/utils/test_route_utils.py tests/utils/test_config_utils.py

Compatibility / Risk

  • No external API path contract is intentionally changed, but runtime behavior now supports hot reloading for routes and config.
  • Multi-worker deployments still require per-process startup, so watcher execution is constrained with local file locks to avoid duplicate watchers on the same host.
  • Route changes and dependency installs are now triggered automatically in development/runtime scenarios where hot reload is enabled.
  • Risk is mainly around watcher timing, lock ownership, and startup ordering, which is covered by focused tests and the lifecycle refactor.

Rollback Plan

If regressions are found:

  1. Revert the app lifecycle refactor in aitools_server.py, app_factory.py, and main.py.
  2. Revert route watcher introduction in route_utils.py and related registration changes.
  3. Restore the previous SERVICE_APP entry and dependency set.

Checklist

  • Code follows project coding standards
  • Self-review completed
  • Documentation/comments updated where needed
  • No breaking changes introduced

Signed-off-by: zijian2 <tangzj0911@gmail.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-executed refactoring of the application startup and lifecycle management, centralizing it into the AIToolsServer class. The addition of hot-reloading for both configuration and routes using watchdog and process locks is a great feature for development and dynamic environments. The code is generally of high quality with good test coverage for the new features. I've found one critical issue regarding an incorrect value in the uvicorn configuration that could prevent the server from starting, and a high-severity maintainability issue related to duplicated code for process locking.

workers=20,
reload=False,
ws_ping_interval=None,
ws_ping_timeout=NotImplemented,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The ws_ping_timeout parameter in uvicorn.Config is set to NotImplemented. According to uvicorn's documentation and type hints, this parameter expects a float or None. Using the NotImplemented singleton will likely cause a TypeError at runtime when uvicorn processes this configuration, which could prevent the server from starting or cause unexpected behavior with WebSocket connections.

Suggested change
ws_ping_timeout=NotImplemented,
ws_ping_timeout=None,

Comment on lines +192 to +226
def _acquire_process_lock(self) -> bool:
"""Acquire cross-process lock so only one worker watches polaris."""
if not self.env_loader:
log.error("Env loader is not initialized; skip config watcher lock.")
return False

lock_path = (
Path(self.env_loader.env_file_path).resolve().parent
/ ".config_watcher.pid.lock"
)
# Keep file descriptor open for lock lifetime.
# pylint: disable=consider-using-with
lock_file = open(lock_path, "w", encoding="utf-8")
try:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
self._lock_file = lock_file
self._owns_lock = True
return True
except BlockingIOError:
lock_file.close()
return False

def _release_process_lock(self) -> None:
"""Release process lock if held by current worker."""
if not self._lock_file:
return

if self._owns_lock:
try:
fcntl.flock(self._lock_file.fileno(), fcntl.LOCK_UN)
except OSError:
pass
self._lock_file.close()
self._lock_file = None
self._owns_lock = False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The process locking logic implemented in _acquire_process_lock and _release_process_lock is duplicated from core/plugin/aitools/utils/route_utils.py. While the pylint: disable=duplicate-code comment shows awareness, this duplication presents a maintainability challenge. Any future bug fixes or enhancements to the locking mechanism will need to be applied in both locations. To improve maintainability, this logic should be extracted into a reusable utility class or context manager that can be shared between ConfigWatcher and RouteReloadWatcher.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant