Warning
Experimental Status: Zenith is currently in an active transition from 2D to 3D. The 3D environment (Ursina) and Skill Forge integration are functional but traversing the early stages of spatial optimization.
Zenith is a self-evolving Reinforcement Learning (RL) ecosystem. The name represents the "peak" of automated mastery—where an agent doesn't just learn from an environment, but actively refactors its own logic to overcome obstacles.
Zenith utilizes a unique Skill Forge mechanism. When training bottlenecks are detected, the system leverages LLMs (NVIDIA NIM / DeepSeek-R1) to dynamically inject new logic into the agent's reward functions and movement capabilities.
To ensure the agent develops generalized intelligence rather than just memorizing paths, Zenith implements a Variable Environment Strategy:
- Physics Shifting: The environment dynamically mutates gravity, friction, and movement constants during training. This forces the agent to adapt to changing physical laws in real-time.
- Procedural Chaos: Hazard density, platform layouts, and enemy patrol paths are procedurally generated and scaled via a curriculum.
- Dimensional Bridge: The agent must survive a "Singularity Event" at Level 1500, transitioning from 2D discrete logic to a 3D continuous vector space (X, Y, Z), ensuring its logic is robust enough to survive a literal change in dimensions.
- Dynamic Skill Forge: Real-time code refactoring and logic injection via NVIDIA NIM.
- Curriculum Learning: Automated level scaling and hazard density adjustment.
- Weights & Biases Integration: Full telemetry for tracking the evolution of the agent.
- RL Framework: Stable-Baselines3 (PPO)
- Environment: Gymnasium, Pygame (2D), Ursina (3D)
- Intelligence: NVIDIA NIM (DeepSeek-R1)
- Telemetry: Weights & Biases