Skip to content

1BIMU/1BIMU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

Typing SVG

Profile Views

🧑‍💻 About Me

  • 🎓 Undergraduate student at Beijing University of Posts and Telecommunications (BUPT), School of Computer Science
  • 🔬 Research interests: RLVR · RLHF · Optimization Algorithms
  • 🌱 Currently exploring the intersection of reinforcement learning and large language model alignment
  • 📍 Beijing, China

🔭 Research Interests

Area Description
RLVR Reinforcement Learning from Verifiable Rewards — scalable reward signals beyond human feedback
RLHF Reinforcement Learning from Human Feedback — aligning LLMs with human preferences
Optimizer Adaptive optimization methods (AdamW, Muon, Shampoo, etc.) for deep learning

📌 Pinned Repositories

APO_OFFICAL — [ICML 2026] The official repository for Anchored Policy Optimization: Mitigating Exploration Collapse via Support-Constrained Rectification Python ⭐ 14 🍴 1

SPPO — [ACL 2026 Oral] SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks official repos. Python ⭐ 3 🍴 3


⚡ Recent Activity

No recent public activity.


📝 Latest Blog Posts


🛠️ Tech Stack

Python PyTorch C++ Linux Git LaTeX


📊 GitHub Stats

GitHub Streak


📫 Contact

GitHub Email Zhihu


"The pursuit of intelligence — from theory to practice." · Last updated: auto-refreshed every 3 hours

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages