Skip to content

Add pluggable PCOptimizer supporting Adam, AdamW, SGD, and SGD-momentum for local PC weight and bias updates#133

Open
Betty987 wants to merge 18 commits intodevelopmentfrom
optimizer
Open

Add pluggable PCOptimizer supporting Adam, AdamW, SGD, and SGD-momentum for local PC weight and bias updates#133
Betty987 wants to merge 18 commits intodevelopmentfrom
optimizer

Conversation

@Betty987
Copy link
Copy Markdown
Collaborator

@Betty987 Betty987 commented Apr 30, 2026

Summary

This PR introduces a pluggable optimizer system for PC weight and bias updates.Previously, all layers used a hardcoded SGD-equivalent rule:
ΔW = clamp(lr × gradient, -0.01, 0.01)
Now any layer can use Adam, AdamW, plain SGD, or SGD-momentum, configured via a single optimizer_name key in the hyperparameter config.

Main Changes

  • Implementation of SGD,SGD with momentum,Adam and AdamW optimizers and PCOptimizer Class- af7f895
  • Instantiate PCOptimizer in PCLayer - 64b3bed
  • Integrate PCOptimizer into step_embed,step_linear and step_attn- 08abe8
  • Add optimizer config at all entry points in GPTConfig - 3c3a9aa
  • Pass optimizer config to the PCLayer Instance in:

Betty987 added 18 commits April 28, 2026 17:24
…SGD-momentum and maintains per-parameter state
…_decay, weight_bound) to GPTConfig with defaults
…ttn; refactor step_attn Q/K/V updates to accumulate full weight matrices before applying optimizer; fix step_embed double-update bug and restore missing else branch
…t_decay) to pc_qkv and pc_output PCLayer instances
@Betty987 Betty987 changed the title Optimizer Add pluggable PCOptimizer supporting Adam, AdamW, SGD, and SGD-momentum for local PC weight and bias updates Apr 30, 2026
@Betty987 Betty987 added the enhancement New feature or request label Apr 30, 2026
@Betty987 Betty987 self-assigned this May 3, 2026
@Betty987 Betty987 requested a review from Nardos24 May 4, 2026 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant