Skip to content

Conversation

@AdityaPandey2612
Copy link

This PR extends the existing autoencoder implementation to support arbitrary-depth encoder–decoder architectures, building directly on the original 2-hidden-layer design.
The current autoencoder code works well for the fixed 2-layer case, but it relies on hard-coded assumptions about layer count and parameter layout. This makes it difficult to experiment with deeper architectures without duplicating large parts of the training logic. The goal of this change is to generalize the training pipeline while preserving the original behavior and APIs.
In short: the existing 2-layer autoencoder remains unchanged, and a new generalized path is added for N-layer configurations.

This work is motivated by practical experimentation needs:

  • Enable deeper autoencoders (e.g., multi-layer encoders with mirrored decoders)
  • Avoid duplicating training logic for each new depth
  • Reuse the existing SGD, momentum, batching, and decay logic
  • Keep backward compatibility with all existing scripts and tests

Changes:

  • Added a generalized autoencoder entry point that accepts a list of hidden layer sizes
  • Refactored training logic to operate over lists of weights and biases instead of fixed positional arguments
  • Introduced helper functions to:
    1. build symmetric encoder/decoder layer layouts
    2. flatten and unflatten model state
    3. run forward and backward passes across arbitrary depth
  • Left the original 2-layer autoencoder path intact and unchanged
    No existing public APIs were removed or modified.

Testing

  • Existing autoencoder tests continue to pass without modification
  • New tests validate:
    1. correct output shapes for multi-layer configurations
    2. consistency of behavior when reducing to the 2-layer case

Known limitations
Parameter server training for the generalized path exposes limitations in the current list-based gradient aggregation logic. The code logic breaks when attempting to change the mode from default to parameter. Would need fixing of internal SystemDS parameter server utility. Follow-up work may be needed to fully support paramserv for arbitrary-depth models. This PR does not attempt to change paramserv internals; the generalized training is stable under the default (non-paramserv) execution path

@github-project-automation github-project-automation bot moved this to In Progress in SystemDS PR Queue Jan 19, 2026
@AdityaPandey2612 AdityaPandey2612 changed the title Generalize autoencoder training to support arbitrary-depth (N-layer) architectures [SYSTEMDS-2850]Generalize autoencoder training to support arbitrary-depth (N-layer) architectures Jan 19, 2026
@AdityaPandey2612 AdityaPandey2612 changed the title [SYSTEMDS-2850]Generalize autoencoder training to support arbitrary-depth (N-layer) architectures [SYSTEMDS-2850]Generalize autoencoder training with parameter server Jan 19, 2026
@AdityaPandey2612 AdityaPandey2612 changed the title [SYSTEMDS-2850]Generalize autoencoder training with parameter server [SYSTEMDS-2850]Generalize autoencoder training Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant