[SYSTEMDS-2850]Generalize autoencoder training #2403

AdityaPandey2612 · 2026-01-19T03:55:27Z

This PR extends the existing autoencoder implementation to support arbitrary-depth encoder–decoder architectures, building directly on the original 2-hidden-layer design.
The current autoencoder code works well for the fixed 2-layer case, but it relies on hard-coded assumptions about layer count and parameter layout. This makes it difficult to experiment with deeper architectures without duplicating large parts of the training logic. The goal of this change is to generalize the training pipeline while preserving the original behavior and APIs.
In short: the existing 2-layer autoencoder remains unchanged, and a new generalized path is added for N-layer configurations.

This work is motivated by practical experimentation needs:

Enable deeper autoencoders (e.g., multi-layer encoders with mirrored decoders)
Avoid duplicating training logic for each new depth
Reuse the existing SGD, momentum, batching, and decay logic
Keep backward compatibility with all existing scripts and tests

Changes:

Added a generalized autoencoder entry point that accepts a list of hidden layer sizes
Refactored training logic to operate over lists of weights and biases instead of fixed positional arguments
Introduced helper functions to:
1. build symmetric encoder/decoder layer layouts
2. flatten and unflatten model state
3. run forward and backward passes across arbitrary depth
Left the original 2-layer autoencoder path intact and unchanged
No existing public APIs were removed or modified.

Testing

Existing autoencoder tests continue to pass without modification
New tests validate:
1. correct output shapes for multi-layer configurations
2. consistency of behavior when reducing to the 2-layer case

Known limitations
Parameter server training for the generalized path exposes limitations in the current list-based gradient aggregation logic. The code logic breaks when attempting to change the mode from default to parameter. Would need fixing of internal SystemDS parameter server utility. Follow-up work may be needed to fully support paramserv for arbitrary-depth models. This PR does not attempt to change paramserv internals; the generalized training is stable under the default (non-paramserv) execution path

AdityaPandey2612 added 6 commits January 15, 2026 20:37

"done"

b2cded4

"done"

eac21f5

"done"

ab4befd

"done"

7db4c97

"done"

5fb3403

"done"

caf8fae

github-project-automation bot added this to SystemDS PR Queue Jan 19, 2026

github-project-automation bot moved this to In Progress in SystemDS PR Queue Jan 19, 2026

AdityaPandey2612 changed the title ~~Generalize autoencoder training to support arbitrary-depth (N-layer) architectures~~ [SYSTEMDS-2850]Generalize autoencoder training to support arbitrary-depth (N-layer) architectures Jan 19, 2026

AdityaPandey2612 changed the title ~~[SYSTEMDS-2850]Generalize autoencoder training to support arbitrary-depth (N-layer) architectures~~ [SYSTEMDS-2850]Generalize autoencoder training with parameter server Jan 19, 2026

AdityaPandey2612 changed the title ~~[SYSTEMDS-2850]Generalize autoencoder training with parameter server~~ [SYSTEMDS-2850]Generalize autoencoder training Jan 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYSTEMDS-2850]Generalize autoencoder training #2403

[SYSTEMDS-2850]Generalize autoencoder training #2403

AdityaPandey2612 commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[SYSTEMDS-2850]Generalize autoencoder training #2403

Are you sure you want to change the base?

[SYSTEMDS-2850]Generalize autoencoder training #2403

Conversation

AdityaPandey2612 commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant