[SYSTEMDS-2850]Generalize autoencoder training #2403
+1,040
−86
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR extends the existing autoencoder implementation to support arbitrary-depth encoder–decoder architectures, building directly on the original 2-hidden-layer design.
The current autoencoder code works well for the fixed 2-layer case, but it relies on hard-coded assumptions about layer count and parameter layout. This makes it difficult to experiment with deeper architectures without duplicating large parts of the training logic. The goal of this change is to generalize the training pipeline while preserving the original behavior and APIs.
In short: the existing 2-layer autoencoder remains unchanged, and a new generalized path is added for N-layer configurations.
This work is motivated by practical experimentation needs:
Changes:
No existing public APIs were removed or modified.
Testing
Known limitations
Parameter server training for the generalized path exposes limitations in the current list-based gradient aggregation logic. The code logic breaks when attempting to change the mode from default to parameter. Would need fixing of internal SystemDS parameter server utility. Follow-up work may be needed to fully support paramserv for arbitrary-depth models. This PR does not attempt to change paramserv internals; the generalized training is stable under the default (non-paramserv) execution path