V2 Encoder Architecture by jakehlee · Pull Request #92 · emit-sds/SpecTf

jakehlee · 2026-05-23T04:58:58Z

This is a draft PR for an updated SpecTfEncoder architecture. I am calling it V2 to avoid breaking backwards compatibility.

The banddef, or the wavelength grid of the spectra used for the learned positional embedding, was previously a static parameter of the model architecture. This is because we expected this value to not change at the instrument level (e.g. for EMIT). However, we now know this is a parameter that does change over time, especially for other instruments. In the V2 architecture, the banddef is now an input feature, not a model parameter. This means that a different banddef can be passed per-spectrum, or a single banddef can be passed for the entire batch.
A new optional mask argument is added to support variable-length input sequences in the future. This would allow spectra with entirely different spectral grids (e.g. AVIRIS and EMIT) to be fed within the same batch for specific inter-instrument training.
The "flat" aggregation method, which also locks the sequence length to a fixed value, has to be removed. Without prior knowledge of the sequence length, "flat" aggregation cannot guarantee a static input feature dimension for the classification layer. Further work/research has shown that "max" and "average" aggregation does fine, with some preliminary evidence showing that one of the tokens starts acting as an aggregation token.

Topics for discussion/comment:

It could be argued that, instead of branching to a V2 architecture, we should update the default SpecTfEncoder implementation and port over the old model weights to the new architecture. This is a bigger lift but it's technically possible with some work.

I am currently using this implementation on a research task, if that task goes smoothly and we resolve the point above we can open the PR to be merged.

jakehlee added 4 commits May 22, 2026 21:35

V2 Encoder moves banddef to input. Draft impl.

d2a55c0

Key padding mask implementation for var length inputs

c380fbb

Add new dataset version for AV3

5d715de

v2 jit fix

6697508

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V2 Encoder Architecture#92

V2 Encoder Architecture#92
jakehlee wants to merge 4 commits into
mainfrom
banddef_lee

jakehlee commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jakehlee commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant