Skip to content

V2 Encoder Architecture#92

Draft
jakehlee wants to merge 4 commits into
mainfrom
banddef_lee
Draft

V2 Encoder Architecture#92
jakehlee wants to merge 4 commits into
mainfrom
banddef_lee

Conversation

@jakehlee

Copy link
Copy Markdown
Collaborator

This is a draft PR for an updated SpecTfEncoder architecture. I am calling it V2 to avoid breaking backwards compatibility.

  • The banddef, or the wavelength grid of the spectra used for the learned positional embedding, was previously a static parameter of the model architecture. This is because we expected this value to not change at the instrument level (e.g. for EMIT). However, we now know this is a parameter that does change over time, especially for other instruments. In the V2 architecture, the banddef is now an input feature, not a model parameter. This means that a different banddef can be passed per-spectrum, or a single banddef can be passed for the entire batch.
  • A new optional mask argument is added to support variable-length input sequences in the future. This would allow spectra with entirely different spectral grids (e.g. AVIRIS and EMIT) to be fed within the same batch for specific inter-instrument training.
  • The "flat" aggregation method, which also locks the sequence length to a fixed value, has to be removed. Without prior knowledge of the sequence length, "flat" aggregation cannot guarantee a static input feature dimension for the classification layer. Further work/research has shown that "max" and "average" aggregation does fine, with some preliminary evidence showing that one of the tokens starts acting as an aggregation token.

Topics for discussion/comment:

  • It could be argued that, instead of branching to a V2 architecture, we should update the default SpecTfEncoder implementation and port over the old model weights to the new architecture. This is a bigger lift but it's technically possible with some work.

I am currently using this implementation on a research task, if that task goes smoothly and we resolve the point above we can open the PR to be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant