Add MoLFormer molecular transformer contrib model#62
Open
jimburtoft wants to merge 1 commit intoaws-neuron:mainfrom
Open
Add MoLFormer molecular transformer contrib model#62jimburtoft wants to merge 1 commit intoaws-neuron:mainfrom
jimburtoft wants to merge 1 commit intoaws-neuron:mainfrom
Conversation
IBM MoLFormer-XL-both-10pct (~47M params, encoder-only) compiled with torch_neuronx.trace() for inf2. Includes notebook, 17-test pytest suite, and README with benchmark results. Key results (inf2.xlarge, SDK 2.28, matmult bf16): - Peak: 1,660 inf/s (BS=4, DP=2) - Latency: 1.50ms P50 (BS=1, single core) - Accuracy: cosine similarity 0.999995 vs CPU - Model size: 74 MB (54% smaller than FP32)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add IBM MoLFormer-XL-both-10pct (~47M params, encoder-only molecular transformer) compiled with
torch_neuronx.trace()for AWS Inferentia2. MoLFormer generates 768-dim embeddings from SMILES molecular strings, used in drug discovery and cheminformatics. Includes Jupyter notebook (compile, verify, benchmark), 17-test pytest suite, and README with measured results.Model Information
Model Name: MoLFormer-XL-both-10pct
Model Architecture: Encoder-only transformer (custom,
trust_remote_code=True)Purpose: Molecular embedding generation from SMILES strings
Checklist
Please ensure your PR includes the following items. Refer to the contrib/CONTRIBUTING.md for detailed guidelines.
Required Components
test/integration/test_model.py)src/)torch_neuronx.trace()directly (no NxDI modeling code). Inference logic is in the notebook and test file, following the same pattern ascontrib/models/LaughterSegmentation/.Optional Components
test/unit/directoryFolder Structure
Confirm your contribution follows this structure:
/contrib/models/MoLFormer/
README.md
molformer_neuron_inf2.ipynb
/test
init.py
/integration
init.py
test_model.py
Note: No
src/directory -- this is an encoder-only traced model (same pattern as LaughterSegmentation). The modeling/inference code lives in the notebook and test file.Testing
How did you test this change?
Deployed inf2.xlarge (2 NeuronCores) in sa-east-1 with SDK 2.28 DLAMI (Ubuntu 24.04). Compiled model with
torch_neuronx.trace()using--auto-cast matmult --auto-cast-type bf16. Ran full pytest suite and standalone runner. Tested accuracy against CPU reference across 5 SMILES molecules (caffeine, aspirin, acetaminophen, cyclohexane, benzene). Benchmarked single-core, DataParallel, and various batch sizes.Test Results:
============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0
collected 17 items
test_model.py::TestModelLoads::test_neuron_model_loads PASSED
test_model.py::TestModelLoads::test_output_shape PASSED
test_model.py::TestAccuracy::test_cosine_similarityCn1c(=O)c2c(ncn2C)n(C)c1=O PASSED
test_model.py::TestAccuracy::test_cosine_similarityCC(=O)Oc1ccccc1C(=O)O PASSED
test_model.py::TestAccuracy::test_cosine_similarityCC(=O)NC1=CC=C(O)C=C1 PASSED
test_model.py::TestAccuracy::test_cosine_similarityC1CCCCC1 PASSED
test_model.py::TestAccuracy::test_cosine_similarityc1ccccc1 PASSED
test_model.py::TestAccuracy::test_max_diffCn1c(=O)c2c(ncn2C)n(C)c1=O PASSED
test_model.py::TestAccuracy::test_max_diffCC(=O)Oc1ccccc1C(=O)O PASSED
test_model.py::TestAccuracy::test_max_diffCC(=O)NC1=CC=C(O)C=C1 PASSED
test_model.py::TestAccuracy::test_max_diffC1CCCCC1 PASSED
test_model.py::TestAccuracy::test_max_diffc1ccccc1 PASSED
test_model.py::TestDataParallel::test_dp_runs PASSED
test_model.py::TestDataParallel::test_dp_speedup PASSED
test_model.py::TestPerformance::test_single_core_throughput PASSED
test_model.py::TestPerformance::test_single_core_latency PASSED
test_model.py::TestPerformance::test_dp_throughput PASSED
============================= 17 passed in 41.20s ==============================
Compatibility
Tested with:
Additional Information
trust_remote_code=True(custom modeling code from HuggingFace hub)deterministic_eval=Trueflag is MoLFormer-specific (not a standard HF arg)--auto-cast matmultis critical for FP32 models: 65% single-core throughput gain (400 -> 661 inf/s), 54% smaller compiled model (160 MB -> 74 MB)dict(withouttorchscript=True) ortuple(with it); test helpers handle both formatsRelated Issues
None
vLLM Integration
By submitting this PR, I confirm that: