Add MoLFormer molecular transformer contrib model by jimburtoft · Pull Request #62 · aws-neuron/neuronx-distributed-inference

jimburtoft · 2026-03-08T06:25:19Z

Description

Add IBM MoLFormer-XL-both-10pct (~47M params, encoder-only molecular transformer) compiled with torch_neuronx.trace() for AWS Inferentia2. MoLFormer generates 768-dim embeddings from SMILES molecular strings, used in drug discovery and cheminformatics. Includes Jupyter notebook (compile, verify, benchmark), 17-test pytest suite, and README with measured results.

Model Information

Model Name: MoLFormer-XL-both-10pct
Model Architecture: Encoder-only transformer (custom, trust_remote_code=True)
Purpose: Molecular embedding generation from SMILES strings

Checklist

Please ensure your PR includes the following items. Refer to the contrib/CONTRIBUTING.md for detailed guidelines.

Required Components

Accuracy Test (ex. test/integration/test_model.py)
- At least one integration test that validates model accuracy
- Uses logit validation or equivalent accuracy verification
- Test can compile and run the model on Neuron
README.md with the following sections:
- Usage Example: Clear code example showing how to use the model
- Compatibility Matrix: Table showing tested Neuron SDK versions and instance types (Trn1/Trn2/Inf2)
- Example Checkpoints: Links to compatible model checkpoints (e.g., HuggingFace Hub)
- Testing Instructions: Command to run the test suite for the model
Source Code (src/)
- N/A -- encoder-only model uses torch_neuronx.trace() directly (no NxDI modeling code). Inference logic is in the notebook and test file, following the same pattern as contrib/models/LaughterSegmentation/.

Optional Components

Unit Tests (CPU or Neuron-based)
- Tests for individual modeling components
- Located in test/unit/ directory

Folder Structure

Confirm your contribution follows this structure:
/contrib/models/MoLFormer/
README.md
molformer_neuron_inf2.ipynb
/test
init.py
/integration
init.py
test_model.py
Note: No src/ directory -- this is an encoder-only traced model (same pattern as LaughterSegmentation). The modeling/inference code lives in the notebook and test file.

Testing

How did you test this change?
Deployed inf2.xlarge (2 NeuronCores) in sa-east-1 with SDK 2.28 DLAMI (Ubuntu 24.04). Compiled model with torch_neuronx.trace() using --auto-cast matmult --auto-cast-type bf16. Ran full pytest suite and standalone runner. Tested accuracy against CPU reference across 5 SMILES molecules (caffeine, aspirin, acetaminophen, cyclohexane, benzene). Benchmarked single-core, DataParallel, and various batch sizes.
Test Results:
============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0
collected 17 items
test_model.py::TestModelLoads::test_neuron_model_loads PASSED
test_model.py::TestModelLoads::test_output_shape PASSED
test_model.py::TestAccuracy::test_cosine_similarityCn1c(=O)c2c(ncn2C)n(C)c1=O PASSED
test_model.py::TestAccuracy::test_cosine_similarityCC(=O)Oc1ccccc1C(=O)O PASSED
test_model.py::TestAccuracy::test_cosine_similarityCC(=O)NC1=CC=C(O)C=C1 PASSED
test_model.py::TestAccuracy::test_cosine_similarityC1CCCCC1 PASSED
test_model.py::TestAccuracy::test_cosine_similarityc1ccccc1 PASSED
test_model.py::TestAccuracy::test_max_diffCn1c(=O)c2c(ncn2C)n(C)c1=O PASSED
test_model.py::TestAccuracy::test_max_diffCC(=O)Oc1ccccc1C(=O)O PASSED
test_model.py::TestAccuracy::test_max_diffCC(=O)NC1=CC=C(O)C=C1 PASSED
test_model.py::TestAccuracy::test_max_diffC1CCCCC1 PASSED
test_model.py::TestAccuracy::test_max_diffc1ccccc1 PASSED
test_model.py::TestDataParallel::test_dp_runs PASSED
test_model.py::TestDataParallel::test_dp_speedup PASSED
test_model.py::TestPerformance::test_single_core_throughput PASSED
test_model.py::TestPerformance::test_single_core_latency PASSED
test_model.py::TestPerformance::test_dp_throughput PASSED
============================= 17 passed in 41.20s ==============================

Compatibility

Tested with:

Neuron SDK Version(s): 2.28
Instance Type(s): inf2.xlarge
PyTorch Version: 2.9.0
Python Version: 3.12.3

Additional Information

MoLFormer requires trust_remote_code=True (custom modeling code from HuggingFace hub)
The deterministic_eval=True flag is MoLFormer-specific (not a standard HF arg)
--auto-cast matmult is critical for FP32 models: 65% single-core throughput gain (400 -> 661 inf/s), 54% smaller compiled model (160 MB -> 74 MB)
Cosine similarity 0.999995 vs CPU reference (matmult bf16); 1.000000 for FP32
Peak throughput: 1,660 inf/s (BS=4, DP=2); best low-latency: 1.50ms P50 (BS=1, single core)
DataParallel scales 1.97x with 2 NeuronCores
Neuron traced models return dict (without torchscript=True) or tuple (with it); test helpers handle both formats

Related Issues

None

vLLM Integration

This model/feature is intended for use with vLLM
Documentation includes vLLM registration instructions

By submitting this PR, I confirm that:

I have read and followed the contributing guidelines
This is a community contribution and may have limited testing compared to officially-supported models
The code follows best practices and is well-documented
All required components listed above are included

IBM MoLFormer-XL-both-10pct (~47M params, encoder-only) compiled with torch_neuronx.trace() for inf2. Includes notebook, 17-test pytest suite, and README with benchmark results. Key results (inf2.xlarge, SDK 2.28, matmult bf16): - Peak: 1,660 inf/s (BS=4, DP=2) - Latency: 1.50ms P50 (BS=1, single core) - Accuracy: cosine similarity 0.999995 vs CPU - Model size: 74 MB (54% smaller than FP32)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MoLFormer molecular transformer contrib model#62

Add MoLFormer molecular transformer contrib model#62
jimburtoft wants to merge 1 commit intoaws-neuron:mainfrom
jimburtoft:contrib/molformer

jimburtoft commented Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jimburtoft commented Mar 8, 2026

Description

Model Information

Checklist

Required Components

Optional Components

Folder Structure

Testing

Compatibility

Additional Information

Related Issues

vLLM Integration

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant