Skip to content

Add MoLFormer molecular transformer contrib model#62

Open
jimburtoft wants to merge 1 commit intoaws-neuron:mainfrom
jimburtoft:contrib/molformer
Open

Add MoLFormer molecular transformer contrib model#62
jimburtoft wants to merge 1 commit intoaws-neuron:mainfrom
jimburtoft:contrib/molformer

Conversation

@jimburtoft
Copy link

Description

Add IBM MoLFormer-XL-both-10pct (~47M params, encoder-only molecular transformer) compiled with torch_neuronx.trace() for AWS Inferentia2. MoLFormer generates 768-dim embeddings from SMILES molecular strings, used in drug discovery and cheminformatics. Includes Jupyter notebook (compile, verify, benchmark), 17-test pytest suite, and README with measured results.

Model Information

Model Name: MoLFormer-XL-both-10pct
Model Architecture: Encoder-only transformer (custom, trust_remote_code=True)
Purpose: Molecular embedding generation from SMILES strings

Checklist

Please ensure your PR includes the following items. Refer to the contrib/CONTRIBUTING.md for detailed guidelines.

Required Components

  • Accuracy Test (ex. test/integration/test_model.py)
    • At least one integration test that validates model accuracy
    • Uses logit validation or equivalent accuracy verification
    • Test can compile and run the model on Neuron
  • README.md with the following sections:
    • Usage Example: Clear code example showing how to use the model
    • Compatibility Matrix: Table showing tested Neuron SDK versions and instance types (Trn1/Trn2/Inf2)
    • Example Checkpoints: Links to compatible model checkpoints (e.g., HuggingFace Hub)
    • Testing Instructions: Command to run the test suite for the model
  • Source Code (src/)
    • N/A -- encoder-only model uses torch_neuronx.trace() directly (no NxDI modeling code). Inference logic is in the notebook and test file, following the same pattern as contrib/models/LaughterSegmentation/.

Optional Components

  • Unit Tests (CPU or Neuron-based)
    • Tests for individual modeling components
    • Located in test/unit/ directory

Folder Structure

Confirm your contribution follows this structure:
/contrib/models/MoLFormer/
README.md
molformer_neuron_inf2.ipynb
/test
init.py
/integration
init.py
test_model.py
Note: No src/ directory -- this is an encoder-only traced model (same pattern as LaughterSegmentation). The modeling/inference code lives in the notebook and test file.

Testing

How did you test this change?
Deployed inf2.xlarge (2 NeuronCores) in sa-east-1 with SDK 2.28 DLAMI (Ubuntu 24.04). Compiled model with torch_neuronx.trace() using --auto-cast matmult --auto-cast-type bf16. Ran full pytest suite and standalone runner. Tested accuracy against CPU reference across 5 SMILES molecules (caffeine, aspirin, acetaminophen, cyclohexane, benzene). Benchmarked single-core, DataParallel, and various batch sizes.
Test Results:
============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0
collected 17 items
test_model.py::TestModelLoads::test_neuron_model_loads PASSED
test_model.py::TestModelLoads::test_output_shape PASSED
test_model.py::TestAccuracy::test_cosine_similarityCn1c(=O)c2c(ncn2C)n(C)c1=O PASSED
test_model.py::TestAccuracy::test_cosine_similarityCC(=O)Oc1ccccc1C(=O)O PASSED
test_model.py::TestAccuracy::test_cosine_similarityCC(=O)NC1=CC=C(O)C=C1 PASSED
test_model.py::TestAccuracy::test_cosine_similarityC1CCCCC1 PASSED
test_model.py::TestAccuracy::test_cosine_similarityc1ccccc1 PASSED
test_model.py::TestAccuracy::test_max_diffCn1c(=O)c2c(ncn2C)n(C)c1=O PASSED
test_model.py::TestAccuracy::test_max_diffCC(=O)Oc1ccccc1C(=O)O PASSED
test_model.py::TestAccuracy::test_max_diffCC(=O)NC1=CC=C(O)C=C1 PASSED
test_model.py::TestAccuracy::test_max_diffC1CCCCC1 PASSED
test_model.py::TestAccuracy::test_max_diffc1ccccc1 PASSED
test_model.py::TestDataParallel::test_dp_runs PASSED
test_model.py::TestDataParallel::test_dp_speedup PASSED
test_model.py::TestPerformance::test_single_core_throughput PASSED
test_model.py::TestPerformance::test_single_core_latency PASSED
test_model.py::TestPerformance::test_dp_throughput PASSED
============================= 17 passed in 41.20s ==============================

Compatibility

Tested with:

  • Neuron SDK Version(s): 2.28
  • Instance Type(s): inf2.xlarge
  • PyTorch Version: 2.9.0
  • Python Version: 3.12.3

Additional Information

  • MoLFormer requires trust_remote_code=True (custom modeling code from HuggingFace hub)
  • The deterministic_eval=True flag is MoLFormer-specific (not a standard HF arg)
  • --auto-cast matmult is critical for FP32 models: 65% single-core throughput gain (400 -> 661 inf/s), 54% smaller compiled model (160 MB -> 74 MB)
  • Cosine similarity 0.999995 vs CPU reference (matmult bf16); 1.000000 for FP32
  • Peak throughput: 1,660 inf/s (BS=4, DP=2); best low-latency: 1.50ms P50 (BS=1, single core)
  • DataParallel scales 1.97x with 2 NeuronCores
  • Neuron traced models return dict (without torchscript=True) or tuple (with it); test helpers handle both formats

Related Issues

None

vLLM Integration

  • This model/feature is intended for use with vLLM
  • Documentation includes vLLM registration instructions

By submitting this PR, I confirm that:

  • I have read and followed the contributing guidelines
  • This is a community contribution and may have limited testing compared to officially-supported models
  • The code follows best practices and is well-documented
  • All required components listed above are included

IBM MoLFormer-XL-both-10pct (~47M params, encoder-only) compiled with
torch_neuronx.trace() for inf2. Includes notebook, 17-test pytest suite,
and README with benchmark results.

Key results (inf2.xlarge, SDK 2.28, matmult bf16):
- Peak: 1,660 inf/s (BS=4, DP=2)
- Latency: 1.50ms P50 (BS=1, single core)
- Accuracy: cosine similarity 0.999995 vs CPU
- Model size: 74 MB (54% smaller than FP32)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant