Skip to content

malithjayasinghe/slm-test-summarization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SLM Test Summarization

A Python framework for testing Small Language Models (SLMs) locally for automated test case and test result summarization. This project enables evaluation and comparison of different language models for their ability to generate concise, accurate summaries of software testing outcomes.

πŸš€ Features

  • Multi-Backend Support: Ollama (local models) and Hugging Face (cloud models)
  • Comprehensive Evaluation: ROUGE, BLEU, and custom metrics for summary quality
  • Batch Processing: Efficient processing of multiple test cases
  • Model Comparison: Side-by-side evaluation of different SLMs
  • Extensible Architecture: Easy to add new model providers and evaluation metrics

πŸ“Š Supported Models

Hugging Face Models

  • BART-Large-CNN: Facebook's state-of-the-art summarization model βœ… Working
  • T5-Small/Base/Large: Google's text-to-text transformer models
  • Any Hugging Face seq2seq model

Ollama Models (Local)

  • Llama 3.2: Meta's latest language model (3B, 7B, 70B variants)
  • Phi 3.5: Microsoft's efficient small language model
  • Gemma: Google's lightweight model family
  • Any model available through Ollama

πŸ› οΈ Installation

Prerequisites

  • Python 3.8+
  • 4GB+ RAM (8GB+ recommended for larger models)
  • Git

Quick Setup

  1. Clone the repository

    git clone https://github.com/yourusername/slm-test-summarization.git
    cd slm-test-summarization
  2. Create virtual environment

    python -m venv .venv
    source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Run the example

    python examples/basic_summarization.py

Optional: Ollama Setup for Local Models

  1. Install Ollama from ollama.ai

  2. Pull a model

    ollama pull llama3.2:3b
  3. Start Ollama service

    ollama serve

🎯 Usage

Basic Example

from src.models.huggingface_model import HuggingFaceModel
from src.models.base import SummarizationRequest
from src.evaluation import SummarizationEvaluator

# Initialize model
model = HuggingFaceModel('facebook/bart-large-cnn')

# Create test case
request = SummarizationRequest(
    test_content="""
    Unit Test Results - User Authentication
    Test: test_valid_login()
    Status: PASSED
    User provided valid credentials (user@example.com, correct_password)
    Expected: Login successful, session token generated
    Actual: Login successful, session token: abc123xyz
    Assertions: 3/3 passed
    Duration: 0.25s
    """,
    test_type="UNIT",
    max_length=50,
    style="concise"
)

# Generate summary
response = model.summarize(request)
print(f"Summary: {response.summary}")
print(f"Processing time: {response.processing_time:.2f}s")

# Evaluate quality
evaluator = SummarizationEvaluator()
reference = "User authentication test passed with valid credentials"
evaluation = evaluator.evaluate_single(reference, response.summary, "BART")
print(f"ROUGE-1 Score: {evaluation['rouge']['rouge1_fmeasure']:.3f}")

Batch Processing

from src.models.ollama_model import OllamaModel

# Initialize Ollama model
model = OllamaModel('llama3.2:3b')

# Process multiple test cases
test_cases = [
    SummarizationRequest(test_content="...", test_type="UNIT"),
    SummarizationRequest(test_content="...", test_type="E2E"),
    SummarizationRequest(test_content="...", test_type="INTEGRATION")
]

responses = model.summarize_batch(test_cases)
for i, response in enumerate(responses):
    print(f"Test {i+1}: {response.summary}")

Model Comparison

from src.evaluation.comparison import ModelComparator

# Compare multiple models
models = {
    'BART': HuggingFaceModel('facebook/bart-large-cnn'),
    'Llama3.2': OllamaModel('llama3.2:3b')
}

comparator = ModelComparator()
results = comparator.compare_models(models, test_cases, references)

# Generate report
from src.evaluation.reports import ReportGenerator
generator = ReportGenerator()
report = generator.generate_comparison_report(results)
print(report)

πŸ“Š Evaluation Metrics

Standard Metrics

  • ROUGE-1/2/L: Content overlap and recall
  • BLEU: Translation quality adapted for summarization

Custom Metrics

  • Length Ratio: Summary conciseness (target vs actual length)
  • Keyword Coverage: Important terms preservation
  • Readability Score: Text complexity analysis
  • Completeness: Information retention assessment

πŸ—οΈ Project Structure

slm-test-summarization/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ models/              # SLM integrations
β”‚   β”‚   β”œβ”€β”€ base.py         # Abstract base classes
β”‚   β”‚   β”œβ”€β”€ huggingface_model.py
β”‚   β”‚   └── ollama_model.py
β”‚   β”œβ”€β”€ evaluation/          # Metrics and comparison
β”‚   β”‚   β”œβ”€β”€ metrics.py      # ROUGE, BLEU, custom metrics
β”‚   β”‚   β”œβ”€β”€ comparison.py   # Model comparison tools
β”‚   β”‚   └── reports.py      # Report generation
β”‚   β”œβ”€β”€ data/               # Test data processing
β”‚   └── utils/              # Configuration and helpers
β”œβ”€β”€ examples/               # Usage examples
β”œβ”€β”€ requirements.txt        # Python dependencies
└── README.md              # This file

πŸ“ˆ Performance Benchmarks

Model Avg Time/Test ROUGE-1 Score Memory Usage
BART-Large-CNN 4.2s 0.354 2.1GB
Llama 3.2:3B 2.8s* 0.331* 3.2GB*
T5-Base 3.1s* 0.298* 1.8GB*

Estimated performance based on model specifications

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

🎯 Use Cases

  • CI/CD Integration: Automatic test result summarization in build pipelines
  • QA Reporting: Generate executive summaries of test suite outcomes
  • Model Research: Compare different SLMs for domain-specific summarization
  • Test Analysis: Quickly understand large test suite results
  • Documentation: Auto-generate test case descriptions

πŸ™ Acknowledgments

  • Hugging Face for transformer models and libraries
  • Ollama for local model serving
  • ROUGE for evaluation metrics
  • Open source community for inspiration and contributions

⭐ Star this repository if it helps your testing workflow!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors