Skip to content

feat: Add comprehensive Python testing infrastructure with Poetry#18

Open
llbbl wants to merge 1 commit intoWenmuZhou:masterfrom
UnitSeeker:add-testing-infrastructure
Open

feat: Add comprehensive Python testing infrastructure with Poetry#18
llbbl wants to merge 1 commit intoWenmuZhou:masterfrom
UnitSeeker:add-testing-infrastructure

Conversation

@llbbl
Copy link
Copy Markdown

@llbbl llbbl commented Jun 27, 2025

Add Python Testing Infrastructure

Summary

This PR sets up a comprehensive testing infrastructure for the OCR dataset tools project using Poetry as the package manager and pytest as the testing framework.

Changes Made

Package Management

  • Poetry Setup: Created pyproject.toml with Poetry configuration as the project's package manager
  • Dependencies: Migrated all necessary dependencies including PyTorch, OpenCV, and other OCR-related packages
  • Dev Dependencies: Added pytest, pytest-cov, and pytest-mock for testing

Testing Configuration

  • pytest Configuration: Set up pytest with:

    • Test discovery patterns for test_*.py and *_test.py files
    • Coverage reporting with HTML and XML output formats
    • Custom markers for unit, integration, and slow tests
    • Strict mode with verbose output
  • Coverage Settings: Configured coverage to:

    • Track convert and dataset packages
    • Exclude test files and __init__.py from coverage
    • Generate reports in multiple formats
    • Currently set to 0% threshold (should be changed to 80% when actual tests are added)

Directory Structure

tests/
├── __init__.py
├── conftest.py          # Shared pytest fixtures
├── test_setup_validation.py  # Infrastructure validation tests
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Fixtures (in conftest.py)

  • temp_dir: Creates temporary directory for test files
  • sample_image: Generates test images
  • sample_detection_json: Creates detection JSON test data
  • sample_recognition_txt: Creates recognition text test data
  • mock_dataset_config: Provides mock configuration
  • sample_points: Creates polygon point arrays
  • mock_lmdb_env: Mocks LMDB database environment
  • reset_modules: Cleans module imports between tests
  • capture_stdout: Captures print output for testing

Additional Changes

  • Updated .gitignore: Added comprehensive entries for:
    • Testing artifacts (.pytest_cache/, coverage.xml, htmlcov/)
    • Claude settings (.claude/*)
    • Python build artifacts and virtual environments
    • Note: poetry.lock is intentionally NOT ignored

How to Use

Installing Dependencies

# Install Poetry (if not already installed)
curl -sSL https://install.python-poetry.org | python3 -

# Install project dependencies
poetry install

Running Tests

Both commands are available and will run the same test suite:

poetry run test
# or
poetry run tests

Test Options

All standard pytest options are available:

# Run only unit tests
poetry run test -m unit

# Run with specific verbosity
poetry run test -v

# Run a specific test file
poetry run test tests/test_setup_validation.py

# Generate coverage report only
poetry run test --cov-report=html

Notes

  1. Coverage Threshold: Currently set to 0% to allow infrastructure setup. Should be changed to 80% in pyproject.toml when actual tests are written.

  2. Validation Tests: The included test_setup_validation.py verifies that the testing infrastructure is properly configured and all dependencies are correctly installed.

  3. Next Steps: Developers can now immediately start writing unit and integration tests for the convert and dataset packages using the provided infrastructure.

  4. Poetry Lock File: The poetry.lock file will be generated on first install and should be committed to ensure reproducible builds.

- Set up Poetry as package manager with pyproject.toml configuration
- Add pytest, pytest-cov, and pytest-mock as dev dependencies
- Configure pytest with coverage reporting and custom markers
- Create test directory structure with fixtures in conftest.py
- Add validation tests to verify infrastructure setup
- Update .gitignore with testing and Claude-related entries
- Configure test commands accessible via `poetry run test/tests`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant