name: Good First Issue
about: A beginner-friendly task perfect for first-time contributors
title: '[GOOD FIRST ISSUE] Add Docstrings to Processing Utility Functions'
labels: 'good first issue, documentation, enhancement'
assignees: ''
Welcome! 👋
This is a beginner-friendly issue perfect for first-time contributors to the Intugle project. We've designed this task to help you get familiar with our codebase while making a meaningful contribution.
Task Description
Add comprehensive docstrings to utility functions in src/intugle/core/utilities/processing.py. Several important functions like string_standardization, compute_stats, adjust_sample, and others need better documentation.
Why This Matters
These utility functions are used throughout the codebase for:
- Data cleaning and standardization
- Statistical computations
- Sample data processing
Good documentation helps developers understand:
- What each function does
- What parameters it expects
- What it returns
- When to use each function
What You'll Learn
- Writing clear documentation for utility functions
- Explaining mathematical operations in plain language
- Documenting data transformation functions
- Understanding statistical concepts (mean, variance, skewness, kurtosis)
Step-by-Step Guide
Prerequisites
Setup Instructions
-
Fork and clone the repository
git clone https://github.com/YOUR_USERNAME/data-tools.git
cd data-tools
-
Create a virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies
-
Create a new branch
git checkout -b docs/add-docstrings-processing-utils
Implementation Steps
-
Open the file src/intugle/core/utilities/processing.py
-
Add docstring to remove_ascii() function (line 18):
- Explain what it does (removes non-ASCII characters)
- Document parameters and return type
- Explain use case (data cleaning)
-
Add docstring to string_standardization() function (line 22):
- Currently has no docstring
- Explain the cleaning steps: remove ASCII, remove special chars, standardize whitespace, etc.
- Add example showing before/after
-
Add docstring to compute_stats() function (line 31):
- Explain what statistics are computed
- Document the return tuple order
- Explain the special case when variance is 0
-
Add docstring to adjust_sample() function (line 54):
- Explain the sampling strategy
- Document all parameters and their defaults
- Explain when samples are augmented vs truncated
-
Add docstring to character_length_based_stratified_sampling() function (line 175):
- Explain stratified sampling approach
- Explain why character length is used for stratification
- Document parameters
-
Add docstring to to_high_precision_array() function (line 246):
- Already has a good docstring! This is a reference for your other docstrings
Files to Modify
- File:
src/intugle/core/utilities/processing.py
- Change: Add comprehensive docstrings to utility functions
- Line(s): 18, 22, 31, 54, 175 (and others as you see fit)
Testing Your Changes
-
Verify docstrings render correctly:
from intugle.core.utilities.processing import (
string_standardization,
compute_stats,
adjust_sample
)
help(string_standardization)
help(compute_stats)
help(adjust_sample)
-
Check linting:
ruff check src/intugle/core/utilities/processing.py
Submitting Your Work
Please run the following command to automatically fix linting issues before committing: ruff check --fix .
-
Commit your changes
git add src/intugle/core/utilities/processing.py
git commit -m "Add comprehensive docstrings to processing utilities"
-
Push to your fork
git push origin docs/add-docstrings-processing-utils
-
Create a Pull Request
- Go to the original repository
- Click "Pull Requests" → "New Pull Request"
- Select your branch
- Fill out the PR template
- Reference this issue with "Fixes #ISSUE_NUMBER"
Expected Outcome
All utility functions should have clear docstrings with:
- Brief description of what the function does
- Parameter descriptions
- Return value documentation
- Practical examples
- Notes about edge cases or special behavior
Definition of Done
Resources
Need Help?
Don't hesitate to ask questions! We're here to help you succeed.
- Comment below with your questions
- Join our Discord for real-time support
- Tag maintainers: @raphael-intugle (if specific help needed)
Skills You'll Use
Thank you for contributing to Intugle!
Tips for Success:
- Read each function carefully to understand what it does
- Test the functions in a Python shell to see their behavior
- Include concrete examples that show real use cases
- Have fun! 🎉
name: Good First Issue
about: A beginner-friendly task perfect for first-time contributors
title: '[GOOD FIRST ISSUE] Add Docstrings to Processing Utility Functions'
labels: 'good first issue, documentation, enhancement'
assignees: ''
Welcome! 👋
This is a beginner-friendly issue perfect for first-time contributors to the Intugle project. We've designed this task to help you get familiar with our codebase while making a meaningful contribution.
Task Description
Add comprehensive docstrings to utility functions in
src/intugle/core/utilities/processing.py. Several important functions likestring_standardization,compute_stats,adjust_sample, and others need better documentation.Why This Matters
These utility functions are used throughout the codebase for:
Good documentation helps developers understand:
What You'll Learn
Step-by-Step Guide
Prerequisites
Setup Instructions
Fork and clone the repository
git clone https://github.com/YOUR_USERNAME/data-tools.git cd data-toolsCreate a virtual environment
Install dependencies
pip install -e ".[dev]"Create a new branch
Implementation Steps
Open the file
src/intugle/core/utilities/processing.pyAdd docstring to
remove_ascii()function (line 18):Add docstring to
string_standardization()function (line 22):Add docstring to
compute_stats()function (line 31):Add docstring to
adjust_sample()function (line 54):Add docstring to
character_length_based_stratified_sampling()function (line 175):Add docstring to
to_high_precision_array()function (line 246):Files to Modify
src/intugle/core/utilities/processing.pyTesting Your Changes
Verify docstrings render correctly:
Check linting:
Submitting Your Work
Commit your changes
git add src/intugle/core/utilities/processing.py git commit -m "Add comprehensive docstrings to processing utilities"Push to your fork
Create a Pull Request
Expected Outcome
All utility functions should have clear docstrings with:
Definition of Done
remove_ascii()functionstring_standardization()functioncompute_stats()functionadjust_sample()functioncharacter_length_based_stratified_sampling()functionResources
Need Help?
Don't hesitate to ask questions! We're here to help you succeed.
Skills You'll Use
Thank you for contributing to Intugle!
Tips for Success: