llm-feat

Automatically generate feature engineering code for pandas DataFrames using LLMs. Get context-aware, target-specific features that understand your domain.

Installation

pip install llm-feat

Quick Start

import pandas as pd
import llm_feat

llm_feat.set_api_key("your-openai-api-key")  # or set OPENAI_API_KEY env var

# Your data
df = pd.DataFrame({
    'income': [50000, 60000, 70000],
    'expenses': [30000, 35000, 40000],
    'target': [1, 0, 1]
})

# Metadata describing your columns
metadata_df = pd.DataFrame({
    'column_name': ['income', 'expenses', 'target'],
    'description': ['Annual income', 'Annual expenses', 'Binary target'],
    'data_type': ['numeric', 'numeric', 'numeric'],
    'label_definition': [None, None, '1 if positive, 0 if negative']
})

# Generate features
code = llm_feat.generate_features(df, metadata_df, mode='code')
print(code)

Generated Code:

import numpy as np

df['income_to_expense_ratio'] = np.where(df['expenses'] != 0, df['income'] / df['expenses'], np.nan)
df['savings'] = df['income'] - df['expenses']
df['savings_to_income_ratio'] = np.where(df['income'] != 0, df['savings'] / df['income'], np.nan)

Feature Reports

Get detailed explanations of why each feature was generated:

code, report = llm_feat.generate_features(
    df, metadata_df, mode='code', return_report=True
)
print(report)

Example Report:

FEATURE REPORT
==============

1. DOMAIN UNDERSTANDING:
   - Problem: Predicting binary target based on income and expenses
   - Key relationships: Income-to-expense ratios indicate financial health

2. GENERATED FEATURES EXPLANATION:
   - Feature: income_to_expense_ratio
     Rationale: Higher ratios indicate better financial stability
     Domain Relevance: Directly related to predicting positive outcomes

Direct Mode

Add features directly to your DataFrame:

df_with_features = llm_feat.generate_features(
    df, metadata_df, mode='direct', model='gpt-4o-mini'
)

Model Performance

See the impact of automated feature engineering on model accuracy:

Example (Diabetes Dataset): Compare results in this notebook →

Screenshots

Auto Feature Generation in Jupyter

Feature code is generated right where you need it.

Feature Report Example

Explains domain insights and feature logic clearly.

Key Features

Context-aware: Uses column descriptions to generate relevant features
Target-aware: Generates features specific to your prediction task
Categorical support: Automatic encoding for categorical columns
Jupyter integration: Code auto-injected into next cell
Feature reports: Understand the reasoning behind each feature
Performance boost: Proven to improve model accuracy with domain-relevant features

Documentation

API Reference - Complete parameter documentation and examples
Basic Examples - Jupyter notebook examples
Model Performance Comparison - See how features improve model accuracy
Changelog - Version history

Development

git clone https://github.com/codeastra2/llm-feat.git
cd llm-feat
conda create -n llm_feat_310 python=3.10 -y
conda activate llm_feat_310
poetry install
poetry run pytest

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.cursor/rules		.cursor/rules
.github/workflows		.github/workflows
docs		docs
llm_feat		llm_feat
tests		tests
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
API.md		API.md
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
example_llm_feat.ipynb		example_llm_feat.ipynb
example_llm_feat.py		example_llm_feat.py
example_model_perf.ipynb		example_model_perf.ipynb
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

llm-feat

Installation

Quick Start

Feature Reports

Direct Mode

Model Performance

Screenshots

Auto Feature Generation in Jupyter

Feature Report Example

Key Features

Documentation

Development

License

Author

Links

About

Uh oh!

Releases 5

Packages

Languages

License

codeastra2/llm-feat

Folders and files

Latest commit

History

Repository files navigation

llm-feat

Installation

Quick Start

Feature Reports

Direct Mode

Model Performance

Screenshots

Auto Feature Generation in Jupyter

Feature Report Example

Key Features

Documentation

Development

License

Author

Links

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Languages

Packages