Skip to content

NashC/asset-price-database

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Asset Price Database

A PostgreSQL data warehouse for centralizing daily OHLCV (Open, High, Low, Close, Volume) data across equities, ETFs, and cryptocurrencies. Currently holds 10,920+ symbols with 99.9% coverage across NYSE, NASDAQ, AMEX, and TSX.

Architecture

CSV / API sources → Staging tables → QC scoring → Raw tables → Gold views → REST API

Raw data can't be trusted — CSV files from different providers have inconsistent formats, missing fields, and duplicate rows. The staging layer ingests everything without validation, then the QC step scores each record (0-100) for completeness and consistency. Records passing QC are promoted to raw tables with full data lineage (source file, load timestamp, quality score). Materialized gold views then deduplicate across sources and serve as the query layer for analytics and the REST API.

Exchange Symbols Coverage
NASDAQ 4,540 41.6%
AMEX 3,586 32.8%
NYSE 2,422 22.2%
TSX 371 3.4%

Quick Start

git clone https://github.com/NashC/asset-price-database.git
cd asset-price-database

# Start PostgreSQL
docker-compose up -d

# Setup
uv venv --python 3.11 && source .venv/bin/activate
uv pip install -e .
alembic upgrade head
psql -h localhost -U assetuser -d assetpricedb -f db/seeds/data_source_seed.sql

Loading data

# Single file
asset-price-db load data/samples/AAPL.csv --symbol AAPL --asset-type STOCK --exchange NASDAQ

# Bulk load (production — processes 8,400+ CSV files)
python bulk_load_yahoo_data_optimized.py

# Check status
asset-price-db status --view price_gold

Querying

from app.db_client import StockDB
from datetime import date

db = StockDB()
prices = db.prices(['AAPL', 'MSFT'], start=date(2024, 1, 1), end=date(2024, 1, 31))
nasdaq = db.get_available_symbols(asset_type='STOCK', exchange='NASDAQ')
# REST API
uvicorn app.fastapi_server:app --reload
curl "http://localhost:8000/prices?symbols=AAPL&start_date=2024-01-01&end_date=2024-01-31"
curl "http://localhost:8000/symbols?asset_type=STOCK&exchange=NASDAQ"

Project Structure

etl/           # Pipeline: CLI, staging, QC scoring, loaders, gold view refresh
app/           # Python DB client, FastAPI server, utilities
db/ddl/        # Table, view, and index DDL scripts
db/seeds/      # Seed data (exchanges, data sources)
tests/         # Unit and integration tests (testcontainers for Postgres)

Tech Stack

Python 3.11+, PostgreSQL 15, SQLAlchemy 2.0, Alembic, FastAPI, pandas, Click (CLI). Docker Compose for local development with PGAdmin.

License

MIT

About

PostgreSQL data warehouse for daily OHLCV data across 10,900+ equity, ETF, and crypto symbols

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors