Skip to content

DanielLin9406/worker-financialReportScreenr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Financial Report Screener & Analyzer

Python Version License Code Style

A comprehensive data pipeline designed to automate the extraction, transformation, and analysis of SEC financial reports. This tool leverages modern Python design patterns to process financial data and generate insightful investment summaries directly into Google Sheets.

Dashboard Preview

🚀 Features

  • Automated Data Pipeline: Seamlessly scrapes and processes SEC financial reports.
  • Company Screener:
    • Ingests financial report files (Income Statement, Balance Sheet, Cash Flow) from local storage.
    • Generates custom analysis reports on Google Sheets.
  • Advanced Financial Analysis:
    • Calculates key valuation metrics including DCF, Graham Number, and DDM.
    • Provides Buy/Sell decision support based on configurable strategies.
  • Google Sheets Integration:
    • Automatically updates a central "Stock" spreadsheet with new analysis data.
    • Fetches existing portfolio data for context-aware recommendations.

🏗 Technical Architecture & Design Highlights

This project implements a robust ETL (Extract, Transform, Load) architecture, utilizing advanced design patterns to ensure modularity and scalability.

ETL Architecture

ETL Process Overview

  1. Extract (Data Ingestion)

    • Responsibility: Ingests raw financial data from hybrid sources (local Excel files & external APIs).
    • Design Patterns:
      • Template Method: Defines the standard data loading skeleton in InputTemplate.
      • Mediator: APIMediator coordinates complex interactions between different API services.
      • Command: Encapsulates API requests into objects, decoupling execution logic.
  2. Transform (Data Processing & Analysis)

    • Responsibility: Cleans raw data, calculates financial indicators, and executes valuation models.
    • Design Patterns:
      • Chain of Responsibility: PipelineHandler creates a processing pipeline for data cleaning (e.g., stripping whitespace -> type conversion).
      • Abstract Factory: TableAbstractFactory provides an interface to create families of related financial tables (Price, Score, Decision).
      • Builder: Constructs complex analysis objects step-by-step (e.g., ParsTableBuilder).
      • Strategy: Encapsulates interchangeable algorithms for scoring and buy/sell decisions (ScoreTableStrategy, BuyDecisionTableStrategy).
  3. Load (Data Output)

    • Responsibility: Dispatches analysis results to destination systems.
    • Design Patterns:
      • Observer: OutputSubject notifies subscribed observers (Google Sheets, Databases) when new analysis data is ready, decoupling the analysis engine from the reporting layer.

Technical Highlights

  • Robust Data Handling: Heavy use of Pandas and NumPy for efficient vectorization and manipulation of large financial datasets.
  • Environment Management: Fully integrated with python-dotenv for secure API key management and venv for isolated development environments.
  • Extensible Architecture: The use of Abstract Base Classes (ABCs) ensures that new indicators or data sources can be added by simply implementing a predefined interface.

🛠 Getting Started

Prerequisites

  • Python 3.8 or higher
  • Google Cloud Platform Service Account (creds.json)
  • API Keys for data sources (Quandl, AlphaVantage)

Configuration

  1. Google Sheets Credentials: Place your Google Service Account JSON key file in the project root and rename it to creds.json.

    • Ensure the service account has access to a Google Sheet named "Stock".
  2. Environment Variables: Create a .env file in the project root with your API keys:

    QUANDL_API_KEY=your_quandl_key
    ALPHA_API_KEY=your_alphavantage_key

Installation

  1. Create and Activate Virtual Environment (Recommended)

    # Create virtual environment
    python3 -m venv venv
    
    # Activate virtual environment (macOS/Linux)
    source venv/bin/activate
    
    # Activate virtual environment (Windows)
    # venv\Scripts\activate
  2. Install Project

    # Install project and dependencies in editable mode
    pip install -e .

💻 Usage

Once installed, you can run the main analysis pipeline using the command line interface:

financial-report

The system will:

  1. Load local financial reports from ~/FinancialData/{ticker}.
  2. Fetch supplementary market data via APIs.
  3. Execute the analysis pipeline.
  4. Upload the results to the configured Google Sheet.

📂 Project Structure

.
├── src/
│   └── company_screener/     # Main package
│       ├── API/              # Data fetching layer (Command Pattern)
│       ├── Config/           # Configuration management
│       ├── CreateTables/     # Table generation logic (Factory Pattern)
│       ├── Input/            # Data ingestion strategies
│       ├── Output/           # Data export handlers (Observer Pattern)
│       ├── Worker/           # Utility workers and loggers
│       ├── main.py           # Application entry point
│       └── mainFactory.py    # Dependency Injection root
├── pyproject.toml            # Project configuration & dependencies
├── requirements.txt          # Legacy dependency file
└── README.md                 # Documentation

📄 License

This project is licensed under the MIT License - see the pyproject.toml file for details.

About

A SEC financial report is input and a well-organized table is output to Google Sheet.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages