Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 10 additions & 6 deletions .github/workflows/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,10 @@ env:
POETRY_VERSION: "1.8.5"
POETRY_VIRTUALENVS_IN_PROJECT: "true"
POETRY_CACHE_DIR: "/home/runner/.cache/poetry"
DBT_PROJECT_DIR: "${{ github.workspace }}/airflow_lappis/dags/dbt/ipea"
DBT_PROJECT_DIR: "${{ github.workspace }}/dbt/minc"

IMAGE_REGISTRY_OWNER: govhub-br
IMAGE_NAME: ghcr.io/govhub-br/airflow-ipea
IMAGE_REGISTRY: ghcr.io
IMAGE_NAME: ghcr.io/govhub-br/data-application-minc-airflow
IMAGE_TAG_SHA: ${{ github.sha }}

jobs:
Expand Down Expand Up @@ -74,11 +74,13 @@ jobs:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3

- name: Build
- name: Build Airflow
uses: docker/build-push-action@v5
with:
push: false
context: .
file: ./infra/docker/airflow/Dockerfile
target: airflow-prod
cache-from: type=gha
cache-to: type=gha,mode=max
tags: |
Expand All @@ -99,15 +101,17 @@ jobs:
- name: Login GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
registry: ${{ env.IMAGE_REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Build & Push
- name: Build & Push Airflow
uses: docker/build-push-action@v5
with:
push: true
context: .
file: ./infra/docker/airflow/Dockerfile
target: airflow-prod
cache-from: type=gha
cache-to: type=gha,mode=max
tags: |
Expand Down
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -56,5 +56,4 @@ Thumbs.db

# Airflow local files
airflow.db
airflow.cfg
webserver_config.py
40 changes: 0 additions & 40 deletions Dockerfile

This file was deleted.

30 changes: 23 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
export PYTHONPATH := $(CURDIR)/airflow_lappis
export MYPYPATH := $(CURDIR):$(CURDIR)/airflow_lappis/dags:$(CURDIR)/airflow_lappis/helpers:$(CURDIR)/airflow_lappis/plugins
COMPOSE_FILE := infra/docker-compose.yml

export PYTHONPATH := $(CURDIR)/dags:$(CURDIR)/plugins:$(CURDIR)/helpers
export MYPYPATH := $(CURDIR):$(CURDIR)/dags:$(CURDIR)/helpers:$(CURDIR)/plugins

.PHONY: setup format lint lint-ci test compose-config up down logs-airflow

setup:
pip install poetry==1.8.5
Expand All @@ -13,18 +17,30 @@ setup:
format:
poetry run black .
poetry run ruff check --fix .
poetry run sqlfmt ./airflow_lappis/dags/dbt
poetry run sqlfmt ./dbt

lint:
poetry run black . --check
poetry run ruff check .
poetry run mypy . --explicit-package-bases --install-types --non-interactive
poetry run sqlfmt ./airflow_lappis/dags/dbt --check
[ "${GITLAB_CI}" ] || poetry run sqlfluff lint ./airflow_lappis/dags/dbt
poetry run sqlfmt ./dbt --check
[ "${GITLAB_CI}" ] || poetry run sqlfluff lint ./dbt

lint-ci:
poetry run sqlfmt ./airflow_lappis/dags/dbt --check
poetry run sqlfluff lint ./airflow_lappis/dags/dbt --config .sqlfluff.ci --ignore templating
poetry run sqlfmt ./dbt --check
poetry run sqlfluff lint ./dbt --config .sqlfluff.ci --ignore templating

test:
poetry run pytest tests

compose-config:
docker compose -f $(COMPOSE_FILE) config

up:
docker compose -f $(COMPOSE_FILE) up postgres airflow airflow-mcp

down:
docker compose -f $(COMPOSE_FILE) down

logs-airflow:
docker compose -f $(COMPOSE_FILE) logs airflow --tail=200
124 changes: 49 additions & 75 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,104 +32,79 @@ Esse trabalho é mantido pelo [Lab Livre](https://www.instagram.com/lab.livre/)
Para dúvidas, sugestões ou para contribuir com o projeto, entre em contato conosco: [lablivreunb@gmail.com](mailto:lablivreunb@gmail.com)


# Data Pipeline Project
# Data Application MinC

This project implements a modern data stack using Airflow, dbt, Jupyter, and Superset for data orchestration, transformation, analysis, and visualization.
Este repositório organiza a aplicação de dados em torno do Airflow e do dbt. A
raiz contém o código executado pelo Airflow; a pasta `infra/` concentra Docker,
Compose e arquivos de suporte para o ambiente local.

## 🚀 Stack Components
## Stack

- **Apache Airflow**: Workflow orchestration
- **dbt**: Data transformation
- **Jupyter**: Interactive data analysis
- **Apache Superset**: Data visualization and exploration
- **Docker**: Containerization and local development
- **Make**: Build automation and setup
- **Apache Airflow**: orquestração dos pipelines
- **dbt**: transformação dos dados
- **PostgreSQL**: banco local para desenvolvimento
- **Docker Compose**: execução local dos serviços
- **Make**: automação de comandos de desenvolvimento

## 📋 Prerequisites
## Estrutura

- Docker and Docker Compose
- Make
- Python 3.x
- Git

## 🔧 Setup

1. Clone the repository:
```bash
git clone git@gitlab.com:lappis-unb/gest-odadosipea/app-lappis-ipea.git
cd app-lappis-ipea
```text
.
├── dags/ # DAGs carregadas pelo Airflow
│ ├── data_ingest/
│ ├── dashboards/
│ └── dbt/ # DAGs Cosmos que executam os projetos dbt
├── dbt/ # Projetos dbt fora do parser de DAGs
│ ├── ipea/
│ └── mir/
├── helpers/ # Utilitários importados pelas DAGs
├── plugins/ # Clientes e extensões usados pelo Airflow
├── templates/ # Templates Jinja/XML usados pelos clientes
├── infra/ # Docker, compose, Airflow config e init de banco
├── tests/
├── Makefile
├── pyproject.toml
└── requirements.txt
```

2. Run the setup using Make:
## Setup

```bash
make setup
```

This will:
- Create necessary virtual environments
- Install dependencies
- Set up pre-commit hooks
- Configure development environment
Para usar Docker Compose, mantenha um `.env` na raiz do projeto. Um exemplo de
variáveis esperadas está em `infra/env/.env.example`.

## 🏃‍♂️ Running Locally

Start all services using Docker Compose:
## Rodando Localmente

```bash
docker-compose up -d
make up
```

Access the different components:
- Airflow: http://localhost:8080
- Jupyter: http://localhost:8888
- Superset: http://localhost:8088

## 💻 Development
Serviços principais:

### Code Quality
- Airflow: http://localhost:8080
- Airflow MCP: http://localhost:8000
- PostgreSQL: localhost:5432

This project uses several tools to maintain code quality:
- Pre-commit hooks
- Linting configurations
- Automated testing
Comandos úteis:

Run linting checks:
```bash
make lint
make compose-config
make logs-airflow
make down
```

Run tests:
## Desenvolvimento

```bash
make format
make lint
make test
```

### Project Structure

```
.
├── airflow/
│ ├── dags/
│ └── plugins/
├── dbt/
│ └── models/
├── jupyter/
│ └── notebooks/
├── superset/
│ └── dashboards/
├── docker-compose.yml
├── Makefile
└── README.md
```

### Makefile Commands

- `make setup`: Initial project setup
- `make lint`: Run linting checks
- `make tests`: Run test suite
- `make clean`: Clean up generated files
- `make build`: Build Docker images

## 🔐 Git Workflow
## Git Workflow

This project requires signed commits. To set up GPG signing:

Expand All @@ -146,13 +121,12 @@ git config --global commit.gpgsign true

3. Add your GPG key to your GitLab account

## 📚 Documentation
## Documentation

- [Airflow Documentation](https://airflow.apache.org/docs/)
- [dbt Documentation](https://docs.getdbt.com/)
- [Superset Documentation](https://superset.apache.org/docs/intro)

## 🤝 Contributing
## Contributing

1. Create a new branch for your feature
2. Make changes and ensure all tests pass
Expand Down
3 changes: 0 additions & 3 deletions airflow_lappis/dags/dashboards/__init__.py

This file was deleted.

Loading
Loading