Skip to content

EmpatDevelopment/microservices-template

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Microservices R&D — Project Documentation

A production-grade NestJS microservice architecture featuring 4 services communicating over RabbitMQ, with PostgreSQL per-service schemas, resilience patterns (timeouts, retries with exponential backoff, circuit breakers), idempotency keys, distributed tracing via OpenTelemetry + Jaeger, internationalization, and comprehensive integration tests. Deployed on AWS ECS Fargate with full Terraform IaC, Bitbucket Pipelines CI/CD, K6 load testing, and Grafana Cloud observability.


Table of Contents


System Architecture

graph TD
    Client["Client / Browser"]

    subgraph Application Services
        GW["API Gateway<br/>:3000<br/>(HTTP REST + Swagger)"]
        OS["Order Service<br/>(RMQ Consumer)"]
        IS["Inventory Service<br/>(RMQ Consumer)"]
        PS["Payment Service<br/>(RMQ Consumer)"]
    end

    subgraph Infrastructure
        RMQ["RabbitMQ<br/>:5672 / :15672"]
        PG["PostgreSQL 16<br/>:5432"]
        JG["Jaeger v2<br/>:16686 / :4318"]
        PGA["pgAdmin<br/>:5050"]
    end

    subgraph Database Schemas
        OS_DB["order_schema"]
        IS_DB["inventory_schema"]
        PS_DB["payment_schema"]
    end

    Client -->|HTTP| GW
    GW -->|"RPC (order_queue)"| RMQ
    GW -->|"RPC (inventory_queue)"| RMQ
    GW -->|"RPC (payment_queue)"| RMQ
    RMQ --> OS
    RMQ --> IS
    RMQ --> PS
    OS --> OS_DB
    IS --> IS_DB
    PS --> PS_DB
    OS_DB --> PG
    IS_DB --> PG
    PS_DB --> PG
    PGA -->|Admin| PG

    GW -.->|OTLP traces| JG
    OS -.->|OTLP traces| JG
    IS -.->|OTLP traces| JG
    PS -.->|OTLP traces| JG
Loading

img.png

Project Structure

microservice-r-d/
├── apps/
│   ├── api-gateway/
│   │   └── src/
│   │       ├── main.ts                        # HTTP app bootstrap + tracing init
│   │       ├── gateway.module.ts              # Imports RMQ clients + ResilienceModule
│   │       └── controllers/
│   │           ├── health.controller.ts       # GET /health, GET /health/circuits
│   │           ├── order.controller.ts        # /api/orders endpoints
│   │           ├── inventory.controller.ts    # /api/products endpoints
│   │           └── payment.controller.ts      # /api/payments endpoints
│   │
│   ├── order-service/
│   │   └── src/
│   │       ├── main.ts                        # Microservice bootstrap + tracing
│   │       ├── order.module.ts                # TypeORM + I18n config
│   │       ├── data-source.ts                 # TypeORM DataSource for migrations
│   │       ├── order/
│   │       │   ├── order.entity.ts            # Order entity (order_schema)
│   │       │   ├── order.service.ts           # CRUD + idempotency logic
│   │       │   └── order.controller.ts        # @MessagePattern handlers
│   │       └── migrations/
│   │
│   ├── inventory-service/
│   │   └── src/
│   │       ├── main.ts
│   │       ├── inventory.module.ts
│   │       ├── data-source.ts
│   │       ├── inventory/
│   │       │   ├── product.entity.ts          # Product entity (inventory_schema)
│   │       │   ├── stock-reservation.entity.ts # StockReservation entity
│   │       │   ├── inventory.service.ts       # CRUD + SKU dedup logic
│   │       │   └── inventory.controller.ts
│   │       └── migrations/
│   │
│   └── payment-service/
│       └── src/
│           ├── main.ts
│           ├── payment.module.ts
│           ├── data-source.ts
│           ├── payment/
│           │   ├── payment.entity.ts          # Payment entity (payment_schema)
│           │   ├── payment.service.ts         # CRUD + idempotency logic
│           │   └── payment.controller.ts
│           └── migrations/
│
├── libs/
│   └── shared/
│       └── src/
│           ├── index.ts                       # Barrel exports for @shared
│           ├── config/
│           │   ├── env.config.ts              # Environment variable loader
│           │   ├── database.config.ts         # getDatabaseConfig(schema)
│           │   └── rmq.config.ts              # RMQ client/server config factories
│           ├── constants/
│           │   └── rmq.constants.ts           # Service names, queues, message patterns
│           ├── dto/
│           │   ├── create-order.dto.ts        # CreateOrderDto + OrderItemDto
│           │   ├── create-product.dto.ts      # CreateProductDto
│           │   └── create-payment.dto.ts      # CreatePaymentDto
│           ├── filters/
│           │   ├── all-exceptions.filter.ts   # Global exception handler
│           │   └── i18n-validation.filter.ts  # i18n validation error formatter
│           ├── interceptors/
│           │   └── success-response.interceptor.ts  # {success: true, data} wrapper
│           ├── decorators/
│           │   ├── skip-interceptor.decorator.ts    # @SkipInterceptor()
│           │   └── api-page-response.decorator.ts   # Swagger pagination decorator
│           ├── pagination/
│           │   └── pagination.dto.ts          # PageDto<T> + PageMetaDto
│           ├── resilience/
│           │   ├── circuit-breaker.service.ts # Opossum CB wrapper
│           │   ├── resilience.module.ts       # Global NestJS module
│           │   └── index.ts
│           ├── tracing/
│           │   ├── tracing.ts                 # OpenTelemetry SDK init
│           │   └── index.ts
│           ├── utils/
│           │   ├── rpc-error.util.ts          # handleRpcResponse() with timeout/retry/CB
│           │   └── rpc-options.ts             # RPC_READ_OPTIONS / RPC_WRITE_OPTIONS
│           └── i18n/
│               ├── en/
│               │   ├── translations.json
│               │   └── inventory.json
│               └── de/
│                   ├── translations.json
│                   └── inventory.json
│
├── test/
│   └── integration/
│       ├── setup.ts                           # Test app bootstrap
│       ├── order.spec.ts
│       ├── inventory.spec.ts
│       ├── payment.spec.ts
│       └── resilience.spec.ts
│
├── docker-compose.yml                         # Dev mode (hot-reload)
├── Dockerfile                                 # Production multi-stage build
├── Dockerfile.dev                             # Dev image (node + nest CLI)
├── init-schemas.sql                           # Creates 3 PostgreSQL schemas
├── nest-cli.json                              # NestJS monorepo config
├── package.json
├── tsconfig.json
├── jest.integration.config.ts
├── .env.example
└── .env

Technology Stack

Category Technology Version Purpose
Runtime Node.js 22 (Alpine) JavaScript runtime
Language TypeScript ^5.1 Type-safe development
Framework NestJS ^10.0 Application framework (monorepo mode)
ORM TypeORM ^0.3.20 Database access + migrations
Database PostgreSQL 16 Relational data storage
Message Broker RabbitMQ 3 (management) Inter-service RPC communication
Tracing OpenTelemetry + Jaeger SDK 0.212 / Jaeger v2 Distributed tracing
Circuit Breaker Opossum ^8.5.0 Failure protection
Validation class-validator + class-transformer ^0.14 / ^0.5 DTO validation
API Docs @nestjs/swagger ^7.4.0 Swagger / OpenAPI
i18n nestjs-i18n ^10.5.1 Internationalization (en, de)
Testing Jest + Supertest ^30.2 / ^7.2 Integration testing
Container Docker Compose - Orchestration
DB Admin pgAdmin 4 latest PostgreSQL web UI

Services Overview

API Gateway

  • Role: HTTP entry point, routes requests to microservices via RabbitMQ RPC
  • Port: 3000 (configurable)
  • Database: None (stateless proxy)
  • Key features: Swagger docs (/api/docs), CORS, global validation, response wrapping, circuit breaker stats (/health/circuits)

Order Service

  • Role: Manages order lifecycle
  • Schema: order_schema
  • Entity: Order (id, status, items, total, idempotencyKey, timestamps)
  • Patterns: create_order, get_order_by_id, get_orders
  • Idempotency: Client-provided X-Idempotency-Key header stored in unique column

Inventory Service

  • Role: Manages products and stock reservations
  • Schema: inventory_schema
  • Entities: Product (id, name, sku, stock, reservedStock), StockReservation (id, orderId, quantity, status, productId)
  • Patterns: create_product, get_product_by_id, get_products
  • Idempotency: Natural deduplication via unique sku column

Payment Service

  • Role: Manages payment processing
  • Schema: payment_schema
  • Entity: Payment (id, orderId, amount, status, idempotencyKey, timestamps)
  • Patterns: create_payment, get_payment_by_id, get_payments_by_order_id
  • Idempotency: Client-provided X-Idempotency-Key header stored in unique column

Shared Library (@shared)

Imported via @shared path alias. Contains: environment config, database/RMQ config factories, DTOs, validation filters, response interceptor, pagination utilities, resilience module (circuit breaker), tracing bootstrap, RPC error handling with timeout/retry, i18n translations, and RMQ constants.


Request Flow

sequenceDiagram
    participant C as Client
    participant GW as API Gateway
    participant RPC as handleRpcResponse()
    participant RMQ as RabbitMQ
    participant OS as Order Service
    participant DB as PostgreSQL

    C->>GW: POST /api/orders<br/>{items, total}<br/>X-Idempotency-Key: abc-123
    GW->>GW: ValidationPipe validates CreateOrderDto
    GW->>RPC: send({cmd: 'create_order'}, {dto, idempotencyKey, lang})

    rect rgb(255, 245, 230)
        Note over RPC: Resilience Pipeline
        RPC->>RPC: timeout(10000ms)
        RPC->>RPC: retry(1x, 500ms backoff)
        RPC->>RPC: CircuitBreaker.exec('ORDER_SERVICE')
    end

    RPC->>RMQ: Publish to order_queue
    RMQ->>OS: Consume message
    OS->>DB: Check idempotencyKey exists?
    alt Key exists
        DB-->>OS: Return existing order
    else New order
        OS->>DB: INSERT INTO order_schema.order
        DB-->>OS: Return new order
    end
    OS-->>RMQ: Reply with order data
    RMQ-->>RPC: Response
    RPC-->>GW: Order object
    GW->>GW: SuccessResponseInterceptor wraps response
    GW-->>C: 201 {success: true, data: {id, status, items, total, ...}}
Loading

img_1.png

API Reference

Health

Method Endpoint Description Response
GET /health Health check {status: "ok"}
GET /health/circuits Circuit breaker stats per service {ORDER_SERVICE: {state, stats}, ...}

Orders (/api/orders)

Method Endpoint Description Headers Body Codes
POST /api/orders Create order X-Idempotency-Key (opt), X-Lang (opt) CreateOrderDto 201, 400
GET /api/orders/:id Get order by UUID X-Lang (opt) - 200, 404
GET /api/orders?page=1&perPage=10 List orders (paginated) - - 200

Example — Create Order:

// POST /api/orders
// X-Idempotency-Key: order-abc-123
{
  "items": [
    { "productId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "quantity": 2, "price": 29.99 }
  ],
  "total": 59.98
}

// Response 201
{
  "success": true,
  "data": {
    "id": "7c69ae16-a49a-4c3b-b507-88e40093a637",
    "status": "PENDING",
    "items": [{ "productId": "a1b2c3d4-...", "quantity": 2, "price": 29.99 }],
    "total": 59.98,
    "idempotencyKey": "order-abc-123",
    "createdAt": "2026-02-25T12:57:27.664Z",
    "updatedAt": "2026-02-25T12:57:27.664Z"
  }
}

Products (/api/products)

Method Endpoint Description Headers Body Codes
POST /api/products Create product - CreateProductDto 201, 400
GET /api/products/:id Get product by UUID X-Lang (opt) - 200, 404
GET /api/products?page=1&perPage=10 List products (paginated) - - 200

Example — Create Product:

// POST /api/products
{ "name": "Widget Pro", "sku": "WDG-PRO-001", "stock": 100 }

// Response 201
{
  "success": true,
  "data": {
    "id": "1a0aab4c-49f6-4023-92e4-2545984ee011",
    "name": "Widget Pro",
    "sku": "WDG-PRO-001",
    "stock": 100,
    "reservedStock": 0,
    "createdAt": "2026-02-25T12:57:34.735Z"
  }
}

Payments (/api/payments)

Method Endpoint Description Headers Body Codes
POST /api/payments Create payment X-Idempotency-Key (opt), X-Lang (opt) CreatePaymentDto 201, 400
GET /api/payments/:id Get payment by UUID X-Lang (opt) - 200, 404
GET /api/payments/order/:orderId Get payments by order - - 200

Example — Create Payment:

// POST /api/payments
// X-Idempotency-Key: pay-xyz-789
{ "orderId": "7c69ae16-a49a-4c3b-b507-88e40093a637", "amount": 59.98 }

// Response 201
{
  "success": true,
  "data": {
    "id": "b0364d00-6097-4a43-ae78-07f3fb9c5823",
    "orderId": "7c69ae16-a49a-4c3b-b507-88e40093a637",
    "amount": 59.98,
    "status": "PENDING",
    "idempotencyKey": "pay-xyz-789",
    "createdAt": "2026-02-25T12:57:47.087Z",
    "updatedAt": "2026-02-25T12:57:47.087Z"
  }
}

Swagger UI with interactive API explorer: http://localhost:3000/api/docs


RabbitMQ Message Patterns

Queue Command Handler Service Payload
order_queue create_order Order Service {dto: CreateOrderDto, idempotencyKey?, lang}
order_queue get_order_by_id Order Service {id: string, lang}
order_queue get_orders Order Service {page: number, perPage: number}
inventory_queue create_product Inventory Service {dto: CreateProductDto}
inventory_queue get_product_by_id Inventory Service {id: string, lang}
inventory_queue get_products Inventory Service {page: number, perPage: number}
payment_queue create_payment Payment Service {dto: CreatePaymentDto, idempotencyKey?, lang}
payment_queue get_payment_by_id Payment Service {id: string, lang}
payment_queue get_payments_by_order_id Payment Service {orderId: string}

All queues are durable and configured via environment variables. Communication uses NestJS's ClientProxy.send() (RPC pattern with reply queue).


Resilience Patterns

Resilience Pipeline

Every RPC call from the API Gateway passes through handleRpcResponse() which applies three layers of protection:

graph LR
    A["RPC Observable<br/>(ClientProxy.send)"] --> B["timeout()"]
    B --> C{"isTransientError?"}
    C -->|"Yes (5xx, timeout,<br/>connection error)"| D["retry() with<br/>exponential backoff"]
    C -->|"No (4xx business error)"| E["throw immediately"]
    D --> F["catchError()<br/>map to HttpException"]
    E --> F
    F --> G["CircuitBreaker.exec()"]
    G --> H["Response or Error"]
Loading

img_2.png Preset configurations (libs/shared/src/utils/rpc-options.ts):

Preset Timeout Max Retries Initial Delay Backoff Sequence Used For
RPC_READ_OPTIONS 5,000ms 3 300ms 300ms, 600ms, 1200ms GET endpoints
RPC_WRITE_OPTIONS 10,000ms 1 500ms 500ms POST endpoints

Retry logic: Only transient errors are retried (timeouts, ECONNREFUSED, ECONNRESET, ETIMEDOUT, 5xx). Business errors (4xx) are thrown immediately without retry. Backoff formula: initialDelay * 2^(retryCount - 1).

Circuit Breaker

Each downstream service has its own circuit breaker (managed by CircuitBreakerService using Opossum):

stateDiagram-v2
    [*] --> CLOSED
    CLOSED --> OPEN : Error rate >= 50%<br/>over 5+ requests<br/>in 10s window
    OPEN --> HALF_OPEN : After 30s cooldown
    HALF_OPEN --> CLOSED : Probe request succeeds
    HALF_OPEN --> OPEN : Probe request fails

    CLOSED : All calls pass through
    OPEN : All calls fail with 503<br/>"circuit open"
    HALF_OPEN : One probe request allowed
Loading

img_3.png Configuration (libs/shared/src/resilience/circuit-breaker.service.ts):

Parameter Default Description
errorThresholdPercentage 50% Failure rate to trip the breaker
volumeThreshold 5 Minimum requests before threshold applies
resetTimeout 30,000ms Time before OPEN transitions to HALF_OPEN
rollingCountTimeout 10,000ms Stats rolling window

Circuit breaker stats are exposed at GET /health/circuits and return the state (CLOSED/OPEN/HALF_OPEN) plus Opossum stats for each service.

Idempotency

Service Strategy Mechanism
Order Client-provided key X-Idempotency-Key header stored in idempotencyKey column (UNIQUE). Duplicate key returns existing order.
Payment Client-provided key Same pattern as Order.
Inventory Natural dedup Unique sku column. Creating a product with an existing SKU returns the existing product.

Distributed Tracing

OpenTelemetry (OTel) auto-instruments the entire request chain. Each service initializes the SDK before NestJS bootstrap via initTracing() in main.ts.

How it works:

graph LR
    subgraph "api-gateway"
        A["HTTP Span<br/>(HttpInstrumentation)"] --> B["Express Span<br/>(ExpressInstrumentation)"]
        B --> C["AMQP Publish Span<br/>(AmqplibInstrumentation)<br/>Injects W3C traceparent"]
    end

    C -->|"RabbitMQ<br/>(trace context in headers)"| D

    subgraph "order-service"
        D["AMQP Consume Span<br/>(AmqplibInstrumentation)<br/>Extracts traceparent"] --> E["PG Query Span<br/>(PgInstrumentation)"]
    end

    E -.->|"OTLP HTTP :4318"| F["Jaeger"]
    A -.->|"OTLP HTTP :4318"| F
Loading

img_4.png Auto-instrumentations:

Instrumentation What it captures
HttpInstrumentation Incoming/outgoing HTTP requests
ExpressInstrumentation Express route handling
PgInstrumentation PostgreSQL queries with SQL details
AmqplibInstrumentation RabbitMQ publish/consume + trace context propagation

Key points:

  • initTracing() must be called before NestJS imports to monkey-patch libraries correctly
  • Trace context propagates through RabbitMQ automatically via W3C Traceparent headers in AMQP message properties
  • BatchSpanProcessor batches spans for efficient export
  • Graceful shutdown on SIGTERM/SIGINT flushes pending spans
  • Jaeger UI: http://localhost:16686 (System Architecture tab shows service dependency graph)

Database Schema

erDiagram
    order_schema_order {
        uuid id PK
        enum status "PENDING | CONFIRMED | FAILED | CANCELLED"
        jsonb items "Array of {productId, quantity, price}"
        decimal total "decimal(12,2)"
        varchar idempotencyKey UK "nullable"
        timestamptz createdAt
        timestamptz updatedAt
    }

    inventory_schema_product {
        uuid id PK
        varchar name
        varchar sku UK
        int stock "default 0"
        int reservedStock "default 0"
        timestamptz createdAt
    }

    inventory_schema_stock_reservation {
        uuid id PK
        uuid orderId
        int quantity
        enum status "RESERVED | RELEASED"
        timestamptz createdAt
        uuid productId FK
    }

    payment_schema_payment {
        uuid id PK
        uuid orderId
        decimal amount "decimal(12,2)"
        enum status "PENDING | COMPLETED | FAILED | REFUNDED"
        varchar idempotencyKey UK "nullable"
        timestamptz createdAt
        timestamptz updatedAt
    }

    inventory_schema_product ||--o{ inventory_schema_stock_reservation : "has many"
Loading

img_5.png

Each service has its own PostgreSQL schema. There are no cross-schema foreign keys — services are isolated by design. The init-schemas.sql file creates all three schemas on first database startup.


Request/Response Pipeline

The API Gateway applies a global middleware chain to every request:

graph LR
    A["HTTP Request"] --> B["CORS<br/>(origin: *)"]
    B --> C["AllExceptionsFilter<br/>(wraps entire chain)"]
    C --> D["ValidationPipe<br/>(whitelist + transform)"]
    D --> E["Controller<br/>(route handler)"]
    E --> F["SuccessResponseInterceptor"]
    F --> G["HTTP Response"]

    D -->|"Validation fails"| H["400 Bad Request"]
    E -->|"Exception thrown"| I["AllExceptionsFilter"]
    I --> J["Error Response"]
Loading

img_6.png

Response formats:

// Success (SuccessResponseInterceptor)
{ "success": true, "data": { ... } }

// Error (AllExceptionsFilter)
{ "success": false, "data": { "message": "...", "statusCode": 404 } }

// Validation Error
{ "success": false, "data": { "message": ["field must be..."], "error": "Bad Request", "statusCode": 400 } }

Environment Variables

Variable Default Used By Description
DB_HOST localhost Backend services PostgreSQL host
DB_PORT 5432 Backend services PostgreSQL port
DB_NAME microservices_learn All + postgres container Database name
DB_USER postgres All + postgres container Database user
DB_PASSWORD postgres All + postgres container Database password
TYPEORM_SYNCHRONIZE true Backend services Auto-sync schema (set false in production!)
RABBITMQ_URL amqp://guest:guest@localhost:5672 All services RabbitMQ connection URL
RABBITMQ_USER guest RabbitMQ container RabbitMQ admin user
RABBITMQ_PASSWORD guest RabbitMQ container RabbitMQ admin password
API_GATEWAY_PORT 3000 API Gateway Gateway HTTP port
ORDER_QUEUE order_queue Gateway + Order Order service queue name
INVENTORY_QUEUE inventory_queue Gateway + Inventory Inventory service queue name
PAYMENT_QUEUE payment_queue Gateway + Payment Payment service queue name
OTEL_EXPORTER_OTLP_ENDPOINT http://localhost:4318/v1/traces All services Jaeger OTLP endpoint
PGADMIN_EMAIL admin@baupay.com pgAdmin container pgAdmin login email
PGADMIN_PASSWORD admin pgAdmin container pgAdmin login password

Getting Started

Prerequisites

  • Docker and Docker Compose
  • Node.js 22+ and npm (for local development)

Quick Start

# 1. Clone and configure
cp .env.example .env

# 2. Start infrastructure (PostgreSQL, RabbitMQ, Jaeger, pgAdmin)
npm run docker:infra

# 3. Install dependencies (for local dev)
npm install

# 4. Run database migrations
npm run migration:run:all

# 5a. Local development (hot-reload, 4 processes)
npm run start:dev

# 5b. OR Docker development (hot-reload inside containers)
npm run docker

Verify

# Health check
curl http://localhost:3000/health
# {"status":"ok"}

# Create a test order
curl -X POST http://localhost:3000/api/orders \
  -H 'Content-Type: application/json' \
  -d '{"items":[{"productId":"a1b2c3d4-e5f6-7890-abcd-ef1234567890","quantity":1,"price":10}],"total":10}'

UI Access

Service URL Credentials
Swagger API Docs http://localhost:3000/api/docs -
Jaeger Tracing UI http://localhost:16686 -
RabbitMQ Management http://localhost:15672 guest / guest
pgAdmin http://localhost:5050 From .env

Docker Architecture

Development Mode (npm run docker)

Uses Dockerfile.dev with volume mounts for hot-reload:

FROM node:22-alpine
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
RUN npm install -g @nestjs/cli
EXPOSE 3000

Each service mounts the source code and uses nest start --watch:

# docker-compose.yml (per service)
build:
  context: .
  dockerfile: Dockerfile.dev
volumes:
  - .:/app              # Source code mount
  - /app/node_modules   # Preserve container's node_modules
command: npx nest start <service-name> --watch

Code changes on host trigger automatic recompilation and restart inside the container.

Production Mode

Uses multi-stage Dockerfile:

  1. depsnpm ci (install dependencies)
  2. buildnpx nest build <service> (compile TypeScript)
  3. productionnode dist/apps/<service>/... (run compiled JS)

Docker Profiles

Infrastructure services (postgres, rabbitmq, jaeger, pgAdmin) start with docker compose up -d. Application services require --profile app:

npm run docker:infra      # Infrastructure only
npm run docker            # Infrastructure + all app services (--profile app --build)

Healthchecks

Service Strategy Interval
PostgreSQL pg_isready 5s
RabbitMQ rabbitmq-diagnostics check_port_connectivity 10s
API Gateway wget /health 10s
Backend Services node -e 'process.exit(0)' 30s

Services use depends_on with condition: service_healthy to ensure proper startup ordering.


Database Migrations

Migrations use TypeORM CLI with per-service data sources:

# Generate a new migration (auto-detects entity changes)
npm run migration:generate:order
npm run migration:generate:inventory
npm run migration:generate:payment

# Run pending migrations
npm run migration:run:all          # All services
npm run migration:run:order        # Single service

# Revert last migration
npm run migration:revert:all       # All services
npm run migration:revert:order     # Single service

init-schemas.sql creates the three PostgreSQL schemas (order_schema, inventory_schema, payment_schema) on first database startup via Docker's docker-entrypoint-initdb.d mechanism.


npm Scripts Reference

Build

Script Description
build Build all 4 services
build:gateway Build API Gateway only
build:order Build Order Service only
build:inventory Build Inventory Service only
build:payment Build Payment Service only

Development

Script Description
start:dev Start all 4 services with hot-reload (concurrently)
start:dev:gateway Start API Gateway with --watch
start:dev:order Start Order Service with --watch
start:dev:inventory Start Inventory Service with --watch
start:dev:payment Start Payment Service with --watch

Production

Script Description
start:prod:gateway Run compiled API Gateway
start:prod:order Run compiled Order Service
start:prod:inventory Run compiled Inventory Service
start:prod:payment Run compiled Payment Service

Docker

Script Description
docker:infra Start infrastructure (postgres, rabbitmq, jaeger, pgadmin)
docker:infra:down Stop infrastructure
docker Build and start everything (infra + app services)
docker:down Stop everything
docker:logs Tail logs for all services
docker:sh:* Shell into a specific container

Migrations

Script Description
migration:generate:{service} Generate migration for a service
migration:run:{service} Run migrations for a service
migration:run:all Run all migrations
migration:revert:{service} Revert last migration for a service
migration:revert:all Revert all migrations

Testing

Script Description
test:integration Run integration tests (--runInBand)

AWS Infrastructure

All infrastructure is defined as code using Terraform with an S3 + DynamoDB state backend.

graph TD
    Internet["Internet"]

    subgraph "AWS — eu-north-1"
        ALB["Application Load Balancer<br/>rnd-ms-alb<br/>:80 → Gateway<br/>:15672 → RabbitMQ UI"]

        subgraph "ECS Fargate Cluster (rnd-ms-cluster)"
            GW_ECS["API Gateway<br/>256 CPU / 512 MB"]
            OS_ECS["Order Service<br/>256 CPU / 512 MB"]
            IS_ECS["Inventory Service<br/>256 CPU / 512 MB"]
            PS_ECS["Payment Service<br/>256 CPU / 512 MB"]
            RMQ_ECS["RabbitMQ<br/>256 CPU / 512 MB<br/>(EFS persistence)"]
            MIG_ECS["Migration Task<br/>(one-off)"]
            K6_ECS["K6 Load Test<br/>(one-off)"]
        end

        subgraph "Data Layer"
            RDS["RDS PostgreSQL 16<br/>db.t3.micro<br/>(private subnets)"]
            EFS["EFS<br/>RabbitMQ data"]
            SSM["SSM Parameter Store<br/>DB creds, RMQ URL"]
        end

        subgraph "Networking"
            VPC["VPC 10.0.0.0/16"]
            PUB_A["Public Subnet A<br/>10.0.1.0/24"]
            PUB_B["Public Subnet B<br/>10.0.2.0/24"]
            PRIV_A["Private Subnet A<br/>10.0.10.0/24"]
            PRIV_B["Private Subnet B<br/>10.0.11.0/24"]
        end

        subgraph "Observability"
            CW["CloudWatch Logs<br/>(7-day retention)"]
            GF["Grafana Cloud<br/>(empattech.grafana.net)"]
        end

        SD["Service Discovery<br/>rabbitmq.local:5672"]
    end

    Internet --> ALB
    ALB --> GW_ECS
    GW_ECS --> RMQ_ECS
    RMQ_ECS --> OS_ECS
    RMQ_ECS --> IS_ECS
    RMQ_ECS --> PS_ECS
    OS_ECS --> RDS
    IS_ECS --> RDS
    PS_ECS --> RDS
    MIG_ECS --> RDS
    RMQ_ECS --> EFS
    RMQ_ECS --> SD
    GW_ECS --> CW
    OS_ECS --> CW
    IS_ECS --> CW
    PS_ECS --> CW
    K6_ECS --> CW
    CW --> GF
Loading

Terraform Resources

Component Resource Details
Compute ECS Fargate Cluster with 4 services + RabbitMQ + migration task + K6 task
Networking VPC 2 public subnets (ECS) + 2 private subnets (RDS), IGW, route tables
Load Balancing ALB Gateway target group (:3000), RabbitMQ management (:15672)
Database RDS PostgreSQL 16 db.t3.micro, encrypted (gp3), 7-day backups, private subnets
Storage EFS RabbitMQ data persistence with IAM auth + access points
Secrets SSM Parameter Store DB credentials, RabbitMQ URL, Grafana keys (SecureString)
DNS Service Discovery Private namespace (*.local) for internal RabbitMQ routing
Logging CloudWatch Log group per service (/ecs/rnd-ms-*), 7-day retention
Security 4 Security Groups ALB (HTTP), ECS tasks (3000, 5672, 15672), EFS (NFS 2049), RDS (5432)
IAM 3 Roles + 1 User ECS execution, ECS task, Grafana CloudWatch reader
Registry ECR 6 repositories (gateway, order, inventory, payment, migrations, k6)
infra/
├── backend.tf              # S3 state backend
├── providers.tf            # AWS provider
├── variables.tf            # All configurable variables
├── vpc.tf                  # VPC, subnets, IGW, route tables
├── security-groups.tf      # ALB, ECS, EFS, RDS security groups
├── alb.tf                  # Load balancer, target groups, listeners
├── ecs.tf                  # ECS cluster + RabbitMQ task/service
├── ecs-services.tf         # Gateway, Order, Inventory, Payment tasks/services
├── ecs-migrations.tf       # Migration ECR repo + task definition
├── ecs-k6.tf               # K6 ECR repo + task definition
├── rds.tf                  # PostgreSQL RDS instance + subnet group
├── efs.tf                  # EFS file system + mount targets + access point
├── iam.tf                  # Execution role, task role, Grafana IAM user
├── ssm.tf                  # SSM parameters (DB, RabbitMQ, Grafana)
├── service-discovery.tf    # Private DNS namespace + RabbitMQ service
├── cloudwatch.tf           # Log groups for all services
└── outputs.tf              # Gateway URL, RDS endpoint, ECR repos, etc.

CI/CD Pipeline (Bitbucket Pipelines)

Fully automated deployment pipeline with smart change detection — only builds, migrates, and deploys services that were modified.

graph LR
    subgraph "PR Pipeline"
        PR_I["Install"] --> PR_P["Parallel"]
        PR_P --> PR_L["Lint + Type Check"]
        PR_P --> PR_T["Integration Tests"]
        PR_P --> PR_D["Detect Changes"]
    end

    subgraph "Main Branch Pipeline"
        I["Install"] --> P["Parallel"]
        P --> L["Lint + Type Check"]
        P --> D["Detect Changes"]
        D --> M["Run Migrations<br/>(ECS Task)"]
        M --> B["Build & Push<br/>(Docker → ECR)"]
        B --> DEP["Deploy to ECS<br/>(wait for stable)"]
        DEP --> S["Smoke Test<br/>(health + circuits)"]
        S --> P2["Parallel"]
        P2 --> R["Rollback<br/>(manual)"]
        P2 --> K6B["K6 Build & Push"]
        K6B --> K6R["K6 Load Test<br/>(manual)"]
    end
Loading

Pipeline Features

Feature Description
Change detection Compares git diff against origin/main, detects changes in apps/, libs/, package*
Per-service migrations Only runs TypeORM migrations for services with actual code changes
ECS migration task Migrations execute as a one-off Fargate task, not at service boot time
Deployment stability Waits for services-stable after each ECS update with event logging on failure
Smoke testing Retries health check up to 10 times (10s interval) with circuit breaker validation
Manual rollback Reverts all services to previous task definition revision
K6 integration Builds K6 image to ECR, runs load test as ECS task (manual trigger)
Custom pipeline k6-load-test can be triggered from Bitbucket UI with configurable profile
Step timeouts max-time on all steps prevents infinite hangs

ECS Migrations

Migrations run as a dedicated Fargate task using Dockerfile.migrations:

Dockerfile.migrations → builds all 3 services → runs scripts/run-migrations.js [service-list]

The pipeline determines which services need migrations based on git diff, then:

  1. Builds and pushes the migration Docker image to ECR
  2. Runs aws ecs run-task with the service names as command override
  3. Waits for task completion and checks exit code
  4. Streams migration logs from CloudWatch

Load Testing (K6)

K6 load tests run as ECS Fargate tasks, hitting the production ALB. Results stream to CloudWatch Logs and are viewable in Grafana.

Test Profiles

Profile Duration Max VUs Stages Purpose
Smoke 2 min 2 30s ramp → 1m steady → 30s down Post-deploy sanity check
Load 16 min 50 Ramp to 20 → hold → ramp to 50 → hold → down Steady-state baseline
Stress 22 min 200 50 → 100 → 200 → hold → down Find breaking points
Spike 5 min 300 10 → 300 burst → hold → recover → down Sudden traffic resilience

Tested Endpoints

  • GET /health — health check + circuit breaker status
  • GET /api/products — inventory listing (measures inventory latency)
  • POST /api/orders — order creation with random items and idempotency keys (measures order latency)
  • GET /api/orders/:id — order retrieval
  • GET /api/payments — payment listing (measures payment latency)

Custom Metrics

Metric Type Description
errors Rate Percentage of failed requests
order_latency Trend Order creation response time
inventory_latency Trend Product listing response time
payment_latency Trend Payment listing response time
orders_created Counter Total successful orders

Smoke Test Results (Production)

Metric Value
p95 Latency 9.8 ms
Average Latency 6.7 ms
Total Requests 512
Iterations 128
Max VUs 2

Running Load Tests

# Via helper script (runs as ECS task)
bash scripts/run-k6.sh smoke     # Quick sanity check
bash scripts/run-k6.sh load      # Normal load baseline
bash scripts/run-k6.sh stress    # Stress test
bash scripts/run-k6.sh spike     # Spike test

# Via Bitbucket custom pipeline
# Pipelines → Run pipeline → k6-load-test → set K6_PROFILE variable
k6/
├── Dockerfile          # grafana/k6 base image + test script
└── load-test.js        # Test scenarios, profiles, custom metrics, thresholds

Observability (Grafana Cloud + CloudWatch)

Grafana Cloud (empattech.grafana.net) connected to AWS CloudWatch for centralized monitoring of all services, load tests, and infrastructure metrics.

Data Flow

graph LR
    subgraph "ECS Services"
        GW["Gateway"]
        OS["Order"]
        IS["Inventory"]
        PS["Payment"]
        K6["K6 Load Test"]
    end

    subgraph "AWS"
        CW_L["CloudWatch Logs<br/>/ecs/rnd-ms-*"]
        CW_M["CloudWatch Metrics<br/>ECS, RDS, ALB"]
    end

    subgraph "Grafana Cloud"
        GF_E["Explore<br/>(Logs Insights)"]
        GF_D["Dashboards<br/>(Metrics)"]
    end

    GW --> CW_L
    OS --> CW_L
    IS --> CW_L
    PS --> CW_L
    K6 --> CW_L
    CW_L --> GF_E
    CW_M --> GF_D
Loading

What's Monitored

Source Log Group / Namespace What You See
API Gateway /ecs/rnd-ms-gateway HTTP requests, RPC calls, errors
Order Service /ecs/rnd-ms-order Order processing, DB queries
Inventory Service /ecs/rnd-ms-inventory Product operations, stock management
Payment Service /ecs/rnd-ms-payment Payment processing
RabbitMQ /ecs/rnd-ms-rabbitmq Broker health, connections
Migrations /ecs/rnd-ms-migrations Migration execution logs
K6 Load Tests /ecs/rnd-ms-k6 Test progress, JSON summary with metrics
ECS Metrics AWS/ECS CPU, memory utilization per service
RDS Metrics AWS/RDS Connections, IOPS, latency, storage
ALB Metrics AWS/ApplicationELB Request count, error rates, latency

Access

  • Grafana: https://empattech.grafana.net → Explore → select cloudwatch data source
  • Authentication: Dedicated IAM user (grafana-cloudwatch-reader) with CloudWatchReadOnlyAccess + CloudWatchLogsReadOnlyAccess
  • Log queries: Use CloudWatch Logs Insights QL in Grafana Explore

Integration Testing

Integration tests exercise the full stack: HTTP request through the API Gateway, RabbitMQ message delivery, microservice processing, and PostgreSQL queries.

Prerequisites

npm run docker:infra     # Infrastructure must be running
npm run migration:run:all # Tables must exist

Running

npm run test:integration

Test Suites

File Tests Description
order.spec.ts 6 Create order, idempotency (same key = same ID), validation (400), get by ID, 404 for missing, pagination
inventory.spec.ts 6 Create product, SKU deduplication, validation (400), get by ID, 404 for missing, pagination
payment.spec.ts 7 Create payment, idempotency, no-key creates different payments, validation (400), get by ID, 404 for missing, get by orderId
resilience.spec.ts 5 Health endpoint, circuit breaker stats, CLOSED state for healthy services, timeout handling, no retry on 404

Setup

test/integration/setup.ts bootstraps a real NestApplication from GatewayModule with the same global pipes, filters, and interceptors as production. OpenTelemetry tracing is initialized so test traces appear in Jaeger.

Configuration

  • Jest config: jest.integration.config.ts
  • Timeout: 30 seconds per test
  • Runs in band (--runInBand) to avoid parallel DB conflicts
  • Module path alias @shared mapped to libs/shared/src/

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors