Skip to content

Add retry mechanism and error handling for Elasticsearch connection failures #545

@tanmayjoddar

Description

@tanmayjoddar

Currently experiencing issues where the application fails to start when Elasticsearch is not immediately available during container startup. This creates problems in docker-compose environments where services start in parallel.

The ingestion module should have proper retry logic with exponential backoff when connecting to Elasticsearch. Right now if ES isn't ready, the whole service can crash or hang.

Proposed changes:

  • Add connection retry decorator for ES operations
  • Implement exponential backoff (start at 1s, max 30s)
  • Add proper logging for connection attempts
  • Make retry attempts configurable via environment variable
  • Handle connection pool exhaustion gracefully

This would make the deployment more robust and prevent the startup race conditions we're seeing.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions