Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions infrastructure/lambdas/sf-telegram-notifier/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# sf-telegram-notifier

Fans EventBridge `Step Functions Execution Status Change` events for the
three Alpha Engine Step Functions into Telegram via the canonical
`alpha_engine_lib.telegram.send_message` primitive.

**Purely additive.** The existing SNS → email path on every SF
(`NotifyComplete` success + `HandleFailure` failure branches) is unchanged.
This Lambda subscribes to a separate EventBridge rule and never touches the
SF JSON definitions.

## Coverage

| SF | Source ARN suffix | Pretty label |
| --- | --- | --- |
| Saturday weekly pipeline | `alpha-engine-saturday-pipeline` | `Saturday SF` |
| Weekday daily pipeline | `alpha-engine-weekday-pipeline` | `Weekday SF` |
| EOD post-market pipeline | `alpha-engine-eod-pipeline` | `EOD SF` |

| Status | Emoji | Push? | Extra detail |
| --- | --- | --- | --- |
| `RUNNING` | 🚀 | silent | execution name only |
| `SUCCEEDED` | ✅ | loud | duration |
| `FAILED` | 🔴 | loud | duration + `error: cause` via `DescribeExecution` (best-effort, truncated at 280 chars) |
| `TIMED_OUT` | ⏰ | loud | duration |
| `ABORTED` | ⛔ | loud | duration |

`RUNNING` is delivered silently (in-channel awareness, no phone buzz) so the
weekday SF's daily 5:45 AM PT start does not page on every trading day.

## Architecture

```
SF status transition
EventBridge default bus
(aws.states / Step Functions Execution Status Change,
filtered to the 3 alpha-engine SF ARNs)
alpha-engine-sf-telegram-notifier ──► alpha_engine_lib.telegram.send_message
Telegram bot API
(alpha-engine primary bot)
```

Telegram credentials are resolved at runtime by the lib from SSM under
`/alpha-engine/TELEGRAM_BOT_TOKEN` + `/alpha-engine/TELEGRAM_CHAT_ID`,
which were provisioned for the executor `notifier.py` arc
(ROADMAP L1067, 2026-05-13). No new secret material is required.

## Deploy

```bash
# First-time bootstrap — creates IAM role, Lambda, EventBridge rule, permission
bash infrastructure/lambdas/sf-telegram-notifier/deploy.sh --bootstrap

# Code-only update (default)
bash infrastructure/lambdas/sf-telegram-notifier/deploy.sh

# Dry-run (validate + package, do not apply)
bash infrastructure/lambdas/sf-telegram-notifier/deploy.sh --dry-run

# Smoke-test (invoke with a synthetic SUCCEEDED event)
bash infrastructure/lambdas/sf-telegram-notifier/deploy.sh --smoke
```

Auth: uses active AWS CLI creds. Personal IAM user has enough perms;
deliberately not wired into CI to keep the OIDC role's blast radius narrow,
matching the spot-orphan-reaper / changelog-cloudwatch-mirror convention.

## IAM (inline policy)

- `logs:CreateLogGroup/Stream + PutLogEvents` on the Lambda's own log group
- `ssm:GetParameter` on `/alpha-engine/TELEGRAM_BOT_TOKEN` +
`/alpha-engine/TELEGRAM_CHAT_ID` (no other parameters)
- `states:DescribeExecution` on `arn:aws:states:…:execution:alpha-engine-*:*`
— only used to enrich `FAILED` events with the error+cause snippet
214 changes: 214 additions & 0 deletions infrastructure/lambdas/sf-telegram-notifier/deploy.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
#!/usr/bin/env bash
# deploy.sh — Create or update the alpha-engine-sf-telegram-notifier Lambda
# and wire its EventBridge SF status-change trigger.
#
# This Lambda subscribes to `aws.states` / "Step Functions Execution Status
# Change" events for the three Alpha Engine SFs (saturday / weekday / eod)
# and forwards human-readable summaries to Telegram via
# `alpha_engine_lib.telegram.send_message`. Existing SNS → email path is
# unaffected.
#
# Managed outside CloudFormation — same rationale as spot-orphan-reaper +
# changelog-cloudwatch-mirror (keeps the github-actions-lambda-deploy
# OIDC role's blast radius narrow; operator-deployed only).
#
# Usage:
# bash infrastructure/lambdas/sf-telegram-notifier/deploy.sh # update code only
# bash infrastructure/lambdas/sf-telegram-notifier/deploy.sh --bootstrap # first-time create + wire EventBridge
# bash infrastructure/lambdas/sf-telegram-notifier/deploy.sh --dry-run # show actions, do not apply
# bash infrastructure/lambdas/sf-telegram-notifier/deploy.sh --smoke # invoke once with a synthetic SUCCEEDED event

set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
FUNCTION_NAME="alpha-engine-sf-telegram-notifier"
ROLE_NAME="alpha-engine-sf-telegram-notifier-role"
POLICY_NAME="alpha-engine-sf-telegram-notifier-policy"
RULE_NAME="alpha-engine-sf-status-change"
REGION="${AWS_REGION:-us-east-1}"
ACCOUNT_ID="${ACCOUNT_ID:-711398986525}"

DRY_RUN=false
BOOTSTRAP=false
SMOKE=false
for arg in "$@"; do
case "$arg" in
--dry-run) DRY_RUN=true ;;
--bootstrap) BOOTSTRAP=true ;;
--smoke) SMOKE=true ;;
-h|--help) sed -n '2,/^$/p' "$0"; exit 0 ;;
esac
done

run() {
if $DRY_RUN; then
echo "DRY: $*"
else
"$@"
fi
}

# ----- 0. Validate handler + run unit tests ----------------------------------

python3 -c "
import ast
src = open('${SCRIPT_DIR}/index.py').read()
ast.parse(src)
print('index.py syntax OK')
"

if [[ -f "${SCRIPT_DIR}/test_handler.py" ]]; then
echo "Running handler unit tests..."
python3 -m pytest "${SCRIPT_DIR}/test_handler.py" -q
fi

# ----- 1. Package: pip install deps + zip handler ---------------------------

PKG=$(mktemp -d)
trap "rm -rf '$PKG'" EXIT

echo "Installing deps into ${PKG} (pip install -t)..."
python3 -m pip install \
--quiet \
--target "${PKG}" \
--upgrade \
-r "${SCRIPT_DIR}/requirements.txt"

cp "${SCRIPT_DIR}/index.py" "${PKG}/index.py"
ZIP="${PKG}/function.zip"
(cd "${PKG}" && zip -qr "function.zip" . -x "function.zip")
echo "Packaged ${ZIP} ($(wc -c < "${ZIP}") bytes)"

# ----- 2. Bootstrap (first-time only) ---------------------------------------

if $BOOTSTRAP; then
echo "Bootstrapping ${FUNCTION_NAME}..."

TRUST_POLICY='{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Principal":{"Service":"lambda.amazonaws.com"},"Action":"sts:AssumeRole"}]}'
if ! aws iam get-role --role-name "${ROLE_NAME}" --query 'Role.RoleName' --output text >/dev/null 2>&1; then
echo " Creating IAM role: ${ROLE_NAME}"
run aws iam create-role \
--role-name "${ROLE_NAME}" \
--assume-role-policy-document "${TRUST_POLICY}" \
--query 'Role.RoleName' --output text
else
echo " IAM role exists: ${ROLE_NAME}"
fi

echo " Applying inline policy: ${POLICY_NAME}"
run aws iam put-role-policy \
--role-name "${ROLE_NAME}" \
--policy-name "${POLICY_NAME}" \
--policy-document "file://${SCRIPT_DIR}/iam-policy.json"

if ! $DRY_RUN; then
echo " Waiting 10s for IAM role propagation..."
sleep 10
fi

ROLE_ARN="arn:aws:iam::${ACCOUNT_ID}:role/${ROLE_NAME}"
if ! aws lambda get-function --function-name "${FUNCTION_NAME}" --query 'Configuration.FunctionName' --output text >/dev/null 2>&1; then
echo " Creating Lambda: ${FUNCTION_NAME}"
run aws lambda create-function \
--function-name "${FUNCTION_NAME}" \
--runtime python3.12 \
--role "${ROLE_ARN}" \
--handler index.handler \
--zip-file "fileb://${ZIP}" \
--timeout 30 \
--memory-size 256 \
--environment 'Variables={LOG_LEVEL=INFO}' \
--region "${REGION}" \
--query 'FunctionArn' --output text
else
echo " Lambda exists, code will be updated in step 3"
fi

# EventBridge rule: Step Functions Execution Status Change for the 3 alpha-engine SFs
echo " Creating EventBridge rule: ${RULE_NAME}"
EVENT_PATTERN=$(cat <<EOF
{
"source": ["aws.states"],
"detail-type": ["Step Functions Execution Status Change"],
"detail": {
"stateMachineArn": [
"arn:aws:states:${REGION}:${ACCOUNT_ID}:stateMachine:alpha-engine-saturday-pipeline",
"arn:aws:states:${REGION}:${ACCOUNT_ID}:stateMachine:alpha-engine-weekday-pipeline",
"arn:aws:states:${REGION}:${ACCOUNT_ID}:stateMachine:alpha-engine-eod-pipeline"
],
"status": ["RUNNING", "SUCCEEDED", "FAILED", "TIMED_OUT", "ABORTED"]
}
}
EOF
)
run aws events put-rule \
--name "${RULE_NAME}" \
--event-pattern "${EVENT_PATTERN}" \
--description "Fan SF status changes to alpha-engine-sf-telegram-notifier" \
--region "${REGION}" \
--query 'RuleArn' --output text

FN_ARN="arn:aws:lambda:${REGION}:${ACCOUNT_ID}:function:${FUNCTION_NAME}"
run aws events put-targets \
--rule "${RULE_NAME}" \
--targets "Id=1,Arn=${FN_ARN}" \
--region "${REGION}"

RULE_ARN="arn:aws:events:${REGION}:${ACCOUNT_ID}:rule/${RULE_NAME}"
run aws lambda add-permission \
--function-name "${FUNCTION_NAME}" \
--statement-id "eventbridge-${RULE_NAME}" \
--action lambda:InvokeFunction \
--principal events.amazonaws.com \
--source-arn "${RULE_ARN}" \
--region "${REGION}" 2>/dev/null || true
fi

# ----- 3. Update function code (always after bootstrap, idempotent) ---------

echo "Updating Lambda function code: ${FUNCTION_NAME}"
run aws lambda update-function-code \
--function-name "${FUNCTION_NAME}" \
--zip-file "fileb://${ZIP}" \
--region "${REGION}" \
--query 'LastUpdateStatus' --output text

if ! $DRY_RUN; then
aws lambda wait function-updated \
--function-name "${FUNCTION_NAME}" \
--region "${REGION}"
fi

echo "✓ Code deployed."

# ----- 4. Smoke (synthetic SUCCEEDED event) ---------------------------------

if $SMOKE; then
echo ""
echo "Smoke-testing via direct invoke (synthetic SUCCEEDED event)..."
RESP=$(mktemp)
PAYLOAD=$(cat <<'EOF'
{
"source": "aws.states",
"detail-type": "Step Functions Execution Status Change",
"detail": {
"status": "SUCCEEDED",
"stateMachineArn": "arn:aws:states:us-east-1:711398986525:stateMachine:alpha-engine-saturday-pipeline",
"executionArn": "arn:aws:states:us-east-1:711398986525:execution:alpha-engine-saturday-pipeline:smoke-test",
"name": "smoke-test",
"startDate": 0,
"stopDate": 60000
}
}
EOF
)
aws lambda invoke \
--function-name "${FUNCTION_NAME}" \
--cli-binary-format raw-in-base64-out \
--payload "${PAYLOAD}" \
--region "${REGION}" \
"${RESP}" >/dev/null
cat "${RESP}"
echo ""
rm -f "${RESP}"
fi
30 changes: 30 additions & 0 deletions infrastructure/lambdas/sf-telegram-notifier/iam-policy.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Logs",
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:us-east-1:711398986525:log-group:/aws/lambda/alpha-engine-sf-telegram-notifier:*"
},
{
"Sid": "TelegramSecretsSSM",
"Effect": "Allow",
"Action": ["ssm:GetParameter"],
"Resource": [
"arn:aws:ssm:us-east-1:711398986525:parameter/alpha-engine/TELEGRAM_BOT_TOKEN",
"arn:aws:ssm:us-east-1:711398986525:parameter/alpha-engine/TELEGRAM_CHAT_ID"
]
},
{
"Sid": "DescribeExecutionForFailureCause",
"Effect": "Allow",
"Action": ["states:DescribeExecution"],
"Resource": "arn:aws:states:us-east-1:711398986525:execution:alpha-engine-*:*"
}
]
}
Loading
Loading