Updating s3 backup scripts to work with boto3 INFRA-448#368
Updating s3 backup scripts to work with boto3 INFRA-448#368ScottMillard wants to merge 13 commits into
Conversation
There was a problem hiding this comment.
Python dependency stuff
I had to install python [and boto3] in each container
@ScottMillard, there is a toggle-backup-activation.sh for each of the 3 databases that takes care of Python and Python dependency installation:
Those scripts are also still looking for the AWS_… access key environment variables to determine whether or not S3 is being used. You mention that you now use "boto's built in method to figure out how to authenticate"; maybe that could also be used to determine (in conjunction with BACKUP_AWS_STORAGE_BUCKET_NAME?) whether S3 backups are configured, dropping the reliance on AWS_… env. vars.
Streaming directly to S3
We did a fair amount of work to stream backups directly to S3 instead of writing them out completely to a local disk prior to upload, since that can require a very large amount of space to be available in /tmp/ or wherever. Can that still be done with boto3?
I've updated the files to install boto3 and tested them on my local kobo-install/kobo-docker. I've removed
Thank you for this John - I've updated the PR and added back in using smart_open to stream directly to S3 - I also added it for the redis backup which did not previously use it. |
jnm
left a comment
There was a problem hiding this comment.
@ScottMillard sorry I haven't re-reviewed this for real, but here's Claude's take with light revisions just to avoid delays. I think the first two issues at least are valid:
Redis S3 backup is broken
The redis
backup-to-s3.pyusesPopen(..., stdout=subprocess.PIPE)to stream into S3 viasmart_open. ButBACKUP_COMMANDcallsbackup-to-disk.bash, which writes the gzipped file to/srv/backups/and only prints a success message to stdout. The streaming code will capture that message text, not the actual backup data. The uploaded S3 "backup" will be empty/trivial.Fix: either have the bash script output gzip data to stdout, or upload the resulting file from
/srv/backups/after the script completes.Postgres timezone mismatch will crash
The postgres
backup-to-s3.pyuses naivedatetime.datetime.now()but compares againstbackup.last_modifiedfrom boto3, which is timezone-aware (UTC). This will raiseTypeError: can't subtract offset-naive and offset-aware datetimeswhen cleanup runs. The mongo and redis scripts correctly usedatetime.datetime.now(datetime.timezone.utc)— postgres was missed.Medium: No subprocess return code check
After streaming from
process.stdout, none of the scripts callprocess.wait()or checkprocess.returncode. Ifmongodumporpg_dumpfails mid-stream, a truncated/corrupt file gets silently uploaded to S3 and counted as a valid backup by retention logic.Low:
smart_openunpinned
smart-open[s3]is installed without a version constraint. Thesmart_open.smart_open()function used in the scripts is deprecated (replaced bysmart_open.open()in v5+). Not urgent but could break on a future pip install.Pre-existing:
keepsvsdayssemantic confusion in cleanupThe cleanup logic compares
delta.days > keepswherekeepsis a retention count (e.g., 2 yearly backups). This conflates "how many to keep" with "age in days." This is carried over from the old code, not introduced by this PR.
Backwards compatibility with kobo-install
Good news here — fully compatible:
AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEYfrom kobo-install'saws.txt.tplare still passed through to cron and picked up by boto3's credential chain- They're just no longer required, so IAM roles work too
- No docker-compose, volume, service name, or network changes
- kobo-install doesn't need changes for existing key-based users
If kobo-install wants to support IAM-role-only deployments (no keys), it would need a minor update to make access keys optional in its config flow, but that's a separate concern.
Verdict
Two critical bugs block merge: the redis backup producing empty files and the postgres timezone crash. The subprocess return code check is a strong "should fix" to prevent silent data loss. The rest are low-priority.
I've fixed this, very good catch. Makes sense that previously the redis didn't use smart_open to stream to S3, but I left it in place anyway. I validated the file sent to S3 contains the actual rdb backup.
This is resolved also. |
|
@greptileai welcome to this repo. Please review! |
|
This updates the s3 backup scripts to work with boto3 so that they can take advantage of short lived credentials. It removes any codified reference to the AWS access/secret key and instead relies on boto's built in method to figure out how to authenticate, which will use many different methods which will allow this script to work in many ways for users that want to authenticate differently.
This will require boto3 to be installed in the environment it is running in.
For the redis backup, it removes the dependency for s3cmd and instead relies on boto for the upload. The redis backup did not previously use smart_open to stream direct to S3 but it has been updated to do so to match the others.
I tested this locally, results are below - I had to install python in each container and authorized to AWS by exporting short lived AWS credentials to the shell prior to running. I validated the file was in the bucket afterwards and deleted them.
I'm not a python dev so copilot did most the heavy lifting 💪
I also added the
backups.mdfile with some information on how to enable/disable backups.redis:
postgres:
mongo: