Skip to content

Add celery beat task, archive_pending_files to automatically archives files that have been created but not made live after 24 hours.#4875

Merged
rparke merged 1 commit into
mainfrom
rp-cleanup-pending-files-scheduled-task
Jun 15, 2026
Merged

Add celery beat task, archive_pending_files to automatically archives files that have been created but not made live after 24 hours.#4875
rparke merged 1 commit into
mainfrom
rp-cleanup-pending-files-scheduled-task

Conversation

@rparke

@rparke rparke commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

This is vital cleanup work that ensures that these stale files don't persist and are properly cleaned up, eventually being removed from S3 too when they are picked up by the scheduled task: remove-archived-template-email-files-from-s3.

24 hours has been selected because that's the session timeout for our users and should be long enough that if they were going to make the file live, they'd have done it.

@rparke rparke force-pushed the rp-cleanup-pending-files-scheduled-task branch 2 times, most recently from b4a0df2 to 8ed0f9d Compare June 8, 2026 10:04
Comment thread app/config.py Outdated
Comment thread tests/app/celery/test_scheduled_tasks.py Outdated
Comment thread app/dao/template_email_files_dao.py Outdated
@rparke rparke force-pushed the rp-cleanup-pending-files-scheduled-task branch from 8ed0f9d to 1775df8 Compare June 8, 2026 10:22
Comment thread app/dao/template_email_files_dao.py Outdated

@CrystalPea CrystalPea left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I left a few comments 🙌🏼

Comment thread app/celery/scheduled_tasks.py Outdated
Comment thread app/celery/scheduled_tasks.py Outdated
datetime.datetime.utcnow() - TemplateEmailFile.created_at
> datetime.timedelta(hours=current_app.config.get("TEMPLATE_EMAIL_FILE_ARCHIVE_PERIOD_IN_HOURS")),
).all()
for file in files_in_pending:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we archive all pending files in a single transaction, instead of looping? That will make the query much much faster.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how much faster this will make the query because we don't get thousands of new files a day (at the moment) can update if needed.

Comment thread app/config.py
Comment thread tests/app/celery/test_scheduled_tasks.py
Comment thread tests/app/celery/test_scheduled_tasks.py
Comment thread app/celery/scheduled_tasks.py Outdated
Comment thread tests/app/db.py Outdated
Comment thread app/celery/scheduled_tasks.py
Comment thread app/dao/template_email_files_dao.py
Comment thread app/dao/template_email_files_dao.py
…e files that have been created but not made live after 24 hours.

This is vital cleanup work that ensures that these stale files don't persist and are properly cleaned up, eventually being removed from S3 too.
@rparke rparke force-pushed the rp-cleanup-pending-files-scheduled-task branch from 87f2cc1 to 152bb0c Compare June 15, 2026 11:47
@rparke rparke merged commit 8181bbe into main Jun 15, 2026
10 checks passed
@rparke rparke deleted the rp-cleanup-pending-files-scheduled-task branch June 15, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants