Skip to content

Latest commit

 

History

History
72 lines (51 loc) · 3.2 KB

File metadata and controls

72 lines (51 loc) · 3.2 KB

worker-python

The Cellenics single cell analysis tasks wrapper, written in Python.

Overview

The Python part of the worker is the main entry point for data analysis tasks. It fullfills the following functions:

  • Receives tasks from the API by listening to an SQS queue associated with the relevant experiment ID.
  • Prepares the received task for the R part of the worker. This is task dependent and may include additional task validation, cleaning and formatting the data or fetching additional data from AWS services.
  • Forwards the task to the R part of the worker, using the local network.
  • Receives the results back from the R part of the worker, uploads the data to S3 and sends a notification to Redis that the task has been computed, using socket io API.

Running the worker

Start the worker

See the main README for instructions on how to run the workers in Docker.

Process tasks

Tasks are automatically processed when they are received from the SQS queue specified.

For local development, make sure you have InfraMock running alongside the ui and api projects. Refer to their respective documentations on how to run them locally. Once all of these are running, tasks should automatically be submitted and processed when you perform actions on the ui. There is nothing else to do.

Advanced: pushing custom work to the local worker

You can also push work to a locally running worker instance without using the ui or the api projects.

First, make sure you have InfraMock running. Then, you can use aws-cli to send a payload directly to the queue the worker is listening to:

aws --endpoint-url=http://localhost:4566 sqs send-message --queue-url http://localhost:4566/queue/development-queue.fifo --message-body "$(< payload.json)" --message-group-id "$(date +%s)"

This will push the payload in the file payload.json to the SQS queue the worker is listening to.

The payload.json file should include the following: { "ETag":"", "socketId":"", "experimentId":"", "Authorization":"", "timeout":"2099-09-23T13:46:32.522Z", "body":{}, "PipelineRunETag":"", "broadcast":false } The values can be generated by running an example experiment in production and looking at the values in the console.

Development

Open the Visual Studio Code workspace:

code python.code-workspace

You should be prompted to run the code in a container. If not, hit Cmd+Shift+P and search for Remote Containers: Reopen in Container. Selecting this item will cause the Python worker to reopen in a container for development.

Tests

Typically, you will be able to run make test-py in the root directory to execute the unit tests.

Task formatting

Task definitions are stored in the api project as an OpenAPI schema. You can find this here.

Download the schema and open it using Stoplight Studio. Looking into WorkRequest should give you the schemas and parameters for all supported tasks.