diff --git a/.devcontainer/devcontainer.json b/.devcontainer/devcontainer.json index b290e09..e2a410d 100644 --- a/.devcontainer/devcontainer.json +++ b/.devcontainer/devcontainer.json @@ -1,4 +1,5 @@ { + "$schema": "https://raw.githubusercontent.com/devcontainers/spec/main/schemas/devContainer.schema.json", "name": "nfcore", "image": "nfcore/gitpod:latest", "remoteUser": "gitpod", diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md deleted file mode 100644 index 3a5f6ae..0000000 --- a/.github/CONTRIBUTING.md +++ /dev/null @@ -1,125 +0,0 @@ -# `nf-core/dmscore`: Contributing Guidelines - -Hi there! -Many thanks for taking an interest in improving nf-core/dmscore. - -We try to manage the required tasks for nf-core/dmscore using GitHub issues, you probably came to this page when creating one. -Please use the pre-filled template to save time. - -However, don't be put off by this template - other more general issues and suggestions are welcome! -Contributions to the code are even more welcome ;) - -> [!NOTE] -> If you need help using or modifying nf-core/dmscore then the best place to ask is on the nf-core Slack [#dmscore](https://nfcore.slack.com/channels/dmscore) channel ([join our Slack here](https://nf-co.re/join/slack)). - -## Contribution workflow - -If you'd like to write some code for nf-core/dmscore, the standard workflow is as follows: - -1. Check that there isn't already an issue about your idea in the [nf-core/dmscore issues](https://github.com/nf-core/dmscore/issues) to avoid duplicating work. If there isn't one already, please create one so that others know you're working on this -2. [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [nf-core/dmscore repository](https://github.com/nf-core/dmscore) to your GitHub account -3. Make the necessary changes / additions within your forked repository following [Pipeline conventions](#pipeline-contribution-conventions) -4. Use `nf-core pipelines schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10). -5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged - -If you're not used to this workflow with git, you can start with some [docs from GitHub](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or even their [excellent `git` resources](https://try.github.io/). - -## Tests - -You have the option to test your changes locally by running the pipeline. For receiving warnings about process selectors and other `debug` information, it is recommended to use the debug profile. Execute all the tests with the following command: - -```bash -nf-test test --profile debug,test,docker --verbose -``` - -When you create a pull request with changes, [GitHub Actions](https://github.com/features/actions) will run automatic tests. -Typically, pull-requests are only fully reviewed when these tests are passing, though of course we can help out before then. - -There are typically two types of tests that run: - -### Lint tests - -`nf-core` has a [set of guidelines](https://nf-co.re/developers/guidelines) which all pipelines must adhere to. -To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core pipelines lint ` command. - -If any failures or warnings are encountered, please follow the listed URL for more documentation. - -### Pipeline tests - -Each `nf-core` pipeline should be set up with a minimal set of test-data. -`GitHub Actions` then runs the pipeline on this data to ensure that it exits successfully. -If there are any failures then the automated tests fail. -These tests are run both with the latest available version of `Nextflow` and also the minimum required version that is stated in the pipeline code. - -## Patch - -:warning: Only in the unlikely and regretful event of a release happening with a bug. - -- On your own fork, make a new branch `patch` based on `upstream/main` or `upstream/master`. -- Fix the bug, and bump version (X.Y.Z+1). -- Open a pull-request from `patch` to `main`/`master` with the changes. - -## Getting help - -For further information/help, please consult the [nf-core/dmscore documentation](https://nf-co.re/dmscore/usage) and don't hesitate to get in touch on the nf-core Slack [#dmscore](https://nfcore.slack.com/channels/dmscore) channel ([join our Slack here](https://nf-co.re/join/slack)). - -## Pipeline contribution conventions - -To make the `nf-core/dmscore` code and processing logic more understandable for new contributors and to ensure quality, we semi-standardise the way the code and other contributions are written. - -### Adding a new step - -If you wish to contribute a new step, please use the following coding standards: - -1. Define the corresponding input channel into your new process from the expected previous process channel. -2. Write the process block (see below). -3. Define the output channel if needed (see below). -4. Add any new parameters to `nextflow.config` with a default (see below). -5. Add any new parameters to `nextflow_schema.json` with help text (via the `nf-core pipelines schema build` tool). -6. Add sanity checks and validation for all relevant parameters. -7. Perform local tests to validate that the new code works as expected. -8. If applicable, add a new test command in `.github/workflow/ci.yml`. -9. Update MultiQC config `assets/multiqc_config.yml` so relevant suffixes, file name clean up and module plots are in the appropriate order. If applicable, add a [MultiQC](https://https://multiqc.info/) module. -10. Add a description of the output files and if relevant any appropriate images from the MultiQC report to `docs/output.md`. - -### Default values - -Parameters should be initialised / defined with default values within the `params` scope in `nextflow.config`. - -Once there, use `nf-core pipelines schema build` to add to `nextflow_schema.json`. - -### Default processes resource requirements - -Sensible defaults for process resource requirements (CPUs / memory / time) for a process should be defined in `conf/base.config`. These should generally be specified generic with `withLabel:` selectors so they can be shared across multiple processes/steps of the pipeline. A nf-core standard set of labels that should be followed where possible can be seen in the [nf-core pipeline template](https://github.com/nf-core/tools/blob/main/nf_core/pipeline-template/conf/base.config), which has the default process as a single core-process, and then different levels of multi-core configurations for increasingly large memory requirements defined with standardised labels. - -The process resources can be passed on to the tool dynamically within the process with the `${task.cpus}` and `${task.memory}` variables in the `script:` block. - -### Naming schemes - -Please use the following naming schemes, to make it easy to understand what is going where. - -- initial process channel: `ch_output_from_` -- intermediate and terminal channels: `ch__for_` - -### Nextflow version bumping - -If you are using a new feature from core Nextflow, you may bump the minimum required version of nextflow in the pipeline with: `nf-core pipelines bump-version --nextflow . [min-nf-version]` - -### Images and figures - -For overview images and other documents we follow the nf-core [style guidelines and examples](https://nf-co.re/developers/design_guidelines). - -## GitHub Codespaces - -This repo includes a devcontainer configuration which will create a GitHub Codespaces for Nextflow development! This is an online developer environment that runs in your browser, complete with VSCode and a terminal. - -To get started: - -- Open the repo in [Codespaces](https://github.com/nf-core/dmscore/codespaces) -- Tools installed - - nf-core - - Nextflow - -Devcontainer specs: - -- [DevContainer config](.devcontainer/devcontainer.json) diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml index 58a5523..78f6876 100644 --- a/.github/ISSUE_TEMPLATE/bug_report.yml +++ b/.github/ISSUE_TEMPLATE/bug_report.yml @@ -8,7 +8,7 @@ body: Before you post this issue, please check the documentation: - [nf-core website: troubleshooting](https://nf-co.re/usage/troubleshooting) - - [nf-core/dmscore pipeline documentation](https://nf-co.re/dmscore/usage) + - [nf-core/deepmutscan pipeline documentation](https://nf-co.re/deepmutscan/usage) - type: textarea id: description attributes: @@ -46,4 +46,4 @@ body: * Executor _(eg. slurm, local, awsbatch)_ * Container engine: _(e.g. Docker, Singularity, Conda, Podman, Shifter, Charliecloud, or Apptainer)_ * OS _(eg. CentOS Linux, macOS, Linux Mint)_ - * Version of nf-core/dmscore _(eg. 1.1, 1.5, 1.8.2)_ + * Version of nf-core/deepmutscan _(eg. 1.1, 1.5, 1.8.2)_ diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml index 08f0c49..572e0cf 100644 --- a/.github/ISSUE_TEMPLATE/config.yml +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -2,6 +2,6 @@ contact_links: - name: Join nf-core url: https://nf-co.re/join about: Please join the nf-core community here - - name: "Slack #dmscore channel" - url: https://nfcore.slack.com/channels/dmscore - about: Discussion about the nf-core/dmscore pipeline + - name: "Slack #deepmutscan channel" + url: https://nfcore.slack.com/channels/deepmutscan + about: Discussion about the nf-core/deepmutscan pipeline diff --git a/.github/ISSUE_TEMPLATE/feature_request.yml b/.github/ISSUE_TEMPLATE/feature_request.yml index a08dd5f..0731019 100644 --- a/.github/ISSUE_TEMPLATE/feature_request.yml +++ b/.github/ISSUE_TEMPLATE/feature_request.yml @@ -1,5 +1,5 @@ name: Feature request -description: Suggest an idea for the nf-core/dmscore pipeline +description: Suggest an idea for the nf-core/deepmutscan pipeline labels: enhancement body: - type: textarea diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 0d6f133..1e78c19 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -1,22 +1,22 @@ ## PR checklist - [ ] This comment contains a description of changes (with reason). - [ ] If you've fixed a bug or added code that should be tested, add tests! -- [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/dmscore/tree/master/.github/CONTRIBUTING.md) -- [ ] If necessary, also make a PR on the nf-core/dmscore _branch_ on the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository. +- [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/deepmutscan/tree/master/docs/CONTRIBUTING.md) +- [ ] If necessary, also make a PR on the nf-core/deepmutscan _branch_ on the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository. - [ ] Make sure your code lints (`nf-core pipelines lint`). - [ ] Ensure the test suite passes (`nextflow run . -profile test,docker --outdir `). - [ ] Check for unexpected warnings in debug mode (`nextflow run . -profile debug,test,docker --outdir `). diff --git a/.github/actions/get-shards/action.yml b/.github/actions/get-shards/action.yml new file mode 100644 index 0000000..e2833ee --- /dev/null +++ b/.github/actions/get-shards/action.yml @@ -0,0 +1,69 @@ +name: "Get number of shards" +description: "Get the number of nf-test shards for the current CI job" +inputs: + max_shards: + description: "Maximum number of shards allowed" + required: true + paths: + description: "Component paths to test" + required: false + tags: + description: "Tags to pass as argument for nf-test --tag parameter" + required: false +outputs: + shard: + description: "Array of shard numbers" + value: ${{ steps.shards.outputs.shard }} + total_shards: + description: "Total number of shards" + value: ${{ steps.shards.outputs.total_shards }} +runs: + using: "composite" + steps: + - name: Install nf-test + uses: nf-core/setup-nf-test@4069fbbaabe94c08faba4ad261bfa88225ba133f # v2 + with: + version: ${{ env.NFT_VER }} + - name: Get number of shards + id: shards + shell: bash + run: | + # Run nf-test with dynamic parameter + nftest_output=$(nf-test test \ + --profile +docker \ + $(if [ -n "${{ inputs.tags }}" ]; then echo "--tag ${{ inputs.tags }}"; fi) \ + --dry-run \ + --ci \ + --changed-since HEAD^) || { + echo "nf-test command failed with exit code $?" + echo "Full output: $nftest_output" + exit 1 + } + echo "nf-test dry-run output: $nftest_output" + + # Default values for shard and total_shards + shard="[]" + total_shards=0 + + # Check if there are related tests + if echo "$nftest_output" | grep -q 'No tests to execute'; then + echo "No related tests found." + else + # Extract the number of related tests + number_of_shards=$(echo "$nftest_output" | sed -n 's|.*Executed \([0-9]*\) tests.*|\1|p') + if [[ -n "$number_of_shards" && "$number_of_shards" -gt 0 ]]; then + shards_to_run=$(( $number_of_shards < ${{ inputs.max_shards }} ? $number_of_shards : ${{ inputs.max_shards }} )) + shard=$(seq 1 "$shards_to_run" | jq -R . | jq -c -s .) + total_shards="$shards_to_run" + else + echo "Unexpected output format. Falling back to default values." + fi + fi + + # Write to GitHub Actions outputs + echo "shard=$shard" >> $GITHUB_OUTPUT + echo "total_shards=$total_shards" >> $GITHUB_OUTPUT + + # Debugging output + echo "Final shard array: $shard" + echo "Total number of shards: $total_shards" diff --git a/.github/actions/nf-test/action.yml b/.github/actions/nf-test/action.yml new file mode 100644 index 0000000..ad686e8 --- /dev/null +++ b/.github/actions/nf-test/action.yml @@ -0,0 +1,111 @@ +name: "nf-test Action" +description: "Runs nf-test with common setup steps" +inputs: + profile: + description: "Profile to use" + required: true + shard: + description: "Shard number for this CI job" + required: true + total_shards: + description: "Total number of test shards(NOT the total number of matrix jobs)" + required: true + paths: + description: "Test paths" + required: true + tags: + description: "Tags to pass as argument for nf-test --tag parameter" + required: false +runs: + using: "composite" + steps: + - name: Setup Nextflow + uses: nf-core/setup-nextflow@b4ec1bc7c16a94435159de94a05253542fddf6ef # v3 + with: + version: "${{ env.NXF_VERSION }}" + + - name: Set up Python + uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6 + with: + python-version: "3.14" + + - name: Install nf-test + uses: nf-core/setup-nf-test@4069fbbaabe94c08faba4ad261bfa88225ba133f # v2 + with: + version: "${{ env.NFT_VER }}" + install-pdiff: true + + - name: Setup apptainer + if: contains(inputs.profile, 'singularity') + uses: eWaterCycle/setup-apptainer@3f706d898c9db585b1d741b4692e66755f3a1b40 # v2 + + - name: Set up Singularity + if: contains(inputs.profile, 'singularity') + shell: bash + run: | + mkdir -p $NXF_SINGULARITY_CACHEDIR + mkdir -p $NXF_SINGULARITY_LIBRARYDIR + + - name: Conda setup + if: contains(inputs.profile, 'conda') + uses: conda-incubator/setup-miniconda@8ee1f361103df19b6f8c8655fd3967a8ecb162d5 # v4 + with: + auto-update-conda: true + conda-solver: libmamba + channels: conda-forge + channel-priority: strict + conda-remove-defaults: true + + - name: Run nf-test + shell: bash + env: + NFT_WORKDIR: ${{ env.NFT_WORKDIR }} + run: | + nf-test test \ + --profile=+${{ inputs.profile }} \ + $(if [ -n "${{ inputs.tags }}" ]; then echo "--tag ${{ inputs.tags }}"; fi) \ + --ci \ + --changed-since HEAD^ \ + --verbose \ + --tap=test.tap \ + --shard ${{ inputs.shard }}/${{ inputs.total_shards }} + + # Save the absolute path of the test.tap file to the output + echo "tap_file_path=$(realpath test.tap)" >> $GITHUB_OUTPUT + + - name: Generate test summary + if: always() + shell: bash + run: | + # Add header if it doesn't exist (using a token file to track this) + if [ ! -f ".summary_header" ]; then + echo "# 🚀 nf-test results" >> $GITHUB_STEP_SUMMARY + echo "" >> $GITHUB_STEP_SUMMARY + echo "| Status | Test Name | Profile | Shard |" >> $GITHUB_STEP_SUMMARY + echo "|:------:|-----------|---------|-------|" >> $GITHUB_STEP_SUMMARY + touch .summary_header + fi + + if [ -f test.tap ]; then + while IFS= read -r line; do + if [[ $line =~ ^ok ]]; then + test_name="${line#ok }" + # Remove the test number from the beginning + test_name="${test_name#* }" + echo "| ✅ | ${test_name} | ${{ inputs.profile }} | ${{ inputs.shard }}/${{ inputs.total_shards }} |" >> $GITHUB_STEP_SUMMARY + elif [[ $line =~ ^not\ ok ]]; then + test_name="${line#not ok }" + # Remove the test number from the beginning + test_name="${test_name#* }" + echo "| ❌ | ${test_name} | ${{ inputs.profile }} | ${{ inputs.shard }}/${{ inputs.total_shards }} |" >> $GITHUB_STEP_SUMMARY + fi + done < test.tap + else + echo "| ⚠️ | No test results found | ${{ inputs.profile }} | ${{ inputs.shard }}/${{ inputs.total_shards }} |" >> $GITHUB_STEP_SUMMARY + fi + + - name: Clean up + if: always() + shell: bash + run: | + sudo rm -rf /home/ubuntu/tests/ diff --git a/.github/workflows/awsfulltest.yml b/.github/workflows/awsfulltest.yml index 50287a8..a094526 100644 --- a/.github/workflows/awsfulltest.yml +++ b/.github/workflows/awsfulltest.yml @@ -4,66 +4,64 @@ name: nf-core AWS full size tests # It runs the -profile 'test_full' on AWS batch on: - pull_request: - branches: - - main - - master workflow_dispatch: pull_request_review: types: [submitted] + release: + types: [published] jobs: run-platform: name: Run AWS full tests - # run only if the PR is approved by at least 2 reviewers and against the master branch or manually triggered - if: github.repository == 'nf-core/dmscore' && github.event.review.state == 'approved' && github.event.pull_request.base.ref == 'master' || github.event_name == 'workflow_dispatch' + # run only if the PR is approved by at least 2 reviewers and against the master/main branch or manually triggered + if: github.repository == 'nf-core/deepmutscan' && github.event.review.state == 'approved' && (github.event.pull_request.base.ref == 'master' || github.event.pull_request.base.ref == 'main') || github.event_name == 'workflow_dispatch' || github.event_name == 'release' runs-on: ubuntu-latest steps: - - name: Get PR reviews - uses: octokit/request-action@v2.x - if: github.event_name != 'workflow_dispatch' - id: check_approvals - continue-on-error: true - with: - route: GET /repos/${{ github.repository }}/pulls/${{ github.event.pull_request.number }}/reviews?per_page=100 - env: - GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} - - - name: Check for approvals - if: ${{ failure() && github.event_name != 'workflow_dispatch' }} - run: | - echo "No review approvals found. At least 2 approvals are required to run this action automatically." - exit 1 - - - name: Check for enough approvals (>=2) - id: test_variables - if: github.event_name != 'workflow_dispatch' + - name: Set revision variable + id: revision run: | - JSON_RESPONSE='${{ steps.check_approvals.outputs.data }}' - CURRENT_APPROVALS_COUNT=$(echo $JSON_RESPONSE | jq -c '[.[] | select(.state | contains("APPROVED")) ] | length') - test $CURRENT_APPROVALS_COUNT -ge 2 || exit 1 # At least 2 approvals are required + echo "revision=${{ (github.event_name == 'workflow_dispatch' || github.event_name == 'release') && github.sha || 'dev' }}" >> "$GITHUB_OUTPUT" - name: Launch workflow via Seqera Platform - uses: seqeralabs/action-tower-launch@v2 + uses: seqeralabs/action-tower-launch@51565b514bff1827cf34620de25d0055759f1fc9 # v2 # TODO nf-core: You can customise AWS full pipeline tests as required # Add full size test data (but still relatively small datasets for few samples) # on the `test_full.config` test runs with only one set of parameters with: - workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }} + workspace_id: ${{ vars.TOWER_WORKSPACE_ID }} access_token: ${{ secrets.TOWER_ACCESS_TOKEN }} - compute_env: ${{ secrets.TOWER_COMPUTE_ENV }} - revision: ${{ github.sha }} - workdir: s3://${{ secrets.AWS_S3_BUCKET }}/work/dmscore/work-${{ github.sha }} + compute_env: ${{ vars.TOWER_COMPUTE_ENV }} + revision: ${{ steps.revision.outputs.revision }} + workdir: s3://${{ vars.AWS_S3_BUCKET }}/work/deepmutscan/work-${{ steps.revision.outputs.revision }} + nextflow_config: | + plugins { + id 'nf-slack@0.5.0' + } + slack { + enabled = true + bot { + token = '${{ secrets.NFSLACK_BOT_TOKEN }}' + channel = 'deepmutscan' + } + onStart { + enabled = false + } + onComplete { + message = ':white_check_mark: *deepmutscan/test_full* completed successfully! :tada:' + } + onError { + message = ':x: *deepmutscan/test_full* failed :crying_cat_face:' + } + } parameters: | { - "hook_url": "${{ secrets.MEGATESTS_ALERTS_SLACK_HOOK_URL }}", - "outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/dmscore/results-${{ github.sha }}" + "outdir": "s3://${{ vars.AWS_S3_BUCKET }}/deepmutscan/results-${{ steps.revision.outputs.revision }}" } profiles: test_full - - uses: actions/upload-artifact@v4 + - uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7 with: name: Seqera Platform debug log file path: | - seqera_platform_action_*.log - seqera_platform_action_*.json + tower_action_*.log + tower_action_*.json diff --git a/.github/workflows/awstest.yml b/.github/workflows/awstest.yml index 4a4b5db..2471331 100644 --- a/.github/workflows/awstest.yml +++ b/.github/workflows/awstest.yml @@ -7,27 +7,27 @@ on: jobs: run-platform: name: Run AWS tests - if: github.repository == 'nf-core/dmscore' + if: github.repository == 'nf-core/deepmutscan' runs-on: ubuntu-latest steps: # Launch workflow using Seqera Platform CLI tool action - name: Launch workflow via Seqera Platform - uses: seqeralabs/action-tower-launch@v2 + uses: seqeralabs/action-tower-launch@51565b514bff1827cf34620de25d0055759f1fc9 # v2 with: - workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }} + workspace_id: ${{ vars.TOWER_WORKSPACE_ID }} access_token: ${{ secrets.TOWER_ACCESS_TOKEN }} - compute_env: ${{ secrets.TOWER_COMPUTE_ENV }} + compute_env: ${{ vars.TOWER_COMPUTE_ENV }} revision: ${{ github.sha }} - workdir: s3://${{ secrets.AWS_S3_BUCKET }}/work/dmscore/work-${{ github.sha }} + workdir: s3://${{ vars.AWS_S3_BUCKET }}/work/deepmutscan/work-${{ github.sha }} parameters: | { - "outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/dmscore/results-test-${{ github.sha }}" + "outdir": "s3://${{ vars.AWS_S3_BUCKET }}/deepmutscan/results-test-${{ github.sha }}" } profiles: test - - uses: actions/upload-artifact@v4 + - uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7 with: name: Seqera Platform debug log file path: | - seqera_platform_action_*.log - seqera_platform_action_*.json + tower_action_*.log + tower_action_*.json diff --git a/.github/workflows/branch.yml b/.github/workflows/branch.yml index 92e3db0..ef39819 100644 --- a/.github/workflows/branch.yml +++ b/.github/workflows/branch.yml @@ -13,15 +13,15 @@ jobs: steps: # PRs to the nf-core repo main/master branch are only ok if coming from the nf-core repo `dev` or any `patch` branches - name: Check PRs - if: github.repository == 'nf-core/dmscore' + if: github.repository == 'nf-core/deepmutscan' run: | - { [[ ${{github.event.pull_request.head.repo.full_name }} == nf-core/dmscore ]] && [[ $GITHUB_HEAD_REF == "dev" ]]; } || [[ $GITHUB_HEAD_REF == "patch" ]] + { [[ ${{github.event.pull_request.head.repo.full_name }} == nf-core/deepmutscan ]] && [[ $GITHUB_HEAD_REF == "dev" ]]; } || [[ $GITHUB_HEAD_REF == "patch" ]] # If the above check failed, post a comment on the PR explaining the failure # NOTE - this doesn't currently work if the PR is coming from a fork, due to limitations in GitHub actions secrets - name: Post PR comment if: failure() - uses: mshick/add-pr-comment@b8f338c590a895d50bcbfa6c5859251edc8952fc # v2 + uses: mshick/add-pr-comment@8e4927817251f1ff60c001f04568532b38e0b4a0 # v3 with: message: | ## This PR is against the `${{github.event.pull_request.base.ref}}` branch :x: diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml deleted file mode 100644 index 92396ce..0000000 --- a/.github/workflows/ci.yml +++ /dev/null @@ -1,87 +0,0 @@ -name: nf-core CI -# This workflow runs the pipeline with the minimal test dataset to check that it completes without any syntax errors -on: - push: - branches: - - dev - pull_request: - release: - types: [published] - workflow_dispatch: - -env: - NXF_ANSI_LOG: false - NXF_SINGULARITY_CACHEDIR: ${{ github.workspace }}/.singularity - NXF_SINGULARITY_LIBRARYDIR: ${{ github.workspace }}/.singularity - -concurrency: - group: "${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}" - cancel-in-progress: true - -jobs: - test: - name: "Run pipeline with test data (${{ matrix.NXF_VER }} | ${{ matrix.test_name }} | ${{ matrix.profile }})" - # Only run on push if this is the nf-core dev branch (merged PRs) - if: "${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/dmscore') }}" - runs-on: ubuntu-latest - strategy: - matrix: - NXF_VER: - - "24.04.2" - - "latest-everything" - profile: - - "conda" - - "docker" - - "singularity" - test_name: - - "test" - isMaster: - - ${{ github.base_ref == 'master' }} - # Exclude conda and singularity on dev - exclude: - - isMaster: false - profile: "conda" - - isMaster: false - profile: "singularity" - steps: - - name: Check out pipeline code - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4 - with: - fetch-depth: 0 - - - name: Set up Nextflow - uses: nf-core/setup-nextflow@v2 - with: - version: "${{ matrix.NXF_VER }}" - - - name: Set up Apptainer - if: matrix.profile == 'singularity' - uses: eWaterCycle/setup-apptainer@main - - - name: Set up Singularity - if: matrix.profile == 'singularity' - run: | - mkdir -p $NXF_SINGULARITY_CACHEDIR - mkdir -p $NXF_SINGULARITY_LIBRARYDIR - - - name: Set up Miniconda - if: matrix.profile == 'conda' - uses: conda-incubator/setup-miniconda@a4260408e20b96e80095f42ff7f1a15b27dd94ca # v3 - with: - miniconda-version: "latest" - auto-update-conda: true - conda-solver: libmamba - channels: conda-forge,bioconda - - - name: Set up Conda - if: matrix.profile == 'conda' - run: | - echo $(realpath $CONDA)/condabin >> $GITHUB_PATH - echo $(realpath python) >> $GITHUB_PATH - - - name: Clean up Disk space - uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1 - - - name: "Run pipeline with test data ${{ matrix.NXF_VER }} | ${{ matrix.test_name }} | ${{ matrix.profile }}" - run: | - nextflow run ${GITHUB_WORKSPACE} -profile ${{ matrix.test_name }},${{ matrix.profile }} --outdir ./results diff --git a/.github/workflows/clean-up.yml b/.github/workflows/clean-up.yml index 0b6b1f2..172de6f 100644 --- a/.github/workflows/clean-up.yml +++ b/.github/workflows/clean-up.yml @@ -10,7 +10,7 @@ jobs: issues: write pull-requests: write steps: - - uses: actions/stale@28ca1036281a5e5922ead5184a1bbf96e5fc984e # v9 + - uses: actions/stale@b5d41d4e1d5dceea10e7104786b73624c18a190f # v10 with: stale-issue-message: "This issue has been tagged as awaiting-changes or awaiting-feedback by an nf-core contributor. Remove stale label or add a comment otherwise this issue will be closed in 20 days." stale-pr-message: "This PR has been tagged as awaiting-changes or awaiting-feedback by an nf-core contributor. Remove stale label or add a comment if it is still useful." diff --git a/.github/workflows/download_pipeline.yml b/.github/workflows/download_pipeline.yml index ab06316..a7bf4fc 100644 --- a/.github/workflows/download_pipeline.yml +++ b/.github/workflows/download_pipeline.yml @@ -12,14 +12,6 @@ on: required: true default: "dev" pull_request: - types: - - opened - - edited - - synchronize - branches: - - main - - master - pull_request_target: branches: - main - master @@ -46,15 +38,18 @@ jobs: runs-on: ubuntu-latest needs: configure steps: + - name: Check out pipeline code + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6 + - name: Install Nextflow - uses: nf-core/setup-nextflow@v2 + uses: nf-core/setup-nextflow@b4ec1bc7c16a94435159de94a05253542fddf6ef # v3 - name: Disk space cleanup uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1 - - uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5 + - uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6 with: - python-version: "3.12" + python-version: "3.14" architecture: "x64" - name: Setup Apptainer @@ -62,10 +57,15 @@ jobs: with: apptainer-version: 1.3.4 + - name: Read .nf-core.yml + id: read_yml + run: | + echo "nf_core_version=$(yq '.nf_core_version' ${{ github.workspace }}/.nf-core.yml)" >> "$GITHUB_OUTPUT" + - name: Install dependencies run: | python -m pip install --upgrade pip - pip install git+https://github.com/nf-core/tools.git@dev + pip install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} - name: Make a cache directory for the container images run: | @@ -120,6 +120,7 @@ jobs: echo "IMAGE_COUNT_AFTER=$image_count" >> "$GITHUB_OUTPUT" - name: Compare container image counts + id: count_comparison run: | if [ "${{ steps.count_initial.outputs.IMAGE_COUNT_INITIAL }}" -ne "${{ steps.count_afterwards.outputs.IMAGE_COUNT_AFTER }}" ]; then initial_count=${{ steps.count_initial.outputs.IMAGE_COUNT_INITIAL }} @@ -132,3 +133,10 @@ jobs: else echo "The pipeline can be downloaded successfully!" fi + + - name: Upload Nextflow logfile for debugging purposes + uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7 + with: + name: nextflow_logfile.txt + path: .nextflow.log* + include-hidden-files: true diff --git a/.github/workflows/fix-linting.yml b/.github/workflows/fix-linting.yml index addd34e..8bda64e 100644 --- a/.github/workflows/fix-linting.yml +++ b/.github/workflows/fix-linting.yml @@ -9,7 +9,7 @@ jobs: if: > contains(github.event.comment.html_url, '/pull/') && contains(github.event.comment.body, '@nf-core-bot fix linting') && - github.repository == 'nf-core/dmscore' + github.repository == 'nf-core/deepmutscan' runs-on: ubuntu-latest steps: # Use the @nf-core-bot token to check out so we can push later @@ -86,4 +86,4 @@ jobs: issue-number: ${{ github.event.issue.number }} body: | @${{ github.actor }} I tried to fix the linting errors, but it didn't work. Please fix them manually. - See [CI log](https://github.com/nf-core/dmscore/actions/runs/${{ github.run_id }}) for more details. + See [CI log](https://github.com/nf-core/deepmutscan/actions/runs/${{ github.run_id }}) for more details. diff --git a/.github/workflows/fix_linting.yml b/.github/workflows/fix_linting.yml new file mode 100644 index 0000000..844a1c0 --- /dev/null +++ b/.github/workflows/fix_linting.yml @@ -0,0 +1,85 @@ +name: Fix linting from a comment +on: + issue_comment: + types: [created] + +jobs: + fix-linting: + # Only run if comment is on a PR with the main repo, and if it contains the magic keywords + if: > + contains(github.event.comment.html_url, '/pull/') && + contains(github.event.comment.body, '@nf-core-bot fix linting') && + github.repository == 'nf-core/deepmutscan' + runs-on: ubuntu-latest + steps: + # Use the @nf-core-bot token to check out so we can push later + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6 + with: + token: ${{ secrets.nf_core_bot_auth_token }} + + # indication that the linting is being fixed + - name: React on comment + uses: peter-evans/create-or-update-comment@e8674b075228eee787fea43ef493e45ece1004c9 # v5 + with: + comment-id: ${{ github.event.comment.id }} + reactions: eyes + + # Action runs on the issue comment, so we don't get the PR by default + # Use the gh cli to check out the PR + - name: Checkout Pull Request + run: gh pr checkout ${{ github.event.issue.number }} + env: + GITHUB_TOKEN: ${{ secrets.nf_core_bot_auth_token }} + + - name: Install Nextflow + uses: nf-core/setup-nextflow@b4ec1bc7c16a94435159de94a05253542fddf6ef # v3 + + # Install and run prek + - name: Run prek + id: prek + uses: j178/prek-action@6ad80277337ad479fe43bd70701c3f7f8aa74db3 # v2 + continue-on-error: true + + # indication that the linting has finished + - name: react if linting finished succesfully + if: steps.prek.outcome == 'success' + uses: peter-evans/create-or-update-comment@e8674b075228eee787fea43ef493e45ece1004c9 # v5 + with: + comment-id: ${{ github.event.comment.id }} + reactions: "+1" + + - name: Commit & push changes + id: commit-and-push + if: steps.prek.outcome == 'failure' + run: | + git config user.email "core@nf-co.re" + git config user.name "nf-core-bot" + git config push.default upstream + git add . + git status + git commit -m "[automated] Fix code linting" + git push + + - name: react if linting errors were fixed + id: react-if-fixed + if: steps.commit-and-push.outcome == 'success' + uses: peter-evans/create-or-update-comment@e8674b075228eee787fea43ef493e45ece1004c9 # v5 + with: + comment-id: ${{ github.event.comment.id }} + reactions: hooray + + - name: react if linting errors were not fixed + if: steps.commit-and-push.outcome == 'failure' + uses: peter-evans/create-or-update-comment@e8674b075228eee787fea43ef493e45ece1004c9 # v5 + with: + comment-id: ${{ github.event.comment.id }} + reactions: confused + + - name: react if linting errors were not fixed + if: steps.commit-and-push.outcome == 'failure' + uses: peter-evans/create-or-update-comment@e8674b075228eee787fea43ef493e45ece1004c9 # v5 + with: + issue-number: ${{ github.event.issue.number }} + body: | + @${{ github.actor }} I tried to fix the linting errors, but it didn't work. Please fix them manually. + See [CI log](https://github.com/nf-core/deepmutscan/actions/runs/${{ github.run_id }}) for more details. diff --git a/.github/workflows/linting.yml b/.github/workflows/linting.yml index dbd52d5..8738ffc 100644 --- a/.github/workflows/linting.yml +++ b/.github/workflows/linting.yml @@ -3,9 +3,6 @@ name: nf-core linting # It runs the `nf-core pipelines lint` and markdown lint tests to ensure # that the code meets the nf-core guidelines. on: - push: - branches: - - dev pull_request: release: types: [published] @@ -14,46 +11,42 @@ jobs: pre-commit: runs-on: ubuntu-latest steps: - - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4 + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6 - - name: Set up Python 3.12 - uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5 - with: - python-version: "3.12" - - - name: Install pre-commit - run: pip install pre-commit + - name: Install Nextflow + uses: nf-core/setup-nextflow@b4ec1bc7c16a94435159de94a05253542fddf6ef # v3 - - name: Run pre-commit - run: pre-commit run --all-files + - name: Run prek + uses: j178/prek-action@6ad80277337ad479fe43bd70701c3f7f8aa74db3 # v2 nf-core: runs-on: ubuntu-latest steps: - name: Check out pipeline code - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6 - name: Install Nextflow - uses: nf-core/setup-nextflow@v2 + uses: nf-core/setup-nextflow@b4ec1bc7c16a94435159de94a05253542fddf6ef # v3 - - uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5 + - uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6 with: - python-version: "3.12" + python-version: "3.14" architecture: "x64" + - name: Setup uv + uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0 + - name: read .nf-core.yml - uses: pietrobolcato/action-read-yaml@1.1.0 + uses: pietrobolcato/action-read-yaml@9f13718d61111b69f30ab4ac683e67a56d254e1d # 1.1.0 id: read_yml with: config: ${{ github.workspace }}/.nf-core.yml - name: Install dependencies - run: | - python -m pip install --upgrade pip - pip install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} + run: uv tool install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} - name: Run nf-core pipelines lint - if: ${{ github.base_ref != 'master' }} + if: ${{ github.base_ref != 'master' || github.base_ref != 'main' }} env: GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} @@ -61,7 +54,7 @@ jobs: run: nf-core -l lint_log.txt pipelines lint --dir ${GITHUB_WORKSPACE} --markdown lint_results.md - name: Run nf-core pipelines lint --release - if: ${{ github.base_ref == 'master' }} + if: ${{ github.base_ref == 'master' || github.base_ref == 'main' }} env: GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} @@ -74,7 +67,7 @@ jobs: - name: Upload linting log file artifact if: ${{ always() }} - uses: actions/upload-artifact@b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882 # v4 + uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7 with: name: linting-logs path: | diff --git a/.github/workflows/linting_comment.yml b/.github/workflows/linting_comment.yml index 0bed96d..5b0c24f 100644 --- a/.github/workflows/linting_comment.yml +++ b/.github/workflows/linting_comment.yml @@ -11,7 +11,7 @@ jobs: runs-on: ubuntu-latest steps: - name: Download lint results - uses: dawidd6/action-download-artifact@80620a5d27ce0ae443b965134db88467fc607b43 # v7 + uses: dawidd6/action-download-artifact@b6e2e70617bc3265edd6dab6c906732b2f1ae151 # v21 with: workflow: linting.yml workflow_conclusion: completed @@ -21,7 +21,7 @@ jobs: run: echo "pr_number=$(cat linting-logs/PR_number.txt)" >> $GITHUB_OUTPUT - name: Post PR comment - uses: marocchino/sticky-pull-request-comment@331f8f5b4215f0445d3c07b4967662a32a2d3e31 # v2 + uses: marocchino/sticky-pull-request-comment@70d2764d1a7d5d9560b100cbea0077fc8f633987 # v3 with: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} number: ${{ steps.pr_number.outputs.pr_number }} diff --git a/.github/workflows/nf-test.yml b/.github/workflows/nf-test.yml new file mode 100644 index 0000000..efd72d6 --- /dev/null +++ b/.github/workflows/nf-test.yml @@ -0,0 +1,144 @@ +name: Run nf-test +on: + pull_request: + paths-ignore: + - "docs/**" + - "**/meta.yml" + - "**/*.md" + - "**/*.png" + - "**/*.svg" + release: + types: [published] + workflow_dispatch: + +# Cancel if a newer run is started +concurrency: + group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }} + cancel-in-progress: true + +env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + NFT_VER: "0.9.4" + NFT_WORKDIR: "~" + NXF_ANSI_LOG: false + NXF_SINGULARITY_CACHEDIR: ${{ github.workspace }}/.singularity + NXF_SINGULARITY_LIBRARYDIR: ${{ github.workspace }}/.singularity + +jobs: + nf-test-changes: + name: nf-test-changes + runs-on: # use self-hosted runners + - runs-on=${{ github.run_id }}-nf-test-changes + - runner=4cpu-linux-x64 + outputs: + shard: ${{ steps.set-shards.outputs.shard }} + total_shards: ${{ steps.set-shards.outputs.total_shards }} + steps: + - name: Clean Workspace # Purge the workspace in case it's running on a self-hosted runner + run: | + ls -la ./ + rm -rf ./* || true + rm -rf ./.??* || true + ls -la ./ + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6 + with: + fetch-depth: 0 + + - name: get number of shards + id: set-shards + uses: ./.github/actions/get-shards + env: + NFT_VER: ${{ env.NFT_VER }} + with: + max_shards: 7 + + - name: debug + run: | + echo ${{ steps.set-shards.outputs.shard }} + echo ${{ steps.set-shards.outputs.total_shards }} + + nf-test: + name: "${{ matrix.profile }} | ${{ matrix.NXF_VER }} | ${{ matrix.shard }}/${{ needs.nf-test-changes.outputs.total_shards }}" + needs: [nf-test-changes] + if: ${{ needs.nf-test-changes.outputs.total_shards != '0' }} + runs-on: # use self-hosted runners + - runs-on=${{ github.run_id }}-nf-test + - runner=4cpu-linux-x64 + strategy: + fail-fast: false + matrix: + shard: ${{ fromJson(needs.nf-test-changes.outputs.shard) }} + profile: [conda, docker, singularity] + isMain: + - ${{ github.base_ref == 'master' || github.base_ref == 'main' }} + # Exclude conda and singularity on dev + exclude: + - isMain: false + profile: "conda" + - isMain: false + profile: "singularity" + NXF_VER: + - "25.10.4" + - "latest-everything" + env: + NXF_ANSI_LOG: false + TOTAL_SHARDS: ${{ needs.nf-test-changes.outputs.total_shards }} + + steps: + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6 + with: + fetch-depth: 0 + + - name: Run nf-test + id: run_nf_test + uses: ./.github/actions/nf-test + continue-on-error: ${{ matrix.NXF_VER == 'latest-everything' }} + env: + NFT_WORKDIR: ${{ env.NFT_WORKDIR }} + NXF_VERSION: ${{ matrix.NXF_VER }} + with: + profile: ${{ matrix.profile }} + shard: ${{ matrix.shard }} + total_shards: ${{ env.TOTAL_SHARDS }} + + - name: Report test status + if: ${{ always() }} + run: | + if [[ "${{ steps.run_nf_test.outcome }}" == "failure" ]]; then + echo "::error::Test with ${{ matrix.NXF_VER }} failed" + # Add to workflow summary + echo "## ❌ Test failed: ${{ matrix.profile }} | ${{ matrix.NXF_VER }} | Shard ${{ matrix.shard }}/${{ env.TOTAL_SHARDS }}" >> $GITHUB_STEP_SUMMARY + if [[ "${{ matrix.NXF_VER }}" == "latest-everything" ]]; then + echo "::warning::Test with latest-everything failed but will not cause workflow failure. Please check if the error is expected or if it needs fixing." + fi + if [[ "${{ matrix.NXF_VER }}" != "latest-everything" ]]; then + exit 1 + fi + fi + + confirm-pass: + needs: [nf-test] + if: always() + runs-on: # use self-hosted runners + - runs-on=${{ github.run_id }}-confirm-pass + - runner=2cpu-linux-x64 + steps: + - name: One or more tests failed (excluding latest-everything) + if: ${{ contains(needs.*.result, 'failure') }} + run: exit 1 + + - name: One or more tests cancelled + if: ${{ contains(needs.*.result, 'cancelled') }} + run: exit 1 + + - name: All tests ok + if: ${{ contains(needs.*.result, 'success') }} + run: exit 0 + + - name: debug-print + if: always() + run: | + echo "::group::DEBUG: `needs` Contents" + echo "DEBUG: toJSON(needs) = ${{ toJSON(needs) }}" + echo "DEBUG: toJSON(needs.*.result) = ${{ toJSON(needs.*.result) }}" + echo "::endgroup::" diff --git a/.github/workflows/release-announcements.yml b/.github/workflows/release-announcements.yml index 450b1d5..78d5dbe 100644 --- a/.github/workflows/release-announcements.yml +++ b/.github/workflows/release-announcements.yml @@ -14,7 +14,11 @@ jobs: run: | echo "topics=$(curl -s https://nf-co.re/pipelines.json | jq -r '.remote_workflows[] | select(.full_name == "${{ github.repository }}") | .topics[]' | awk '{print "#"$0}' | tr '\n' ' ')" | sed 's/-//g' >> $GITHUB_OUTPUT - - uses: rzr/fediverse-action@master + - name: get description + id: get_description + run: | + echo "description=$(curl -s https://nf-co.re/pipelines.json | jq -r '.remote_workflows[] | select(.full_name == "${{ github.repository }}") | .description')" >> $GITHUB_OUTPUT + - uses: rzr/fediverse-action@563159eb8d45f70ab6aaba36ed55cd037e51f441 # master with: access-token: ${{ secrets.MASTODON_ACCESS_TOKEN }} host: "mstdn.science" # custom host if not "mastodon.social" (default) @@ -22,48 +26,15 @@ jobs: # https://docs.github.com/en/developers/webhooks-and-events/webhooks/webhook-events-and-payloads#release message: | Pipeline release! ${{ github.repository }} v${{ github.event.release.tag_name }} - ${{ github.event.release.name }}! - + ${{ steps.get_description.outputs.description }} Please see the changelog: ${{ github.event.release.html_url }} ${{ steps.get_topics.outputs.topics }} #nfcore #openscience #nextflow #bioinformatics - send-tweet: - runs-on: ubuntu-latest - - steps: - - uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5 - with: - python-version: "3.10" - - name: Install dependencies - run: pip install tweepy==4.14.0 - - name: Send tweet - shell: python - run: | - import os - import tweepy - - client = tweepy.Client( - access_token=os.getenv("TWITTER_ACCESS_TOKEN"), - access_token_secret=os.getenv("TWITTER_ACCESS_TOKEN_SECRET"), - consumer_key=os.getenv("TWITTER_CONSUMER_KEY"), - consumer_secret=os.getenv("TWITTER_CONSUMER_SECRET"), - ) - tweet = os.getenv("TWEET") - client.create_tweet(text=tweet) - env: - TWEET: | - Pipeline release! ${{ github.repository }} v${{ github.event.release.tag_name }} - ${{ github.event.release.name }}! - - Please see the changelog: ${{ github.event.release.html_url }} - TWITTER_CONSUMER_KEY: ${{ secrets.TWITTER_CONSUMER_KEY }} - TWITTER_CONSUMER_SECRET: ${{ secrets.TWITTER_CONSUMER_SECRET }} - TWITTER_ACCESS_TOKEN: ${{ secrets.TWITTER_ACCESS_TOKEN }} - TWITTER_ACCESS_TOKEN_SECRET: ${{ secrets.TWITTER_ACCESS_TOKEN_SECRET }} - bsky-post: runs-on: ubuntu-latest steps: - - uses: zentered/bluesky-post-action@80dbe0a7697de18c15ad22f4619919ceb5ccf597 # v0.1.0 + - uses: zentered/bluesky-post-action@5a91cc2ad10a304a4e96c16182dbe4918710bcf6 # v0.4.0 with: post: | Pipeline release! ${{ github.repository }} v${{ github.event.release.tag_name }} - ${{ github.event.release.name }}! diff --git a/.github/workflows/template-version-comment.yml b/.github/workflows/template-version-comment.yml new file mode 100644 index 0000000..ea30827 --- /dev/null +++ b/.github/workflows/template-version-comment.yml @@ -0,0 +1,46 @@ +name: nf-core template version comment +# This workflow is triggered on PRs to check if the pipeline template version matches the latest nf-core version. +# It posts a comment to the PR, even if it comes from a fork. + +on: pull_request_target + +jobs: + template_version: + runs-on: ubuntu-latest + steps: + - name: Check out pipeline code + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6 + with: + ref: ${{ github.event.pull_request.head.sha }} + + - name: Read template version from .nf-core.yml + uses: nichmor/minimal-read-yaml@1f7205277e25e156e1f63815781db80a6d490b8f # v0.0.2 + id: read_yml + with: + config: ${{ github.workspace }}/.nf-core.yml + + - name: Install nf-core + run: | + python -m pip install --upgrade pip + pip install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} + + - name: Check nf-core outdated + id: nf_core_outdated + run: echo "OUTPUT=$(pip list --outdated | grep nf-core)" >> ${GITHUB_ENV} + + - name: Post nf-core template version comment + uses: mshick/add-pr-comment@8e4927817251f1ff60c001f04568532b38e0b4a0 # v3 + if: | + contains(env.OUTPUT, 'nf-core') + with: + repo-token: ${{ secrets.NF_CORE_BOT_AUTH_TOKEN }} + allow-repeats: false + message: | + > [!WARNING] + > Newer version of the nf-core template is available. + > + > Your pipeline is using an old version of the nf-core template: ${{ steps.read_yml.outputs['nf_core_version'] }}. + > Please update your pipeline to the latest version. + > + > For more documentation on how to update your pipeline, please see the [Synchronisation documentation](https://nf-co.re/docs/developing/template-syncs/overview). + # diff --git a/.gitignore b/.gitignore index a42ce01..0187478 100644 --- a/.gitignore +++ b/.gitignore @@ -7,3 +7,11 @@ testing/ testing* *.pyc null/ +.lineage/ + +# Pipeline outputs +test_results/ +work/ +.nextflow* +.nf-test* +.Rhistory diff --git a/.nf-core.yml b/.nf-core.yml index 59e948b..4d3ffb7 100644 --- a/.nf-core.yml +++ b/.nf-core.yml @@ -1,12 +1,8 @@ -repository_type: pipeline - -nf_core_version: 3.1.2 - lint: {} - +nf_core_version: 4.0.2 +repository_type: pipeline template: - org: nf-core - name: dmscore + author: Benjamin Wehnert & Max Stammnitz description: "Until now, most Deep Mutational Scanning (DMS) experiments relied\ \ on variant-specific barcoded libraries for sequencing. This method enabled DMS\ \ on large proteins and led to many great publications. Recently, efforts have\ @@ -17,8 +13,9 @@ template: \ files and generating a count table of variants. Along the way, it provides multiple\ \ QC metrics, enabling users to quickly evaluate the success of their experimental\ \ setup." - author: Benjamin Wehnert & Max Stammnitz - version: 1.0.0dev - force: true - outdir: . + force: false is_nfcore: true + name: deepmutscan + org: nf-core + outdir: . + version: 1.0.0 diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 9e9f0e1..f51e1a2 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -4,10 +4,30 @@ repos: hooks: - id: prettier additional_dependencies: - - prettier@3.2.5 - - - repo: https://github.com/editorconfig-checker/editorconfig-checker.python - rev: "3.0.3" + - prettier@3.8.3 + - repo: https://github.com/pre-commit/pre-commit-hooks + rev: v6.0.0 hooks: - - id: editorconfig-checker - alias: ec + - id: trailing-whitespace + args: [--markdown-linebreak-ext=md] + exclude: | + (?x)^( + .*ro-crate-metadata.json$| + modules/(?!local/).*| + subworkflows/(?!local/).*| + .*\.snap$ + )$ + - id: end-of-file-fixer + exclude: | + (?x)^( + .*ro-crate-metadata.json$| + modules/(?!local/).*| + subworkflows/(?!local/).*| + .*\.snap$ + )$ + - repo: https://github.com/seqeralabs/nf-lint-pre-commit + rev: v0.3.0 + hooks: + - id: nextflow-lint + files: '\.nf$|nextflow\.config$' + args: ["-output", "json"] diff --git a/.prettierignore b/.prettierignore index edd29f0..63cde50 100644 --- a/.prettierignore +++ b/.prettierignore @@ -1,6 +1,4 @@ email_template.html -adaptivecard.json -slackreport.json .nextflow* work/ data/ @@ -10,4 +8,7 @@ testing/ testing* *.pyc bin/ +.nf-test/ ro-crate-metadata.json +modules/nf-core/ +subworkflows/nf-core/ diff --git a/.prettierrc.yml b/.prettierrc.yml index c81f9a7..07dbd8b 100644 --- a/.prettierrc.yml +++ b/.prettierrc.yml @@ -1 +1,6 @@ printWidth: 120 +tabWidth: 4 +overrides: + - files: "*.{md,yml,yaml,html,css,scss,js,cff}" + options: + tabWidth: 2 diff --git a/CHANGELOG.md b/CHANGELOG.md index 40ea82f..0039a1b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,11 +1,11 @@ -# nf-core/dmscore: Changelog +# nf-core/deepmutscan: Changelog The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). -## v1.0.0dev - [date] +## v1.0.0 - [date] -Initial release of nf-core/dmscore, created with the [nf-core](https://nf-co.re/) template. +Initial release of nf-core/deepmutscan, created with the [nf-core](https://nf-co.re/) template. ### `Added` diff --git a/CITATIONS.md b/CITATIONS.md index a7e69f7..3ea1015 100644 --- a/CITATIONS.md +++ b/CITATIONS.md @@ -1,4 +1,4 @@ -# nf-core/dmscore: Citations +# nf-core/deepmutscan: Citations ## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/) diff --git a/LICENSE b/LICENSE index 48133aa..8119d7b 100644 --- a/LICENSE +++ b/LICENSE @@ -1,6 +1,6 @@ MIT License -Copyright (c) The nf-core/dmscore team +Copyright (c) The nf-core/deepmutscan team Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal diff --git a/README.md b/README.md index b6ae647..7a45e11 100644 --- a/README.md +++ b/README.md @@ -1,106 +1,125 @@

- - nf-core/dmscore + + nf-core/deepmutscan

-[![GitHub Actions CI Status](https://github.com/nf-core/dmscore/actions/workflows/ci.yml/badge.svg)](https://github.com/nf-core/dmscore/actions/workflows/ci.yml) -[![GitHub Actions Linting Status](https://github.com/nf-core/dmscore/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-core/dmscore/actions/workflows/linting.yml)[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/dmscore/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX) +[![GitHub Actions CI Status](https://github.com/nf-core/deepmutscan/actions/workflows/ci.yml/badge.svg)](https://github.com/nf-core/deepmutscan/actions/workflows/ci.yml) +[![GitHub Actions Linting Status](https://github.com/nf-core/deepmutscan/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-core/deepmutscan/actions/workflows/linting.yml)[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/deepmutscan/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX) [![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com) -[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A524.04.2-23aa62.svg)](https://www.nextflow.io/) +[![Nextflow](https://img.shields.io/badge/version-%E2%89%A525.10.4-green?style=flat&logo=nextflow&logoColor=white&color=%230DC09D&link=https%3A%2F%2Fnextflow.io)](https://www.nextflow.io/) +[![nf-core template version](https://img.shields.io/badge/nf--core_template-4.0.2-green?style=flat&logo=nfcore&logoColor=white&color=%2324B064&link=https%3A%2F%2Fnf-co.re)](https://github.com/nf-core/tools/releases/tag/4.0.2) [![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/) [![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/) [![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/) -[![Launch on Seqera Platform](https://img.shields.io/badge/Launch%20%F0%9F%9A%80-Seqera%20Platform-%234256e7)](https://cloud.seqera.io/launch?pipeline=https://github.com/nf-core/dmscore) +[![Launch on Seqera Platform](https://img.shields.io/badge/Launch%20%F0%9F%9A%80-Seqera%20Platform-%234256e7)](https://cloud.seqera.io/launch?pipeline=https://github.com/nf-core/deepmutscan) -[![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23dmscore-4A154B?labelColor=000000&logo=slack)](https://nfcore.slack.com/channels/dmscore)[![Follow on Twitter](http://img.shields.io/badge/twitter-%40nf__core-1DA1F2?labelColor=000000&logo=twitter)](https://twitter.com/nf_core)[![Follow on Mastodon](https://img.shields.io/badge/mastodon-nf__core-6364ff?labelColor=FFFFFF&logo=mastodon)](https://mstdn.science/@nf_core)[![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core) +[![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23deepmutscan-4A154B?labelColor=000000&logo=slack)](https://nfcore.slack.com/channels/deepmutscan)[![Follow on Twitter](http://img.shields.io/badge/twitter-%40nf__core-1DA1F2?labelColor=000000&logo=twitter)](https://twitter.com/nf_core)[![Follow on Mastodon](https://img.shields.io/badge/mastodon-nf__core-6364ff?labelColor=FFFFFF&logo=mastodon)](https://mstdn.science/@nf_core)[![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core) ## Introduction -**nf-core/dmscore** is a bioinformatics pipeline that ... +**nf-core/deepmutscan** is a workflow designed for the analysis of deep mutational scanning (DMS) data. DMS enables researchers to experimentally measure the fitness effects of thousands of genes or gene variants simultaneously, helping to classify disease causing mutants in human and animal populations, to learn the fundamental rules of protein architecture, small-molecule binding, mRNA splicing, viral evolution and many other quantifiable phenotypes. - +While DNA synthesis and sequencing technologies have advanced substantially, long open reading frame (ORF) targets still present a major challenge for DMS studies. Shotgun DNA sequencing can be used to greatly speed up the inference of long ORF mutant fitness landscapes, theoretically at no expense in accuracy. We have designed the `nf-core/deepmutscan` pipeline to unlock the power of shotgun sequencing based DMS studies on long ORFs, to simplify and standardise the complex bioinformatics steps involved in data processing of such experiments – from read alignment to QC reporting and fitness landscape inferences. - -1. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/)) +![nf-core/deepmutscan workflow](docs/images/pipeline.png) -## Usage +The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community! -> [!NOTE] -> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data. +On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/deepmutscan/results). - +1. Alignment of reads to the reference open reading frame (ORF) (`BWA-mem`) +2. Filtering of wildtype and erroneous reads (`samtools view`) +3. Read merging for base error reduction (`vsearch merge`) +4. Mutation counting +5. Single nucleotide variant error correction +6. DMS library quality control +7. Data summarisation across samples +8. Fitness estimation (`DiMSum`, `mutscan`) -Now, you can run the pipeline using: +## Usage - +> [!NOTE] +> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/get_started/environment_setup/overview) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/get_started/run-your-first-pipeline) with `-profile test` before running the workflow on actual data. -```bash -nextflow run nf-core/dmscore \ - -profile \ - --input samplesheet.csv \ - --outdir +First, prepare a samplesheet with your input/output data in which each row represents a pair of fastq files (paired end). This should look as follows: + +```csv title="samplesheet.csv" +sample,type,replicate,file1,file2 +ORF1,input,1,/reads/forward1.fastq.gz,/reads/reverse1.fastq.gz +ORF1,input,2,/reads/forward2.fastq.gz,/reads/reverse2.fastq.gz +ORF1,output,1,/reads/forward3.fastq.gz,/reads/reverse3.fastq.gz +ORF1,output,2,/reads/forward4.fastq.gz,/reads/reverse4.fastq.gz ``` -> [!WARNING] -> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files). +Secondly, specify the gene or gene region of interest using a reference FASTA file via `--fasta`. Provide the exact codon coordinates using `--reading_frame`. -For more details and further functionality, please refer to the [usage documentation](https://nf-co.re/dmscore/usage) and the [parameter documentation](https://nf-co.re/dmscore/parameters). +Now, you can run the pipeline using: + +```bash title="example_run.sh" +nextflow run nf-core/deepmutscan \ + -profile \ + --input ./samplesheet.csv \ + --fasta ./ref.fa \ + --reading_frame 1-300 \ + --outdir ./results +``` ## Pipeline output -To see the results of an example test run with a full size dataset refer to the [results](https://nf-co.re/dmscore/results) tab on the nf-core website pipeline page. +To see the results of an example test run with a full size dataset refer to the [results](https://nf-co.re/deepmutscan/results) tab on the nf-core website pipeline page. + For more details about the output files and reports, please refer to the -[output documentation](https://nf-co.re/dmscore/output). +[output documentation](https://nf-co.re/deepmutscan/output). -## Credits +## Contributing + +We welcome contributions from the community! -nf-core/dmscore was originally written by Benjamin Wehnert & Max Stammnitz. +For technical challenges and feedback on the pipeline, please use our [Github repository](https://github.com/nf-core/deepmutscan). Please open an [issue](https://github.com/nf-core/deepmutscan/issues/new) or [pull request](https://github.com/nf-core/deepmutscan/compare) to: -We thank the following people for their extensive assistance in the development of this pipeline: +- Report bugs or solve data incompatibilities when running `nf-core/deepmutscan` +- Suggest the implementation of new modules for custom DMS workflows +- Help improve this documentation - +If you are interested in getting involved as a developer, please consider joining our interactive [`#deepmutscan` Slack channel](https://nfcore.slack.com/channels/deepmutscan) (via [this invite](https://nf-co.re/join/slack)). + +## Credits -## Contributions and Support +nf-core/deepmutscan was originally written by [Benjamin Wehnert](https://github.com/BenjaminWehnert1008) and [Max Stammnitz](https://github.com/MaximilianStammnitz) at the [Centre for Genomic Regulation, Barcelona](https://www.crg.eu/), with the generous support of an EMBO Long-term Postdoctoral Fellowship and a Marie Skłodowska-Curie grant by the European Union. -If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md). +If you use `nf-core/deepmutscan` in your analyses, please cite: -For further information or help, don't hesitate to get in touch on the [Slack `#dmscore` channel](https://nfcore.slack.com/channels/dmscore) (you can join with [this invite](https://nf-co.re/join/slack)). +> 📄 Wehnert et al., _bioRxiv_ preprint (coming soon) -## Citations +Please also cite the `nf-core` framework: - - +> 📄 Ewels et al., _Nature Biotechnology_, 2020 +> [https://doi.org/10.1038/s41587-020-0439-x](https://doi.org/10.1038/s41587-020-0439-x) - +For further information or help, don't hesitate to get in touch on the [Slack `#deepmutscan` channel](https://nfcore.slack.com/channels/deepmutscan) (you can join with [this invite](https://nf-co.re/join/slack)). -An extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file. +## Scientific contact -You can cite the `nf-core` publication as follows: +For scientific discussions around the use of this pipeline (e.g. on experimental design or sequencing data requirements), please feel free to get in touch with us directly: -> **The nf-core framework for community-curated bioinformatics pipelines.** -> -> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen. -> -> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x). +- Benjamin Wehnert — wehnertbenjamin@gmail.com +- Maximilian Stammnitz — maximilian.stammnitz@crg.eu diff --git a/assets/adaptivecard.json b/assets/adaptivecard.json deleted file mode 100644 index 62ad62a..0000000 --- a/assets/adaptivecard.json +++ /dev/null @@ -1,67 +0,0 @@ -{ - "type": "message", - "attachments": [ - { - "contentType": "application/vnd.microsoft.card.adaptive", - "contentUrl": null, - "content": { - "\$schema": "http://adaptivecards.io/schemas/adaptive-card.json", - "msteams": { - "width": "Full" - }, - "type": "AdaptiveCard", - "version": "1.2", - "body": [ - { - "type": "TextBlock", - "size": "Large", - "weight": "Bolder", - "color": "<% if (success) { %>Good<% } else { %>Attention<%} %>", - "text": "nf-core/dmscore v${version} - ${runName}", - "wrap": true - }, - { - "type": "TextBlock", - "spacing": "None", - "text": "Completed at ${dateComplete} (duration: ${duration})", - "isSubtle": true, - "wrap": true - }, - { - "type": "TextBlock", - "text": "<% if (success) { %>Pipeline completed successfully!<% } else { %>Pipeline completed with errors. The full error message was: ${errorReport}.<% } %>", - "wrap": true - }, - { - "type": "TextBlock", - "text": "The command used to launch the workflow was as follows:", - "wrap": true - }, - { - "type": "TextBlock", - "text": "${commandLine}", - "isSubtle": true, - "wrap": true - } - ], - "actions": [ - { - "type": "Action.ShowCard", - "title": "Pipeline Configuration", - "card": { - "type": "AdaptiveCard", - "\$schema": "http://adaptivecards.io/schemas/adaptive-card.json", - "body": [ - { - "type": "FactSet", - "facts": [<% out << summary.collect{ k,v -> "{\"title\": \"$k\", \"value\" : \"$v\"}"}.join(",\n") %> - ] - } - ] - } - } - ] - } - } - ] -} diff --git a/assets/email_template.html b/assets/email_template.html index 8baab9e..c051109 100644 --- a/assets/email_template.html +++ b/assets/email_template.html @@ -4,21 +4,21 @@ - - nf-core/dmscore Pipeline Report + + nf-core/deepmutscan Pipeline Report
-

nf-core/dmscore ${version}

+

nf-core/deepmutscan ${version}

Run Name: $runName

<% if (!success){ out << """
-

nf-core/dmscore execution completed unsuccessfully!

+

nf-core/deepmutscan execution completed unsuccessfully!

The exit status of the task that caused the workflow execution to fail was: $exitStatus.

The full error message was:

${errorReport}
@@ -27,7 +27,7 @@

nf-core/dmscore execution completed un } else { out << """
- nf-core/dmscore execution completed successfully! + nf-core/deepmutscan execution completed successfully!
""" } @@ -44,8 +44,8 @@

Pipeline Configuration:

-

nf-core/dmscore

-

https://github.com/nf-core/dmscore

+

nf-core/deepmutscan

+

https://github.com/nf-core/deepmutscan

diff --git a/assets/email_template.txt b/assets/email_template.txt index 4247bc4..907c089 100644 --- a/assets/email_template.txt +++ b/assets/email_template.txt @@ -4,15 +4,15 @@ |\\ | |__ __ / ` / \\ |__) |__ } { | \\| | \\__, \\__/ | \\ |___ \\`-._,-`-, `._,._,' - nf-core/dmscore ${version} + nf-core/deepmutscan ${version} ---------------------------------------------------- Run Name: $runName <% if (success){ - out << "## nf-core/dmscore execution completed successfully! ##" + out << "## nf-core/deepmutscan execution completed successfully! ##" } else { out << """#################################################### -## nf-core/dmscore execution completed unsuccessfully! ## +## nf-core/deepmutscan execution completed unsuccessfully! ## #################################################### The exit status of the task that caused the workflow execution to fail was: $exitStatus. The full error message was: @@ -35,5 +35,5 @@ Pipeline Configuration: <% out << summary.collect{ k,v -> " - $k: $v" }.join("\n") %> -- -nf-core/dmscore -https://github.com/nf-core/dmscore +nf-core/deepmutscan +https://github.com/nf-core/deepmutscan diff --git a/assets/methods_description_template.yml b/assets/methods_description_template.yml index e4e7ab3..0076d67 100644 --- a/assets/methods_description_template.yml +++ b/assets/methods_description_template.yml @@ -1,13 +1,13 @@ -id: "nf-core-dmscore-methods-description" +id: "nf-core-deepmutscan-methods-description" description: "Suggested text and references to use when describing pipeline usage within the methods section of a publication." -section_name: "nf-core/dmscore Methods Description" -section_href: "https://github.com/nf-core/dmscore" +section_name: "nf-core/deepmutscan Methods Description" +section_href: "https://github.com/nf-core/deepmutscan" plot_type: "html" ## TODO nf-core: Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline ## You inject any metadata in the Nextflow '${workflow}' object data: |

Methods

-

Data was processed using nf-core/dmscore v${workflow.manifest.version} ${doi_text} of the nf-core collection of workflows (Ewels et al., 2020), utilising reproducible software environments from the Bioconda (Grüning et al., 2018) and Biocontainers (da Veiga Leprevost et al., 2017) projects.

+

Data was processed using nf-core/deepmutscan v${workflow.manifest.version} ${doi_text} of the nf-core collection of workflows (Ewels et al., 2020), utilising reproducible software environments from the Bioconda (Grüning et al., 2018) and Biocontainers (da Veiga Leprevost et al., 2017) projects.

The pipeline was executed with Nextflow v${workflow.nextflow.version} (Di Tommaso et al., 2017) with the following command:

${workflow.commandLine}

${tool_citations}

diff --git a/assets/multiqc_config.yml b/assets/multiqc_config.yml index 2f6ceb9..bcfed63 100644 --- a/assets/multiqc_config.yml +++ b/assets/multiqc_config.yml @@ -1,13 +1,11 @@ report_comment: > - This report has been generated by the nf-core/dmscore - analysis pipeline. For information about how to interpret these results, please see the - documentation. + This report has been generated by the nf-core/deepmutscan analysis pipeline. For information about how to interpret these results, please see the documentation. report_section_order: - "nf-core-dmscore-methods-description": + "nf-core-deepmutscan-methods-description": order: -1000 software_versions: order: -1001 - "nf-core-dmscore-summary": + "nf-core-deepmutscan-summary": order: -1002 export_plots: true diff --git a/assets/nf-core-deepmutscan_logo_light.png b/assets/nf-core-deepmutscan_logo_light.png new file mode 100644 index 0000000..d6c8e55 Binary files /dev/null and b/assets/nf-core-deepmutscan_logo_light.png differ diff --git a/assets/nf-core-dmscore_logo_light.png b/assets/nf-core-dmscore_logo_light.png deleted file mode 100644 index 3faa4ca..0000000 Binary files a/assets/nf-core-dmscore_logo_light.png and /dev/null differ diff --git a/assets/samplesheet.csv b/assets/samplesheet.csv index 5f653ab..5b5d503 100644 --- a/assets/samplesheet.csv +++ b/assets/samplesheet.csv @@ -1,3 +1,5 @@ -sample,fastq_1,fastq_2 -SAMPLE_PAIRED_END,/path/to/fastq/files/AEG588A1_S1_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A1_S1_L002_R2_001.fastq.gz -SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz, +sample,type,replicate,file1,file2 +ORF1,input,1,/reads/forward1.fastq.gz,/reads/reverse1.fastq.gz +ORF1,input,2,/reads/forward2.fastq.gz,/reads/reverse2.fastq.gz +ORF1,output,1,/reads/forward3.fastq.gz,/reads/reverse3.fastq.gz +ORF1,output,2,/reads/forward4.fastq.gz,/reads/reverse4.fastq.gz diff --git a/assets/schema_input.json b/assets/schema_input.json index 2d9d8f6..134ee17 100644 --- a/assets/schema_input.json +++ b/assets/schema_input.json @@ -1,7 +1,7 @@ { "$schema": "https://json-schema.org/draft/2020-12/schema", - "$id": "https://raw.githubusercontent.com/nf-core/dmscore/master/assets/schema_input.json", - "title": "nf-core/dmscore pipeline - params.input schema", + "$id": "https://raw.githubusercontent.com/nf-core/deepmutscan/master/assets/schema_input.json", + "title": "nf-core/deepmutscan pipeline - params.input schema", "description": "Schema for the file provided with params.input", "type": "array", "items": { @@ -9,25 +9,46 @@ "properties": { "sample": { "type": "string", - "pattern": "^\\S+$", - "errorMessage": "Sample name must be provided and cannot contain spaces", + "pattern": "^[^\\s/]+$", + "errorMessage": "Sample name must be provided, cannot contain spaces, and must not include special characters", "meta": ["id"] }, - "fastq_1": { + "file1": { "type": "string", "format": "file-path", "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", - "errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" + "pattern": "^\\S+\\.(bam|fastq|fq|fastq\\.gz|fq\\.gz)$", + "allOf": [ + { + "if": { + "pattern": "^\\S+\\.bam$" + }, + "then": { + "pattern": "^\\S+_(pe|se)\\.bam$", + "errorMessage": "If file1 ends with .bam, it must contain '_pe.bam' or '_se.bam', defining paired-end or single-end" + } + } + ], + "errorMessage": "File 1 must be provided, cannot contain spaces, and must have an allowed extension (.bam, .fastq, .fq, .fastq.gz, .fq.gz)" }, - "fastq_2": { - "type": "string", + "file2": { + "type": ["string", "null"], "format": "file-path", "exists": true, - "pattern": "^\\S+\\.f(ast)?q\\.gz$", - "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'" + "pattern": "^\\S+\\.(fastq|fq|fastq\\.gz|fq\\.gz)$", + "errorMessage": "File 2 must have an allowed extension (.fastq, .fq, .fastq.gz, .fq.gz) or be null for single-end reads" + }, + "type": { + "type": "string", + "enum": ["input", "output", "quality"], + "errorMessage": "Type must be one of: input, output, or quality" + }, + "replicate": { + "type": "integer", + "minimum": 1, + "errorMessage": "Replicate must be a positive integer" } }, - "required": ["sample", "fastq_1"] + "required": ["sample", "file1", "type", "replicate"] } } diff --git a/assets/sendmail_template.txt b/assets/sendmail_template.txt index cdd2e19..4c84df3 100644 --- a/assets/sendmail_template.txt +++ b/assets/sendmail_template.txt @@ -9,12 +9,12 @@ Content-Type: text/html; charset=utf-8 $email_html --nfcoremimeboundary -Content-Type: image/png;name="nf-core-dmscore_logo.png" +Content-Type: image/png;name="nf-core-deepmutscan_logo.png" Content-Transfer-Encoding: base64 Content-ID: -Content-Disposition: inline; filename="nf-core-dmscore_logo_light.png" +Content-Disposition: inline; filename="nf-core-deepmutscan_logo_light.png" -<% out << new File("$projectDir/assets/nf-core-dmscore_logo_light.png"). +<% out << new File("$projectDir/assets/nf-core-deepmutscan_logo_light.png"). bytes. encodeBase64(). toString(). diff --git a/assets/slackreport.json b/assets/slackreport.json deleted file mode 100644 index e5aa3f8..0000000 --- a/assets/slackreport.json +++ /dev/null @@ -1,34 +0,0 @@ -{ - "attachments": [ - { - "fallback": "Plain-text summary of the attachment.", - "color": "<% if (success) { %>good<% } else { %>danger<%} %>", - "author_name": "nf-core/dmscore ${version} - ${runName}", - "author_icon": "https://www.nextflow.io/docs/latest/_static/favicon.ico", - "text": "<% if (success) { %>Pipeline completed successfully!<% } else { %>Pipeline completed with errors<% } %>", - "fields": [ - { - "title": "Command used to launch the workflow", - "value": "```${commandLine}```", - "short": false - } - <% - if (!success) { %> - , - { - "title": "Full error message", - "value": "```${errorReport}```", - "short": false - }, - { - "title": "Pipeline configuration", - "value": "<% out << summary.collect{ k,v -> k == "hook_url" ? "_${k}_: (_hidden_)" : ( ( v.class.toString().contains('Path') || ( v.class.toString().contains('String') && v.contains('/') ) ) ? "_${k}_: `${v}`" : (v.class.toString().contains('DateTime') ? ("_${k}_: " + v.format(java.time.format.DateTimeFormatter.ofLocalizedDateTime(java.time.format.FormatStyle.MEDIUM))) : "_${k}_: ${v}") ) }.join(",\n") %>", - "short": false - } - <% } - %> - ], - "footer": "Completed at <% out << dateComplete.format(java.time.format.DateTimeFormatter.ofLocalizedDateTime(java.time.format.FormatStyle.MEDIUM)) %> (duration: ${duration})" - } - ] -} diff --git a/conf/base.config b/conf/base.config index 9aa66a0..6dd4679 100644 --- a/conf/base.config +++ b/conf/base.config @@ -1,6 +1,6 @@ /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - nf-core/dmscore Nextflow base config file + nf-core/deepmutscan Nextflow base config file ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A 'blank slate' config file, appropriate for general use on most high performance compute environments. Assumes that all software is installed and available on @@ -9,22 +9,25 @@ */ process { - - // TODO nf-core: Check the defaults for all processes cpus = { 1 * task.attempt } memory = { 6.GB * task.attempt } time = { 4.h * task.attempt } - errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' } + errorStrategy = { task.exitStatus in ((130..145) + 104 + (175..177)) ? 'retry' : 'finish' } maxRetries = 1 maxErrors = '-1' + // Disable 'noclobber' and pre-clean versions.yml so modules can overwrite it + beforeScript = ''' + set +o noclobber + rm -f versions.yml || true + ''' + // Process-specific resource requirements // NOTE - Please try and reuse the labels below as much as possible. // These labels are used and recognised by default in DSL2 files hosted on nf-core/modules. // If possible, it would be nice to keep the same label naming convention when // adding in your local modules too. - // TODO nf-core: Customise requirements for specific processes. // See https://www.nextflow.io/docs/latest/config.html#config-process-selectors withLabel:process_single { cpus = { 1 } diff --git a/conf/containers_conda_lock_files_amd64.config b/conf/containers_conda_lock_files_amd64.config new file mode 100644 index 0000000..41e234d --- /dev/null +++ b/conf/containers_conda_lock_files_amd64.config @@ -0,0 +1,2 @@ +process { withName: 'FASTQC' { container = 'modules/nf-core/fastqc/.conda-lock/linux_amd64-bd-5cb1a2fa2f18c7c2_1.txt' } } +process { withName: 'MULTIQC' { container = 'modules/nf-core/multiqc/.conda-lock/linux_amd64-bd-c17fb751507e9dfc_1.txt' } } diff --git a/conf/containers_conda_lock_files_arm64.config b/conf/containers_conda_lock_files_arm64.config new file mode 100644 index 0000000..5b5b9aa --- /dev/null +++ b/conf/containers_conda_lock_files_arm64.config @@ -0,0 +1,2 @@ +process { withName: 'FASTQC' { container = 'modules/nf-core/fastqc/.conda-lock/linux_arm64-bd-e455e32f745abe68_1.txt' } } +process { withName: 'MULTIQC' { container = 'modules/nf-core/multiqc/.conda-lock/linux_arm64-bd-5c84a5000a226ab5_1.txt' } } diff --git a/conf/containers_docker_amd64.config b/conf/containers_docker_amd64.config new file mode 100644 index 0000000..a66e3d1 --- /dev/null +++ b/conf/containers_docker_amd64.config @@ -0,0 +1,2 @@ +process { withName: 'FASTQC' { container = 'community.wave.seqera.io/library/fastqc:0.12.1--5cb1a2fa2f18c7c2' } } +process { withName: 'MULTIQC' { container = 'community.wave.seqera.io/library/multiqc:1.35--c17fb751507e9dfc' } } diff --git a/conf/containers_docker_arm64.config b/conf/containers_docker_arm64.config new file mode 100644 index 0000000..215f686 --- /dev/null +++ b/conf/containers_docker_arm64.config @@ -0,0 +1,2 @@ +process { withName: 'FASTQC' { container = 'community.wave.seqera.io/library/fastqc:0.12.1--e455e32f745abe68' } } +process { withName: 'MULTIQC' { container = 'community.wave.seqera.io/library/multiqc:1.35--5c84a5000a226ab5' } } diff --git a/conf/containers_singularity_https_amd64.config b/conf/containers_singularity_https_amd64.config new file mode 100644 index 0000000..2fb11a3 --- /dev/null +++ b/conf/containers_singularity_https_amd64.config @@ -0,0 +1,2 @@ +process { withName: 'FASTQC' { container = 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/f2/f20b021476d1d87658820f971ebecc1e8cdbde0f338eb0d9cea2b0a8fc54a54b/data' } } +process { withName: 'MULTIQC' { container = 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/c8/c8e346f4f6080eadf1253505e6ff09ef004454fc18e8d672006fd7b222cc412e/data' } } diff --git a/conf/containers_singularity_https_arm64.config b/conf/containers_singularity_https_arm64.config new file mode 100644 index 0000000..5fa5b4f --- /dev/null +++ b/conf/containers_singularity_https_arm64.config @@ -0,0 +1,2 @@ +process { withName: 'FASTQC' { container = 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/46/46daf2dad0169afd2ae047c3e50ed3776259f664bf07e5e06b045dc23449e994/data' } } +process { withName: 'MULTIQC' { container = 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/e4/e48aa28aebc881254a499b24c3e1ce77b8df1b85a5432699ed6f72eb17ac7fb5/data' } } diff --git a/conf/containers_singularity_oras_amd64.config b/conf/containers_singularity_oras_amd64.config new file mode 100644 index 0000000..b334375 --- /dev/null +++ b/conf/containers_singularity_oras_amd64.config @@ -0,0 +1,2 @@ +process { withName: 'FASTQC' { container = 'oras://community.wave.seqera.io/library/fastqc:0.12.1--5c4bd442468d75dd' } } +process { withName: 'MULTIQC' { container = 'oras://community.wave.seqera.io/library/multiqc:1.35--c680f2aea25ccec2' } } diff --git a/conf/containers_singularity_oras_arm64.config b/conf/containers_singularity_oras_arm64.config new file mode 100644 index 0000000..661c656 --- /dev/null +++ b/conf/containers_singularity_oras_arm64.config @@ -0,0 +1,2 @@ +process { withName: 'FASTQC' { container = 'oras://community.wave.seqera.io/library/fastqc:0.12.1--127a87fc06499035' } } +process { withName: 'MULTIQC' { container = 'oras://community.wave.seqera.io/library/multiqc:1.35--c0468833d65b2f81' } } diff --git a/conf/modules.config b/conf/modules.config index d203d2b..a7a29f2 100644 --- a/conf/modules.config +++ b/conf/modules.config @@ -20,6 +20,7 @@ process { withName: FASTQC { ext.args = '--quiet' + containerOptions = '' } withName: 'MULTIQC' { @@ -31,4 +32,159 @@ process { ] } + withName: 'BWA_INDEX' { + publishDir = [ + path: "${params.outdir}/intermediate_files/bam_files", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'BWA_MEM' { + publishDir = [ + path: "${params.outdir}/intermediate_files/bam_files/bwa/mem", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'BAMFILTER_DMS' { + publishDir = [ + path: "${params.outdir}/intermediate_files/bam_files/filtered", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'PREMERGE' { + publishDir = [ + path: "${params.outdir}/intermediate_files/bam_files/premerged", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'GATK_SATURATIONMUTAGENESIS' { + publishDir = [ + path: "${params.outdir}/intermediate_files/gatk", + mode: 'copy', + overwrite: false, + // put everything except versions.yml under a folder named by the sample id + saveAs: { filename -> + if (filename == 'versions.yml') return null + "${meta.id}/${filename}" + } + ] + } + + withName: 'DMSANALYSIS_POSSIBLE_MUTATIONS' { + publishDir = [ + path: "${params.outdir}/intermediate_files", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'DMSANALYSIS_AASEQ' { + publishDir = [ + path: "${params.outdir}/intermediate_files", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'DMSANALYSIS_PROCESS_GATK' { + publishDir = [ + path: "${params.outdir}/intermediate_files/processed_gatk_files", + mode: 'copy', + saveAs: { filename -> + if (filename == 'versions.yml') return null + "${meta.id}/${filename}" + } + ] + } + + withName: /.*VISUALIZATION_.*/ { + publishDir = [ + path: { "${params.outdir}/library_QC" }, // e.g. results/library_QC + mode: 'copy', + overwrite: false, + saveAs: { fn -> + if (fn == 'versions.yml') return null + "${meta.id}/${fn}" // put every output under the sample's subfolder + } + ] + } + + withName: 'GATK_GATKTOFITNESS' { + publishDir = [ + path: "${params.outdir}/fitness/DiMSum_results/single_rep_counts", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'MERGE_COUNTS' { + publishDir = [ + path: "${params.outdir}/fitness", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'EXPDESIGN_FITNESS' { + publishDir = [ + path: "${params.outdir}/fitness", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'FIND_SYNONYMOUS_MUTATION' { + publishDir = [ + path: "${params.outdir}/fitness", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'FITNESS_CALCULATION' { + publishDir = [ + path: "${params.outdir}/fitness/default_results", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'FITNESS_QC' { + publishDir = [ + path: "${params.outdir}/fitness/default_results", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'FITNESS_HEATMAP' { + publishDir = [ + path: "${params.outdir}/fitness/default_results", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'RUN_DIMSUM' { + publishDir = [ + path: "${params.outdir}/fitness/DiMSum_results", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } + + withName: 'RUN_MUTSCAN' { + publishDir = [ + path: "${params.outdir}/fitness/mutscan_results", + mode: 'copy', + saveAs: { filename -> filename == 'versions.yml' ? null : filename } + ] + } } diff --git a/conf/test.config b/conf/test.config index 8c566af..c89ca06 100644 --- a/conf/test.config +++ b/conf/test.config @@ -5,7 +5,7 @@ Defines input files and everything required to run a fast and simple pipeline test. Use as follows: - nextflow run nf-core/dmscore -profile test, --outdir + nextflow run nf-core/deepmutscan -profile test, --outdir ---------------------------------------------------------------------------------------- */ @@ -13,7 +13,7 @@ process { resourceLimits = [ cpus: 4, - memory: '15.GB', + memory: '8.GB', time: '1.h' ] } @@ -23,8 +23,13 @@ params { config_profile_description = 'Minimal test dataset to check pipeline function' // Input data - // TODO nf-core: Specify the paths to your test data on nf-core/test-datasets - // TODO nf-core: Give any required params for the test so that command line flags are not needed - input = params.pipelines_testdata_base_path + 'viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv'// Genome references - genome = 'R64-1-1' + input = params.pipelines_testdata_base_path + 'deepmutscan/samplesheet/GID1A_test.csv' + fasta = params.pipelines_testdata_base_path + 'deepmutscan/testdata/GID1A.fasta' + reading_frame = '352-1383' + min_counts = 2 + mutagenesis_type = 'nnk_nns' + run_seqdepth = true + fitness = true + mutscan = true + dimsum = true } diff --git a/conf/test_full.config b/conf/test_full.config index b57a75c..edb899d 100644 --- a/conf/test_full.config +++ b/conf/test_full.config @@ -5,7 +5,7 @@ Defines input files and everything required to run a full size pipeline test. Use as follows: - nextflow run nf-core/dmscore -profile test_full, --outdir + nextflow run nf-core/deepmutscan -profile test_full, --outdir ---------------------------------------------------------------------------------------- */ @@ -17,7 +17,7 @@ params { // Input data for full size test // TODO nf-core: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA) // TODO nf-core: Give any required params for the test so that command line flags are not needed - input = params.pipelines_testdata_base_path + 'viralrecon/samplesheet/samplesheet_full_illumina_amplicon.csv' + input = params.pipelines_testdata_base_path + 'samplesheet_qc_only.csv' // Genome references genome = 'R64-1-1' diff --git a/docs/CONTRIBUTING.md b/docs/CONTRIBUTING.md new file mode 100644 index 0000000..639617e --- /dev/null +++ b/docs/CONTRIBUTING.md @@ -0,0 +1,181 @@ +--- +title: Contributing +markdownPlugin: checklist +--- + +# `nf-core/deepmutscan`: Contributing guidelines + +Hi there! +Thanks for taking an interest in improving nf-core/deepmutscan. + +This page describes the recommended nf-core way to contribute to both nf-core/deepmutscan and nf-core pipelines in general, including: + +- [General contribution guidelines](#general-contribution-guidelines): common procedures or guides across all nf-core pipelines. +- [Pipeline-specific contribution guidelines](#pipeline-specific-contribution-guidelines): procedures or guides specific to the development conventions of nf-core/deepmutscan. + +> [!NOTE] +> If you need help using or modifying nf-core/deepmutscan, ask on the nf-core Slack [#deepmutscan](https://nfcore.slack.com/channels/deepmutscan) channel ([join our Slack here](https://nf-co.re/join/slack)). + +## General contribution guidelines + +### Contribution quick start + +To contribute code to any nf-core pipeline: + +- [ ] Ensure you have Nextflow, nf-core tools, and nf-test installed. See the [nf-core/tools repository](https://github.com/nf-core/tools) for instructions. +- [ ] Check whether a GitHub [issue](https://github.com/nf-core/deepmutscan/issues) about your idea already exists. If an issue does not exist, create one so that others are aware you are working on it. +- [ ] [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [nf-core/deepmutscan repository](https://github.com/nf-core/deepmutscan) to your GitHub account. +- [ ] Create a branch on your forked repository and make your changes following [pipeline conventions](#pipeline-contribution-conventions) (if applicable). +- [ ] To fix major bugs, name your branch `patch` and follow the [patch release](#patch-release) process. +- [ ] Update relevant documentation within the `docs/` folder, use nf-core/tools to update `nextflow_schema.json`, and update `CITATIONS.md`. +- [ ] Run and/or update tests. See [Testing](#testing) for more information. +- [ ] [Lint](#lint-tests) your code with nf-core/tools. +- [ ] Submit a pull request (PR) against the `dev` branch and request a review. + +If you are not used to this workflow with Git, see the [GitHub documentation](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or [Git resources](https://try.github.io/) for more information. + +## Use of AI and LLMs + +The nf-core stance on the use of AI and LLMs is that humans are still ultimately responsible for their submitted code, regardless of the tools they use. + +If you’re using AI tools, try to stick by these guidelines: + +- Keep PRs as small and focussed as possible +- Avoid any unnecessary changes, such as moving or refactoring code (unless that is the explicit intention of the PR) +- Review all generated code yourself before opening a PR, and ensure that you understand it +- Engage with the community review process and expect to make revisions + +For more detail, see the the [blog post](https://nf-co.re/blog/2026/statement-on-ai) for a statement from the nf-core/core team. + +### Getting help + +For further information and help, see the [nf-core/deepmutscan documentation](https://nf-co.re/deepmutscan/usage) or ask on the nf-core [#deepmutscan](https://nfcore.slack.com/channels/deepmutscan) Slack channel ([join our Slack here](https://nf-co.re/join/slack)). + +### GitHub Codespaces + +You can contribute to nf-core/deepmutscan without installing a local development environment on your machine by using [GitHub Codespaces](https://github.com/codespaces). + +[GitHub Codespaces](https://github.com/codespaces) is an online developer environment that runs in your browser, complete with VS Code and a terminal. +Most nf-core repositories include a devcontainer configuration, which creates a GitHub Codespaces environment specifically for Nextflow development. +The environment includes pre-installed nf-core tools, Nextflow, and a few other helpful utilities via a Docker container. + +To get started, open the repository in [Codespaces](https://github.com/nf-core/deepmutscan/codespaces). + +### Testing + +Once you have made your changes, run the pipeline with nf-test to test them locally. +For additional information, use the `--verbose` flag to view the Nextflow console log output. + +```bash +nf-test test --tag test --profile +docker --verbose +``` + +If you have added new functionality, ensure you update the test assertions in the `.nf.test` files in the `tests/` directory. +Update the snapshots with the following command: + +```bash +nf-test test --tag test --profile +docker --verbose --update-snapshots +``` + +When you create a pull request with changes, GitHub Actions will run automatic tests. +Pull requests are typically reviewed when these tests are passing. + +Two types of tests are typically run: + +#### Lint tests + +nf-core has a [set of guidelines](https://nf-co.re/docs/specifications/overview) which all pipelines must follow. +To enforce these, run linting with nf-core/tools: + +```bash +nf-core pipelines lint +``` + +If you encounter failures or warnings, follow the linked documentation printed to screen. +For more information about linting tests, see [nf-core/tools API documentation](https://nf-co.re/docs/nf-core-tools/api_reference/latest/pipeline_lint_tests/actions_awsfulltest). + +#### Pipeline tests + +Each nf-core pipeline should be set up with a minimal set of test data. +GitHub Actions runs the pipeline on this data to ensure it runs through and exits successfully. +If there are any failures then the automated tests fail. +These tests are run with the latest available version of Nextflow and the minimum required version specified in the pipeline code. + +### Patch release + +> [!WARNING] +> Only in the unlikely event of a release that contains a critical bug. + +- [ ] Create a new branch `patch` on your fork based on `upstream/main` or `upstream/master`. +- [ ] Fix the bug and use nf-core/tools to bump the version to the next semantic version, for example, `1.2.3` → `1.2.4`. +- [ ] Open a Pull Request from `patch` directly to `main`/`master` with the changes. + +### Pipeline contribution conventions + +nf-core semi-standardises how you write code and other contributions to make the nf-core/deepmutscan code and processing logic more understandable for new contributors and to ensure quality. + +#### Add a new pipeline step + +To contribute a new step to the pipeline, follow the general nf-core coding procedure. +Please also refer to the [pipeline-specific contribution guidelines](#pipeline-specific-contribution-guidelines): + +- [ ] Define the corresponding [input channel](#channel-naming-schemes) into your new process from the expected previous process channel. +- [ ] Install a module with nf-core/tools, or write a local module (see [default processes resource requirements](#default-processes-resource-requirements)), and add it to the target `.nf`. +- [ ] Define the output channel if needed. Mix the version output channel into `ch_versions` and relevant files into `ch_multiqc`. +- [ ] Add new or updated parameters to `nextflow.config` with a [default value](#default-parameter-values). +- [ ] Add new or updated parameters and relevant help text to `nextflow_schema.json` with [nf-core/tools](#default-parameter-values). +- [ ] Add validation for relevant parameters to the pipeline utilisation section of `utils_nfcore_\_pipeline/main.nf` subworkflow. +- [ ] Perform local tests to validate that the new code works as expected. + - [ ] If applicable, add a new test in the `tests` directory. +- [ ] Update `usage.md`, `output.md`, and `citation.md` as appropriate. +- [ ] [Lint](lint) the code with nf-core/tools. +- [ ] Update any diagrams or pipeline images as necessary. +- [ ] Update MultiQC config `assets/multiqc_config.yml` so relevant suffixes, file name cleanup, and module plots are in the appropriate order. +- [ ] If applicable, create a [MultiQC](https://seqera.io/multiqc/) module. +- [ ] Add a description of the output files and, if relevant, images from the MultiQC report to `docs/output.md`. + +To update the minimum required Nextflow version, see the [Nextflow version bumping](#nextflow-version-bumping) section below. For more information about pipeline contributions, see [pipeline-specific contribution guidelines](#pipeline-specific-contribution-guidelines). + +#### Channel naming schemes + +Use the following naming schemes for channels to make the channel flow easier to understand: + +- Initial process channel: `ch_output_from_` +- Intermediate and terminal channels: `ch__for_` + +#### Default parameter values + +Parameters should be initialised and defined with default values within the `params` scope in `nextflow.config`. +They should also be documented in the pipeline JSON schema. + +To update `nextflow_schema.json`, run: + +```bash +nf-core pipelines schema build +``` + +The schema builder interface that loads in your browser should automatically update the defaults in the parameter documentation. + +#### Default processes resource requirements + +If you write a local module, specify a default set of resource requirements for the process. + +Sensible defaults for process resource requirements (CPUs, memory, time) should be defined in `conf/base.config`. +Specify these with generic `withLabel:` selectors, so they can be shared across multiple processes and steps of the pipeline. + +nf-core provides a set of standard labels that you should follow where possible, as seen in the [nf-core pipeline template](https://github.com/nf-core/tools/blob/main/nf_core/pipeline-template/conf/base.config). +These labels define resource defaults for single-core processes, modules that require a GPU, and different levels of multi-core configurations with increasing memory requirements. + +Values assigned within these labels can be dynamically passed to a tool using the the `${task.cpus}` and `${task.memory}` Nextflow variables in the `script:` block of a module (see an example in the [modules repository](https://github.com/nf-core/modules/blob/bd1b6a40f55933d94b8c9ca94ec8c1ea0eaf4b82/modules/nf-core/samtools/bam2fq/main.nf#L30)). + +#### Nextflow version bumping + +If you use a new feature from core Nextflow, bump the minimum required Nextflow version in the pipeline with: + +```bash +nf-core pipelines bump-version --nextflow . +``` + +#### Images and figures guidelines + +If you update images or graphics, follow the nf-core [style guidelines](https://nf-co.re/docs/community/brand/workflow-schematics). diff --git a/docs/README.md b/docs/README.md index 917d984..657c1b3 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,6 +1,6 @@ -# nf-core/dmscore: Documentation +# nf-core/deepmutscan: Documentation -The nf-core/dmscore documentation is split into the following pages: +The nf-core/deepmutscan documentation is split into the following pages: - [Usage](usage.md) - An overview of how the pipeline works, how to run it and a description of all of the different command-line flags. diff --git a/docs/images/fastqc.png b/docs/images/fastqc.png new file mode 100644 index 0000000..32c30ff Binary files /dev/null and b/docs/images/fastqc.png differ diff --git a/docs/images/fitness_estimation_count_correlation.png b/docs/images/fitness_estimation_count_correlation.png new file mode 100644 index 0000000..98686bf Binary files /dev/null and b/docs/images/fitness_estimation_count_correlation.png differ diff --git a/docs/images/fitness_estimation_fitness_correlation.png b/docs/images/fitness_estimation_fitness_correlation.png new file mode 100644 index 0000000..81e30d8 Binary files /dev/null and b/docs/images/fitness_estimation_fitness_correlation.png differ diff --git a/docs/images/fitness_heatmap.png b/docs/images/fitness_heatmap.png new file mode 100644 index 0000000..a8e94e7 Binary files /dev/null and b/docs/images/fitness_heatmap.png differ diff --git a/docs/images/library_QC_SeqDepth.png b/docs/images/library_QC_SeqDepth.png new file mode 100644 index 0000000..bdd3ee8 Binary files /dev/null and b/docs/images/library_QC_SeqDepth.png differ diff --git a/docs/images/library_QC_counts_heatmap.png b/docs/images/library_QC_counts_heatmap.png new file mode 100644 index 0000000..850076e Binary files /dev/null and b/docs/images/library_QC_counts_heatmap.png differ diff --git a/docs/images/library_QC_counts_per_cov_heatmap.png b/docs/images/library_QC_counts_per_cov_heatmap.png new file mode 100644 index 0000000..03ad28f Binary files /dev/null and b/docs/images/library_QC_counts_per_cov_heatmap.png differ diff --git a/docs/images/library_QC_logdiff_plot.png b/docs/images/library_QC_logdiff_plot.png new file mode 100644 index 0000000..c04d900 Binary files /dev/null and b/docs/images/library_QC_logdiff_plot.png differ diff --git a/docs/images/library_QC_logdiff_varying_bases.png b/docs/images/library_QC_logdiff_varying_bases.png new file mode 100644 index 0000000..e26b9fb Binary files /dev/null and b/docs/images/library_QC_logdiff_varying_bases.png differ diff --git a/docs/images/library_QC_rolling_counts.png b/docs/images/library_QC_rolling_counts.png new file mode 100644 index 0000000..b7c71d5 Binary files /dev/null and b/docs/images/library_QC_rolling_counts.png differ diff --git a/docs/images/library_QC_rolling_coverage.png b/docs/images/library_QC_rolling_coverage.png new file mode 100644 index 0000000..b4ffe6b Binary files /dev/null and b/docs/images/library_QC_rolling_coverage.png differ diff --git a/docs/images/multiqc1.png b/docs/images/multiqc1.png new file mode 100644 index 0000000..a99a7b1 Binary files /dev/null and b/docs/images/multiqc1.png differ diff --git a/docs/images/multiqc2.png b/docs/images/multiqc2.png new file mode 100644 index 0000000..75ee227 Binary files /dev/null and b/docs/images/multiqc2.png differ diff --git a/docs/images/multiqc3.png b/docs/images/multiqc3.png new file mode 100644 index 0000000..d4c8c61 Binary files /dev/null and b/docs/images/multiqc3.png differ diff --git a/docs/images/nf-core-deepmutscan_logo_dark.png b/docs/images/nf-core-deepmutscan_logo_dark.png new file mode 100644 index 0000000..398e0b2 Binary files /dev/null and b/docs/images/nf-core-deepmutscan_logo_dark.png differ diff --git a/docs/images/nf-core-deepmutscan_logo_light.png b/docs/images/nf-core-deepmutscan_logo_light.png new file mode 100644 index 0000000..ed93fd9 Binary files /dev/null and b/docs/images/nf-core-deepmutscan_logo_light.png differ diff --git a/docs/images/nf-core-dmscore_logo_dark.png b/docs/images/nf-core-dmscore_logo_dark.png deleted file mode 100644 index 7cbb4e1..0000000 Binary files a/docs/images/nf-core-dmscore_logo_dark.png and /dev/null differ diff --git a/docs/images/nf-core-dmscore_logo_light.png b/docs/images/nf-core-dmscore_logo_light.png deleted file mode 100644 index 569ba7e..0000000 Binary files a/docs/images/nf-core-dmscore_logo_light.png and /dev/null differ diff --git a/docs/images/pipeline.png b/docs/images/pipeline.png new file mode 100644 index 0000000..fa66090 Binary files /dev/null and b/docs/images/pipeline.png differ diff --git a/docs/output.md b/docs/output.md index ff49a08..8b10a3d 100644 --- a/docs/output.md +++ b/docs/output.md @@ -1,56 +1,129 @@ -# nf-core/dmscore: Output +# nf-core/deepmutscan: Output ## Introduction -This document describes the output produced by the pipeline. Most of the plots are taken from the MultiQC report, which summarises results at the end of the pipeline. +The directories listed below will be created in the results directory after `nf-core/deepmutscan` has finished. All paths are relative to the top-level results directory: -The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory. +```tree title="nf-core/deepmutscan results" +results/ +├── fastqc/ # Individual raw sequencing QC reports for each specified fastq file, in `.html` +├── fitness/ # Merged variant count tables, fitness and error estimates, replicate correlations and heatmaps +├── intermediate_files/ # Raw alignments, raw and pre-filtered variant count tables, QC reports +├── library_QC/ # Sample-specific PDF visualizations: position-wise sequencing coverage, count heatmaps, etc. +├── multiqc/ # Shared raw sequencing QC report for all fastq files, in `.html` +├── pipelineinfo/ # Nextflow helper files for timeline and summary report generation +├── timeline.html # Nextflow timeline for all tasks +└── report.html # Nextflow summary report incl. detailed CPU and memory usage per for all tasks +``` - +### FastQC -## Pipeline overview +[FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) gives general quality metrics about your sequenced reads. It provides information about the quality score distribution across your reads, per base sequence content (%A/T/G/C), adapter contamination and overrepresented sequences. For further reading and documentation see the [FastQC help pages](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/). -The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps: +
+Output files -- [FastQC](#fastqc) - Raw read QC- [MultiQC](#multiqc) - Aggregate report describing results and QC from the whole pipeline -- [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution +- `fastqc/` + - `*_fastqc.zip`: Zip archive containing the FastQC report, tab-delimited data file and plot images + - `*_fastqc.html`: FastQC report containing quality metrics -### FastQC +![FASTQC report](images/fastqc.png) + +
+ +### MultiQC + +[MultiQC](http://multiqc.info) is a visualization tool that generates a single HTML report summarising all samples in your project. Most of the pipeline QC results are visualised in the report and further statistics are available in the report data directory. Results generated by MultiQC collate pipeline QC from supported tools e.g. FastQC. The pipeline has special steps which also allow the software versions to be reported in the MultiQC output for future traceability. For more information about how to use MultiQC reports, see .### Pipeline information
Output files -- `fastqc/` - - `*_fastqc.html`: FastQC report containing quality metrics. - - `*_fastqc.zip`: Zip archive containing the FastQC report, tab-delimited data file and plot images. +- `multiqc/` + - `multiqc_report.html`: a standalone `.html` file that can be viewed in your web browser + - `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline + - `multiqc_plots/`: directory containing static images from the report in various formats + +![MULTIQC overview](images/multiqc1.png) +![MULTIQC base-quality summary](images/multiqc2.png) +![MULTIQC GC-content summary](images/multiqc3.png)
-[FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) gives general quality metrics about your sequenced reads. It provides information about the quality score distribution across your reads, per base sequence content (%A/T/G/C), adapter contamination and overrepresented sequences. For further reading and documentation see the [FastQC help pages](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/).### MultiQC +### Intermediate files (Core Pipeline Stages 1-4) + +This directory is created during the first series of steps of the pipeline, featuring raw read alignments, filtering and variant counting.
Output files -- `multiqc/` - - `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser. - - `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline. - - `multiqc_plots/`: directory containing static images from the report in various formats. +- `intermediate_files/` + - `aa_seq.txt`: a string of the reconstructed wildtype amino acid sequence of the specified open reading frame + - `possible_mutations.csv`: using the `--mutagenesis` argument, this file lists all the programmed mutations per position; these are used for variant count filtering and visualisation + - `bam_files/bwa/`: sets of BWA referenes indices from the original alignment(s), with BAM files for each sample in the `mem/` subfolder + - `bam_files/filtered/`: filtered BAM files for each sample, without any wildtype or indel-matching reads + - `bam_files/premerged/`: filtered, read-merged and re-aligned BAM files for each sample, representing highest-quality alignments for subsequent variant counting + - `gatk/`: subfolders with resulting variant count table outputs from `AnalyzeSaturationMutagenesis`, stratified by sample + - `processed_gatk_files/`: subfolders with prefiltered GATK variant count tables, stratified by sample
-[MultiQC](http://multiqc.info) is a visualization tool that generates a single HTML report summarising all samples in your project. Most of the pipeline QC results are visualised in the report and further statistics are available in the report data directory. +### Library QC (Core Pipeline Stage 5) -Results generated by MultiQC collate pipeline QC from supported tools e.g. FastQC. The pipeline has special steps which also allow the software versions to be reported in the MultiQC output for future traceability. For more information about how to use MultiQC reports, see .### Pipeline information +This directory is created during the second series of steps of the pipeline, featuring various QC visualisations for each sample.
Output files +- `library_QC/` + - `counts_heatmap.pdf`: a complete heatmap of absolute mutant counts, stratified by mutant amino acid (Y-axis) per position (X-axis) + ![Count heatmap](images/library_QC_counts_heatmap.png) + + - `counts_per_cov_heatmap.pdf`: as above, but as a fraction of the total sequencing coverage + - `logdiff_plot.pdf`: sorted, log-scale coverage distribution of all mutants + ![Logarithmic count differences](images/library_QC_logdiff_plot.png) + + - `logdiff_varying_bases.pdf`: as above, but stratified by hamming distance to the wildtype nucleotide sequence (colour shading) + - `rolling_coverage.pdf`: sliding-window rolling coverage + ![Rolling coverage](images/library_QC_rolling_coverage.png) + + - `rolling_counts.pdf`: sliding-window rolling coverage, stratified by hamming distance to the wildtype nucleotide sequence (colour shading) + ![Rolling counts](images/library_QC_rolling_counts.png) + + - `rolling_counts_per_cov.pdf`: as above, but as a fraction of the total sequencing coverage + - `SeqDepth.pdf` (optional via the `--run_seqdepth` argument): rarefaction curve of the sequencing coverage and how it relates to the percentage of programmed variants detected + ![Sequencing coverage rarefaction](images/library_QC_SeqDepth.png) + +
+ +### Fitness (Core Pipeline Stages 6-8) + +This directory is created during the final series of steps of the pipeline, featuring fitness and fitness error estimates (when DMS input/output sample groups are specified). + +
+Output files + +- `fitness/` + - `counts_merged.tsv`: summarised gene variant counts across all input and output samples + - `default_results/fitness_estimation_count_correlation.pdf`: pair-wise replicate variant count scatterplots and correlations between all specified samples + ![Variant count correlation(s)](images/fitness_estimation_count_correlation.png) + + - `default_results/fitness_estimation_fitness_correlation.pdf`: pair-wise fitness replicate scatterplots and correlations between all specified output samples + ![Fitness correlation(s)](images/fitness_estimation_fitness_correlation.png) + + - `default_results/fitness_heatmap.pdf`: a complete heatmap of absolute mutant counts, stratified by mutant amino acid (Y-axis) per position (X-axis) + ![Default fitness heatmap](images/fitness_heatmap.png) + + - `default_results/fitness_estimation.tsv`: table file with all fitness and fitness error estimates calculated + - `DiMSum_results/dimsum_results/` (optional): subfolder with the full set of [DiMSum](https://github.com/lehner-lab/DiMSum) outputs, including the associated `.HTML` report, `.Rdata` and `.tsv` files with fitness and fitness error estimates + +
+ +### Pipeline Info (Nextflow Reports) + - `pipeline_info/` - Reports generated by Nextflow: `execution_report.html`, `execution_timeline.html`, `execution_trace.txt` and `pipeline_dag.dot`/`pipeline_dag.svg`. - Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.yml`. The `pipeline_report*` files will only be present if the `--email` / `--email_on_fail` parameter's are used when running the pipeline. - Reformatted samplesheet files used as input to the pipeline: `samplesheet.valid.csv`. - Parameters used by the pipeline run: `params.json`. - - [Nextflow](https://www.nextflow.io/docs/latest/tracing.html) provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage. diff --git a/docs/pipeline_steps.md b/docs/pipeline_steps.md new file mode 100644 index 0000000..9573527 --- /dev/null +++ b/docs/pipeline_steps.md @@ -0,0 +1,111 @@ +# nf-core/deepmutscan: Detailed Pipeline Steps + +This page provides in-depth descriptions of the data processing modules implemented in **nf-core/deepmutscan**. It is mainly intended for advanced users and developers who want to understand the rationale behind design choices, explore implementation details, and consider potential future extensions. + +--- + +## Overview + +The pipeline processes deep mutational scanning (DMS) sequencing data in several stages: + +1. Alignment of reads to the reference open reading frame (ORF) +2. Filtering of wildtype and erroneous reads +3. Read merging for base error reduction +4. Mutation counting +5. DMS library quality control +6. Data summarisation across samples +7. Single nucleotide variant error correction _(in development)_ +8. Fitness estimation _(in development)_ + +![pipeline](/docs/pipeline.png) + +Each step is explained below. Links are provided to the primary tools and libraries used, where applicable. + +--- + +## 1. Alignment + +All paired-end raw reads are first aligned to the provided reference ORF using [**bwa-mem**](http://bio-bwa.sourceforge.net/). This is a highly efficient mapping algorithm for reads ≥100 bp, with its multi-threading support automatically handled by nf-core. + +In future versions of nf-core/deepmutscan, we consider the use of [**bwa-mem2**](https://github.com/bwa-mem2/bwa-mem2), which provides similar alignment rates with a moderate speed increase ([Vasimuddin et al., _IPDPS_ 2019](https://ieeexplore.ieee.org/document/8820962)). With the increasing diversity of sequencing platforms for DMS, new throughput, read length, and error profiles may require further alignment options to be implemented. + +--- + +## 2. Filtering + +For long ORF site-saturation mutagenesis libraries, most aligned shotgun sequencing reads contain exact matches against the reference. It is not possible to infer which of these stem from mutant versus wildtype DNA molecules prior to fragmentation, hence they are filtered out. Similarly, erroneous reads with unexpected indels are also removed. + +To this end, we use [**samtools view**](https://www.htslib.org/doc/samtools.html). + +--- + +## 3. Read Merging + +Even the highest-accuracy next-generation sequencing platforms do not have perfect base accuracy. To minimise the effect of base errors (which would otherwise be counted as "false mutations"), nf-core/deepmutscan uses the overlap of each aligned read pair. With base errors on the forward and reverse read being independent, the pipeline applies the [**vsearch fastq_mergepairs**](https://github.com/torognes/vsearch) function to convert each read pair into a single consensus molecule with adjusted base error scores. + +> [!TIP] +> Optimal merging performance is usually obtained if the average DNA fragment size matches the read size. For example, libraries sequenced with 150 bp paired-end reads should ideally also be sheared/tagmented to a mean size of 150 bp. + +Future versions may offer additional options depending on sequencing type and error profiles. + +--- + +## 4. Variant Counting + +Aligned, non-wildtype consensus reads are screened for exact, base-level mismatches. nf-core/deepmutscan currently uses the popular [**GATK AnalyzeSaturationMutagenesis**](https://gatk.broadinstitute.org/hc/en-us/articles/360037594771-AnalyzeSaturationMutagenesis-BETA) function to count occurrences of all single, double, triple, and higher-order nucleotide changes between each read and the reference ORF. + +We are currently working on the nf-core/deepmutscan implementation of a much lighter, alternative Python implementation for mutation counting. In this script, users will be allowed to specify a minimum base quality cutoff for mutations to be included in the final count table (default: Q30) – an option not available in GATK. + +--- + +## 5. DMS Library Quality Control + +By integrating the reference ORF coordinates and the chosen DMS library type (default: NNK/NNs degenerate codon-based nicking), nf-core/deepmutscan calculates a number of mutation count summary statistics. + +Custom visualisations allow for inspection of (1) mutation efficiency along the ORF, (2) position-specific recovery of amino acid diversity, and (3) overall sequencing coverage evenness and saturation. + +--- + +## 6. Data Summarisation for Fitness Estimation + +Steps 1-5 are iteratively run across all samples defined in the `.csv` spreadsheet. Once read alignment, merging, mutation counting, and library QC have been completed for the full list of samples, users can opt to proceed with fitness estimation. To this end, the pipeline generates all the necessary input files by merging mutation counts across samples. + +--- + +## 7. Single Nucleotide Variant Error Correction _(in development)_ + +This module will implement strategies to distinguish true single nucleotide variants from sequencing artefacts. There are two options to perform this: + +- Empirical error rate modelling based on wildtype sequencing +- Empirical error rate modelling based on false double mutants _(in development)_ + +--- + +## 8. Fitness Estimation _(in development)_ + +The final step of the pipeline will perform fitness estimation based on mutation counts. By default, we calculate fitness scores as the logarithm of variants' output to input ratio, normalised to that of the provided wildtype sequence. Future expansions may include: + +- Integration of other popular fitness inference tools, including [DiMSum](https://github.com/lehner-lab/DiMSum), [Enrich2](https://github.com/FowlerLab/Enrich2), [rosace](https://github.com/pimentellab/rosace/) and [mutscan](https://github.com/fmicompbio/mutscan) +- Standardised output formats for downstream analyses and comparison + +> [!IMPORTANT] +> We note that exact wildtype sequence reads are filtered out in stage 2. Including synonymous wildtype codons in the original mutagenesis design is therefore essential when it comes to calibrating the fitness calculations. + +--- + +## Notes for Developers + +- Custom scripts used in filtering and mutation counting are available in the `bin/` directory of the repository. +- Modules are implemented in Nextflow DSL2 and follow the nf-core community guidelines. +- Contributions, optimisations, and additional analysis modules are welcome - please open a pull request or GitHub issue to discuss ideas. + +_This document is meant as a living reference. As the pipeline evolves, the descriptions of steps 7 and 8 will be expanded with concrete implementation details._ + +--- + +## Contact + +For detailled scientific or technical questions, feedback and experimental discussions, feel free to contact us directly: + +- Benjamin Wehnert — wehnertbenjamin@gmail.com +- Maximilian Stammnitz — maximilian.stammnitz@crg.eu diff --git a/docs/usage.md b/docs/usage.md index e7ae0cb..4c79b30 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -1,74 +1,38 @@ -# nf-core/dmscore: Usage +# nf-core/deepmutscan: Usage -## :warning: Please read this documentation on the nf-core website: [https://nf-co.re/dmscore/usage](https://nf-co.re/dmscore/usage) +## :warning: Please read this documentation on the nf-core website: [https://nf-co.re/deepmutscan/usage](https://nf-co.re/deepmutscan/usage) > _Documentation of pipeline parameters is generated automatically from the pipeline schema and can no longer be found in markdown files._ ## Introduction - +**nf-core/deepmutscan** is a workflow designed for the analysis of deep mutational scanning (DMS) data. DMS enables researchers to experimentally measure the fitness effects of thousands of genes or gene variants simultaneously, helping to classify disease causing mutants in human and animal populations, to learn the fundamental rules of virus evolution, protein architecture, splicing, small-molecule interactions and many other phenotypes. -## Samplesheet input - -You will need to create a samplesheet with information about the samples you would like to analyse before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row as shown in the examples below. - -```bash ---input '[path to samplesheet file]' -``` - -### Multiple runs of the same sample - -The `sample` identifiers have to be the same when you have re-sequenced the same sample more than once e.g. to increase sequencing depth. The pipeline will concatenate the raw reads before performing any downstream analysis. Below is an example for the same sample sequenced across 3 lanes: - -```csv title="samplesheet.csv" -sample,fastq_1,fastq_2 -CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz -CONTROL_REP1,AEG588A1_S1_L003_R1_001.fastq.gz,AEG588A1_S1_L003_R2_001.fastq.gz -CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz,AEG588A1_S1_L004_R2_001.fastq.gz -``` - -### Full samplesheet - -The pipeline will auto-detect whether a sample is single- or paired-end using the information provided in the samplesheet. The samplesheet can have as many columns as you desire, however, there is a strict requirement for the first 3 columns to match those defined in the table below. - -A final samplesheet file consisting of both single- and paired-end data may look something like the one below. This is for 6 samples, where `TREATMENT_REP3` has been sequenced twice. - -```csv title="samplesheet.csv" -sample,fastq_1,fastq_2 -CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz -CONTROL_REP2,AEG588A2_S2_L002_R1_001.fastq.gz,AEG588A2_S2_L002_R2_001.fastq.gz -CONTROL_REP3,AEG588A3_S3_L002_R1_001.fastq.gz,AEG588A3_S3_L002_R2_001.fastq.gz -TREATMENT_REP1,AEG588A4_S4_L003_R1_001.fastq.gz, -TREATMENT_REP2,AEG588A5_S5_L003_R1_001.fastq.gz, -TREATMENT_REP3,AEG588A6_S6_L003_R1_001.fastq.gz, -TREATMENT_REP3,AEG588A6_S6_L004_R1_001.fastq.gz, -``` - -| Column | Description | -| --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). | -| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". | -| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". | - -An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline. +This page provides in-depth descriptions of the data processing modules implemented in `nf-core/deepmutscan`. It is similarly intended for new deep mutational scanning aficionados, for advanced users and developers who want to understand the rationale behind certain design choices, to explore implementation details and consider potential future extensions. ## Running the pipeline -The typical command for running the pipeline is as follows: +The typical command for running the pipeline (on an example protein-coding gene with 100 amino acids) is as follows: -```bash -nextflow run nf-core/dmscore --input ./samplesheet.csv --outdir ./results --genome GRCh37 -profile docker +```bash title="example_run.sh" +nextflow run nf-core/deepmutscan \ + -profile \ + --input ./samplesheet.csv \ + --fasta ./ref.fa \ + --reading_frame 1-300 \ + --outdir ./results ``` -This will launch the pipeline with the `docker` configuration profile. See below for more information about profiles. +The `-profile ` specification is mandatory and should reflect either your own institutional profile or any pipeline profile specified in the [profile section](##-profile). + +This will launch the pipeline by performing sequencing read alignments, various raw data QC analyses, optional mutant count error corrections, fitness and fitness error estimations. Note that the pipeline will create the following files in your working directory: -```bash -work # Directory containing the nextflow working files - # Finished results in specified location (defined with --outdir) -.nextflow_log # Log file from Nextflow -# Other nextflow hidden files, eg. history of pipeline runs and old logs. +```console title="working directory" +work # Directory containing the nextflow working files +results # Finished results in specified location (defined with --outdir); needs full writing access +.nextflow_log # Log file from Nextflow ``` If you wish to repeatedly use the same parameters for multiple runs, rather than specifying each flag in the command, you can specify these in a params file. @@ -76,12 +40,12 @@ If you wish to repeatedly use the same parameters for multiple runs, rather than Pipeline settings can be provided in a `yaml` or `json` file via `-params-file `. > [!WARNING] -> Do not use `-c ` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args). +> Do not use `-c ` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/running/run-pipelines#configuring-pipelines), other infrastructural tweaks (such as output directories), or module arguments (args). The above pipeline run specified with a params file in yaml format: -```bash -nextflow run nf-core/dmscore -profile docker -params-file params.yaml +```bash title="example_run.sh" +nextflow run nf-core/deepmutscan -profile docker -params-file params.yaml ``` with: @@ -89,32 +53,141 @@ with: ```yaml title="params.yaml" input: './samplesheet.csv' outdir: './results/' -genome: 'GRCh37' +gene reference: 'ref.fa' <...> ``` You can also generate such `YAML`/`JSON` files via [nf-core/launch](https://nf-co.re/launch). -### Updating the pipeline +## Inputs + +Users need to first prepare a samplesheet with your input/output data in which each row represents a pair of matched fastq files (paired end). This should look as follows: + +```csv title="samplesheet.csv" +sample,type,replicate,file1,file2 +ORF1,input,1,/reads/forward1.fastq.gz,/reads/reverse1.fastq.gz +ORF1,input,2,/reads/forward2.fastq.gz,/reads/reverse2.fastq.gz +ORF1,output,1,/reads/forward3.fastq.gz,/reads/reverse3.fastq.gz +ORF1,output,2,/reads/forward4.fastq.gz,/reads/reverse4.fastq.gz +``` + +Secondly, users need to specify the gene or gene region of interest using a reference FASTA file via `--fasta`. Provide the exact codon coordinates using `--reading_frame`. + +## Optional parameters + +Several optional parameters are available for `nf-core/deepmutscan`, some of which are currently _(in development)_. + +| Parameter | Default | Description | +| -------------------- | --------------- | ------------------------------------------------------------- | +| `--run_seqdepth` | `false` | Estimate sequencing saturation by rarefaction | +| `--fitness` | `false` | Default fitness inference module | +| `--dimsum` | `false` | Optional fitness inference module _(AMD/x86_64 systems only)_ | +| `--mutagenesis` | `nnk` | Deep mutational scanning strategy used | +| `--error-estimation` | `wt_sequencing` | Error model used to correct 1nt counts _(in development)_ | +| `--read-align` | `bwa-mem` | Customised read aligner _(in development)_ | + +## Pipeline output + +After execution, the pipeline creates the following directory structure: + +```tree title="nf-core/deepmutscan results" +results/ +├── fastqc/ # Individual HTML reports for specified fastq files, raw sequencing QC +├── fitness/ # Merged variant count tables, fitness and error estimates, replicate correlations and heatmaps +├── intermediate_files/ # Raw alignments, raw and pre-filtered variant count tables, QC reports +├── library_QC/ # Sample-specific PDF visualizations: position-wise sequencing coverage, count heatmaps, etc. +├── multiqc/ # Shared HTML reports for all fastq files, raw sequencing QC +├── pipelineinfo/ # Nextflow helper files for timeline and summary report generation +├── timeline.html # Nextflow timeline for all tasks +└── report.html # Nextflow summary report incl. detailed CPU and memory usage per for all tasks +``` + +## Detailed steps + +### 1. Alignment + +All paired-end raw reads are first aligned to the provided reference ORF using [**bwa-mem**](http://bio-bwa.sourceforge.net/). This is a highly efficient mapping algorithm for reads ≥100 bp, with its multi-threading support automatically handled by nf-core. + +In future versions of `nf-core/deepmutscan`, we consider the use of [**bwa-mem2**](https://github.com/bwa-mem2/bwa-mem2), which provides similar alignment quality at moderate speed increase ([Vasimuddin et al., _IPDPS_ 2019](https://ieeexplore.ieee.org/document/8820962)). With the increasing diversity of sequencing platforms for DMS new read length, throughput and error profiles may require further alignment options to be implemented. + +### 2. Filtering + +For long ORF site-saturation mutagenesis libraries, most aligned shotgun sequencing reads contain exact matches against the reference. It is not possible to infer which of these stem from actual wildtype vs (upstream or downstream) mutant DNA molecules prior to fragmentation, hence they are filtered out. Currently, reads with likely artefactual indel-containing alignments are also removed. + +To this end, we use [**samtools view**](https://www.htslib.org/doc/samtools.html). + +### 3. Read Merging + +Even the highest-quality next-generation sequencing platforms do not feature perfect base accuracy. To minimise the effect of base errors (which would otherwise be counted as "false mutations"), `nf-core/deepmutscan` uses the overlap of each aligned read pair. With base errors on the forward and reverse read being independent, the pipeline applies the [**vsearch fastq_mergepairs**](https://github.com/torognes/vsearch) function to convert each read pair into a single consensus molecule with adjusted base error scores. + +> [!TIP] +> Optimal merging benefit is usually obtained if the average DNA fragment size matches the read size. For example, libraries sequenced with 150 bp paired-end reads should ideally also be sheared/tagmented to a mean size of 150 bp. + +Future versions may offer additional options depending on sequencing type and error profiles. + +### 4. Variant Counting + +Aligned, non-wildtype consensus reads are screened for exact, base-level mismatches. `nf-core/deepmutscan` currently uses the popular [**GATK AnalyzeSaturationMutagenesis**](https://gatk.broadinstitute.org/hc/en-us/articles/360037594771-AnalyzeSaturationMutagenesis-BETA) function to count occurrences of all single, double, triple, and higher-order nucleotide changes between each read and the reference ORF. + +We are currently working on the implementation of an alternative, lightweight Python implementation for mutation counting. Users will thereby also be allowed to specify a minimum base quality cutoff for mutations to be included in the final count table – an option which is unfortunately not available in GATK. + +### 5. DMS Library Quality Control + +By integrating the reference ORF coordinates and the chosen DMS library type (default: NNK degenerate codons), `nf-core/deepmutscan` calculates a number of mutation count summary statistics. + +Custom visualisations allow for inspection of (1) mutation efficiency along the ORF, (2) position-specific recovery of mutant amino acid diversity, and (3) overall sequencing coverage evenness and library saturation. + +### 6. Data Summarisation for Fitness Estimation + +Steps 1-5 are run in parallel across all individual samples defined in the `.csv` spreadsheet. Once read alignment, filtering, merging, variant counting, and DMS library QC have been completed for the full list of samples – if input/output sample pairs are available – users can opt to proceed towards fitness estimation. To this end, the pipeline generates all the necessary preparatory files by generating a merged mutation count table across samples. + +### 7. Single Nucleotide Variant Error Correction _(in development)_ + +This module will feature strategies to distinguish true single nucleotide variants from sequencing artefacts. There are two options to perform this: + +- Empirical error rate modelling based on additional wildtype sequencing +- Empirical error rate modelling based on false double mutants in the programmed single mutant library + +### 8. Fitness Estimation + +The final step of the pipeline will perform fitness estimation based on mutation counts. By default, we calculate fitness scores as the logarithm of variants' output to input ratio, normalised to that of the provided wildtype nucleotide sequence. + +Future expansions may include: + +- Integration of other popular fitness inference tools, including [DiMSum](https://github.com/lehner-lab/DiMSum), [Enrich2](https://github.com/FowlerLab/Enrich2), [rosace](https://github.com/pimentellab/rosace/) and [mutscan](https://github.com/fmicompbio/mutscan) +- Standardised output formats for downstream analyses and comparison + +> [!IMPORTANT] +> We note that exact wildtype sequence reads are filtered out in stage 2. Including synonymous wildtype codons in the original mutagenesis design is therefore essential when it comes to calibrating the fitness calculations. + +## Notes for Developers + +- Custom R scripts used in filtering and QC visualisation are available in the `modules/local/dmsanalysis/bin/` directory of the repository. +- Modules are implemented in Nextflow DSL2 and follow the nf-core community guidelines. +- Contributions, optimisations, and additional analysis modules are welcome: please open a Github [issue](https://github.com/nf-core/deepmutscan/issues/new) or [pull request](https://github.com/nf-core/deepmutscan/compare) to discuss or suggest ideas. + +_This document is meant as a living reference. As the pipeline evolves, the descriptions of steps 7 and 8 will be further expanded with concrete implementation details._ + +## Updating the pipeline -When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline: +When you run the original command above, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use this cached version if available - even if the pipeline has been updated since. To make sure that you are running the latest version of the pipeline, make sure that you regularly update the cached version: ```bash -nextflow pull nf-core/dmscore +nextflow pull nf-core/deepmutscan ``` -### Reproducibility +## Reproducibility It is a good idea to specify the pipeline version when running the pipeline on your data. This ensures that a specific version of the pipeline code and software are used when you run your pipeline. If you keep using the same tag, you'll be running the same version of the pipeline, even if there have been changes to the code since. -First, go to the [nf-core/dmscore releases page](https://github.com/nf-core/dmscore/releases) and find the latest pipeline version - numeric only (eg. `1.3.1`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.3.1`. Of course, you can switch to another version by changing the number after the `-r` flag. +First, go to the [nf-core/deepmutscan releases page](https://github.com/nf-core/deepmutscan/releases) and find the latest pipeline version - numeric only (eg. `1.0.0`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.0.0`. This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future. For example, at the bottom of the MultiQC reports. To further assist in reproducibility, you can use share and reuse [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter. > [!TIP] -> If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles. +> If you wish to share such a profile (e.g. providing it as supplementary material for academic publications), make sure to _not_ include your cluster specific file paths or institutional specific profiles. ## Core Nextflow arguments @@ -135,7 +208,7 @@ The pipeline also dynamically loads configurations from [https://github.com/nf-c Note that multiple profiles can be loaded, for example: `-profile test,docker` - the order of arguments is important! They are loaded in sequence, so later profiles can overwrite earlier profiles. -If `-profile` is not specified, the pipeline will run locally and expect all software to be installed and available on the `PATH`. This is _not_ recommended, since it can lead to different results on different machines dependent on the computer environment. +If `-profile` is not specified, the pipeline will run locally and expect all software to be installed and available on the `PATH`. This is _not_ recommended, since it may lead to varying or even irreproducible results across users' different computer environments. - `test` - A profile with a complete configuration for automated testing @@ -159,7 +232,7 @@ If `-profile` is not specified, the pipeline will run locally and expect all sof ### `-resume` -Specify this when restarting a pipeline. Nextflow will use cached results from any pipeline steps where the inputs are the same, continuing from where it got to previously. For input to be considered the same, not only the names must be identical but the files' contents as well. For more info about this parameter, see [this blog post](https://www.nextflow.io/blog/2019/demystifying-nextflow-resume.html). +Specify this when restarting a pipeline. Nextflow will use cached results (from within the `/work` directory) from any pipeline steps where the inputs are the same, continuing from where it got to previously. For input to be considered the same, not only the names must be identical but the files' contents as well. For more info about this parameter, see [this blog post](https://www.nextflow.io/blog/2019/demystifying-nextflow-resume.html). You can also supply a run name to resume a specific run: `-resume [run-name]`. Use the `nextflow log` command to show previous run names. @@ -173,19 +246,19 @@ Specify the path to a specific config file (this is a core Nextflow command). Se Whilst the default requirements set within the pipeline will hopefully work for most people and with most input data, you may find that you want to customise the compute resources that the pipeline requests. Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the pipeline steps, if the job exits with any of the error codes specified [here](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/base.config#L18) it will automatically be resubmitted with higher resources request (2 x original, then 3 x original). If it still fails after the third attempt then the pipeline execution is stopped. -To change the resource requests, please see the [max resources](https://nf-co.re/docs/usage/configuration#max-resources) and [tuning workflow resources](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources) section of the nf-core website. +To change the resource requests, please see the [max resources](https://nf-co.re/docs/running/configuration/nextflow-for-your-system#set-max-resources) and [customise process resources](https://nf-co.re/docs/running/configuration/nextflow-for-your-system#customize-process-resources) section of the nf-core website. ### Custom Containers In some cases, you may wish to change the container or conda environment used by a pipeline steps for a particular tool. By default, nf-core pipelines use containers and software from the [biocontainers](https://biocontainers.pro/) or [bioconda](https://bioconda.github.io/) projects. However, in some cases the pipeline specified version maybe out of date. -To use a different container from the default container or conda environment specified in a pipeline, please see the [updating tool versions](https://nf-co.re/docs/usage/configuration#updating-tool-versions) section of the nf-core website. +To use a different container from the default container or conda environment specified in a pipeline, please see the [updating tool versions](https://nf-co.re/docs/running/configuration/nextflow-for-your-system#update-tool-versions) section of the nf-core website. ### Custom Tool Arguments A pipeline might not always support every possible argument or option of a particular tool used in pipeline. Fortunately, nf-core pipelines provide some freedom to users to insert additional parameters that the pipeline does not include by default. -To learn how to provide additional arguments to a particular tool of the pipeline, please see the [customising tool arguments](https://nf-co.re/docs/usage/configuration#customising-tool-arguments) section of the nf-core website. +To learn how to provide additional arguments to a particular tool of the pipeline, please see the [customising tool arguments](https://nf-co.re/docs/running/configuration/nextflow-for-your-system#modifying-tool-arguments) section of the nf-core website. ### nf-core/configs diff --git a/log.txt b/log.txt new file mode 100644 index 0000000..dceb2c1 --- /dev/null +++ b/log.txt @@ -0,0 +1,613 @@ +May-07 23:01:49.322 [main] DEBUG nextflow.cli.Launcher - $> nextflow run . -profile test,docker --fasta /Users/benjaminwehnert/GID1A_SUNi_ref_small.fasta --reading_frame 352-1383 --min_counts 2 --mutagenesis_type max_diff_to_wt --outdir ./results +May-07 23:01:49.500 [main] DEBUG nextflow.cli.CmdRun - N E X T F L O W ~ version 24.04.4 +May-07 23:01:49.524 [main] DEBUG nextflow.plugin.PluginsFacade - Setting up plugin manager > mode=prod; embedded=false; plugins-dir=/Users/benjaminwehnert/.nextflow/plugins; core-plugins: nf-amazon@2.5.3,nf-azure@1.6.1,nf-cloudcache@0.4.1,nf-codecommit@0.2.1,nf-console@1.1.3,nf-ga4gh@1.3.0,nf-google@1.13.2-patch1,nf-tower@1.9.1,nf-wave@1.4.2-patch1 +May-07 23:01:49.533 [main] INFO o.pf4j.DefaultPluginStatusProvider - Enabled plugins: [] +May-07 23:01:49.534 [main] INFO o.pf4j.DefaultPluginStatusProvider - Disabled plugins: [] +May-07 23:01:49.536 [main] INFO org.pf4j.DefaultPluginManager - PF4J version 3.12.0 in 'deployment' mode +May-07 23:01:49.549 [main] INFO org.pf4j.AbstractPluginManager - No plugins +May-07 23:01:50.385 [main] WARN nextflow.config.Manifest - Invalid config manifest attribute `contributors` +May-07 23:01:50.406 [main] DEBUG nextflow.config.ConfigBuilder - Found config local: /Users/benjaminwehnert/dmscore/nextflow.config +May-07 23:01:50.408 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /Users/benjaminwehnert/dmscore/nextflow.config +May-07 23:01:50.426 [main] DEBUG n.secret.LocalSecretsProvider - Secrets store: /Users/benjaminwehnert/.nextflow/secrets/store.json +May-07 23:01:50.428 [main] DEBUG nextflow.secret.SecretsLoader - Discovered secrets providers: [nextflow.secret.LocalSecretsProvider@169268a7] - activable => nextflow.secret.LocalSecretsProvider@169268a7 +May-07 23:01:50.438 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `test,docker` +May-07 23:01:51.909 [main] DEBUG nextflow.config.ConfigBuilder - Available config profiles: [bih, cfc_dev, uzl_omics, ifb_core, embl_hd, denbi_qbic, alice, mjolnir_globe, uppmax, giga, incliva, ilifu, ki_luria, uge, icr_alma, rosalind_uge, lugh, mccleary, unibe_ibu, vai, czbiohub_aws, jax, roslin, ccga_med, tes, scw, unc_longleaf, tigem, tubingen_apg, google, apollo, ipop_up, vsc_calcua, pdc_kth, googlels, ceci_nic5, humantechnopole, stjude, daisybio, eddie, medair, biowulf, apptainer, bi, bigpurple, adcra, cedars, pawsey_setonix, vsc_kul_uhasselt, pawsey_nimbus, ucl_myriad, utd_ganymede, charliecloud, seattlechildrens, icr_davros, ceres, arm, munin, rosalind, hasta, cfc, uzh, shu_bmrc, ebi_codon_slurm, ebc, ccga_dx, crick, ku_sund_danhead, marvin, shifter, biohpc_gen, mana, mamba, york_viking, unc_lccc, wehi, awsbatch, wustl_htcf, arcc, ceci_dragon2, imperial, maestro, software_license, cannon, genotoul, nci_gadi, abims, janelia, nu_genomics, googlebatch, oist, sahmri, kaust, alliance_canada, mpcdf, leicester, vsc_ugent, create, sage, cambridge, jex, podman, ebi_codon, cheaha, xanadu, nyu_hpc, test, marjorie, computerome, ucd_sonic, seg_globe, mssm, sanger, dkfz, bluebear, pasteur, einstein, ethz_euler, m3c, test_full, imb, ucl_cscluster, tuos_stanage, azurebatch, hki, seadragon, crukmi, csiro_petrichor, qmul_apocrita, wave, docker, engaging, gis, hypatia, psmn, eva, unity, cropdiversityhpc, nygc, fgcz, conda, crg, singularity, mpcdf_viper, pe2, self_hosted_runner, tufts, uw_hyak_pedslabs, binac2, debug, genouest, cbe, unsw_katana, gitpod, phoenix, seawulf, uod_hpc, fub_curta, uct_hpc, aws_tower, binac, fsu_draco] +May-07 23:01:51.958 [main] DEBUG nextflow.cli.CmdRun - Applied DSL=2 by global default +May-07 23:01:51.972 [main] DEBUG nextflow.cli.CmdRun - Launching `./main.nf` [modest_coulomb] DSL2 - revision: 84101fc51c +May-07 23:01:51.974 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins declared=[nf-schema@2.3.0] +May-07 23:01:51.974 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins default=[] +May-07 23:01:51.974 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins resolved requirement=[nf-schema@2.3.0] +May-07 23:01:51.975 [main] DEBUG nextflow.plugin.PluginUpdater - Installing plugin nf-schema version: 2.3.0 +May-07 23:01:51.983 [main] INFO org.pf4j.AbstractPluginManager - Plugin 'nf-schema@2.3.0' resolved +May-07 23:01:51.983 [main] INFO org.pf4j.AbstractPluginManager - Start plugin 'nf-schema@2.3.0' +May-07 23:01:51.990 [main] DEBUG nextflow.plugin.BasePlugin - Plugin started nf-schema@2.3.0 +May-07 23:01:52.045 [main] DEBUG nextflow.Session - Session UUID: 2397b75e-3882-46d7-ba2c-8549f8a2b4a6 +May-07 23:01:52.045 [main] DEBUG nextflow.Session - Run name: modest_coulomb +May-07 23:01:52.046 [main] DEBUG nextflow.Session - Executor pool size: 8 +May-07 23:01:52.053 [main] DEBUG nextflow.file.FilePorter - File porter settings maxRetries=3; maxTransfers=50; pollTimeout=null +May-07 23:01:52.057 [main] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool 'FileTransfer' minSize=10; maxSize=24; workQueue=LinkedBlockingQueue[10000]; allowCoreThreadTimeout=false +May-07 23:01:52.077 [main] DEBUG nextflow.cli.CmdRun - + Version: 24.04.4 build 5917 + Created: 01-08-2024 07:05 UTC (09:05 CEST) + System: Mac OS X 15.0 + Runtime: Groovy 4.0.21 on OpenJDK 64-Bit Server VM 17.0.13+0 + Encoding: UTF-8 (UTF-8) + Process: 31358@MacBook-Air-von-Benjamin.local [127.0.0.1] + CPUs: 8 - Mem: 8 GB (94.6 MB) - Swap: 8 GB (737.5 MB) +May-07 23:01:52.088 [main] DEBUG nextflow.Session - Work-dir: /Users/benjaminwehnert/dmscore/work [Mac OS X] +May-07 23:01:52.088 [main] DEBUG nextflow.Session - Script base path does not exist or is not a directory: /Users/benjaminwehnert/dmscore/bin +May-07 23:01:52.104 [main] DEBUG nextflow.executor.ExecutorFactory - Extension executors providers=[] +May-07 23:01:52.123 [main] DEBUG nextflow.Session - Observer factory: DefaultObserverFactory +May-07 23:01:52.143 [main] DEBUG nextflow.Session - Observer factory: ValidationObserverFactory +May-07 23:01:52.174 [main] WARN nextflow.config.Manifest - Invalid config manifest attribute `contributors` +May-07 23:01:52.195 [main] DEBUG nextflow.cache.CacheFactory - Using Nextflow cache factory: nextflow.cache.DefaultCacheFactory +May-07 23:01:52.205 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 9; maxThreads: 1000 +May-07 23:01:52.264 [main] DEBUG nextflow.Session - Session start +May-07 23:01:52.266 [main] DEBUG nextflow.trace.TraceFileObserver - Workflow started -- trace file: /Users/benjaminwehnert/dmscore/results/pipeline_info/execution_trace_2025-05-07_23-01-50.txt +May-07 23:01:52.411 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution +May-07 23:01:53.516 [main] DEBUG nextflow.script.IncludeDef - Loading included plugin extensions with names: [paramsSummaryMap:paramsSummaryMap]; plugin Id: nf-schema +May-07 23:01:53.917 [main] DEBUG nextflow.script.IncludeDef - Loading included plugin extensions with names: [paramsSummaryLog:paramsSummaryLog]; plugin Id: nf-schema +May-07 23:01:53.918 [main] DEBUG nextflow.script.IncludeDef - Loading included plugin extensions with names: [validateParameters:validateParameters]; plugin Id: nf-schema +May-07 23:01:53.920 [main] DEBUG nextflow.script.IncludeDef - Loading included plugin extensions with names: [paramsSummaryMap:paramsSummaryMap]; plugin Id: nf-schema +May-07 23:01:53.921 [main] DEBUG nextflow.script.IncludeDef - Loading included plugin extensions with names: [samplesheetToList:samplesheetToList]; plugin Id: nf-schema +May-07 23:01:54.131 [main] WARN nextflow.script.ScriptBinding - Access to undefined parameter `custom_codon_library` -- Initialise it to a default value eg. `params.custom_codon_library = some_value` +May-07 23:01:54.356 [main] DEBUG nextflow.script.IncludeDef - Loading included plugin extensions with names: [paramsSummaryLog:paramsSummaryLog]; plugin Id: nf-schema +May-07 23:01:54.357 [main] DEBUG nextflow.script.IncludeDef - Loading included plugin extensions with names: [validateParameters:validateParameters]; plugin Id: nf-schema +May-07 23:01:54.359 [main] DEBUG nextflow.script.IncludeDef - Loading included plugin extensions with names: [paramsSummaryMap:paramsSummaryMap]; plugin Id: nf-schema +May-07 23:01:54.360 [main] DEBUG nextflow.script.IncludeDef - Loading included plugin extensions with names: [samplesheetToList:samplesheetToList]; plugin Id: nf-schema +May-07 23:01:54.574 [main] INFO nextflow.Nextflow - +------------------------------------------------------ + ,--./,-. + ___ __ __ __ ___ /,-._.--~' + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + nf-core/dmscore 1.0.0dev +------------------------------------------------------ +Input/output options + input : https://raw.githubusercontent.com/BenjaminWehnert1008/test-datasets/dmsqc/dmsqc/samplesheet_qc_only.csv + outdir : ./results + min_counts : 2 + mutagenesis_type : max_diff_to_wt + +Reference genome options + genome : R64-1-1 + fasta : /Users/benjaminwehnert/GID1A_SUNi_ref_small.fasta + +Institutional config options + config_profile_name : Test profile + config_profile_description: Minimal test dataset to check pipeline function + +Generic options + trace_report_suffix : 2025-05-07_23-01-50 + +Core Nextflow options + runName : modest_coulomb + containerEngine : docker + launchDir : /Users/benjaminwehnert/dmscore + workDir : /Users/benjaminwehnert/dmscore/work + projectDir : /Users/benjaminwehnert/dmscore + userName : benjaminwehnert + profile : test,docker + configFiles : /Users/benjaminwehnert/dmscore/nextflow.config + +!! Only displaying parameters that differ from the pipeline defaults !! +------------------------------------------------------ +* The nf-core framework + https://doi.org/10.1038/s41587-020-0439-x + +* Software dependencies + https://github.com/nf-core/dmscore/blob/master/CITATIONS.md + +May-07 23:01:54.576 [main] DEBUG n.validation.ValidationExtension - Starting parameters validation +May-07 23:01:54.920 [main] DEBUG nextflow.validation.SchemaEvaluator - Started validating /BenjaminWehnert1008/test-datasets/dmsqc/dmsqc/samplesheet_qc_only.csv +May-07 23:01:55.945 [main] DEBUG nextflow.validation.SchemaEvaluator - Validation of file 'https://raw.githubusercontent.com/BenjaminWehnert1008/test-datasets/dmsqc/dmsqc/samplesheet_qc_only.csv' passed! +May-07 23:01:56.095 [main] DEBUG n.v.FormatDirectoryPathEvaluator - Cloud blob storage paths are not supported by 'FormatDirectoryPathEvaluator': 's3://ngi-igenomes/igenomes/' +May-07 23:01:56.099 [main] DEBUG n.validation.ValidationExtension - Finishing parameters validation +May-07 23:01:56.178 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:process_medium` matches labels `process_medium` for process with name NFCORE_DMSCORE:DMSCORE:FASTQC +May-07 23:01:56.182 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withName:FASTQC` matches process NFCORE_DMSCORE:DMSCORE:FASTQC +May-07 23:01:56.196 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null +May-07 23:01:56.196 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' +May-07 23:01:56.202 [main] DEBUG nextflow.executor.Executor - [warm up] executor > local +May-07 23:01:56.206 [main] DEBUG n.processor.LocalPollingMonitor - Creating local task monitor for executor 'local' > cpus=8; memory=8 GB; capacity=8; pollInterval=100ms; dumpInterval=5m +May-07 23:01:56.209 [main] DEBUG n.processor.TaskPollingMonitor - >>> barrier register (monitor: local) +May-07 23:01:56.359 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:process_single` matches labels `process_single` for process with name NFCORE_DMSCORE:DMSCORE:MULTIQC +May-07 23:01:56.360 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withName:MULTIQC` matches process NFCORE_DMSCORE:DMSCORE:MULTIQC +May-07 23:01:56.362 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null +May-07 23:01:56.363 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' +May-07 23:01:56.372 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:process_single` matches labels `process_single` for process with name NFCORE_DMSCORE:DMSCORE:BWA_INDEX +May-07 23:01:56.373 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withName:BWA_INDEX` matches process NFCORE_DMSCORE:DMSCORE:BWA_INDEX +May-07 23:01:56.375 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null +May-07 23:01:56.375 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' +May-07 23:01:56.391 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:process_high` matches labels `process_high` for process with name NFCORE_DMSCORE:DMSCORE:BWA_MEM +May-07 23:01:56.391 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withName:BWA_MEM` matches process NFCORE_DMSCORE:DMSCORE:BWA_MEM +May-07 23:01:56.394 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null +May-07 23:01:56.395 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' +May-07 23:01:56.404 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:process_single` matches labels `process_single` for process with name NFCORE_DMSCORE:DMSCORE:BAMFILTER_DMS +May-07 23:01:56.406 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withName:BAMFILTER_DMS` matches process NFCORE_DMSCORE:DMSCORE:BAMFILTER_DMS +May-07 23:01:56.409 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null +May-07 23:01:56.409 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' +May-07 23:01:56.418 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:process_medium` matches labels `process_medium` for process with name NFCORE_DMSCORE:DMSCORE:PREMERGE +May-07 23:01:56.418 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withName:PREMERGE` matches process NFCORE_DMSCORE:DMSCORE:PREMERGE +May-07 23:01:56.420 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null +May-07 23:01:56.420 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' +May-07 23:01:56.429 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:process_high` matches labels `process_high` for process with name NFCORE_DMSCORE:DMSCORE:GATK_SATURATIONMUTAGENESIS +May-07 23:01:56.430 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withName:GATK_SATURATIONMUTAGENESIS` matches process NFCORE_DMSCORE:DMSCORE:GATK_SATURATIONMUTAGENESIS +May-07 23:01:56.433 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null +May-07 23:01:56.433 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' +May-07 23:01:56.443 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:process_single` matches labels `process_single` for process with name NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_AASEQ +May-07 23:01:56.443 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withName:DMSANALYSIS_AASEQ` matches process NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_AASEQ +May-07 23:01:56.446 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null +May-07 23:01:56.446 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' +May-07 23:01:56.458 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:process_single` matches labels `process_single` for process with name NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_POSSIBLE_MUTATIONS +May-07 23:01:56.459 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withName:DMSANALYSIS_POSSIBLE_MUTATIONS` matches process NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_POSSIBLE_MUTATIONS +May-07 23:01:56.461 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null +May-07 23:01:56.461 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' +May-07 23:01:56.480 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:process_single` matches labels `process_single` for process with name NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_PROCESS_GATK +May-07 23:01:56.481 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withName:DMSANALYSIS_PROCESS_GATK` matches process NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_PROCESS_GATK +May-07 23:01:56.484 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null +May-07 23:01:56.484 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local' +May-07 23:01:56.499 [main] DEBUG nextflow.Session - Config process names validation disabled as requested +May-07 23:01:56.500 [main] DEBUG nextflow.Session - Igniting dataflow network (43) +May-07 23:01:56.508 [main] DEBUG nextflow.processor.TaskProcessor - Starting process > NFCORE_DMSCORE:DMSCORE:FASTQC +May-07 23:01:56.508 [main] DEBUG nextflow.processor.TaskProcessor - Starting process > NFCORE_DMSCORE:DMSCORE:MULTIQC +May-07 23:01:56.509 [main] DEBUG nextflow.processor.TaskProcessor - Starting process > NFCORE_DMSCORE:DMSCORE:BWA_INDEX +May-07 23:01:56.509 [main] DEBUG nextflow.processor.TaskProcessor - Starting process > NFCORE_DMSCORE:DMSCORE:BWA_MEM +May-07 23:01:56.511 [main] DEBUG nextflow.processor.TaskProcessor - Starting process > NFCORE_DMSCORE:DMSCORE:BAMFILTER_DMS +May-07 23:01:56.512 [main] DEBUG nextflow.processor.TaskProcessor - Starting process > NFCORE_DMSCORE:DMSCORE:PREMERGE +May-07 23:01:56.512 [main] DEBUG nextflow.processor.TaskProcessor - Starting process > NFCORE_DMSCORE:DMSCORE:GATK_SATURATIONMUTAGENESIS +May-07 23:01:56.512 [main] DEBUG nextflow.processor.TaskProcessor - Starting process > NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_AASEQ +May-07 23:01:56.512 [main] DEBUG nextflow.processor.TaskProcessor - Starting process > NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_POSSIBLE_MUTATIONS +May-07 23:01:56.512 [main] DEBUG nextflow.processor.TaskProcessor - Starting process > NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_PROCESS_GATK +May-07 23:01:56.514 [main] DEBUG nextflow.script.ScriptRunner - Parsed script files: + Script_e5f965e09aa3641b: /Users/benjaminwehnert/dmscore/./workflows/../modules/local/bamprocessing/premerge.nf + Script_113f045ba19c2c46: /Users/benjaminwehnert/dmscore/./workflows/../modules/nf-core/fastqc/main.nf + Script_4f612cfd8c52e8cd: /Users/benjaminwehnert/dmscore/./subworkflows/local/utils_nfcore_dmscore_pipeline/../../nf-core/utils_nextflow_pipeline/main.nf + Script_215a636da7ab24a2: /Users/benjaminwehnert/dmscore/./workflows/../subworkflows/local/utils_nfcore_dmscore_pipeline/../../nf-core/utils_nfschema_plugin/main.nf + Script_68606fccdecc54d9: /Users/benjaminwehnert/dmscore/./subworkflows/local/utils_nfcore_dmscore_pipeline/main.nf + Script_0f3077e98f6a93f6: /Users/benjaminwehnert/dmscore/./workflows/../modules/nf-core/bwa/mem/main.nf + Script_5c4e8d4051efa81e: /Users/benjaminwehnert/dmscore/./subworkflows/local/utils_nfcore_dmscore_pipeline/../../nf-core/utils_nfcore_pipeline/main.nf + Script_3bf8120d0b5bcde1: /Users/benjaminwehnert/dmscore/./workflows/../modules/local/dmsanalysis/possiblemutations.nf + Script_d65a0ba0a319d7db: /Users/benjaminwehnert/dmscore/./workflows/../modules/local/dmsanalysis/aaseq.nf + Script_4ff7366a79e0ef06: /Users/benjaminwehnert/dmscore/./workflows/../modules/local/bamprocessing/bamfilteringdms.nf + Script_c568aae5b239c1c5: /Users/benjaminwehnert/dmscore/main.nf + Script_09ccfa79b2802f41: /Users/benjaminwehnert/dmscore/./workflows/dmscore.nf + Script_4955288afd8ca61e: /Users/benjaminwehnert/dmscore/./workflows/../modules/nf-core/multiqc/main.nf + Script_a1878766b1a6b241: /Users/benjaminwehnert/dmscore/./workflows/../modules/local/dmsanalysis/processgatk.nf + Script_37dcd664b2773148: /Users/benjaminwehnert/dmscore/./workflows/../modules/local/gatk/saturationmutagenesis.nf + Script_25aa31c18d513c61: /Users/benjaminwehnert/dmscore/./workflows/../modules/nf-core/bwa/index/main.nf +May-07 23:01:56.514 [main] DEBUG nextflow.script.ScriptRunner - > Awaiting termination +May-07 23:01:56.514 [main] DEBUG nextflow.Session - Session await +May-07 23:01:56.605 [Actor Thread 1] DEBUG nextflow.sort.BigSort - Sort completed -- entries: 1; slices: 1; internal sort time: 0.001 s; external sort time: 0.011 s; total time: 0.012 s +May-07 23:01:56.605 [Actor Thread 2] DEBUG nextflow.sort.BigSort - Sort completed -- entries: 1; slices: 1; internal sort time: 0.001 s; external sort time: 0.011 s; total time: 0.012 s +May-07 23:01:56.611 [Actor Thread 2] DEBUG nextflow.file.FileCollector - Saved collect-files list to: /Users/benjaminwehnert/dmscore/work/collect-file/4640128d9cfd4a45636425a4f43db374 +May-07 23:01:56.611 [Actor Thread 1] DEBUG nextflow.file.FileCollector - Saved collect-files list to: /Users/benjaminwehnert/dmscore/work/collect-file/04a58cc29ef1c76e5af4e2fb20a13ad7 +May-07 23:01:56.619 [Actor Thread 1] DEBUG nextflow.file.FileCollector - Deleting file collector temp dir: /var/folders/r0/ldrzd4wn1s3516hy0vsn8xzm0000gn/T/nxf-1651345332501275985 +May-07 23:01:56.619 [Actor Thread 2] DEBUG nextflow.file.FileCollector - Deleting file collector temp dir: /var/folders/r0/ldrzd4wn1s3516hy0vsn8xzm0000gn/T/nxf-12321695686053957190 +May-07 23:01:56.623 [Actor Thread 3] DEBUG nextflow.util.HashBuilder - [WARN] Unknown hashing type: class Script_09ccfa79b2802f41$_runScript_closure1$_closure26 +May-07 23:01:56.624 [Actor Thread 14] DEBUG nextflow.util.HashBuilder - Unable to get file attributes file: /NULL -- Cause: java.nio.file.NoSuchFileException: /NULL +May-07 23:01:56.697 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run +May-07 23:01:56.698 [Task submitter] INFO nextflow.Session - [d4/e141d7] Submitted process > NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_AASEQ (amino_acid_sequence) +May-07 23:01:56.730 [FileTransfer-2] DEBUG nextflow.file.FilePorter - Copying foreign file https://raw.githubusercontent.com/BenjaminWehnert1008/test-datasets/dmsqc/dmsqc/pMS190_GID1A_SUNi_S2_1_R2_50k.fastq to work dir: /Users/benjaminwehnert/dmscore/work/stage-2397b75e-3882-46d7-ba2c-8549f8a2b4a6/6a/8815724cee5e2e4e17200e98e8c0ad/pMS190_GID1A_SUNi_S2_1_R2_50k.fastq +May-07 23:01:56.730 [FileTransfer-1] DEBUG nextflow.file.FilePorter - Copying foreign file https://raw.githubusercontent.com/BenjaminWehnert1008/test-datasets/dmsqc/dmsqc/pMS190_GID1A_SUNi_S2_1_R1_50k.fastq to work dir: /Users/benjaminwehnert/dmscore/work/stage-2397b75e-3882-46d7-ba2c-8549f8a2b4a6/40/dfe72a8fa5ce1b038fdf1aa1fb4749/pMS190_GID1A_SUNi_S2_1_R1_50k.fastq +May-07 23:01:58.744 [Actor Thread 15] INFO nextflow.file.FilePorter - Staging foreign file: https://raw.githubusercontent.com/BenjaminWehnert1008/test-datasets/dmsqc/dmsqc/pMS190_GID1A_SUNi_S2_1_R1_50k.fastq +May-07 23:02:00.581 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 3; name: NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_AASEQ (amino_acid_sequence); status: COMPLETED; exit: 0; error: -; workDir: /Users/benjaminwehnert/dmscore/work/d4/e141d76a7c3b4130080bcdf8a831a3] +May-07 23:02:00.585 [Task monitor] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool 'TaskFinalizer' minSize=10; maxSize=24; workQueue=LinkedBlockingQueue[10000]; allowCoreThreadTimeout=false +May-07 23:02:00.612 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run +May-07 23:02:00.613 [Task submitter] INFO nextflow.Session - [49/349fd9] Submitted process > NFCORE_DMSCORE:DMSCORE:BWA_INDEX (GID1A_SUNi_ref_small.fasta) +May-07 23:02:00.663 [TaskFinalizer-1] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool 'PublishDir' minSize=10; maxSize=24; workQueue=LinkedBlockingQueue[10000]; allowCoreThreadTimeout=false +May-07 23:02:00.756 [Actor Thread 15] INFO nextflow.file.FilePorter - Staging foreign file: https://raw.githubusercontent.com/BenjaminWehnert1008/test-datasets/dmsqc/dmsqc/pMS190_GID1A_SUNi_S2_1_R2_50k.fastq +May-07 23:02:00.757 [Actor Thread 4] WARN nextflow.processor.TaskContext - Cannot serialize context map. Cause: java.lang.IllegalArgumentException: Unable to create serializer "com.esotericsoftware.kryo.serializers.FieldSerializer" for class: java.lang.ref.ReferenceQueue -- Resume will not work on this process +May-07 23:02:00.763 [Actor Thread 4] DEBUG nextflow.processor.TaskContext - Failed to serialize delegate map items: [ + 'meta':[Script_09ccfa79b2802f41$_runScript_closure1$_closure26] = + 'pos_range':[java.lang.String] = 352-1383 + '$':[java.lang.Boolean] = true + 'wt_seq':[nextflow.processor.TaskPath] = GID1A_SUNi_ref_small.fasta + 'script':[nextflow.processor.TaskPath] = aa_seq.R + 'task':[nextflow.processor.TaskConfig] = [container:community.wave.seqera.io/library/bioconductor-biostrings_r-base_r-biocmanager_r-dplyr_pruned:0fd2e39a5bf2ecaa, withName:PREMERGE:[publishDir:[path:./results/intermediate_files/bam_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure9$_closure21@26dc7ffc]], memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure15@13f9d575, withName:FASTQC:[ext:[args:--quiet], containerOptions:], withLabel:error_retry:[errorStrategy:retry, maxRetries:2], when:nextflow.script.TaskClosure@264d7648, withLabel:process_high:[cpus:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure9$_closure23@1b36f3f3, memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure9$_closure24@5e504cf6, time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure9$_closure25@35f6834], withLabel:error_ignore:[errorStrategy:ignore], resourceLimits:[cpus:4, memory:8.GB, time:1.h], withName:DMSANALYSIS_AASEQ:[publishDir:[path:./results/intermediate_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure13$_closure25@4eeda121]], withLabel:process_high_memory:[memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure11$_closure27@62e86a64], withName:BAMFILTER_DMS:[publishDir:[path:./results/intermediate_files/bam_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure8$_closure20@3e03bd33]], publishDir:[[path:./results/intermediate_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure13$_closure25@4eeda121]], withName:BWA_MEM:[publishDir:[path:./results/intermediate_files/bam_files/bwa/mem, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure7$_closure19@46e56c0f]], executor:local, stub:nextflow.script.TaskClosure@6522395b, conda:null/environment.yml, withName:GATK_SATURATIONMUTAGENESIS:[publishDir:[path:./results/intermediate_files/gatk, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure10$_closure22@4438d4d1]], cacheable:true, withLabel:process_low:[cpus:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure7$_closure17@67f11340, memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure7$_closure18@a998ea5, time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure7$_closure19@7e85964c], withLabel:process_medium:[cpus:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure8$_closure20@7c995b11, memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure8$_closure21@131d3cd1, time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure8$_closure22@55b774b1], tag:amino_acid_sequence, withName:MULTIQC:[ext:[args:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure5$_closure15@7d2bfbd], publishDir:[path:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure5$_closure16@31a52d85, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure5$_closure17@4ba464d4]], workDir:/Users/benjaminwehnert/dmscore/work/d4/e141d76a7c3b4130080bcdf8a831a3, exitStatus:0, ext:[:], withLabel:process_single:[cpus:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure14@255893ed, memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure15@13f9d575, time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure16@37e5efac], withName:BWA_INDEX:[publishDir:[path:./results/intermediate_files/bam_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure6$_closure18@2f3435d0]], process:NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_AASEQ, debug:false, cpus:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure14@255893ed, index:1, label:[process_single], withLabel:process_long:[time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure10$_closure26@475e6626], maxRetries:1, maxErrors:-1, shell:bash + +set -e # Exit if a tool returns a non-zero status/exit code +set -u # Treat unset variables and parameters as an error +set -o pipefail # Returns the status of the last command to exit with a non-zero status or zero if all successfully execute +set -C # No clobber - prevent output redirection from overwriting files. +, withName:DMSANALYSIS_POSSIBLE_MUTATIONS:[publishDir:[path:./results/intermediate_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure11$_closure23@267852db]], name:NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_AASEQ (amino_acid_sequence), containerOptions:-u $(id -u):$(id -g), errorStrategy:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure5@3e785137, time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure16@37e5efac, withName:DMSANALYSIS_PROCESS_GATK:[publishDir:[path:./results/intermediate_files/processed_gatk_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure14$_closure26@30ec799d]], hash:d4e141d76a7c3b4130080bcdf8a831a3] +] +com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer "com.esotericsoftware.kryo.serializers.FieldSerializer" for class: java.lang.ref.ReferenceQueue +Serialization trace: +queue (org.codehaus.groovy.util.ReferenceManager$CallBackedManager) +manager (org.codehaus.groovy.util.ReferenceManager$1) +manager (org.codehaus.groovy.util.ReferenceBundle) +bundle (org.codehaus.groovy.reflection.CachedClass$4) +cachedSuperClass (org.codehaus.groovy.reflection.stdclasses.ObjectCachedClass) +cachedClass (org.codehaus.groovy.reflection.CachedMethod) +allMethods (groovy.lang.MetaClassImpl) +delegate (groovy.runtime.metaclass.NextflowDelegatingMetaClass) +metaClass (groovyx.gpars.dataflow.DataflowVariable) +first (groovyx.gpars.dataflow.stream.DataflowStream) +asyncHead (groovyx.gpars.dataflow.stream.DataflowStreamReadAdapter) +source (nextflow.extension.MapOp) +owner (nextflow.extension.MapOp$_apply_closure1) +code (groovyx.gpars.dataflow.operator.DataflowOperatorActor) +actor (groovyx.gpars.dataflow.operator.DataflowOperator) +allOperators (nextflow.Session) +session (nextflow.validation.ValidationExtension) +target (nextflow.script.FunctionDef) +definitions (nextflow.script.ScriptMeta) +meta (nextflow.script.ScriptBinding) +binding (Script_09ccfa79b2802f41) +delegate (Script_09ccfa79b2802f41$_runScript_closure1) +delegate (Script_09ccfa79b2802f41$_runScript_closure1$_closure26) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:82) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599) + at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:82) + at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:22) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599) + at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:82) + at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:22) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599) + at com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:95) + at com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:21) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599) + at com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:95) + at com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:21) + at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599) + at org.codehaus.groovy.vmplugin.v8.IndyInterface.fromCache(IndyInterface.java:321) + at nextflow.util.KryoHelper.serialize(SerializationHelper.groovy:166) + at org.codehaus.groovy.vmplugin.v8.IndyInterface.fromCache(IndyInterface.java:321) + at nextflow.processor.TaskContext.serialize(TaskContext.groovy:198) + at nextflow.cache.CacheDB.writeTaskEntry0(CacheDB.groovy:148) + at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) + at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) + at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) + at java.base/java.lang.reflect.Method.invoke(Method.java:569) + at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:343) + at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:328) + at groovy.lang.MetaClassImpl.doInvokeMethod(MetaClassImpl.java:1333) + at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1088) + at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1007) + at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:645) + at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:628) + at org.codehaus.groovy.runtime.InvokerHelper.invokeMethodSafe(InvokerHelper.java:82) + at nextflow.cache.CacheDB$_putTaskAsync_closure1.doCall(CacheDB.groovy:157) + at nextflow.cache.CacheDB$_putTaskAsync_closure1.call(CacheDB.groovy) + at groovyx.gpars.agent.AgentBase.onMessage(AgentBase.java:102) + at groovyx.gpars.agent.Agent.handleMessage(Agent.java:84) + at groovyx.gpars.agent.AgentCore$1.handleMessage(AgentCore.java:48) + at groovyx.gpars.util.AsyncMessagingCore.run(AsyncMessagingCore.java:132) + at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) + at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) + at java.base/java.lang.Thread.run(Thread.java:840) +Caused by: java.lang.IllegalArgumentException: Unable to create serializer "com.esotericsoftware.kryo.serializers.FieldSerializer" for class: java.lang.ref.ReferenceQueue + at com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:48) + at com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26) + at com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:351) + at com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:344) + at com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56) + at com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:461) + at com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:79) + at com.esotericsoftware.kryo.Kryo.writeClass(Kryo.java:488) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:57) + ... 106 common frames omitted +Caused by: java.lang.reflect.InvocationTargetException: null + at jdk.internal.reflect.GeneratedConstructorAccessor39.newInstance(Unknown Source) + at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) + at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500) + at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481) + at com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:35) + ... 114 common frames omitted +Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make field private final java.lang.ref.ReferenceQueue$Lock java.lang.ref.ReferenceQueue.lock accessible: module java.base does not "opens java.lang.ref" to unnamed module @7b02881e + at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354) + at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297) + at java.base/java.lang.reflect.Field.checkCanSetAccessible(Field.java:178) + at java.base/java.lang.reflect.Field.setAccessible(Field.java:172) + at com.esotericsoftware.kryo.serializers.FieldSerializer.buildValidFields(FieldSerializer.java:282) + at com.esotericsoftware.kryo.serializers.FieldSerializer.rebuildCachedFields(FieldSerializer.java:217) + at com.esotericsoftware.kryo.serializers.FieldSerializer.rebuildCachedFields(FieldSerializer.java:156) + at com.esotericsoftware.kryo.serializers.FieldSerializer.(FieldSerializer.java:133) + ... 119 common frames omitted +May-07 23:02:02.378 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 1; name: NFCORE_DMSCORE:DMSCORE:BWA_INDEX (GID1A_SUNi_ref_small.fasta); status: COMPLETED; exit: 0; error: -; workDir: /Users/benjaminwehnert/dmscore/work/49/349fd93ff7dd98e92e426a30457fb9] +May-07 23:02:02.387 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run +May-07 23:02:02.388 [Task submitter] INFO nextflow.Session - [36/b138e8] Submitted process > NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_POSSIBLE_MUTATIONS (table /w all possible variants) +May-07 23:02:02.422 [Actor Thread 8] WARN nextflow.processor.TaskContext - Cannot serialize context map. Cause: java.lang.IllegalArgumentException: Unable to create serializer "com.esotericsoftware.kryo.serializers.FieldSerializer" for class: java.lang.ref.ReferenceQueue -- Resume will not work on this process +May-07 23:02:13.067 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 2; name: NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_POSSIBLE_MUTATIONS (table /w all possible variants); status: COMPLETED; exit: 0; error: -; workDir: /Users/benjaminwehnert/dmscore/work/36/b138e8e27f7423eea98b17aaf74451] +May-07 23:02:13.085 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run +May-07 23:02:13.085 [Task submitter] INFO nextflow.Session - [67/c2a06b] Submitted process > NFCORE_DMSCORE:DMSCORE:BWA_MEM (gid1a_1_quality_1_pe) +May-07 23:02:13.133 [Actor Thread 6] WARN nextflow.processor.TaskContext - Cannot serialize context map. Cause: java.lang.IllegalArgumentException: Unable to create serializer "com.esotericsoftware.kryo.serializers.FieldSerializer" for class: java.lang.ref.ReferenceQueue -- Resume will not work on this process +May-07 23:02:13.136 [Actor Thread 6] DEBUG nextflow.processor.TaskContext - Failed to serialize delegate map items: [ + 'meta':[Script_09ccfa79b2802f41$_runScript_closure1$_closure26] = + 'pos_range':[java.lang.String] = 352-1383 + 'mutagenesis_type':[java.lang.String] = max_diff_to_wt + '$':[java.lang.Boolean] = true + 'wt_seq':[nextflow.processor.TaskPath] = GID1A_SUNi_ref_small.fasta + 'custom_codon_library':[nextflow.processor.TaskPath] = NULL + 'script':[nextflow.processor.TaskPath] = possible_mutations.R + 'task':[nextflow.processor.TaskConfig] = [container:community.wave.seqera.io/library/bioconductor-biostrings_r-base_r-biocmanager_r-dplyr_pruned:0fd2e39a5bf2ecaa, withName:PREMERGE:[publishDir:[path:./results/intermediate_files/bam_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure9$_closure21@26dc7ffc]], memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure15@13f9d575, withName:FASTQC:[ext:[args:--quiet], containerOptions:], withLabel:error_retry:[errorStrategy:retry, maxRetries:2], when:nextflow.script.TaskClosure@5f4899a5, withLabel:process_high:[cpus:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure9$_closure23@1b36f3f3, memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure9$_closure24@5e504cf6, time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure9$_closure25@35f6834], withLabel:error_ignore:[errorStrategy:ignore], resourceLimits:[cpus:4, memory:8.GB, time:1.h], withName:DMSANALYSIS_AASEQ:[publishDir:[path:./results/intermediate_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure13$_closure25@4eeda121]], withLabel:process_high_memory:[memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure11$_closure27@62e86a64], withName:BAMFILTER_DMS:[publishDir:[path:./results/intermediate_files/bam_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure8$_closure20@3e03bd33]], publishDir:[[path:./results/intermediate_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure11$_closure23@267852db]], withName:BWA_MEM:[publishDir:[path:./results/intermediate_files/bam_files/bwa/mem, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure7$_closure19@46e56c0f]], executor:local, stub:nextflow.script.TaskClosure@7069093e, conda:null/environment.yml, withName:GATK_SATURATIONMUTAGENESIS:[publishDir:[path:./results/intermediate_files/gatk, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure10$_closure22@4438d4d1]], cacheable:true, withLabel:process_low:[cpus:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure7$_closure17@67f11340, memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure7$_closure18@a998ea5, time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure7$_closure19@7e85964c], withLabel:process_medium:[cpus:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure8$_closure20@7c995b11, memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure8$_closure21@131d3cd1, time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure8$_closure22@55b774b1], tag:table /w all possible variants, withName:MULTIQC:[ext:[args:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure5$_closure15@7d2bfbd], publishDir:[path:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure5$_closure16@31a52d85, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure5$_closure17@4ba464d4]], workDir:/Users/benjaminwehnert/dmscore/work/36/b138e8e27f7423eea98b17aaf74451, exitStatus:0, ext:[:], withLabel:process_single:[cpus:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure14@255893ed, memory:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure15@13f9d575, time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure16@37e5efac], withName:BWA_INDEX:[publishDir:[path:./results/intermediate_files/bam_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure6$_closure18@2f3435d0]], process:NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_POSSIBLE_MUTATIONS, debug:false, cpus:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure14@255893ed, index:1, label:[process_single], withLabel:process_long:[time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure10$_closure26@475e6626], maxRetries:1, maxErrors:-1, shell:bash + +set -e # Exit if a tool returns a non-zero status/exit code +set -u # Treat unset variables and parameters as an error +set -o pipefail # Returns the status of the last command to exit with a non-zero status or zero if all successfully execute +set -C # No clobber - prevent output redirection from overwriting files. +, withName:DMSANALYSIS_POSSIBLE_MUTATIONS:[publishDir:[path:./results/intermediate_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure11$_closure23@267852db]], name:NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_POSSIBLE_MUTATIONS (table /w all possible variants), containerOptions:-u $(id -u):$(id -g), errorStrategy:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure5@3e785137, time:Script126E8BAA3BC9692B58F2A1C965A1D472$_run_closure1$_closure6$_closure16@37e5efac, withName:DMSANALYSIS_PROCESS_GATK:[publishDir:[path:./results/intermediate_files/processed_gatk_files, mode:copy, saveAs:ScriptE8646A4B8FFA7429020F836DC9CB8146$_run_closure1$_closure14$_closure26@30ec799d]], hash:36b138e8e27f7423eea98b17aaf74451] +] +com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer "com.esotericsoftware.kryo.serializers.FieldSerializer" for class: java.lang.ref.ReferenceQueue +Serialization trace: +queue (org.codehaus.groovy.util.ReferenceManager$CallBackedManager) +manager (org.codehaus.groovy.util.ReferenceManager$1) +manager (org.codehaus.groovy.util.ReferenceBundle) +bundle (org.codehaus.groovy.reflection.CachedClass$4) +cachedSuperClass (org.codehaus.groovy.reflection.stdclasses.ObjectCachedClass) +cachedClass (org.codehaus.groovy.reflection.CachedMethod) +allMethods (groovy.lang.MetaClassImpl) +delegate (groovy.runtime.metaclass.NextflowDelegatingMetaClass) +metaClass (groovyx.gpars.dataflow.DataflowVariable) +first (groovyx.gpars.dataflow.stream.DataflowStream) +asyncHead (groovyx.gpars.dataflow.stream.DataflowStreamReadAdapter) +source (nextflow.extension.MapOp) +owner (nextflow.extension.MapOp$_apply_closure1) +code (groovyx.gpars.dataflow.operator.DataflowOperatorActor) +actor (groovyx.gpars.dataflow.operator.DataflowOperator) +allOperators (nextflow.Session) +session (nextflow.validation.ValidationExtension) +target (nextflow.script.FunctionDef) +definitions (nextflow.script.ScriptMeta) +meta (nextflow.script.ScriptBinding) +binding (Script_09ccfa79b2802f41) +delegate (Script_09ccfa79b2802f41$_runScript_closure1) +delegate (Script_09ccfa79b2802f41$_runScript_closure1$_closure26) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:82) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599) + at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:82) + at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:22) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599) + at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:82) + at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:22) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599) + at com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:95) + at com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:21) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) + at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) + at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599) + at com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:95) + at com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:21) + at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599) + at org.codehaus.groovy.vmplugin.v8.IndyInterface.fromCache(IndyInterface.java:321) + at nextflow.util.KryoHelper.serialize(SerializationHelper.groovy:166) + at org.codehaus.groovy.vmplugin.v8.IndyInterface.fromCache(IndyInterface.java:321) + at nextflow.processor.TaskContext.serialize(TaskContext.groovy:198) + at nextflow.cache.CacheDB.writeTaskEntry0(CacheDB.groovy:148) + at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) + at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) + at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) + at java.base/java.lang.reflect.Method.invoke(Method.java:569) + at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:343) + at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:328) + at groovy.lang.MetaClassImpl.doInvokeMethod(MetaClassImpl.java:1333) + at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1088) + at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1007) + at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:645) + at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:628) + at org.codehaus.groovy.runtime.InvokerHelper.invokeMethodSafe(InvokerHelper.java:82) + at nextflow.cache.CacheDB$_putTaskAsync_closure1.doCall(CacheDB.groovy:157) + at nextflow.cache.CacheDB$_putTaskAsync_closure1.call(CacheDB.groovy) + at groovyx.gpars.agent.AgentBase.onMessage(AgentBase.java:102) + at groovyx.gpars.agent.Agent.handleMessage(Agent.java:84) + at groovyx.gpars.agent.AgentCore$1.handleMessage(AgentCore.java:48) + at groovyx.gpars.util.AsyncMessagingCore.run(AsyncMessagingCore.java:132) + at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) + at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) + at java.base/java.lang.Thread.run(Thread.java:840) +Caused by: java.lang.IllegalArgumentException: Unable to create serializer "com.esotericsoftware.kryo.serializers.FieldSerializer" for class: java.lang.ref.ReferenceQueue + at com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:48) + at com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26) + at com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:351) + at com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:344) + at com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56) + at com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:461) + at com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:79) + at com.esotericsoftware.kryo.Kryo.writeClass(Kryo.java:488) + at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:57) + ... 106 common frames omitted +Caused by: java.lang.reflect.InvocationTargetException: null + at jdk.internal.reflect.GeneratedConstructorAccessor39.newInstance(Unknown Source) + at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) + at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500) + at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481) + at com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:35) + ... 114 common frames omitted +Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make field private final java.lang.ref.ReferenceQueue$Lock java.lang.ref.ReferenceQueue.lock accessible: module java.base does not "opens java.lang.ref" to unnamed module @7b02881e + at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354) + at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297) + at java.base/java.lang.reflect.Field.checkCanSetAccessible(Field.java:178) + at java.base/java.lang.reflect.Field.setAccessible(Field.java:172) + at com.esotericsoftware.kryo.serializers.FieldSerializer.buildValidFields(FieldSerializer.java:282) + at com.esotericsoftware.kryo.serializers.FieldSerializer.rebuildCachedFields(FieldSerializer.java:217) + at com.esotericsoftware.kryo.serializers.FieldSerializer.rebuildCachedFields(FieldSerializer.java:156) + at com.esotericsoftware.kryo.serializers.FieldSerializer.(FieldSerializer.java:133) + ... 119 common frames omitted +May-07 23:02:17.570 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 5; name: NFCORE_DMSCORE:DMSCORE:BWA_MEM (gid1a_1_quality_1_pe); status: COMPLETED; exit: 0; error: -; workDir: /Users/benjaminwehnert/dmscore/work/67/c2a06b1c7b4781da339f3e7aacd574] +May-07 23:02:17.585 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run +May-07 23:02:17.586 [Task submitter] INFO nextflow.Session - [48/c69654] Submitted process > NFCORE_DMSCORE:DMSCORE:FASTQC (gid1a_1_quality_1_pe) +May-07 23:02:17.748 [TaskFinalizer-4] DEBUG nextflow.processor.TaskProcessor - Process NFCORE_DMSCORE:DMSCORE:BWA_MEM > Skipping output binding because one or more optional files are missing: fileoutparam<1:1> +May-07 23:02:17.749 [TaskFinalizer-4] DEBUG nextflow.processor.TaskProcessor - Process NFCORE_DMSCORE:DMSCORE:BWA_MEM > Skipping output binding because one or more optional files are missing: fileoutparam<2:1> +May-07 23:02:17.749 [TaskFinalizer-4] DEBUG nextflow.processor.TaskProcessor - Process NFCORE_DMSCORE:DMSCORE:BWA_MEM > Skipping output binding because one or more optional files are missing: fileoutparam<3:1> +May-07 23:02:17.828 [Actor Thread 14] WARN nextflow.processor.TaskContext - Cannot serialize context map. Cause: java.lang.IllegalArgumentException: Unable to create serializer "com.esotericsoftware.kryo.serializers.FieldSerializer" for class: java.lang.ref.ReferenceQueue -- Resume will not work on this process +May-07 23:02:28.533 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 4; name: NFCORE_DMSCORE:DMSCORE:FASTQC (gid1a_1_quality_1_pe); status: COMPLETED; exit: 0; error: -; workDir: /Users/benjaminwehnert/dmscore/work/48/c69654e9b29a194ea81b7c28f63819] +May-07 23:02:28.582 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run +May-07 23:02:28.583 [Task submitter] INFO nextflow.Session - [4e/7ee9bd] Submitted process > NFCORE_DMSCORE:DMSCORE:BAMFILTER_DMS (gid1a_1_quality_1_pe) +May-07 23:02:28.752 [Actor Thread 12] DEBUG nextflow.sort.BigSort - Sort completed -- entries: 2; slices: 1; internal sort time: 0.013 s; external sort time: 0.008 s; total time: 0.021 s +May-07 23:02:28.776 [Actor Thread 12] DEBUG nextflow.file.FileCollector - Saved collect-files list to: /Users/benjaminwehnert/dmscore/work/collect-file/c1149f4a215bbe0651400d7f32cdafdc +May-07 23:02:28.784 [Actor Thread 12] DEBUG nextflow.file.FileCollector - Deleting file collector temp dir: /var/folders/r0/ldrzd4wn1s3516hy0vsn8xzm0000gn/T/nxf-15345989576056151242 +May-07 23:02:32.866 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 6; name: NFCORE_DMSCORE:DMSCORE:BAMFILTER_DMS (gid1a_1_quality_1_pe); status: COMPLETED; exit: 0; error: -; workDir: /Users/benjaminwehnert/dmscore/work/4e/7ee9bd66291d03ed6f3c2bbdfefe0d] +May-07 23:02:32.872 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run +May-07 23:02:32.873 [Task submitter] INFO nextflow.Session - [55/5dd043] Submitted process > NFCORE_DMSCORE:DMSCORE:MULTIQC +May-07 23:02:43.262 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 7; name: NFCORE_DMSCORE:DMSCORE:MULTIQC; status: COMPLETED; exit: 0; error: -; workDir: /Users/benjaminwehnert/dmscore/work/55/5dd0430083da5c332e7950544ac6e0] +May-07 23:02:43.292 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run +May-07 23:02:43.294 [Task submitter] INFO nextflow.Session - [aa/6e83ae] Submitted process > NFCORE_DMSCORE:DMSCORE:PREMERGE (gid1a_1_quality_1_pe) +May-07 23:02:43.298 [TaskFinalizer-7] DEBUG nextflow.processor.TaskProcessor - Process NFCORE_DMSCORE:DMSCORE:MULTIQC > Skipping output binding because one or more optional files are missing: fileoutparam<2> +May-07 23:02:46.384 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 8; name: NFCORE_DMSCORE:DMSCORE:PREMERGE (gid1a_1_quality_1_pe); status: COMPLETED; exit: 0; error: -; workDir: /Users/benjaminwehnert/dmscore/work/aa/6e83aebc1b031afabb4d8df09e97f3] +May-07 23:02:46.488 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run +May-07 23:02:46.489 [Task submitter] INFO nextflow.Session - [f2/f878d3] Submitted process > NFCORE_DMSCORE:DMSCORE:GATK_SATURATIONMUTAGENESIS (gid1a_1_quality_1_pe) +May-07 23:03:05.753 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 9; name: NFCORE_DMSCORE:DMSCORE:GATK_SATURATIONMUTAGENESIS (gid1a_1_quality_1_pe); status: COMPLETED; exit: 0; error: -; workDir: /Users/benjaminwehnert/dmscore/work/f2/f878d301d101b20d245c27be19bbf3] +May-07 23:03:05.837 [Actor Thread 2] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for + task: name=NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_PROCESS_GATK (1); work-dir=null + error [nextflow.exception.ProcessUnrecoverableException]: Path value cannot be null +May-07 23:03:05.858 [Actor Thread 2] ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_PROCESS_GATK (1)' + +Caused by: + Path value cannot be null + + + +Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run` +May-07 23:03:05.860 [Actor Thread 2] INFO nextflow.Session - Execution cancelled -- Finishing pending tasks before exit +May-07 23:03:05.865 [Actor Thread 2] ERROR nextflow.Nextflow - Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting +May-07 23:03:05.866 [main] DEBUG nextflow.Session - Session await > all processes finished +May-07 23:03:05.867 [Task monitor] DEBUG n.processor.TaskPollingMonitor - <<< barrier arrives (monitor: local) - terminating tasks monitor poll loop +May-07 23:03:05.867 [main] DEBUG nextflow.Session - Session await > all barriers passed +May-07 23:03:05.868 [Actor Thread 12] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for + task: name=NFCORE_DMSCORE:DMSCORE:DMSANALYSIS_PROCESS_GATK; work-dir=null + error [java.lang.InterruptedException]: java.lang.InterruptedException +May-07 23:03:05.872 [main] DEBUG nextflow.util.ThreadPoolManager - Thread pool 'TaskFinalizer' shutdown completed (hard=false) +May-07 23:03:05.872 [main] DEBUG nextflow.util.ThreadPoolManager - Thread pool 'PublishDir' shutdown completed (hard=false) +May-07 23:03:05.880 [main] INFO nextflow.Nextflow - -[nf-core/dmscore] Pipeline completed with errors- +May-07 23:03:05.897 [main] DEBUG n.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=9; failedCount=0; ignoredCount=0; cachedCount=0; pendingCount=0; submittedCount=0; runningCount=0; retriesCount=0; abortedCount=0; succeedDuration=2m 24s; failedDuration=0ms; cachedDuration=0ms;loadCpus=0; loadMemory=0; peakRunning=2; peakCpus=8; peakMemory=16 GB; ] +May-07 23:03:05.898 [main] DEBUG nextflow.trace.TraceFileObserver - Workflow completed -- saving trace file +May-07 23:03:05.900 [main] DEBUG nextflow.trace.ReportObserver - Workflow completed -- rendering execution report +May-07 23:03:07.314 [main] DEBUG nextflow.trace.TimelineObserver - Workflow completed -- rendering execution timeline +May-07 23:03:07.511 [main] DEBUG nextflow.cache.CacheDB - Closing CacheDB done +May-07 23:03:07.545 [main] INFO org.pf4j.AbstractPluginManager - Stop plugin 'nf-schema@2.3.0' +May-07 23:03:07.545 [main] DEBUG nextflow.plugin.BasePlugin - Plugin stopped nf-schema +May-07 23:03:07.546 [main] DEBUG nextflow.util.ThreadPoolManager - Thread pool 'FileTransfer' shutdown completed (hard=false) +May-07 23:03:07.547 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye diff --git a/main.nf b/main.nf index 8c9943a..41a451c 100644 --- a/main.nf +++ b/main.nf @@ -1,11 +1,11 @@ #!/usr/bin/env nextflow /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - nf-core/dmscore + nf-core/deepmutscan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Github : https://github.com/nf-core/dmscore - Website: https://nf-co.re/dmscore - Slack : https://nfcore.slack.com/channels/dmscore + Github : https://github.com/nf-core/deepmutscan + Website: https://nf-co.re/deepmutscan + Slack : https://nfcore.slack.com/channels/deepmutscan ---------------------------------------------------------------------------------------- */ @@ -15,10 +15,10 @@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -include { DMSCORE } from './workflows/dmscore' -include { PIPELINE_INITIALISATION } from './subworkflows/local/utils_nfcore_dmscore_pipeline' -include { PIPELINE_COMPLETION } from './subworkflows/local/utils_nfcore_dmscore_pipeline' -include { getGenomeAttribute } from './subworkflows/local/utils_nfcore_dmscore_pipeline' +include { DEEPMUTSCAN } from './workflows/deepmutscan' +include { PIPELINE_INITIALISATION } from './subworkflows/local/utils_nfcore_deepmutscan_pipeline' +include { PIPELINE_COMPLETION } from './subworkflows/local/utils_nfcore_deepmutscan_pipeline' +include { getGenomeAttribute } from './subworkflows/local/utils_nfcore_deepmutscan_pipeline' /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -26,9 +26,6 @@ include { getGenomeAttribute } from './subworkflows/local/utils_nfcore_dmsc ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -// TODO nf-core: Remove this line if you don't need a FASTA file -// This is an example of how to use getGenomeAttribute() to fetch parameters -// from igenomes.config using `--genome` params.fasta = getGenomeAttribute('fasta') /* @@ -40,7 +37,7 @@ params.fasta = getGenomeAttribute('fasta') // // WORKFLOW: Run main analysis pipeline depending on type of input // -workflow NFCORE_DMSCORE { +workflow NFCORE_DEEPMUTSCAN { take: samplesheet // channel: samplesheet read in from --input @@ -50,12 +47,17 @@ workflow NFCORE_DMSCORE { // // WORKFLOW: Run pipeline // - DMSCORE ( - samplesheet + DEEPMUTSCAN ( + samplesheet, + params.multiqc_config, + params.multiqc_logo, + params.multiqc_methods_description, + params.outdir, ) emit: - multiqc_report = DMSCORE.out.multiqc_report // channel: /path/to/multiqc_report.html + multiqc_report = DEEPMUTSCAN.out.multiqc_report // channel: /path/to/multiqc_report.html } + /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ RUN MAIN WORKFLOW @@ -80,7 +82,7 @@ workflow { // // WORKFLOW: Run main workflow // - NFCORE_DMSCORE ( + NFCORE_DEEPMUTSCAN ( PIPELINE_INITIALISATION.out.samplesheet ) // @@ -92,8 +94,7 @@ workflow { params.plaintext_email, params.outdir, params.monochrome_logs, - params.hook_url, - NFCORE_DMSCORE.out.multiqc_report + NFCORE_DEEPMUTSCAN.out.multiqc_report ) } diff --git a/modules.json b/modules.json index 910602b..2d49f3a 100644 --- a/modules.json +++ b/modules.json @@ -1,18 +1,28 @@ { - "name": "nf-core/dmscore", - "homePage": "https://github.com/nf-core/dmscore", + "name": "nf-core/deepmutscan", + "homePage": "https://github.com/nf-core/deepmutscan", "repos": { "https://github.com/nf-core/modules.git": { "modules": { "nf-core": { + "bwa/index": { + "branch": "master", + "git_sha": "6d46786420b4d7bc88eba026eb389c0c5535d120", + "installed_by": ["modules"] + }, + "bwa/mem": { + "branch": "master", + "git_sha": "2fb127c8fd13de0adaa676df7169131e45c0b114", + "installed_by": ["modules"] + }, "fastqc": { "branch": "master", - "git_sha": "dc94b6ee04a05ddb9f7ae050712ff30a13149164", + "git_sha": "6d46786420b4d7bc88eba026eb389c0c5535d120", "installed_by": ["modules"] }, "multiqc": { "branch": "master", - "git_sha": "cf17ca47590cc578dfb47db1c2a44ef86f89976d", + "git_sha": "98403d15b0e50edae1f3fec5eae5e24982f1fade", "installed_by": ["modules"] } } @@ -21,17 +31,17 @@ "nf-core": { "utils_nextflow_pipeline": { "branch": "master", - "git_sha": "c2b22d85f30a706a3073387f30380704fcae013b", + "git_sha": "1a545fcbd762911c21a64ced3dbef99b2b51ac75", "installed_by": ["subworkflows"] }, "utils_nfcore_pipeline": { "branch": "master", - "git_sha": "51ae5406a030d4da1e49e4dab49756844fdd6c7a", + "git_sha": "a3fb7351b1fdb2b1de282b765816bbea190e86a8", "installed_by": ["subworkflows"] }, "utils_nfschema_plugin": { "branch": "master", - "git_sha": "2fd2cd6d0e7b273747f32e465fdc6bcc3ae0814e", + "git_sha": "a7b27fd25bfa8dcc07d299e88bd790585901a436", "installed_by": ["subworkflows"] } } diff --git a/modules/local/bamprocessing/bam_filter/environment.yml b/modules/local/bamprocessing/bam_filter/environment.yml new file mode 100644 index 0000000..a5338f6 --- /dev/null +++ b/modules/local/bamprocessing/bam_filter/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::samtools=1.21 diff --git a/modules/local/bamprocessing/bam_filter/main.nf b/modules/local/bamprocessing/bam_filter/main.nf new file mode 100644 index 0000000..900bcc1 --- /dev/null +++ b/modules/local/bamprocessing/bam_filter/main.nf @@ -0,0 +1,49 @@ + +process BAMFILTER_DMS { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h96c455f_1': + 'biocontainers/samtools:1.21--h96c455f_1' }" + + input: + tuple val(meta), path(bam) + + output: + tuple val(meta), path("*.bam"), emit: bam + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + samtools view -h -F 4 -F 256 -q 30 $bam | \ + samtools view -h | \ + awk '{if(\$6 !~ /I/ && \$6 !~ /D/ && \$6 !~ /N/) print \$0}' | \ + samtools view -h | \ + awk '{for(i=1;i<=NF;i++) if(\$i ~ /^NM:i:/ && \$i != "NM:i:0") {print \$0; next}} \$1 ~ /^@/' | \ + samtools view -bS > ${meta.id}_filtered.bam + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(samtools --version |& sed '1!d ; s/samtools //') +END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + touch ${prefix}.bam + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bamfilteringdms: \$(samtools --version |& sed '1!d ; s/samtools //') +END_VERSIONS + """ +} diff --git a/modules/local/bamprocessing/bam_filter/meta.yml b/modules/local/bamprocessing/bam_filter/meta.yml new file mode 100644 index 0000000..78c92ea --- /dev/null +++ b/modules/local/bamprocessing/bam_filter/meta.yml @@ -0,0 +1,48 @@ +name: "bamfilter_dms" +description: Filters BAM files specifically for Deep Mutational Scanning (DMS) analysis by removing unmapped reads, low-quality alignments, secondary alignments, indels (CIGAR I/D/N), and perfectly matching wild-type reads. +keywords: + - dms + - bam + - filtering + - samtools + - indels +tools: + - "samtools": + description: "Tools for manipulating next-generation sequencing data" + homepage: "http://www.htslib.org/" + documentation: "http://www.htslib.org/doc/samtools.html" + tool_dev_url: "https://github.com/samtools/samtools" + doi: "10.1093/bioinformatics/btp352" + licence: ["MIT/Expat"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - bam: + type: file + description: Input BAM file to be filtered + pattern: "*.{bam}" + +output: + - bam: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - "*.bam": + type: file + description: Filtered BAM file containing only high-quality mutated reads without indels + pattern: "*_filtered.bam" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/bamprocessing/premerge/environment.yml b/modules/local/bamprocessing/premerge/environment.yml new file mode 100644 index 0000000..0e199ab --- /dev/null +++ b/modules/local/bamprocessing/premerge/environment.yml @@ -0,0 +1,7 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bwa=0.7.19 + - bioconda::samtools=1.21 + - bioconda::vsearch=2.30.0 diff --git a/modules/local/bamprocessing/premerge/main.nf b/modules/local/bamprocessing/premerge/main.nf new file mode 100644 index 0000000..fbf433f --- /dev/null +++ b/modules/local/bamprocessing/premerge/main.nf @@ -0,0 +1,51 @@ + +process PREMERGE { + tag "$meta.id" + label 'process_medium' + + conda "${moduleDir}/environment.yml" + container "community.wave.seqera.io/library/bwa_samtools_vsearch:28e8640725d3d8e9" + + input: + tuple val(meta), path(bam) + path wt_seq + + output: + tuple val(meta), path("*.bam"), emit: bam + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + # Convert BAM to paired FASTQ files + samtools fastq -1 forward_reads.fastq -2 reverse_reads.fastq -0 /dev/null -s /dev/null -n $bam + + # Merge paired reads + vsearch --fastq_mergepairs forward_reads.fastq --reverse reverse_reads.fastq --fastqout merged_reads.fastq --fastq_minovlen 10 --fastq_allowmergestagger + + # Re-align merged reads + bwa index $wt_seq + bwa mem $wt_seq merged_reads.fastq | samtools view -Sb - > ${prefix}_merged.bam + + # Save version information + cat <<-END_VERSIONS > versions.yml + "${task.process}": + premerge: \$(samtools --version |& sed '1!d ; s/samtools //') +END_VERSIONS + """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + touch merged_reads.fastq + touch merged_reads.bam + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + premerge: dummy_version +END_VERSIONS + """ +} diff --git a/modules/local/bamprocessing/premerge/meta.yml b/modules/local/bamprocessing/premerge/meta.yml new file mode 100644 index 0000000..76186fe --- /dev/null +++ b/modules/local/bamprocessing/premerge/meta.yml @@ -0,0 +1,65 @@ +name: "premerge" +description: Processes paired-end BAM files by converting them to FASTQ, merging overlapping paired reads using VSEARCH, and re-aligning the merged single-end reads to the wild-type reference using BWA. +keywords: + - paired-end merging + - alignment + - bwa + - vsearch + - samtools + - dms +tools: + - "samtools": + description: "Tools for manipulating next-generation sequencing data" + homepage: "http://www.htslib.org/" + documentation: "http://www.htslib.org/doc/samtools.html" + tool_dev_url: "https://github.com/samtools/samtools" + doi: "10.1093/bioinformatics/btp352" + licence: ["MIT/Expat"] + - "vsearch": + description: "Versatile open-source tool for metagenomics" + homepage: "https://github.com/torognes/vsearch" + documentation: "https://github.com/torognes/vsearch/blob/master/README.md" + doi: "10.7717/peerj.2584" + licence: ["GPL-3.0-or-later", "BSD-2-Clause"] + - "bwa": + description: "Burrow-Wheeler Aligner for short-read alignment" + homepage: "http://bio-bwa.sourceforge.net/" + documentation: "http://bio-bwa.sourceforge.net/bwa.shtml" + doi: "10.1093/bioinformatics/btp324" + licence: ["GPL-3.0-or-later"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - bam: + type: file + description: Input paired-end BAM file + pattern: "*.{bam}" + - - wt_seq: + type: file + description: FASTA file containing the wild-type reference sequence + pattern: "*.{fasta,fa}" + +output: + - bam: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - "*.bam": + type: file + description: BAM file containing the merged and re-aligned reads + pattern: "*_merged.bam" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/dmsanalysis/aa_seq/environment.yml b/modules/local/dmsanalysis/aa_seq/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/dmsanalysis/aa_seq/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/dmsanalysis/aa_seq/main.nf b/modules/local/dmsanalysis/aa_seq/main.nf new file mode 100644 index 0000000..01ac974 --- /dev/null +++ b/modules/local/dmsanalysis/aa_seq/main.nf @@ -0,0 +1,31 @@ +process DMSANALYSIS_AASEQ { + tag "amino_acid_sequence" + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(meta), path(wt_seq) + val pos_range + + output: + tuple val(meta), path("aa_seq.txt"), emit: aa_seq + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + template 'aa_seq.R' + + stub: + """ + touch aa_seq.txt + echo "DMSANALYSIS_AASEQ:" > versions.yml + echo " stub-version: 0.0.0" >> versions.yml + """ +} diff --git a/modules/local/dmsanalysis/aa_seq/meta.yml b/modules/local/dmsanalysis/aa_seq/meta.yml new file mode 100644 index 0000000..46a761d --- /dev/null +++ b/modules/local/dmsanalysis/aa_seq/meta.yml @@ -0,0 +1,54 @@ +name: "dmsanalysis_aaseq" +description: Translates a wild-type DNA sequence into an amino acid sequence based on a provided coding region (ORF) range. +keywords: + - translation + - amino acid + - dna + - dms + - biostrings +tools: + - "bioconductor-biostrings": + description: "Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences" + homepage: "https://bioconductor.org/packages/Biostrings" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/Biostrings/man/Biostrings.pdf" + licence: ["Artistic-2.0"] + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/mutscan" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'ref' ]` + - wt_seq: + type: file + description: FASTA file containing the wild-type DNA reference sequence + pattern: "*.{fasta,fa}" + - - pos_range: + type: string + description: Start and stop codon positions in the format 'start-stop', e.g., '352-1383' + +output: + - aa_seq: + - meta: + type: map + description: | + Groovy Map containing sample information + - aa_seq.txt: + type: file + description: Text file containing the translated amino acid sequence + pattern: "aa_seq.txt" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/dmsanalysis/aa_seq/templates/aa_seq.R b/modules/local/dmsanalysis/aa_seq/templates/aa_seq.R new file mode 100644 index 0000000..9e759cd --- /dev/null +++ b/modules/local/dmsanalysis/aa_seq/templates/aa_seq.R @@ -0,0 +1,65 @@ +#!/usr/bin/env Rscript + +# input: wildtype-seq, start&stopp pos. +# output: amino acid sequence within the start-stop frame (.txt) + +# Load necessary libraries +suppressMessages(library(Biostrings)) + +# Define the function +aa_seq <- function(wt_seq_input, pos_range, output_file) { + # Parse the start and stop positions from the input format "start-stop" + positions <- unlist(strsplit(pos_range, "-")) + start_pos <- as.numeric(positions[1]) + stop_pos <- as.numeric(positions[2]) + + # Check if the input is a file or a string + if (file.exists(wt_seq_input)) { + # If it's a file, read the sequence from the fasta file + seq_data <- readDNAStringSet(filepath = wt_seq_input) + wt_seq <- seq_data[[1]] # Extract the sequence + } else { + # Otherwise, treat the input as a sequence string + wt_seq <- DNAString(wt_seq_input) + } + + # Extract the sequence between start and stop codons + coding_seq <- subseq(wt_seq, start = start_pos, end = stop_pos) + + # Translate the coding sequence into an amino acid sequence + aa_seq <- translate(coding_seq) + + # Write the amino acid sequence to a .txt file + write(as.character(aa_seq), file = output_file) +} + + +##### +# run function +##### +aa_seq( + wt_seq_input = "$wt_seq", + "$pos_range", + "aa_seq.txt" + ) + + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +Biostrings_version <- as.character(packageVersion("Biostrings")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(Biostrings_version) == 0) Biostrings_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"\${task.process}":', + paste(' r-base:', r_version), + paste(' r-Biostrings:', Biostrings_version) + ), + f +) +close(f) diff --git a/modules/local/dmsanalysis/false-doubles_based_seq_error_correction/environment.yml b/modules/local/dmsanalysis/false-doubles_based_seq_error_correction/environment.yml new file mode 100644 index 0000000..4070243 --- /dev/null +++ b/modules/local/dmsanalysis/false-doubles_based_seq_error_correction/environment.yml @@ -0,0 +1,4 @@ +channels: + - conda-forge +dependencies: + - conda-forge::r-base=4.4.1 diff --git a/modules/local/dmsanalysis/false-doubles_based_seq_error_correction/templates/false-doubles_based_seq_error_correction.R b/modules/local/dmsanalysis/false-doubles_based_seq_error_correction/templates/false-doubles_based_seq_error_correction.R new file mode 100644 index 0000000..835d438 --- /dev/null +++ b/modules/local/dmsanalysis/false-doubles_based_seq_error_correction/templates/false-doubles_based_seq_error_correction.R @@ -0,0 +1,202 @@ +#!/usr/bin/env Rscript + +## false-doubles based sequencing error correction in nf-core/deepmutscan +## 30.03.2026 + + +## --- Helper functions --- + +# function to run the false-doubles based sequencing error correction (nucleotide level) +seq_error_correct_by_false_doubles <- function(input_count_path_raw, input_count_path_processed, output_file_path){ + + ## load data (nucleotide-level counts from GATK) + + # Set the column names + colnames <- c("counts", "cov", "mean_length_variant_reads", "varying_bases", + "base_mut", "varying_codons", "codon_mut", "aa_mut", "pos_mut") + input.counts.raw <- read.table(input_count_path_raw, sep = "\t", header = FALSE, fill = TRUE, col.names = colnames) + input.counts.processed <- read.csv(input_count_path_processed) + + ## append key columns + input.counts.processed$counts_corrected <- input.counts.processed$counts + input.counts.processed$counts_per_cov_corrected <- input.counts.processed$counts_per_cov + + # Process the GATK file, error-correct single nucleotide variant counts + cat("Sequencing error correction of GATK counts...\n") + for (i in grep("[,]", input.counts.processed[,"base_mut"], invert = T)){ + + tmp.single <- input.counts.processed[i,"base_mut"] + + ## locate this variant across all multi-codon variants in the original count matrix + tmp.false.doubles <- input.counts.raw[grep(paste0("(?= 3 & tmp.false.doubles[,"varying_codons"] < 3),] + if(nrow(tmp.false.doubles) == 0){ + next + } + + ## need to match with the corresponding correct single codon variant(s) + tmp.true.singles <- tmp.false.doubles$base_mut + tmp.true.singles <- strsplit(tmp.true.singles, ", ") + tmp.true.singles <- lapply(tmp.true.singles, function(x){x <- x[x != tmp.single]; x <- paste0(x, collapse = ", "); return(x)}) + tmp.true.singles <- do.call(c, tmp.true.singles) + tmp.true.singles <- input.counts.processed[match(tmp.true.singles, input.counts.processed[,"base_mut"]),] + if(all(is.na(tmp.true.singles) == T)){ + next + } + + ## make sure that we only count events for which there are both true single codon and false double codon variants + if(any(is.na(tmp.true.singles$counts))){ + tmp.false.doubles <- tmp.false.doubles[-which(is.na(tmp.true.singles$counts)),] + tmp.true.singles <- tmp.true.singles[-which(is.na(tmp.true.singles$counts)),] + } + if(all(is.na(tmp.true.singles) == T)){ + next + } + + ## calculate the expected false 1nt count probability (MLE) + e_MLE <- sum(tmp.false.doubles$counts) / sum(tmp.true.singles$counts * c(tmp.false.doubles$cov / tmp.true.singles$cov)) + input.counts.processed[i,"counts_per_cov_corrected"] <- input.counts.processed[i,"counts_per_cov"] - e_MLE + + ## based on this, also adjust the total_counts by cross-multiplication + input.counts.processed[i,"counts_corrected"] <- input.counts.processed[i,"counts"] * c(input.counts.processed[i,"counts_per_cov_corrected"] / input.counts.processed[i,"counts_per_cov"]) + + ## round to nearest integer, do not allow for negative counts + input.counts.processed[i,"counts_corrected"] <- round(input.counts.processed[i,"counts_corrected"]) + if(input.counts.processed[i,"counts_corrected"] < 0){ + input.counts.processed[i,"counts_corrected"] <- 0 + input.counts.processed[i,"counts_per_cov_corrected"] <- 0 + } + + } + + # Write the processed data + write.csv(input.counts.processed, file = output_file_path, row.names = F) + +} + +# to re-make the AA level input table after WT sequencing error correction +seq_error_correct_counts_for_heatmaps <- function(gatk_file_path, aa_seq_file_path, output_csv_path, threshold = 3) { + + # Load the raw GATK data + raw_gatk <- read.table(gatk_file_path, sep = ",", header = TRUE, stringsAsFactors = FALSE) + + # Read the wild-type amino acid sequence from the text file + wt_seq <- readLines(aa_seq_file_path) + wt_seq <- unlist(strsplit(wt_seq, "")) + + # Replace 'X' with '*', indicating the stop codon + wt_seq[wt_seq == "X"] <- "*" + + # Summarize counts for each unique pos_mut + aggregated_counts_per_cov <- aggregate( + counts_per_cov_corrected ~ pos_mut, + data = raw_gatk, + FUN = function(x) sum(x, na.rm = TRUE) + ) + + aggregated_counts <- aggregate( + counts_corrected ~ pos_mut, + data = raw_gatk, + FUN = function(x) sum(x, na.rm = TRUE) + ) + + # Merge the two aggregated tables + aggregated_data <- merge( + aggregated_counts_per_cov, + aggregated_counts, + by = "pos_mut", + all = TRUE + ) + + # Rename columns to match original output + names(aggregated_data)[names(aggregated_data) == "counts_per_cov_corrected"] <- "total_counts_per_cov_corrected" + names(aggregated_data)[names(aggregated_data) == "counts_corrected"] <- "total_counts_corrected" + + # Extract wt_aa, position, and mut_aa from pos_mut + aggregated_data$wt_aa <- sub("(\\D)(\\d+)(\\D)", "\\1", aggregated_data$pos_mut) + aggregated_data$position <- as.numeric(sub("(\\D)(\\d+)(\\D)", "\\2", aggregated_data$pos_mut)) + aggregated_data$mut_aa <- sub("(\\D)(\\d+)(\\D)", "\\3", aggregated_data$pos_mut) + + # Replace 'X' with '*' + aggregated_data$mut_aa[aggregated_data$mut_aa == "X"] <- "*" + + # Define all 20 standard amino acids and stop codon + all_amino_acids <- c("A", "C", "D", "E", "F", "G", "H", "I", "K", "L", + "M", "N", "P", "Q", "R", "S", "T", "V", "W", "Y", "*") + + # Create all positions + all_positions <- seq_along(wt_seq) + + # Create complete grid of all possible position/mutation combinations + complete_data <- expand.grid( + mut_aa = all_amino_acids, + position = all_positions, + stringsAsFactors = FALSE + ) + + # Merge aggregated data into complete grid + heatmap_data <- merge( + complete_data, + aggregated_data, + by = c("mut_aa", "position"), + all.x = TRUE, + sort = FALSE + ) + + # Fill missing values + heatmap_data$total_counts_per_cov_corrected[is.na(heatmap_data$total_counts_per_cov_corrected)] <- 0 + heatmap_data$wt_aa <- wt_seq[heatmap_data$position] + + # Apply threshold + low_count <- is.na(heatmap_data$total_counts_corrected) | heatmap_data$total_counts_corrected < threshold + heatmap_data$total_counts_per_cov_corrected[low_count] <- NA + heatmap_data$total_counts_corrected[low_count] <- NA + + # Fill missing pos_mut values + missing_pos_mut <- is.na(heatmap_data$pos_mut) + heatmap_data$pos_mut[missing_pos_mut] <- paste0( + heatmap_data$wt_aa[missing_pos_mut], + heatmap_data$position[missing_pos_mut], + heatmap_data$mut_aa[missing_pos_mut] + ) + + # Re-order rows + out.order <- paste0(rep(1:max(heatmap_data$position), each = 21), + rep(c("A", "C", "D", "E", "F", "G", "H", + "I", "K", "L","M", "N", "P", "Q", + "R", "S", "T", "V", "W", "Y", "*"), + max(heatmap_data$position))) + heatmap_data <- heatmap_data[match(out.order, paste0(heatmap_data$position, heatmap_data$mut_aa)),] + rownames(heatmap_data) <- 1:nrow(heatmap_data) + + # Save output + write.csv(heatmap_data, file = output_csv_path, row.names = FALSE) + print(paste("Aggregated data saved to:", output_csv_path)) + +} + +## --- Main functions --- +seq_error_correct_by_false_doubles(input_count_path_raw = "intermediate_files/gatk/input_1_pe/gatk_output.variantCounts", + input_count_path_processed = "intermediate_files/processed_gatk_files/input_1_pe/variantCounts_filtered_by_library.csv", + output_file_path = "intermediate_files/processed_gatk_files/input_1_pe/variantCounts_filtered_by_library_err_corrected_false_doubles.csv") + +seq_error_correct_counts_for_heatmaps(gatk_file_path = "intermediate_files/processed_gatk_files/input_1_pe/variantCounts_filtered_by_library_err_corrected_false_doubles.csv", + aa_seq_file_path = "intermediate_files/aa_seq.txt", + output_csv_path = "intermediate_files/processed_gatk_files/input_1_pe/variantCounts_for_heatmaps_err_corrected_false_doubles.csv") + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +if (is.null(r_version)) r_version <- "unknown" +f <- file("versions.yml", "w") +writeLines( + c( + '"\${task.process}":', + paste(' r-base:', r_version), + ), + f +) +close(f) diff --git a/modules/local/dmsanalysis/possible_mutations/environment.yml b/modules/local/dmsanalysis/possible_mutations/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/dmsanalysis/possible_mutations/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/dmsanalysis/possible_mutations/main.nf b/modules/local/dmsanalysis/possible_mutations/main.nf new file mode 100644 index 0000000..e98eb0f --- /dev/null +++ b/modules/local/dmsanalysis/possible_mutations/main.nf @@ -0,0 +1,33 @@ +process DMSANALYSIS_POSSIBLE_MUTATIONS { + tag "table /w all possible variants" + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(meta), path(wt_seq) + val pos_range + val mutagenesis_type + path custom_codon_library + + output: + tuple val(meta), path("possible_mutations.csv"), emit: possible_mutations + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + template 'possible_mutations.R' + + stub: + """ + touch possible_mutations.csv + echo "DMSANALYSIS_POSSIBLE_MUTATIONS:" > versions.yml + echo " stub-version: 0.0.0" >> versions.yml + """ +} diff --git a/modules/local/dmsanalysis/possible_mutations/meta.yml b/modules/local/dmsanalysis/possible_mutations/meta.yml new file mode 100644 index 0000000..4fbd8c3 --- /dev/null +++ b/modules/local/dmsanalysis/possible_mutations/meta.yml @@ -0,0 +1,63 @@ +name: "dmsanalysis_possible_mutations" +description: | + Generates a comprehensive table of all theoretically possible variants based on the + mutagenesis strategy (e.g., NNK, NNS) and the reference sequence. This table acts + as the ground truth for downstream filtering and analysis. +keywords: + - deep mutational scanning + - dms + - mutagenesis + - library design + - variants +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'ref' ]` + - wt_seq: + type: file + description: FASTA file containing the wild-type reference DNA sequence + pattern: "*.{fasta,fa}" + - - pos_range: + type: string + description: Start and stop codon positions (ORF) in the format 'start-stop', e.g., '352-1383' + - - mutagenesis_type: + type: string + description: | + Type of mutagenic primers used. + Supported types: nnk, nns, nnh, nnn, nnk_nns, nnk_nns_nnh, or custom. + - - custom_codon_library: + type: file + description: | + Optional CSV file defining a custom codon library. + Only required if mutagenesis_type is set to 'custom'. + +output: + - possible_mutations: + - meta: + type: map + description: | + Groovy Map containing sample information + - possible_mutations.csv: + type: file + description: CSV file containing all theoretically possible mutations for the experiment + pattern: "possible_mutations.csv" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/dmsanalysis/possible_mutations/templates/possible_mutations.R b/modules/local/dmsanalysis/possible_mutations/templates/possible_mutations.R new file mode 100644 index 0000000..13a0702 --- /dev/null +++ b/modules/local/dmsanalysis/possible_mutations/templates/possible_mutations.R @@ -0,0 +1,197 @@ +#!/usr/bin/env Rscript + +# ------------------------------------------------------------------------------ +# Script: Generate Programmed Codon Variants +# Description: Generates all possible programmed codon mutations for a given +# wild-type sequence based on a specified mutagenesis strategy. +# Input: +# - wt_seq_input: Wild-type sequence (string or path to FASTA file). +# - start_stop_pos: Target sequence range format "start-stop". +# - mutagenesis_type: Strategy ('nnk', 'nns', 'nnh', 'nnn', 'nnk_nns', 'nnk_nns_nnh', 'custom'). +# - custom_codon_library_path: Path to custom library. Automatically detects +# if the file is a global list ("AAA, AAC...") or a position-wise CSV +# (requires a "Position" header). +# - output_file: Desired name/path for the output CSV. +# Output: A CSV file containing all possible mutated codons per position. +# ------------------------------------------------------------------------------ + +suppressMessages(library(Biostrings)) +suppressMessages(library(methods)) + +generate_possible_variants <- function(wt_seq_input, start_stop_pos, mutagenesis_type, + custom_codon_library_path, output_file) { + + # Parse the start and stop positions from the input format "start-stop" + positions <- unlist(strsplit(start_stop_pos, "-")) + start_pos <- as.numeric(positions[1]) + stop_pos <- as.numeric(positions[2]) + + # Load sequence from file or process as a direct string + if (file.exists(wt_seq_input)) { + seq_data <- Biostrings::readDNAStringSet(filepath = wt_seq_input) + wt_seq <- seq_data[[1]] + } else { + wt_seq <- Biostrings::DNAString(wt_seq_input) + } + + # Extract the target coding sequence + coding_seq <- Biostrings::subseq(wt_seq, start = start_pos, end = stop_pos) + coding_seq <- as.character(coding_seq) + + # Predefined codon dictionaries + nnk_codons <- c('AAG', 'AAT', 'ATG', 'ATT', 'AGG', 'AGT', 'ACG', 'ACT', 'TAG', 'TAT', 'TTG', 'TTT', 'TGG', 'TGT', 'TCG', 'TCT', 'GAG', 'GAT', 'GTG', 'GTT', 'GGG', 'GGT', 'GCG', 'GCT', 'CAG', 'CAT', 'CTG', 'CTT', 'CGG', 'CGT', 'CCG', 'CCT') + nns_codons <- c('AAG', 'AAC', 'ATG', 'ATC', 'AGG', 'AGC', 'ACG', 'ACC', 'TAG', 'TAC', 'TTG', 'TTC', 'TGG', 'TGC', 'TCG', 'TCC', 'GAG', 'GAC', 'GTG', 'GTC', 'GGG', 'GGC', 'GCG', 'GCC', 'CAG', 'CAC', 'CTG', 'CTC', 'CGG', 'CGC', 'CCG', 'CCC') + nnh_codons <- c('AAA', 'AAC', 'AAT', 'ATA', 'ATC', 'ATT', 'AGA', 'AGC', 'AGT', 'ACA', 'ACC', 'ACT', 'TAA', 'TAC', 'TAT', 'TTA', 'TTC', 'TTT', 'TGA', 'TGC', 'TGT', 'TCA', 'TCC', 'TCT', 'GAA', 'GAC', 'GAT', 'GTA', 'GTC', 'GTT', 'GGA', 'GGC', 'GGT', 'GCA', 'GCC', 'GCT', 'CAA', 'CAC', 'CAT', 'CTA', 'CTC', 'CTT', 'CGA', 'CGC', 'CGT', 'CCA', 'CCC', 'CCT') + nnn_codons <- c('AAA', 'AAC', 'AAG', 'AAT', 'ATA', 'ATC', 'ATG', 'ATT', 'AGA', 'AGC', 'AGG', 'AGT', 'ACA', 'ACC', 'ACG', 'ACT', 'TAA', 'TAC', 'TAG', 'TAT', 'TTA', 'TTC', 'TTG', 'TTT', 'TGA', 'TGC', 'TGG', 'TGT', 'TCA', 'TCC', 'TCG', 'TCT', 'GAA', 'GAC', 'GAG', 'GAT', 'GTA', 'GTC', 'GTG', 'GTT', 'GGA', 'GGC', 'GGG', 'GGT', 'GCA', 'GCC', 'GCG', 'GCT', 'CAA', 'CAC', 'CAG', 'CAT', 'CTA', 'CTC', 'CTG', 'CTT', 'CGA', 'CGC', 'CGG', 'CGT', 'CCA', 'CCC', 'CCG', 'CCT') + + # -------------------------------------------------------------------------- + # Custom Library Parsing with Auto-Detection + # -------------------------------------------------------------------------- + is_position_wise <- FALSE + position_lookup <- list() + custom_codons <- NULL + + if (mutagenesis_type == "custom") { + if (!file.exists(custom_codon_library_path) || is.null(custom_codon_library_path)) { + stop("Custom codons file must be provided and valid when using 'custom' mutagenesis_type.") + } + + # Auto-detect format by inspecting the first line + first_line <- readLines(custom_codon_library_path, n = 1) + + if (grepl("Position", first_line, ignore.case = TRUE)) { + # Format 1: Position-wise CSV file + is_position_wise <- TRUE + + # Read line-by-line instead of read.csv to avoid strict column matching errors + lines <- readLines(custom_codon_library_path) + + # Loop through lines, skipping the header (index 1) + for (line in lines[-1]) { + # Skip empty lines + if (trimws(line) == "") next + + # Split the line by commas + parts <- trimws(unlist(strsplit(line, ","))) + + # The first part is the position, everything else are the codons + pos_idx <- parts[1] + codon_vec <- parts[-1] + + # Remove any accidental empty strings (e.g., from trailing commas) + codon_vec <- codon_vec[codon_vec != ""] + + position_lookup[[pos_idx]] <- codon_vec + } + } else { + # Format 2: Global comma-separated list (Legacy compatibility) + custom_codons <- unlist(strsplit(readLines(custom_codon_library_path), ",")) + custom_codons <- trimws(custom_codons) + } + } + + # Helper function to split a DNA sequence into nucleotide triplets + split_into_codons <- function(seq) { + # Note: Double escaping is required for Perl regular expressions here + return(strsplit(seq, "(?<=.{3})", perl = TRUE)[[1]]) + } + + wt_codons <- split_into_codons(coding_seq) + + # Initialize dataframe to store final variant results + # Note: \$ escaping is maintained for Nextflow compatibility + result <- data.frame(Codon_Number = integer(), wt_codon = character(), Variant = character(), stringsAsFactors = FALSE) + + # Helper function to determine the target codon list per position + get_codon_list <- function(wt_codon, codon_index) { + if (mutagenesis_type == "nnk") { + return(nnk_codons) + } else if (mutagenesis_type == "nns") { + return(nns_codons) + } else if (mutagenesis_type == "nnh") { + return(nnh_codons) + } else if (mutagenesis_type == "nnn") { + return(nnn_codons) + } else if (mutagenesis_type == "nnk_nns") { + if (substr(wt_codon, 3, 3) == "T") return(nns_codons) else return(nnk_codons) + } else if (mutagenesis_type == "nnk_nns_nnh") { + if (substr(wt_codon, 3, 3) == "T") { + return(nns_codons) + } else if (substr(wt_codon, 3, 3) == "G"){ + return(nnh_codons) + } else { + return(nnk_codons) + } + } else if (mutagenesis_type == "custom") { + if (is_position_wise) { + idx_str <- as.character(codon_index) + if (!is.null(position_lookup[[idx_str]])) { + return(position_lookup[[idx_str]]) + } else { + return(NULL) # Skip positions not explicitly defined in the CSV + } + } else { + return(custom_codons) + } + } else { + stop("Invalid mutagenesis_type. Choose from 'nnk', 'nns', 'nnh', 'nnn', 'nnk_nns', 'nnk_nns_nnh', or 'custom'.") + } + } + + # Iterate over each wild-type codon to assign programmed variants + for (i in seq_along(wt_codons)) { + wt_codon <- wt_codons[i] + codon_list <- get_codon_list(wt_codon, i) + + # Skip iteration if no custom codons were assigned to this specific position + if (is.null(codon_list)) next + + # Filter out the wild-type codon from the mutation list + possible_variants <- codon_list[codon_list != wt_codon] + + for (variant in possible_variants) { + # Note: \$ escaping is maintained for Nextflow compatibility + result <- rbind(result, data.frame(Codon_Number = i, wt_codon = wt_codon, Variant = variant, stringsAsFactors = FALSE)) + } + } + + write.csv(result, output_file, row.names = FALSE) +} + +# ------------------------------------------------------------------------------ +# Main Execution Block (Nextflow variable substitution) +# ------------------------------------------------------------------------------ + +# Replaces bash if/else logic. If Nextflow omits the optional file, +# it passes "/NULL", which R translates to an actual NULL object. +custom_lib_arg <- if ("$custom_codon_library" == "/NULL") { + NULL +} else { + "$custom_codon_library" +} + +generate_possible_variants( + wt_seq_input = "$wt_seq", + start_stop_pos = "$pos_range", + mutagenesis_type = "$mutagenesis_type", + custom_codon_library_path = "$custom_codon_library", + output_file = "possible_mutations.csv" +) + +# ------------------------------------------------------------------------------ +# Versioning Generation +# ------------------------------------------------------------------------------ + +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +biostrings_version <- as.character(packageVersion("Biostrings")) + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version), + paste(' biostrings:', biostrings_version) + ), + f +) +close(f) diff --git a/modules/local/dmsanalysis/process_gatk/environment.yml b/modules/local/dmsanalysis/process_gatk/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/dmsanalysis/process_gatk/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/dmsanalysis/process_gatk/main.nf b/modules/local/dmsanalysis/process_gatk/main.nf new file mode 100644 index 0000000..44656cf --- /dev/null +++ b/modules/local/dmsanalysis/process_gatk/main.nf @@ -0,0 +1,41 @@ +process DMSANALYSIS_PROCESS_GATK { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + publishDir "${params.outdir}/intermediate_files", mode: 'copy' + + input: + tuple val(meta), path(variantCounts) + path possible_mutations + path aa_seq + val min_counts + + output: + tuple val(meta), + path("annotated_variantCounts.csv"), + path("variantCounts_filtered_by_library.csv"), + path("library_completed_variantCounts.csv"), + path("variantCounts_for_heatmaps.csv"), + emit: processed_variantCounts + + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + template 'process_gatk.R' + + stub: + """ + touch annotated_variantCounts.csv variantCounts_filtered_by_library.csv library_completed_variantCounts.csv variantCounts_for_heatmaps.csv + echo "DMSANALYSIS_PROCESSGATK:" > versions.yml + echo " stub-version: 0.0.0" >> versions.yml + """ +} diff --git a/modules/local/dmsanalysis/process_gatk/meta.yml b/modules/local/dmsanalysis/process_gatk/meta.yml new file mode 100644 index 0000000..c962cea --- /dev/null +++ b/modules/local/dmsanalysis/process_gatk/meta.yml @@ -0,0 +1,69 @@ +name: "dmsanalysis_process_gatk" +description: "Processes variant counts generated by GATK, filters them against expected libraries, and prepares data for downstream visualization and fitness calculation." +keywords: + - deep mutational scanning + - gatk + - variant filtering + - dms +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - variantCounts: + type: file + description: CSV file containing variant counts from GATK + pattern: "*.{csv}" + - - possible_mutations: + type: file + description: CSV file containing all theoretically possible mutations based on library design + pattern: "*.{csv}" + - - aa_seq: + type: file + description: FASTA or text file containing the reference amino acid sequence + - - min_counts: + type: integer + description: Minimum count threshold to consider a variant valid + +output: + - processed_variantCounts: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - annotated_variantCounts.csv: + type: file + description: Raw GATK counts annotated with mutation details + pattern: "annotated_variantCounts.csv" + - variantCounts_filtered_by_library.csv: + type: file + description: Variant counts filtered to include only those present in the provided library + pattern: "variantCounts_filtered_by_library.csv" + - library_completed_variantCounts.csv: + type: file + description: Full library table including zero-count variants + pattern: "library_completed_variantCounts.csv" + - variantCounts_for_heatmaps.csv: + type: file + description: Processed counts formatted specifically for heatmap visualization + pattern: "variantCounts_for_heatmaps.csv" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/dmsanalysis/process_gatk/templates/process_gatk.R b/modules/local/dmsanalysis/process_gatk/templates/process_gatk.R new file mode 100644 index 0000000..6abc053 --- /dev/null +++ b/modules/local/dmsanalysis/process_gatk/templates/process_gatk.R @@ -0,0 +1,364 @@ +#!/usr/bin/env Rscript + +# four parts condensed into one R script + +######### +# process raw gatk +######### + +# input: gatk variantcounts tsv file, output_path +# output: csv with column names. Creates additional counts_per_cov column. Fills pos_mut column for synonymous mutations. Sorted out variants that have mutations, but do not show up in the specifying columns -> was affecting roughly 30 low-count variants out of over 15000 in Taylor's data + +library("dplyr") +process_raw_gatk <- function(gatk_file_path, output_csv_path) { + + # Set the column names + colnames <- c("counts", "cov", "mean_length_variant_reads", "varying_bases", + "base_mut", "varying_codons", "codon_mut", "aa_mut", "pos_mut") + + # Read the GATK file into a data frame + gatk_raw <- read.table(gatk_file_path, sep = "\\t", header = FALSE, fill = TRUE, col.names = colnames) + + # Filter out rows where 'aa_mut' is empty or NA + gatk_raw <- gatk_raw[!(gatk_raw\$aa_mut == "" | is.na(gatk_raw\$aa_mut)), ] + + # Handle synonymous mutations: where aa_mut starts with "S" and pos_mut is either NA or "" + gatk_raw <- gatk_raw %>% + rowwise() %>% + mutate( + pos_mut = ifelse( + (is.na(pos_mut) | pos_mut == "") & grepl("^S:", aa_mut), + # Construct the new 'pos_mut' entry for synonymous mutations + paste0( + sub("S:([A-Z])>[A-Z]", "\\\\1", aa_mut), # Get the original amino acid from 'aa_mut' + sub("^(\\\\d+):.*", "\\\\1", codon_mut), # Get the position from 'codon_mut' + sub("S:[A-Z]>([A-Z])", "\\\\1", aa_mut) # Get the mutated amino acid from 'aa_mut' + ), + pos_mut # Keep the existing 'pos_mut' if it's not NA or "" + ) + ) %>% + ungroup() %>% + mutate(counts_per_cov = counts / cov) + + # Write the cleaned data frame to a CSV file + write.csv(gatk_raw, file = output_csv_path, row.names = FALSE) +} + +##### +# run function +##### +process_raw_gatk( + gatk_file_path = "$variantCounts", + output_csv_path = "annotated_variantCounts.csv" +) + + + + + + + + +######### +# filter gatk by codon library +######### + +# Input: pre_processed_raw_gatk_path, mutation library .csv path (former "possible_NNK_mutations.csv"), output_path +# Output: gatk table filtered for only single-codon mutations that are part of the library + +library("dplyr") +library("stringr") + +filter_gatk_by_codon_library <- function(gatk_file_path, codon_library_path, output_file_path) { + # Load the GATK table from the provided file path + gatk_table <- read.csv(gatk_file_path) + + # Load the codon library from the provided .csv file + codon_library <- read.csv(codon_library_path) + + # Ensure the codon library has the expected columns + if (!all(c("Codon_Number", "wt_codon", "Variant") %in% colnames(codon_library))) { + stop("Codon library must contain columns 'Codon_Number', 'wt_codon', and 'Variant'.") + } + + # Filter the GATK table + filtered_gatk <- gatk_table %>% + filter(varying_codons == 1) %>% # Keep rows with single-codon mutations + rowwise() %>% + filter({ + # Extract the position and mutated codon + codon_position <- as.numeric(sub(":.*", "", codon_mut)) # Extract position before ':' + mutated_codon <- sub(".*>", "", codon_mut) # Extract codon after '>' + + # Check if the position and codon are valid + is_in_library <- any( + codon_library\$Codon_Number == codon_position & + (codon_library\$Variant == mutated_codon | # Check Variant column + codon_library\$wt_codon == mutated_codon) # Check wt_codon column + ) + is_in_library + }) %>% + ungroup() %>% + # Apply additional filtering based on mutation distances + rowwise() %>% + filter({ + # Split base_mut into individual mutations + mutations <- unlist(strsplit(base_mut, ",\\\\s*")) # Splits by comma and removes extra spaces + # Extract numeric positions from each mutation string + positions <- as.numeric(str_extract(mutations, "^[0-9]+")) + + # Calculate the distance between the first and last position + distance <- max(positions, na.rm = TRUE) - min(positions, na.rm = TRUE) + + # Keep rows where the distance is <= 2 + distance <= 2 + }) %>% + ungroup() + + # Write the filtered GATK table to the output file path + write.csv(filtered_gatk, file = output_file_path, row.names = FALSE) +} + +##### +# run function +##### +filter_gatk_by_codon_library( + gatk_file_path = "annotated_variantCounts.csv", + codon_library_path = "$possible_mutations", + output_file_path = "variantCounts_filtered_by_library.csv" +) + + + + +######### +# complete filtered gatk +######### + +# input: NNK_codon_library_filtered_gatk.csv-path, prefiltered_gatk.csv-path (containing only NNK mutations), output_folder-path +# output: completed gatk_file with all possible variants (even if not measured in sequencing) -> NA in counts and counts_per_cov to 0.0000001 to deal with log-scale in following calculations + +library(dplyr) +library(Biostrings) # Required for codon-to-amino-acid translation + +# Function to calculate Hamming distance (varying_bases) +hamming_distance <- function(wt_codon, variant_codon) { + sum(strsplit(wt_codon, "")[[1]] != strsplit(variant_codon, "")[[1]]) +} + +# Function to get amino acid from codon +get_amino_acid <- function(codon) { + codon_table <- GENETIC_CODE + aa <- codon_table[[toupper(codon)]] + if (is.null(aa)) { + return(NA) # Handle cases where codon is not valid + } + return(aa) +} + +# Function to calculate mutation type (aa_mut) and pos_mut +mutation_details <- function(wt_codon, variant_codon, codon_number) { + wt_aa <- get_amino_acid(wt_codon) + variant_aa <- get_amino_acid(variant_codon) + + # If amino acids are different, it's a missense mutation; otherwise, synonymous + if (wt_aa != variant_aa) { + mutation_type <- "M" # Missense mutation + } else { + mutation_type <- "S" # Synonymous mutation + } + + # aa_mut: Type of mutation and amino acid changes (e.g., M:D>S) + aa_mut <- paste0(mutation_type, ":", wt_aa, ">", variant_aa) + + # pos_mut: Wild-type AA, codon position, mutated AA (e.g., D2Q) + pos_mut <- paste0(wt_aa, codon_number, variant_aa) + + return(list(aa_mut = aa_mut, pos_mut = pos_mut)) +} + +complete_prefiltered_gatk <- function(possible_nnk_path, prefiltered_gatk_path, output_file_path) { + + # Load the possible NNK mutations CSV + possible_nnk <- read.csv(possible_nnk_path) + + # Load the prefiltered GATK CSV + prefiltered_gatk <- read.csv(prefiltered_gatk_path) + + # Create codon_mut column in possible_NNK_mutations in the format 'Codon_Number:wt_codon>Variant' + possible_nnk <- possible_nnk %>% + mutate(codon_mut = paste0(Codon_Number, ":", wt_codon, ">", Variant)) + + # Merge both dataframes based on the codon_mut column (full join to include all) + merged_data <- full_join(prefiltered_gatk, possible_nnk, by = "codon_mut") + + # Fill missing values in counts_per_cov and counts with 0.0000001 + merged_data <- merged_data %>% + mutate(counts_per_cov = ifelse(is.na(counts_per_cov), 0.0000001, counts_per_cov), + counts = ifelse(is.na(counts), 0.000001, counts)) + + # Calculate Hamming distance (varying_bases) and mutation details (aa_mut, pos_mut) + merged_data <- merged_data %>% + rowwise() %>% + mutate(varying_bases = hamming_distance(wt_codon, Variant), + mutation_info = list(mutation_details(wt_codon, Variant, Codon_Number))) %>% + mutate(aa_mut = mutation_info\$aa_mut, # Extract aa_mut + pos_mut = mutation_info\$pos_mut) %>% # Extract pos_mut + ungroup() %>% + select(-mutation_info) # Remove the temporary list column + + # Save the merged data to a new CSV file + write.csv(merged_data, file = output_file_path, row.names = FALSE) +} + +##### +# run function +##### +complete_prefiltered_gatk( + possible_nnk_path = "$possible_mutations", + prefiltered_gatk_path = "variantCounts_filtered_by_library.csv", + output_file_path = "library_completed_variantCounts.csv" +) + + + + + + + + + +######### +# prepare gatk data for counts heatmap +######### + +# input: prefiltered GATK path (filtered for codon library), aa-seq file path, output path, threshold (for minimum counts to recognize variant) +# output: csv file serving as basis for counts_per_cov_heatmap function + +suppressMessages(library(dplyr)) +suppressMessages(library(ggplot2)) +suppressMessages(library(tidyr)) +suppressMessages(library(reshape2)) +suppressMessages(library(scales)) + +prepare_gatk_data_for_counts_heatmaps <- function(gatk_file_path, aa_seq_file_path, output_csv_path, threshold = 3) { + # Load the raw GATK data + raw_gatk <- read.table(gatk_file_path, sep = ",", header = TRUE) + + # Read the wild-type amino acid sequence from the text file + wt_seq <- readLines(aa_seq_file_path) + wt_seq <- unlist(strsplit(wt_seq, "")) # Split the sequence into individual amino acids + + # Summarize counts-per-cov for each unique aa mutation in pos_mut + aggregated_data <- raw_gatk %>% + group_by(pos_mut) %>% + summarize(total_counts_per_cov = sum(counts_per_cov, na.rm = TRUE), + total_counts = sum(counts, na.rm = TRUE)) # Also sum the counts + + # Extract the wild-type position and mutations from 'pos_mut' + aggregated_data <- aggregated_data %>% + mutate( + wt_aa = sub("(\\\\D)(\\\\d+)(\\\\D)", "\\\\1", pos_mut), # Wild-type amino acid (e.g., S) + position = as.numeric(sub("(\\\\D)(\\\\d+)(\\\\D)", "\\\\2", pos_mut)), # Position (e.g., 3) + mut_aa = sub("(\\\\D)(\\\\d+)(\\\\D)", "\\\\3", pos_mut) # Mutant amino acid (e.g., R) + ) + + # Replace 'X' with '*', indicating the stop codon + aggregated_data <- aggregated_data %>% + mutate(mut_aa = ifelse(mut_aa == "X", "*", mut_aa)) + + # Replace 'X' with '*' in the wild-type amino acid sequence as well + wt_seq <- ifelse(wt_seq == "X", "*", wt_seq) + + # Define all 20 standard amino acids and the stop codon "*" + all_amino_acids <- c("A", "C", "D", "E", "F", "G", "H", "I", "K", "L", + "M", "N", "P", "Q", "R", "S", "T", "V", "W", "Y", "*") + + # Create a list of all positions in the wild-type sequence + all_positions <- 1:length(wt_seq) + + # Create a complete grid of all possible combinations of positions and amino acids + complete_data <- expand.grid(mut_aa = all_amino_acids, position = all_positions) + + # Merge the summarized data with the complete grid (filling missing entries with 0) + heatmap_data <- complete_data %>% + left_join(aggregated_data, by = c("mut_aa", "position")) %>% + mutate(total_counts_per_cov = ifelse(is.na(total_counts_per_cov), 0, total_counts_per_cov), + wt_aa = wt_seq[position]) # Assign the wild-type amino acid + + # Set variants with counts < threshold to NA + heatmap_data <- heatmap_data %>% + mutate( + total_counts_per_cov = ifelse(total_counts < threshold, NA, total_counts_per_cov), + total_counts = ifelse(total_counts < threshold, NA, total_counts) + ) + + # Fill pos_mut column + heatmap_data <- heatmap_data %>% + mutate( + pos_mut = ifelse(is.na(pos_mut), + paste0(wt_aa, position, mut_aa), + pos_mut) + ) + + # Save the aggregated data to a CSV file + write.csv(heatmap_data, file = output_csv_path, row.names = FALSE) + print(paste("Aggregated data saved to:", output_csv_path)) +} + + +##### +# run function +##### +prepare_gatk_data_for_counts_heatmaps( + gatk_file_path = "variantCounts_filtered_by_library.csv", + aa_seq_file_path = "$aa_seq", + output_csv_path = "variantCounts_for_heatmaps.csv", + threshold = $min_counts +) + + + + + + + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +dplyr_version <- as.character(packageVersion("dplyr")) +Biostrings_version <- as.character(packageVersion("Biostrings")) +stringr_version <- as.character(packageVersion("stringr")) +ggplot2_version <- as.character(packageVersion("ggplot2")) +tidyr_version <- as.character(packageVersion("tidyr")) +reshape2_version <- as.character(packageVersion("reshape2")) +scales_version <- as.character(packageVersion("scales")) + + +if (is.null(r_version)) r_version <- "unknown" +if (length(dplyr_version) == 0) dplyr_version <- "unknown" +if (length(Biostrings_version) == 0) Biostrings_version <- "unknown" +if (length(stringr_version) == 0) stringr_version <- "unknown" +if (length(ggplot2_version) == 0) ggplot2_version <- "unknown" +if (length(tidyr_version) == 0) tidyr_version <- "unknown" +if (length(reshape2_version) == 0) reshape2_version <- "unknown" +if (length(scales_version) == 0) scales_version <- "unknown" + + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version), + paste(' r-dplyr:', dplyr_version), + paste(' r-Biostrings:', Biostrings_version), + paste(' r-stringr:', stringr_version), + paste(' r-ggplot2:', ggplot2_version), + paste(' r-tidyr:', tidyr_version), + paste(' r-reshape2:', reshape2_version), + paste(' r-scales:', scales_version) + ), + f +) +close(f) diff --git a/modules/local/dmsanalysis/wildtype_based_seq_error_correction/environment.yml b/modules/local/dmsanalysis/wildtype_based_seq_error_correction/environment.yml new file mode 100644 index 0000000..4070243 --- /dev/null +++ b/modules/local/dmsanalysis/wildtype_based_seq_error_correction/environment.yml @@ -0,0 +1,4 @@ +channels: + - conda-forge +dependencies: + - conda-forge::r-base=4.4.1 diff --git a/modules/local/dmsanalysis/wildtype_based_seq_error_correction/templates/wt_based_seq_error_correction.R b/modules/local/dmsanalysis/wildtype_based_seq_error_correction/templates/wt_based_seq_error_correction.R new file mode 100644 index 0000000..c7dd640 --- /dev/null +++ b/modules/local/dmsanalysis/wildtype_based_seq_error_correction/templates/wt_based_seq_error_correction.R @@ -0,0 +1,172 @@ +#!/usr/bin/env Rscript + +## WT-based sequencing error correction in nf-core/deepmutscan +## 21.03.2026 + + +## --- Helper functiosn --- + +# function to run the WT sequencing error correction (nucleotide level) +seq_error_correct_by_WT_nt <- function(wt_seq_count_path, input_count_path, output_file_path){ + + ## load data (nucleotide-level counts from GATK) + WT.counts <- read.csv(wt_seq_count_path) + input.counts <- read.csv(input_count_path) + + ## append key columns + input.counts$counts_corrected <- input.counts$counts + input.counts$counts_per_cov_corrected <- input.counts$counts_per_cov + + # Process the GATK file + for (i in 1:nrow(WT.counts)){ + + ## only look at variants observed in both the input and WT sequencing + tmp.id <- match(WT.counts[i,"base_mut"], input.counts[,"base_mut"]) + if(is.na(tmp.id) == T){ + next + } + + ## subtract the observed per-base coverage in the WT sequencing + input.counts[tmp.id,"counts_per_cov_corrected"] <- input.counts[tmp.id,"counts_per_cov"] - WT.counts[i,"counts_per_cov"] + + ## based on this, also adjust the total_counts (cross-multiplication) + ## total_counts_corrected ~ total_counts * c(total_counts_per_cov_corrected / total_counts_per_cov) + input.counts[tmp.id,"counts_corrected"] <- input.counts[tmp.id,"counts"] * c(input.counts[tmp.id,"counts_per_cov_corrected"] / input.counts[tmp.id,"counts_per_cov"]) + + ## round to nearest integer, do not allow for negative counts + input.counts[tmp.id,"counts_corrected"] <- round(input.counts[tmp.id,"counts_corrected"]) + if(input.counts[tmp.id,"counts_corrected"] < 0){ + input.counts[tmp.id,"counts_corrected"] <- 0 + input.counts[tmp.id,"counts_per_cov_corrected"] <- 0 + } + + } + + # Write the processed data + write.csv(input.counts, file = output_file_path, row.names = F) + +} + +# to re-make the AA level input table after WT sequencing error correction +seq_error_correct_counts_for_heatmaps <- function(gatk_file_path, aa_seq_file_path, output_csv_path, threshold = 3) { + + # Load the raw GATK data + raw_gatk <- read.table(gatk_file_path, sep = ",", header = TRUE, stringsAsFactors = FALSE) + + # Read the wild-type amino acid sequence from the text file + wt_seq <- readLines(aa_seq_file_path) + wt_seq <- unlist(strsplit(wt_seq, "")) + + # Replace 'X' with '*', indicating the stop codon + wt_seq[wt_seq == "X"] <- "*" + + # Summarize counts for each unique pos_mut + aggregated_counts_per_cov <- aggregate( + counts_per_cov_corrected ~ pos_mut, + data = raw_gatk, + FUN = function(x) sum(x, na.rm = TRUE) + ) + + aggregated_counts <- aggregate( + counts_corrected ~ pos_mut, + data = raw_gatk, + FUN = function(x) sum(x, na.rm = TRUE) + ) + + # Merge the two aggregated tables + aggregated_data <- merge( + aggregated_counts_per_cov, + aggregated_counts, + by = "pos_mut", + all = TRUE + ) + + # Rename columns to match original output + names(aggregated_data)[names(aggregated_data) == "counts_per_cov_corrected"] <- "total_counts_per_cov_corrected" + names(aggregated_data)[names(aggregated_data) == "counts_corrected"] <- "total_counts_corrected" + + # Extract wt_aa, position, and mut_aa from pos_mut + aggregated_data$wt_aa <- sub("(\\D)(\\d+)(\\D)", "\\1", aggregated_data$pos_mut) + aggregated_data$position <- as.numeric(sub("(\\D)(\\d+)(\\D)", "\\2", aggregated_data$pos_mut)) + aggregated_data$mut_aa <- sub("(\\D)(\\d+)(\\D)", "\\3", aggregated_data$pos_mut) + + # Replace 'X' with '*' + aggregated_data$mut_aa[aggregated_data$mut_aa == "X"] <- "*" + + # Define all 20 standard amino acids and stop codon + all_amino_acids <- c("A", "C", "D", "E", "F", "G", "H", "I", "K", "L", + "M", "N", "P", "Q", "R", "S", "T", "V", "W", "Y", "*") + + # Create all positions + all_positions <- seq_along(wt_seq) + + # Create complete grid of all possible position/mutation combinations + complete_data <- expand.grid( + mut_aa = all_amino_acids, + position = all_positions, + stringsAsFactors = FALSE + ) + + # Merge aggregated data into complete grid + heatmap_data <- merge( + complete_data, + aggregated_data, + by = c("mut_aa", "position"), + all.x = TRUE, + sort = FALSE + ) + + # Fill missing values + heatmap_data$total_counts_per_cov_corrected[is.na(heatmap_data$total_counts_per_cov_corrected)] <- 0 + heatmap_data$wt_aa <- wt_seq[heatmap_data$position] + + # Apply threshold + low_count <- is.na(heatmap_data$total_counts_corrected) | heatmap_data$total_counts_corrected < threshold + heatmap_data$total_counts_per_cov_corrected[low_count] <- NA + heatmap_data$total_counts_corrected[low_count] <- NA + + # Fill missing pos_mut values + missing_pos_mut <- is.na(heatmap_data$pos_mut) + heatmap_data$pos_mut[missing_pos_mut] <- paste0( + heatmap_data$wt_aa[missing_pos_mut], + heatmap_data$position[missing_pos_mut], + heatmap_data$mut_aa[missing_pos_mut] + ) + + # Re-order rows + out.order <- paste0(rep(1:max(heatmap_data$position), each = 21), + rep(c("A", "C", "D", "E", "F", "G", "H", + "I", "K", "L","M", "N", "P", "Q", + "R", "S", "T", "V", "W", "Y", "*"), + max(heatmap_data$position))) + heatmap_data <- heatmap_data[match(out.order, paste0(heatmap_data$position, heatmap_data$mut_aa)),] + rownames(heatmap_data) <- 1:nrow(heatmap_data) + + # Save output + write.csv(heatmap_data, file = output_csv_path, row.names = F) + +} + +## --- Main functions --- +seq_error_correct_by_WT_nt(wt_seq_count_path = "intermediate_files/processed_gatk_files/wildtype_pe/variantCounts_filtered_by_library.csv", + input_count_path = "intermediate_files/processed_gatk_files/input_1_pe/variantCounts_filtered_by_library.csv", + output_file_path = "intermediate_files/processed_gatk_files/input_1_pe/variantCounts_filtered_by_library_err_corrected.csv") + +seq_error_correct_counts_for_heatmaps(gatk_file_path = "intermediate_files/processed_gatk_files/input_1_pe/variantCounts_filtered_by_library_err_corrected.csv", + aa_seq_file_path = "intermediate_files/aa_seq.txt", + output_csv_path = "intermediate_files/processed_gatk_files/input_1_pe/variantCounts_for_heatmaps_err_corrected.csv") + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +if (is.null(r_version)) r_version <- "unknown" +f <- file("versions.yml", "w") +writeLines( + c( + '"\${task.process}":', + paste(' r-base:', r_version), + ), + f +) +close(f) diff --git a/modules/local/fitness/find_synonymous_mutation/environment.yml b/modules/local/fitness/find_synonymous_mutation/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/fitness/find_synonymous_mutation/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/fitness/find_synonymous_mutation/main.nf b/modules/local/fitness/find_synonymous_mutation/main.nf new file mode 100644 index 0000000..eeb9c86 --- /dev/null +++ b/modules/local/fitness/find_synonymous_mutation/main.nf @@ -0,0 +1,22 @@ +process FIND_SYNONYMOUS_MUTATION { + tag { sample.sample } + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(sample), path(counts_merged) // from MERGE_COUNTS.out.merged_counts + path wt_fasta // broadcast singleton + val pos_range // "start-end", broadcast singleton + + output: + tuple val(sample), path("synonymous_wt.txt"), emit: synonymous_wt + path "versions.yml", emit: versions + + script: + template 'find_syn_mutation.R' +} diff --git a/modules/local/fitness/find_synonymous_mutation/meta.yml b/modules/local/fitness/find_synonymous_mutation/meta.yml new file mode 100644 index 0000000..8ec3ab3 --- /dev/null +++ b/modules/local/fitness/find_synonymous_mutation/meta.yml @@ -0,0 +1,58 @@ +name: "find_synonymous_mutation" +description: Identifies synonymous mutations within the merged count table, prioritizing those with high counts (typically 2nt changes) to serve as a reference for normalizing fitness scores. +keywords: + - deep mutational scanning + - dms + - synonymous mutation + - fitness calculation + - normalization +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + - "bioconductor-biostrings": + description: "Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences" + homepage: "https://bioconductor.org/packages/Biostrings" + licence: ["Artistic-2.0"] + +input: + - - sample: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test' ]` + - counts_merged: + type: file + description: TSV/CSV file containing variant counts merged across samples or replicates + pattern: "*.{tsv,csv}" + - - wt_fasta: + type: file + description: FASTA file containing the wild-type reference DNA sequence + pattern: "*.{fasta,fa}" + - - pos_range: + type: string + description: Start and stop codon positions (ORF) in the format 'start-stop', e.g., '352-1383' + +output: + - synonymous_wt: + - sample: + type: map + description: | + Groovy Map containing sample information + - synonymous_wt.txt: + type: file + description: Text file identifying the chosen synonymous mutation to be used as a wild-type reference + pattern: "synonymous_wt.txt" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/fitness/find_synonymous_mutation/templates/find_syn_mutation.R b/modules/local/fitness/find_synonymous_mutation/templates/find_syn_mutation.R new file mode 100644 index 0000000..a0ee07a --- /dev/null +++ b/modules/local/fitness/find_synonymous_mutation/templates/find_syn_mutation.R @@ -0,0 +1,119 @@ +#!/usr/bin/env Rscript + +suppressMessages(library(Biostrings)) + +# Pick a synonymous "WT substitute" for DiMSum normalization using a fixed coding window. +# Inputs: +# wt_fasta : path to FASTA (single WT sequence) +# counts_merged_tsv : path to merged counts (columns: nt_seq, input1..N, output1..M) +# pos_range : "start-end" (1-based, inclusive), e.g. "352-1383" +# Returns: +# character scalar: chosen nt sequence +# +# Preference: +# 1) AA-identical (fully synonymous) AND exactly 2 nt mismatches vs WT, both within ONE codon. +# 2) If none, AA-identical AND exactly 1 nt mismatch vs WT. +# 3) If more than one, pick highest mean of input counts. +# 4) If still none, stop with an error. + +pick_synonymous_wt_from_range <- function(wt_fasta, counts_merged_tsv, pos_range) { + ## ---- parse range ---- + pr <- strsplit(as.character(pos_range), "-", fixed = TRUE)[[1]] + if (length(pr) != 2L) stop("pos_range must be 'start-end', got: ", pos_range) + start_pos <- as.integer(pr[1]); end_pos <- as.integer(pr[2]) + if (is.na(start_pos) || is.na(end_pos) || start_pos < 1L || end_pos < start_pos) + stop("Invalid pos_range: ", pos_range) + + ## ---- WT window ---- + wt_set <- Biostrings::readDNAStringSet(wt_fasta) + if (length(wt_set) != 1L) stop("WT FASTA must contain exactly one sequence.") + wt_subseq <- Biostrings::subseq(wt_set[[1]], start = start_pos, end = end_pos) + wt_seq_chr <- as.character(wt_subseq) + wt_len <- nchar(wt_seq_chr) + if ((wt_len %% 3) != 0) stop("Provided window length is not divisible by 3: ", wt_len) + wt_aa <- Biostrings::translate(wt_subseq, if.fuzzy.codon = "X") + wt_chars <- strsplit(wt_seq_chr, "", fixed = TRUE)[[1]] + + ## ---- counts ---- + df <- utils::read.delim(counts_merged_tsv, sep = "\\t", header = TRUE, + stringsAsFactors = FALSE, check.names = FALSE) + if (!"nt_seq" %in% names(df)) stop("counts_merged_tsv must have a 'nt_seq' column.") + + df\$nt_seq <- toupper(df\$nt_seq) + keep_len <- nchar(df\$nt_seq) == wt_len + if (!any(keep_len)) stop("No sequences match WT window length (", wt_len, ").") + if (!all(keep_len)) df <- df[keep_len, , drop = FALSE] + + # input columns & mean (works with 1+ replicates) + input_cols <- grep("^input", names(df), value = TRUE) + if (length(input_cols) == 0L) stop("No input columns found (expect names starting with 'input').") + input_mat <- as.data.frame(lapply(df[, input_cols, drop = FALSE], function(x) as.numeric(as.character(x)))) + input_mean <- if (length(input_cols) == 1L) input_mat[[1]] else rowMeans(as.matrix(input_mat), na.rm = TRUE) + + ## ---- synonymous filter ---- + var_set <- Biostrings::DNAStringSet(df\$nt_seq) + var_aa <- Biostrings::translate(var_set, if.fuzzy.codon = "X") + syn_idx <- which(as.character(var_aa) == as.character(wt_aa)) + if (length(syn_idx) == 0L) stop("No fully-synonymous variants found relative to WT translation.") + + # helpers + mismatch_positions <- function(seq_nt_chars) which(seq_nt_chars != wt_chars) # 1-based positions + codon_index <- function(pos_vec) floor((pos_vec - 1L) / 3L) # 0-based codon bin + + # preference 1: exactly 2 mismatches, both within the same codon + cand_two_one <- Filter(function(i) { + vchars <- strsplit(df\$nt_seq[i], "", fixed = TRUE)[[1]] + pos <- mismatch_positions(vchars) + length(pos) == 2L && length(unique(codon_index(pos))) == 1L + }, syn_idx) + + choose_best <- function(idx_vec) idx_vec[ which.max(input_mean[idx_vec]) ] + + if (length(cand_two_one) > 0L) { + best_i <- choose_best(cand_two_one) + return(as.character(df\$nt_seq[best_i])) + } + + # preference 2 (fallback): exactly 1 mismatch (still synonymous) + cand_one <- Filter(function(i) { + vchars <- strsplit(df\$nt_seq[i], "", fixed = TRUE)[[1]] + length(mismatch_positions(vchars)) == 1L + }, syn_idx) + + if (length(cand_one) > 0L) { + best_i <- choose_best(cand_one) + return(as.character(df\$nt_seq[best_i])) + } + + stop("No suitable synonymous variant found: neither 2-in-1-codon nor 1-nt synonymous candidates present.") +} + +##### +# run function +##### +seq <- pick_synonymous_wt_from_range( + wt_fasta = "$wt_fasta", + counts_merged_tsv = "$counts_merged", + pos_range = "$pos_range" + ) +write(seq, file='synonymous_wt.txt') + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +Biostrings_version <- as.character(packageVersion("Biostrings")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(Biostrings_version) == 0) Biostrings_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version), + paste(' r-Biostrings:', Biostrings_version) + ), + f +) +close(f) diff --git a/modules/local/fitness/fitness_QC/environment.yml b/modules/local/fitness/fitness_QC/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/fitness/fitness_QC/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/fitness/fitness_QC/main.nf b/modules/local/fitness/fitness_QC/main.nf new file mode 100644 index 0000000..78420eb --- /dev/null +++ b/modules/local/fitness/fitness_QC/main.nf @@ -0,0 +1,31 @@ +process FITNESS_QC { + tag { sample.sample } + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(sample), path(fitness_estimation_tsv) // from FITNESS_CALCULATION + + output: + tuple val(sample), path("fitness_estimation_count_correlation.pdf"), emit: counts_corr_pdf + tuple val(sample), path("fitness_estimation_fitness_correlation.pdf"), emit: fitness_corr_pdf + path "versions.yml", emit: versions + + script: + template 'fitness_QC.R' + + stub: + """ + touch fitness_estimation_count_correlation.pdf + touch fitness_estimation_fitness_correlation.pdf + cat > versions.yml <<'EOF' + FITNESS_PLOTS: + stub-version: "0.0.0" + EOF + """ +} diff --git a/modules/local/fitness/fitness_QC/meta.yml b/modules/local/fitness/fitness_QC/meta.yml new file mode 100644 index 0000000..309e225 --- /dev/null +++ b/modules/local/fitness/fitness_QC/meta.yml @@ -0,0 +1,54 @@ +name: "fitness_qc" +description: Generates quality control visualizations for fitness estimation results, including correlation plots of raw counts and calculated fitness scores to assess experimental reproducibility. +keywords: + - deep mutational scanning + - dms + - quality control + - correlation + - fitness +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + +input: + - - sample: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test' ]` + - fitness_estimation_tsv: + type: file + description: TSV file containing calculated fitness estimates and associated statistics + pattern: "*.{tsv}" + +output: + - counts_corr_pdf: + - sample: + type: map + description: Groovy Map containing sample information + - fitness_estimation_count_correlation.pdf: + type: file + description: PDF plot showing the correlation of variant counts between samples/replicates + pattern: "fitness_estimation_count_correlation.pdf" + - fitness_corr_pdf: + - sample: + type: map + description: Groovy Map containing sample information + - fitness_estimation_fitness_correlation.pdf: + type: file + description: PDF plot showing the correlation of fitness scores between samples/replicates + pattern: "fitness_estimation_fitness_correlation.pdf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/fitness/fitness_QC/templates/fitness_QC.R b/modules/local/fitness/fitness_QC/templates/fitness_QC.R new file mode 100644 index 0000000..6df7dfe --- /dev/null +++ b/modules/local/fitness/fitness_QC/templates/fitness_QC.R @@ -0,0 +1,128 @@ +#!/usr/bin/env Rscript + +## fitness QC plots for nf-core/deepmutscan +## 28.10.2025 +## maximilian.stammnitz@crg.eu + +## lower panels: scatter + x=y (log-log version for counts) +panel_xy_abline_counts <- function(x, y, ...) { + op <- par("xpd"); on.exit(par(xpd = op), add = TRUE) + par(xpd = FALSE) + points(x, y, pch = 16, cex = 0.1, ...) + abline(a = 0, b = 1, lty = 2, col = "grey50") +} + +## upper panels: Pearson r (log-log-transformed) +panel_cor_counts <- function(x, y, digits = 2, prefix = "r = ", cex.text = 1.4, ...) { + r <- suppressWarnings(cor(log(x), log(y), use = "pairwise.complete.obs", method = "pearson")) + lab <- if (is.finite(r)) bquote(italic(r) == .(round(r, digits))) else bquote(italic(r) == NA) + + ## Save/restore full graphics state we touch + op <- par(c("usr", "xpd", "xlog", "ylog")) + on.exit(par(op), add = TRUE) + + ## Draw in normalized 0..1 panel coords with logs OFF so text is visible + par(xlog = FALSE, ylog = FALSE, xpd = FALSE, usr = c(0, 1, 0, 1)) + text(0.5, 0.5, labels = lab, cex = cex.text, font = 1, col = "black") +} + +## lower panels: scatter + x=y (linear version for fitness) +panel_xy_abline_fitness <- function(x, y, ...) { + op <- par("xpd"); on.exit(par(xpd = op), add = TRUE) + par(xpd = FALSE) + points(x, y, pch = 16, cex = 0.5, ...) + abline(a = 0, b = 1, lty = 2, col = "grey50") +} + +## upper panels: Pearson r (linear) +panel_cor_fitness <- function(x, y, digits = 2, prefix = "r = ", cex.text = 1.4, ...) { + r <- suppressWarnings(cor(x, y, use = "pairwise.complete.obs", method = "pearson")) + lab <- if (is.finite(r)) bquote(italic(r) == .(round(r, digits))) else bquote(italic(r) == NA) + + ## Save/restore full graphics state we touch + op <- par(c("usr", "xpd", "xlog", "ylog")) + on.exit(par(op), add = TRUE) + + ## Draw in normalized 0..1 panel coords with logs OFF so text is visible + par(xlog = FALSE, ylog = FALSE, xpd = FALSE, usr = c(0, 1, 0, 1)) + text(0.5, 0.5, labels = lab, cex = cex.text, font = 1, col = "black") +} + +#' Plot input/output count correlations and fitness replicate correlations +#' +#' @param fitness_table_path Path to the input table (fitness_estimation.tsv) +#' @param out_counts_corr_pdf Path to write the counts correlation PDF +#' @param out_fitness_corr_pdf Path to write the fitness correlation PDF +#' +#' @return Invisibly returns TRUE; writes the two PDFs. +run_fitness_plots <- function(fitness_table_path, + out_counts_corr_pdf, + out_fitness_corr_pdf) { + + merged.counts.fitness <- read.table(fitness_table_path, sep = "\\t", header = TRUE, check.names = FALSE) + + ## identify the right samples + inputs <- grep("input", colnames(merged.counts.fitness)) + outputs <- grep("output", colnames(merged.counts.fitness)) + + ## 5. Plot input vs. output counts ## + ##################################### + pdf(out_counts_corr_pdf, height = 9, width = 14) + pairs(merged.counts.fitness[, c(inputs, outputs)] + 1, ## use a pseudo-count + lower.panel = panel_xy_abline_counts, + upper.panel = panel_cor_counts, + cex.text = 2, + log = "xy") + dev.off() + + ## 6. Plot fitness correlations ## + ################################## + fitness.repl <- grep("rescaled_fitness", colnames(merged.counts.fitness)) + + if (length(fitness.repl) > 1) { + pdf(out_fitness_corr_pdf, height = 9, width = 14) + pairs(merged.counts.fitness[, fitness.repl], + lower.panel = panel_xy_abline_fitness, + upper.panel = panel_cor_fitness, + cex.text = 2, + xlim = c(-3, 1), + ylim = c(-3, 1)) + dev.off() + } else { + ## If only one (or zero) rescaled_fitness columns exist, still create an empty placeholder + ## so Nextflow finds the declared output. + pdf(out_fitness_corr_pdf, height = 9, width = 14) + plot.new() + title("No replicate fitness columns found (need ≥2 'rescaled_fitness...')\\nCreated placeholder PDF.") + dev.off() + } + + invisible(TRUE) +} + + +##### +# run function +##### +run_fitness_plots( + fitness_table_path = "$fitness_estimation_tsv", + out_counts_corr_pdf = "fitness_estimation_count_correlation.pdf", + out_fitness_corr_pdf = "fitness_estimation_fitness_correlation.pdf" +) + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] + +if (is.null(r_version)) r_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version) + ), + f +) +close(f) diff --git a/modules/local/fitness/fitness_calculation/environment.yml b/modules/local/fitness/fitness_calculation/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/fitness/fitness_calculation/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/fitness/fitness_calculation/main.nf b/modules/local/fitness/fitness_calculation/main.nf new file mode 100644 index 0000000..5bf6208 --- /dev/null +++ b/modules/local/fitness/fitness_calculation/main.nf @@ -0,0 +1,29 @@ +process FITNESS_CALCULATION { + tag { sample.sample } + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(sample), path(counts_merged) + path(exp_design) + path(syn_wt_txt) + + output: + tuple val(sample), path("fitness_estimation.tsv"), emit: fitness_estimation + path "versions.yml", emit: versions + + script: + template 'fitness_calculation.R' + + stub: + """ + touch fitness_estimation.tsv + echo "FITNESS_CALCULATION:" > versions.yml + echo " stub-version: 0.0.0" >> versions.yml + """ +} diff --git a/modules/local/fitness/fitness_calculation/meta.yml b/modules/local/fitness/fitness_calculation/meta.yml new file mode 100644 index 0000000..6be779d --- /dev/null +++ b/modules/local/fitness/fitness_calculation/meta.yml @@ -0,0 +1,54 @@ +name: "fitness_calculation" +description: Calculates fitness scores for Deep Mutational Scanning (DMS) variants. It typically uses a log-enrichment ratio approach, comparing variant frequencies in selection (output) versus baseline (input) samples, normalized to a synonymous wild-type reference. +keywords: + - deep mutational scanning + - dms + - fitness calculation + - log-ratio + - enrichment +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + +input: + - - sample: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test' ]` + - counts_merged: + type: file + description: TSV or CSV file containing the consolidated variant counts across all replicates + pattern: "*.{tsv,csv}" + - - exp_design: + type: file + description: CSV or text file defining the experimental design, including input/output mappings and replicates + pattern: "*.{csv,txt}" + - - syn_wt_txt: + type: file + description: Text file identifying the synonymous wild-type variant used as the normalization baseline + pattern: "*.txt" + +output: + - fitness_estimation: + - sample: + type: map + description: Groovy Map containing sample information + - fitness_estimation.tsv: + type: file + description: TSV file containing the final calculated fitness scores for all variants + pattern: "fitness_estimation.tsv" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/fitness/fitness_calculation/templates/fitness_calculation.R b/modules/local/fitness/fitness_calculation/templates/fitness_calculation.R new file mode 100644 index 0000000..07d8d11 --- /dev/null +++ b/modules/local/fitness/fitness_calculation/templates/fitness_calculation.R @@ -0,0 +1,363 @@ +#!/usr/bin/env Rscript + +## default fitness estimation for nf-core/deepmutscan +## 18.02.2026 +## maximilian.stammnitz@crg.eu + +## 0. Libraries ## +################## + +suppressPackageStartupMessages({ + library(Biostrings) +}) + +## --- Helper functions --- + +# calculate nt hamming distances from the specified WT +compute_nt_hamming <- function(merged.counts, wt.seq) { + merged.counts <- cbind("nt_ham" = rep(NA, nrow(merged.counts)), merged.counts) + for (i in 1:nrow(merged.counts)){ + tmp.wt <- strsplit(as.character(wt.seq), "")[[1]] + tmp.mut <- strsplit(as.character(merged.counts\$nt_seq[i]), "")[[1]] + if(length(which(tmp.mut != tmp.wt)) == 0){ + merged.counts\$nt_ham[i] <- 0 + rm(tmp.mut, tmp.wt) + next + }else{ + merged.counts\$nt_ham[i] <- length(which(tmp.mut != tmp.wt)) + rm(tmp.mut, tmp.wt) + next + } + } + merged.counts +} + +# translate sequences and add aa_seq +add_aa_seq <- function(merged.counts) { + merged.counts <- cbind("aa_seq" = as.character(translate(DNAStringSet(merged.counts\$nt_seq))), merged.counts) + merged.counts +} + +# calculate AA hamming distances from the WT +compute_aa_hamming <- function(merged.counts, wt.seq.aa) { + merged.counts <- cbind("aa_ham" = rep(NA, nrow(merged.counts)), merged.counts) + for (i in 1:nrow(merged.counts)){ + tmp.wt <- strsplit(as.character(wt.seq.aa), "")[[1]] + tmp.mut <- strsplit(as.character(merged.counts\$aa_seq[i]), "")[[1]] + if(length(which(tmp.mut != tmp.wt)) == 0){ + merged.counts\$aa_ham[i] <- 0 + rm(tmp.mut, tmp.wt) + next + }else{ + merged.counts\$aa_ham[i] <- length(which(tmp.mut != tmp.wt)) + rm(tmp.mut, tmp.wt) + next + } + } + merged.counts +} + +# name the mutations +name_mutations <- function(merged.counts, wt.seq.aa) { + merged.counts <- cbind("wt_aa" = rep(NA, nrow(merged.counts)), + "pos" = rep(NA, nrow(merged.counts)), + "mut_aa" = rep(NA, nrow(merged.counts)), merged.counts) + for (i in 1:nrow(merged.counts)){ + if(merged.counts\$aa_ham[i] == 0){ + next + }else{ + tmp.wt <- strsplit(as.character(wt.seq.aa), "")[[1]] + tmp.mut <- strsplit(as.character(merged.counts\$aa_seq[i]), "")[[1]] + merged.counts\$pos[i] <- which(tmp.mut != tmp.wt) + merged.counts\$'wt_aa'[i] <- tmp.wt[merged.counts\$pos[i]] + merged.counts\$'mut_aa'[i] <- tmp.mut[merged.counts\$pos[i]] + rm(tmp.mut, tmp.wt) + } + } + merged.counts +} + +# find stops, WT and WT; aggregate counts of variants which are identical on the aa (but not nt) level +aggregate_by_aa <- function(merged.counts) { + ## find stops, WT and WT + merged.counts <- cbind(merged.counts, + "wt" = rep(NA, nrow(merged.counts)), + "stop" = rep(NA, nrow(merged.counts))) + merged.counts\$wt[which(merged.counts\$nt_ham == 0)] <- TRUE + merged.counts\$stop[which(merged.counts\$'mut_aa' == "*")] <- TRUE + + ## aggregate counts of variants which are identical on the aa (but not nt) level + ## exception: wildtype ones + ## thereby shrinking the matrix + uniq.aa.vars <- unique(merged.counts\$aa_seq) + uniq.aa.vars <- uniq.aa.vars[-which(uniq.aa.vars == merged.counts\$aa_seq[which(merged.counts\$wt == TRUE)])] + for(i in 1:length(uniq.aa.vars)){ + tmp.aa_seq <- uniq.aa.vars[i] + hits <- which(as.character(merged.counts\$aa_seq) == tmp.aa_seq) + if(length(hits) == 1){ + rm(tmp.aa_seq, hits) + next + }else{ + for(j in grep("input|output", colnames(merged.counts))){ + merged.counts[hits[1],j] <- sum(merged.counts[hits,j], na.rm = TRUE) + } + merged.counts[hits[1], "nt_seq"] <- paste(merged.counts[hits, "nt_seq"], collapse = ", ") + merged.counts <- merged.counts[-hits[-1],] + rm(tmp.aa_seq, hits) + next + } + } + merged.counts +} + +# calculate bimodal fitness distribution +density_peaks <- function(x, adjust = 1, ...) { + + ## obtain density distribution + d <- density(x, adjust = adjust, n = 5000, na.rm = T, ...) + y <- d\$y + idx <- which(diff(sign(diff(y))) == -2) + 1 + if (length(idx) == 0) return(numeric(0)) + + ## order density peaks by height (y value); take top 2 + idx <- idx[order(y[idx], decreasing = TRUE)] + peaks_x <- d\$x[idx] + peaks_x[seq_len(min(2, length(peaks_x)))] + +} + +# 3. Raw fitness calculations ## +calc_raw_fitness <- function(merged.counts, exp.design) { + ## how many fitness replicates are there + reps <- length(unique(exp.design\$experiment_replicate)) + for (i in 1:reps){ + merged.counts <- cbind(merged.counts, rep(NA, nrow(merged.counts))) + colnames(merged.counts)[ncol(merged.counts)] <- paste0("raw_fitness_rep", i) + } + + ## calculate raw fitness of all variants vs. WT variant + for (i in 1:reps){ + + ### collect counts + tmp.input.counts <- merged.counts[,paste0("input", i)] + tmp.output.counts <- merged.counts[,paste0("output", i)] + + ### add pseudo-count to zero-outputs (if the corresponding input count is non-zero) + tmp.output.counts[which(tmp.output.counts == 0 & tmp.input.counts != 0)] <- 1 + + ### take logs + tmp.wt.log.ratio <- log(tmp.output.counts[which(merged.counts\$wt == TRUE)] / + tmp.input.counts[which(merged.counts\$wt == TRUE)]) + tmp.fitness <- log(tmp.output.counts / + tmp.input.counts) - tmp.wt.log.ratio + + ### uncertain values to NA + tmp.fitness[which(is.na(tmp.fitness) == TRUE)] <- NA + tmp.fitness[which(tmp.fitness == "Inf")] <- NA + + ### add to table + merged.counts[,c(ncol(merged.counts) - reps + i)] <- tmp.fitness + + ### clean up + rm(tmp.fitness, tmp.wt.log.ratio, tmp.output.counts, tmp.input.counts) + } + + list(merged.counts = merged.counts, reps = reps) +} + +# 4. Fitness and error refinements ## +rescale_and_summarize <- function(merged.counts, reps) { + ## center the raw fitness distributions on 0 (median of wildtype synonymous) and -1 (median of stops) + for (i in 1:reps){ + + merged.counts <- cbind(merged.counts, rep(NA, nrow(merged.counts))) + colnames(merged.counts)[ncol(merged.counts)] <- paste0("rescaled_fitness_rep", i) + + ### fetch the key counts + tmp.wt.fitness <- merged.counts[which(merged.counts\$aa_ham == 0),ncol(merged.counts) - reps] + tmp.stop.fitness <- merged.counts[which(merged.counts\$stop == TRUE),ncol(merged.counts) - reps] + + ### rescale + tmp.wt.fitness.med <- median(tmp.wt.fitness, na.rm = TRUE) + tmp.stop.fitness.med <- median(tmp.stop.fitness, na.rm = TRUE) + ## if both WT and STOP mutants are available + if(!is.na(tmp.wt.fitness.med) & !is.na(tmp.stop.fitness.med)){ + lm.rescale <- lm(c(0, -1) ~ c(tmp.wt.fitness.med, tmp.stop.fitness.med)) + merged.counts[,ncol(merged.counts)] <- merged.counts[,ncol(merged.counts) - reps] * lm.rescale\$coefficients[[2]] + lm.rescale\$coefficients[[1]] + rm(tmp.wt.fitness, tmp.stop.fitness, + tmp.wt.fitness.med, tmp.stop.fitness.med, lm.rescale) + + ## if only WT mutants are available: lower peak determined by bimodal distribution fitting + }else if(!is.na(tmp.wt.fitness.med) & is.na(tmp.stop.fitness.med)){ + tmp.peaks <- sort(density_peaks(x = merged.counts[,ncol(merged.counts) - reps])) + lm.rescale <- lm(c(0, -1) ~ c(tmp.wt.fitness.med, tmp.peaks[1])) + merged.counts[,ncol(merged.counts)] <- merged.counts[,ncol(merged.counts) - reps] * lm.rescale\$coefficients[[2]] + lm.rescale\$coefficients[[1]] + rm(tmp.wt.fitness, tmp.stop.fitness, + tmp.wt.fitness.med, tmp.stop.fitness.med, lm.rescale, tmp.peaks) + + ## if only STOP mutants are available: higher peak determined by bimodal distribution fitting + }else if(is.na(tmp.wt.fitness.med) & !is.na(tmp.stop.fitness.med)){ + tmp.peaks <- sort(density_peaks(x = merged.counts[,ncol(merged.counts) - reps])) + lm.rescale <- lm(c(0, -1) ~ c(tmp.peaks[2], tmp.stop.fitness.med)) + merged.counts[,ncol(merged.counts)] <- merged.counts[,ncol(merged.counts) - reps] * lm.rescale\$coefficients[[2]] + lm.rescale\$coefficients[[1]] + rm(tmp.wt.fitness, tmp.stop.fitness, + tmp.wt.fitness.med, tmp.stop.fitness.med, lm.rescale, tmp.peaks) + + ## if neither WT nor STOP mutants are available: both peak determined by bimodal distribution fitting + }else if(is.na(tmp.wt.fitness.med) & is.na(tmp.stop.fitness.med)){ + tmp.peaks <- sort(density_peaks(x = merged.counts[,ncol(merged.counts) - reps])) + lm.rescale <- lm(c(0, -1) ~ c(tmp.peaks[2], tmp.peaks[1])) + merged.counts[,ncol(merged.counts)] <- merged.counts[,ncol(merged.counts) - reps] * lm.rescale\$coefficients[[2]] + lm.rescale\$coefficients[[1]] + rm(tmp.wt.fitness, tmp.stop.fitness, + tmp.wt.fitness.med, tmp.stop.fitness.med, lm.rescale, tmp.peaks) + + } + } + + ## calculate fitness mean and standard deviation across replicates + merged.counts <- cbind(merged.counts, + "mean fitness" = rep(NA, nrow(merged.counts)), + "fitness sd" = rep(NA, nrow(merged.counts))) + + if(reps == 1){ + + merged.counts\$'mean fitness' <- merged.counts[,ncol(merged.counts) - 2] + + }else if(reps > 1){ + + merged.counts\$'mean fitness' <- apply(merged.counts[,c(ncol(merged.counts) - 1 - reps):c(ncol(merged.counts) - 2)], + 1, + mean, + na.rm = TRUE) + merged.counts\$'fitness sd' <- apply(merged.counts[,c(ncol(merged.counts) - 1 - reps):c(ncol(merged.counts) - 2)], + 1, + sd, + na.rm = TRUE) + + } + + merged.counts +} + +## --- Main function --- + +#' Run default fitness estimation with configurable I/O paths +#' +#' @param counts_path Path to counts_merged.tsv +#' @param design_path Path to experimentalDesign.tsv +#' @param wt_seq_path Path to synonymous_wt.txt (single line DNA sequence) +#' @param output_path Path to write fitness_estimation.tsv +#' +#' @return Invisibly returns the final data.frame; writes the output to output_path. +run_fitness_estimation <- function(counts_path, + design_path, + wt_seq_path, + output_path) { + ## 1. Import key files ## + ######################### + + merged.counts <- read.table(counts_path, sep = "\t", header = TRUE, check.names = FALSE) + exp.design <- read.table(design_path, sep = "\t", header = TRUE, check.names = FALSE) + wt.seq <- DNAString(as.character(read.table(wt_seq_path))) + wt.seq.aa <- translate(wt.seq) + + ## 2. Pre-processing the count table ## + ####################################### + + ## calculate nt hamming distances from the specified WT + merged.counts <- compute_nt_hamming(merged.counts, wt.seq) + + ## translate sequences + merged.counts <- add_aa_seq(merged.counts) + + ## calculate AA hamming distances from the WT + merged.counts <- compute_aa_hamming(merged.counts, wt.seq.aa) + + ## name the mutations + merged.counts <- name_mutations(merged.counts, wt.seq.aa) + + ## find stops, WT and WT; aggregate AA-identical variants (except WT) + merged.counts <- aggregate_by_aa(merged.counts) + + ## 3. Raw fitness calculations ## + ################################# + fitness_res <- calc_raw_fitness(merged.counts, exp.design) + merged.counts <- fitness_res\$merged.counts + reps <- fitness_res\$reps + + ## 4. Fitness and error refinements ## + ###################################### + merged.counts <- rescale_and_summarize(merged.counts, reps) + + ## clean up + rm(reps) + + ## export + write.table(merged.counts, output_path, + col.names = TRUE, row.names = FALSE, quote = FALSE, sep = "\t", na = "") + + invisible(merged.counts) +} + + +## 5. Version ## +################ + +# sessionInfo() +# R version 4.5.1 (2025-06-13) +# Platform: aarch64-apple-darwin20 +# Running under: macOS Sonoma 14.6.1 +# +# Matrix products: default +# BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib +# LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1 +# +# locale: +# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 +# +# time zone: Europe/Madrid +# tzcode source: internal +# +# attached base packages: +# [1] stats4 stats graphics grDevices utils datasets methods base +# +# other attached packages: +# [1] Biostrings_2.76.0 GenomeInfoDb_1.44.2 XVector_0.48.0 IRanges_2.42.0 S4Vectors_0.46.0 +# [6] BiocGenerics_0.54.0 generics_0.1.4 +# +# loaded via a namespace (and not attached): +# [1] httr_1.4.7 compiler_4.5.1 R6_2.6.1 tools_4.5.1 +# [5] GenomeInfoDbData_1.2.14 rstudioapi_0.17.1 crayon_1.5.3 UCSC.utils_1.4.0 +# [9] jsonlite_2.0.0 + + + +##### +# run function +##### +run_fitness_estimation( + counts_path = "$counts_merged", + design_path = "$exp_design", + wt_seq_path = "$syn_wt_txt", + output_path = "fitness_estimation.tsv" +) + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +Biostrings_version <- as.character(packageVersion("Biostrings")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(Biostrings_version) == 0) Biostrings_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"\${task.process}":', + paste(' r-base:', r_version), + paste(' r-Biostrings:', Biostrings_version) + ), + f +) +close(f) diff --git a/modules/local/fitness/fitness_experimental_design/environment.yml b/modules/local/fitness/fitness_experimental_design/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/fitness/fitness_experimental_design/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/fitness/fitness_experimental_design/main.nf b/modules/local/fitness/fitness_experimental_design/main.nf new file mode 100644 index 0000000..c15fcac --- /dev/null +++ b/modules/local/fitness/fitness_experimental_design/main.nf @@ -0,0 +1,20 @@ +process EXPDESIGN_FITNESS { + tag "experimentalDesign" + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + path samplesheet_csv + + output: + path "experimentalDesign.tsv", emit: experimental_design + path "versions.yml", emit: versions + + script: + template 'dimsum_experimentalDesign.R' +} diff --git a/modules/local/fitness/fitness_experimental_design/meta.yml b/modules/local/fitness/fitness_experimental_design/meta.yml new file mode 100644 index 0000000..74d5c93 --- /dev/null +++ b/modules/local/fitness/fitness_experimental_design/meta.yml @@ -0,0 +1,37 @@ +name: "expdesign_fitness" +description: Transforms a standard nf-core samplesheet into a tab-separated experimental design file that defines replicates, selection conditions, and input/output relationships for fitness calculation. +keywords: + - deep mutational scanning + - dms + - experimental design + - samplesheet + - dimsum + - mutscan +tools: + - "r-base": + description: "R is a free software environment for statistical computing and graphics" + homepage: "https://www.r-project.org/" + documentation: "https://cran.r-project.org/manuals.html" + licence: ["GPL-2.0-or-later"] + +input: + - - samplesheet_csv: + type: file + description: The primary nf-core pipeline samplesheet containing metadata for all sequencing libraries + pattern: "*.{csv}" + +output: + - experimental_design: + - experimentalDesign.tsv: + type: file + description: A TSV file formatted specifically for DiMSum or mutscan experimental design requirements + pattern: "experimentalDesign.tsv" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/fitness/fitness_experimental_design/templates/dimsum_experimentalDesign.R b/modules/local/fitness/fitness_experimental_design/templates/dimsum_experimentalDesign.R new file mode 100644 index 0000000..66a9be6 --- /dev/null +++ b/modules/local/fitness/fitness_experimental_design/templates/dimsum_experimentalDesign.R @@ -0,0 +1,95 @@ +#!/usr/bin/env Rscript + +# Make a DiMSum experimental design from a deepmutscan samplesheet. +# - samplesheet_csv: path to CSV with columns sample,type,replicate,file1,file2 +# - out_path: where to write the TSV (default "experimentalDesign.tsv") +# Returns: the experimental design as a data.frame +make_dimsum_experimental_design <- function(samplesheet_csv, out_path = "experimentalDesign.tsv") { + # ---- read & normalize ---- + ss <- read.csv(samplesheet_csv, stringsAsFactors = FALSE, check.names = FALSE) + names(ss) <- tolower(names(ss)) + + # tolerate missing file2 column (single-end) + if (!"file2" %in% names(ss)) ss\$file2 <- "" + + required <- c("sample", "type", "replicate", "file1", "file2") + missing <- setdiff(required, names(ss)) + if (length(missing) > 0) stop("Samplesheet missing columns: ", paste(missing, collapse = ", ")) + + # coerce types + ss\$replicate <- as.integer(ss\$replicate) + + # ---- derive sample_name strategy ---- + # If only one biological sample present (e.g. one protein), use "input1", "output2", ... + # If multiple biological samples present, prefix with 'sample' to avoid collisions: + # "GID1A_input1", "GID1B_output2", ... + multi_base <- length(unique(ss\$sample)) > 1 + if (multi_base) { + sample_name <- paste(ss\$sample, ss\$type, ss\$replicate, sep = "") + } else { + sample_name <- paste0(ss\$type, ss\$replicate) + } + + # ---- build DiMSum columns ---- + experiment_replicate <- ss\$replicate + selection_id <- ifelse(ss\$type == "input", 0L, + ifelse(ss\$type == "output", 1L, NA_integer_)) + # assume one selection batch + selection_replicate <- ifelse(ss\$type == "output", 1L, NA_integer_) + # assume one technical batch + technical_replicate <- rep(1L, nrow(ss)) + + pair1 <- basename(ss\$file1) + # keep empty string for single-end / missing file2 + pair2 <- ifelse(is.na(ss\$file2) | ss\$file2 == "", "", basename(ss\$file2)) + + ed <- data.frame( + sample_name = sample_name, + experiment_replicate = experiment_replicate, + selection_id = selection_id, + selection_replicate = selection_replicate, + technical_replicate = technical_replicate, + pair1 = pair1, + pair2 = pair2, + stringsAsFactors = FALSE + ) + + # ---- order rows: by sample (if multiple), type (input, output, quality), then replicate ---- + type_rank <- match(ss\$type, c("input", "output", "quality")) + ord <- if (multi_base) { + order(ss\$sample, type_rank, ss\$replicate, na.last = TRUE) + } else { + order(type_rank, ss\$replicate, na.last = TRUE) + } + ed <- ed[ord, , drop = FALSE] + rownames(ed) <- NULL + + # ---- write & return ---- + write.table(ed, file = out_path, sep = "\\t", row.names = FALSE, col.names = TRUE, quote = FALSE, na = "") + return(ed) +} + +##### +# run function +##### +make_dimsum_experimental_design( + samplesheet_csv = "$samplesheet_csv", + out_path = "experimentalDesign.tsv" +) + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] + +if (is.null(r_version)) r_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version) + ), + f +) +close(f) diff --git a/modules/local/fitness/fitness_heatmap/environment.yml b/modules/local/fitness/fitness_heatmap/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/fitness/fitness_heatmap/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/fitness/fitness_heatmap/main.nf b/modules/local/fitness/fitness_heatmap/main.nf new file mode 100644 index 0000000..4106ba6 --- /dev/null +++ b/modules/local/fitness/fitness_heatmap/main.nf @@ -0,0 +1,30 @@ +process FITNESS_HEATMAP { + tag { sample.sample } + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(sample), path(fitness_estimation_tsv) // from FITNESS_CALCULATION + tuple val(sample2), path(wt_seq) // WT sequence + + output: + tuple val(sample), path("fitness_heatmap.pdf"), emit: fitness_heatmap + path "versions.yml", emit: versions + + script: + template 'fitness_heatmap.R' + + stub: + """ + touch fitness_heatmap.pdf + cat > versions.yml <<'EOF' + FITNESS_HEATMAP: + stub-version: "0.0.0" + EOF + """ +} diff --git a/modules/local/fitness/fitness_heatmap/meta.yml b/modules/local/fitness/fitness_heatmap/meta.yml new file mode 100644 index 0000000..7b22869 --- /dev/null +++ b/modules/local/fitness/fitness_heatmap/meta.yml @@ -0,0 +1,59 @@ +name: "fitness_heatmap" +description: Generates a fitness landscape heatmap from calculated fitness scores, displaying the functional impact of amino acid substitutions at each position of the reference sequence. +keywords: + - deep mutational scanning + - dms + - fitness + - visualization + - heatmap + - mutation landscape +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + - "bioconductor-biostrings": + description: "Efficient manipulation of biological strings in R" + homepage: "https://bioconductor.org/packages/Biostrings" + licence: ["Artistic-2.0"] + +input: + - - sample: + type: map + description: | + Groovy Map containing sample information for the fitness data + e.g. `[ id:'test' ]` + - fitness_estimation_tsv: + type: file + description: TSV file containing calculated fitness scores and statistics + pattern: "*.{tsv}" + - - sample2: + type: map + description: | + Groovy Map containing metadata for the reference sequence + - wt_seq: + type: file + description: FASTA file or text file containing the wild-type reference sequence + pattern: "*.{fasta,fa,txt}" + +output: + - fitness_heatmap: + - sample: + type: map + description: Groovy Map containing sample information + - fitness_heatmap.pdf: + type: file + description: PDF heatmap visualizing the fitness scores across the protein sequence + pattern: "fitness_heatmap.pdf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/fitness/fitness_heatmap/templates/fitness_heatmap.R b/modules/local/fitness/fitness_heatmap/templates/fitness_heatmap.R new file mode 100644 index 0000000..e860382 --- /dev/null +++ b/modules/local/fitness/fitness_heatmap/templates/fitness_heatmap.R @@ -0,0 +1,312 @@ +#!/usr/bin/env Rscript + +suppressPackageStartupMessages({ + library(methods) + library(dplyr) + library(ggplot2) + library(grid) # for unit() +}) + +# ---------- helper functions ---------- +find_col <- function(df, candidates) { + norm <- function(x) gsub("[^a-z0-9]+", "_", tolower(x)) + nms <- colnames(df); nn <- norm(nms) + for (cand in candidates) { + hit <- which(nn == norm(cand)) + if (length(hit) == 1) return(nms[hit]) + } + stop(sprintf("Could not find any of columns: %s", paste(candidates, collapse = ", "))) +} + +get_rescaled_cols <- function(df) { + nms <- colnames(df) + hits <- grep("^rescaled[_ ]?fitness", nms, ignore.case = TRUE, value = TRUE) + if (!length(hits)) stop("No 'rescaled_fitness' columns found.") + idx <- suppressWarnings(as.integer(gsub(".*?([0-9]+)\$", "\\\\1", hits))) # additional backslashs to make it groovy readable + hits[order(is.na(idx), idx, hits)] +} + +# Find "mean fitness" column (if present) +find_mean_col <- function(df) { + nms <- colnames(df) + key <- tolower(gsub("[^a-z0-9]+", "_", nms)) + hit <- which(key == "mean_fitness") + if (length(hit) == 1) nms[hit] else NULL +} + +# NEW: read WT amino acid sequence from .txt file (single line) +read_wt_seq_aa_txt <- function(path) { + if (is.null(path)) stop("wt_seq_aa_txt_path must be provided.") + x <- readLines(path, warn = FALSE) + x <- x[nzchar(x)] + if (!length(x)) stop("WT AA TXT is empty.") + aa <- toupper(gsub("\\\\s+", "", x[which.max(nchar(x))])) + # Keep only valid amino acid letters (including stop '*') + aa <- gsub("[^ACDEFGHIKLMNPQRSTVWY*]", "", aa) + if (!nchar(aa)) stop("WT AA TXT contains no valid AA letters.") + aa +} + +# Build full AA×position grid for positions 1..wt_len (from WT sequence), +# join fitness values; any missing combos stay NA (-> grey). +# Positions > wt_len are considered padded and will be white. +build_heatmap_long <- function(df, + wt_aa_col, + pos_col, + mut_aa_col, + fitness_col, + positions_per_row = 75, + wt_seq_aa, + fill_missing_as_zero = FALSE) { + + # authoritative WT length from provided sequence + letters <- strsplit(wt_seq_aa, "", fixed = TRUE)[[1]] + wt_len <- length(letters) + + # normalize data + df0 <- df %>% + transmute( + position = suppressWarnings(as.numeric(.data[[pos_col]])), + wt_aa_in = .data[[wt_aa_col]], + mut_aa = .data[[mut_aa_col]], + fitness = suppressWarnings(as.numeric(.data[[fitness_col]])) + ) %>% + filter(is.finite(position)) + + # drop any rows that claim positions beyond WT length + if (nrow(df0) && any(df0\$position > wt_len, na.rm = TRUE)) { + dropped <- sum(df0\$position > wt_len, na.rm = TRUE) + warning(sprintf("Dropping %d row(s) with position > WT length (%d).", dropped, wt_len)) + df0 <- df0 %>% filter(position <= wt_len) + } + + # pad to next multiple of 75 (by rows) + rem <- wt_len %% positions_per_row + pad_need <- if (rem == 0) 0 else positions_per_row - rem + max_paded <- wt_len + pad_need + + # full grid: positions 1..max_paded (so the tail exists), AA set of 21 + all_positions <- seq_len(max_paded) + all_amino_acids <- c("A","C","D","E","F","G","H","I","K","L", + "M","N","P","Q","R","S","T","V","W","Y","*") + + grid_df <- expand.grid(position = all_positions, + mut_aa = all_amino_acids, + KEEP.OUT.ATTRS = FALSE, stringsAsFactors = FALSE) %>% + mutate(is_padded = position > wt_len) + + # join fitness only for real positions (<= wt_len) + fit_df <- df0 %>% select(position, mut_aa, fitness) + d <- grid_df %>% + left_join(fit_df, by = c("position","mut_aa")) + + if (fill_missing_as_zero) { + d\$fitness[is.na(d\$fitness) & d\$position <= wt_len] <- 0 + } + + # authoritative WT AA per real position; tail gets placeholder 'Y' + wt_map <- tibble(position = seq_len(wt_len), wt_aa = letters) + d <- d %>% + left_join(wt_map, by = "position") %>% + mutate(wt_aa = ifelse(is.na(wt_aa) & position > wt_len, "Y", wt_aa)) + + # layout fields + d <- d %>% + mutate( + row_group = ((position - 1) %/% positions_per_row) + 1, + wt_aa_pos = paste0(wt_aa, position), + wt_aa_pos = factor(wt_aa_pos, levels = unique(wt_aa_pos)), + synonymous = mut_aa == wt_aa + ) + + # IMPORTANT: use WT length as the true end of the protein + d\$max_pos <- wt_len + d +} + +syn_segments <- function(d, positions_per_row = 75) { + amino_order <- rev(c("G", "A", "V", "L", "M", "I", "F", + "Y", "W", "K", "R", "H", "D", "E", + "S", "T", "C", "N", "Q", "P", "*")) + d %>% + mutate( + mut_aa = factor(mut_aa, levels = amino_order), + x = as.numeric(factor(wt_aa_pos, levels = levels(wt_aa_pos))) - + ((row_group - 1) * positions_per_row), + y = as.numeric(factor(mut_aa, levels = amino_order)) + ) %>% + filter(synonymous, position <= max_pos) +} + +# Draw one solid white rectangle per row group covering the padded tail region +white_tail_rects <- function(d, positions_per_row = 75) { + wt_len <- unique(d\$max_pos)[1] + if (!is.finite(wt_len)) return(dplyr::tibble()[0,]) + + # if perfectly divisible by 75, there is no tail to cover + if (wt_len %% positions_per_row == 0) return(dplyr::tibble()[0,]) + + # which facet (row group) contains the last real position? + last_group <- ((wt_len - 1) %/% positions_per_row) + 1 + last_local_idx <- ((wt_len - 1) %% positions_per_row) + 1 # 1..75 within the facet + + tibble::tibble( + row_group = last_group, + xmin = last_local_idx + 0.5 - 0.025, # tiny epsilon to avoid hairlines + xmax = positions_per_row + 0.5 + 0.025, + ymin = 0.5 - 0.025, + ymax = 21.5 + 0.025 + ) +} + +plot_heatmap <- function(d, title_text, positions_per_row = 75) { + amino_order <- rev(c("G", "A", "V", "L", "M", "I", "F", + "Y", "W", "K", "R", "H", "D", "E", + "S", "T", "C", "N", "Q", "P", "*")) + d <- d %>% mutate(mut_aa = factor(mut_aa, levels = amino_order)) + + min_f <- suppressWarnings(min(d\$fitness, na.rm = TRUE)); if (!is.finite(min_f)) min_f <- 0 + max_f <- suppressWarnings(max(d\$fitness, na.rm = TRUE)); if (!is.finite(max_f)) max_f <- 0 + max_orig_pos <- unique(d\$max_pos)[1] + + syn <- syn_segments(d, positions_per_row) + rect <- white_tail_rects(d, positions_per_row) + + ggplot(d, aes(x = wt_aa_pos, y = mut_aa, fill = fitness)) + + scale_fill_gradientn( + colours = c("#D73027", "#F0F0F0", "#4575B4"), + values = if ((abs(min_f) + max_f) > 0) c(0, abs(min_f)/(abs(min_f)+max_f), 1) else c(0, 0.5, 1), + na.value = "grey35", + limits = c(min_f, max_f) + ) + + scale_x_discrete( + labels = function(x) { + num <- suppressWarnings(as.numeric(gsub("[^0-9]", "", x))) + ifelse(num > max_orig_pos, " ", x) + }, + expand = expansion(mult = c(0, 0)) # no extra margin area + ) + + geom_tile() + + # Solid white block covering the tail (no pattern / no seams) + { if (nrow(rect)) geom_rect(data = rect, inherit.aes = FALSE, + aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax), + fill = "white", color = NA) } + + geom_segment( + data = syn, + aes(x = x - 0.485, xend = x + 0.485, y = y - 0.485, yend = y + 0.485), + linewidth = 0.2, inherit.aes = FALSE, color = "grey10" + ) + + theme_minimal() + + labs(title = title_text, x = "Wild-type amino acid", y = "Mutant amino acid", fill = "Fitness") + + theme( + plot.title = element_text(size = 16, face = "bold"), + axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5, size = 10), + axis.text.y = element_text(size = 10), + axis.title.x = element_text(size = 14), + axis.title.y = element_text(size = 14), + legend.title = element_text(size = 12), + legend.text = element_text(size = 10), + panel.grid.major = element_blank(), + panel.grid.minor = element_blank(), + strip.text = element_blank(), + strip.background = element_blank(), + panel.spacing = grid::unit(0.2, "lines") + ) + + facet_wrap(~ row_group, scales = "free_x", ncol = 1) +} + +# ---------- main callable ---------- +# fitness_table_path : path to fitness_estimation.tsv +# wt_seq_aa_txt_path : path to TXT file containing WT AA sequence (one line) +# output_pdf_path : output PDF (default "fitness_heatmap.pdf") +# positions_per_row : default 75 +run_fitness_rescaled_heatmaps <- function(fitness_table_path, + wt_seq_aa_txt_path, + output_pdf_path = "fitness_heatmap.pdf", + positions_per_row = 75) { + + df <- read.table( + fitness_table_path, sep = "\t", header = TRUE, + check.names = FALSE, quote = "", comment.char = "" + ) + + wt_aa_col <- find_col(df, c("wt aa", "wt_aa", "wt")) + pos_col <- find_col(df, c("pos", "position")) + mut_aa_col <- find_col(df, c("mut aa", "mut_aa", "aa")) + rescaled_cols <- get_rescaled_cols(df) + + wt_seq_aa <- read_wt_seq_aa_txt(wt_seq_aa_txt_path) + + plots <- list() + + ## 1) Mean first – use existing "mean fitness" column if available + mean_col <- find_mean_col(df) + if (is.null(mean_col)) { + df\$`rescaled_fitness_mean` <- if (length(rescaled_cols) == 1) df[[rescaled_cols[1]]] else rowMeans(df[, rescaled_cols], na.rm = TRUE) + mean_col <- "rescaled_fitness_mean" + } + long_df_mean <- build_heatmap_long(df, wt_aa_col, pos_col, mut_aa_col, mean_col, + positions_per_row, wt_seq_aa) + plots[[length(plots) + 1]] <- list( + title = sprintf("Fitness — mean of %d replicate(s)", length(rescaled_cols)), + data = long_df_mean + ) + + ## 2) Then individual replicates + for (i in seq_along(rescaled_cols)) { + col <- rescaled_cols[i] + long_df <- build_heatmap_long(df, wt_aa_col, pos_col, mut_aa_col, col, + positions_per_row, wt_seq_aa) + plots[[length(plots) + 1]] <- list( + title = sprintf("Fitness — rep%d", i), + data = long_df + ) + } + + # Device height: (#row groups × 4) + page_heights <- vapply(plots, function(p) max(p\$data\$row_group, na.rm = TRUE), numeric(1)) + device_height <- max(4, as.numeric(page_heights) * 4, na.rm = TRUE) + + grDevices::pdf(output_pdf_path, width = 16, height = device_height) + on.exit(try(grDevices::dev.off(), silent = TRUE), add = TRUE) + for (p in plots) print(plot_heatmap(p\$data, p\$title, positions_per_row)) + invisible(TRUE) +} + +#### +# run function +#### +run_fitness_rescaled_heatmaps( + fitness_table_path = "$fitness_estimation_tsv", + wt_seq_aa_txt_path = "$wt_seq", + output_pdf_path = "fitness_heatmap.pdf" +) + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +dplyr_version <- as.character(packageVersion("dplyr")) +ggplot2_version <- as.character(packageVersion("ggplot2")) +methods_version <- as.character(packageVersion("methods")) +grid_version <- as.character(packageVersion("grid")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(dplyr_version) == 0) dplyr_version <- "unknown" +if (length(ggplot2_version) == 0) ggplot2_version <- "unknown" +if (length(methods_version) == 0) methods_version <- "unknown" +if (length(grid_version) == 0) grid_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version), + paste(' r-dplyr:', dplyr_version), + paste(' r-ggplot2:', ggplot2_version), + paste(' r-methods:', methods_version), + paste(' r-grid:', grid_version) + ), + f +) +close(f) diff --git a/modules/local/fitness/merge_counts/environment.yml b/modules/local/fitness/merge_counts/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/fitness/merge_counts/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/fitness/merge_counts/main.nf b/modules/local/fitness/merge_counts/main.nf new file mode 100644 index 0000000..d34e1f8 --- /dev/null +++ b/modules/local/fitness/merge_counts/main.nf @@ -0,0 +1,20 @@ +process MERGE_COUNTS { + tag "${sample.sample}" + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(sample), val(metas), path(input_counts), path(output_counts) + + output: + tuple val(sample), path("counts_merged.tsv"), emit: merged_counts + path "versions.yml", emit: versions + + script: + template 'merge_counts.R' +} diff --git a/modules/local/fitness/merge_counts/meta.yml b/modules/local/fitness/merge_counts/meta.yml new file mode 100644 index 0000000..87e035a --- /dev/null +++ b/modules/local/fitness/merge_counts/meta.yml @@ -0,0 +1,51 @@ +name: "merge_counts" +description: Consolidates variant counts from multiple input (baseline) and output (selection) files into a single merged TSV table, organized by sample or replicate. +keywords: + - deep mutational scanning + - dms + - count merging + - variant table + - preprocessing +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + +input: + - - sample: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test' ]` + - metas: + type: list + description: A list of metadata maps corresponding to the input and output count files + - input_counts: + type: list + description: A list of paths to count files representing the baseline/input population + - output_counts: + type: list + description: A list of paths to count files representing the selection/output population + +output: + - merged_counts: + - sample: + type: map + description: Groovy Map containing sample information + - counts_merged.tsv: + type: file + description: A consolidated TSV file containing merged counts for all provided inputs and outputs + pattern: "counts_merged.tsv" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/fitness/merge_counts/templates/merge_counts.R b/modules/local/fitness/merge_counts/templates/merge_counts.R new file mode 100644 index 0000000..8e39be8 --- /dev/null +++ b/modules/local/fitness/merge_counts/templates/merge_counts.R @@ -0,0 +1,112 @@ +#!/usr/bin/env Rscript + +# -------------------------------------------------------------- +# 1. SETUP & INPUTS +# -------------------------------------------------------------- + +raw_inputs_str <- "$input_counts" +raw_outputs_str <- "$output_counts" + +parse_paths <- function(x) { + if (x == "" || x == "[]") return(character(0)) + # Split at spaces + parts <- strsplit(x, "\\\\s+")[[1]] + # Remove empty strings + parts[parts != ""] +} + +input_paths <- parse_paths(raw_inputs_str) +output_paths <- parse_paths(raw_outputs_str) + +message("Found ", length(input_paths), " input files.") +message("Found ", length(output_paths), " output files.") + +# -------------------------------------------------------------- +# 2. MERGE LOGIC +# -------------------------------------------------------------- + +# Helper to read a 2-col TSV without header +read_counts <- function(fp) { + utils::read.table( + fp, header = FALSE, sep = "\\t", quote = "", + col.names = c("nt_seq", "count"), + colClasses = c("character", "numeric"), + comment.char = "", check.names = FALSE + ) +} + +# Read all inputs / outputs +input_list <- lapply(input_paths, read_counts) +output_list <- lapply(output_paths, read_counts) + +# Collect universe of sequences +all_seqs <- unique(c( + unlist(lapply(input_list, function(x) x\$nt_seq)), + unlist(lapply(output_list, function(x) x\$nt_seq)) +)) + +# Pre-allocate output frame +n_in <- length(input_list) +n_out <- length(output_list) + +col_names <- c( + "nt_seq", + if (n_in > 0) paste0("input", seq_len(n_in)) else character(0), + if (n_out > 0) paste0("output", seq_len(n_out)) else character(0) +) + +# Initialize dataframe with 0 counts +out <- data.frame( + nt_seq = all_seqs, + matrix(0, nrow = length(all_seqs), ncol = n_in + n_out), + stringsAsFactors = FALSE, check.names = FALSE +) +names(out) <- col_names + +# Fill inputs +if (n_in > 0) { + for (i in seq_len(n_in)) { + df <- input_list[[i]] + # Match sequences: WICHTIG \$ escaping + idx <- match(df\$nt_seq, out\$nt_seq) + # Assign counts: WICHTIG \$ escaping + out[idx, paste0("input", i)] <- df\$count + } +} + +# Fill outputs +if (n_out > 0) { + for (j in seq_len(n_out)) { + df <- output_list[[j]] + # WICHTIG \$ escaping + idx <- match(df\$nt_seq, out\$nt_seq) + out[idx, paste0("output", j)] <- df\$count + } +} + +# Write Output +utils::write.table( + out, + file = "counts_merged.tsv", + sep = "\\t", + row.names = FALSE, + col.names = TRUE, + quote = FALSE +) + +# -------------------------------------------------------------- +# 3. VERSIONING +# -------------------------------------------------------------- + +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +if (is.null(r_version)) r_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version) + ), + f +) +close(f) diff --git a/modules/local/fitness/run_dimsum/environment.yml b/modules/local/fitness/run_dimsum/environment.yml new file mode 100644 index 0000000..4d75327 --- /dev/null +++ b/modules/local/fitness/run_dimsum/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::r-dimsum=1.4 diff --git a/modules/local/fitness/run_dimsum/main.nf b/modules/local/fitness/run_dimsum/main.nf new file mode 100644 index 0000000..edf4917 --- /dev/null +++ b/modules/local/fitness/run_dimsum/main.nf @@ -0,0 +1,43 @@ +process RUN_DIMSUM { + tag { sample.sample } + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/92/9298c391f285d7f2c1155f6a519a96c4f7591971780ba4f10569558282f40b6f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_r-dimsum_r-base_r-data.table_pruned:cca13eed371a9d84' }" + + input: + tuple val(sample), path(counts_merged) + path(wt_txt) + path(exp_design) + + output: + path "dimsum_results**", emit: results_dir + path "versions.yml", emit: versions + + script: + """ + set -euo pipefail + + # DiMSum expects the sequence string, not a file path + WT=\$(tr -d ' \r\\n\\t' < "$wt_txt") + + DiMSum \ + --experimentDesignPath "$exp_design" \ + --wildtypeSequence "\$WT" \ + --countPath "$counts_merged" \ + --startStage 4 \ + --stopStage 5 \ + --fitnessErrorModel F \ + --retainIntermediateFiles T \ + --projectName "dimsum_results" \ + --fastqFileDir . \ + + R_VERSION=\$(R --version | head -n 1 | sed -E 's/^R version ([0-9.]+).*/\\1/') + cat <<-END_VERSIONS > versions.yml + DIMSUM_RUN: + r-base: \$R_VERSION +END_VERSIONS + """ +} diff --git a/modules/local/fitness/run_dimsum/meta.yml b/modules/local/fitness/run_dimsum/meta.yml new file mode 100644 index 0000000..3e889f9 --- /dev/null +++ b/modules/local/fitness/run_dimsum/meta.yml @@ -0,0 +1,49 @@ +name: "run_dimsum" +description: Runs DiMSum to estimate fitness and functionality scores from Deep Mutational Scanning (DMS) count data. +keywords: + - deep mutational scanning + - dms + - fitness + - dimsum +tools: + - "dimsum": + description: "An error-model-enriched pipeline for analyzing deep mutational scanning data" + homepage: "https://github.com/lehner-lab/DiMSum" + documentation: "https://github.com/lehner-lab/DiMSum/blob/master/README.md" + doi: "10.1186/s12859-020-03709-3" + licence: ["GPL-3.0-or-later"] + +input: + - - sample: + type: map + description: | + Groovy Map containing sample information. + Matches the structure used in the pipeline (e.g., [ id:'test' ]) + - counts_merged: + type: file + description: A TSV file containing variant counts merged across samples + pattern: "*.{tsv,csv}" + - - wt_txt: + type: file + description: A text file containing the wild-type DNA/amino acid sequence string + pattern: "*.{txt}" + - - exp_design: + type: file + description: DiMSum-specific experimental design file describing replicates and conditions + pattern: "*.{txt,csv}" + +output: + - results_dir: + - dimsum_results**: + type: directory + description: Directory containing the full suite of DiMSum output files, including fitness estimates and plots + pattern: "dimsum_results*" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/fitness/run_mutscan/environment.yml b/modules/local/fitness/run_mutscan/environment.yml new file mode 100644 index 0000000..aeaf45d --- /dev/null +++ b/modules/local/fitness/run_mutscan/environment.yml @@ -0,0 +1,7 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.78.0 + - bioconda::bioconductor-mutscan=1.0.0 + - conda-forge::r-base=4.5.3 diff --git a/modules/local/fitness/run_mutscan/main.nf b/modules/local/fitness/run_mutscan/main.nf new file mode 100644 index 0000000..c027e99 --- /dev/null +++ b/modules/local/fitness/run_mutscan/main.nf @@ -0,0 +1,24 @@ +process RUN_MUTSCAN { + tag "${sample.sample}" + label 'process_medium' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(sample), path(counts_merged) + path(syn_wt_txt) + path(exp_design) + + output: + tuple val(sample), path("fitness_estimation_mutscan_edgeR.tsv"), emit: fitness_mutscan_edgeR + tuple val(sample), path("fitness_estimation_mutscan_limma.tsv"), emit: fitness_mutscan_limma + tuple val(sample), path("*.pdf"), emit: qc_plots + path "versions.yml", emit: versions + + script: + template 'fitness_calculation_mutscan.R' +} diff --git a/modules/local/fitness/run_mutscan/meta.yml b/modules/local/fitness/run_mutscan/meta.yml new file mode 100644 index 0000000..f80e038 --- /dev/null +++ b/modules/local/fitness/run_mutscan/meta.yml @@ -0,0 +1,81 @@ +name: "run_mutscan" +description: Calculates fitness and functionality scores using the mutscan R package, employing edgeR and limma-voom statistical frameworks for robust estimation from input and output variant counts. +keywords: + - deep mutational scanning + - dms + - fitness estimation + - mutscan + - edger + - limma +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + - "edger": + description: "Empirical Analysis of Digital Gene Expression Data in R" + homepage: "https://bioconductor.org/packages/release/bioc/html/edgeR.html" + doi: "10.1093/bioinformatics/btp616" + licence: ["GPL-2.0-or-later"] + - "limma": + description: "Linear Models for Microarray Data" + homepage: "https://bioconductor.org/packages/release/bioc/html/limma.html" + doi: "10.1093/nar/gkv007" + licence: ["GPL-2.0-or-later"] + +input: + - - sample: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test' ]` + - counts_merged: + type: file + description: TSV/CSV file containing variant counts merged across samples or replicates + pattern: "*.{tsv,csv}" + - - syn_wt_txt: + type: file + description: Text file identifying the synonymous wild-type mutation used as the reference baseline + pattern: "*.txt" + - - exp_design: + type: file + description: Experimental design file defining replicates, conditions, and groupings for statistical analysis + pattern: "*.{csv,txt}" + +output: + - fitness_mutscan_edgeR: + - sample: + type: map + description: Groovy Map containing sample information + - fitness_estimation_mutscan_edgeR.tsv: + type: file + description: Fitness estimation results generated using the edgeR framework + pattern: "fitness_estimation_mutscan_edgeR.tsv" + - fitness_mutscan_limma: + - sample: + type: map + description: Groovy Map containing sample information + - fitness_estimation_mutscan_limma.tsv: + type: file + description: Fitness estimation results generated using the limma framework + pattern: "fitness_estimation_mutscan_limma.tsv" + - qc_plots: + - sample: + type: map + description: Groovy Map containing sample information + - "*.pdf": + type: file + description: Quality control plots generated by mutscan during fitness estimation + pattern: "*.pdf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/fitness/run_mutscan/templates/fitness_calculation_mutscan.R b/modules/local/fitness/run_mutscan/templates/fitness_calculation_mutscan.R new file mode 100644 index 0000000..e981d28 --- /dev/null +++ b/modules/local/fitness/run_mutscan/templates/fitness_calculation_mutscan.R @@ -0,0 +1,371 @@ +#!/usr/bin/env Rscript + +## mutscan (Soneson et al., Genome Biology 2023) fitness estimation for nf-core/deepmutscan +## 19.03.2026 + +## 0. Libraries ## +################## + +suppressPackageStartupMessages({ + library(mutscan) + library(Biostrings) +}) + +## --- Helper functions --- + +# calculate nt hamming distances from the specified WT +nbrMutBases <- function(x, wt.seq) { + x <- cbind(x, "nbrMutBases" = rep(NA, nrow(x))) + for (i in 1:nrow(x)){ + tmp.wt <- strsplit(as.character(wt.seq), "")[[1]] + tmp.mut <- strsplit(as.character(x\$sequence[i]), "")[[1]] + if(length(which(tmp.mut != tmp.wt)) == 0){ + x\$nbrMutBases[i] <- 0 + rm(tmp.mut, tmp.wt) + next + }else{ + x\$nbrMutBases[i] <- length(which(tmp.mut != tmp.wt)) + rm(tmp.mut, tmp.wt) + next + } + } + x\$nbrMutBases <- as.character(x\$nbrMutBases) + return(x) +} + +# calculate codon hamming distances from the specified WT +nbrMutCodons <- function(x, wt.seq) { + x <- cbind(x, "nbrMutCodons" = rep(NA, nrow(x))) + for (i in 1:nrow(x)){ + tmp.wt <- strsplit(as.character(wt.seq), "")[[1]] + tmp.mut <- strsplit(as.character(x\$sequence[i]), "")[[1]] + if(length(which(tmp.mut != tmp.wt)) == 0){ + x\$nbrMutCodons[i] <- 0 + rm(tmp.mut, tmp.wt) + next + }else if(length(which(tmp.mut != tmp.wt)) == 1){ + x\$nbrMutCodons[i] <- 1 + rm(tmp.mut, tmp.wt) + }else if(length(which(tmp.mut != tmp.wt)) > 1){ + mut.pos <- which(tmp.mut != tmp.wt) + x\$nbrMutCodons[i] <- length(unique(ceiling(mut.pos / 3))) + rm(tmp.mut, tmp.wt) + next + } + } + x\$nbrMutCodons <- as.character(x\$nbrMutCodons) + return(x) +} + +# translate sequences and add aa_seq +sequenceAA <- function(x) { + x <- cbind(x, "sequenceAA" = as.character(translate(DNAStringSet(x\$sequence)))) + return(x) +} + +# calculate AA hamming distances from the WT +nbrMutAAs <- function(x, wt.seq) { + x <- cbind(x, "nbrMutAAs" = rep(NA, nrow(x))) + for (i in 1:nrow(x)){ + tmp.wt <- strsplit(as.character(translate(wt.seq)), "")[[1]] + tmp.mut <- strsplit(as.character(x\$sequenceAA[i]), "")[[1]] + if(length(which(tmp.mut != tmp.wt)) == 0){ + x\$nbrMutAAs[i] <- 0 + rm(tmp.mut, tmp.wt) + next + }else{ + x\$nbrMutAAs[i] <- length(which(tmp.mut != tmp.wt)) + rm(tmp.mut, tmp.wt) + next + } + } + x\$nbrMutAAs <- as.character(x\$nbrMutAAs) + return(x) +} + +# mutant base labels +mutantNameBase <- function(x, wt.seq){ + x <- cbind(x, "mutantNameBase" = rep(NA, nrow(x))) + for(i in 1:nrow(x)){ + tmp.wt <- strsplit(as.character(wt.seq), "")[[1]] + tmp.mut <- strsplit(as.character(x\$sequence[i]), "")[[1]] + if(length(which(tmp.mut != tmp.wt)) == 0){ + x\$mutantNameBase[i] <- "f.0.WT" + rm(tmp.mut, tmp.wt) + next + }else if(length(which(tmp.mut != tmp.wt)) > 0){ + mut.pos <- which(tmp.mut != tmp.wt) + x\$mutantNameBase[i] <- paste0("f.", mut.pos, ".", tmp.mut[mut.pos], collapse = "_") + rm(tmp.mut, tmp.wt, mut.pos) + next + } + } + return(x) +} + +# mutant codon labels +mutantNameCodon <- function(x, wt.seq){ + x <- cbind(x, "mutantNameCodon" = rep(NA, nrow(x))) + for(i in 1:nrow(x)){ + tmp.wt <- strsplit(as.character(wt.seq), "(?<=.{3})", perl = TRUE)[[1]] + tmp.mut <- strsplit(as.character(x\$sequence[i]), "(?<=.{3})", perl = TRUE)[[1]] + if(length(which(tmp.mut != tmp.wt)) == 0){ + x\$mutantNameCodon[i] <- "f.0.WT" + rm(tmp.mut, tmp.wt) + next + }else if(length(which(tmp.mut != tmp.wt)) > 0){ + mut.pos <- which(tmp.mut != tmp.wt) + x\$mutantNameCodon[i] <- paste0("f.", mut.pos, ".", tmp.mut[mut.pos], collapse = "_") + rm(tmp.mut, tmp.wt, mut.pos) + next + } + } + return(x) +} + +# name the mutations (AA level) +mutantNameAA <- function(x, wt.seq) { + x <- cbind(x, "mutantNameAA" = rep(NA, nrow(x))) + for (i in 1:nrow(x)){ + if(x\$nbrMutAAs[i] == 0){ + x\$mutantNameAA[i] <- "f.0.WT" + }else{ + tmp.wt <- strsplit(as.character(translate(wt.seq)), "")[[1]] + tmp.mut <- strsplit(as.character(x\$sequenceAA[i]), "")[[1]] + tmp.pos <- which(tmp.mut != tmp.wt) + x\$mutantNameAA[i] <- paste0("f.", tmp.pos, ".", tmp.mut[tmp.pos]) + rm(tmp.mut, tmp.wt,tmp.pos) + } + } + return(x) +} + +# categorise the mutation types +mutationTypes <- function(x){ + x <- cbind(x, "mutationTypes" = rep(NA, nrow(x))) + x[grep("WT", x\$mutantNameAA), "mutationTypes"] <- "silent" ## silent/synonymous + x[grep("[*]", x\$mutantNameAA), "mutationTypes"] <- "stop" ## stop + x[-grep("[*]|WT", x\$mutantNameAA), "mutationTypes"] <- "nonsynonymous" ## nonsynonymous + x +} + +# mutant name +mutantName <- function(x, wt.seq){ + x <- cbind("mutantName" = rep(NA, nrow(x)), x) + for (i in 1:nrow(x)){ + if(x\$nbrMutAAs[i] == 0){ + x\$mutantName[i] <- "WT" + }else{ + tmp.wt <- strsplit(as.character(translate(wt.seq)), "")[[1]] + tmp.mut <- strsplit(as.character(x\$sequenceAA[i]), "")[[1]] + tmp.pos <- which(tmp.mut != tmp.wt) + x\$mutantName[i] <- paste0(tmp.wt[tmp.pos], tmp.pos,tmp.mut[tmp.pos]) + rm(tmp.mut, tmp.wt,tmp.pos) + } + } + return(x) +} + +# mutscan 'summaryTable' building (from merged counts TSV file) +mutscan.summaryTable.from.counts <- function(x, sample, wt.seq){ + + ### pre-process + x <- x[,c(1, grep(sample, colnames(x)))] + colnames(x) <- c("sequence", "nbrReads") + x <- cbind(x, "maxNbrReads" = x[,"nbrReads"], "nbrUmis" = x[,"nbrReads"]) + + ### nbrMutBases + x <- nbrMutBases(x, wt.seq) + + ### nbrMutCodons + x <- nbrMutCodons(x, wt.seq) + + ### sequenceAA + x <- sequenceAA(x) + + ### nbrMutAAs + x <- nbrMutAAs(x, wt.seq) + + ### varLengths + x <- cbind(x, "varLengths" = as.character(nchar(x\$sequence))) + + ### mutantNameBase + x <- mutantNameBase(x, wt.seq) + + ### mutantNameCodon + x <- mutantNameCodon(x, wt.seq) + + ### mutantNameBaseHGVS (would ideally look like this: "f:c", "f:c.32_33delinsAC", etc.) + x <- cbind(x, "mutantNameBaseHGVS" = x\$mutantNameCodon) + + ### mutantNameAA + x <- mutantNameAA(x, wt.seq) + + ### mutantNameAAHGVS (would ideally look like this: "f:p" or "f:p.(Leu11His)") + x <- cbind(x, "mutantNameAAHGVS" = x\$mutantNameAA) + + ### mutationTypes: silent, stop, nonsynonymous + x <- mutationTypes(x) + + ### mutantName + x <- mutantName(x, wt.seq) + + ### re-order columns + x <- x[,c(1:7,9:16,8)] + + ### output + return(x) +} + +## --- Main function --- + +#' Run mutscan fitness estimation with configurable I/O paths +#' +#' @param counts_path Path to counts_merged.tsv +#' @param design_path Path to experimentalDesign.tsv +#' @param wt_seq_path Path to synonymous_wt.txt (single line DNA sequence) +#' @param output_path Path to write fitness_estimation.tsv +#' +#' @return Invisibly returns the final data.frame; writes the output to output_path. +run_mutscan_fitness_estimation <- function(counts_path, + design_path, + wt_seq_path, + output_path_edgeR, + output_path_limma){ + + ## 1. Import key files ## + ######################### + + merged.counts <- read.table(counts_path, sep = "\t", header = T, check.names = F) + exp.design <- read.table(design_path, sep = "\t", header = T, check.names = F) + wt.seq <- DNAString(as.character(read.table(wt_seq_path))) + + ## 2. Variant count matrix reformatting ## + ########################################## + + var.tables <- vector(mode = "list", length = nrow(exp.design)) + names(var.tables) <- exp.design[,"sample_name"] + var.tables <- lapply(var.tables, function(x){x <- vector(mode = "list", length = 4); + names(x) <- c("summaryTable", "filterSummary", "errorStatistics", "parameters"); + return(x)}) + + # mutscan 'summaryTable' + for(i in 1:length(var.tables)){ + print(i) + var.tables[[i]]\$summaryTable <- mutscan.summaryTable.from.counts(merged.counts, names(var.tables)[i], wt.seq) + } + + # mutscan 'filterSummary' (fill with minimal decoy) + var.tables <- lapply(var.tables, function(x){x\$filterSummary <- data.frame(NA); return(x)}) + + # mutscan 'errorStatistics' (fill with minimal decoy) + var.tables <- lapply(var.tables, function(x){x\$errorStatistics <- NA; return(x)}) + + # mutscan 'parameters' (fill with minimal decoy) + var.tables <- lapply(var.tables, function(x){x\$parameters\$mutNameDelimiter <- '.'; return(x)}) + + ## 3. summarizeExperiment object ## + ################################### + + # mutscan 'coldata' object (from experimental design TSV file) + condition <- exp.design\$selection_id + condition[which(condition == '0')] <- "input" + condition[which(condition == '1')] <- "output" + coldata <- as.data.frame(cbind("Name" = exp.design\$sample_name, + "Condition" = condition, + "Replicate" = exp.design\$experiment_replicate)) + class(coldata\$Replicate) <- "integer" + + # mutscan 'summarizeExperiment' object + se <- summarizeExperiment(x = var.tables, + coldata = coldata, + countType = "reads") + + ## 4. logFC calculations ## + ########################### + + # mutscan 'model.matrix' object to calculate logFC values + model.design <- model.matrix(~ Replicate + Condition, data = se@colData) + + # edgeR + logFC.edgeR <- calculateRelativeFC(se = se, + design = model.design, + coef = "Conditionoutput", + WTrows = "WT", + selAssay = "counts", + pseudocount = 1, + method = "edgeR") + + # limma + logFC.limma <- calculateRelativeFC(se = se, + design = model.design, + coef = "Conditionoutput", + WTrows = "WT", + selAssay = "counts", + pseudocount = 1, + method = "limma") + + ## 5. QC plots ## + ################# + + # raw counts comparison + pdf('mutscan_counts_corr.pdf', height = 9, width = 14) + print(plotPairs(se, selAssay = "counts", addIdentityLine = TRUE)) + dev.off() + + # edgeR volcano plot + pdf('mutscan_edgeR_volcano.pdf', height = 9, width = 14) + print(plotVolcano(logFC.edgeR, pointSize = "large")) + dev.off() + + # limma volcano plot + pdf('mutscan_limma_volcano.pdf', height = 9, width = 14) + print(plotVolcano(logFC.limma, pointSize = "large")) + dev.off() + + ## 6. Data export ## + #################### + + write.table(logFC.edgeR, output_path_edgeR, + col.names = TRUE, row.names = FALSE, quote = FALSE, sep = "\t", na = "") + write.table(logFC.limma, output_path_limma, + col.names = TRUE, row.names = FALSE, quote = FALSE, sep = "\t", na = "") + + invisible(logFC.edgeR) + invisible(logFC.limma) + +} + +##### +# run function +##### +run_mutscan_fitness_estimation( + counts_path = "$counts_merged", + design_path = "$exp_design", + wt_seq_path = "$syn_wt_txt", + output_path_edgeR = "fitness_estimation_mutscan_edgeR.tsv", + output_path_limma = "fitness_estimation_mutscan_limma.tsv" +) + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +mutscan_version <- as.character(packageVersion("mutscan")) +Biostrings_version <- as.character(packageVersion("Biostrings")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(mutscan_version) == 0) mutscan_version <- "unknown" +if (length(Biostrings_version) == 0) Biostrings_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"\${task.process}":', + paste(' r-base:', r_version), + paste(' r-mutscan:', mutscan_version), + paste(' r-Biostrings:', Biostrings_version) + ), + f +) +close(f) diff --git a/modules/local/gatk/gatk_to_fitness/environment.yml b/modules/local/gatk/gatk_to_fitness/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/gatk/gatk_to_fitness/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/gatk/gatk_to_fitness/main.nf b/modules/local/gatk/gatk_to_fitness/main.nf new file mode 100644 index 0000000..543ef9a --- /dev/null +++ b/modules/local/gatk/gatk_to_fitness/main.nf @@ -0,0 +1,32 @@ +process GATK_GATKTOFITNESS { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(meta), path(variantCounts_filtered_by_library) + path wt_seq + val pos_range + + output: + tuple val(meta), path("${meta.id}_fitness_input.tsv"), emit: fitness_input + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + template 'gatk_to_fitness.R' + + stub: + """ + touch ${meta.id}_fitness_input.tsv + echo "GATK_GATKTOFITNESS:" > versions.yml + echo " stub-version: 0.0.0" >> versions.yml + """ +} diff --git a/modules/local/gatk/gatk_to_fitness/meta.yml b/modules/local/gatk/gatk_to_fitness/meta.yml new file mode 100644 index 0000000..e9df307 --- /dev/null +++ b/modules/local/gatk/gatk_to_fitness/meta.yml @@ -0,0 +1,57 @@ +name: "gatk_gatktofitness" +description: Replaces or reformats GATK variant count data into a standardized TSV format compatible with downstream fitness calculation tools, ensuring mutation nomenclature is consistent with the reference sequence. +keywords: + - deep mutational scanning + - dms + - gatk + - format conversion + - fitness +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + - "bioconductor-biostrings": + description: "Efficient manipulation of biological strings" + homepage: "https://bioconductor.org/packages/Biostrings" + licence: ["Artistic-2.0"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - variantCounts_filtered_by_library: + type: file + description: CSV file containing GATK variant counts already filtered against the design library + pattern: "*.{csv}" + - - wt_seq: + type: file + description: FASTA file containing the wild-type DNA reference sequence + pattern: "*.{fasta,fa}" + - - pos_range: + type: string + description: Start and stop codon positions (ORF) in the format 'start-stop' + +output: + - fitness_input: + - meta: + type: map + description: Groovy Map containing sample information + - "${meta.id}_fitness_input.tsv": + type: file + description: A standardized TSV file ready for input into fitness calculation modules + pattern: "*_fitness_input.tsv" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/gatk/gatk_to_fitness/templates/gatk_to_fitness.R b/modules/local/gatk/gatk_to_fitness/templates/gatk_to_fitness.R new file mode 100644 index 0000000..38d1be2 --- /dev/null +++ b/modules/local/gatk/gatk_to_fitness/templates/gatk_to_fitness.R @@ -0,0 +1,103 @@ +#!/usr/bin/env Rscript + +suppressMessages(library(Biostrings)) + +generate_fitness_input <- function(wt_seq_path, gatk_file, pos_range, output_file_path) { + # Parse the position range + positions <- unlist(strsplit(pos_range, "-")) + start_pos <- as.numeric(positions[1]) + stop_pos <- as.numeric(positions[2]) + + # Load the wild-type sequence + seq_data <- Biostrings::readDNAStringSet(filepath = wt_seq_path) + wt_seq <- seq_data[[1]] # Extract the sequence + wt_seq <- subseq(wt_seq, start = start_pos, end = stop_pos) + + # Convert wt_seq to a character string + wt_seq <- as.character(wt_seq) + + # Split the wild-type sequence into codons (groups of 3 bases) + wt_codons <- substring(wt_seq, seq(1, nchar(wt_seq), 3), seq(3, nchar(wt_seq), 3)) + + # Helper function to process GATK CSVs into count data + process_gatk_file <- function(gatk_csv) { + # Load the input GATK CSV file + gatk_data <- read.csv(gatk_csv, stringsAsFactors = FALSE) + + # Initialize a data frame for results + results <- data.frame( + nt_seq = character(), + count = numeric(), + stringsAsFactors = FALSE + ) + + # Iterate over each row in the input data + for (i in 1:nrow(gatk_data)) { + # Extract the mutation info + codon_mut <- gatk_data\$codon_mut[i] + + if("counts_corrected" %in% colnames(gatk_data)){ + counts <- gatk_data\$counts_corrected[i] + }else{ + counts <- gatk_data\$counts[i] + } + + # Create a mutable copy of the wild-type codons + mutated_codons <- wt_codons + + # Apply the mutation + mutations <- strsplit(codon_mut, ", ")[[1]] + for (mutation in mutations) { + codon_position <- as.numeric(sub(":.*", "", mutation)) + new_codon <- sub(".*>", "", mutation) + # Replace the codon at the specified position + mutated_codons[codon_position] <- new_codon + } + + # Convert the mutated codons back to a sequence string + mutated_seq_string <- paste(mutated_codons, collapse = "") + + # Add the result to the data frame + results <- rbind(results, data.frame(nt_seq = mutated_seq_string, count = counts)) + } + + return(results) + } + + # Process the GATK file + cat("Processing GATK file...\\n") + processed_data <- process_gatk_file(gatk_file) + + # Write the processed data to a file without column names + write.table(processed_data, file = output_file_path, sep = "\\t", row.names = FALSE, col.names = FALSE, quote = FALSE) +} + +##### +# run function +##### +generate_fitness_input( + wt_seq_path = "$wt_seq", + gatk_file = "$variantCounts_filtered_by_library", + pos_range = "$pos_range", + output_file_path = "${meta.id}_fitness_input.tsv" +) + +##### +# create versions.yml +##### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +Biostrings_version <- as.character(packageVersion("Biostrings")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(Biostrings_version) == 0) Biostrings_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version), + paste(' r-Biostrings:', Biostrings_version) + ), + f +) +close(f) diff --git a/modules/local/gatk/saturationmutagenesis/environment.yml b/modules/local/gatk/saturationmutagenesis/environment.yml new file mode 100644 index 0000000..be64e16 --- /dev/null +++ b/modules/local/gatk/saturationmutagenesis/environment.yml @@ -0,0 +1,7 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::gatk4=4.6.2.0 + - bioconda::samtools=1.21 + - conda-forge::java-1.7.0-openjdk-conda-aarch64=1.7.0.261 diff --git a/modules/local/gatk/saturationmutagenesis/main.nf b/modules/local/gatk/saturationmutagenesis/main.nf new file mode 100644 index 0000000..1ec3921 --- /dev/null +++ b/modules/local/gatk/saturationmutagenesis/main.nf @@ -0,0 +1,56 @@ +process GATK_SATURATIONMUTAGENESIS { + tag "$meta.id" + label 'process_high' + + conda "${moduleDir}/environment.yml" + container "community.wave.seqera.io/library/gatk4_samtools_java-1.7.0-openjdk-conda-aarch64:7c1f89018b5d5103" + + input: + tuple val(meta), path(premerged_reads) + path wt_seq + val pos_range + val min_counts + + output: + tuple val(meta), path("gatk_output.variantCounts"), emit: variantCounts // to access the output + tuple val(meta), path("gatk_output.*"), emit: gatk_output + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + # Index reference + samtools faidx $wt_seq + gatk CreateSequenceDictionary -R $wt_seq + + # Read start and stop codon from input + start_stop_codon="$pos_range" + + # Run GATK AnalyzeSaturationMutagenesis + gatk AnalyzeSaturationMutagenesis \ + -I $premerged_reads \ + -R $wt_seq \ + --orf \$start_stop_codon \ + --paired-mode false \ + --min-q 30 \ + --min-variant-obs $min_counts \ + -O gatk_output + + # Save versions + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(samtools --version |& sed '1!d ; s/samtools //') + gatk: \$(gatk --version |& sed 's/^.*GATK/\1/') +END_VERSIONS + """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + touch gatk_output.variantCounts + touch versions.yml + """ +} diff --git a/modules/local/gatk/saturationmutagenesis/meta.yml b/modules/local/gatk/saturationmutagenesis/meta.yml new file mode 100644 index 0000000..de457df --- /dev/null +++ b/modules/local/gatk/saturationmutagenesis/meta.yml @@ -0,0 +1,67 @@ +name: "gatk_saturationmutagenesis" +description: Uses GATK AnalyzeSaturationMutagenesis to identify and quantify variants from premerged reads in a Deep Mutational Scanning (DMS) experiment. +keywords: + - variant calling + - dms + - gatk + - saturation mutagenesis + - counting +tools: + - "gatk4": + description: "Genome Analysis Toolkit (GATK) for variant discovery and high-throughput sequencing analysis" + homepage: "https://gatk.broadinstitute.org/hc/en-us" + documentation: "https://gatk.broadinstitute.org/hc/en-us/articles/360037593731-AnalyzeSaturationMutagenesis-BETA" + licence: ["BSD-3-Clause"] + - "samtools": + description: "Tools for manipulating next-generation sequencing data" + homepage: "http://www.htslib.org/" + documentation: "http://www.htslib.org/doc/samtools.html" + licence: ["MIT/Expat"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - premerged_reads: + type: file + description: BAM/SAM file containing premerged reads + pattern: "*.{bam,sam}" + - - wt_seq: + type: file + description: FASTA file containing the wild-type reference sequence + pattern: "*.{fasta,fa}" + - - pos_range: + type: string + description: Start and stop codon positions (ORF) in format 'start-stop' + - - min_counts: + type: integer + description: Minimum number of variant observations required to report a variant + +output: + - variantCounts: + - meta: + type: map + description: Groovy Map containing sample information + - gatk_output.variantCounts: + type: file + description: The primary GATK output file containing mutation counts + pattern: "*.variantCounts" + - gatk_output: + - meta: + type: map + description: Groovy Map containing sample information + - gatk_output.*: + type: file + description: All files generated by the GATK AnalyzeSaturationMutagenesis command + pattern: "gatk_output.*" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/visualization/counts_heatmap/environment.yml b/modules/local/visualization/counts_heatmap/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/visualization/counts_heatmap/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/visualization/counts_heatmap/main.nf b/modules/local/visualization/counts_heatmap/main.nf new file mode 100644 index 0000000..4ef05dd --- /dev/null +++ b/modules/local/visualization/counts_heatmap/main.nf @@ -0,0 +1,31 @@ +process VISUALIZATION_COUNTS_HEATMAP { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(meta), path(variantCounts_for_heatmaps) + val min_counts + + output: + tuple val(meta), path("counts_heatmap.pdf"), emit: counts_heatmap + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + template 'counts_heatmap.R' + + stub: + """ + touch counts_heatmap.pdf + echo "VISUALIZATION_COUNTS_HEATMAP:" > versions.yml + echo " stub-version: 0.0.0" >> versions.yml + """ +} diff --git a/modules/local/visualization/counts_heatmap/meta.yml b/modules/local/visualization/counts_heatmap/meta.yml new file mode 100644 index 0000000..41ac5d7 --- /dev/null +++ b/modules/local/visualization/counts_heatmap/meta.yml @@ -0,0 +1,51 @@ +name: "visualization_counts_heatmap" +description: Generates a comprehensive heatmap of variant counts, visualizing the mutation landscape across the entire reference sequence. +keywords: + - deep mutational scanning + - dms + - visualization + - heatmap + - mutation landscape +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - variantCounts_for_heatmaps: + type: file + description: CSV file containing processed variant counts formatted for heatmap generation + pattern: "*.{csv}" + - - min_counts: + type: integer + description: Minimum count threshold used to filter variants before plotting + +output: + - counts_heatmap: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - counts_heatmap.pdf: + type: file + description: PDF heatmap visualizing mutation frequencies across the sequence positions + pattern: "counts_heatmap.pdf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/visualization/counts_heatmap/templates/counts_heatmap.R b/modules/local/visualization/counts_heatmap/templates/counts_heatmap.R new file mode 100644 index 0000000..44ecb8d --- /dev/null +++ b/modules/local/visualization/counts_heatmap/templates/counts_heatmap.R @@ -0,0 +1,187 @@ +#!/usr/bin/env Rscript + +# Input: prepared GATK data path, output_path, threshold (same as used for prepare_gatk_data_for_counts_per_cov_heatmap function) +# Output: counts_per_cov_heatmap.pdf + +library(dplyr) +library(ggplot2) + +counts_heatmap <- function(input_csv_path, threshold = 3, output_pdf_path, img_format = "pdf") { + + # Inner function to add padding to the last row, adding 21 amino acids per position + pad_heatmap_data_long <- function(heatmap_data_long, min_non_na_value, num_positions_per_row = 75) { + all_amino_acids <- c("G", "A", "V", "L", "M", "I", "F", + "Y", "W", "K", "R", "H", "D", "E", + "S", "T", "C", "N", "Q", "P", "*") + + max_position <- max(heatmap_data_long\$position) + num_missing_positions <- num_positions_per_row - (max_position %% num_positions_per_row) + + if (num_missing_positions < num_positions_per_row) { + new_positions <- (max_position + 1):(max_position + num_missing_positions) + + # Add all 21 amino acid variants for each new position + padding_data <- expand.grid( + mut_aa = all_amino_acids, # All possible amino acids + position = new_positions # New positions to be padded + ) + + # Set placeholder values for the added positions to the exact smallest non-NA value + padding_data\$total_counts <- min_non_na_value # Set to the smallest non-NA value + padding_data\$wt_aa <- "Y" # Set wild-type amino acid to 'Y' + padding_data\$wt_aa_pos <- paste0("Y", padding_data\$position) # Create wt_aa_pos with correct positions + padding_data\$row_group <- max(heatmap_data_long\$row_group) # Set row group to the current last group + + # Add the new padding rows to heatmap_data_long + heatmap_data_long <- dplyr::bind_rows(heatmap_data_long, padding_data) + } + + return(heatmap_data_long) + } + + # Load the CSV data + heatmap_data <- read.csv(input_csv_path) + + # Check if the necessary column exists in the data + if (!"total_counts" %in% colnames(heatmap_data)) { + stop("The column 'total_counts' is not found in the data.") + } + + # Create heatmap_data_long by selecting necessary columns + heatmap_data_long <- heatmap_data %>% + select(mut_aa, position, total_counts, wt_aa) # Use 'total_counts' + + # Find the smallest non-NA value in total_counts + min_non_na_value <- min(heatmap_data_long\$total_counts, na.rm = TRUE) + + # Group positions by rows (75 positions per row) and calculate row_group + heatmap_data_long <- heatmap_data_long %>% + mutate(row_group = ((position - 1) %/% 75) + 1) # Grouping positions into rows + + # Apply padding to add missing positions at the end of the last row, using the calculated min value + heatmap_data_long <- pad_heatmap_data_long(heatmap_data_long, min_non_na_value) + + # Convert positions to numeric, sort them, and create wt_aa_pos for the plot + heatmap_data_long <- heatmap_data_long %>% + mutate(position = as.numeric(position)) %>% # Ensure position is numeric + arrange(position) %>% # Sort by position + mutate(wt_aa_pos = factor(paste0(wt_aa, position), levels = unique(paste0(wt_aa, position)))) # Create sorted factor levels for wt_aa_pos + + # Add a column to identify synonymous mutations (where mut_aa == wt_aa) + heatmap_data_long <- heatmap_data_long %>% + mutate(synonymous = mut_aa == wt_aa) + + # Definiere die korrekte Reihenfolge der Aminosäuren + amino_acid_order <- rev(c("G", "A", "V", "L", "M", "I", "F", + "Y", "W", "K", "R", "H", "D", "E", + "S", "T", "C", "N", "Q", "P", "*")) + + heatmap_data_long <- heatmap_data_long %>% + mutate(mut_aa = factor(mut_aa, levels = amino_acid_order)) + + # Bearbeite heatmap_data_long und erstelle syn_positions gleichzeitig + syn_positions <- heatmap_data_long %>% + mutate(mut_aa = factor(mut_aa, levels = amino_acid_order), + # Berechne die x-Koordinate, die pro Gruppe immer von 1 bis 75 verläuft + x = as.numeric(factor(wt_aa_pos, levels = unique(wt_aa_pos))) - ((row_group - 1) * 75), + y = as.numeric(factor(mut_aa, levels = amino_acid_order))) %>% + filter(synonymous == TRUE) + + # Calculate the number of row groups and adjust plot height dynamically + num_row_groups <- max(heatmap_data_long\$row_group) + plot_height <- num_row_groups * 4 + + # Set the limits for the color scale, ignoring NA (negative values are now NA) + min_count <- min(heatmap_data_long\$total_counts, na.rm = TRUE) + max_count <- max(heatmap_data_long\$total_counts, na.rm = TRUE) + max_position <- max(heatmap_data\$position) + + # Create the heatmap plot with explicit handling for positions > max_position + heatmap_plot <- ggplot(heatmap_data_long, aes(x = wt_aa_pos, y = mut_aa, fill = total_counts)) + + scale_fill_gradientn(colours = c(alpha("blue", 0), "blue"), na.value = "grey35", trans = "log", # Apply log transformation to the scale + limits = c(min_count, max_count), + breaks = scales::trans_breaks("log10", function(x) 10^x), # Logarithmic scale breaks + labels = scales::trans_format("log10", scales::math_format(10^.x))) + + scale_x_discrete(labels = function(x) { + numeric_pos <- as.numeric(gsub("[^0-9]", "", x)) + ifelse(numeric_pos > max_position, " ", x) + }) + + geom_tile() + + + # Add diagonal lines for synonymous mutations using geom_segment + geom_segment(data = syn_positions[syn_positions\$position <= max_position, ], + aes(x = x - 0.485, xend = x + 0.485, + y = y - 0.485, yend = y + 0.485, color = synonymous), + size = 0.2) + + + # Manuelle Farbskala für die diagonalen Linien + scale_color_manual(values = c("TRUE" = "grey10"), labels = c("TRUE" = "")) + + + theme_minimal() + + labs(title = "Heatmap of Counts per Variant", x = "Wild-type Amino Acid", y = "Mutant Amino Acid", fill = "Counts", color = "Synonymous Mutation") + + theme(plot.title = element_text(size = 16, face = "bold"), + axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5, size = 12), + axis.text.y = element_text(size = 12), # Larger y-axis labels + axis.title.x = element_text(size = 16), + axis.title.y = element_text(size = 16), + legend.title = element_text(size = 14), # Larger legend title + legend.text = element_text(size = 12), # Larger legend text + panel.spacing = unit(0.1, "lines"), # Adjust panel spacing + strip.text = element_blank(), # Remove row group labels (facet numbers) + strip.background = element_blank(), + panel.grid.major = element_blank(), # Remove major grid lines + panel.grid.minor = element_blank()) + # Remove minor grid lines + facet_wrap(~ row_group, scales = "free_x", ncol = 1) + # Group by 75 positions per row + theme(panel.spacing = unit(0.2, "lines")) + + heatmap_plot <- heatmap_plot + + geom_point(data = heatmap_data_long, aes(size = ""), colour = "black", alpha = 0) # Invisible points for legend + heatmap_plot <- heatmap_plot + + guides(size = guide_legend(paste("Dropout (Counts <", threshold, ")"), override.aes = list(shape = 15, size = 8, colour = "grey35", alpha = 1))) # Define Legend for Dropouts + + # Save the heatmap plot + if (img_format == "pdf") { + ggsave(output_pdf_path, plot = heatmap_plot, width = 16, height = plot_height, dpi = 150, device = cairo_pdf) + } else { + ggsave(output_pdf_path, plot = heatmap_plot, width = 16, height = plot_height, dpi = 150) + } + + if (file.exists(output_pdf_path)) { + print("Heatmap image successfully created!") + } else { + print("Error: Heatmap image was not created.") + } +} + +##### +# run function +##### +counts_heatmap( + input_csv_path = "$variantCounts_for_heatmaps", + threshold = $min_counts, + output_pdf_path = "counts_heatmap.pdf", + img_format = "pdf" +) + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +dplyr_version <- as.character(packageVersion("dplyr")) +ggplot2_version <- as.character(packageVersion("ggplot2")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(dplyr_version) == 0) dplyr_version <- "unknown" +if (length(ggplot2_version) == 0) ggplot2_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version), + paste(' r-dplyr:', dplyr_version), + paste(' r-ggplot2:', ggplot2_version) + ), + f +) +close(f) diff --git a/modules/local/visualization/counts_per_cov/environment.yml b/modules/local/visualization/counts_per_cov/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/visualization/counts_per_cov/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/visualization/counts_per_cov/main.nf b/modules/local/visualization/counts_per_cov/main.nf new file mode 100644 index 0000000..e19bd40 --- /dev/null +++ b/modules/local/visualization/counts_per_cov/main.nf @@ -0,0 +1,31 @@ +process VISUALIZATION_COUNTS_PER_COV { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(meta), path(variantCounts_for_heatmaps) + val min_counts + + output: + tuple val(meta), path("counts_per_cov_heatmap.pdf"), emit: counts_per_cov_heatmap + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + template 'counts_per_cov_heatmap.R' + + stub: + """ + touch counts_per_cov_heatmap.pdf + echo "VISUALIZATION_COUNTS_PER_COV:" > versions.yml + echo " stub-version: 0.0.0" >> versions.yml + """ +} diff --git a/modules/local/visualization/counts_per_cov/meta.yml b/modules/local/visualization/counts_per_cov/meta.yml new file mode 100644 index 0000000..354351c --- /dev/null +++ b/modules/local/visualization/counts_per_cov/meta.yml @@ -0,0 +1,49 @@ +name: "visualization_counts_per_cov" +description: "Generates a heatmap visualizing the distribution of variant counts per coverage depth in Deep Mutational Scanning data." +keywords: + - deep mutational scanning + - dms + - visualization + - heatmap + - coverage +tools: + - "r-base": + description: "R is a free software environment for statistical computing and graphics" + homepage: "https://www.r-project.org/" + documentation: "https://cran.r-project.org/manuals.html" + licence: ["GPL-2.0-or-later"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - variantCounts_for_heatmaps: + type: file + description: CSV file containing processed variant counts formatted for heatmap generation + pattern: "*.{csv}" + - - min_counts: + type: integer + description: Minimum count threshold used to filter variants for visualization + +output: + - counts_per_cov_heatmap: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - counts_per_cov_heatmap.pdf: + type: file + description: PDF heatmap showing variant counts distributed across coverage depths + pattern: "counts_per_cov_heatmap.pdf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/visualization/counts_per_cov/templates/counts_per_cov_heatmap.R b/modules/local/visualization/counts_per_cov/templates/counts_per_cov_heatmap.R new file mode 100644 index 0000000..77518b7 --- /dev/null +++ b/modules/local/visualization/counts_per_cov/templates/counts_per_cov_heatmap.R @@ -0,0 +1,187 @@ +#!/usr/bin/env Rscript + +# Input: prefiltered GATK data path, output_path, threshold (same as used for prepare_gatk_data_for_counts_per_cov_heatmap function) +# Output: counts_per_cov_heatmap.pdf + +library(dplyr) +library(ggplot2) + +counts_per_cov_heatmap <- function(input_csv_path, threshold = 3, output_pdf_path, img_format = "pdf") { + + # Inner function to add padding to the last row, adding 21 amino acids per position + pad_heatmap_data_long <- function(heatmap_data_long, min_non_na_value, num_positions_per_row = 75) { + all_amino_acids <- c("G", "A", "V", "L", "M", "I", "F", + "Y", "W", "K", "R", "H", "D", "E", + "S", "T", "C", "N", "Q", "P", "*") + + max_position <- max(heatmap_data_long\$position) + num_missing_positions <- num_positions_per_row - (max_position %% num_positions_per_row) + + if (num_missing_positions < num_positions_per_row) { + new_positions <- (max_position + 1):(max_position + num_missing_positions) + + # Add all 21 amino acid variants for each new position + padding_data <- expand.grid( + mut_aa = all_amino_acids, # All possible amino acids + position = new_positions # New positions to be padded + ) + + # Set placeholder values for the added positions to the exact smallest non-NA value + padding_data\$total_counts_per_cov <- min_non_na_value # Set to the smallest non-NA value + padding_data\$wt_aa <- "Y" # Set wild-type amino acid to 'Y' + padding_data\$wt_aa_pos <- paste0("Y", padding_data\$position) # Create wt_aa_pos with correct positions + padding_data\$row_group <- max(heatmap_data_long\$row_group) # Set row group to the current last group + + # Add the new padding rows to heatmap_data_long + heatmap_data_long <- dplyr::bind_rows(heatmap_data_long, padding_data) + } + + return(heatmap_data_long) + } + + # Load the CSV data + heatmap_data <- read.csv(input_csv_path) + + # Check if the necessary column exists in the data + if (!"total_counts_per_cov" %in% colnames(heatmap_data)) { + stop("The column 'total_counts_per_cov' is not found in the data.") + } + + # Create heatmap_data_long by selecting necessary columns + heatmap_data_long <- heatmap_data %>% + select(mut_aa, position, total_counts_per_cov, wt_aa) # Use 'total_counts_per_cov' + + # Find the smallest non-NA value in total_counts_per_cov + min_non_na_value <- min(heatmap_data_long\$total_counts_per_cov, na.rm = TRUE) + + # Group positions by rows (75 positions per row) and calculate row_group + heatmap_data_long <- heatmap_data_long %>% + mutate(row_group = ((position - 1) %/% 75) + 1) # Grouping positions into rows + + # Apply padding to add missing positions at the end of the last row, using the calculated min value + heatmap_data_long <- pad_heatmap_data_long(heatmap_data_long, min_non_na_value) + + # Convert positions to numeric, sort them, and create wt_aa_pos for the plot + heatmap_data_long <- heatmap_data_long %>% + mutate(position = as.numeric(position)) %>% # Ensure position is numeric + arrange(position) %>% # Sort by position + mutate(wt_aa_pos = factor(paste0(wt_aa, position), levels = unique(paste0(wt_aa, position)))) # Create sorted factor levels for wt_aa_pos + + # Add a column to identify synonymous mutations (where mut_aa == wt_aa) + heatmap_data_long <- heatmap_data_long %>% + mutate(synonymous = mut_aa == wt_aa) + + # Definiere die korrekte Reihenfolge der Aminosäuren + amino_acid_order <- rev(c("G", "A", "V", "L", "M", "I", "F", + "Y", "W", "K", "R", "H", "D", "E", + "S", "T", "C", "N", "Q", "P", "*")) + + heatmap_data_long <- heatmap_data_long %>% + mutate(mut_aa = factor(mut_aa, levels = amino_acid_order)) + + # Bearbeite heatmap_data_long und erstelle syn_positions gleichzeitig + syn_positions <- heatmap_data_long %>% + mutate(mut_aa = factor(mut_aa, levels = amino_acid_order), + # Berechne die x-Koordinate, die pro Gruppe immer von 1 bis 75 verläuft + x = as.numeric(factor(wt_aa_pos, levels = unique(wt_aa_pos))) - ((row_group - 1) * 75), + y = as.numeric(factor(mut_aa, levels = amino_acid_order))) %>% + filter(synonymous == TRUE) + + # Calculate the number of row groups and adjust plot height dynamically + num_row_groups <- max(heatmap_data_long\$row_group) + plot_height <- num_row_groups * 4 + + # Set the limits for the color scale, ignoring NA (negative values are now NA) + min_count <- min(heatmap_data_long\$total_counts_per_cov, na.rm = TRUE) + max_count <- max(heatmap_data_long\$total_counts_per_cov, na.rm = TRUE) + max_position <- max(heatmap_data\$position) + + # Create the heatmap plot with explicit handling for positions > max_position + heatmap_plot <- ggplot(heatmap_data_long, aes(x = wt_aa_pos, y = mut_aa, fill = total_counts_per_cov)) + + scale_fill_gradientn(colours = c(alpha("blue", 0), "blue"), na.value = "grey35", trans = "log", # Apply log transformation to the scale + limits = c(min_count, max_count), + breaks = scales::trans_breaks("log10", function(x) 10^x), # Logarithmic scale breaks + labels = scales::trans_format("log10", scales::math_format(10^.x))) + + scale_x_discrete(labels = function(x) { + numeric_pos <- as.numeric(gsub("[^0-9]", "", x)) + ifelse(numeric_pos > max_position, " ", x) + }) + + geom_tile() + + + # Add diagonal lines for synonymous mutations using geom_segment + geom_segment(data = syn_positions[syn_positions\$position <= max_position, ], + aes(x = x - 0.485, xend = x + 0.485, + y = y - 0.485, yend = y + 0.485, color = synonymous), + size = 0.2) + + + # Manuelle Farbskala für die diagonalen Linien + scale_color_manual(values = c("TRUE" = "grey10"), labels = c("TRUE" = "")) + + + theme_minimal() + + labs(title = "Heatmap of Counts per Coverage for Mutations", x = "Wild-type Amino Acid", y = "Mutant Amino Acid", fill = "Counts per \\n Coverage", color = "Synonymous Mutation") + + theme(plot.title = element_text(size = 16, face = "bold"), + axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5, size = 12), + axis.text.y = element_text(size = 12), # Larger y-axis labels + axis.title.x = element_text(size = 16), + axis.title.y = element_text(size = 16), + legend.title = element_text(size = 14), # Larger legend title + legend.text = element_text(size = 12), # Larger legend text + panel.spacing = unit(0.1, "lines"), # Adjust panel spacing + strip.text = element_blank(), # Remove row group labels (facet numbers) + strip.background = element_blank(), + panel.grid.major = element_blank(), # Remove major grid lines + panel.grid.minor = element_blank()) + # Remove minor grid lines + facet_wrap(~ row_group, scales = "free_x", ncol = 1) + # Group by 75 positions per row + theme(panel.spacing = unit(0.2, "lines")) + + heatmap_plot <- heatmap_plot + + geom_point(data = heatmap_data_long, aes(size = ""), colour = "black", alpha = 0) # Invisible points for legend + heatmap_plot <- heatmap_plot + + guides(size = guide_legend(paste("Dropout (Counts <", threshold, ")"), override.aes = list(shape = 15, size = 8, colour = "grey35", alpha = 1))) # Define Legend for Dropouts + + # Save the heatmap plot + if (img_format == "pdf") { + ggsave(output_pdf_path, plot = heatmap_plot, width = 16, height = plot_height, dpi = 150, device = cairo_pdf) + } else { + ggsave(output_pdf_path, plot = heatmap_plot, width = 16, height = plot_height, dpi = 150) + } + + if (file.exists(output_pdf_path)) { + print("Heatmap image successfully created!") + } else { + print("Error: Heatmap image was not created.") + } +} + +##### +# run function +##### +counts_per_cov_heatmap( + input_csv_path = "$variantCounts_for_heatmaps", + threshold = $min_counts, + output_pdf_path = "counts_per_cov_heatmap.pdf", + img_format = "pdf" +) + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +dplyr_version <- as.character(packageVersion("dplyr")) +ggplot2_version <- as.character(packageVersion("ggplot2")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(dplyr_version) == 0) dplyr_version <- "unknown" +if (length(ggplot2_version) == 0) ggplot2_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version), + paste(' r-dplyr:', dplyr_version), + paste(' r-ggplot2:', ggplot2_version) + ), + f +) +close(f) diff --git a/modules/local/visualization/global_pos_biases_counts/environment.yml b/modules/local/visualization/global_pos_biases_counts/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/visualization/global_pos_biases_counts/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/visualization/global_pos_biases_counts/main.nf b/modules/local/visualization/global_pos_biases_counts/main.nf new file mode 100644 index 0000000..f28fb1d --- /dev/null +++ b/modules/local/visualization/global_pos_biases_counts/main.nf @@ -0,0 +1,34 @@ +process VISUALIZATION_GLOBAL_POS_BIASES_COUNTS { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(meta), path(variantCounts_filtered_by_library) + path aa_seq + val sliding_window_size + + output: + tuple val(meta), path("rolling_counts.pdf"), emit: rolling_counts + tuple val(meta), path("rolling_counts_per_cov.pdf"), emit: rolling_counts_per_cov + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + template 'global_position_biases_counts_and_counts_per_cov.R' + + stub: + """ + touch rolling_counts.pdf + touch rolling_counts_per_cov.pdf + echo "VISUALIZATION_COUNTS_HEATMAP:" > versions.yml + echo " stub-version: 0.0.0" >> versions.yml + """ +} diff --git a/modules/local/visualization/global_pos_biases_counts/meta.yml b/modules/local/visualization/global_pos_biases_counts/meta.yml new file mode 100644 index 0000000..8a02b08 --- /dev/null +++ b/modules/local/visualization/global_pos_biases_counts/meta.yml @@ -0,0 +1,61 @@ +name: "visualization_global_pos_biases_counts" +description: Calculates and visualizes the rolling mean of variant counts and counts normalized by coverage across the reference sequence to identify regional biases or mutation hotspots. +keywords: + - deep mutational scanning + - dms + - visualization + - rolling mean + - sliding window + - bias +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - variantCounts_filtered_by_library: + type: file + description: CSV file containing variant counts filtered against the target library + pattern: "*.{csv}" + - - aa_seq: + type: file + description: Text or FASTA file containing the reference amino acid sequence + - - sliding_window_size: + type: integer + description: The size of the window (number of amino acids) used for calculating the rolling mean + +output: + - rolling_counts: + - meta: + type: map + description: Groovy Map containing sample information + - rolling_counts.pdf: + type: file + description: PDF plot showing the rolling mean of raw variant counts across positions + pattern: "rolling_counts.pdf" + - rolling_counts_per_cov: + - meta: + type: map + description: Groovy Map containing sample information + - rolling_counts_per_cov.pdf: + type: file + description: PDF plot showing the rolling mean of counts normalized by local coverage + pattern: "rolling_counts_per_cov.pdf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/visualization/global_pos_biases_counts/templates/global_position_biases_counts_and_counts_per_cov.R b/modules/local/visualization/global_pos_biases_counts/templates/global_position_biases_counts_and_counts_per_cov.R new file mode 100644 index 0000000..2ec9d91 --- /dev/null +++ b/modules/local/visualization/global_pos_biases_counts/templates/global_position_biases_counts_and_counts_per_cov.R @@ -0,0 +1,186 @@ +#!/usr/bin/env Rscript + +# input: prefiltered gatk path, aa seq path, window_size (sliding window), output_path_folder +# output: two lineplots showing counts & counts per coverage divided in types of variants (single-/double-/triple-base exchange) + +library(zoo) # sliding window +library(dplyr) +library(ggplot2) +library(scales) + +position_biases <- function(prefiltered_gatk_path, aa_seq_path, window_size = 10) { + + # Load and process the data + prefiltered_gatk <- read.table(prefiltered_gatk_path, sep = ",", fill = NA, header = TRUE) + prefiltered_gatk\$pos <- as.numeric(sub("(\\\\D)(\\\\d+)(\\\\D)", "\\\\2", prefiltered_gatk\$pos_mut)) + unique_pos <- unique(as.numeric(prefiltered_gatk\$pos)) + aa_seq <- readLines(aa_seq_path, warn = FALSE) + aa_seq_length <- nchar(aa_seq) + aa_positions <- seq(nchar(aa_seq)) + + means_counts_single <- rep(NA, nchar(aa_seq)) + means_counts_double <- rep(NA, nchar(aa_seq)) + means_counts_triple <- rep(NA, nchar(aa_seq)) + means_counts_per_cov_single <- rep(NA, nchar(aa_seq)) + means_counts_per_cov_double <- rep(NA, nchar(aa_seq)) + means_counts_per_cov_triple <- rep(NA, nchar(aa_seq)) + + + # Loop through each position in the amino acid sequence + for (i in 1:(nchar(aa_seq))) { + + # Filter the data for the current position in aa_positions + window_data <- prefiltered_gatk %>% filter(prefiltered_gatk\$pos %in% aa_positions[i]) + + # Calculate mean for Single mutations (where varying_bases == 1) + window_data_single <- window_data %>% filter(varying_bases == 1) + means_counts_single[i] <- mean(window_data_single\$counts, na.rm = FALSE) + means_counts_per_cov_single[i] <- mean(window_data_single\$counts_per_cov, na.rm = FALSE) + + # Calculate mean for Double mutations (where varying_bases == 2) + window_data_double <- window_data %>% filter(varying_bases == 2) + means_counts_double[i] <- mean(window_data_double\$counts, na.rm = FALSE) + means_counts_per_cov_double[i] <- mean(window_data_double\$counts_per_cov, na.rm = FALSE) + + # Calculate mean for Triple mutations (where varying_bases == 3) + window_data_triple <- window_data %>% filter(varying_bases == 3) + means_counts_triple[i] <- mean(window_data_triple\$counts, na.rm = FALSE) + means_counts_per_cov_triple[i] <- mean(window_data_triple\$counts_per_cov, na.rm = FALSE) + } + + pos_bias_df <- data.frame(pos = seq(nchar(aa_seq))) + + + pos_bias_df\$rolling_mean_counts_single <- rollapply(means_counts_single, width = window_size, + FUN = function(x) (mean(x, na.rm = TRUE) + 0.00001), fill = "extend") + pos_bias_df\$rolling_mean_counts_double <- rollapply(means_counts_double, width = window_size, + FUN = function(x) (mean(x, na.rm = TRUE) + 0.00001), fill = "extend") + pos_bias_df\$rolling_mean_counts_triple <- rollapply(means_counts_triple, width = window_size, + FUN = function(x) (mean(x, na.rm = TRUE) + 0.00001), fill = "extend") + + pos_bias_df\$rolling_mean_counts_per_cov_single <- rollapply(means_counts_per_cov_single, width = window_size, + FUN = function(x) (mean(x, na.rm = TRUE) + 0.00001), fill = "extend") + pos_bias_df\$rolling_mean_counts_per_cov_double <- rollapply(means_counts_per_cov_double, width = window_size, + FUN = function(x) (mean(x, na.rm = TRUE) + 0.00001), fill = "extend") + pos_bias_df\$rolling_mean_counts_per_cov_triple <- rollapply(means_counts_per_cov_triple, width = window_size, + FUN = function(x) (mean(x, na.rm = TRUE) + 0.00001), fill = "extend") + + + # Replace NAs with 0 for rolling means and SE + pos_bias_df\$rolling_mean_counts_single[is.na(pos_bias_df\$rolling_mean_counts_single)] <- 0.00001 + pos_bias_df\$rolling_mean_counts_double[is.na(pos_bias_df\$rolling_mean_counts_double)] <- 0.00001 + pos_bias_df\$rolling_mean_counts_triple[is.na(pos_bias_df\$rolling_mean_counts_triple)] <- 0.00001 + pos_bias_df\$rolling_mean_counts_per_cov_single[is.na(pos_bias_df\$rolling_mean_counts_per_cov_single)] <- 0.000001 + pos_bias_df\$rolling_mean_counts_per_cov_double[is.na(pos_bias_df\$rolling_mean_counts_per_cov_double)] <- 0.000001 + pos_bias_df\$rolling_mean_counts_per_cov_triple[is.na(pos_bias_df\$rolling_mean_counts_per_cov_triple)] <- 0.000001 + + + + plots_theme <- list( + + # Add the minimal theme to the list + theme_minimal(), + + # Customize legend title and appearance + theme(legend.title = element_text(size = 10, face = "bold"), + legend.position = "right"), + + # Customize guides for the legend elements + guides( + color = guide_legend(title = "Mutation Type", order = 1), # Title for line color + fill = guide_legend(title = "Standard Deviation", order = 2), # Title for ribbon fill + linetype = guide_legend(title = "Required Coverage", order = 3) # Title for line type + ) + ) + + + # Placeholder name for the linetype + linetype_placeholder <- "Required coverage" + + rolling_counts_plot <- ggplot(pos_bias_df, aes(x = pos)) + + + # Add line for rolling mean coverage + geom_line(aes(y = rolling_mean_counts_single, color = "One Varying Base")) + + geom_line(aes(y = rolling_mean_counts_double, color = "Two Varying Bases")) + + geom_line(aes(y = rolling_mean_counts_triple, color = "Three Varying Bases")) + + + # Individual axis labels for this plot + xlab("Amino Acid Position") + + ylab("Counts") + + + # Manually set color and fill labels + scale_color_manual(name = "Mutation Type", + values = c("One Varying Base" = "chocolate", "Two Varying Bases" = "darkolivegreen3", "Three Varying Bases" = "deepskyblue1"), + limits = c("One Varying Base", "Two Varying Bases", "Three Varying Bases")) + # Color for line + + # Apply the saved theme and design elements at the end + plots_theme + + + scale_y_continuous(trans = 'log10', labels = scales::comma) + + + + + rolling_counts_per_cov_plot <- ggplot(pos_bias_df, aes(x = pos)) + + + # Add line for rolling mean coverage + geom_line(aes(y = rolling_mean_counts_per_cov_single, color = "One Varying Base")) + + geom_line(aes(y = rolling_mean_counts_per_cov_double, color = "Two Varying Bases")) + + geom_line(aes(y = rolling_mean_counts_per_cov_triple, color = "Three Varying Bases")) + + + # Individual axis labels for this plot + xlab("Amino Acid Position") + + ylab("Counts") + + + # Manually set color and fill labels + scale_color_manual(name = "Mutation Type", + values = c("One Varying Base" = "chocolate", "Two Varying Bases" = "darkolivegreen3", "Three Varying Bases" = "deepskyblue1"), + limits = c("One Varying Base", "Two Varying Bases", "Three Varying Bases")) + # Color and legend order for lines + + # Apply the saved theme and design elements at the end + plots_theme + + + scale_y_continuous(trans = 'log10', labels = scales::comma) + + + ggsave(filename = "rolling_counts.pdf", plot = rolling_counts_plot, device = "pdf", width = 10, height = 6) + ggsave(filename = "rolling_counts_per_cov.pdf", plot = rolling_counts_per_cov_plot, device = "pdf", width = 10, height = 6) +} + +##### +# run pipeline +##### +position_biases( + prefiltered_gatk_path = "$variantCounts_filtered_by_library", + aa_seq_path = "$aa_seq", + window_size = $sliding_window_size +) + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +dplyr_version <- as.character(packageVersion("dplyr")) +ggplot2_version <- as.character(packageVersion("ggplot2")) +scales_version <- as.character(packageVersion("scales")) +zoo_version <- as.character(packageVersion("zoo")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(dplyr_version) == 0) dplyr_version <- "unknown" +if (length(ggplot2_version) == 0) ggplot2_version <- "unknown" +if (length(scales_version) == 0) scales_version <- "unknown" +if (length(zoo_version) == 0) zoo_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version), + paste(' r-dplyr:', dplyr_version), + paste(' r-ggplot2:', ggplot2_version), + paste(' r-scales:', scales_version), + paste(' r-zoo:', zoo_version) + ), + f +) +close(f) diff --git a/modules/local/visualization/global_pos_biases_cov/environment.yml b/modules/local/visualization/global_pos_biases_cov/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/visualization/global_pos_biases_cov/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/visualization/global_pos_biases_cov/main.nf b/modules/local/visualization/global_pos_biases_cov/main.nf new file mode 100644 index 0000000..b6b18e3 --- /dev/null +++ b/modules/local/visualization/global_pos_biases_cov/main.nf @@ -0,0 +1,33 @@ +process VISUALIZATION_GLOBAL_POS_BIASES_COV { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(meta), path(variantCounts_filtered_by_library) + path aa_seq + val sliding_window_size + val aimed_cov + + output: + tuple val(meta), path("rolling_coverage.pdf"), emit: rolling_coverage + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + template 'global_position_biases_cov.R' + + stub: + """ + touch rolling_coverage.pdf + echo "VISUALIZATION_COUNTS_HEATMAP:" > versions.yml + echo " stub-version: 0.0.0" >> versions.yml + """ +} diff --git a/modules/local/visualization/global_pos_biases_cov/meta.yml b/modules/local/visualization/global_pos_biases_cov/meta.yml new file mode 100644 index 0000000..c0b6216 --- /dev/null +++ b/modules/local/visualization/global_pos_biases_cov/meta.yml @@ -0,0 +1,56 @@ +name: "visualization_global_pos_biases_cov" +description: Calculates and plots the rolling mean coverage across the protein sequence to identify positional biases, regions of low coverage, or potential experimental dropouts. +keywords: + - deep mutational scanning + - dms + - visualization + - coverage + - rolling window + - bias +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - variantCounts_filtered_by_library: + type: file + description: CSV file containing variant counts filtered against the target library + pattern: "*.{csv}" + - - aa_seq: + type: file + description: Text file containing the reference amino acid sequence + - - sliding_window_size: + type: integer + description: The size of the window (in amino acids) used for calculating the rolling mean coverage + - - aimed_cov: + type: integer + description: The target coverage threshold to visualize as a horizontal reference line in the plot + +output: + - rolling_coverage: + - meta: + type: map + description: Groovy Map containing sample information + - rolling_coverage.pdf: + type: file + description: PDF plot showing the rolling mean coverage across sequence positions + pattern: "rolling_coverage.pdf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/visualization/global_pos_biases_cov/templates/global_position_biases_cov.R b/modules/local/visualization/global_pos_biases_cov/templates/global_position_biases_cov.R new file mode 100644 index 0000000..0a2fda1 --- /dev/null +++ b/modules/local/visualization/global_pos_biases_cov/templates/global_position_biases_cov.R @@ -0,0 +1,121 @@ +#!/usr/bin/env Rscript + +# input: prefiltered gatk path, aa seq path, window_size (sliding window), output_path_folder, aimed counts per aa variant +# output: lineplot showing coverage over positions and dotted line: Assuming all potential 21 variants are equally distributed, this is the coverage one need to get the aimed counts per variant. + +library(zoo) # sliding window +library(dplyr) +library(ggplot2) +library(scales) +position_biases <- function(prefiltered_gatk_path, aa_seq_path, window_size = 10, output_file_path, targeted_counts_per_aa_variant = 100) { + + # Load and process the data + prefiltered_gatk <- read.table(prefiltered_gatk_path, sep = ",", fill = NA, header = TRUE) + prefiltered_gatk\$pos <- as.numeric(sub("(\\\\D)(\\\\d+)(\\\\D)", "\\\\2", prefiltered_gatk\$pos_mut)) + unique_pos <- unique(as.numeric(prefiltered_gatk\$pos)) + aa_seq <- readLines(aa_seq_path, warn = FALSE) + aa_seq_length <- nchar(aa_seq) + aa_positions <- seq(nchar(aa_seq)) + + means_cov <- rep(NA, nchar(aa_seq)) + + # Calculate means for cov (should be the same over all variants in one position) + for (i in 1:(nchar(aa_seq))) { + window_data <- prefiltered_gatk %>% filter(prefiltered_gatk\$pos %in% aa_positions[i]) + means_cov[i] <- mean(window_data\$cov, na.rm = FALSE) + } + + pos_bias_df <- data.frame(pos = seq(nchar(aa_seq))) + + # Log-transform the rolling means to avoid issues with zeros (log(y + 0.001)) + pos_bias_df\$rolling_mean_cov <- rollapply(means_cov, width = window_size, + FUN = function(x) (mean(x, na.rm = TRUE)), fill = "extend") + + # Generate the plot + plots_theme <- list( + + # Customize legend appearance (leave titles blank) + guides( + color = guide_legend(title = NULL, order = 1), # Title for line color + fill = guide_legend(title = NULL, order = 2), # Title for ribbon fill + linetype = guide_legend(title = NULL, order = 3) # Title for line type + ), + + # Apply minimal theme + theme_minimal() + ) + + + linetype_placeholder <- "Required coverage" + + rolling_cov_plot <- ggplot(pos_bias_df, aes(x = pos, y = rolling_mean_cov)) + + + # Add line for rolling mean coverage + geom_line(aes(color = "Coverage")) + + + # Add a horizontal line with a mapped linetype so it appears in the legend + geom_hline(aes(yintercept = ((21 * nchar(aa_seq) * targeted_counts_per_aa_variant)), + linetype = linetype_placeholder), + color = "black", linewidth = 0.3) + + + # Individual axis labels for this plot + xlab("Amino Acid Position") + + ylab("Coverage") + + + # Manually set color and fill labels + scale_color_manual(values = c("Coverage" = "black")) + # Color for line + + # Set the linetype with a label that uses the targeted threshold dynamically + scale_linetype_manual(values = c("Required coverage" = "dotted"), + labels = paste("Required coverage \\n for", as.character(targeted_counts_per_aa_variant), "counts per variant \\n if equally present")) + + + # Apply the saved theme and design elements at the end + plots_theme + + + # Set y-axis to log scale and apply comma formatting + scale_y_continuous(trans = 'log10', labels = scales::comma) + + + # Return the plot + ggsave(filename = output_file_path, plot = rolling_cov_plot, device = "pdf", width = 10, height = 6) +} + +##### +# run function +##### +position_biases( + prefiltered_gatk_path = "$variantCounts_filtered_by_library", + aa_seq_path = "$aa_seq", + window_size = $sliding_window_size, + output_file_path = "rolling_coverage.pdf", + targeted_counts_per_aa_variant = $aimed_cov +) + +##### +# create versions.yml +##### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +dplyr_version <- as.character(packageVersion("dplyr")) +ggplot2_version <- as.character(packageVersion("ggplot2")) +scales_version <- as.character(packageVersion("scales")) +zoo_version <- as.character(packageVersion("zoo")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(dplyr_version) == 0) dplyr_version <- "unknown" +if (length(ggplot2_version) == 0) ggplot2_version <- "unknown" +if (length(scales_version) == 0) scales_version <- "unknown" +if (length(zoo_version) == 0) zoo_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version), + paste(' r-dplyr:', dplyr_version), + paste(' r-ggplot2:', ggplot2_version), + paste(' r-scales:', scales_version), + paste(' r-zoo:', zoo_version) + ), + f +) +close(f) diff --git a/modules/local/visualization/logdiff/environment.yml b/modules/local/visualization/logdiff/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/visualization/logdiff/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/visualization/logdiff/main.nf b/modules/local/visualization/logdiff/main.nf new file mode 100644 index 0000000..3b733c5 --- /dev/null +++ b/modules/local/visualization/logdiff/main.nf @@ -0,0 +1,32 @@ +process VISUALIZATION_LOGDIFF { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(meta), path(library_completed_variantCounts) + + output: + tuple val(meta), path("logdiff_plot.pdf"), emit: logdiff_plot + tuple val(meta), path("logdiff_varying_bases.pdf"), emit: logdiff_varying_bases + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + template 'logdiff.R' + + stub: + """ + touch logdiff_plot.pdf + touch logdiff_varying_bases.pdf + echo "VISUALIZATION_COUNTS_HEATMAP:" > versions.yml + echo " stub-version: 0.0.0" >> versions.yml + """ +} diff --git a/modules/local/visualization/logdiff/meta.yml b/modules/local/visualization/logdiff/meta.yml new file mode 100644 index 0000000..dd4e343 --- /dev/null +++ b/modules/local/visualization/logdiff/meta.yml @@ -0,0 +1,54 @@ +name: "visualization_logdiff" +description: Generates diagnostic plots to visualize the logarithmic differences in variant counts, specifically focusing on the distribution of variants across different nucleotide compositions. +keywords: + - deep mutational scanning + - dms + - visualization + - diagnostics + - logdiff +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - library_completed_variantCounts: + type: file + description: CSV file containing processed variant counts, including zero-count entries from the library + pattern: "*.{csv}" + +output: + - logdiff_plot: + - meta: + type: map + description: Groovy Map containing sample information + - logdiff_plot.pdf: + type: file + description: PDF plot showing the overall logarithmic differences in variant counts + pattern: "logdiff_plot.pdf" + - logdiff_varying_bases: + - meta: + type: map + description: Groovy Map containing sample information + - logdiff_varying_bases.pdf: + type: file + description: PDF plot showing logarithmic differences categorized by varying nucleotide bases + pattern: "logdiff_varying_bases.pdf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/visualization/logdiff/templates/logdiff.R b/modules/local/visualization/logdiff/templates/logdiff.R new file mode 100644 index 0000000..752e574 --- /dev/null +++ b/modules/local/visualization/logdiff/templates/logdiff.R @@ -0,0 +1,115 @@ +#!/usr/bin/env Rscript + +# Maybe develop additional ideas to characterise variants below 10 percentile -> Do we find patterns that help the user to understand why certain variants are hard to count? + +# input: completed_prefiltered_gatk path, output folder path +# output: two logdiff-plots (counts_per_cov): 1st lineplot 2nd dotplot showing type of variant (varying bases - 1/2/3) + +library(dplyr) +library(ggplot2) +library(scales) + + logdiff_plot <- function(prefiltered_gatk_path) { + + # Load the data + prefiltered_gatk <- read.table(prefiltered_gatk_path, sep = ",", fill = NA, header = TRUE) + + # Sort by counts_per_cov while keeping corresponding varying_bases + sorted_counts <- prefiltered_gatk %>% + arrange(counts_per_cov) # Sort by counts_per_cov + + # Create a new column for the sorted index (1, 2, 3, ...) + sorted_counts\$ids <- 1:nrow(sorted_counts) + + # Calculate Q1 (10% quantile) and Q3 (90% quantile) + Q1 <- quantile(sorted_counts\$counts_per_cov, 0.10, na.rm = TRUE) + Q3 <- quantile(sorted_counts\$counts_per_cov, 0.90, na.rm = TRUE) + + # Calculate the LogDiff + LogDiff <- log10(Q3) - log10(Q1) + + # Create the first plot with a line + line_plot <- ggplot(sorted_counts, aes(x = ids, y = counts_per_cov)) + + geom_line(color = "black") + + + # Set axis labels + xlab("Variants") + + ylab("Counts per Coverage") + + + # Apply logarithmic scale to the y-axis + scale_y_continuous(trans = 'log10') + + + # Add horizontal dotted lines at Q1 and Q3 + geom_hline(yintercept = Q1, linetype = "dotted", color = "black") + + geom_hline(yintercept = Q3, linetype = "dotted", color = "black") + + + # Add the LogDiff value to the top left corner + annotate("text", x = 0, y = max(sorted_counts\$counts_per_cov), label = paste("LogDiff =", round(LogDiff, 2)), hjust = 0, vjust = 1, size = 5, color = "black") + + + # Apply the minimal theme + theme_minimal() + + # Save the line plot as a PDF + ggsave(filename = "logdiff_plot.pdf", plot = line_plot, device = "pdf", width = 10, height = 6) + + # Create the second plot with colored dots based on varying_bases + colored_plot <- ggplot(sorted_counts, aes(x = ids, y = counts_per_cov, color = as.factor(varying_bases))) + + + # Add horizontal dotted lines at Q1 and Q3 + geom_hline(yintercept = Q1, linetype = "dotted", color = "black") + + geom_hline(yintercept = Q3, linetype = "dotted", color = "black") + + + geom_point(size = 0.9) + # Use points instead of lines + + # Set axis labels + xlab("Variants") + + ylab("Counts per Coverage") + + + # Apply logarithmic scale to the y-axis + scale_y_continuous(trans = 'log10') + + + # Add the LogDiff value to the top left corner + annotate("text", x = 1, y = max(sorted_counts\$counts_per_cov), label = paste("LogDiff =", round(LogDiff, 2)), hjust = 0, vjust = 1, size = 5, color = "black") + + + # Add color scale for varying_bases + scale_color_manual(values = c("1" = "chocolate", "2" = "darkolivegreen3", "3" = "deepskyblue1"), name = "Varying \\n Bases") + + + # Apply the minimal theme + theme_minimal() + + # Save the colored plot as a PDF + ggsave(filename = "logdiff_varying_bases.pdf", plot = colored_plot, device = "pdf", width = 10, height = 6) + } + +##### +# run function +##### +logdiff_plot( + prefiltered_gatk_path = "$library_completed_variantCounts" +) + +#### +# create versions.yml +#### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +dplyr_version <- as.character(packageVersion("dplyr")) +ggplot2_version <- as.character(packageVersion("ggplot2")) +scales_version <- as.character(packageVersion("scales")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(dplyr_version) == 0) dplyr_version <- "unknown" +if (length(ggplot2_version) == 0) ggplot2_version <- "unknown" +if (length(scales_version) == 0) scales_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version), + paste(' r-dplyr:', dplyr_version), + paste(' r-ggplot2:', ggplot2_version), + paste(' r-scales:', scales_version) + ), + f +) +close(f) diff --git a/modules/local/visualization/seqdepth/environment.yml b/modules/local/visualization/seqdepth/environment.yml new file mode 100644 index 0000000..1b0726d --- /dev/null +++ b/modules/local/visualization/seqdepth/environment.yml @@ -0,0 +1,15 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioconductor-biostrings=2.74.0 + - conda-forge::r-base=4.4.1 + - conda-forge::r-biocmanager=1.30.25 + - conda-forge::r-dplyr=1.1.4 + - conda-forge::r-ggplot2=3.5.1 + - conda-forge::r-reshape2=1.4.4 + - conda-forge::r-scales=1.3.0 + - conda-forge::r-stringr=1.5.1 + - conda-forge::r-tidyr=1.3.1 + - conda-forge::r-tidyverse=2.0.0 + - conda-forge::r-zoo=1.8_12 diff --git a/modules/local/visualization/seqdepth/main.nf b/modules/local/visualization/seqdepth/main.nf new file mode 100644 index 0000000..7db51b3 --- /dev/null +++ b/modules/local/visualization/seqdepth/main.nf @@ -0,0 +1,30 @@ +process VISUALIZATION_SEQDEPTH { + tag "$meta.id" + label 'process_high' + + conda "${moduleDir}/environment.yml" + + container "${ workflow.containerEngine == 'singularity' + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/73/73a72ec77725aeb67678a74228938fdd6827b669d01a8c96951b1a8ef96eeb0f/data' + : 'community.wave.seqera.io/library/bioconductor-biostrings_bioconductor-mutscan_r-base_r-biocmanager_pruned:c65036d76406f342' }" + + input: + tuple val(meta), path(variantCounts_filtered_by_library) + path possible_mutations + val min_counts + + output: + tuple val(meta), path("SeqDepth.pdf"), emit: SeqDepth + path "versions.yml", emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + template 'SeqDepth_simulation.R' + + stub: + """ + touch SeqDepth.pdf + """ +} diff --git a/modules/local/visualization/seqdepth/meta.yml b/modules/local/visualization/seqdepth/meta.yml new file mode 100644 index 0000000..e3a35c6 --- /dev/null +++ b/modules/local/visualization/seqdepth/meta.yml @@ -0,0 +1,55 @@ +name: "visualization_seqdepth" +description: Performs a sequencing depth simulation to assess library coverage and saturation, helping to determine if the sequencing effort was sufficient to capture all intended variants. +keywords: + - deep mutational scanning + - dms + - visualization + - sequencing depth + - saturation +tools: + - "mutscan": + description: "R package for analysis of deep mutational scanning data" + homepage: "https://bioconductor.org/packages/release/bioc/html/mutscan.html" + documentation: "https://bioconductor.org/packages/release/bioc/manuals/mutscan/man/mutscan.pdf" + tool_dev_url: "https://github.com/csoneson/mutscan" + doi: "10.1186/s12859-023-05187-y" + licence: ["GPL-3.0-or-later"] + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - variantCounts_filtered_by_library: + type: file + description: CSV file containing variant counts filtered against the target library + pattern: "*.{csv}" + - - possible_mutations: + type: file + description: CSV file defining all theoretically possible mutations for the library design + pattern: "*.{csv}" + - - min_counts: + type: integer + description: Minimum count threshold used in the simulation logic + +output: + - SeqDepth: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - SeqDepth.pdf: + type: file + description: PDF plot showing the simulation of variant detection as a function of sequencing depth + pattern: "SeqDepth.pdf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/modules/local/visualization/seqdepth/templates/SeqDepth_simulation.R b/modules/local/visualization/seqdepth/templates/SeqDepth_simulation.R new file mode 100644 index 0000000..18b38d0 --- /dev/null +++ b/modules/local/visualization/seqdepth/templates/SeqDepth_simulation.R @@ -0,0 +1,154 @@ +#!/usr/bin/env Rscript + +# input: prefiltered (by codon library) gatk path, possible mutations path, output_folder, reduction_fraction (steps in % to reduce counts), threshold to count variant as present in dataset +# output: pdf showing the plot +# limitation: takes quite a lot time: depends on number of counts in prefiltered gatk (4 min on M1 MacBook for 340,000 counts in total) -> reduction_fraction only has a low impact -> need to find more efficient random-sampling algorithm + +library(dplyr) +library(ggplot2) + +SeqDepth_simulation_plot <- function(prefiltered_gatk_path, possible_mutations_path, output_file_path, reduction_fraction = 0.01, threshold = 3) { + + # Read data from the specified CSV file + data <- read.csv(prefiltered_gatk_path) + data <- data %>% mutate(counts = as.numeric(counts)) # Ensure counts are numeric + original_counts <- data\$counts # Store the original counts for weight calculations + possible_mutations <- read.csv(possible_mutations_path) + + # Initialize variables + total_counts <- sum(data\$counts) + reduction_per_step <- floor(total_counts * reduction_fraction) # Round down to the nearest integer + results <- data.frame(remaining_counts = numeric(), remaining_variants = numeric()) + + # Track the initial state + remaining_variants <- sum(data\$counts >= threshold) + remaining_counts <- total_counts + results <- rbind(results, data.frame(remaining_counts = remaining_counts, remaining_variants = remaining_variants)) + + # Get a list of indices for non-zero variants + non_zero_indices <- which(data\$counts > 0) + + # Calculate initial weights based on the original counts for non-zero variants + weights <- as.numeric(original_counts[non_zero_indices]) + weights <- weights / sum(weights) # Normalize weights + + # Loop until all counts are zero + while (remaining_counts > 0) { + # If no more non-zero variants are available, break the loop + if (length(non_zero_indices) == 0) { + break + } + + # Randomly reduce counts by the specified amount using weighted sampling + for (i in 1:reduction_per_step) { + # Randomly choose an index from the non-zero variants based on weights + selected_idx <- sample(length(non_zero_indices), 1, prob = weights) + index <- non_zero_indices[selected_idx] + + # Reduce the count by 1 + data\$counts[index] <- data\$counts[index] - 1 + + # If the count reaches zero, remove the index from the list of non-zero variants + if (data\$counts[index] == 0) { + non_zero_indices <- non_zero_indices[-selected_idx] + weights <- weights[-selected_idx] # Remove the corresponding weight + } + } + + # Update remaining counts and remaining variants after the reduction + remaining_counts <- sum(data\$counts) + remaining_variants <- sum(data\$counts >= threshold & data\$counts > 0) + + # Store the results for this step + results <- rbind(results, data.frame(remaining_counts = remaining_counts, remaining_variants = remaining_variants)) + + # Adjust the reduction_per_step if the total remaining counts are less than the reduction amount + if (remaining_counts < reduction_per_step) { + reduction_per_step <- remaining_counts + } + + # Recalculate weights using the original counts, but only for remaining non-zero variants + if (length(non_zero_indices) > 0) { + weights <- as.numeric(original_counts[non_zero_indices]) + weights <- weights / sum(weights) # Normalize weights + } + } + + # Transform results for plotting + baseline_count <- max(results\$remaining_counts) + plot_data <- results %>% + mutate( + remaining_counts_fold = round(remaining_counts / baseline_count, 2), # X-axis in fold-change from max, rounded to 2 decimals + remaining_variants_percent = (remaining_variants / nrow(possible_mutations)) * 100 # Y-axis in percent + ) + + # Set plot limits + x_max <- 1 # X-axis limit exactly at 1 + y_max <- 100 # Y-axis limit at 100% + + # Create the plot + plot <- ggplot(plot_data, aes(x = remaining_counts_fold, y = remaining_variants_percent)) + + geom_line(color = "black", size = 0.4) + # Solid line for the data + geom_hline(yintercept = 100, linetype = "dotted", color = "black") + # Horizontal line at 100% + + # Main plot settings with fine grid lines + scale_y_continuous( + labels = scales::percent_format(scale = 1), + limits = c(0, y_max), + breaks = seq(0, y_max, by = 5) # Y-axis grid lines every 5% + ) + + scale_x_continuous( + limits = c(0, x_max), + breaks = seq(0, x_max, length.out = 20), # X-axis with 20 grid lines ending at 1 + labels = scales::number_format(accuracy = 0.01) # Round labels to 2 decimal places + ) + + labs( + x = "Fold-Change of Sequencing Depth", + y = "Variants (% of Maximum)" + ) + + theme_minimal() + + theme( + panel.border = element_rect(color = "black", fill = NA), # Add a box-like border + panel.grid.major = element_line(size = 0.2, linetype = "solid", color = "grey80"), # Fine grid lines + panel.grid.minor = element_blank(), # No minor grid lines for clarity + axis.text.x = element_text(angle = 45, hjust = 1) # Rotate X-axis labels at 45 degrees + ) + + # Save the plot as a PDF in the specified output folder + ggsave(output_file_path, plot = plot, device = "pdf", width = 8, height = 6) +} + + +##### +# run function +##### +SeqDepth_simulation_plot( + prefiltered_gatk_path = "$variantCounts_filtered_by_library", + possible_mutations_path = "$possible_mutations", + output_file_path = "SeqDepth.pdf", + reduction_fraction = 0.01, + threshold = $min_counts + ) + +##### +# create versions.yml +##### +r_version <- strsplit(version[['version.string']], ' ')[[1]][3] +dplyr_version <- as.character(packageVersion("dplyr")) +ggplot2_version <- as.character(packageVersion("ggplot2")) + +if (is.null(r_version)) r_version <- "unknown" +if (length(dplyr_version) == 0) dplyr_version <- "unknown" +if (length(ggplot2_version) == 0) ggplot2_version <- "unknown" + +f <- file("versions.yml", "w") +writeLines( + c( + '"${task.process}":', + paste(' r-base:', r_version), + paste(' r-dplyr:', dplyr_version), + paste(' r-ggplot2:', ggplot2_version) + ), + f +) +close(f) diff --git a/modules/nf-core/bwa/index/environment.yml b/modules/nf-core/bwa/index/environment.yml new file mode 100644 index 0000000..54e6794 --- /dev/null +++ b/modules/nf-core/bwa/index/environment.yml @@ -0,0 +1,13 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda + +dependencies: + # renovate: datasource=conda depName=bioconda/bwa + - bioconda::bwa=0.7.19 + # renovate: datasource=conda depName=bioconda/htslib + - bioconda::htslib=1.22.1 + # renovate: datasource=conda depName=bioconda/samtools + - bioconda::samtools=1.22.1 diff --git a/modules/nf-core/bwa/index/main.nf b/modules/nf-core/bwa/index/main.nf new file mode 100644 index 0000000..a1c98ac --- /dev/null +++ b/modules/nf-core/bwa/index/main.nf @@ -0,0 +1,44 @@ +process BWA_INDEX { + tag "$fasta" + // NOTE requires 5.37N memory where N is the size of the database + // source: https://bio-bwa.sourceforge.net/bwa.shtml#8 + memory { 7.B * fasta.size() } + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine in ['singularity', 'apptainer'] && !task.ext.singularity_pull_docker_container ? + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/d7/d7e24dc1e4d93ca4d3a76a78d4c834a7be3985b0e1e56fddd61662e047863a8a/data' : + 'community.wave.seqera.io/library/bwa_htslib_samtools:83b50ff84ead50d0' }" + + input: + tuple val(meta), path(fasta) + + output: + tuple val(meta), path("bwa"), emit: index + tuple val("${task.process}"), val('bwa'), eval('bwa 2>&1 | sed -n "s/^Version: //p"'), topic: versions, emit: versions_bwa + + when: + task.ext.when == null || task.ext.when + + script: + def prefix = task.ext.prefix ?: "${fasta.baseName}" + def args = task.ext.args ?: '' + """ + mkdir bwa + bwa \\ + index \\ + $args \\ + -p bwa/${prefix} \\ + $fasta + """ + + stub: + def prefix = task.ext.prefix ?: "${fasta.baseName}" + """ + mkdir bwa + touch bwa/${prefix}.amb + touch bwa/${prefix}.ann + touch bwa/${prefix}.bwt + touch bwa/${prefix}.pac + touch bwa/${prefix}.sa + """ +} diff --git a/modules/nf-core/bwa/index/meta.yml b/modules/nf-core/bwa/index/meta.yml new file mode 100644 index 0000000..f5bf7f5 --- /dev/null +++ b/modules/nf-core/bwa/index/meta.yml @@ -0,0 +1,71 @@ +name: bwa_index +description: Create BWA index for reference genome +keywords: + - index + - fasta + - genome + - reference +tools: + - bwa: + description: | + BWA is a software package for mapping DNA sequences against + a large reference genome, such as the human genome. + homepage: http://bio-bwa.sourceforge.net/ + documentation: https://bio-bwa.sourceforge.net/bwa.shtml + arxiv: arXiv:1303.3997 + licence: ["GPL-3.0-or-later"] + identifier: "biotools:bwa" +input: + - - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Input genome fasta file + ontologies: + - edam: "http://edamontology.org/data_2044" # Sequence + - edam: "http://edamontology.org/format_1929" # FASTA +output: + index: + - - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - bwa: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + pattern: "*.{amb,ann,bwt,pac,sa}" + ontologies: + - edam: "http://edamontology.org/data_3210" # Genome index + versions_bwa: + - - ${task.process}: + type: string + description: The process the versions were collected from + - bwa: + type: string + description: The tool name + - 'bwa 2>&1 | sed -n "s/^Version: //p"': + type: string + description: The command used to generate the version of the tool +topics: + versions: + - - ${task.process}: + type: string + description: The process the versions were collected from + - bwa: + type: string + description: The tool name + - 'bwa 2>&1 | sed -n "s/^Version: //p"': + type: string + description: The command used to generate the version of the tool +authors: + - "@drpatelh" + - "@maxulysse" +maintainers: + - "@maxulysse" + - "@gallvp" diff --git a/modules/nf-core/bwa/index/tests/main.nf.test b/modules/nf-core/bwa/index/tests/main.nf.test new file mode 100644 index 0000000..f0fba82 --- /dev/null +++ b/modules/nf-core/bwa/index/tests/main.nf.test @@ -0,0 +1,57 @@ +nextflow_process { + + name "Test Process BWA_INDEX" + tag "modules_nfcore" + tag "modules" + tag "bwa" + tag "bwa/index" + script "../main.nf" + process "BWA_INDEX" + + test("BWA index") { + + when { + process { + """ + input[0] = [ + [id: 'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + + then { + assert process.success + assertAll( + { assert snapshot(process.out).match() } + ) + } + + } + + test("BWA index - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [id: 'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + + then { + assert process.success + assertAll( + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/bwa/index/tests/main.nf.test.snap b/modules/nf-core/bwa/index/tests/main.nf.test.snap new file mode 100644 index 0000000..21a6f73 --- /dev/null +++ b/modules/nf-core/bwa/index/tests/main.nf.test.snap @@ -0,0 +1,108 @@ +{ + "BWA index - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + [ + "genome.amb:md5,d41d8cd98f00b204e9800998ecf8427e", + "genome.ann:md5,d41d8cd98f00b204e9800998ecf8427e", + "genome.bwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "genome.pac:md5,d41d8cd98f00b204e9800998ecf8427e", + "genome.sa:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "1": [ + [ + "BWA_INDEX", + "bwa", + "0.7.19-r1273" + ] + ], + "index": [ + [ + { + "id": "test" + }, + [ + "genome.amb:md5,d41d8cd98f00b204e9800998ecf8427e", + "genome.ann:md5,d41d8cd98f00b204e9800998ecf8427e", + "genome.bwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "genome.pac:md5,d41d8cd98f00b204e9800998ecf8427e", + "genome.sa:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "versions_bwa": [ + [ + "BWA_INDEX", + "bwa", + "0.7.19-r1273" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.3", + "nextflow": "25.10.2" + }, + "timestamp": "2026-01-23T16:58:59.966558606" + }, + "BWA index": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + [ + "genome.amb:md5,3a68b8b2287e07dd3f5f95f4344ba76e", + "genome.ann:md5,c32e11f6c859f166c7525a9c1d583567", + "genome.bwt:md5,0469c30a1e239dd08f68afe66fde99da", + "genome.pac:md5,983e3d2cd6f36e2546e6d25a0da78d66", + "genome.sa:md5,ab3952cabf026b48cd3eb5bccbb636d1" + ] + ] + ], + "1": [ + [ + "BWA_INDEX", + "bwa", + "0.7.19-r1273" + ] + ], + "index": [ + [ + { + "id": "test" + }, + [ + "genome.amb:md5,3a68b8b2287e07dd3f5f95f4344ba76e", + "genome.ann:md5,c32e11f6c859f166c7525a9c1d583567", + "genome.bwt:md5,0469c30a1e239dd08f68afe66fde99da", + "genome.pac:md5,983e3d2cd6f36e2546e6d25a0da78d66", + "genome.sa:md5,ab3952cabf026b48cd3eb5bccbb636d1" + ] + ] + ], + "versions_bwa": [ + [ + "BWA_INDEX", + "bwa", + "0.7.19-r1273" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.3", + "nextflow": "25.10.2" + }, + "timestamp": "2026-01-23T16:58:53.330725134" + } +} \ No newline at end of file diff --git a/modules/nf-core/bwa/mem/environment.yml b/modules/nf-core/bwa/mem/environment.yml new file mode 100644 index 0000000..54e6794 --- /dev/null +++ b/modules/nf-core/bwa/mem/environment.yml @@ -0,0 +1,13 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda + +dependencies: + # renovate: datasource=conda depName=bioconda/bwa + - bioconda::bwa=0.7.19 + # renovate: datasource=conda depName=bioconda/htslib + - bioconda::htslib=1.22.1 + # renovate: datasource=conda depName=bioconda/samtools + - bioconda::samtools=1.22.1 diff --git a/modules/nf-core/bwa/mem/main.nf b/modules/nf-core/bwa/mem/main.nf new file mode 100644 index 0000000..bde6a9a --- /dev/null +++ b/modules/nf-core/bwa/mem/main.nf @@ -0,0 +1,73 @@ +process BWA_MEM { + tag "$meta.id" + label 'process_high' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine in ['singularity', 'apptainer'] && !task.ext.singularity_pull_docker_container ? + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/d7/d7e24dc1e4d93ca4d3a76a78d4c834a7be3985b0e1e56fddd61662e047863a8a/data' : + 'community.wave.seqera.io/library/bwa_htslib_samtools:83b50ff84ead50d0' }" + + input: + tuple val(meta) , path(reads) + tuple val(meta2), path(index) + tuple val(meta3), path(fasta) + val sort_bam + + output: + tuple val(meta), path("*.bam") , emit: bam, optional: true + tuple val(meta), path("*.cram") , emit: cram, optional: true + tuple val(meta), path("*.sam") , emit: sam, optional: true + tuple val(meta), path("*.csi") , emit: csi, optional: true + tuple val(meta), path("*.crai") , emit: crai, optional: true + tuple val("${task.process}"), val('bwa'), eval('bwa 2>&1 | sed -n "s/^Version: //p"'), topic: versions, emit: versions_bwa + tuple val("${task.process}"), val('samtools'), eval("samtools version | sed '1!d;s/.* //'"), topic: versions, emit: versions_samtools + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def samtools_command = sort_bam ? 'sort' : 'view' + def extension = args2.contains("--output-fmt sam") ? "sam" : + args2.contains("--output-fmt cram") ? "cram": + sort_bam && args2.contains("-O cram")? "cram": + !sort_bam && args2.contains("-C") ? "cram": + "bam" + def reference = fasta && extension=="cram" ? "--reference ${fasta}" : "" + if (!fasta && extension=="cram") error "Fasta reference is required for CRAM output" + // + // For SAM output we can skip samtools view + // + def pipe_command = "" + if (extension == "sam") { + pipe_command = "> ${prefix}.${extension}" + } else { + pipe_command = "| samtools $samtools_command $args2 ${reference} --threads $task.cpus -o ${prefix}.${extension} -" + } + """ + INDEX=`find -L ./ -name "*.amb" | sed 's/\\.amb\$//'` + + bwa mem \\ + $args \\ + -t $task.cpus \\ + \$INDEX \\ + $reads \\ + $pipe_command + """ + + stub: + def args2 = task.ext.args2 ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def extension = args2.contains("--output-fmt sam") ? "sam" : + args2.contains("--output-fmt cram") ? "cram": + sort_bam && args2.contains("-O cram")? "cram": + !sort_bam && args2.contains("-C") ? "cram": + "bam" + """ + touch ${prefix}.${extension} + touch ${prefix}.csi + touch ${prefix}.crai + """ +} diff --git a/modules/nf-core/bwa/mem/meta.yml b/modules/nf-core/bwa/mem/meta.yml new file mode 100644 index 0000000..1c4ee88 --- /dev/null +++ b/modules/nf-core/bwa/mem/meta.yml @@ -0,0 +1,159 @@ +name: bwa_mem +description: Performs fastq alignment to a fasta reference using BWA +keywords: + - mem + - bwa + - alignment + - map + - fastq + - bam + - sam +tools: + - bwa: + description: | + BWA is a software package for mapping DNA sequences against + a large reference genome, such as the human genome. + homepage: http://bio-bwa.sourceforge.net/ + documentation: https://bio-bwa.sourceforge.net/bwa.shtml + arxiv: arXiv:1303.3997 + licence: + - "GPL-3.0-or-later" + identifier: "biotools:bwa" +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files of size 1 and 2 for single-end and paired-end data, + respectively. + ontologies: + - edam: "http://edamontology.org/data_2044" + - edam: "http://edamontology.org/format_1930" + - - meta2: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - index: + type: file + description: BWA genome index files + pattern: "Directory containing BWA index *.{amb,ann,bwt,pac,sa}" + ontologies: + - edam: "http://edamontology.org/data_3210" + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Reference genome in FASTA format + pattern: "*.{fasta,fa}" + ontologies: + - edam: "http://edamontology.org/data_2044" + - edam: "http://edamontology.org/format_1929" + - sort_bam: + type: boolean + description: use samtools sort (true) or samtools view (false) + pattern: "true or false" +output: + bam: + - - meta: + type: map + description: Groovy Map containing sample information + - "*.bam": + type: file + description: Output BAM file containing read alignments + pattern: "*.{bam}" + ontologies: + - edam: "http://edamontology.org/format_2572" + cram: + - - meta: + type: map + description: Groovy Map containing sample information + - "*.cram": + type: file + description: Output CRAM file containing read alignments + pattern: "*.{cram}" + ontologies: + - edam: "http://edamontology.org/format_3462" + sam: + - - meta: + type: map + description: Groovy Map containing sample information + - "*.sam": + type: file + description: Output SAM file containing read alignments + pattern: "*.{sam}" + ontologies: + - edam: "http://edamontology.org/format_2573" + csi: + - - meta: + type: map + description: Groovy Map containing sample information + - "*.csi": + type: file + description: Optional index file for BAM file + pattern: "*.{csi}" + ontologies: [] + crai: + - - meta: + type: map + description: Groovy Map containing sample information + - "*.crai": + type: file + description: Optional index file for CRAM file + pattern: "*.{crai}" + ontologies: [] + versions_bwa: + - - ${task.process}: + type: string + description: The name of the process + - bwa: + type: string + description: The name of the tool + - 'bwa 2>&1 | sed -n "s/^Version: //p"': + type: eval + description: The expression to obtain the version of the tool + versions_samtools: + - - ${task.process}: + type: string + description: The name of the process + - samtools: + type: string + description: The name of the tool + - samtools version | sed '1!d;s/.* //': + type: eval + description: The expression to obtain the version of the tool +topics: + versions: + - - ${task.process}: + type: string + description: The name of the process + - bwa: + type: string + description: The name of the tool + - 'bwa 2>&1 | sed -n "s/^Version: //p"': + type: eval + description: The expression to obtain the version of the tool + - - ${task.process}: + type: string + description: The name of the process + - samtools: + type: string + description: The name of the tool + - samtools version | sed '1!d;s/.* //': + type: eval + description: The expression to obtain the version of the tool +authors: + - "@drpatelh" + - "@jeremy1805" + - "@matthdsm" +maintainers: + - "@drpatelh" + - "@jeremy1805" + - "@matthdsm" diff --git a/modules/nf-core/bwa/mem/tests/main.nf.test b/modules/nf-core/bwa/mem/tests/main.nf.test new file mode 100644 index 0000000..e284e2e --- /dev/null +++ b/modules/nf-core/bwa/mem/tests/main.nf.test @@ -0,0 +1,343 @@ +nextflow_process { + + name "Test Process BWA_MEM" + tag "modules_nfcore" + tag "modules" + tag "bwa" + tag "bwa/mem" + tag "bwa/index" + script "../main.nf" + process "BWA_MEM" + + setup { + run("BWA_INDEX") { + script "../../index/main.nf" + process { + """ + input[0] = [ + [id: 'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + test("Single-End") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BWA_INDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.cram, + process.out.sam, + process.out.csi, + process.out.crai, + process.out.findAll { key, val -> key.startsWith("versions") }, + bam(process.out.bam[0][1]).getReadsMD5() + ).match() + } + ) + } + + } + + test("Single-End Sort") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BWA_INDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = true + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.cram, + process.out.sam, + process.out.csi, + process.out.crai, + process.out.findAll { key, val -> key.startsWith("versions") }, + bam(process.out.bam[0][1]).getReadsMD5() + ).match() + } + ) + } + + } + + test("Single-End - SAM output") { + + config "./nextflow_sam.config" + + when { + params { + module_args2 = "--output-fmt sam" + } + + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BWA_INDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.bam, + process.out.cram, + process.out.csi, + process.out.crai, + process.out.findAll { key, val -> key.startsWith("versions") }, + bam(process.out.sam[0][1]).getReadsMD5() + ).match() + } + ) + } + + } + + test("Paired-End") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BWA_INDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.cram, + process.out.sam, + process.out.csi, + process.out.crai, + process.out.findAll { key, val -> key.startsWith("versions") }, + bam(process.out.bam[0][1]).getReadsMD5() + ).match() + } + ) + } + + } + + test("Paired-End Sort") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BWA_INDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = true + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.cram, + process.out.sam, + process.out.csi, + process.out.crai, + process.out.findAll { key, val -> key.startsWith("versions") }, + bam(process.out.bam[0][1]).getReadsMD5() + ).match() + } + ) + } + + } + + test("Paired-End - no fasta") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BWA_INDEX.out.index + input[2] = [[:],[]] + input[3] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.cram, + process.out.sam, + process.out.csi, + process.out.crai, + process.out.findAll { key, val -> key.startsWith("versions") }, + bam(process.out.bam[0][1]).getReadsMD5() + ).match() + } + ) + } + + } + + test ("Paired-end - SAM output") { + + config "./nextflow_sam.config" + + when { + params { + module_args2 = "--output-fmt sam" + } + + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BWA_INDEX.out.index + input[2] = [[:],[]] + input[3] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.bam, + process.out.cram, + process.out.csi, + process.out.crai, + process.out.findAll { key, val -> key.startsWith("versions") }, + bam(process.out.sam[0][1]).getReadsMD5() + ).match() + } + ) + } + + } + + test("Single-end - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BWA_INDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("Paired-end - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BWA_INDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } +} diff --git a/modules/nf-core/bwa/mem/tests/main.nf.test.snap b/modules/nf-core/bwa/mem/tests/main.nf.test.snap new file mode 100644 index 0000000..6f9031d --- /dev/null +++ b/modules/nf-core/bwa/mem/tests/main.nf.test.snap @@ -0,0 +1,478 @@ +{ + "Single-End - SAM output": { + "content": [ + [ + + ], + [ + + ], + [ + + ], + [ + + ], + { + "versions_bwa": [ + [ + "BWA_MEM", + "bwa", + "0.7.19-r1273" + ] + ], + "versions_samtools": [ + [ + "BWA_MEM", + "samtools", + "1.22.1" + ] + ] + }, + "798439cbd7fd81cbcc5078022dc5479d" + ], + "timestamp": "2026-05-11T12:09:32.334359515", + "meta": { + "nf-test": "0.9.5", + "nextflow": "26.04.0" + } + }, + "Single-End": { + "content": [ + [ + + ], + [ + + ], + [ + + ], + [ + + ], + { + "versions_bwa": [ + [ + "BWA_MEM", + "bwa", + "0.7.19-r1273" + ] + ], + "versions_samtools": [ + [ + "BWA_MEM", + "samtools", + "1.22.1" + ] + ] + }, + "798439cbd7fd81cbcc5078022dc5479d" + ], + "timestamp": "2026-05-11T12:07:21.233636979", + "meta": { + "nf-test": "0.9.5", + "nextflow": "26.04.0" + } + }, + "Single-End Sort": { + "content": [ + [ + + ], + [ + + ], + [ + + ], + [ + + ], + { + "versions_bwa": [ + [ + "BWA_MEM", + "bwa", + "0.7.19-r1273" + ] + ], + "versions_samtools": [ + [ + "BWA_MEM", + "samtools", + "1.22.1" + ] + ] + }, + "94fcf617f5b994584c4e8d4044e16b4f" + ], + "timestamp": "2026-05-11T12:07:28.74614221", + "meta": { + "nf-test": "0.9.5", + "nextflow": "26.04.0" + } + }, + "Paired-End": { + "content": [ + [ + + ], + [ + + ], + [ + + ], + [ + + ], + { + "versions_bwa": [ + [ + "BWA_MEM", + "bwa", + "0.7.19-r1273" + ] + ], + "versions_samtools": [ + [ + "BWA_MEM", + "samtools", + "1.22.1" + ] + ] + }, + "57aeef88ed701a8ebc8e2f0a381b2a6" + ], + "timestamp": "2026-05-11T12:07:42.612131595", + "meta": { + "nf-test": "0.9.5", + "nextflow": "26.04.0" + } + }, + "Paired-End Sort": { + "content": [ + [ + + ], + [ + + ], + [ + + ], + [ + + ], + { + "versions_bwa": [ + [ + "BWA_MEM", + "bwa", + "0.7.19-r1273" + ] + ], + "versions_samtools": [ + [ + "BWA_MEM", + "samtools", + "1.22.1" + ] + ] + }, + "af8628d9df18b2d3d4f6fd47ef2bb872" + ], + "timestamp": "2026-05-11T12:09:45.938323098", + "meta": { + "nf-test": "0.9.5", + "nextflow": "26.04.0" + } + }, + "Single-end - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + [ + { + "id": "test", + "single_end": true + }, + "test.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": true + }, + "test.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "5": [ + [ + "BWA_MEM", + "bwa", + "0.7.19-r1273" + ] + ], + "6": [ + [ + "BWA_MEM", + "samtools", + "1.22.1" + ] + ], + "bam": [ + [ + { + "id": "test", + "single_end": true + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "crai": [ + [ + { + "id": "test", + "single_end": true + }, + "test.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "cram": [ + + ], + "csi": [ + [ + { + "id": "test", + "single_end": true + }, + "test.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "sam": [ + + ], + "versions_bwa": [ + [ + "BWA_MEM", + "bwa", + "0.7.19-r1273" + ] + ], + "versions_samtools": [ + [ + "BWA_MEM", + "samtools", + "1.22.1" + ] + ] + } + ], + "timestamp": "2026-05-11T12:10:09.92486753", + "meta": { + "nf-test": "0.9.5", + "nextflow": "26.04.0" + } + }, + "Paired-End - no fasta": { + "content": [ + [ + + ], + [ + + ], + [ + + ], + [ + + ], + { + "versions_bwa": [ + [ + "BWA_MEM", + "bwa", + "0.7.19-r1273" + ] + ], + "versions_samtools": [ + [ + "BWA_MEM", + "samtools", + "1.22.1" + ] + ] + }, + "57aeef88ed701a8ebc8e2f0a381b2a6" + ], + "timestamp": "2026-05-11T12:09:52.820539909", + "meta": { + "nf-test": "0.9.5", + "nextflow": "26.04.0" + } + }, + "Paired-end - SAM output": { + "content": [ + [ + + ], + [ + + ], + [ + + ], + [ + + ], + { + "versions_bwa": [ + [ + "BWA_MEM", + "bwa", + "0.7.19-r1273" + ] + ], + "versions_samtools": [ + [ + "BWA_MEM", + "samtools", + "1.22.1" + ] + ] + }, + "57aeef88ed701a8ebc8e2f0a381b2a6" + ], + "timestamp": "2026-05-11T12:10:00.199968933", + "meta": { + "nf-test": "0.9.5", + "nextflow": "26.04.0" + } + }, + "Paired-end - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "test.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "5": [ + [ + "BWA_MEM", + "bwa", + "0.7.19-r1273" + ] + ], + "6": [ + [ + "BWA_MEM", + "samtools", + "1.22.1" + ] + ], + "bam": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "crai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "cram": [ + + ], + "csi": [ + [ + { + "id": "test", + "single_end": false + }, + "test.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "sam": [ + + ], + "versions_bwa": [ + [ + "BWA_MEM", + "bwa", + "0.7.19-r1273" + ] + ], + "versions_samtools": [ + [ + "BWA_MEM", + "samtools", + "1.22.1" + ] + ] + } + ], + "timestamp": "2026-05-11T12:10:16.940291647", + "meta": { + "nf-test": "0.9.5", + "nextflow": "26.04.0" + } + } +} \ No newline at end of file diff --git a/modules/nf-core/bwa/mem/tests/nextflow_sam.config b/modules/nf-core/bwa/mem/tests/nextflow_sam.config new file mode 100644 index 0000000..831332e --- /dev/null +++ b/modules/nf-core/bwa/mem/tests/nextflow_sam.config @@ -0,0 +1,5 @@ +process { + withName: BWA_MEM { + ext.args2 = params.module_args2 + } +} diff --git a/modules/nf-core/fastqc/.conda-lock/linux_amd64-bd-5cb1a2fa2f18c7c2_1.txt b/modules/nf-core/fastqc/.conda-lock/linux_amd64-bd-5cb1a2fa2f18c7c2_1.txt new file mode 100644 index 0000000..7770ccd --- /dev/null +++ b/modules/nf-core/fastqc/.conda-lock/linux_amd64-bd-5cb1a2fa2f18c7c2_1.txt @@ -0,0 +1,822 @@ + +version: 6 +environments: +default: +channels: +- url: https://conda.anaconda.org/conda-forge/ +- url: https://conda.anaconda.org/bioconda/ +- url: https://conda.anaconda.org/bioconda/ +options: +pypi-prerelease-mode: if-necessary-or-explicit +packages: +linux-64: +- conda: https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-20_gnu.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/alsa-lib-1.2.15.3-hb03c661_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-hda65f42_9.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2026.2.25-hbd8a1cb_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/cairo-1.18.4-he90730b_1.conda +- conda: https://conda.anaconda.org/bioconda/noarch/fastqc-0.12.1-hdfd78af_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-dejavu-sans-mono-2.37-hab24e00_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-inconsolata-3.000-h77eed37_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-source-code-pro-2.038-h77eed37_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-ubuntu-0.83-h77eed37_3.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/fontconfig-2.17.1-h27c8c51_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-ecosystem-1-0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-forge-1-hc364b38_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/giflib-5.2.2-hd590300_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/graphite2-1.3.14-hecca717_2.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/harfbuzz-13.2.1-h6083320_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/icu-78.3-h33c6efd_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/keyutils-1.6.3-hb9d3cd8_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/krb5-1.22.2-ha1258a1_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/lcms2-2.18-h0c24ade_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/lerc-4.1.0-hdb68285_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libcups-2.3.3-h7a8fb5f_6.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libdeflate-1.25-h17f619e_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libedit-3.1.20250104-pl5321h7949ede_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libexpat-2.7.4-hecca717_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libffi-3.5.2-h3435931_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libfreetype-2.14.3-ha770c72_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libfreetype6-2.14.3-h73754d4_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgcc-15.2.0-he0feb66_18.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-15.2.0-h69a702a_18.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libglib-2.86.4-h6548e54_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgomp-15.2.0-he0feb66_18.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libiconv-1.18-h3b78370_2.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libjpeg-turbo-3.1.2-hb03c661_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/liblzma-5.8.2-hb03c661_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libpng-1.6.55-h421ea60_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-15.2.0-h934c35e_18.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libtiff-4.7.1-h9d88235_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libuuid-2.41.3-h5347b49_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libwebp-base-1.6.0-hd42ef1d_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libxcb-1.17.0-h8a09558_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libxcrypt-4.4.36-hd590300_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libzlib-1.3.2-h25fd6f3_2.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/ncurses-6.5-h2d0b736_3.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/openjdk-25.0.2-ha668962_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/openssl-3.6.1-h35e630c_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/pcre2-10.47-haa7fec5_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/perl-5.32.1-7_hd590300_perl5.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/pixman-0.46.4-h54a6638_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/procps-ng-4.0.6-h18c060e_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/pthread-stubs-0.4-hb9d3cd8_1002.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libice-1.1.2-hb9d3cd8_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libsm-1.2.6-he73a12e_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libx11-1.8.13-he1eb515_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxau-1.0.12-hb03c661_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxdmcp-1.1.5-hb03c661_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxext-1.3.7-hb03c661_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxfixes-6.0.2-hb03c661_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxi-1.8.2-hb9d3cd8_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxrandr-1.5.5-hb03c661_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxrender-0.9.12-hb9d3cd8_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxt-1.3.1-hb9d3cd8_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxtst-1.2.5-hb9d3cd8_3.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/zstd-1.5.7-hb78ec9c_6.conda +packages: +- conda: https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-20_gnu.conda +build_number: 20 +sha256: 1dd3fffd892081df9726d7eb7e0dea6198962ba775bd88842135a4ddb4deb3c9 +md5: a9f577daf3de00bca7c3c76c0ecbd1de +depends: +- __glibc >=2.17,<3.0.a0 +- libgomp >=7.5.0 +constrains: +- openmp_impl <0.0a0 +license: BSD-3-Clause +license_family: BSD +size: 28948 +timestamp: 1770939786096 +- conda: https://conda.anaconda.org/conda-forge/linux-64/alsa-lib-1.2.15.3-hb03c661_0.conda +sha256: d88aa7ae766cf584e180996e92fef2aa7d8e0a0a5ab1d4d49c32390c1b5fff31 +md5: dcdc58c15961dbf17a0621312b01f5cb +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: LGPL-2.1-or-later +license_family: GPL +size: 584660 +timestamp: 1768327524772 +- conda: https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-hda65f42_9.conda +sha256: 0b75d45f0bba3e95dc693336fa51f40ea28c980131fec438afb7ce6118ed05f6 +md5: d2ffd7602c02f2b316fd921d39876885 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: bzip2-1.0.6 +license_family: BSD +size: 260182 +timestamp: 1771350215188 +- conda: https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2026.2.25-hbd8a1cb_0.conda +sha256: 67cc7101b36421c5913a1687ef1b99f85b5d6868da3abbf6ec1a4181e79782fc +md5: 4492fd26db29495f0ba23f146cd5638d +depends: +- __unix +license: ISC +size: 147413 +timestamp: 1772006283803 +- conda: https://conda.anaconda.org/conda-forge/linux-64/cairo-1.18.4-he90730b_1.conda +sha256: 06525fa0c4e4f56e771a3b986d0fdf0f0fc5a3270830ee47e127a5105bde1b9a +md5: bb6c4808bfa69d6f7f6b07e5846ced37 +depends: +- __glibc >=2.17,<3.0.a0 +- fontconfig >=2.15.0,<3.0a0 +- fonts-conda-ecosystem +- icu >=78.1,<79.0a0 +- libexpat >=2.7.3,<3.0a0 +- libfreetype >=2.14.1 +- libfreetype6 >=2.14.1 +- libgcc >=14 +- libglib >=2.86.3,<3.0a0 +- libpng >=1.6.53,<1.7.0a0 +- libstdcxx >=14 +- libxcb >=1.17.0,<2.0a0 +- libzlib >=1.3.1,<2.0a0 +- pixman >=0.46.4,<1.0a0 +- xorg-libice >=1.1.2,<2.0a0 +- xorg-libsm >=1.2.6,<2.0a0 +- xorg-libx11 >=1.8.12,<2.0a0 +- xorg-libxext >=1.3.6,<2.0a0 +- xorg-libxrender >=0.9.12,<0.10.0a0 +license: LGPL-2.1-only or MPL-1.1 +size: 989514 +timestamp: 1766415934926 +- conda: https://conda.anaconda.org/bioconda/noarch/fastqc-0.12.1-hdfd78af_0.tar.bz2 +sha256: 7cc26225d590540ae95cd24940ff42f2da7479dd4cd22ae9ab9298665d06790c +md5: c9f6a4b12229f7331f79c9a00dd6e240 +depends: +- font-ttf-dejavu-sans-mono +- fontconfig +- openjdk >=8.0.144 +- perl +license: GPL >=3 +size: 11664291 +timestamp: 1677946722445 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-dejavu-sans-mono-2.37-hab24e00_0.tar.bz2 +sha256: 58d7f40d2940dd0a8aa28651239adbf5613254df0f75789919c4e6762054403b +md5: 0c96522c6bdaed4b1566d11387caaf45 +license: BSD-3-Clause +license_family: BSD +size: 397370 +timestamp: 1566932522327 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-inconsolata-3.000-h77eed37_0.tar.bz2 +sha256: c52a29fdac682c20d252facc50f01e7c2e7ceac52aa9817aaf0bb83f7559ec5c +md5: 34893075a5c9e55cdafac56607368fc6 +license: OFL-1.1 +license_family: Other +size: 96530 +timestamp: 1620479909603 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-source-code-pro-2.038-h77eed37_0.tar.bz2 +sha256: 00925c8c055a2275614b4d983e1df637245e19058d79fc7dd1a93b8d9fb4b139 +md5: 4d59c254e01d9cde7957100457e2d5fb +license: OFL-1.1 +license_family: Other +size: 700814 +timestamp: 1620479612257 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-ubuntu-0.83-h77eed37_3.conda +sha256: 2821ec1dc454bd8b9a31d0ed22a7ce22422c0aef163c59f49dfdf915d0f0ca14 +md5: 49023d73832ef61042f6a237cb2687e7 +license: LicenseRef-Ubuntu-Font-Licence-Version-1.0 +license_family: Other +size: 1620504 +timestamp: 1727511233259 +- conda: https://conda.anaconda.org/conda-forge/linux-64/fontconfig-2.17.1-h27c8c51_0.conda +sha256: aa4a44dba97151221100a637c7f4bde619567afade9c0265f8e1c8eed8d7bd8c +md5: 867127763fbe935bab59815b6e0b7b5c +depends: +- __glibc >=2.17,<3.0.a0 +- libexpat >=2.7.4,<3.0a0 +- libfreetype >=2.14.1 +- libfreetype6 >=2.14.1 +- libgcc >=14 +- libuuid >=2.41.3,<3.0a0 +- libzlib >=1.3.1,<2.0a0 +license: MIT +license_family: MIT +size: 270705 +timestamp: 1771382710863 +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-ecosystem-1-0.tar.bz2 +sha256: a997f2f1921bb9c9d76e6fa2f6b408b7fa549edd349a77639c9fe7a23ea93e61 +md5: fee5683a3f04bd15cbd8318b096a27ab +depends: +- fonts-conda-forge +license: BSD-3-Clause +license_family: BSD +size: 3667 +timestamp: 1566974674465 +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-forge-1-hc364b38_1.conda +sha256: 54eea8469786bc2291cc40bca5f46438d3e062a399e8f53f013b6a9f50e98333 +md5: a7970cd949a077b7cb9696379d338681 +depends: +- font-ttf-ubuntu +- font-ttf-inconsolata +- font-ttf-dejavu-sans-mono +- font-ttf-source-code-pro +license: BSD-3-Clause +license_family: BSD +size: 4059 +timestamp: 1762351264405 +- conda: https://conda.anaconda.org/conda-forge/linux-64/giflib-5.2.2-hd590300_0.conda +sha256: aac402a8298f0c0cc528664249170372ef6b37ac39fdc92b40601a6aed1e32ff +md5: 3bf7b9fd5a7136126e0234db4b87c8b6 +depends: +- libgcc-ng >=12 +license: MIT +license_family: MIT +size: 77248 +timestamp: 1712692454246 +- conda: https://conda.anaconda.org/conda-forge/linux-64/graphite2-1.3.14-hecca717_2.conda +sha256: 25ba37da5c39697a77fce2c9a15e48cf0a84f1464ad2aafbe53d8357a9f6cc8c +md5: 2cd94587f3a401ae05e03a6caf09539d +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libstdcxx >=14 +license: LGPL-2.0-or-later +license_family: LGPL +size: 99596 +timestamp: 1755102025473 +- conda: https://conda.anaconda.org/conda-forge/linux-64/harfbuzz-13.2.1-h6083320_0.conda +sha256: 477f2c553f72165020d3c56740ba354be916c2f0b76fd9f535e83d698277d5ec +md5: 14470902326beee192e33719a2e8bb7f +depends: +- __glibc >=2.17,<3.0.a0 +- cairo >=1.18.4,<2.0a0 +- graphite2 >=1.3.14,<2.0a0 +- icu >=78.3,<79.0a0 +- libexpat >=2.7.4,<3.0a0 +- libfreetype >=2.14.2 +- libfreetype6 >=2.14.2 +- libgcc >=14 +- libglib >=2.86.4,<3.0a0 +- libstdcxx >=14 +- libzlib >=1.3.2,<2.0a0 +license: MIT +license_family: MIT +size: 2384060 +timestamp: 1774276284520 +- conda: https://conda.anaconda.org/conda-forge/linux-64/icu-78.3-h33c6efd_0.conda +sha256: fbf86c4a59c2ed05bbffb2ba25c7ed94f6185ec30ecb691615d42342baa1a16a +md5: c80d8a3b84358cb967fa81e7075fbc8a +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libstdcxx >=14 +license: MIT +license_family: MIT +size: 12723451 +timestamp: 1773822285671 +- conda: https://conda.anaconda.org/conda-forge/linux-64/keyutils-1.6.3-hb9d3cd8_0.conda +sha256: 0960d06048a7185d3542d850986d807c6e37ca2e644342dd0c72feefcf26c2a4 +md5: b38117a3c920364aff79f870c984b4a3 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +license: LGPL-2.1-or-later +size: 134088 +timestamp: 1754905959823 +- conda: https://conda.anaconda.org/conda-forge/linux-64/krb5-1.22.2-ha1258a1_0.conda +sha256: 3e307628ca3527448dd1cb14ad7bb9d04d1d28c7d4c5f97ba196ae984571dd25 +md5: fb53fb07ce46a575c5d004bbc96032c2 +depends: +- __glibc >=2.17,<3.0.a0 +- keyutils >=1.6.3,<2.0a0 +- libedit >=3.1.20250104,<3.2.0a0 +- libedit >=3.1.20250104,<4.0a0 +- libgcc >=14 +- libstdcxx >=14 +- openssl >=3.5.5,<4.0a0 +license: MIT +license_family: MIT +size: 1386730 +timestamp: 1769769569681 +- conda: https://conda.anaconda.org/conda-forge/linux-64/lcms2-2.18-h0c24ade_0.conda +sha256: 836ec4b895352110335b9fdcfa83a8dcdbe6c5fb7c06c4929130600caea91c0a +md5: 6f2e2c8f58160147c4d1c6f4c14cbac4 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libjpeg-turbo >=3.1.2,<4.0a0 +- libtiff >=4.7.1,<4.8.0a0 +license: MIT +license_family: MIT +size: 249959 +timestamp: 1768184673131 +- conda: https://conda.anaconda.org/conda-forge/linux-64/lerc-4.1.0-hdb68285_0.conda +sha256: f84cb54782f7e9cea95e810ea8fef186e0652d0fa73d3009914fa2c1262594e1 +md5: a752488c68f2e7c456bcbd8f16eec275 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libstdcxx >=14 +license: Apache-2.0 +license_family: Apache +size: 261513 +timestamp: 1773113328888 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libcups-2.3.3-h7a8fb5f_6.conda +sha256: 205c4f19550f3647832ec44e35e6d93c8c206782bdd620c1d7cf66237580ff9c +md5: 49c553b47ff679a6a1e9fc80b9c5a2d4 +depends: +- __glibc >=2.17,<3.0.a0 +- krb5 >=1.22.2,<1.23.0a0 +- libgcc >=14 +- libstdcxx >=14 +- libzlib >=1.3.1,<2.0a0 +license: Apache-2.0 +license_family: Apache +size: 4518030 +timestamp: 1770902209173 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libdeflate-1.25-h17f619e_0.conda +sha256: aa8e8c4be9a2e81610ddf574e05b64ee131fab5e0e3693210c9d6d2fba32c680 +md5: 6c77a605a7a689d17d4819c0f8ac9a00 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: MIT +license_family: MIT +size: 73490 +timestamp: 1761979956660 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libedit-3.1.20250104-pl5321h7949ede_0.conda +sha256: d789471216e7aba3c184cd054ed61ce3f6dac6f87a50ec69291b9297f8c18724 +md5: c277e0a4d549b03ac1e9d6cbbe3d017b +depends: +- ncurses +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +- ncurses >=6.5,<7.0a0 +license: BSD-2-Clause +license_family: BSD +size: 134676 +timestamp: 1738479519902 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libexpat-2.7.4-hecca717_0.conda +sha256: d78f1d3bea8c031d2f032b760f36676d87929b18146351c4464c66b0869df3f5 +md5: e7f7ce06ec24cfcfb9e36d28cf82ba57 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +constrains: +- expat 2.7.4.* +license: MIT +license_family: MIT +size: 76798 +timestamp: 1771259418166 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libffi-3.5.2-h3435931_0.conda +sha256: 31f19b6a88ce40ebc0d5a992c131f57d919f73c0b92cd1617a5bec83f6e961e6 +md5: a360c33a5abe61c07959e449fa1453eb +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: MIT +license_family: MIT +size: 58592 +timestamp: 1769456073053 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libfreetype-2.14.3-ha770c72_0.conda +sha256: 38f014a7129e644636e46064ecd6b1945e729c2140e21d75bb476af39e692db2 +md5: e289f3d17880e44b633ba911d57a321b +depends: +- libfreetype6 >=2.14.3 +license: GPL-2.0-only OR FTL +size: 8049 +timestamp: 1774298163029 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libfreetype6-2.14.3-h73754d4_0.conda +sha256: 16f020f96da79db1863fcdd8f2b8f4f7d52f177dd4c58601e38e9182e91adf1d +md5: fb16b4b69e3f1dcfe79d80db8fd0c55d +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libpng >=1.6.55,<1.7.0a0 +- libzlib >=1.3.2,<2.0a0 +constrains: +- freetype >=2.14.3 +license: GPL-2.0-only OR FTL +size: 384575 +timestamp: 1774298162622 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgcc-15.2.0-he0feb66_18.conda +sha256: faf7d2017b4d718951e3a59d081eb09759152f93038479b768e3d612688f83f5 +md5: 0aa00f03f9e39fb9876085dee11a85d4 +depends: +- __glibc >=2.17,<3.0.a0 +- _openmp_mutex >=4.5 +constrains: +- libgcc-ng ==15.2.0=*_18 +- libgomp 15.2.0 he0feb66_18 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 1041788 +timestamp: 1771378212382 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-15.2.0-h69a702a_18.conda +sha256: e318a711400f536c81123e753d4c797a821021fb38970cebfb3f454126016893 +md5: d5e96b1ed75ca01906b3d2469b4ce493 +depends: +- libgcc 15.2.0 he0feb66_18 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 27526 +timestamp: 1771378224552 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libglib-2.86.4-h6548e54_1.conda +sha256: a27e44168a1240b15659888ce0d9b938ed4bdb49e9ea68a7c1ff27bcea8b55ce +md5: bb26456332b07f68bf3b7622ed71c0da +depends: +- __glibc >=2.17,<3.0.a0 +- libffi >=3.5.2,<3.6.0a0 +- libgcc >=14 +- libiconv >=1.18,<2.0a0 +- libzlib >=1.3.1,<2.0a0 +- pcre2 >=10.47,<10.48.0a0 +constrains: +- glib 2.86.4 *_1 +license: LGPL-2.1-or-later +size: 4398701 +timestamp: 1771863239578 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgomp-15.2.0-he0feb66_18.conda +sha256: 21337ab58e5e0649d869ab168d4e609b033509de22521de1bfed0c031bfc5110 +md5: 239c5e9546c38a1e884d69effcf4c882 +depends: +- __glibc >=2.17,<3.0.a0 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 603262 +timestamp: 1771378117851 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libiconv-1.18-h3b78370_2.conda +sha256: c467851a7312765447155e071752d7bf9bf44d610a5687e32706f480aad2833f +md5: 915f5995e94f60e9a4826e0b0920ee88 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: LGPL-2.1-only +size: 790176 +timestamp: 1754908768807 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libjpeg-turbo-3.1.2-hb03c661_0.conda +sha256: cc9aba923eea0af8e30e0f94f2ad7156e2984d80d1e8e7fe6be5a1f257f0eb32 +md5: 8397539e3a0bbd1695584fb4f927485a +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +constrains: +- jpeg <0.0.0a +license: IJG AND BSD-3-Clause AND Zlib +size: 633710 +timestamp: 1762094827865 +- conda: https://conda.anaconda.org/conda-forge/linux-64/liblzma-5.8.2-hb03c661_0.conda +sha256: 755c55ebab181d678c12e49cced893598f2bab22d582fbbf4d8b83c18be207eb +md5: c7c83eecbb72d88b940c249af56c8b17 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +constrains: +- xz 5.8.2.* +license: 0BSD +size: 113207 +timestamp: 1768752626120 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libpng-1.6.55-h421ea60_0.conda +sha256: 36ade759122cdf0f16e2a2562a19746d96cf9c863ffaa812f2f5071ebbe9c03c +md5: 5f13ffc7d30ffec87864e678df9957b4 +depends: +- libgcc >=14 +- __glibc >=2.17,<3.0.a0 +- libzlib >=1.3.1,<2.0a0 +license: zlib-acknowledgement +size: 317669 +timestamp: 1770691470744 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-15.2.0-h934c35e_18.conda +sha256: 78668020064fdaa27e9ab65cd2997e2c837b564ab26ce3bf0e58a2ce1a525c6e +md5: 1b08cd684f34175e4514474793d44bcb +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc 15.2.0 he0feb66_18 +constrains: +- libstdcxx-ng ==15.2.0=*_18 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 5852330 +timestamp: 1771378262446 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libtiff-4.7.1-h9d88235_1.conda +sha256: e5f8c38625aa6d567809733ae04bb71c161a42e44a9fa8227abe61fa5c60ebe0 +md5: cd5a90476766d53e901500df9215e927 +depends: +- __glibc >=2.17,<3.0.a0 +- lerc >=4.0.0,<5.0a0 +- libdeflate >=1.25,<1.26.0a0 +- libgcc >=14 +- libjpeg-turbo >=3.1.0,<4.0a0 +- liblzma >=5.8.1,<6.0a0 +- libstdcxx >=14 +- libwebp-base >=1.6.0,<2.0a0 +- libzlib >=1.3.1,<2.0a0 +- zstd >=1.5.7,<1.6.0a0 +license: HPND +size: 435273 +timestamp: 1762022005702 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libuuid-2.41.3-h5347b49_0.conda +sha256: 1a7539cfa7df00714e8943e18de0b06cceef6778e420a5ee3a2a145773758aee +md5: db409b7c1720428638e7c0d509d3e1b5 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: BSD-3-Clause +license_family: BSD +size: 40311 +timestamp: 1766271528534 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libwebp-base-1.6.0-hd42ef1d_0.conda +sha256: 3aed21ab28eddffdaf7f804f49be7a7d701e8f0e46c856d801270b470820a37b +md5: aea31d2e5b1091feca96fcfe945c3cf9 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +constrains: +- libwebp 1.6.0 +license: BSD-3-Clause +license_family: BSD +size: 429011 +timestamp: 1752159441324 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libxcb-1.17.0-h8a09558_0.conda +sha256: 666c0c431b23c6cec6e492840b176dde533d48b7e6fb8883f5071223433776aa +md5: 92ed62436b625154323d40d5f2f11dd7 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +- pthread-stubs +- xorg-libxau >=1.0.11,<2.0a0 +- xorg-libxdmcp +license: MIT +license_family: MIT +size: 395888 +timestamp: 1727278577118 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libxcrypt-4.4.36-hd590300_1.conda +sha256: 6ae68e0b86423ef188196fff6207ed0c8195dd84273cb5623b85aa08033a410c +md5: 5aa797f8787fe7a17d1b0821485b5adc +depends: +- libgcc-ng >=12 +license: LGPL-2.1-or-later +size: 100393 +timestamp: 1702724383534 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libzlib-1.3.2-h25fd6f3_2.conda +sha256: 55044c403570f0dc26e6364de4dc5368e5f3fc7ff103e867c487e2b5ab2bcda9 +md5: d87ff7921124eccd67248aa483c23fec +depends: +- __glibc >=2.17,<3.0.a0 +constrains: +- zlib 1.3.2 *_2 +license: Zlib +license_family: Other +size: 63629 +timestamp: 1774072609062 +- conda: https://conda.anaconda.org/conda-forge/linux-64/ncurses-6.5-h2d0b736_3.conda +sha256: 3fde293232fa3fca98635e1167de6b7c7fda83caf24b9d6c91ec9eefb4f4d586 +md5: 47e340acb35de30501a76c7c799c41d7 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +license: X11 AND BSD-3-Clause +size: 891641 +timestamp: 1738195959188 +- conda: https://conda.anaconda.org/conda-forge/linux-64/openjdk-25.0.2-ha668962_0.conda +sha256: 3825a4c84676a8a5cc23b397a2911e4efa4a805daf2af764153bd904e142ec41 +md5: a41092b0177362dbe5eb2a18501e86c0 +depends: +- xorg-libx11 +- xorg-libxext +- xorg-libxi +- xorg-libxrender +- xorg-libxtst +- libstdcxx >=14 +- libgcc >=14 +- __glibc >=2.17,<3.0.a0 +- libfreetype >=2.14.1 +- libfreetype6 >=2.14.1 +- xorg-libxrender >=0.9.12,<0.10.0a0 +- libjpeg-turbo >=3.1.2,<4.0a0 +- giflib >=5.2.2,<5.3.0a0 +- xorg-libxrandr >=1.5.5,<2.0a0 +- harfbuzz >=12.3.2 +- fontconfig >=2.17.1,<3.0a0 +- fonts-conda-ecosystem +- xorg-libxtst >=1.2.5,<2.0a0 +- xorg-libxi >=1.8.2,<2.0a0 +- lcms2 >=2.18,<3.0a0 +- alsa-lib >=1.2.15.3,<1.3.0a0 +- libpng >=1.6.55,<1.7.0a0 +- xorg-libxt >=1.3.1,<2.0a0 +- libzlib >=1.3.1,<2.0a0 +- xorg-libxext >=1.3.7,<2.0a0 +- xorg-libx11 >=1.8.13,<2.0a0 +- libcups >=2.3.3,<2.4.0a0 +license: GPL-2.0-or-later WITH Classpath-exception-2.0 +license_family: GPL +size: 122465031 +timestamp: 1771443671180 +- conda: https://conda.anaconda.org/conda-forge/linux-64/openssl-3.6.1-h35e630c_1.conda +sha256: 44c877f8af015332a5d12f5ff0fb20ca32f896526a7d0cdb30c769df1144fb5c +md5: f61eb8cd60ff9057122a3d338b99c00f +depends: +- __glibc >=2.17,<3.0.a0 +- ca-certificates +- libgcc >=14 +license: Apache-2.0 +license_family: Apache +size: 3164551 +timestamp: 1769555830639 +- conda: https://conda.anaconda.org/conda-forge/linux-64/pcre2-10.47-haa7fec5_0.conda +sha256: 5e6f7d161356fefd981948bea5139c5aa0436767751a6930cb1ca801ebb113ff +md5: 7a3bff861a6583f1889021facefc08b1 +depends: +- __glibc >=2.17,<3.0.a0 +- bzip2 >=1.0.8,<2.0a0 +- libgcc >=14 +- libzlib >=1.3.1,<2.0a0 +license: BSD-3-Clause +license_family: BSD +size: 1222481 +timestamp: 1763655398280 +- conda: https://conda.anaconda.org/conda-forge/linux-64/perl-5.32.1-7_hd590300_perl5.conda +build_number: 7 +sha256: 9ec32b6936b0e37bcb0ed34f22ec3116e75b3c0964f9f50ecea5f58734ed6ce9 +md5: f2cfec9406850991f4e3d960cc9e3321 +depends: +- libgcc-ng >=12 +- libxcrypt >=4.4.36 +license: GPL-1.0-or-later OR Artistic-1.0-Perl +size: 13344463 +timestamp: 1703310653947 +- conda: https://conda.anaconda.org/conda-forge/linux-64/pixman-0.46.4-h54a6638_1.conda +sha256: 43d37bc9ca3b257c5dd7bf76a8426addbdec381f6786ff441dc90b1a49143b6a +md5: c01af13bdc553d1a8fbfff6e8db075f0 +depends: +- libgcc >=14 +- libstdcxx >=14 +- libgcc >=14 +- __glibc >=2.17,<3.0.a0 +license: MIT +license_family: MIT +size: 450960 +timestamp: 1754665235234 +- conda: https://conda.anaconda.org/conda-forge/linux-64/procps-ng-4.0.6-h18c060e_0.conda +sha256: 4ce2e1ee31a6217998f78c31ce7dc0a3e0557d9238b51d49dd20c52d467a126d +md5: f2c23a77b25efcad57d377b34bd84941 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- ncurses >=6.5,<7.0a0 +license: GPL-2.0-or-later AND LGPL-2.0-or-later +license_family: GPL +size: 593603 +timestamp: 1769710381284 +- conda: https://conda.anaconda.org/conda-forge/linux-64/pthread-stubs-0.4-hb9d3cd8_1002.conda +sha256: 9c88f8c64590e9567c6c80823f0328e58d3b1efb0e1c539c0315ceca764e0973 +md5: b3c17d95b5a10c6e64a21fa17573e70e +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +license: MIT +license_family: MIT +size: 8252 +timestamp: 1726802366959 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libice-1.1.2-hb9d3cd8_0.conda +sha256: c12396aabb21244c212e488bbdc4abcdef0b7404b15761d9329f5a4a39113c4b +md5: fb901ff28063514abb6046c9ec2c4a45 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +license: MIT +license_family: MIT +size: 58628 +timestamp: 1734227592886 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libsm-1.2.6-he73a12e_0.conda +sha256: 277841c43a39f738927145930ff963c5ce4c4dacf66637a3d95d802a64173250 +md5: 1c74ff8c35dcadf952a16f752ca5aa49 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +- libuuid >=2.38.1,<3.0a0 +- xorg-libice >=1.1.2,<2.0a0 +license: MIT +license_family: MIT +size: 27590 +timestamp: 1741896361728 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libx11-1.8.13-he1eb515_0.conda +sha256: 516d4060139dbb4de49a4dcdc6317a9353fb39ebd47789c14e6fe52de0deee42 +md5: 861fb6ccbc677bb9a9fb2468430b9c6a +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libxcb >=1.17.0,<2.0a0 +license: MIT +license_family: MIT +size: 839652 +timestamp: 1770819209719 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxau-1.0.12-hb03c661_1.conda +sha256: 6bc6ab7a90a5d8ac94c7e300cc10beb0500eeba4b99822768ca2f2ef356f731b +md5: b2895afaf55bf96a8c8282a2e47a5de0 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: MIT +license_family: MIT +size: 15321 +timestamp: 1762976464266 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxdmcp-1.1.5-hb03c661_1.conda +sha256: 25d255fb2eef929d21ff660a0c687d38a6d2ccfbcbf0cc6aa738b12af6e9d142 +md5: 1dafce8548e38671bea82e3f5c6ce22f +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: MIT +license_family: MIT +size: 20591 +timestamp: 1762976546182 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxext-1.3.7-hb03c661_0.conda +sha256: 79c60fc6acfd3d713d6340d3b4e296836a0f8c51602327b32794625826bd052f +md5: 34e54f03dfea3e7a2dcf1453a85f1085 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- xorg-libx11 >=1.8.12,<2.0a0 +license: MIT +license_family: MIT +size: 50326 +timestamp: 1769445253162 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxfixes-6.0.2-hb03c661_0.conda +sha256: 83c4c99d60b8784a611351220452a0a85b080668188dce5dfa394b723d7b64f4 +md5: ba231da7fccf9ea1e768caf5c7099b84 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- xorg-libx11 >=1.8.12,<2.0a0 +license: MIT +license_family: MIT +size: 20071 +timestamp: 1759282564045 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxi-1.8.2-hb9d3cd8_0.conda +sha256: 1a724b47d98d7880f26da40e45f01728e7638e6ec69f35a3e11f92acd05f9e7a +md5: 17dcc85db3c7886650b8908b183d6876 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +- xorg-libx11 >=1.8.10,<2.0a0 +- xorg-libxext >=1.3.6,<2.0a0 +- xorg-libxfixes >=6.0.1,<7.0a0 +license: MIT +license_family: MIT +size: 47179 +timestamp: 1727799254088 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxrandr-1.5.5-hb03c661_0.conda +sha256: 80ed047a5cb30632c3dc5804c7716131d767089f65877813d4ae855ee5c9d343 +md5: e192019153591938acf7322b6459d36e +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- xorg-libx11 >=1.8.12,<2.0a0 +- xorg-libxext >=1.3.6,<2.0a0 +- xorg-libxrender >=0.9.12,<0.10.0a0 +license: MIT +license_family: MIT +size: 30456 +timestamp: 1769445263457 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxrender-0.9.12-hb9d3cd8_0.conda +sha256: 044c7b3153c224c6cedd4484dd91b389d2d7fd9c776ad0f4a34f099b3389f4a1 +md5: 96d57aba173e878a2089d5638016dc5e +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +- xorg-libx11 >=1.8.10,<2.0a0 +license: MIT +license_family: MIT +size: 33005 +timestamp: 1734229037766 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxt-1.3.1-hb9d3cd8_0.conda +sha256: a8afba4a55b7b530eb5c8ad89737d60d60bc151a03fbef7a2182461256953f0e +md5: 279b0de5f6ba95457190a1c459a64e31 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +- xorg-libice >=1.1.1,<2.0a0 +- xorg-libsm >=1.2.4,<2.0a0 +- xorg-libx11 >=1.8.10,<2.0a0 +license: MIT +license_family: MIT +size: 379686 +timestamp: 1731860547604 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxtst-1.2.5-hb9d3cd8_3.conda +sha256: 752fdaac5d58ed863bbf685bb6f98092fe1a488ea8ebb7ed7b606ccfce08637a +md5: 7bbe9a0cc0df0ac5f5a8ad6d6a11af2f +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +- xorg-libx11 >=1.8.10,<2.0a0 +- xorg-libxext >=1.3.6,<2.0a0 +- xorg-libxi >=1.7.10,<2.0a0 +license: MIT +license_family: MIT +size: 32808 +timestamp: 1727964811275 +- conda: https://conda.anaconda.org/conda-forge/linux-64/zstd-1.5.7-hb78ec9c_6.conda +sha256: 68f0206ca6e98fea941e5717cec780ed2873ffabc0e1ed34428c061e2c6268c7 +md5: 4a13eeac0b5c8e5b8ab496e6c4ddd829 +depends: +- __glibc >=2.17,<3.0.a0 +- libzlib >=1.3.1,<2.0a0 +license: BSD-3-Clause +license_family: BSD +size: 601375 +timestamp: 1764777111296 diff --git a/modules/nf-core/fastqc/.conda-lock/linux_arm64-bd-e455e32f745abe68_1.txt b/modules/nf-core/fastqc/.conda-lock/linux_arm64-bd-e455e32f745abe68_1.txt new file mode 100644 index 0000000..cdc434c --- /dev/null +++ b/modules/nf-core/fastqc/.conda-lock/linux_arm64-bd-e455e32f745abe68_1.txt @@ -0,0 +1,769 @@ + +version: 6 +environments: +default: +channels: +- url: https://conda.anaconda.org/conda-forge/ +- url: https://conda.anaconda.org/bioconda/ +- url: https://conda.anaconda.org/bioconda/ +options: +pypi-prerelease-mode: if-necessary-or-explicit +packages: +linux-aarch64: +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/_openmp_mutex-4.5-20_gnu.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/alsa-lib-1.2.15.3-he30d5cf_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/bzip2-1.0.8-h4777abc_9.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2026.2.25-hbd8a1cb_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/cairo-1.18.4-h0b6afd8_1.conda +- conda: https://conda.anaconda.org/bioconda/noarch/fastqc-0.12.1-hdfd78af_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-dejavu-sans-mono-2.37-hab24e00_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-inconsolata-3.000-h77eed37_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-source-code-pro-2.038-h77eed37_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-ubuntu-0.83-h77eed37_3.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/fontconfig-2.17.1-hba86a56_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-ecosystem-1-0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-forge-1-hc364b38_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/giflib-5.2.2-h31becfc_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/graphite2-1.3.14-hfae3067_2.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/harfbuzz-13.2.1-h1134a53_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/icu-78.3-hcab7f73_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/keyutils-1.6.3-h86ecc28_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/krb5-1.22.2-hfd895c2_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/lcms2-2.18-h9d5b58d_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/lerc-4.1.0-h52b7260_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libcups-2.3.3-h4f2b762_6.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libdeflate-1.25-h1af38f5_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libedit-3.1.20250104-pl5321h976ea20_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libexpat-2.7.4-hfae3067_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libffi-3.5.2-h376a255_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libfreetype-2.14.3-h8af1aa0_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libfreetype6-2.14.3-hdae7a39_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgcc-15.2.0-h8acb6b2_18.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgcc-ng-15.2.0-he9431aa_18.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libglib-2.86.4-hf53f6bf_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgomp-15.2.0-h8acb6b2_18.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libiconv-1.18-h90929bb_2.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libjpeg-turbo-3.1.2-he30d5cf_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/liblzma-5.8.2-he30d5cf_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libpng-1.6.55-h1abf092_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libstdcxx-15.2.0-hef695bb_18.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libtiff-4.7.1-hdb009f0_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libuuid-2.41.3-h1022ec0_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libwebp-base-1.6.0-ha2e29f5_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libxcb-1.17.0-h262b8f6_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libxcrypt-4.4.36-h31becfc_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libzlib-1.3.2-hdc9db2a_2.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/ncurses-6.5-ha32ae93_3.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/openjdk-25.0.2-h488f50d_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/openssl-3.6.1-h546c87b_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pcre2-10.47-hf841c20_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/perl-5.32.1-7_h31becfc_perl5.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pixman-0.46.4-h7ac5ae9_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/procps-ng-4.0.6-h1779866_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pthread-stubs-0.4-h86ecc28_1002.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libice-1.1.2-h86ecc28_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libsm-1.2.6-h0808dbd_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libx11-1.8.13-h63a1b12_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxau-1.0.12-he30d5cf_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxdmcp-1.1.5-he30d5cf_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxext-1.3.7-he30d5cf_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxfixes-6.0.2-he30d5cf_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxi-1.8.2-h57736b2_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxrandr-1.5.5-he30d5cf_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxrender-0.9.12-h86ecc28_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxt-1.3.1-h57736b2_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxtst-1.2.5-h57736b2_3.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/zstd-1.5.7-h85ac4a6_6.conda +packages: +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/_openmp_mutex-4.5-20_gnu.conda +build_number: 20 +sha256: a2527b1d81792a0ccd2c05850960df119c2b6d8f5fdec97f2db7d25dc23b1068 +md5: 468fd3bb9e1f671d36c2cbc677e56f1d +depends: +- libgomp >=7.5.0 +constrains: +- openmp_impl <0.0a0 +license: BSD-3-Clause +license_family: BSD +size: 28926 +timestamp: 1770939656741 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/alsa-lib-1.2.15.3-he30d5cf_0.conda +sha256: ea2233e2db9908c2e5f29d3ca420a546b4583253f4f70abb5494cdd676866d42 +md5: 4a98cbc4ade694520227402ff8880630 +depends: +- libgcc >=14 +license: LGPL-2.1-or-later +license_family: GPL +size: 615729 +timestamp: 1768327548407 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/bzip2-1.0.8-h4777abc_9.conda +sha256: b3495077889dde6bb370938e7db82be545c73e8589696ad0843a32221520ad4c +md5: 840d8fc0d7b3209be93080bc20e07f2d +depends: +- libgcc >=14 +license: bzip2-1.0.6 +license_family: BSD +size: 192412 +timestamp: 1771350241232 +- conda: https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2026.2.25-hbd8a1cb_0.conda +sha256: 67cc7101b36421c5913a1687ef1b99f85b5d6868da3abbf6ec1a4181e79782fc +md5: 4492fd26db29495f0ba23f146cd5638d +depends: +- __unix +license: ISC +size: 147413 +timestamp: 1772006283803 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/cairo-1.18.4-h0b6afd8_1.conda +sha256: 675db823f3d6fb6bf747fab3b0170ba99b269a07cf6df1e49fff2f9972be9cd1 +md5: 043c13ed3a18396994be9b4fab6572ad +depends: +- fontconfig >=2.15.0,<3.0a0 +- fonts-conda-ecosystem +- icu >=78.1,<79.0a0 +- libexpat >=2.7.3,<3.0a0 +- libfreetype >=2.14.1 +- libfreetype6 >=2.14.1 +- libgcc >=14 +- libglib >=2.86.3,<3.0a0 +- libpng >=1.6.53,<1.7.0a0 +- libstdcxx >=14 +- libxcb >=1.17.0,<2.0a0 +- libzlib >=1.3.1,<2.0a0 +- pixman >=0.46.4,<1.0a0 +- xorg-libice >=1.1.2,<2.0a0 +- xorg-libsm >=1.2.6,<2.0a0 +- xorg-libx11 >=1.8.12,<2.0a0 +- xorg-libxext >=1.3.6,<2.0a0 +- xorg-libxrender >=0.9.12,<0.10.0a0 +license: LGPL-2.1-only or MPL-1.1 +size: 927045 +timestamp: 1766416003626 +- conda: https://conda.anaconda.org/bioconda/noarch/fastqc-0.12.1-hdfd78af_0.tar.bz2 +sha256: 7cc26225d590540ae95cd24940ff42f2da7479dd4cd22ae9ab9298665d06790c +md5: c9f6a4b12229f7331f79c9a00dd6e240 +depends: +- font-ttf-dejavu-sans-mono +- fontconfig +- openjdk >=8.0.144 +- perl +license: GPL >=3 +size: 11664291 +timestamp: 1677946722445 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-dejavu-sans-mono-2.37-hab24e00_0.tar.bz2 +sha256: 58d7f40d2940dd0a8aa28651239adbf5613254df0f75789919c4e6762054403b +md5: 0c96522c6bdaed4b1566d11387caaf45 +license: BSD-3-Clause +license_family: BSD +size: 397370 +timestamp: 1566932522327 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-inconsolata-3.000-h77eed37_0.tar.bz2 +sha256: c52a29fdac682c20d252facc50f01e7c2e7ceac52aa9817aaf0bb83f7559ec5c +md5: 34893075a5c9e55cdafac56607368fc6 +license: OFL-1.1 +license_family: Other +size: 96530 +timestamp: 1620479909603 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-source-code-pro-2.038-h77eed37_0.tar.bz2 +sha256: 00925c8c055a2275614b4d983e1df637245e19058d79fc7dd1a93b8d9fb4b139 +md5: 4d59c254e01d9cde7957100457e2d5fb +license: OFL-1.1 +license_family: Other +size: 700814 +timestamp: 1620479612257 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-ubuntu-0.83-h77eed37_3.conda +sha256: 2821ec1dc454bd8b9a31d0ed22a7ce22422c0aef163c59f49dfdf915d0f0ca14 +md5: 49023d73832ef61042f6a237cb2687e7 +license: LicenseRef-Ubuntu-Font-Licence-Version-1.0 +license_family: Other +size: 1620504 +timestamp: 1727511233259 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/fontconfig-2.17.1-hba86a56_0.conda +sha256: 835aff8615dd8d8fff377679710ce81b8a2c47b6404e21a92fb349fda193a15c +md5: 0fed1ff55f4938a65907f3ecf62609db +depends: +- libexpat >=2.7.4,<3.0a0 +- libfreetype >=2.14.1 +- libfreetype6 >=2.14.1 +- libgcc >=14 +- libuuid >=2.41.3,<3.0a0 +- libzlib >=1.3.1,<2.0a0 +license: MIT +license_family: MIT +size: 279044 +timestamp: 1771382728182 +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-ecosystem-1-0.tar.bz2 +sha256: a997f2f1921bb9c9d76e6fa2f6b408b7fa549edd349a77639c9fe7a23ea93e61 +md5: fee5683a3f04bd15cbd8318b096a27ab +depends: +- fonts-conda-forge +license: BSD-3-Clause +license_family: BSD +size: 3667 +timestamp: 1566974674465 +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-forge-1-hc364b38_1.conda +sha256: 54eea8469786bc2291cc40bca5f46438d3e062a399e8f53f013b6a9f50e98333 +md5: a7970cd949a077b7cb9696379d338681 +depends: +- font-ttf-ubuntu +- font-ttf-inconsolata +- font-ttf-dejavu-sans-mono +- font-ttf-source-code-pro +license: BSD-3-Clause +license_family: BSD +size: 4059 +timestamp: 1762351264405 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/giflib-5.2.2-h31becfc_0.conda +sha256: a79dc3bd54c4fb1f249942ee2d5b601a76ecf9614774a4cff9af49adfa458db2 +md5: 2f809afaf0ba1ea4135dce158169efac +depends: +- libgcc-ng >=12 +license: MIT +license_family: MIT +size: 82124 +timestamp: 1712692444545 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/graphite2-1.3.14-hfae3067_2.conda +sha256: c9b1781fe329e0b77c5addd741e58600f50bef39321cae75eba72f2f381374b7 +md5: 4aa540e9541cc9d6581ab23ff2043f13 +depends: +- libgcc >=14 +- libstdcxx >=14 +license: LGPL-2.0-or-later +license_family: LGPL +size: 102400 +timestamp: 1755102000043 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/harfbuzz-13.2.1-h1134a53_0.conda +sha256: e22f485fddaaea3ff4b6cae98e0197b9dccd2ed2770337ad6ff38a92afe04e59 +md5: 05d65a2cf410adc331c9ea61f59f1013 +depends: +- cairo >=1.18.4,<2.0a0 +- graphite2 >=1.3.14,<2.0a0 +- icu >=78.3,<79.0a0 +- libexpat >=2.7.4,<3.0a0 +- libfreetype >=2.14.2 +- libfreetype6 >=2.14.2 +- libgcc >=14 +- libglib >=2.86.4,<3.0a0 +- libstdcxx >=14 +- libzlib >=1.3.2,<2.0a0 +license: MIT +license_family: MIT +size: 2345732 +timestamp: 1774281448329 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/icu-78.3-hcab7f73_0.conda +sha256: 49ba6aed2c6b482bb0ba41078057555d29764299bc947b990708617712ef6406 +md5: 546da38c2fa9efacf203e2ad3f987c59 +depends: +- libgcc >=14 +- libstdcxx >=14 +license: MIT +license_family: MIT +size: 12837286 +timestamp: 1773822650615 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/keyutils-1.6.3-h86ecc28_0.conda +sha256: 5ce830ca274b67de11a7075430a72020c1fb7d486161a82839be15c2b84e9988 +md5: e7df0aab10b9cbb73ab2a467ebfaf8c7 +depends: +- libgcc >=13 +license: LGPL-2.1-or-later +size: 129048 +timestamp: 1754906002667 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/krb5-1.22.2-hfd895c2_0.conda +sha256: b53999d888dda53c506b264e8c02b5f5c8e022c781eda0718f007339e6bc90ba +md5: d9ca108bd680ea86a963104b6b3e95ca +depends: +- keyutils >=1.6.3,<2.0a0 +- libedit >=3.1.20250104,<3.2.0a0 +- libedit >=3.1.20250104,<4.0a0 +- libgcc >=14 +- libstdcxx >=14 +- openssl >=3.5.5,<4.0a0 +license: MIT +license_family: MIT +size: 1517436 +timestamp: 1769773395215 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/lcms2-2.18-h9d5b58d_0.conda +sha256: 379ef5e91a587137391a6149755d0e929f1a007d2dcb211318ac670a46c8596f +md5: bb960f01525b5e001608afef9d47b79c +depends: +- libgcc >=14 +- libjpeg-turbo >=3.1.2,<4.0a0 +- libtiff >=4.7.1,<4.8.0a0 +license: MIT +license_family: MIT +size: 293039 +timestamp: 1768184778398 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/lerc-4.1.0-h52b7260_0.conda +sha256: 8957fd460c1c132c8031f65fd5f56ec3807fd71b7cab2c5e2b0937b13404ab36 +md5: d13423b06447113a90b5b1366d4da171 +depends: +- libgcc >=14 +- libstdcxx >=14 +license: Apache-2.0 +license_family: Apache +size: 240444 +timestamp: 1773114901155 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libcups-2.3.3-h4f2b762_6.conda +sha256: 41b04f995c9f63af8c4065a35931e46cbc2fdd6b9bf7e4c19f90d53cbb2bc8e5 +md5: 67828c963b17db7dc989fe5d509ef04a +depends: +- krb5 >=1.22.2,<1.23.0a0 +- libgcc >=14 +- libstdcxx >=14 +- libzlib >=1.3.1,<2.0a0 +license: Apache-2.0 +license_family: Apache +size: 4553739 +timestamp: 1770903929794 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libdeflate-1.25-h1af38f5_0.conda +sha256: 48814b73bd462da6eed2e697e30c060ae16af21e9fbed30d64feaf0aad9da392 +md5: a9138815598fe6b91a1d6782ca657b0c +depends: +- libgcc >=14 +license: MIT +license_family: MIT +size: 71117 +timestamp: 1761979776756 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libedit-3.1.20250104-pl5321h976ea20_0.conda +sha256: c0b27546aa3a23d47919226b3a1635fccdb4f24b94e72e206a751b33f46fd8d6 +md5: fb640d776fc92b682a14e001980825b1 +depends: +- ncurses +- libgcc >=13 +- ncurses >=6.5,<7.0a0 +license: BSD-2-Clause +license_family: BSD +size: 148125 +timestamp: 1738479808948 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libexpat-2.7.4-hfae3067_0.conda +sha256: 995ce3ad96d0f4b5ed6296b051a0d7b6377718f325bc0e792fbb96b0e369dad7 +md5: 57f3b3da02a50a1be2a6fe847515417d +depends: +- libgcc >=14 +constrains: +- expat 2.7.4.* +license: MIT +license_family: MIT +size: 76564 +timestamp: 1771259530958 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libffi-3.5.2-h376a255_0.conda +sha256: 3df4c539449aabc3443bbe8c492c01d401eea894603087fca2917aa4e1c2dea9 +md5: 2f364feefb6a7c00423e80dcb12db62a +depends: +- libgcc >=14 +license: MIT +license_family: MIT +size: 55952 +timestamp: 1769456078358 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libfreetype-2.14.3-h8af1aa0_0.conda +sha256: 752e4f66283d7deb4c6fd47d88df644d8daa2aaa825a54f3bf350a625190192a +md5: a229e22d4d8814a07702b0919d8e6701 +depends: +- libfreetype6 >=2.14.3 +license: GPL-2.0-only OR FTL +size: 8125 +timestamp: 1774301094057 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libfreetype6-2.14.3-hdae7a39_0.conda +sha256: 8e6b27fe4eec4c2fa7b7769a21973734c8dba1de80086fb0213e58375ac09f4c +md5: b99ed99e42dafb27889483b3098cace7 +depends: +- libgcc >=14 +- libpng >=1.6.55,<1.7.0a0 +- libzlib >=1.3.2,<2.0a0 +constrains: +- freetype >=2.14.3 +license: GPL-2.0-only OR FTL +size: 422941 +timestamp: 1774301093473 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgcc-15.2.0-h8acb6b2_18.conda +sha256: 43df385bedc1cab11993c4369e1f3b04b4ca5d0ea16cba6a0e7f18dbc129fcc9 +md5: 552567ea2b61e3a3035759b2fdb3f9a6 +depends: +- _openmp_mutex >=4.5 +constrains: +- libgcc-ng ==15.2.0=*_18 +- libgomp 15.2.0 h8acb6b2_18 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 622900 +timestamp: 1771378128706 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgcc-ng-15.2.0-he9431aa_18.conda +sha256: 83bb0415f59634dccfa8335d4163d1f6db00a27b36666736f9842b650b92cf2f +md5: 4feebd0fbf61075a1a9c2e9b3936c257 +depends: +- libgcc 15.2.0 h8acb6b2_18 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 27568 +timestamp: 1771378136019 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libglib-2.86.4-hf53f6bf_1.conda +sha256: afc503dbd04a5bf2709aa9d8318a03a8c4edb389f661ff280c3494bfef4341ec +md5: 4ac4372fc4d7f20630a91314cdac8afd +depends: +- libffi >=3.5.2,<3.6.0a0 +- libgcc >=14 +- libiconv >=1.18,<2.0a0 +- libzlib >=1.3.1,<2.0a0 +- pcre2 >=10.47,<10.48.0a0 +constrains: +- glib 2.86.4 *_1 +license: LGPL-2.1-or-later +size: 4512186 +timestamp: 1771863220969 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgomp-15.2.0-h8acb6b2_18.conda +sha256: fc716f11a6a8525e27a5d332ef6a689210b0d2a4dd1133edc0f530659aa9faa6 +md5: 4faa39bf919939602e594253bd673958 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 588060 +timestamp: 1771378040807 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libiconv-1.18-h90929bb_2.conda +sha256: 1473451cd282b48d24515795a595801c9b65b567fe399d7e12d50b2d6cdb04d9 +md5: 5a86bf847b9b926f3a4f203339748d78 +depends: +- libgcc >=14 +license: LGPL-2.1-only +size: 791226 +timestamp: 1754910975665 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libjpeg-turbo-3.1.2-he30d5cf_0.conda +sha256: 84064c7c53a64291a585d7215fe95ec42df74203a5bf7615d33d49a3b0f08bb6 +md5: 5109d7f837a3dfdf5c60f60e311b041f +depends: +- libgcc >=14 +constrains: +- jpeg <0.0.0a +license: IJG AND BSD-3-Clause AND Zlib +size: 691818 +timestamp: 1762094728337 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/liblzma-5.8.2-he30d5cf_0.conda +sha256: 843c46e20519651a3e357a8928352b16c5b94f4cd3d5481acc48be2e93e8f6a3 +md5: 96944e3c92386a12755b94619bae0b35 +depends: +- libgcc >=14 +constrains: +- xz 5.8.2.* +license: 0BSD +size: 125916 +timestamp: 1768754941722 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libpng-1.6.55-h1abf092_0.conda +sha256: c7378c6b79de4d571d00ad1caf0a4c19d43c9c94077a761abb6ead44d891f907 +md5: be4088903b94ea297975689b3c3aeb27 +depends: +- libgcc >=14 +- libzlib >=1.3.1,<2.0a0 +license: zlib-acknowledgement +size: 340156 +timestamp: 1770691477245 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libstdcxx-15.2.0-hef695bb_18.conda +sha256: 31fdb9ffafad106a213192d8319b9f810e05abca9c5436b60e507afb35a6bc40 +md5: f56573d05e3b735cb03efeb64a15f388 +depends: +- libgcc 15.2.0 h8acb6b2_18 +constrains: +- libstdcxx-ng ==15.2.0=*_18 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 5541411 +timestamp: 1771378162499 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libtiff-4.7.1-hdb009f0_1.conda +sha256: 7ff79470db39e803e21b8185bc8f19c460666d5557b1378d1b1e857d929c6b39 +md5: 8c6fd84f9c87ac00636007c6131e457d +depends: +- lerc >=4.0.0,<5.0a0 +- libdeflate >=1.25,<1.26.0a0 +- libgcc >=14 +- libjpeg-turbo >=3.1.0,<4.0a0 +- liblzma >=5.8.1,<6.0a0 +- libstdcxx >=14 +- libwebp-base >=1.6.0,<2.0a0 +- libzlib >=1.3.1,<2.0a0 +- zstd >=1.5.7,<1.6.0a0 +license: HPND +size: 488407 +timestamp: 1762022048105 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libuuid-2.41.3-h1022ec0_0.conda +sha256: c37a8e89b700646f3252608f8368e7eb8e2a44886b92776e57ad7601fc402a11 +md5: cf2861212053d05f27ec49c3784ff8bb +depends: +- libgcc >=14 +license: BSD-3-Clause +license_family: BSD +size: 43453 +timestamp: 1766271546875 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libwebp-base-1.6.0-ha2e29f5_0.conda +sha256: b03700a1f741554e8e5712f9b06dd67e76f5301292958cd3cb1ac8c6fdd9ed25 +md5: 24e92d0942c799db387f5c9d7b81f1af +depends: +- libgcc >=14 +constrains: +- libwebp 1.6.0 +license: BSD-3-Clause +license_family: BSD +size: 359496 +timestamp: 1752160685488 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libxcb-1.17.0-h262b8f6_0.conda +sha256: 461cab3d5650ac6db73a367de5c8eca50363966e862dcf60181d693236b1ae7b +md5: cd14ee5cca2464a425b1dbfc24d90db2 +depends: +- libgcc >=13 +- pthread-stubs +- xorg-libxau >=1.0.11,<2.0a0 +- xorg-libxdmcp +license: MIT +license_family: MIT +size: 397493 +timestamp: 1727280745441 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libxcrypt-4.4.36-h31becfc_1.conda +sha256: 6b46c397644091b8a26a3048636d10b989b1bf266d4be5e9474bf763f828f41f +md5: b4df5d7d4b63579d081fd3a4cf99740e +depends: +- libgcc-ng >=12 +license: LGPL-2.1-or-later +size: 114269 +timestamp: 1702724369203 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libzlib-1.3.2-hdc9db2a_2.conda +sha256: eb111e32e5a7313a5bf799c7fb2419051fa2fe7eff74769fac8d5a448b309f7f +md5: 502006882cf5461adced436e410046d1 +constrains: +- zlib 1.3.2 *_2 +license: Zlib +license_family: Other +size: 69833 +timestamp: 1774072605429 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/ncurses-6.5-ha32ae93_3.conda +sha256: 91cfb655a68b0353b2833521dc919188db3d8a7f4c64bea2c6a7557b24747468 +md5: 182afabe009dc78d8b73100255ee6868 +depends: +- libgcc >=13 +license: X11 AND BSD-3-Clause +size: 926034 +timestamp: 1738196018799 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/openjdk-25.0.2-h488f50d_0.conda +sha256: 6fd2c872b275fa5d42a61a4b6dc28a819cde29f9048adb547363597432e0720e +md5: 27fdd5d67e235c20d23b2d66406497d3 +depends: +- xorg-libx11 +- xorg-libxext +- xorg-libxi +- xorg-libxrender +- xorg-libxtst +- libstdcxx >=14 +- libgcc >=14 +- libzlib >=1.3.1,<2.0a0 +- xorg-libxtst >=1.2.5,<2.0a0 +- libpng >=1.6.55,<1.7.0a0 +- alsa-lib >=1.2.15.3,<1.3.0a0 +- xorg-libx11 >=1.8.13,<2.0a0 +- xorg-libxi >=1.8.2,<2.0a0 +- xorg-libxrandr >=1.5.5,<2.0a0 +- lcms2 >=2.18,<3.0a0 +- xorg-libxrender >=0.9.12,<0.10.0a0 +- libcups >=2.3.3,<2.4.0a0 +- libfreetype >=2.14.1 +- libfreetype6 >=2.14.1 +- harfbuzz >=12.3.2 +- xorg-libxext >=1.3.7,<2.0a0 +- giflib >=5.2.2,<5.3.0a0 +- xorg-libxt >=1.3.1,<2.0a0 +- libjpeg-turbo >=3.1.2,<4.0a0 +- fontconfig >=2.17.1,<3.0a0 +- fonts-conda-ecosystem +license: GPL-2.0-or-later WITH Classpath-exception-2.0 +license_family: GPL +size: 106988620 +timestamp: 1771443741031 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/openssl-3.6.1-h546c87b_1.conda +sha256: 7f8048c0e75b2620254218d72b4ae7f14136f1981c5eb555ef61645a9344505f +md5: 25f5885f11e8b1f075bccf4a2da91c60 +depends: +- ca-certificates +- libgcc >=14 +license: Apache-2.0 +license_family: Apache +size: 3692030 +timestamp: 1769557678657 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pcre2-10.47-hf841c20_0.conda +sha256: 04df2cee95feba440387f33f878e9f655521e69f4be33a0cd637f07d3d81f0f9 +md5: 1a30c42e32ca0ea216bd0bfe6f842f0b +depends: +- bzip2 >=1.0.8,<2.0a0 +- libgcc >=14 +- libzlib >=1.3.1,<2.0a0 +license: BSD-3-Clause +license_family: BSD +size: 1166552 +timestamp: 1763655534263 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/perl-5.32.1-7_h31becfc_perl5.conda +build_number: 7 +sha256: d78296134263b5bf476cad838ded65451e7162db756f9997c5d06b08122572ed +md5: 17d019cb2a6c72073c344e98e40dfd61 +depends: +- libgcc-ng >=12 +- libxcrypt >=4.4.36 +license: GPL-1.0-or-later OR Artistic-1.0-Perl +size: 13338804 +timestamp: 1703310557094 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pixman-0.46.4-h7ac5ae9_1.conda +sha256: e6b0846a998f2263629cfeac7bca73565c35af13251969f45d385db537a514e4 +md5: 1587081d537bd4ae77d1c0635d465ba5 +depends: +- libgcc >=14 +- libstdcxx >=14 +- libgcc >=14 +license: MIT +license_family: MIT +size: 357913 +timestamp: 1754665583353 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/procps-ng-4.0.6-h1779866_0.conda +sha256: e9cbcbc94e151ada3d6dc365380aaaf591f65012c16d9a2abaea4b9b90adc402 +md5: ab7288cc39545556d1bc5e71ab2df9a9 +depends: +- libgcc >=14 +- ncurses >=6.5,<7.0a0 +license: GPL-2.0-or-later AND LGPL-2.0-or-later +license_family: GPL +size: 636733 +timestamp: 1769712412683 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pthread-stubs-0.4-h86ecc28_1002.conda +sha256: 977dfb0cb3935d748521dd80262fe7169ab82920afd38ed14b7fee2ea5ec01ba +md5: bb5a90c93e3bac3d5690acf76b4a6386 +depends: +- libgcc >=13 +license: MIT +license_family: MIT +size: 8342 +timestamp: 1726803319942 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libice-1.1.2-h86ecc28_0.conda +sha256: a2ba1864403c7eb4194dacbfe2777acf3d596feae43aada8d1b478617ce45031 +md5: c8d8ec3e00cd0fd8a231789b91a7c5b7 +depends: +- libgcc >=13 +license: MIT +license_family: MIT +size: 60433 +timestamp: 1734229908988 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libsm-1.2.6-h0808dbd_0.conda +sha256: b86a819cd16f90c01d9d81892155126d01555a20dabd5f3091da59d6309afd0a +md5: 2d1409c50882819cb1af2de82e2b7208 +depends: +- libgcc >=13 +- libuuid >=2.38.1,<3.0a0 +- xorg-libice >=1.1.2,<2.0a0 +license: MIT +license_family: MIT +size: 28701 +timestamp: 1741897678254 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libx11-1.8.13-h63a1b12_0.conda +sha256: cf886160e2ff580d77f7eb8ec1a77c41c2c5b05343e329bc35f0ddf40b8d92ab +md5: 22dd10425ef181e80e130db50675d615 +depends: +- libgcc >=14 +- libxcb >=1.17.0,<2.0a0 +license: MIT +license_family: MIT +size: 869058 +timestamp: 1770819244991 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxau-1.0.12-he30d5cf_1.conda +sha256: e9f6e931feeb2f40e1fdbafe41d3b665f1ab6cb39c5880a1fcf9f79a3f3c84a5 +md5: 1c246e1105000c3660558459e2fd6d43 +depends: +- libgcc >=14 +license: MIT +license_family: MIT +size: 16317 +timestamp: 1762977521691 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxdmcp-1.1.5-he30d5cf_1.conda +sha256: 128d72f36bcc8d2b4cdbec07507542e437c7d67f677b7d77b71ed9eeac7d6df1 +md5: bff06dcde4a707339d66d45d96ceb2e2 +depends: +- libgcc >=14 +license: MIT +license_family: MIT +size: 21039 +timestamp: 1762979038025 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxext-1.3.7-he30d5cf_0.conda +sha256: db2188bc0d844d4e9747bac7f6c1d067e390bd769c5ad897c93f1df759dc5dba +md5: fb42b683034619915863d68dd9df03a3 +depends: +- libgcc >=14 +- xorg-libx11 >=1.8.12,<2.0a0 +license: MIT +license_family: MIT +size: 52409 +timestamp: 1769446753771 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxfixes-6.0.2-he30d5cf_0.conda +sha256: 8cb9c88e25c57e47419e98f04f9ef3154ad96b9f858c88c570c7b91216a64d0e +md5: e8b4056544341daf1d415eaeae7a040c +depends: +- libgcc >=14 +- xorg-libx11 >=1.8.12,<2.0a0 +license: MIT +license_family: MIT +size: 20704 +timestamp: 1759284028146 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxi-1.8.2-h57736b2_0.conda +sha256: 7b587407ecb9ccd2bbaf0fb94c5dbdde4d015346df063e9502dc0ce2b682fb5e +md5: eeee3bdb31c6acde2b81ad1b8c287087 +depends: +- libgcc >=13 +- xorg-libx11 >=1.8.9,<2.0a0 +- xorg-libxext >=1.3.6,<2.0a0 +- xorg-libxfixes >=6.0.1,<7.0a0 +license: MIT +license_family: MIT +size: 48197 +timestamp: 1727801059062 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxrandr-1.5.5-he30d5cf_0.conda +sha256: 9f5196665a8d72f4f119c40dcc4bafeb0b540b102cc7b8b299c2abf599e7919f +md5: 1f64c613f0b8d67e9fb0e165d898fb6b +depends: +- libgcc >=14 +- xorg-libx11 >=1.8.12,<2.0a0 +- xorg-libxext >=1.3.6,<2.0a0 +- xorg-libxrender >=0.9.12,<0.10.0a0 +license: MIT +license_family: MIT +size: 31122 +timestamp: 1769445286951 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxrender-0.9.12-h86ecc28_0.conda +sha256: ffd77ee860c9635a28cfda46163dcfe9224dc6248c62404c544ae6b564a0be1f +md5: ae2c2dd0e2d38d249887727db2af960e +depends: +- libgcc >=13 +- xorg-libx11 >=1.8.10,<2.0a0 +license: MIT +license_family: MIT +size: 33649 +timestamp: 1734229123157 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxt-1.3.1-h57736b2_0.conda +sha256: 7c109792b60720809a580612aba7f8eb2a0bd425b9fc078748a9d6ffc97cbfa8 +md5: a9e4852c8e0b68ee783e7240030b696f +depends: +- libgcc >=13 +- xorg-libice >=1.1.1,<2.0a0 +- xorg-libsm >=1.2.4,<2.0a0 +- xorg-libx11 >=1.8.9,<2.0a0 +license: MIT +license_family: MIT +size: 384752 +timestamp: 1731860572314 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxtst-1.2.5-h57736b2_3.conda +sha256: 6eaffce5a34fc0a16a21ddeaefb597e792a263b1b0c387c1ce46b0a967d558e1 +md5: c05698071b5c8e0da82a282085845860 +depends: +- libgcc >=13 +- xorg-libx11 >=1.8.9,<2.0a0 +- xorg-libxext >=1.3.6,<2.0a0 +- xorg-libxi >=1.7.10,<2.0a0 +license: MIT +license_family: MIT +size: 33786 +timestamp: 1727964907993 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/zstd-1.5.7-h85ac4a6_6.conda +sha256: 569990cf12e46f9df540275146da567d9c618c1e9c7a0bc9d9cfefadaed20b75 +md5: c3655f82dcea2aa179b291e7099c1fcc +depends: +- libzlib >=1.3.1,<2.0a0 +license: BSD-3-Clause +license_family: BSD +size: 614429 +timestamp: 1764777145593 diff --git a/modules/nf-core/fastqc/environment.yml b/modules/nf-core/fastqc/environment.yml index 691d4c7..f9f54ee 100644 --- a/modules/nf-core/fastqc/environment.yml +++ b/modules/nf-core/fastqc/environment.yml @@ -1,3 +1,5 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda diff --git a/modules/nf-core/fastqc/main.nf b/modules/nf-core/fastqc/main.nf index 752c3a1..1085126 100644 --- a/modules/nf-core/fastqc/main.nf +++ b/modules/nf-core/fastqc/main.nf @@ -1,19 +1,19 @@ process FASTQC { - tag "$meta.id" - label 'process_medium' + tag "${meta.id}" + label 'process_low' conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/fastqc:0.12.1--hdfd78af_0' : - 'biocontainers/fastqc:0.12.1--hdfd78af_0' }" + container "${workflow.containerEngine in ['singularity', 'apptainer'] && !task.ext.singularity_pull_docker_container + ? 'https://depot.galaxyproject.org/singularity/fastqc:0.12.1--hdfd78af_0' + : 'quay.io/biocontainers/fastqc:0.12.1--hdfd78af_0'}" input: - tuple val(meta), path(reads) + tuple val(meta), path(reads, stageAs: '?/*') output: tuple val(meta), path("*.html"), emit: html - tuple val(meta), path("*.zip") , emit: zip - path "versions.yml" , emit: versions + tuple val(meta), path("*.zip"), emit: zip + tuple val("${task.process}"), val('fastqc'), eval('fastqc --version | sed "/FastQC v/!d; s/.*v//"'), emit: versions_fastqc, topic: versions when: task.ext.when == null || task.ext.when @@ -22,32 +22,30 @@ process FASTQC { def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" // Make list of old name and new name pairs to use for renaming in the bash while loop - def old_new_pairs = reads instanceof Path || reads.size() == 1 ? [[ reads, "${prefix}.${reads.extension}" ]] : reads.withIndex().collect { entry, index -> [ entry, "${prefix}_${index + 1}.${entry.extension}" ] } + def old_new_pairs = reads instanceof Path || reads.size() == 1 ? [[reads, "${prefix}.${reads.extension}"]] : reads.withIndex().collect { entry, index -> [entry, "${prefix}_${index + 1}.${entry.extension}"] } def rename_to = old_new_pairs*.join(' ').join(' ') - def renamed_files = old_new_pairs.collect{ _old_name, new_name -> new_name }.join(' ') + def renamed_files = old_new_pairs.collect { _old_name, new_name -> new_name }.join(' ') // The total amount of allocated RAM by FastQC is equal to the number of threads defined (--threads) time the amount of RAM defined (--memory) // https://github.com/s-andrews/FastQC/blob/1faeea0412093224d7f6a07f777fad60a5650795/fastqc#L211-L222 - // Dividing the task.memory by task.cpu allows to stick to requested amount of RAM in the label - def memory_in_mb = MemoryUnit.of("${task.memory}").toUnit('MB') / task.cpus + // Dividing the task.memory by task.cpus allows to stick to requested amount of RAM in the label + def memory_in_mb = task.memory + ? (task.memory.toUnit('MB') / task.cpus).intValue() + : null // FastQC memory value allowed range (100 - 10000) def fastqc_memory = memory_in_mb > 10000 ? 10000 : (memory_in_mb < 100 ? 100 : memory_in_mb) + def fastqc_memory_arg = fastqc_memory ? "--memory ${fastqc_memory}" : '' """ - printf "%s %s\\n" $rename_to | while read old_name new_name; do + printf "%s %s\\n" ${rename_to} | while read old_name new_name; do [ -f "\${new_name}" ] || ln -s \$old_name \$new_name done fastqc \\ - $args \\ - --threads $task.cpus \\ - --memory $fastqc_memory \\ - $renamed_files - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - fastqc: \$( fastqc --version | sed '/FastQC v/!d; s/.*v//' ) - END_VERSIONS + ${args} \\ + --threads ${task.cpus} \\ + ${fastqc_memory_arg} \\ + ${renamed_files} """ stub: @@ -55,10 +53,5 @@ process FASTQC { """ touch ${prefix}.html touch ${prefix}.zip - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - fastqc: \$( fastqc --version | sed '/FastQC v/!d; s/.*v//' ) - END_VERSIONS """ } diff --git a/modules/nf-core/fastqc/meta.yml b/modules/nf-core/fastqc/meta.yml index 2b2e62b..2f6cfef 100644 --- a/modules/nf-core/fastqc/meta.yml +++ b/modules/nf-core/fastqc/meta.yml @@ -29,9 +29,10 @@ input: description: | List of input FastQ files of size 1 and 2 for single-end and paired-end data, respectively. + ontologies: [] output: - - html: - - meta: + html: + - - meta: type: map description: | Groovy Map containing sample information @@ -40,8 +41,9 @@ output: type: file description: FastQC report pattern: "*_{fastqc.html}" - - zip: - - meta: + ontologies: [] + zip: + - - meta: type: map description: | Groovy Map containing sample information @@ -50,11 +52,29 @@ output: type: file description: FastQC report archive pattern: "*_{fastqc.zip}" - - versions: - - versions.yml: - type: file - description: File containing software versions - pattern: "versions.yml" + ontologies: [] + versions_fastqc: + - - ${task.process}: + type: string + description: The process the versions were collected from + - fastqc: + type: string + description: The tool name + - fastqc --version | sed "/FastQC v/!d; s/.*v//": + type: eval + description: The expression to obtain the version of the tool + +topics: + versions: + - - ${task.process}: + type: string + description: The process the versions were collected from + - fastqc: + type: string + description: The tool name + - fastqc --version | sed "/FastQC v/!d; s/.*v//": + type: eval + description: The expression to obtain the version of the tool authors: - "@drpatelh" - "@grst" @@ -65,3 +85,27 @@ maintainers: - "@grst" - "@ewels" - "@FelixKrueger" +containers: + docker: + linux/arm64: + name: community.wave.seqera.io/library/fastqc:0.12.1--e455e32f745abe68 + build_id: bd-e455e32f745abe68_1 + scan_id: sc-f102f736465af88c_1 + linux/amd64: + name: community.wave.seqera.io/library/fastqc:0.12.1--5cb1a2fa2f18c7c2 + build_id: bd-5cb1a2fa2f18c7c2_1 + scan_id: sc-0c0466326b6b77d2_1 + singularity: + linux/amd64: + name: oras://community.wave.seqera.io/library/fastqc:0.12.1--5c4bd442468d75dd + build_id: bd-5c4bd442468d75dd_1 + https: https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/f2/f20b021476d1d87658820f971ebecc1e8cdbde0f338eb0d9cea2b0a8fc54a54b/data + linux/arm64: + name: oras://community.wave.seqera.io/library/fastqc:0.12.1--127a87fc06499035 + build_id: bd-127a87fc06499035_1 + https: https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/46/46daf2dad0169afd2ae047c3e50ed3776259f664bf07e5e06b045dc23449e994/data + conda: + linux/amd64: + lock_file: modules/nf-core/fastqc/.conda-lock/linux_amd64-bd-5cb1a2fa2f18c7c2_1.txt + linux/arm64: + lock_file: modules/nf-core/fastqc/.conda-lock/linux_arm64-bd-e455e32f745abe68_1.txt diff --git a/modules/nf-core/fastqc/tests/main.nf.test b/modules/nf-core/fastqc/tests/main.nf.test index e9d79a0..66c44da 100644 --- a/modules/nf-core/fastqc/tests/main.nf.test +++ b/modules/nf-core/fastqc/tests/main.nf.test @@ -30,7 +30,7 @@ nextflow_process { { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - { assert snapshot(process.out.versions).match() } + { assert snapshot(sanitizeOutput(process.out).findAll { key, val -> key != 'html' && key != 'zip' }).match() } ) } } @@ -58,7 +58,7 @@ nextflow_process { { assert process.out.zip[0][1][1] ==~ ".*/test_2_fastqc.zip" }, { assert path(process.out.html[0][1][0]).text.contains("File typeConventional base calls") }, { assert path(process.out.html[0][1][1]).text.contains("File typeConventional base calls") }, - { assert snapshot(process.out.versions).match() } + { assert snapshot(sanitizeOutput(process.out).findAll { key, val -> key != 'html' && key != 'zip' }).match() } ) } } @@ -82,7 +82,7 @@ nextflow_process { { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - { assert snapshot(process.out.versions).match() } + { assert snapshot(sanitizeOutput(process.out).findAll { key, val -> key != 'html' && key != 'zip' }).match() } ) } } @@ -106,7 +106,7 @@ nextflow_process { { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - { assert snapshot(process.out.versions).match() } + { assert snapshot(sanitizeOutput(process.out).findAll { key, val -> key != 'html' && key != 'zip' }).match() } ) } } @@ -142,7 +142,7 @@ nextflow_process { { assert path(process.out.html[0][1][1]).text.contains("File typeConventional base calls") }, { assert path(process.out.html[0][1][2]).text.contains("File typeConventional base calls") }, { assert path(process.out.html[0][1][3]).text.contains("File typeConventional base calls") }, - { assert snapshot(process.out.versions).match() } + { assert snapshot(sanitizeOutput(process.out).findAll { key, val -> key != 'html' && key != 'zip' }).match() } ) } } @@ -166,7 +166,7 @@ nextflow_process { { assert process.out.html[0][1] ==~ ".*/mysample_fastqc.html" }, { assert process.out.zip[0][1] ==~ ".*/mysample_fastqc.zip" }, { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - { assert snapshot(process.out.versions).match() } + { assert snapshot(sanitizeOutput(process.out).findAll { key, val -> key != 'html' && key != 'zip' }).match() } ) } } diff --git a/modules/nf-core/fastqc/tests/main.nf.test.snap b/modules/nf-core/fastqc/tests/main.nf.test.snap index d5db309..c8ee120 100644 --- a/modules/nf-core/fastqc/tests/main.nf.test.snap +++ b/modules/nf-core/fastqc/tests/main.nf.test.snap @@ -1,15 +1,21 @@ { "sarscov2 custom_prefix": { "content": [ - [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" - ] + { + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] + ] + } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:02:16.374038" + "timestamp": "2025-10-28T16:39:14.518503" }, "sarscov2 single-end [fastq] - stub": { "content": [ @@ -33,7 +39,11 @@ ] ], "2": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "html": [ [ @@ -44,8 +54,12 @@ "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], - "versions": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "zip": [ [ @@ -59,10 +73,10 @@ } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:02:24.993809" + "timestamp": "2025-10-28T16:39:19.309008" }, "sarscov2 custom_prefix - stub": { "content": [ @@ -86,7 +100,11 @@ ] ], "2": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "html": [ [ @@ -97,8 +115,12 @@ "mysample.html:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], - "versions": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "zip": [ [ @@ -112,58 +134,82 @@ } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:03:10.93942" + "timestamp": "2025-10-28T16:39:44.94888" }, "sarscov2 interleaved [fastq]": { "content": [ - [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" - ] + { + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] + ] + } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:01:42.355718" + "timestamp": "2025-10-28T16:38:45.168496" }, "sarscov2 paired-end [bam]": { "content": [ - [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" - ] + { + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] + ] + } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:01:53.276274" + "timestamp": "2025-10-28T16:38:53.268919" }, "sarscov2 multiple [fastq]": { "content": [ - [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" - ] + { + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] + ] + } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:02:05.527626" + "timestamp": "2025-10-28T16:39:05.050305" }, "sarscov2 paired-end [fastq]": { "content": [ - [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" - ] + { + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] + ] + } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:01:31.188871" + "timestamp": "2025-10-28T16:38:37.2373" }, "sarscov2 paired-end [fastq] - stub": { "content": [ @@ -187,7 +233,11 @@ ] ], "2": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "html": [ [ @@ -198,8 +248,12 @@ "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], - "versions": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "zip": [ [ @@ -213,10 +267,10 @@ } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:02:34.273566" + "timestamp": "2025-10-28T16:39:24.450398" }, "sarscov2 multiple [fastq] - stub": { "content": [ @@ -240,7 +294,11 @@ ] ], "2": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "html": [ [ @@ -251,8 +309,12 @@ "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], - "versions": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "zip": [ [ @@ -266,22 +328,28 @@ } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:03:02.304411" + "timestamp": "2025-10-28T16:39:39.758762" }, "sarscov2 single-end [fastq]": { "content": [ - [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" - ] + { + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] + ] + } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:01:19.095607" + "timestamp": "2025-10-28T16:38:29.555068" }, "sarscov2 interleaved [fastq] - stub": { "content": [ @@ -305,7 +373,11 @@ ] ], "2": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "html": [ [ @@ -316,8 +388,12 @@ "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], - "versions": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "zip": [ [ @@ -331,10 +407,10 @@ } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:02:44.640184" + "timestamp": "2025-10-28T16:39:29.193136" }, "sarscov2 paired-end [bam] - stub": { "content": [ @@ -358,7 +434,11 @@ ] ], "2": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "html": [ [ @@ -369,8 +449,12 @@ "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], - "versions": [ - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + "versions_fastqc": [ + [ + "FASTQC", + "fastqc", + "0.12.1" + ] ], "zip": [ [ @@ -384,9 +468,9 @@ } ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nf-test": "0.9.2", + "nextflow": "25.10.0" }, - "timestamp": "2024-07-22T11:02:53.550742" + "timestamp": "2025-10-28T16:39:34.144919" } } \ No newline at end of file diff --git a/modules/nf-core/fastqc/tests/tags.yml b/modules/nf-core/fastqc/tests/tags.yml deleted file mode 100644 index 7834294..0000000 --- a/modules/nf-core/fastqc/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -fastqc: - - modules/nf-core/fastqc/** diff --git a/modules/nf-core/multiqc/.conda-lock/linux_amd64-bd-c17fb751507e9dfc_1.txt b/modules/nf-core/multiqc/.conda-lock/linux_amd64-bd-c17fb751507e9dfc_1.txt new file mode 100644 index 0000000..2a91c22 --- /dev/null +++ b/modules/nf-core/multiqc/.conda-lock/linux_amd64-bd-c17fb751507e9dfc_1.txt @@ -0,0 +1,1526 @@ + +version: 6 +environments: +default: +channels: +- url: https://conda.anaconda.org/conda-forge/ +- url: https://conda.anaconda.org/bioconda/ +- url: https://conda.anaconda.org/bioconda/ +options: +pypi-prerelease-mode: if-necessary-or-explicit +packages: +linux-64: +- conda: https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-20_gnu.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/_python_abi3_support-1.0-hd8ed1ab_2.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/annotated-types-0.7.0-pyhd8ed1ab_1.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/attrs-26.1.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/backports.zstd-1.5.0-py314h680f03e_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/brotli-python-1.2.0-py314h3de4e8d_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-hda65f42_9.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2026.5.20-hbd8a1cb_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/certifi-2026.5.20-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/charset-normalizer-3.4.7-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/click-8.4.0-pyhc90fa1f_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/coloredlogs-15.0.1-pyhd8ed1ab_4.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/colormath-3.0.0-pyhd8ed1ab_4.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/cpython-3.14.5-py314hd8ed1ab_100.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/expat-2.8.1-hecca717_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-dejavu-sans-mono-2.37-hab24e00_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-inconsolata-3.000-h77eed37_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-source-code-pro-2.038-h77eed37_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-ubuntu-0.83-h77eed37_3.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/fontconfig-2.18.0-h27c8c51_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-forge-1-hc364b38_1.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/h2-4.3.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/hpack-4.1.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/humanfriendly-10.0-pyh707e725_8.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/humanize-4.15.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/hyperframe-6.1.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/idna-3.15-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/importlib-metadata-9.0.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/jinja2-3.1.6-pyhcf101f3_1.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/jsonschema-4.26.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/jsonschema-specifications-2025.9.1-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/kaleido-core-0.2.1-h3644ca4_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/linux-64/lcms2-2.19.1-h0c24ade_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/ld_impl_linux-64-2.45.1-default_hbd61a6d_102.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/lerc-4.1.0-hdb68285_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libblas-3.11.0-7_h4a7cf45_openblas.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libcblas-3.11.0-7_h0358290_openblas.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libdeflate-1.25-h17f619e_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libexpat-2.8.1-hecca717_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libffi-3.5.2-h3435931_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libfreetype-2.14.3-ha770c72_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libfreetype6-2.14.3-h73754d4_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgcc-15.2.0-he0feb66_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-15.2.0-h69a702a_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgfortran-15.2.0-h69a702a_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgfortran5-15.2.0-h68bc16d_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgomp-15.2.0-he0feb66_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libjpeg-turbo-3.1.4.1-hb03c661_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/liblapack-3.11.0-7_h47877c9_openblas.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/liblzma-5.8.3-hb03c661_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libmpdec-4.0.0-hb03c661_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libopenblas-0.3.33-pthreads_h94d23a6_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libpng-1.6.58-h421ea60_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libsqlite-3.53.1-h0c1763c_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-15.2.0-h934c35e_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libtiff-4.7.1-h9d88235_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libuuid-2.42.1-h5347b49_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libwebp-base-1.6.0-hd42ef1d_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libxcb-1.17.0-h8a09558_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/libzlib-1.3.2-h25fd6f3_2.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/markdown-3.10.2-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/markdown-it-py-4.2.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/markupsafe-3.0.3-py314h67df5f8_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/mathjax-2.7.7-ha770c72_3.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/mdurl-0.1.2-pyhd8ed1ab_1.conda +- conda: https://conda.anaconda.org/bioconda/noarch/multiqc-1.35-pyhdfd78af_1.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/narwhals-2.21.2-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/natsort-8.4.0-pyhcf101f3_2.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/ncurses-6.6-hdb14827_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/networkx-3.6.1-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/nspr-4.38-h29cc59b_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/nss-3.118-h445c969_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/numpy-2.4.6-py314h2b28147_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/openjpeg-2.5.4-h55fea9a_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/openssl-3.6.2-h35e630c_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/packaging-26.2-pyhc364b38_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/pillow-12.2.0-py314h8ec4b1a_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/plotly-6.6.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/polars-1.41.0-pyh58ad624_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/polars-runtime-32-1.41.0-py310h49dadd8_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/polars-runtime-compat-1.41.0-py310hcbd6021_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/procps-ng-4.0.6-h18c060e_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/pthread-stubs-0.4-hb9d3cd8_1002.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/pyaml-env-1.2.2-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/pydantic-2.13.4-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/pydantic-core-2.46.4-py314h2e6c369_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/pygments-2.20.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/pysocks-1.7.1-pyha55dd90_7.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/python-3.14.5-habeac84_100_cp314.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/python-dotenv-1.2.2-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/python-gil-3.14.5-h4df99d1_100.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/python-kaleido-0.2.1-pyhd8ed1ab_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/python_abi-3.14-8_cp314.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/pyyaml-6.0.3-py314h67df5f8_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/readline-8.3-h853b02a_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/referencing-0.37.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/regex-2026.5.9-py314h5bd0f2a_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/requests-2.34.2-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/rich-15.0.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/rich-click-1.9.7-pyh8f84b5b_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/rpds-py-0.30.0-py314h2e6c369_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/spectra-0.0.11-pyhd8ed1ab_2.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/sqlite-3.53.1-hbc0de68_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/tiktoken-0.12.0-py314h67fec18_3.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/tk-8.6.13-noxft_h366c992_103.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/tqdm-4.67.3-pyh8f84b5b_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/typeguard-4.5.2-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/typing-extensions-4.15.0-h396c80c_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/typing-inspection-0.4.2-pyhcf101f3_2.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/typing_extensions-4.15.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/tzdata-2025c-hc9c84f9_1.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/urllib3-2.7.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxau-1.0.12-hb03c661_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxdmcp-1.1.5-hb03c661_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/yaml-0.2.5-h280c20c_3.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/zipp-4.1.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/zlib-ng-2.3.3-hceb46e0_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-64/zstd-1.5.7-hb78ec9c_6.conda +packages: +- conda: https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-20_gnu.conda +build_number: 20 +sha256: 1dd3fffd892081df9726d7eb7e0dea6198962ba775bd88842135a4ddb4deb3c9 +md5: a9f577daf3de00bca7c3c76c0ecbd1de +depends: +- __glibc >=2.17,<3.0.a0 +- libgomp >=7.5.0 +constrains: +- openmp_impl <0.0a0 +license: BSD-3-Clause +license_family: BSD +size: 28948 +timestamp: 1770939786096 +- conda: https://conda.anaconda.org/conda-forge/noarch/_python_abi3_support-1.0-hd8ed1ab_2.conda +sha256: a3967b937b9abf0f2a99f3173fa4630293979bd1644709d89580e7c62a544661 +md5: aaa2a381ccc56eac91d63b6c1240312f +depends: +- cpython +- python-gil +license: MIT +license_family: MIT +size: 8191 +timestamp: 1744137672556 +- conda: https://conda.anaconda.org/conda-forge/noarch/annotated-types-0.7.0-pyhd8ed1ab_1.conda +sha256: e0ea1ba78fbb64f17062601edda82097fcf815012cf52bb704150a2668110d48 +md5: 2934f256a8acfe48f6ebb4fce6cde29c +depends: +- python >=3.9 +- typing-extensions >=4.0.0 +license: MIT +license_family: MIT +size: 18074 +timestamp: 1733247158254 +- conda: https://conda.anaconda.org/conda-forge/noarch/attrs-26.1.0-pyhcf101f3_0.conda +sha256: 1b6124230bb4e571b1b9401537ecff575b7b109cc3a21ee019f65e083b8399ab +md5: c6b0543676ecb1fb2d7643941fe375f2 +depends: +- python >=3.10 +- python +license: MIT +license_family: MIT +size: 64927 +timestamp: 1773935801332 +- conda: https://conda.anaconda.org/conda-forge/noarch/backports.zstd-1.5.0-py314h680f03e_0.conda +noarch: generic +sha256: a1c97297e867776760489537bc5ae36fa83a154be30e3b79385a39ca4cb058fe +md5: 1133126d840e75287d83947be3fc3e71 +depends: +- python >=3.14 +license: BSD-3-Clause AND MIT AND EPL-2.0 +size: 7533 +timestamp: 1778594057496 +- conda: https://conda.anaconda.org/conda-forge/linux-64/brotli-python-1.2.0-py314h3de4e8d_1.conda +sha256: 3ad3500bff54a781c29f16ce1b288b36606e2189d0b0ef2f67036554f47f12b0 +md5: 8910d2c46f7e7b519129f486e0fe927a +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libstdcxx >=14 +- python >=3.14,<3.15.0a0 +- python_abi 3.14.* *_cp314 +constrains: +- libbrotlicommon 1.2.0 hb03c661_1 +license: MIT +license_family: MIT +size: 367376 +timestamp: 1764017265553 +- conda: https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-hda65f42_9.conda +sha256: 0b75d45f0bba3e95dc693336fa51f40ea28c980131fec438afb7ce6118ed05f6 +md5: d2ffd7602c02f2b316fd921d39876885 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: bzip2-1.0.6 +license_family: BSD +size: 260182 +timestamp: 1771350215188 +- conda: https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2026.5.20-hbd8a1cb_0.conda +sha256: 9812a303a1395e1dafbd92e5bc8a1ff6013bcbba0a09c7f03a8d23e43560aa9b +md5: 489b8e97e666c93f68fdb35c3c9b957f +depends: +- __unix +license: ISC +size: 129868 +timestamp: 1779289852439 +- conda: https://conda.anaconda.org/conda-forge/noarch/certifi-2026.5.20-pyhd8ed1ab_0.conda +sha256: 645655a3510e38e625da136595f3f16f2130c3263630cc3bc8f60f619ddbe490 +md5: 9fefff2f745ea1cc2ef15211a20c054a +depends: +- python >=3.10 +license: ISC +size: 134201 +timestamp: 1779285131141 +- conda: https://conda.anaconda.org/conda-forge/noarch/charset-normalizer-3.4.7-pyhd8ed1ab_0.conda +sha256: 3f9483d62ce24ecd063f8a5a714448445dc8d9e201147c46699fc0033e824457 +md5: a9167b9571f3baa9d448faa2139d1089 +depends: +- python >=3.10 +license: MIT +license_family: MIT +size: 58872 +timestamp: 1775127203018 +- conda: https://conda.anaconda.org/conda-forge/noarch/click-8.4.0-pyhc90fa1f_0.conda +sha256: 99ab8ef815c4520cce3a7482c2513f377c14348206857661d84c76a55e030f97 +md5: 003767c47f1f0a474c4de268b57839c3 +depends: +- __unix +- python +- python >=3.10 +license: BSD-3-Clause +license_family: BSD +size: 104631 +timestamp: 1779108494556 +- conda: https://conda.anaconda.org/conda-forge/noarch/coloredlogs-15.0.1-pyhd8ed1ab_4.conda +sha256: 8021c76eeadbdd5784b881b165242db9449783e12ce26d6234060026fd6a8680 +md5: b866ff7007b934d564961066c8195983 +depends: +- humanfriendly >=9.1 +- python >=3.9 +license: MIT +license_family: MIT +size: 43758 +timestamp: 1733928076798 +- conda: https://conda.anaconda.org/conda-forge/noarch/colormath-3.0.0-pyhd8ed1ab_4.conda +sha256: 59c9e29800b483b390467f90e82b0da3a4fbf0612efe1c90813fca232780e160 +md5: 071cf7b0ce333c81718b054066c15102 +depends: +- networkx >=2.0 +- numpy +- python >=3.9 +license: BSD-3-Clause +license_family: BSD +size: 39326 +timestamp: 1735759976140 +- conda: https://conda.anaconda.org/conda-forge/noarch/cpython-3.14.5-py314hd8ed1ab_100.conda +noarch: generic +sha256: 777882d2685f368417f31bbe1b28f73687fc6c8f6a5768bda20ffeefa6b07f5b +md5: a749029ce5d0632a913db19d17f944ab +depends: +- python >=3.14,<3.15.0a0 +- python_abi * *_cp314 +license: Python-2.0 +size: 50212 +timestamp: 1779236682725 +- conda: https://conda.anaconda.org/conda-forge/linux-64/expat-2.8.1-hecca717_0.conda +sha256: 29a10599d56d93bd750914888ebe6822d47722070762b4647b34d12df9f4476e +md5: d0757fd84af06f065eba49d39af6c546 +depends: +- __glibc >=2.17,<3.0.a0 +- libexpat 2.8.1 hecca717_0 +- libgcc >=14 +license: MIT +license_family: MIT +size: 148238 +timestamp: 1779278694477 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-dejavu-sans-mono-2.37-hab24e00_0.tar.bz2 +sha256: 58d7f40d2940dd0a8aa28651239adbf5613254df0f75789919c4e6762054403b +md5: 0c96522c6bdaed4b1566d11387caaf45 +license: BSD-3-Clause +license_family: BSD +size: 397370 +timestamp: 1566932522327 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-inconsolata-3.000-h77eed37_0.tar.bz2 +sha256: c52a29fdac682c20d252facc50f01e7c2e7ceac52aa9817aaf0bb83f7559ec5c +md5: 34893075a5c9e55cdafac56607368fc6 +license: OFL-1.1 +license_family: Other +size: 96530 +timestamp: 1620479909603 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-source-code-pro-2.038-h77eed37_0.tar.bz2 +sha256: 00925c8c055a2275614b4d983e1df637245e19058d79fc7dd1a93b8d9fb4b139 +md5: 4d59c254e01d9cde7957100457e2d5fb +license: OFL-1.1 +license_family: Other +size: 700814 +timestamp: 1620479612257 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-ubuntu-0.83-h77eed37_3.conda +sha256: 2821ec1dc454bd8b9a31d0ed22a7ce22422c0aef163c59f49dfdf915d0f0ca14 +md5: 49023d73832ef61042f6a237cb2687e7 +license: LicenseRef-Ubuntu-Font-Licence-Version-1.0 +license_family: Other +size: 1620504 +timestamp: 1727511233259 +- conda: https://conda.anaconda.org/conda-forge/linux-64/fontconfig-2.18.0-h27c8c51_0.conda +sha256: e798086d8a65d55dc4c51f5746705639c9a5f2eeb0b8fc50e6152cfc0d69a4e8 +md5: 06965b2f9854d0b15e0443ee81fe83dc +depends: +- __glibc >=2.17,<3.0.a0 +- libexpat >=2.8.1,<3.0a0 +- libfreetype >=2.14.3 +- libfreetype6 >=2.14.3 +- libgcc >=14 +- libuuid >=2.42.1,<3.0a0 +- libzlib >=1.3.2,<2.0a0 +license: MIT +license_family: MIT +size: 280882 +timestamp: 1779421631622 +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-forge-1-hc364b38_1.conda +sha256: 54eea8469786bc2291cc40bca5f46438d3e062a399e8f53f013b6a9f50e98333 +md5: a7970cd949a077b7cb9696379d338681 +depends: +- font-ttf-ubuntu +- font-ttf-inconsolata +- font-ttf-dejavu-sans-mono +- font-ttf-source-code-pro +license: BSD-3-Clause +license_family: BSD +size: 4059 +timestamp: 1762351264405 +- conda: https://conda.anaconda.org/conda-forge/noarch/h2-4.3.0-pyhcf101f3_0.conda +sha256: 84c64443368f84b600bfecc529a1194a3b14c3656ee2e832d15a20e0329b6da3 +md5: 164fc43f0b53b6e3a7bc7dce5e4f1dc9 +depends: +- python >=3.10 +- hyperframe >=6.1,<7 +- hpack >=4.1,<5 +- python +license: MIT +license_family: MIT +size: 95967 +timestamp: 1756364871835 +- conda: https://conda.anaconda.org/conda-forge/noarch/hpack-4.1.0-pyhd8ed1ab_0.conda +sha256: 6ad78a180576c706aabeb5b4c8ceb97c0cb25f1e112d76495bff23e3779948ba +md5: 0a802cb9888dd14eeefc611f05c40b6e +depends: +- python >=3.9 +license: MIT +license_family: MIT +size: 30731 +timestamp: 1737618390337 +- conda: https://conda.anaconda.org/conda-forge/noarch/humanfriendly-10.0-pyh707e725_8.conda +sha256: fa2071da7fab758c669e78227e6094f6b3608228740808a6de5d6bce83d9e52d +md5: 7fe569c10905402ed47024fc481bb371 +depends: +- __unix +- python >=3.9 +license: MIT +license_family: MIT +size: 73563 +timestamp: 1733928021866 +- conda: https://conda.anaconda.org/conda-forge/noarch/humanize-4.15.0-pyhd8ed1ab_0.conda +sha256: 6c4343b376d0b12a4c75ab992640970d36c933cad1fd924f6a1181fa91710e80 +md5: daddf757c3ecd6067b9af1df1f25d89e +depends: +- python >=3.10 +license: MIT +license_family: MIT +size: 67994 +timestamp: 1766267728652 +- conda: https://conda.anaconda.org/conda-forge/noarch/hyperframe-6.1.0-pyhd8ed1ab_0.conda +sha256: 77af6f5fe8b62ca07d09ac60127a30d9069fdc3c68d6b256754d0ffb1f7779f8 +md5: 8e6923fc12f1fe8f8c4e5c9f343256ac +depends: +- python >=3.9 +license: MIT +license_family: MIT +size: 17397 +timestamp: 1737618427549 +- conda: https://conda.anaconda.org/conda-forge/noarch/idna-3.15-pyhcf101f3_0.conda +sha256: 3d25f9f6f7ab3e1ce6429fc8c8aae0335cf446692e715068488536d220cc43de +md5: 1b9083b7f00609605d1483dbc6071a81 +depends: +- python >=3.10 +- python +license: BSD-3-Clause +license_family: BSD +size: 62642 +timestamp: 1779294335905 +- conda: https://conda.anaconda.org/conda-forge/noarch/importlib-metadata-9.0.0-pyhcf101f3_0.conda +sha256: 43e2a5497cad1598ff88a3e69f69bc88b7b8f141fa63c60eab5db296317318b8 +md5: ffc17e785d64e12fc311af9184221839 +depends: +- python >=3.10 +- zipp >=3.20 +- python +license: Apache-2.0 +size: 34766 +timestamp: 1779714582554 +- conda: https://conda.anaconda.org/conda-forge/noarch/jinja2-3.1.6-pyhcf101f3_1.conda +sha256: fc9ca7348a4f25fed2079f2153ecdcf5f9cf2a0bc36c4172420ca09e1849df7b +md5: 04558c96691bed63104678757beb4f8d +depends: +- markupsafe >=2.0 +- python >=3.10 +- python +license: BSD-3-Clause +license_family: BSD +size: 120685 +timestamp: 1764517220861 +- conda: https://conda.anaconda.org/conda-forge/noarch/jsonschema-4.26.0-pyhcf101f3_0.conda +sha256: db973a37d75db8e19b5f44bbbdaead0c68dde745407f281e2a7fe4db74ec51d7 +md5: ada41c863af263cc4c5fcbaff7c3e4dc +depends: +- attrs >=22.2.0 +- jsonschema-specifications >=2023.3.6 +- python >=3.10 +- referencing >=0.28.4 +- rpds-py >=0.25.0 +- python +license: MIT +license_family: MIT +size: 82356 +timestamp: 1767839954256 +- conda: https://conda.anaconda.org/conda-forge/noarch/jsonschema-specifications-2025.9.1-pyhcf101f3_0.conda +sha256: 0a4f3b132f0faca10c89fdf3b60e15abb62ded6fa80aebfc007d05965192aa04 +md5: 439cd0f567d697b20a8f45cb70a1005a +depends: +- python >=3.10 +- referencing >=0.31.0 +- python +license: MIT +license_family: MIT +size: 19236 +timestamp: 1757335715225 +- conda: https://conda.anaconda.org/conda-forge/linux-64/kaleido-core-0.2.1-h3644ca4_0.tar.bz2 +sha256: 7f243680ca03eba7457b7a48f93a9440ba8181a8eac20a3eb5ef165ab6c96664 +md5: b3723b235b0758abaae8c82ce4d80146 +depends: +- __glibc >=2.17,<3.0.a0 +- expat >=2.2.10,<3.0.0a0 +- fontconfig +- fonts-conda-forge +- libgcc-ng >=9.3.0 +- mathjax 2.7.* +- nspr >=4.29,<5.0a0 +- nss >=3.62,<4.0a0 +- sqlite >=3.34.0,<4.0a0 +license: MIT +license_family: MIT +size: 62099926 +timestamp: 1615199463039 +- conda: https://conda.anaconda.org/conda-forge/linux-64/lcms2-2.19.1-h0c24ade_0.conda +sha256: eb89c6c39f2f6a93db55723dbb2f6bba8c8e63e6312bf1abf13e6e9ff45849c8 +md5: f92f984b558e6e6204014b16d212b271 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libjpeg-turbo >=3.1.4.1,<4.0a0 +- libtiff >=4.7.1,<4.8.0a0 +license: MIT +license_family: MIT +size: 251086 +timestamp: 1778079286384 +- conda: https://conda.anaconda.org/conda-forge/linux-64/ld_impl_linux-64-2.45.1-default_hbd61a6d_102.conda +sha256: 3d584956604909ff5df353767f3a2a2f60e07d070b328d109f30ac40cd62df6c +md5: 18335a698559cdbcd86150a48bf54ba6 +depends: +- __glibc >=2.17,<3.0.a0 +- zstd >=1.5.7,<1.6.0a0 +constrains: +- binutils_impl_linux-64 2.45.1 +license: GPL-3.0-only +license_family: GPL +size: 728002 +timestamp: 1774197446916 +- conda: https://conda.anaconda.org/conda-forge/linux-64/lerc-4.1.0-hdb68285_0.conda +sha256: f84cb54782f7e9cea95e810ea8fef186e0652d0fa73d3009914fa2c1262594e1 +md5: a752488c68f2e7c456bcbd8f16eec275 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libstdcxx >=14 +license: Apache-2.0 +license_family: Apache +size: 261513 +timestamp: 1773113328888 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libblas-3.11.0-7_h4a7cf45_openblas.conda +build_number: 7 +sha256: 081c850f99bc355821fac9c6e3727d40b3f8ce3beb50a5437cf03726b611ff39 +md5: 955b44e8b00b7f7ef4ce0130cef12394 +depends: +- libopenblas >=0.3.33,<0.3.34.0a0 +- libopenblas >=0.3.33,<1.0a0 +constrains: +- libcblas 3.11.0 7*_openblas +- blas 2.307 openblas +- liblapack 3.11.0 7*_openblas +- liblapacke 3.11.0 7*_openblas +- mkl <2027 +license: BSD-3-Clause +license_family: BSD +size: 18716 +timestamp: 1778489854108 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libcblas-3.11.0-7_h0358290_openblas.conda +build_number: 7 +sha256: 956ae0bb1ec8b0c3663d75b151aceb0521b54e513bf97f621a035f9c87037970 +md5: 0675639dc24cb0032f199e7ff68e4633 +depends: +- libblas 3.11.0 7_h4a7cf45_openblas +constrains: +- liblapacke 3.11.0 7*_openblas +- blas 2.307 openblas +- liblapack 3.11.0 7*_openblas +license: BSD-3-Clause +license_family: BSD +size: 18675 +timestamp: 1778489861559 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libdeflate-1.25-h17f619e_0.conda +sha256: aa8e8c4be9a2e81610ddf574e05b64ee131fab5e0e3693210c9d6d2fba32c680 +md5: 6c77a605a7a689d17d4819c0f8ac9a00 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: MIT +license_family: MIT +size: 73490 +timestamp: 1761979956660 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libexpat-2.8.1-hecca717_0.conda +sha256: 363018b25fdb5534c79783d912bd4b685a3547f4fc5996357ad548899b0ee8e7 +md5: 93764a5ca80616e9c10106cdaec92f74 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +constrains: +- expat 2.8.1.* +license: MIT +license_family: MIT +size: 77294 +timestamp: 1779278686680 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libffi-3.5.2-h3435931_0.conda +sha256: 31f19b6a88ce40ebc0d5a992c131f57d919f73c0b92cd1617a5bec83f6e961e6 +md5: a360c33a5abe61c07959e449fa1453eb +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: MIT +license_family: MIT +size: 58592 +timestamp: 1769456073053 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libfreetype-2.14.3-ha770c72_0.conda +sha256: 38f014a7129e644636e46064ecd6b1945e729c2140e21d75bb476af39e692db2 +md5: e289f3d17880e44b633ba911d57a321b +depends: +- libfreetype6 >=2.14.3 +license: GPL-2.0-only OR FTL +size: 8049 +timestamp: 1774298163029 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libfreetype6-2.14.3-h73754d4_0.conda +sha256: 16f020f96da79db1863fcdd8f2b8f4f7d52f177dd4c58601e38e9182e91adf1d +md5: fb16b4b69e3f1dcfe79d80db8fd0c55d +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libpng >=1.6.55,<1.7.0a0 +- libzlib >=1.3.2,<2.0a0 +constrains: +- freetype >=2.14.3 +license: GPL-2.0-only OR FTL +size: 384575 +timestamp: 1774298162622 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgcc-15.2.0-he0feb66_19.conda +sha256: 8e0a3b5e41272e5678499b5dfc4cddb673f9e935de01eb0767ce857001229f46 +md5: 57736f29cc2b0ec0b6c2952d3f101b6a +depends: +- __glibc >=2.17,<3.0.a0 +- _openmp_mutex >=4.5 +constrains: +- libgcc-ng ==15.2.0=*_19 +- libgomp 15.2.0 he0feb66_19 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 1041084 +timestamp: 1778269013026 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-15.2.0-h69a702a_19.conda +sha256: 9dcf54adfaa5e861123c2da4f2f0451a685464ea7e5a41ad91cf67b31d658d98 +md5: 331ee9b72b9dff570d56b1302c5ab37d +depends: +- libgcc 15.2.0 he0feb66_19 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 27694 +timestamp: 1778269016987 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgfortran-15.2.0-h69a702a_19.conda +sha256: 561a42758ef25b9ce308c4e2cf56daee4f06138385a17e29a492cd928e00be6f +md5: 42bf7eca1a951735fa06c0e3c0d5c8e6 +depends: +- libgfortran5 15.2.0 h68bc16d_19 +constrains: +- libgfortran-ng ==15.2.0=*_19 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 27655 +timestamp: 1778269042954 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgfortran5-15.2.0-h68bc16d_19.conda +sha256: 057978bb69fea29ed715a9b98adf71015c31baecc4aeb2bfc20d4fd5d83579d4 +md5: 85072b0ad177c966294f129b7c04a2d5 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=15.2.0 +constrains: +- libgfortran 15.2.0 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 2483673 +timestamp: 1778269025089 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libgomp-15.2.0-he0feb66_19.conda +sha256: 5abe4ab9d93f6c9757d654f1969ae2267d4505315c1f2f8fe705fd60af084f1b +md5: faac990cb7aedc7f3a2224f2c9b0c26c +depends: +- __glibc >=2.17,<3.0.a0 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 603817 +timestamp: 1778268942614 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libjpeg-turbo-3.1.4.1-hb03c661_0.conda +sha256: 10056646c28115b174de81a44e23e3a0a3b95b5347d2e6c45cc6d49d35294256 +md5: 6178c6f2fb254558238ef4e6c56fb782 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +constrains: +- jpeg <0.0.0a +license: IJG AND BSD-3-Clause AND Zlib +size: 633831 +timestamp: 1775962768273 +- conda: https://conda.anaconda.org/conda-forge/linux-64/liblapack-3.11.0-7_h47877c9_openblas.conda +build_number: 7 +sha256: 96962084921f197c9ad13fb7f8b324f2351d50ff3d8d962148751ad532f54a01 +md5: 6569b4f273740e25dc0dc7e3232c2a6c +depends: +- libblas 3.11.0 7_h4a7cf45_openblas +constrains: +- liblapacke 3.11.0 7*_openblas +- libcblas 3.11.0 7*_openblas +- blas 2.307 openblas +license: BSD-3-Clause +license_family: BSD +size: 18694 +timestamp: 1778489869038 +- conda: https://conda.anaconda.org/conda-forge/linux-64/liblzma-5.8.3-hb03c661_0.conda +sha256: ec30e52a3c1bf7d0425380a189d209a52baa03f22fb66dd3eb587acaa765bd6d +md5: b88d90cad08e6bc8ad540cb310a761fb +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +constrains: +- xz 5.8.3.* +license: 0BSD +size: 113478 +timestamp: 1775825492909 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libmpdec-4.0.0-hb03c661_1.conda +sha256: fe171ed5cf5959993d43ff72de7596e8ac2853e9021dec0344e583734f1e0843 +md5: 2c21e66f50753a083cbe6b80f38268fa +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: BSD-2-Clause +license_family: BSD +size: 92400 +timestamp: 1769482286018 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libopenblas-0.3.33-pthreads_h94d23a6_0.conda +sha256: 3d9aa85648e5e18a6d66db98b8c4317cc426721ad7a220aa86330d1ccedc8903 +md5: 2d3278b721e40468295ca755c3b84070 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libgfortran +- libgfortran5 >=14.3.0 +constrains: +- openblas >=0.3.33,<0.3.34.0a0 +license: BSD-3-Clause +license_family: BSD +size: 5931919 +timestamp: 1776993658641 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libpng-1.6.58-h421ea60_0.conda +sha256: 377cfe037f3eeb3b1bf3ad333f724a64d32f315ee1958581fc671891d63d3f89 +md5: eba48a68a1a2b9d3c0d9511548db85db +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libzlib >=1.3.2,<2.0a0 +license: zlib-acknowledgement +size: 317729 +timestamp: 1776315175087 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libsqlite-3.53.1-h0c1763c_0.conda +sha256: 54cdcd3214313b62c2a8ee277e6f42150d9b748264c1b70d958bf735e420ef8d +md5: 7dc38adcbf71e6b38748e919e16e0dce +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libzlib >=1.3.2,<2.0a0 +license: blessing +size: 954962 +timestamp: 1777986471789 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-15.2.0-h934c35e_19.conda +sha256: dff1058c76ec6b8759e41cefa2508162d00e4a5e6721aa68ec3fd10094e702dc +md5: 5794b3bdc38177caf969dabd3af08549 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc 15.2.0 he0feb66_19 +constrains: +- libstdcxx-ng ==15.2.0=*_19 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 5852044 +timestamp: 1778269036376 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libtiff-4.7.1-h9d88235_1.conda +sha256: e5f8c38625aa6d567809733ae04bb71c161a42e44a9fa8227abe61fa5c60ebe0 +md5: cd5a90476766d53e901500df9215e927 +depends: +- __glibc >=2.17,<3.0.a0 +- lerc >=4.0.0,<5.0a0 +- libdeflate >=1.25,<1.26.0a0 +- libgcc >=14 +- libjpeg-turbo >=3.1.0,<4.0a0 +- liblzma >=5.8.1,<6.0a0 +- libstdcxx >=14 +- libwebp-base >=1.6.0,<2.0a0 +- libzlib >=1.3.1,<2.0a0 +- zstd >=1.5.7,<1.6.0a0 +license: HPND +size: 435273 +timestamp: 1762022005702 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libuuid-2.42.1-h5347b49_0.conda +sha256: 3f0edf1280e2f6684a986f821eaa3e123d2694a00b31b96ca0d4a4c12c129231 +md5: 7d0a66598195ef00b6efc55aefc7453b +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: BSD-3-Clause +license_family: BSD +size: 40163 +timestamp: 1779118517630 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libwebp-base-1.6.0-hd42ef1d_0.conda +sha256: 3aed21ab28eddffdaf7f804f49be7a7d701e8f0e46c856d801270b470820a37b +md5: aea31d2e5b1091feca96fcfe945c3cf9 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +constrains: +- libwebp 1.6.0 +license: BSD-3-Clause +license_family: BSD +size: 429011 +timestamp: 1752159441324 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libxcb-1.17.0-h8a09558_0.conda +sha256: 666c0c431b23c6cec6e492840b176dde533d48b7e6fb8883f5071223433776aa +md5: 92ed62436b625154323d40d5f2f11dd7 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +- pthread-stubs +- xorg-libxau >=1.0.11,<2.0a0 +- xorg-libxdmcp +license: MIT +license_family: MIT +size: 395888 +timestamp: 1727278577118 +- conda: https://conda.anaconda.org/conda-forge/linux-64/libzlib-1.3.2-h25fd6f3_2.conda +sha256: 55044c403570f0dc26e6364de4dc5368e5f3fc7ff103e867c487e2b5ab2bcda9 +md5: d87ff7921124eccd67248aa483c23fec +depends: +- __glibc >=2.17,<3.0.a0 +constrains: +- zlib 1.3.2 *_2 +license: Zlib +license_family: Other +size: 63629 +timestamp: 1774072609062 +- conda: https://conda.anaconda.org/conda-forge/noarch/markdown-3.10.2-pyhcf101f3_0.conda +sha256: 20e0892592a3e7c683e3d66df704a9425d731486a97c34fc56af4da1106b2b6b +md5: ba0a9221ce1063f31692c07370d062f3 +depends: +- importlib-metadata >=4.4 +- python >=3.10 +- python +license: BSD-3-Clause +license_family: BSD +size: 85893 +timestamp: 1770694658918 +- conda: https://conda.anaconda.org/conda-forge/noarch/markdown-it-py-4.2.0-pyhd8ed1ab_0.conda +sha256: 0c4c35376fe920714390d46e4b8d31c876d65f18e1655899e0763ec25f2a902f +md5: 6d03368f2b2b0a5fb6839df53b2eb5e0 +depends: +- mdurl >=0.1,<1 +- python >=3.10 +license: MIT +license_family: MIT +size: 69017 +timestamp: 1778169663339 +- conda: https://conda.anaconda.org/conda-forge/linux-64/markupsafe-3.0.3-py314h67df5f8_1.conda +sha256: c279be85b59a62d5c52f5dd9a4cd43ebd08933809a8416c22c3131595607d4cf +md5: 9a17c4307d23318476d7fbf0fedc0cde +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- python >=3.14,<3.15.0a0 +- python_abi 3.14.* *_cp314 +constrains: +- jinja2 >=3.0.0 +license: BSD-3-Clause +license_family: BSD +size: 27424 +timestamp: 1772445227915 +- conda: https://conda.anaconda.org/conda-forge/linux-64/mathjax-2.7.7-ha770c72_3.tar.bz2 +sha256: 02fef69bde69db264a12f21386612262f545b6e3e68d8f1ccec19f3eaae58edf +md5: 86e69bd82c2a2c6fd29f5ab7e02b3691 +license: Apache-2.0 +license_family: Apache +size: 22281629 +timestamp: 1662784498331 +- conda: https://conda.anaconda.org/conda-forge/noarch/mdurl-0.1.2-pyhd8ed1ab_1.conda +sha256: 78c1bbe1723449c52b7a9df1af2ee5f005209f67e40b6e1d3c7619127c43b1c7 +md5: 592132998493b3ff25fd7479396e8351 +depends: +- python >=3.9 +license: MIT +license_family: MIT +size: 14465 +timestamp: 1733255681319 +- conda: https://conda.anaconda.org/bioconda/noarch/multiqc-1.35-pyhdfd78af_1.conda +sha256: e86033aa55a9e915e2d0957e770bdb81e3feb26a227d1adb17f9d6c528da6a71 +md5: cdb20309681ba3ce8f52c110e214d4f3 +depends: +- click +- coloredlogs +- humanize +- importlib-metadata +- jinja2 >=3.0.0 +- jsonschema +- markdown +- natsort +- numpy +- packaging +- pillow >=10.2.0 +- plotly >=5.18 +- polars >=1.34.0 +- polars-runtime-compat >=1.34.0 +- pyaml-env +- pydantic >=2.7.1 +- python >=3.9,!=3.14.1 +- python-dotenv +- python-kaleido 0.2.1 +- pyyaml >=4 +- requests +- rich >=10 +- rich-click +- spectra >=0.0.10 +- tiktoken +- tqdm +- typeguard >=4 +license: GPL-3.0-or-later +license_family: GPL3 +size: 4282188 +timestamp: 1779465338806 +- conda: https://conda.anaconda.org/conda-forge/noarch/narwhals-2.21.2-pyhcf101f3_0.conda +sha256: 70f43d62450927d51673eecd8823e14f5b3cfebdb43cda1d502eba97162bab42 +md5: 6687827c332121727ce383919e1ec8c2 +depends: +- python >=3.10 +- python +license: MIT +license_family: MIT +size: 284323 +timestamp: 1778929680962 +- conda: https://conda.anaconda.org/conda-forge/noarch/natsort-8.4.0-pyhcf101f3_2.conda +sha256: aeb1548eb72e4f198e72f19d242fb695b35add2ac7b2c00e0d83687052867680 +md5: e941e85e273121222580723010bd4fa2 +depends: +- python >=3.9 +- python +license: MIT +license_family: MIT +size: 39262 +timestamp: 1770905275632 +- conda: https://conda.anaconda.org/conda-forge/linux-64/ncurses-6.6-hdb14827_0.conda +sha256: fc89f74bbe362fb29fa3c037697a89bec140b346a2469a90f7936d1d7ea4d8a3 +md5: fc21868a1a5aacc937e7a18747acb8a5 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: X11 AND BSD-3-Clause +size: 918956 +timestamp: 1777422145199 +- conda: https://conda.anaconda.org/conda-forge/noarch/networkx-3.6.1-pyhcf101f3_0.conda +sha256: f6a82172afc50e54741f6f84527ef10424326611503c64e359e25a19a8e4c1c6 +md5: a2c1eeadae7a309daed9d62c96012a2b +depends: +- python >=3.11 +- python +constrains: +- numpy >=1.25 +- scipy >=1.11.2 +- matplotlib-base >=3.8 +- pandas >=2.0 +license: BSD-3-Clause +license_family: BSD +size: 1587439 +timestamp: 1765215107045 +- conda: https://conda.anaconda.org/conda-forge/linux-64/nspr-4.38-h29cc59b_0.conda +sha256: e3664264bd936c357523b55c71ed5a30263c6ba278d726a75b1eb112e6fb0b64 +md5: e235d5566c9cc8970eb2798dd4ecf62f +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libstdcxx >=14 +license: MPL-2.0 +license_family: MOZILLA +size: 228588 +timestamp: 1762348634537 +- conda: https://conda.anaconda.org/conda-forge/linux-64/nss-3.118-h445c969_0.conda +sha256: 44dd98ffeac859d84a6dcba79a2096193a42fc10b29b28a5115687a680dd6aea +md5: 567fbeed956c200c1db5782a424e58ee +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libsqlite >=3.51.0,<4.0a0 +- libstdcxx >=14 +- libzlib >=1.3.1,<2.0a0 +- nspr >=4.38,<5.0a0 +license: MPL-2.0 +license_family: MOZILLA +size: 2057773 +timestamp: 1763485556350 +- conda: https://conda.anaconda.org/conda-forge/linux-64/numpy-2.4.6-py314h2b28147_0.conda +sha256: bc61ae892973751a6b0e6ecea57ed6d7053224bddcb007165d6ceb1d7344ad47 +md5: f49b5f950379e0b97c35ca97682f7c6a +depends: +- python +- libstdcxx >=14 +- libgcc >=14 +- __glibc >=2.17,<3.0.a0 +- liblapack >=3.9.0,<4.0a0 +- python_abi 3.14.* *_cp314 +- libblas >=3.9.0,<4.0a0 +- libcblas >=3.9.0,<4.0a0 +constrains: +- numpy-base <0a0 +license: BSD-3-Clause +license_family: BSD +size: 8928909 +timestamp: 1779169198391 +- conda: https://conda.anaconda.org/conda-forge/linux-64/openjpeg-2.5.4-h55fea9a_0.conda +sha256: 3900f9f2dbbf4129cf3ad6acf4e4b6f7101390b53843591c53b00f034343bc4d +md5: 11b3379b191f63139e29c0d19dee24cd +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libpng >=1.6.50,<1.7.0a0 +- libstdcxx >=14 +- libtiff >=4.7.1,<4.8.0a0 +- libzlib >=1.3.1,<2.0a0 +license: BSD-2-Clause +license_family: BSD +size: 355400 +timestamp: 1758489294972 +- conda: https://conda.anaconda.org/conda-forge/linux-64/openssl-3.6.2-h35e630c_0.conda +sha256: c0ef482280e38c71a08ad6d71448194b719630345b0c9c60744a2010e8a8e0cb +md5: da1b85b6a87e141f5140bb9924cecab0 +depends: +- __glibc >=2.17,<3.0.a0 +- ca-certificates +- libgcc >=14 +license: Apache-2.0 +license_family: Apache +size: 3167099 +timestamp: 1775587756857 +- conda: https://conda.anaconda.org/conda-forge/noarch/packaging-26.2-pyhc364b38_0.conda +sha256: 3906abfb6511a3bb309e39b9b1b7bc38f50a723971de2395489fd1f379255890 +md5: 4c06a92e74452cfa53623a81592e8934 +depends: +- python >=3.8 +- python +license: Apache-2.0 +license_family: APACHE +size: 91574 +timestamp: 1777103621679 +- conda: https://conda.anaconda.org/conda-forge/linux-64/pillow-12.2.0-py314h8ec4b1a_0.conda +sha256: 123d8a7c16c88658b4f29e9f115a047598c941708dade74fbaff373a32dbec5e +md5: 76c4757c0ec9d11f969e8eb44899307b +depends: +- python +- libgcc >=14 +- __glibc >=2.17,<3.0.a0 +- libtiff >=4.7.1,<4.8.0a0 +- openjpeg >=2.5.4,<3.0a0 +- libxcb >=1.17.0,<2.0a0 +- libwebp-base >=1.6.0,<2.0a0 +- zlib-ng >=2.3.3,<2.4.0a0 +- libjpeg-turbo >=3.1.2,<4.0a0 +- python_abi 3.14.* *_cp314 +- libfreetype >=2.14.3 +- libfreetype6 >=2.14.3 +- lcms2 >=2.18,<3.0a0 +- tk >=8.6.13,<8.7.0a0 +license: HPND +size: 1082797 +timestamp: 1775060059882 +- conda: https://conda.anaconda.org/conda-forge/noarch/plotly-6.6.0-pyhd8ed1ab_0.conda +sha256: c418d325359fc7a0074cea7f081ef1bce26e114d2da8a0154c5d27ecc87a08e7 +md5: 3e9427ee186846052e81fadde8ebe96a +depends: +- narwhals >=1.15.1 +- packaging +- python >=3.10 +constrains: +- ipywidgets >=7.6 +license: MIT +license_family: MIT +size: 5251872 +timestamp: 1772628857717 +- conda: https://conda.anaconda.org/conda-forge/noarch/polars-1.41.0-pyh58ad624_0.conda +sha256: 70fc56877c4a095ee658d61924d8019768fbae4a48437058d181fc94b0a7c4d8 +md5: 25a883fed9f1f3f21ff317a3e7c92ac4 +depends: +- polars-runtime-32 ==1.41.0 +- python >=3.10 +- python +constrains: +- numpy >=1.16.0 +- pyarrow >=7.0.0 +- fastexcel >=0.9 +- openpyxl >=3.0.0 +- xlsx2csv >=0.8.0 +- connectorx >=0.3.2 +- deltalake >=1.0.0 +- pyiceberg >=0.7.1 +- altair >=5.4.0 +- great_tables >=0.8.0 +- polars-runtime-32 ==1.41.0 +- polars-runtime-64 ==1.41.0 +- polars-runtime-compat ==1.41.0 +license: MIT +size: 539656 +timestamp: 1779630790562 +- conda: https://conda.anaconda.org/conda-forge/linux-64/polars-runtime-32-1.41.0-py310h49dadd8_0.conda +noarch: python +sha256: e51ee3fe5259f2e115b2f78f8fbe3554e419c7c82b0c110878e12a5ff95ce3ab +md5: 7682765a1588e5ac887c99736d297c93 +depends: +- python +- __glibc >=2.17,<3.0.a0 +- libstdcxx >=14 +- libgcc >=14 +- _python_abi3_support 1.* +- cpython >=3.10 +constrains: +- __glibc >=2.17 +license: MIT +size: 42578921 +timestamp: 1779630790562 +- conda: https://conda.anaconda.org/conda-forge/linux-64/polars-runtime-compat-1.41.0-py310hcbd6021_0.conda +noarch: python +sha256: 29c3831c92394af11d9f7d04882dda9479ffbb76a3d36ba155d52159d67805fa +md5: cb0b620c9914a07a9022cb8b183ea9ee +depends: +- python +- libstdcxx >=14 +- libgcc >=14 +- __glibc >=2.17,<3.0.a0 +- _python_abi3_support 1.* +- cpython >=3.10 +constrains: +- __glibc >=2.17 +license: MIT +size: 41864944 +timestamp: 1779630722548 +- conda: https://conda.anaconda.org/conda-forge/linux-64/procps-ng-4.0.6-h18c060e_0.conda +sha256: 4ce2e1ee31a6217998f78c31ce7dc0a3e0557d9238b51d49dd20c52d467a126d +md5: f2c23a77b25efcad57d377b34bd84941 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- ncurses >=6.5,<7.0a0 +license: GPL-2.0-or-later AND LGPL-2.0-or-later +license_family: GPL +size: 593603 +timestamp: 1769710381284 +- conda: https://conda.anaconda.org/conda-forge/linux-64/pthread-stubs-0.4-hb9d3cd8_1002.conda +sha256: 9c88f8c64590e9567c6c80823f0328e58d3b1efb0e1c539c0315ceca764e0973 +md5: b3c17d95b5a10c6e64a21fa17573e70e +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=13 +license: MIT +license_family: MIT +size: 8252 +timestamp: 1726802366959 +- conda: https://conda.anaconda.org/conda-forge/noarch/pyaml-env-1.2.2-pyhd8ed1ab_0.conda +sha256: 58994e0d2ea8584cb399546e6f6896d771995e6121d1a7b6a2c9948388358932 +md5: e17be1016bcc3516827b836cd3e4d9dc +depends: +- python >=3.9 +- pyyaml >=5.0,<=7.0 +license: MIT +license_family: MIT +size: 14645 +timestamp: 1736766960536 +- conda: https://conda.anaconda.org/conda-forge/noarch/pydantic-2.13.4-pyhcf101f3_0.conda +sha256: 69700e31165df070e9716315e042196aa92525dae5deb5107785847ab9f4189f +md5: 729843edafc0899b3348bd3f19525b9d +depends: +- typing-inspection >=0.4.2 +- typing_extensions >=4.14.1 +- python >=3.10 +- annotated-types >=0.6.0 +- pydantic-core ==2.46.4 +- python +license: MIT +license_family: MIT +size: 346511 +timestamp: 1778103405862 +- conda: https://conda.anaconda.org/conda-forge/linux-64/pydantic-core-2.46.4-py314h2e6c369_0.conda +sha256: 802e216c39f1359aed60823b6e11d8ccd812b0ae1c81ae5ac1c81f99446409ab +md5: 0c96993dbeadf3a277cf757b9f1c9412 +depends: +- python +- typing-extensions >=4.6.0,!=4.7.0 +- libgcc >=14 +- __glibc >=2.17,<3.0.a0 +- python_abi 3.14.* *_cp314 +constrains: +- __glibc >=2.17 +license: MIT +license_family: MIT +size: 1895020 +timestamp: 1778084229247 +- conda: https://conda.anaconda.org/conda-forge/noarch/pygments-2.20.0-pyhd8ed1ab_0.conda +sha256: cf70b2f5ad9ae472b71235e5c8a736c9316df3705746de419b59d442e8348e86 +md5: 16c18772b340887160c79a6acc022db0 +depends: +- python >=3.10 +license: BSD-2-Clause +license_family: BSD +size: 893031 +timestamp: 1774796815820 +- conda: https://conda.anaconda.org/conda-forge/noarch/pysocks-1.7.1-pyha55dd90_7.conda +sha256: ba3b032fa52709ce0d9fd388f63d330a026754587a2f461117cac9ab73d8d0d8 +md5: 461219d1a5bd61342293efa2c0c90eac +depends: +- __unix +- python >=3.9 +license: BSD-3-Clause +license_family: BSD +size: 21085 +timestamp: 1733217331982 +- conda: https://conda.anaconda.org/conda-forge/linux-64/python-3.14.5-habeac84_100_cp314.conda +build_number: 100 +sha256: 55eed9bf2a3f6e90311276f0834737fe7c2d9ec3e5e2e557507858df4c7521e6 +md5: da92e59ff92f2d5ede4f612af20f583f +depends: +- __glibc >=2.17,<3.0.a0 +- bzip2 >=1.0.8,<2.0a0 +- ld_impl_linux-64 >=2.36.1 +- libexpat >=2.8.0,<3.0a0 +- libffi >=3.5.2,<3.6.0a0 +- libgcc >=14 +- liblzma >=5.8.3,<6.0a0 +- libmpdec >=4.0.0,<5.0a0 +- libsqlite >=3.53.1,<4.0a0 +- libuuid >=2.42.1,<3.0a0 +- libzlib >=1.3.2,<2.0a0 +- ncurses >=6.6,<7.0a0 +- openssl >=3.5.6,<4.0a0 +- python_abi 3.14.* *_cp314 +- readline >=8.3,<9.0a0 +- tk >=8.6.13,<8.7.0a0 +- tzdata +- zstd >=1.5.7,<1.6.0a0 +license: Python-2.0 +size: 36745188 +timestamp: 1779236923603 +python_site_packages_path: lib/python3.14/site-packages +- conda: https://conda.anaconda.org/conda-forge/noarch/python-dotenv-1.2.2-pyhcf101f3_0.conda +sha256: 74e417a768f59f02a242c25e7db0aa796627b5bc8c818863b57786072aeb85e5 +md5: 130584ad9f3a513cdd71b1fdc1244e9c +depends: +- python >=3.10 +license: BSD-3-Clause +license_family: BSD +size: 27848 +timestamp: 1772388605021 +- conda: https://conda.anaconda.org/conda-forge/noarch/python-gil-3.14.5-h4df99d1_100.conda +sha256: 41dd7da285d71d519257fa7dacb1cae060d5ebfaa5f92cba5994899d2978e943 +md5: 41954747ba952ec4b01e16c2c9e8d8ff +depends: +- cpython 3.14.5.* +- python_abi * *_cp314 +license: Python-2.0 +size: 50212 +timestamp: 1779236703009 +- conda: https://conda.anaconda.org/conda-forge/noarch/python-kaleido-0.2.1-pyhd8ed1ab_0.tar.bz2 +sha256: e17bf63a30aec33432f1ead86e15e9febde9fc40a7f869c0e766be8d2db44170 +md5: 310259a5b03ff02289d7705f39e2b1d2 +depends: +- kaleido-core 0.2.1.* +- python >=3.5 +license: MIT +license_family: MIT +size: 18320 +timestamp: 1615204747600 +- conda: https://conda.anaconda.org/conda-forge/noarch/python_abi-3.14-8_cp314.conda +build_number: 8 +sha256: ad6d2e9ac39751cc0529dd1566a26751a0bf2542adb0c232533d32e176e21db5 +md5: 0539938c55b6b1a59b560e843ad864a4 +constrains: +- python 3.14.* *_cp314 +license: BSD-3-Clause +license_family: BSD +size: 6989 +timestamp: 1752805904792 +- conda: https://conda.anaconda.org/conda-forge/linux-64/pyyaml-6.0.3-py314h67df5f8_1.conda +sha256: b318fb070c7a1f89980ef124b80a0b5ccf3928143708a85e0053cde0169c699d +md5: 2035f68f96be30dc60a5dfd7452c7941 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- python >=3.14,<3.15.0a0 +- python_abi 3.14.* *_cp314 +- yaml >=0.2.5,<0.3.0a0 +license: MIT +license_family: MIT +size: 202391 +timestamp: 1770223462836 +- conda: https://conda.anaconda.org/conda-forge/linux-64/readline-8.3-h853b02a_0.conda +sha256: 12ffde5a6f958e285aa22c191ca01bbd3d6e710aa852e00618fa6ddc59149002 +md5: d7d95fc8287ea7bf33e0e7116d2b95ec +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- ncurses >=6.5,<7.0a0 +license: GPL-3.0-only +license_family: GPL +size: 345073 +timestamp: 1765813471974 +- conda: https://conda.anaconda.org/conda-forge/noarch/referencing-0.37.0-pyhcf101f3_0.conda +sha256: 0577eedfb347ff94d0f2fa6c052c502989b028216996b45c7f21236f25864414 +md5: 870293df500ca7e18bedefa5838a22ab +depends: +- attrs >=22.2.0 +- python >=3.10 +- rpds-py >=0.7.0 +- typing_extensions >=4.4.0 +- python +license: MIT +license_family: MIT +size: 51788 +timestamp: 1760379115194 +- conda: https://conda.anaconda.org/conda-forge/linux-64/regex-2026.5.9-py314h5bd0f2a_0.conda +sha256: c7a4aca4977c15c82d053b06cbc676460974c1b25757cfeea8a9a2497ac911f8 +md5: 9dd235b6ac69a0198080dac39f9891aa +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- python >=3.14,<3.15.0a0 +- python_abi 3.14.* *_cp314 +license: Apache-2.0 AND CNRI-Python +license_family: PSF +size: 413611 +timestamp: 1778374155646 +- conda: https://conda.anaconda.org/conda-forge/noarch/requests-2.34.2-pyhcf101f3_0.conda +sha256: 1715246b19c9f85ee022933b4845f2fc14ac9184981b7b7d9b728bec8e9588da +md5: 4a85203c1d80c1059086ae860836ffb9 +depends: +- python >=3.10 +- certifi >=2023.5.7 +- charset-normalizer >=2,<4 +- idna >=2.5,<4 +- urllib3 >=1.26,<3 +- python +constrains: +- chardet >=3.0.2,<8 +license: Apache-2.0 +license_family: APACHE +size: 68709 +timestamp: 1778851103479 +- conda: https://conda.anaconda.org/conda-forge/noarch/rich-15.0.0-pyhcf101f3_0.conda +sha256: 3d6ba2c0fcdac3196ba2f0615b4104e532525ffa1335b50a2878be5ff488814a +md5: 0242025a3c804966bf71aa04eee82f66 +depends: +- markdown-it-py >=2.2.0 +- pygments >=2.13.0,<3.0.0 +- python >=3.10 +- typing_extensions >=4.0.0,<5.0.0 +- python +license: MIT +license_family: MIT +size: 208577 +timestamp: 1775991661559 +- conda: https://conda.anaconda.org/conda-forge/noarch/rich-click-1.9.7-pyh8f84b5b_0.conda +sha256: aa3fcb167321bae51998de2e94d199109c9024f25a5a063cb1c28d8f1af33436 +md5: 0c20a8ebcddb24a45da89d5e917e6cb9 +depends: +- python >=3.10 +- rich >=12 +- click >=8 +- typing-extensions >=4 +- __unix +- python +license: MIT +license_family: MIT +size: 64356 +timestamp: 1769850479089 +- conda: https://conda.anaconda.org/conda-forge/linux-64/rpds-py-0.30.0-py314h2e6c369_0.conda +sha256: e53b0cbf3b324eaa03ca1fe1a688fdf4ab42cea9c25270b0a7307d8aaaa4f446 +md5: c1c368b5437b0d1a68f372ccf01cb133 +depends: +- python +- libgcc >=14 +- __glibc >=2.17,<3.0.a0 +- python_abi 3.14.* *_cp314 +constrains: +- __glibc >=2.17 +license: MIT +license_family: MIT +size: 376121 +timestamp: 1764543122774 +- conda: https://conda.anaconda.org/conda-forge/noarch/spectra-0.0.11-pyhd8ed1ab_2.conda +sha256: 7c65782d2511738e62c70462e89d65da4fa54d5a7e47c46667bcd27a59f81876 +md5: 472239e4eb7b5a84bb96b3ed7e3a596a +depends: +- colormath >=3.0.0 +- python >=3.9 +license: MIT +license_family: MIT +size: 22284 +timestamp: 1735770589188 +- conda: https://conda.anaconda.org/conda-forge/linux-64/sqlite-3.53.1-hbc0de68_0.conda +sha256: d167fa92781bcdcd3b9aaa6bb1cd50c5b108f6190c170098a118b5cf5df2f881 +md5: 8e0b8654ead18e50af552e54b5a08a61 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libsqlite 3.53.1 h0c1763c_0 +- libzlib >=1.3.2,<2.0a0 +- ncurses >=6.6,<7.0a0 +- readline >=8.3,<9.0a0 +license: blessing +size: 205399 +timestamp: 1777986477546 +- conda: https://conda.anaconda.org/conda-forge/linux-64/tiktoken-0.12.0-py314h67fec18_3.conda +sha256: 7e395d67fd249d901beb1ae269057763c0d8c3ee5f7a348694bdb16d158a37d9 +md5: d705f9d8a1185a2b01cced191177a028 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libstdcxx >=14 +- python >=3.14,<3.15.0a0 +- python_abi 3.14.* *_cp314 +- regex >=2022.1.18 +- requests >=2.26.0 +constrains: +- __glibc >=2.17 +license: MIT +license_family: MIT +size: 939648 +timestamp: 1764028306357 +- conda: https://conda.anaconda.org/conda-forge/linux-64/tk-8.6.13-noxft_h366c992_103.conda +sha256: cafeec44494f842ffeca27e9c8b0c27ed714f93ac77ddadc6aaf726b5554ebac +md5: cffd3bdd58090148f4cfcd831f4b26ab +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libzlib >=1.3.1,<2.0a0 +constrains: +- xorg-libx11 >=1.8.12,<2.0a0 +license: TCL +license_family: BSD +size: 3301196 +timestamp: 1769460227866 +- conda: https://conda.anaconda.org/conda-forge/noarch/tqdm-4.67.3-pyh8f84b5b_0.conda +sha256: 9ef8e47cf00e4d6dcc114eb32a1504cc18206300572ef14d76634ba29dfe1eb6 +md5: e5ce43272193b38c2e9037446c1d9206 +depends: +- python >=3.10 +- __unix +- python +license: MPL-2.0 and MIT +size: 94132 +timestamp: 1770153424136 +- conda: https://conda.anaconda.org/conda-forge/noarch/typeguard-4.5.2-pyhcf101f3_0.conda +sha256: 59d7851d32fddb5b510272e6557aa982edeb927d349648dac27f5bf01d18bb26 +md5: 4460f039b7dedf15f7df086446ca75ae +depends: +- typing_extensions >=4.14.0 +- python >=3.10 +- importlib-metadata >=3.6 +- python +constrains: +- pytest >=7 +license: MIT +license_family: MIT +size: 38297 +timestamp: 1778779291237 +- conda: https://conda.anaconda.org/conda-forge/noarch/typing-extensions-4.15.0-h396c80c_0.conda +sha256: 7c2df5721c742c2a47b2c8f960e718c930031663ac1174da67c1ed5999f7938c +md5: edd329d7d3a4ab45dcf905899a7a6115 +depends: +- typing_extensions ==4.15.0 pyhcf101f3_0 +license: PSF-2.0 +license_family: PSF +size: 91383 +timestamp: 1756220668932 +- conda: https://conda.anaconda.org/conda-forge/noarch/typing-inspection-0.4.2-pyhcf101f3_2.conda +sha256: 8b90d2f19f9458b8c58a55e1fcdc1d90c1603a847a47654d8a454549413ba60a +md5: 53f5409c5cfd6c5a66417d68e3f0a864 +depends: +- python >=3.10 +- typing_extensions >=4.12.0 +- python +license: MIT +license_family: MIT +size: 20935 +timestamp: 1777105465795 +- conda: https://conda.anaconda.org/conda-forge/noarch/typing_extensions-4.15.0-pyhcf101f3_0.conda +sha256: 032271135bca55aeb156cee361c81350c6f3fb203f57d024d7e5a1fc9ef18731 +md5: 0caa1af407ecff61170c9437a808404d +depends: +- python >=3.10 +- python +license: PSF-2.0 +license_family: PSF +size: 51692 +timestamp: 1756220668932 +- conda: https://conda.anaconda.org/conda-forge/noarch/tzdata-2025c-hc9c84f9_1.conda +sha256: 1d30098909076af33a35017eed6f2953af1c769e273a0626a04722ac4acaba3c +md5: ad659d0a2b3e47e38d829aa8cad2d610 +license: LicenseRef-Public-Domain +size: 119135 +timestamp: 1767016325805 +- conda: https://conda.anaconda.org/conda-forge/noarch/urllib3-2.7.0-pyhd8ed1ab_0.conda +sha256: feff959a816f7988a0893201aa9727bbb7ee1e9cec2c4f0428269b489eb93fb4 +md5: cbb88288f74dbe6ada1c6c7d0a97223e +depends: +- backports.zstd >=1.0.0 +- brotli-python >=1.2.0 +- h2 >=4,<5 +- pysocks >=1.5.6,<2.0,!=1.5.7 +- python >=3.10 +license: MIT +license_family: MIT +size: 103560 +timestamp: 1778188657149 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxau-1.0.12-hb03c661_1.conda +sha256: 6bc6ab7a90a5d8ac94c7e300cc10beb0500eeba4b99822768ca2f2ef356f731b +md5: b2895afaf55bf96a8c8282a2e47a5de0 +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: MIT +license_family: MIT +size: 15321 +timestamp: 1762976464266 +- conda: https://conda.anaconda.org/conda-forge/linux-64/xorg-libxdmcp-1.1.5-hb03c661_1.conda +sha256: 25d255fb2eef929d21ff660a0c687d38a6d2ccfbcbf0cc6aa738b12af6e9d142 +md5: 1dafce8548e38671bea82e3f5c6ce22f +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +license: MIT +license_family: MIT +size: 20591 +timestamp: 1762976546182 +- conda: https://conda.anaconda.org/conda-forge/linux-64/yaml-0.2.5-h280c20c_3.conda +sha256: 6d9ea2f731e284e9316d95fa61869fe7bbba33df7929f82693c121022810f4ad +md5: a77f85f77be52ff59391544bfe73390a +depends: +- libgcc >=14 +- __glibc >=2.17,<3.0.a0 +license: MIT +license_family: MIT +size: 85189 +timestamp: 1753484064210 +- conda: https://conda.anaconda.org/conda-forge/noarch/zipp-4.1.0-pyhcf101f3_0.conda +sha256: 210bd31c22bb88f5e2a167df24c95bb5f152b2ada7502f9b8c49d1f5366db423 +md5: ba3dcdc8584155c97c648ae9c044b7a3 +depends: +- python >=3.10 +- python +license: MIT +license_family: MIT +size: 24190 +timestamp: 1779159948016 +- conda: https://conda.anaconda.org/conda-forge/linux-64/zlib-ng-2.3.3-hceb46e0_1.conda +sha256: ea4e50c465d70236408cb0bfe0115609fd14db1adcd8bd30d8918e0291f8a75f +md5: 2aadb0d17215603a82a2a6b0afd9a4cb +depends: +- __glibc >=2.17,<3.0.a0 +- libgcc >=14 +- libstdcxx >=14 +license: Zlib +license_family: Other +size: 122618 +timestamp: 1770167931827 +- conda: https://conda.anaconda.org/conda-forge/linux-64/zstd-1.5.7-hb78ec9c_6.conda +sha256: 68f0206ca6e98fea941e5717cec780ed2873ffabc0e1ed34428c061e2c6268c7 +md5: 4a13eeac0b5c8e5b8ab496e6c4ddd829 +depends: +- __glibc >=2.17,<3.0.a0 +- libzlib >=1.3.1,<2.0a0 +license: BSD-3-Clause +license_family: BSD +size: 601375 +timestamp: 1764777111296 diff --git a/modules/nf-core/multiqc/.conda-lock/linux_arm64-bd-5c84a5000a226ab5_1.txt b/modules/nf-core/multiqc/.conda-lock/linux_arm64-bd-5c84a5000a226ab5_1.txt new file mode 100644 index 0000000..3d5b93d --- /dev/null +++ b/modules/nf-core/multiqc/.conda-lock/linux_arm64-bd-5c84a5000a226ab5_1.txt @@ -0,0 +1,1476 @@ + +version: 6 +environments: +default: +channels: +- url: https://conda.anaconda.org/conda-forge/ +- url: https://conda.anaconda.org/bioconda/ +- url: https://conda.anaconda.org/bioconda/ +options: +pypi-prerelease-mode: if-necessary-or-explicit +packages: +linux-aarch64: +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/_openmp_mutex-4.5-20_gnu.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/_python_abi3_support-1.0-hd8ed1ab_2.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/annotated-types-0.7.0-pyhd8ed1ab_1.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/attrs-26.1.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/backports.zstd-1.5.0-py314h680f03e_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/brotli-python-1.2.0-py314h352cb57_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/bzip2-1.0.8-h4777abc_9.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2026.5.20-hbd8a1cb_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/certifi-2026.5.20-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/charset-normalizer-3.4.7-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/click-8.4.0-pyhc90fa1f_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/coloredlogs-15.0.1-pyhd8ed1ab_4.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/colormath-3.0.0-pyhd8ed1ab_4.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/cpython-3.14.5-py314hd8ed1ab_100.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/expat-2.8.1-hfae3067_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-dejavu-sans-mono-2.37-hab24e00_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-inconsolata-3.000-h77eed37_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-source-code-pro-2.038-h77eed37_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-ubuntu-0.83-h77eed37_3.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/fontconfig-2.18.0-hba86a56_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-forge-1-hc364b38_1.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/h2-4.3.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/hpack-4.1.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/humanfriendly-10.0-pyh707e725_8.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/humanize-4.15.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/hyperframe-6.1.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/idna-3.15-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/importlib-metadata-9.0.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/jinja2-3.1.6-pyhcf101f3_1.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/jsonschema-4.26.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/jsonschema-specifications-2025.9.1-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/kaleido-core-0.2.1-he5a581e_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/lcms2-2.19.1-h9d5b58d_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/ld_impl_linux-aarch64-2.45.1-default_h1979696_102.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/lerc-4.1.0-h52b7260_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libblas-3.11.0-7_haddc8a3_openblas.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libcblas-3.11.0-7_hd72aa62_openblas.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libdeflate-1.25-h1af38f5_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libexpat-2.8.1-hfae3067_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libffi-3.5.2-h376a255_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libfreetype-2.14.3-h8af1aa0_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libfreetype6-2.14.3-hdae7a39_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgcc-15.2.0-h8acb6b2_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgcc-ng-15.2.0-he9431aa_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgfortran-15.2.0-he9431aa_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgfortran5-15.2.0-h1b7bec0_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgomp-15.2.0-h8acb6b2_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libjpeg-turbo-3.1.4.1-he30d5cf_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/liblapack-3.11.0-7_h88aeb00_openblas.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/liblzma-5.8.3-he30d5cf_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libmpdec-4.0.0-he30d5cf_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libopenblas-0.3.33-pthreads_h9d3fd7e_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libpng-1.6.58-h1abf092_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libsqlite-3.53.1-h022381a_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libstdcxx-15.2.0-hef695bb_19.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libtiff-4.7.1-hdb009f0_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libuuid-2.42.1-h1022ec0_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libwebp-base-1.6.0-ha2e29f5_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libxcb-1.17.0-h262b8f6_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libzlib-1.3.2-hdc9db2a_2.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/markdown-3.10.2-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/markdown-it-py-4.2.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/markupsafe-3.0.3-py314hb76de3f_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/mathjax-2.7.7-h8af1aa0_3.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/mdurl-0.1.2-pyhd8ed1ab_1.conda +- conda: https://conda.anaconda.org/bioconda/noarch/multiqc-1.35-pyhdfd78af_1.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/narwhals-2.21.2-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/natsort-8.4.0-pyhcf101f3_2.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/ncurses-6.6-hf8d1292_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/networkx-3.6.1-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/nspr-4.38-h3ad9384_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/nss-3.118-h544fa81_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/numpy-2.4.6-py314he1698a1_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/openjpeg-2.5.4-h5da879a_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/openssl-3.6.2-h546c87b_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/packaging-26.2-pyhc364b38_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pillow-12.2.0-py314hac3e5ec_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/plotly-6.6.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/polars-1.41.0-pyh58ad624_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/polars-runtime-32-1.41.0-py310h32c7c23_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/polars-runtime-compat-1.41.0-py310hc0e61be_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/procps-ng-4.0.6-h1779866_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pthread-stubs-0.4-h86ecc28_1002.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/pyaml-env-1.2.2-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/pydantic-2.13.4-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pydantic-core-2.46.4-py314h451b6cc_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/pygments-2.20.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/pysocks-1.7.1-pyha55dd90_7.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/python-3.14.5-hfd9ac0a_100_cp314.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/python-dotenv-1.2.2-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/python-gil-3.14.5-h4df99d1_100.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/python-kaleido-0.2.1-pyhd8ed1ab_0.tar.bz2 +- conda: https://conda.anaconda.org/conda-forge/noarch/python_abi-3.14-8_cp314.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pyyaml-6.0.3-py314h807365f_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/readline-8.3-hb682ff5_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/referencing-0.37.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/regex-2026.5.9-py314h51f160d_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/requests-2.34.2-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/rich-15.0.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/rich-click-1.9.7-pyh8f84b5b_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/rpds-py-0.30.0-py314h02b7a91_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/spectra-0.0.11-pyhd8ed1ab_2.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/sqlite-3.53.1-he8854b5_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/tiktoken-0.12.0-py314h6a36e60_3.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/tk-8.6.13-noxft_h0dc03b3_103.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/tqdm-4.67.3-pyh8f84b5b_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/typeguard-4.5.2-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/typing-extensions-4.15.0-h396c80c_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/typing-inspection-0.4.2-pyhcf101f3_2.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/typing_extensions-4.15.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/tzdata-2025c-hc9c84f9_1.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/urllib3-2.7.0-pyhd8ed1ab_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxau-1.0.12-he30d5cf_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxdmcp-1.1.5-he30d5cf_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/yaml-0.2.5-h80f16a2_3.conda +- conda: https://conda.anaconda.org/conda-forge/noarch/zipp-4.1.0-pyhcf101f3_0.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/zlib-ng-2.3.3-ha7cb516_1.conda +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/zstd-1.5.7-h85ac4a6_6.conda +packages: +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/_openmp_mutex-4.5-20_gnu.conda +build_number: 20 +sha256: a2527b1d81792a0ccd2c05850960df119c2b6d8f5fdec97f2db7d25dc23b1068 +md5: 468fd3bb9e1f671d36c2cbc677e56f1d +depends: +- libgomp >=7.5.0 +constrains: +- openmp_impl <0.0a0 +license: BSD-3-Clause +license_family: BSD +size: 28926 +timestamp: 1770939656741 +- conda: https://conda.anaconda.org/conda-forge/noarch/_python_abi3_support-1.0-hd8ed1ab_2.conda +sha256: a3967b937b9abf0f2a99f3173fa4630293979bd1644709d89580e7c62a544661 +md5: aaa2a381ccc56eac91d63b6c1240312f +depends: +- cpython +- python-gil +license: MIT +license_family: MIT +size: 8191 +timestamp: 1744137672556 +- conda: https://conda.anaconda.org/conda-forge/noarch/annotated-types-0.7.0-pyhd8ed1ab_1.conda +sha256: e0ea1ba78fbb64f17062601edda82097fcf815012cf52bb704150a2668110d48 +md5: 2934f256a8acfe48f6ebb4fce6cde29c +depends: +- python >=3.9 +- typing-extensions >=4.0.0 +license: MIT +license_family: MIT +size: 18074 +timestamp: 1733247158254 +- conda: https://conda.anaconda.org/conda-forge/noarch/attrs-26.1.0-pyhcf101f3_0.conda +sha256: 1b6124230bb4e571b1b9401537ecff575b7b109cc3a21ee019f65e083b8399ab +md5: c6b0543676ecb1fb2d7643941fe375f2 +depends: +- python >=3.10 +- python +license: MIT +license_family: MIT +size: 64927 +timestamp: 1773935801332 +- conda: https://conda.anaconda.org/conda-forge/noarch/backports.zstd-1.5.0-py314h680f03e_0.conda +noarch: generic +sha256: a1c97297e867776760489537bc5ae36fa83a154be30e3b79385a39ca4cb058fe +md5: 1133126d840e75287d83947be3fc3e71 +depends: +- python >=3.14 +license: BSD-3-Clause AND MIT AND EPL-2.0 +size: 7533 +timestamp: 1778594057496 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/brotli-python-1.2.0-py314h352cb57_1.conda +sha256: 5a5b0cdcd7ed89c6a8fb830924967f6314a2b71944bc1ebc2c105781ba97aa75 +md5: a1b5c571a0923a205d663d8678df4792 +depends: +- libgcc >=14 +- libstdcxx >=14 +- python >=3.14,<3.15.0a0 +- python >=3.14,<3.15.0a0 *_cp314 +- python_abi 3.14.* *_cp314 +constrains: +- libbrotlicommon 1.2.0 he30d5cf_1 +license: MIT +license_family: MIT +size: 373193 +timestamp: 1764017486851 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/bzip2-1.0.8-h4777abc_9.conda +sha256: b3495077889dde6bb370938e7db82be545c73e8589696ad0843a32221520ad4c +md5: 840d8fc0d7b3209be93080bc20e07f2d +depends: +- libgcc >=14 +license: bzip2-1.0.6 +license_family: BSD +size: 192412 +timestamp: 1771350241232 +- conda: https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2026.5.20-hbd8a1cb_0.conda +sha256: 9812a303a1395e1dafbd92e5bc8a1ff6013bcbba0a09c7f03a8d23e43560aa9b +md5: 489b8e97e666c93f68fdb35c3c9b957f +depends: +- __unix +license: ISC +size: 129868 +timestamp: 1779289852439 +- conda: https://conda.anaconda.org/conda-forge/noarch/certifi-2026.5.20-pyhd8ed1ab_0.conda +sha256: 645655a3510e38e625da136595f3f16f2130c3263630cc3bc8f60f619ddbe490 +md5: 9fefff2f745ea1cc2ef15211a20c054a +depends: +- python >=3.10 +license: ISC +size: 134201 +timestamp: 1779285131141 +- conda: https://conda.anaconda.org/conda-forge/noarch/charset-normalizer-3.4.7-pyhd8ed1ab_0.conda +sha256: 3f9483d62ce24ecd063f8a5a714448445dc8d9e201147c46699fc0033e824457 +md5: a9167b9571f3baa9d448faa2139d1089 +depends: +- python >=3.10 +license: MIT +license_family: MIT +size: 58872 +timestamp: 1775127203018 +- conda: https://conda.anaconda.org/conda-forge/noarch/click-8.4.0-pyhc90fa1f_0.conda +sha256: 99ab8ef815c4520cce3a7482c2513f377c14348206857661d84c76a55e030f97 +md5: 003767c47f1f0a474c4de268b57839c3 +depends: +- __unix +- python +- python >=3.10 +license: BSD-3-Clause +license_family: BSD +size: 104631 +timestamp: 1779108494556 +- conda: https://conda.anaconda.org/conda-forge/noarch/coloredlogs-15.0.1-pyhd8ed1ab_4.conda +sha256: 8021c76eeadbdd5784b881b165242db9449783e12ce26d6234060026fd6a8680 +md5: b866ff7007b934d564961066c8195983 +depends: +- humanfriendly >=9.1 +- python >=3.9 +license: MIT +license_family: MIT +size: 43758 +timestamp: 1733928076798 +- conda: https://conda.anaconda.org/conda-forge/noarch/colormath-3.0.0-pyhd8ed1ab_4.conda +sha256: 59c9e29800b483b390467f90e82b0da3a4fbf0612efe1c90813fca232780e160 +md5: 071cf7b0ce333c81718b054066c15102 +depends: +- networkx >=2.0 +- numpy +- python >=3.9 +license: BSD-3-Clause +license_family: BSD +size: 39326 +timestamp: 1735759976140 +- conda: https://conda.anaconda.org/conda-forge/noarch/cpython-3.14.5-py314hd8ed1ab_100.conda +noarch: generic +sha256: 777882d2685f368417f31bbe1b28f73687fc6c8f6a5768bda20ffeefa6b07f5b +md5: a749029ce5d0632a913db19d17f944ab +depends: +- python >=3.14,<3.15.0a0 +- python_abi * *_cp314 +license: Python-2.0 +size: 50212 +timestamp: 1779236682725 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/expat-2.8.1-hfae3067_0.conda +sha256: a9cd5eb1700e11cc39acc36630a2d72a4e317943bd7c5695cd8804419f04ff42 +md5: 89f0247b3cea528d8ad1a6664a313153 +depends: +- libexpat 2.8.1 hfae3067_0 +- libgcc >=14 +license: MIT +license_family: MIT +size: 140114 +timestamp: 1779278679081 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-dejavu-sans-mono-2.37-hab24e00_0.tar.bz2 +sha256: 58d7f40d2940dd0a8aa28651239adbf5613254df0f75789919c4e6762054403b +md5: 0c96522c6bdaed4b1566d11387caaf45 +license: BSD-3-Clause +license_family: BSD +size: 397370 +timestamp: 1566932522327 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-inconsolata-3.000-h77eed37_0.tar.bz2 +sha256: c52a29fdac682c20d252facc50f01e7c2e7ceac52aa9817aaf0bb83f7559ec5c +md5: 34893075a5c9e55cdafac56607368fc6 +license: OFL-1.1 +license_family: Other +size: 96530 +timestamp: 1620479909603 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-source-code-pro-2.038-h77eed37_0.tar.bz2 +sha256: 00925c8c055a2275614b4d983e1df637245e19058d79fc7dd1a93b8d9fb4b139 +md5: 4d59c254e01d9cde7957100457e2d5fb +license: OFL-1.1 +license_family: Other +size: 700814 +timestamp: 1620479612257 +- conda: https://conda.anaconda.org/conda-forge/noarch/font-ttf-ubuntu-0.83-h77eed37_3.conda +sha256: 2821ec1dc454bd8b9a31d0ed22a7ce22422c0aef163c59f49dfdf915d0f0ca14 +md5: 49023d73832ef61042f6a237cb2687e7 +license: LicenseRef-Ubuntu-Font-Licence-Version-1.0 +license_family: Other +size: 1620504 +timestamp: 1727511233259 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/fontconfig-2.18.0-hba86a56_0.conda +sha256: 1805f4ab3d9e1734a5a17abccc2cb0fdade51d4d5f29bdc410600ea0115ec050 +md5: b660d59a9d0fb3297327418624acaec3 +depends: +- libexpat >=2.8.1,<3.0a0 +- libfreetype >=2.14.3 +- libfreetype6 >=2.14.3 +- libgcc >=14 +- libuuid >=2.42.1,<3.0a0 +- libzlib >=1.3.2,<2.0a0 +license: MIT +license_family: MIT +size: 293348 +timestamp: 1779421661332 +- conda: https://conda.anaconda.org/conda-forge/noarch/fonts-conda-forge-1-hc364b38_1.conda +sha256: 54eea8469786bc2291cc40bca5f46438d3e062a399e8f53f013b6a9f50e98333 +md5: a7970cd949a077b7cb9696379d338681 +depends: +- font-ttf-ubuntu +- font-ttf-inconsolata +- font-ttf-dejavu-sans-mono +- font-ttf-source-code-pro +license: BSD-3-Clause +license_family: BSD +size: 4059 +timestamp: 1762351264405 +- conda: https://conda.anaconda.org/conda-forge/noarch/h2-4.3.0-pyhcf101f3_0.conda +sha256: 84c64443368f84b600bfecc529a1194a3b14c3656ee2e832d15a20e0329b6da3 +md5: 164fc43f0b53b6e3a7bc7dce5e4f1dc9 +depends: +- python >=3.10 +- hyperframe >=6.1,<7 +- hpack >=4.1,<5 +- python +license: MIT +license_family: MIT +size: 95967 +timestamp: 1756364871835 +- conda: https://conda.anaconda.org/conda-forge/noarch/hpack-4.1.0-pyhd8ed1ab_0.conda +sha256: 6ad78a180576c706aabeb5b4c8ceb97c0cb25f1e112d76495bff23e3779948ba +md5: 0a802cb9888dd14eeefc611f05c40b6e +depends: +- python >=3.9 +license: MIT +license_family: MIT +size: 30731 +timestamp: 1737618390337 +- conda: https://conda.anaconda.org/conda-forge/noarch/humanfriendly-10.0-pyh707e725_8.conda +sha256: fa2071da7fab758c669e78227e6094f6b3608228740808a6de5d6bce83d9e52d +md5: 7fe569c10905402ed47024fc481bb371 +depends: +- __unix +- python >=3.9 +license: MIT +license_family: MIT +size: 73563 +timestamp: 1733928021866 +- conda: https://conda.anaconda.org/conda-forge/noarch/humanize-4.15.0-pyhd8ed1ab_0.conda +sha256: 6c4343b376d0b12a4c75ab992640970d36c933cad1fd924f6a1181fa91710e80 +md5: daddf757c3ecd6067b9af1df1f25d89e +depends: +- python >=3.10 +license: MIT +license_family: MIT +size: 67994 +timestamp: 1766267728652 +- conda: https://conda.anaconda.org/conda-forge/noarch/hyperframe-6.1.0-pyhd8ed1ab_0.conda +sha256: 77af6f5fe8b62ca07d09ac60127a30d9069fdc3c68d6b256754d0ffb1f7779f8 +md5: 8e6923fc12f1fe8f8c4e5c9f343256ac +depends: +- python >=3.9 +license: MIT +license_family: MIT +size: 17397 +timestamp: 1737618427549 +- conda: https://conda.anaconda.org/conda-forge/noarch/idna-3.15-pyhcf101f3_0.conda +sha256: 3d25f9f6f7ab3e1ce6429fc8c8aae0335cf446692e715068488536d220cc43de +md5: 1b9083b7f00609605d1483dbc6071a81 +depends: +- python >=3.10 +- python +license: BSD-3-Clause +license_family: BSD +size: 62642 +timestamp: 1779294335905 +- conda: https://conda.anaconda.org/conda-forge/noarch/importlib-metadata-9.0.0-pyhcf101f3_0.conda +sha256: 43e2a5497cad1598ff88a3e69f69bc88b7b8f141fa63c60eab5db296317318b8 +md5: ffc17e785d64e12fc311af9184221839 +depends: +- python >=3.10 +- zipp >=3.20 +- python +license: Apache-2.0 +size: 34766 +timestamp: 1779714582554 +- conda: https://conda.anaconda.org/conda-forge/noarch/jinja2-3.1.6-pyhcf101f3_1.conda +sha256: fc9ca7348a4f25fed2079f2153ecdcf5f9cf2a0bc36c4172420ca09e1849df7b +md5: 04558c96691bed63104678757beb4f8d +depends: +- markupsafe >=2.0 +- python >=3.10 +- python +license: BSD-3-Clause +license_family: BSD +size: 120685 +timestamp: 1764517220861 +- conda: https://conda.anaconda.org/conda-forge/noarch/jsonschema-4.26.0-pyhcf101f3_0.conda +sha256: db973a37d75db8e19b5f44bbbdaead0c68dde745407f281e2a7fe4db74ec51d7 +md5: ada41c863af263cc4c5fcbaff7c3e4dc +depends: +- attrs >=22.2.0 +- jsonschema-specifications >=2023.3.6 +- python >=3.10 +- referencing >=0.28.4 +- rpds-py >=0.25.0 +- python +license: MIT +license_family: MIT +size: 82356 +timestamp: 1767839954256 +- conda: https://conda.anaconda.org/conda-forge/noarch/jsonschema-specifications-2025.9.1-pyhcf101f3_0.conda +sha256: 0a4f3b132f0faca10c89fdf3b60e15abb62ded6fa80aebfc007d05965192aa04 +md5: 439cd0f567d697b20a8f45cb70a1005a +depends: +- python >=3.10 +- referencing >=0.31.0 +- python +license: MIT +license_family: MIT +size: 19236 +timestamp: 1757335715225 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/kaleido-core-0.2.1-he5a581e_0.tar.bz2 +sha256: d3c7f4797566e6f983d16c2a87063a18e4b2d819a66230190a21584d70042755 +md5: 4f0d284f5d11e04277b552eb1c172c7f +depends: +- __glibc >=2.17,<3.0.a0 +- expat >=2.2.10,<3.0.0a0 +- fontconfig +- fonts-conda-forge +- libgcc-ng >=9.3.0 +- mathjax 2.7.* +- nspr >=4.29,<5.0a0 +- nss >=3.62,<4.0a0 +- sqlite >=3.34.0,<4.0a0 +license: MIT +license_family: MIT +size: 65750397 +timestamp: 1615199465742 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/lcms2-2.19.1-h9d5b58d_0.conda +sha256: 1e5f68e4b36a0e1a278c6dc026bc3d7775518a15832cbc9d7fc1c0e4c47784b1 +md5: b1f8bee3c53a6d2c103fb4a1ae44f5c4 +depends: +- libgcc >=14 +- libjpeg-turbo >=3.1.4.1,<4.0a0 +- libtiff >=4.7.1,<4.8.0a0 +license: MIT +license_family: MIT +size: 296899 +timestamp: 1778079402392 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/ld_impl_linux-aarch64-2.45.1-default_h1979696_102.conda +sha256: 7abd913d81a9bf00abb699e8987966baa2065f5132e37e815f92d90fc6bba530 +md5: a21644fc4a83da26452a718dc9468d5f +depends: +- zstd >=1.5.7,<1.6.0a0 +constrains: +- binutils_impl_linux-aarch64 2.45.1 +license: GPL-3.0-only +license_family: GPL +size: 875596 +timestamp: 1774197520746 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/lerc-4.1.0-h52b7260_0.conda +sha256: 8957fd460c1c132c8031f65fd5f56ec3807fd71b7cab2c5e2b0937b13404ab36 +md5: d13423b06447113a90b5b1366d4da171 +depends: +- libgcc >=14 +- libstdcxx >=14 +license: Apache-2.0 +license_family: Apache +size: 240444 +timestamp: 1773114901155 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libblas-3.11.0-7_haddc8a3_openblas.conda +build_number: 7 +sha256: f27ba323c2f1e1731b5e880fe520f178f55047f25be94f77e649605b2343c066 +md5: e8d07b777f6ff1fab69665336561910b +depends: +- libopenblas >=0.3.33,<0.3.34.0a0 +- libopenblas >=0.3.33,<1.0a0 +constrains: +- liblapack 3.11.0 7*_openblas +- libcblas 3.11.0 7*_openblas +- mkl <2027 +- blas 2.307 openblas +- liblapacke 3.11.0 7*_openblas +license: BSD-3-Clause +license_family: BSD +size: 18696 +timestamp: 1778489796402 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libcblas-3.11.0-7_hd72aa62_openblas.conda +build_number: 7 +sha256: c8f0192362966df0828419f042d6f94c079e5df00ad6bd05b5e84c12b42f8cc7 +md5: 90ac57b82c055faa9be25031864b7d8f +depends: +- libblas 3.11.0 7_haddc8a3_openblas +constrains: +- liblapack 3.11.0 7*_openblas +- blas 2.307 openblas +- liblapacke 3.11.0 7*_openblas +license: BSD-3-Clause +license_family: BSD +size: 18664 +timestamp: 1778489802790 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libdeflate-1.25-h1af38f5_0.conda +sha256: 48814b73bd462da6eed2e697e30c060ae16af21e9fbed30d64feaf0aad9da392 +md5: a9138815598fe6b91a1d6782ca657b0c +depends: +- libgcc >=14 +license: MIT +license_family: MIT +size: 71117 +timestamp: 1761979776756 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libexpat-2.8.1-hfae3067_0.conda +sha256: 1fc392b997c6ee2bd3226a7cd870d0edbcbb367e25f9f18dd4a7025fced6efc0 +md5: 513dd884361dfb8a554298ed69b58823 +depends: +- libgcc >=14 +constrains: +- expat 2.8.1.* +license: MIT +license_family: MIT +size: 77140 +timestamp: 1779278671302 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libffi-3.5.2-h376a255_0.conda +sha256: 3df4c539449aabc3443bbe8c492c01d401eea894603087fca2917aa4e1c2dea9 +md5: 2f364feefb6a7c00423e80dcb12db62a +depends: +- libgcc >=14 +license: MIT +license_family: MIT +size: 55952 +timestamp: 1769456078358 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libfreetype-2.14.3-h8af1aa0_0.conda +sha256: 752e4f66283d7deb4c6fd47d88df644d8daa2aaa825a54f3bf350a625190192a +md5: a229e22d4d8814a07702b0919d8e6701 +depends: +- libfreetype6 >=2.14.3 +license: GPL-2.0-only OR FTL +size: 8125 +timestamp: 1774301094057 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libfreetype6-2.14.3-hdae7a39_0.conda +sha256: 8e6b27fe4eec4c2fa7b7769a21973734c8dba1de80086fb0213e58375ac09f4c +md5: b99ed99e42dafb27889483b3098cace7 +depends: +- libgcc >=14 +- libpng >=1.6.55,<1.7.0a0 +- libzlib >=1.3.2,<2.0a0 +constrains: +- freetype >=2.14.3 +license: GPL-2.0-only OR FTL +size: 422941 +timestamp: 1774301093473 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgcc-15.2.0-h8acb6b2_19.conda +sha256: 4592b096e553f67799ae70d4b6167eeda3ec74587d68c7aecbf4e7b1df136681 +md5: f35b3f52d0a2ec4ffe3c89ba135cdb9a +depends: +- _openmp_mutex >=4.5 +constrains: +- libgomp 15.2.0 h8acb6b2_19 +- libgcc-ng ==15.2.0=*_19 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 622462 +timestamp: 1778268755949 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgcc-ng-15.2.0-he9431aa_19.conda +sha256: 1137f93f477f56199ded24117430045a0c02cbe8b10031beac3b9ad2138539d3 +md5: 770cf892e5530f43e63cadc673e85653 +depends: +- libgcc 15.2.0 h8acb6b2_19 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 27738 +timestamp: 1778268759211 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgfortran-15.2.0-he9431aa_19.conda +sha256: e5ad94be72634233510b33ba792a3339921bd468f0b8bc6961ea05eded251d9b +md5: c7a5b5decf969ead5ecada83654164cf +depends: +- libgfortran5 15.2.0 h1b7bec0_19 +constrains: +- libgfortran-ng ==15.2.0=*_19 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 27728 +timestamp: 1778268784621 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgfortran5-15.2.0-h1b7bec0_19.conda +sha256: af8e9bdcaa77f133a8ee4c1ef57ef564d9c45aa262abf9f5ef9b50eb99d96407 +md5: 779dbb494de6d3d6477cab52eb34285a +depends: +- libgcc >=15.2.0 +constrains: +- libgfortran 15.2.0 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 1487244 +timestamp: 1778268767295 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libgomp-15.2.0-h8acb6b2_19.conda +sha256: 2370ef0ffcbae5bede3c4bf136add4abc257245eb91f724c99bb4a43116c5a83 +md5: c5e8a379c4a2ec2aea4ba22758c001d9 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 587387 +timestamp: 1778268674393 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libjpeg-turbo-3.1.4.1-he30d5cf_0.conda +sha256: e97ec2af5f09f8f6ea8ecd550055c95ae80fae22015fcfadaa94eafe025c9ccc +md5: a85ba48648f6868016f2741fd9170250 +depends: +- libgcc >=14 +constrains: +- jpeg <0.0.0a +license: IJG AND BSD-3-Clause AND Zlib +size: 693143 +timestamp: 1775962625956 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/liblapack-3.11.0-7_h88aeb00_openblas.conda +build_number: 7 +sha256: 20b38a0156200ac65f597bf0a93914c565435f2cc58b1042581854231a99ac35 +md5: 5899cbd743cc74fd655c1ed2af7120f3 +depends: +- libblas 3.11.0 7_haddc8a3_openblas +constrains: +- libcblas 3.11.0 7*_openblas +- blas 2.307 openblas +- liblapacke 3.11.0 7*_openblas +license: BSD-3-Clause +license_family: BSD +size: 18685 +timestamp: 1778489809140 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/liblzma-5.8.3-he30d5cf_0.conda +sha256: d61962b9cd54c3554361550203c64d5b65b71e3058a285b66e4b04b9769f0a5c +md5: 76298a9e6d71ee6e832a8d0d7373b261 +depends: +- libgcc >=14 +constrains: +- xz 5.8.3.* +license: 0BSD +size: 126102 +timestamp: 1775828008518 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libmpdec-4.0.0-he30d5cf_1.conda +sha256: 57c0dd12d506e84541c4e877898bd2a59cca141df493d34036f18b2751e0a453 +md5: 7b9813e885482e3ccb1fa212b86d7fd0 +depends: +- libgcc >=14 +license: BSD-2-Clause +license_family: BSD +size: 114056 +timestamp: 1769482343003 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libopenblas-0.3.33-pthreads_h9d3fd7e_0.conda +sha256: b018ecfb05e75a8eea3f21f6b5c5c2a54b5178bdcf19e2e2df2735740214a8c8 +md5: 58a66cd95e9692f08abe89f55a6f3f12 +depends: +- libgcc >=14 +- libgfortran +- libgfortran5 >=14.3.0 +constrains: +- openblas >=0.3.33,<0.3.34.0a0 +license: BSD-3-Clause +license_family: BSD +size: 5121336 +timestamp: 1776993423004 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libpng-1.6.58-h1abf092_0.conda +sha256: 483eaa53da40a6a3e558709d9f7b1ca388735364ae21a1ba58cf942514649c92 +md5: f51503ac45a4888bce71af9027a2ecc9 +depends: +- libgcc >=14 +- libzlib >=1.3.2,<2.0a0 +license: zlib-acknowledgement +size: 341202 +timestamp: 1776315188425 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libsqlite-3.53.1-h022381a_0.conda +sha256: ad03b7d8e4d08001f0df88ee7a56108bb35bae4795a42b9a04cc1abfa822bd07 +md5: 2ec1119217d8f0d086e9a62f3cb0e5ea +depends: +- libgcc >=14 +- libzlib >=1.3.2,<2.0a0 +license: blessing +size: 955361 +timestamp: 1777986487553 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libstdcxx-15.2.0-hef695bb_19.conda +sha256: 1dadc45e599f510dd5f97141dddcdbb9844d9f1430c1f3a38075cf1c58f87b4e +md5: 543fbc8d71f2a0baf04cf88ce96cb8bb +depends: +- libgcc 15.2.0 h8acb6b2_19 +constrains: +- libstdcxx-ng ==15.2.0=*_19 +license: GPL-3.0-only WITH GCC-exception-3.1 +license_family: GPL +size: 5546559 +timestamp: 1778268777463 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libtiff-4.7.1-hdb009f0_1.conda +sha256: 7ff79470db39e803e21b8185bc8f19c460666d5557b1378d1b1e857d929c6b39 +md5: 8c6fd84f9c87ac00636007c6131e457d +depends: +- lerc >=4.0.0,<5.0a0 +- libdeflate >=1.25,<1.26.0a0 +- libgcc >=14 +- libjpeg-turbo >=3.1.0,<4.0a0 +- liblzma >=5.8.1,<6.0a0 +- libstdcxx >=14 +- libwebp-base >=1.6.0,<2.0a0 +- libzlib >=1.3.1,<2.0a0 +- zstd >=1.5.7,<1.6.0a0 +license: HPND +size: 488407 +timestamp: 1762022048105 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libuuid-2.42.1-h1022ec0_0.conda +sha256: 1628839b062e98b2192857d4da8496ac9ac6b0dbb77aa040c34efc9192c440ee +md5: 0f42f9fedd2a32d798de95a7f65c456f +depends: +- libgcc >=14 +license: BSD-3-Clause +license_family: BSD +size: 43453 +timestamp: 1779118526838 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libwebp-base-1.6.0-ha2e29f5_0.conda +sha256: b03700a1f741554e8e5712f9b06dd67e76f5301292958cd3cb1ac8c6fdd9ed25 +md5: 24e92d0942c799db387f5c9d7b81f1af +depends: +- libgcc >=14 +constrains: +- libwebp 1.6.0 +license: BSD-3-Clause +license_family: BSD +size: 359496 +timestamp: 1752160685488 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libxcb-1.17.0-h262b8f6_0.conda +sha256: 461cab3d5650ac6db73a367de5c8eca50363966e862dcf60181d693236b1ae7b +md5: cd14ee5cca2464a425b1dbfc24d90db2 +depends: +- libgcc >=13 +- pthread-stubs +- xorg-libxau >=1.0.11,<2.0a0 +- xorg-libxdmcp +license: MIT +license_family: MIT +size: 397493 +timestamp: 1727280745441 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/libzlib-1.3.2-hdc9db2a_2.conda +sha256: eb111e32e5a7313a5bf799c7fb2419051fa2fe7eff74769fac8d5a448b309f7f +md5: 502006882cf5461adced436e410046d1 +constrains: +- zlib 1.3.2 *_2 +license: Zlib +license_family: Other +size: 69833 +timestamp: 1774072605429 +- conda: https://conda.anaconda.org/conda-forge/noarch/markdown-3.10.2-pyhcf101f3_0.conda +sha256: 20e0892592a3e7c683e3d66df704a9425d731486a97c34fc56af4da1106b2b6b +md5: ba0a9221ce1063f31692c07370d062f3 +depends: +- importlib-metadata >=4.4 +- python >=3.10 +- python +license: BSD-3-Clause +license_family: BSD +size: 85893 +timestamp: 1770694658918 +- conda: https://conda.anaconda.org/conda-forge/noarch/markdown-it-py-4.2.0-pyhd8ed1ab_0.conda +sha256: 0c4c35376fe920714390d46e4b8d31c876d65f18e1655899e0763ec25f2a902f +md5: 6d03368f2b2b0a5fb6839df53b2eb5e0 +depends: +- mdurl >=0.1,<1 +- python >=3.10 +license: MIT +license_family: MIT +size: 69017 +timestamp: 1778169663339 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/markupsafe-3.0.3-py314hb76de3f_1.conda +sha256: 383c188496d13a55658c06e61e7d4cdff2c9f9d5a0648769fca8250bece7e0ef +md5: e5de3c36dd548b35ff2a8aa49208dcb3 +depends: +- libgcc >=14 +- python >=3.14,<3.15.0a0 +- python_abi 3.14.* *_cp314 +constrains: +- jinja2 >=3.0.0 +license: BSD-3-Clause +license_family: BSD +size: 27913 +timestamp: 1772446407659 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/mathjax-2.7.7-h8af1aa0_3.tar.bz2 +sha256: 8fd4c79d6eda3d4cba73783114305a53a154ada4d1e334d4e02cb3521429599b +md5: 7b08314a6867a9d5648a1c3265e9eb8e +license: Apache-2.0 +license_family: Apache +size: 22257008 +timestamp: 1662784555011 +- conda: https://conda.anaconda.org/conda-forge/noarch/mdurl-0.1.2-pyhd8ed1ab_1.conda +sha256: 78c1bbe1723449c52b7a9df1af2ee5f005209f67e40b6e1d3c7619127c43b1c7 +md5: 592132998493b3ff25fd7479396e8351 +depends: +- python >=3.9 +license: MIT +license_family: MIT +size: 14465 +timestamp: 1733255681319 +- conda: https://conda.anaconda.org/bioconda/noarch/multiqc-1.35-pyhdfd78af_1.conda +sha256: e86033aa55a9e915e2d0957e770bdb81e3feb26a227d1adb17f9d6c528da6a71 +md5: cdb20309681ba3ce8f52c110e214d4f3 +depends: +- click +- coloredlogs +- humanize +- importlib-metadata +- jinja2 >=3.0.0 +- jsonschema +- markdown +- natsort +- numpy +- packaging +- pillow >=10.2.0 +- plotly >=5.18 +- polars >=1.34.0 +- polars-runtime-compat >=1.34.0 +- pyaml-env +- pydantic >=2.7.1 +- python >=3.9,!=3.14.1 +- python-dotenv +- python-kaleido 0.2.1 +- pyyaml >=4 +- requests +- rich >=10 +- rich-click +- spectra >=0.0.10 +- tiktoken +- tqdm +- typeguard >=4 +license: GPL-3.0-or-later +license_family: GPL3 +size: 4282188 +timestamp: 1779465338806 +- conda: https://conda.anaconda.org/conda-forge/noarch/narwhals-2.21.2-pyhcf101f3_0.conda +sha256: 70f43d62450927d51673eecd8823e14f5b3cfebdb43cda1d502eba97162bab42 +md5: 6687827c332121727ce383919e1ec8c2 +depends: +- python >=3.10 +- python +license: MIT +license_family: MIT +size: 284323 +timestamp: 1778929680962 +- conda: https://conda.anaconda.org/conda-forge/noarch/natsort-8.4.0-pyhcf101f3_2.conda +sha256: aeb1548eb72e4f198e72f19d242fb695b35add2ac7b2c00e0d83687052867680 +md5: e941e85e273121222580723010bd4fa2 +depends: +- python >=3.9 +- python +license: MIT +license_family: MIT +size: 39262 +timestamp: 1770905275632 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/ncurses-6.6-hf8d1292_0.conda +sha256: 369db85c5cd8d99dde364ce70725d76511d9c8199e5b820c740414091bf5bcca +md5: b2a43456aa56fe80c2477a5094899eff +depends: +- libgcc >=14 +license: X11 AND BSD-3-Clause +size: 960036 +timestamp: 1777422174534 +- conda: https://conda.anaconda.org/conda-forge/noarch/networkx-3.6.1-pyhcf101f3_0.conda +sha256: f6a82172afc50e54741f6f84527ef10424326611503c64e359e25a19a8e4c1c6 +md5: a2c1eeadae7a309daed9d62c96012a2b +depends: +- python >=3.11 +- python +constrains: +- numpy >=1.25 +- scipy >=1.11.2 +- matplotlib-base >=3.8 +- pandas >=2.0 +license: BSD-3-Clause +license_family: BSD +size: 1587439 +timestamp: 1765215107045 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/nspr-4.38-h3ad9384_0.conda +sha256: 78a06e89285fef242e272998b292c1e621e3ee3dd4fba62ec014e503c7ec118f +md5: 6dd4f07147774bf720075a210f8026b9 +depends: +- libgcc >=14 +- libstdcxx >=14 +license: MPL-2.0 +license_family: MOZILLA +size: 235140 +timestamp: 1762350120355 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/nss-3.118-h544fa81_0.conda +sha256: 48942696889367ffd448f8dccfc080fb7e130b9938a4a3b6b20ef8e6af856463 +md5: 4540f9570d12db2150f42ba036154552 +depends: +- libgcc >=14 +- libsqlite >=3.51.0,<4.0a0 +- libstdcxx >=14 +- libzlib >=1.3.1,<2.0a0 +- nspr >=4.38,<5.0a0 +license: MPL-2.0 +license_family: MOZILLA +size: 2061869 +timestamp: 1763490303490 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/numpy-2.4.6-py314he1698a1_0.conda +sha256: 04af718b911f8a3a0095481c7e283aa081a175fe626eccbc2c5644bcb2aba9a1 +md5: 8b173772deea177b45d2a133b509b3f7 +depends: +- python +- libstdcxx >=14 +- libgcc >=14 +- python_abi 3.14.* *_cp314 +- libblas >=3.9.0,<4.0a0 +- liblapack >=3.9.0,<4.0a0 +- libcblas >=3.9.0,<4.0a0 +constrains: +- numpy-base <0a0 +license: BSD-3-Clause +license_family: BSD +size: 8002900 +timestamp: 1779169206742 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/openjpeg-2.5.4-h5da879a_0.conda +sha256: bd1bc8bdde5e6c5cbac42d462b939694e40b59be6d0698f668515908640c77b8 +md5: cea962410e327262346d48d01f05936c +depends: +- libgcc >=14 +- libpng >=1.6.50,<1.7.0a0 +- libstdcxx >=14 +- libtiff >=4.7.1,<4.8.0a0 +- libzlib >=1.3.1,<2.0a0 +license: BSD-2-Clause +license_family: BSD +size: 392636 +timestamp: 1758489353577 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/openssl-3.6.2-h546c87b_0.conda +sha256: 348cb74c1530ac241215d047ef65d134cf797af935c97a68655319362b7e6a01 +md5: 3b129669089e4d6a5c6871dbb4669b99 +depends: +- ca-certificates +- libgcc >=14 +license: Apache-2.0 +license_family: Apache +size: 3706406 +timestamp: 1775589602258 +- conda: https://conda.anaconda.org/conda-forge/noarch/packaging-26.2-pyhc364b38_0.conda +sha256: 3906abfb6511a3bb309e39b9b1b7bc38f50a723971de2395489fd1f379255890 +md5: 4c06a92e74452cfa53623a81592e8934 +depends: +- python >=3.8 +- python +license: Apache-2.0 +license_family: APACHE +size: 91574 +timestamp: 1777103621679 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pillow-12.2.0-py314hac3e5ec_0.conda +sha256: 96b26c2657275ffe84ab510edf0865e21999d791485d12794edd4a71b837beb6 +md5: 87d58d103b47c4a8567b3d7666647684 +depends: +- python +- libgcc >=14 +- python 3.14.* *_cp314 +- openjpeg >=2.5.4,<3.0a0 +- libxcb >=1.17.0,<2.0a0 +- libwebp-base >=1.6.0,<2.0a0 +- zlib-ng >=2.3.3,<2.4.0a0 +- python_abi 3.14.* *_cp314 +- lcms2 >=2.18,<3.0a0 +- tk >=8.6.13,<8.7.0a0 +- libtiff >=4.7.1,<4.8.0a0 +- libjpeg-turbo >=3.1.2,<4.0a0 +- libfreetype >=2.14.3 +- libfreetype6 >=2.14.3 +license: HPND +size: 1062080 +timestamp: 1775060067775 +- conda: https://conda.anaconda.org/conda-forge/noarch/plotly-6.6.0-pyhd8ed1ab_0.conda +sha256: c418d325359fc7a0074cea7f081ef1bce26e114d2da8a0154c5d27ecc87a08e7 +md5: 3e9427ee186846052e81fadde8ebe96a +depends: +- narwhals >=1.15.1 +- packaging +- python >=3.10 +constrains: +- ipywidgets >=7.6 +license: MIT +license_family: MIT +size: 5251872 +timestamp: 1772628857717 +- conda: https://conda.anaconda.org/conda-forge/noarch/polars-1.41.0-pyh58ad624_0.conda +sha256: 70fc56877c4a095ee658d61924d8019768fbae4a48437058d181fc94b0a7c4d8 +md5: 25a883fed9f1f3f21ff317a3e7c92ac4 +depends: +- polars-runtime-32 ==1.41.0 +- python >=3.10 +- python +constrains: +- numpy >=1.16.0 +- pyarrow >=7.0.0 +- fastexcel >=0.9 +- openpyxl >=3.0.0 +- xlsx2csv >=0.8.0 +- connectorx >=0.3.2 +- deltalake >=1.0.0 +- pyiceberg >=0.7.1 +- altair >=5.4.0 +- great_tables >=0.8.0 +- polars-runtime-32 ==1.41.0 +- polars-runtime-64 ==1.41.0 +- polars-runtime-compat ==1.41.0 +license: MIT +size: 539656 +timestamp: 1779630790562 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/polars-runtime-32-1.41.0-py310h32c7c23_0.conda +noarch: python +sha256: d903b774ec09189e164207328aac157eee82fed8cc5c9ace46aeb5d1c15cb5b3 +md5: 8c08c506ed1ea8ce0ca37af5e918c58d +depends: +- python +- libgcc >=14 +- libstdcxx >=14 +- _python_abi3_support 1.* +- cpython >=3.10 +constrains: +- __glibc >=2.17 +license: MIT +size: 38704429 +timestamp: 1779630794932 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/polars-runtime-compat-1.41.0-py310hc0e61be_0.conda +noarch: python +sha256: 101696adff43a654146376c62ef9611bf7946b95fa46f604fe247d77eefc6267 +md5: 65b73e4260677ee5162bdbb252e28e06 +depends: +- python +- libstdcxx >=14 +- libgcc >=14 +- _python_abi3_support 1.* +- cpython >=3.10 +constrains: +- __glibc >=2.17 +license: MIT +size: 38651498 +timestamp: 1779630714016 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/procps-ng-4.0.6-h1779866_0.conda +sha256: e9cbcbc94e151ada3d6dc365380aaaf591f65012c16d9a2abaea4b9b90adc402 +md5: ab7288cc39545556d1bc5e71ab2df9a9 +depends: +- libgcc >=14 +- ncurses >=6.5,<7.0a0 +license: GPL-2.0-or-later AND LGPL-2.0-or-later +license_family: GPL +size: 636733 +timestamp: 1769712412683 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pthread-stubs-0.4-h86ecc28_1002.conda +sha256: 977dfb0cb3935d748521dd80262fe7169ab82920afd38ed14b7fee2ea5ec01ba +md5: bb5a90c93e3bac3d5690acf76b4a6386 +depends: +- libgcc >=13 +license: MIT +license_family: MIT +size: 8342 +timestamp: 1726803319942 +- conda: https://conda.anaconda.org/conda-forge/noarch/pyaml-env-1.2.2-pyhd8ed1ab_0.conda +sha256: 58994e0d2ea8584cb399546e6f6896d771995e6121d1a7b6a2c9948388358932 +md5: e17be1016bcc3516827b836cd3e4d9dc +depends: +- python >=3.9 +- pyyaml >=5.0,<=7.0 +license: MIT +license_family: MIT +size: 14645 +timestamp: 1736766960536 +- conda: https://conda.anaconda.org/conda-forge/noarch/pydantic-2.13.4-pyhcf101f3_0.conda +sha256: 69700e31165df070e9716315e042196aa92525dae5deb5107785847ab9f4189f +md5: 729843edafc0899b3348bd3f19525b9d +depends: +- typing-inspection >=0.4.2 +- typing_extensions >=4.14.1 +- python >=3.10 +- annotated-types >=0.6.0 +- pydantic-core ==2.46.4 +- python +license: MIT +license_family: MIT +size: 346511 +timestamp: 1778103405862 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pydantic-core-2.46.4-py314h451b6cc_0.conda +sha256: 1a7c6b18e404c13c4d959888ecb48a9ed9de0e41be2872932b83a35278088df0 +md5: 9c3ace6aba6df14b943256095ac1281e +depends: +- python +- typing-extensions >=4.6.0,!=4.7.0 +- libgcc >=14 +- python 3.14.* *_cp314 +- python_abi 3.14.* *_cp314 +constrains: +- __glibc >=2.17 +license: MIT +license_family: MIT +size: 1780773 +timestamp: 1778084251775 +- conda: https://conda.anaconda.org/conda-forge/noarch/pygments-2.20.0-pyhd8ed1ab_0.conda +sha256: cf70b2f5ad9ae472b71235e5c8a736c9316df3705746de419b59d442e8348e86 +md5: 16c18772b340887160c79a6acc022db0 +depends: +- python >=3.10 +license: BSD-2-Clause +license_family: BSD +size: 893031 +timestamp: 1774796815820 +- conda: https://conda.anaconda.org/conda-forge/noarch/pysocks-1.7.1-pyha55dd90_7.conda +sha256: ba3b032fa52709ce0d9fd388f63d330a026754587a2f461117cac9ab73d8d0d8 +md5: 461219d1a5bd61342293efa2c0c90eac +depends: +- __unix +- python >=3.9 +license: BSD-3-Clause +license_family: BSD +size: 21085 +timestamp: 1733217331982 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/python-3.14.5-hfd9ac0a_100_cp314.conda +build_number: 100 +sha256: d37bad5447365346166c72950ea8f49689aa49cecc1b0623d00458427627b8df +md5: d956e09feb806f5974675ce92ad81d45 +depends: +- bzip2 >=1.0.8,<2.0a0 +- ld_impl_linux-aarch64 >=2.36.1 +- libexpat >=2.8.0,<3.0a0 +- libffi >=3.5.2,<3.6.0a0 +- libgcc >=14 +- liblzma >=5.8.3,<6.0a0 +- libmpdec >=4.0.0,<5.0a0 +- libsqlite >=3.53.1,<4.0a0 +- libuuid >=2.42.1,<3.0a0 +- libzlib >=1.3.2,<2.0a0 +- ncurses >=6.6,<7.0a0 +- openssl >=3.5.6,<4.0a0 +- python_abi 3.14.* *_cp314 +- readline >=8.3,<9.0a0 +- tk >=8.6.13,<8.7.0a0 +- tzdata +- zstd >=1.5.7,<1.6.0a0 +license: Python-2.0 +size: 37510439 +timestamp: 1779236267040 +python_site_packages_path: lib/python3.14/site-packages +- conda: https://conda.anaconda.org/conda-forge/noarch/python-dotenv-1.2.2-pyhcf101f3_0.conda +sha256: 74e417a768f59f02a242c25e7db0aa796627b5bc8c818863b57786072aeb85e5 +md5: 130584ad9f3a513cdd71b1fdc1244e9c +depends: +- python >=3.10 +license: BSD-3-Clause +license_family: BSD +size: 27848 +timestamp: 1772388605021 +- conda: https://conda.anaconda.org/conda-forge/noarch/python-gil-3.14.5-h4df99d1_100.conda +sha256: 41dd7da285d71d519257fa7dacb1cae060d5ebfaa5f92cba5994899d2978e943 +md5: 41954747ba952ec4b01e16c2c9e8d8ff +depends: +- cpython 3.14.5.* +- python_abi * *_cp314 +license: Python-2.0 +size: 50212 +timestamp: 1779236703009 +- conda: https://conda.anaconda.org/conda-forge/noarch/python-kaleido-0.2.1-pyhd8ed1ab_0.tar.bz2 +sha256: e17bf63a30aec33432f1ead86e15e9febde9fc40a7f869c0e766be8d2db44170 +md5: 310259a5b03ff02289d7705f39e2b1d2 +depends: +- kaleido-core 0.2.1.* +- python >=3.5 +license: MIT +license_family: MIT +size: 18320 +timestamp: 1615204747600 +- conda: https://conda.anaconda.org/conda-forge/noarch/python_abi-3.14-8_cp314.conda +build_number: 8 +sha256: ad6d2e9ac39751cc0529dd1566a26751a0bf2542adb0c232533d32e176e21db5 +md5: 0539938c55b6b1a59b560e843ad864a4 +constrains: +- python 3.14.* *_cp314 +license: BSD-3-Clause +license_family: BSD +size: 6989 +timestamp: 1752805904792 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/pyyaml-6.0.3-py314h807365f_1.conda +sha256: 496b5e65dfdd0aaaaa5de0dcaaf3bceea00fcb4398acf152f89e567c82ec1046 +md5: 9ae2c92975118058bd720e9ba2bb7c58 +depends: +- libgcc >=14 +- python >=3.14,<3.15.0a0 +- python >=3.14,<3.15.0a0 *_cp314 +- python_abi 3.14.* *_cp314 +- yaml >=0.2.5,<0.3.0a0 +license: MIT +license_family: MIT +size: 195678 +timestamp: 1770223441816 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/readline-8.3-hb682ff5_0.conda +sha256: fe695f9d215e9a2e3dd0ca7f56435ab4df24f5504b83865e3d295df36e88d216 +md5: 3d49cad61f829f4f0e0611547a9cda12 +depends: +- libgcc >=14 +- ncurses >=6.5,<7.0a0 +license: GPL-3.0-only +license_family: GPL +size: 357597 +timestamp: 1765815673644 +- conda: https://conda.anaconda.org/conda-forge/noarch/referencing-0.37.0-pyhcf101f3_0.conda +sha256: 0577eedfb347ff94d0f2fa6c052c502989b028216996b45c7f21236f25864414 +md5: 870293df500ca7e18bedefa5838a22ab +depends: +- attrs >=22.2.0 +- python >=3.10 +- rpds-py >=0.7.0 +- typing_extensions >=4.4.0 +- python +license: MIT +license_family: MIT +size: 51788 +timestamp: 1760379115194 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/regex-2026.5.9-py314h51f160d_0.conda +sha256: 05ef55f09f31eabd0a205f6b065e13fc746675f41924620977692ef0ffe5aad8 +md5: 34ed7bc9febeca70f55b757ca09c354d +depends: +- libgcc >=14 +- python >=3.14,<3.15.0a0 +- python >=3.14,<3.15.0a0 *_cp314 +- python_abi 3.14.* *_cp314 +license: Apache-2.0 AND CNRI-Python +license_family: PSF +size: 409780 +timestamp: 1778374195988 +- conda: https://conda.anaconda.org/conda-forge/noarch/requests-2.34.2-pyhcf101f3_0.conda +sha256: 1715246b19c9f85ee022933b4845f2fc14ac9184981b7b7d9b728bec8e9588da +md5: 4a85203c1d80c1059086ae860836ffb9 +depends: +- python >=3.10 +- certifi >=2023.5.7 +- charset-normalizer >=2,<4 +- idna >=2.5,<4 +- urllib3 >=1.26,<3 +- python +constrains: +- chardet >=3.0.2,<8 +license: Apache-2.0 +license_family: APACHE +size: 68709 +timestamp: 1778851103479 +- conda: https://conda.anaconda.org/conda-forge/noarch/rich-15.0.0-pyhcf101f3_0.conda +sha256: 3d6ba2c0fcdac3196ba2f0615b4104e532525ffa1335b50a2878be5ff488814a +md5: 0242025a3c804966bf71aa04eee82f66 +depends: +- markdown-it-py >=2.2.0 +- pygments >=2.13.0,<3.0.0 +- python >=3.10 +- typing_extensions >=4.0.0,<5.0.0 +- python +license: MIT +license_family: MIT +size: 208577 +timestamp: 1775991661559 +- conda: https://conda.anaconda.org/conda-forge/noarch/rich-click-1.9.7-pyh8f84b5b_0.conda +sha256: aa3fcb167321bae51998de2e94d199109c9024f25a5a063cb1c28d8f1af33436 +md5: 0c20a8ebcddb24a45da89d5e917e6cb9 +depends: +- python >=3.10 +- rich >=12 +- click >=8 +- typing-extensions >=4 +- __unix +- python +license: MIT +license_family: MIT +size: 64356 +timestamp: 1769850479089 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/rpds-py-0.30.0-py314h02b7a91_0.conda +sha256: a587240f16eac7c6a80f9585cef679cd1cb9a287b8dfcdd36dcef1f7e7db15dc +md5: e7f6ed9e60043bb5cbcc527764897f0d +depends: +- python +- libgcc >=14 +- python_abi 3.14.* *_cp314 +constrains: +- __glibc >=2.17 +license: MIT +license_family: MIT +size: 376332 +timestamp: 1764543345455 +- conda: https://conda.anaconda.org/conda-forge/noarch/spectra-0.0.11-pyhd8ed1ab_2.conda +sha256: 7c65782d2511738e62c70462e89d65da4fa54d5a7e47c46667bcd27a59f81876 +md5: 472239e4eb7b5a84bb96b3ed7e3a596a +depends: +- colormath >=3.0.0 +- python >=3.9 +license: MIT +license_family: MIT +size: 22284 +timestamp: 1735770589188 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/sqlite-3.53.1-he8854b5_0.conda +sha256: 27467e4bfb0681546f149718c33b806fec078185fbaa6a4d17d440bc8f56185c +md5: 46009bdca2315a99e0a3a7d0ba1af3b9 +depends: +- libgcc >=14 +- libsqlite 3.53.1 h022381a_0 +- libzlib >=1.3.2,<2.0a0 +- ncurses >=6.6,<7.0a0 +- readline >=8.3,<9.0a0 +license: blessing +size: 209964 +timestamp: 1777986493350 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/tiktoken-0.12.0-py314h6a36e60_3.conda +sha256: c1da41c79262b27efa168407cfecc47b20270e5fc071a8307f95a2c85fb94170 +md5: 55bf7b559202236157b14323b40f19e6 +depends: +- libgcc >=14 +- libstdcxx >=14 +- python >=3.14,<3.15.0a0 +- python_abi 3.14.* *_cp314 +- regex >=2022.1.18 +- requests >=2.26.0 +constrains: +- __glibc >=2.17 +license: MIT +license_family: MIT +size: 914402 +timestamp: 1764030357702 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/tk-8.6.13-noxft_h0dc03b3_103.conda +sha256: e25c314b52764219f842b41aea2c98a059f06437392268f09b03561e4f6e5309 +md5: 7fc6affb9b01e567d2ef1d05b84aa6ed +depends: +- libgcc >=14 +- libzlib >=1.3.1,<2.0a0 +constrains: +- xorg-libx11 >=1.8.12,<2.0a0 +license: TCL +license_family: BSD +size: 3368666 +timestamp: 1769464148928 +- conda: https://conda.anaconda.org/conda-forge/noarch/tqdm-4.67.3-pyh8f84b5b_0.conda +sha256: 9ef8e47cf00e4d6dcc114eb32a1504cc18206300572ef14d76634ba29dfe1eb6 +md5: e5ce43272193b38c2e9037446c1d9206 +depends: +- python >=3.10 +- __unix +- python +license: MPL-2.0 and MIT +size: 94132 +timestamp: 1770153424136 +- conda: https://conda.anaconda.org/conda-forge/noarch/typeguard-4.5.2-pyhcf101f3_0.conda +sha256: 59d7851d32fddb5b510272e6557aa982edeb927d349648dac27f5bf01d18bb26 +md5: 4460f039b7dedf15f7df086446ca75ae +depends: +- typing_extensions >=4.14.0 +- python >=3.10 +- importlib-metadata >=3.6 +- python +constrains: +- pytest >=7 +license: MIT +license_family: MIT +size: 38297 +timestamp: 1778779291237 +- conda: https://conda.anaconda.org/conda-forge/noarch/typing-extensions-4.15.0-h396c80c_0.conda +sha256: 7c2df5721c742c2a47b2c8f960e718c930031663ac1174da67c1ed5999f7938c +md5: edd329d7d3a4ab45dcf905899a7a6115 +depends: +- typing_extensions ==4.15.0 pyhcf101f3_0 +license: PSF-2.0 +license_family: PSF +size: 91383 +timestamp: 1756220668932 +- conda: https://conda.anaconda.org/conda-forge/noarch/typing-inspection-0.4.2-pyhcf101f3_2.conda +sha256: 8b90d2f19f9458b8c58a55e1fcdc1d90c1603a847a47654d8a454549413ba60a +md5: 53f5409c5cfd6c5a66417d68e3f0a864 +depends: +- python >=3.10 +- typing_extensions >=4.12.0 +- python +license: MIT +license_family: MIT +size: 20935 +timestamp: 1777105465795 +- conda: https://conda.anaconda.org/conda-forge/noarch/typing_extensions-4.15.0-pyhcf101f3_0.conda +sha256: 032271135bca55aeb156cee361c81350c6f3fb203f57d024d7e5a1fc9ef18731 +md5: 0caa1af407ecff61170c9437a808404d +depends: +- python >=3.10 +- python +license: PSF-2.0 +license_family: PSF +size: 51692 +timestamp: 1756220668932 +- conda: https://conda.anaconda.org/conda-forge/noarch/tzdata-2025c-hc9c84f9_1.conda +sha256: 1d30098909076af33a35017eed6f2953af1c769e273a0626a04722ac4acaba3c +md5: ad659d0a2b3e47e38d829aa8cad2d610 +license: LicenseRef-Public-Domain +size: 119135 +timestamp: 1767016325805 +- conda: https://conda.anaconda.org/conda-forge/noarch/urllib3-2.7.0-pyhd8ed1ab_0.conda +sha256: feff959a816f7988a0893201aa9727bbb7ee1e9cec2c4f0428269b489eb93fb4 +md5: cbb88288f74dbe6ada1c6c7d0a97223e +depends: +- backports.zstd >=1.0.0 +- brotli-python >=1.2.0 +- h2 >=4,<5 +- pysocks >=1.5.6,<2.0,!=1.5.7 +- python >=3.10 +license: MIT +license_family: MIT +size: 103560 +timestamp: 1778188657149 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxau-1.0.12-he30d5cf_1.conda +sha256: e9f6e931feeb2f40e1fdbafe41d3b665f1ab6cb39c5880a1fcf9f79a3f3c84a5 +md5: 1c246e1105000c3660558459e2fd6d43 +depends: +- libgcc >=14 +license: MIT +license_family: MIT +size: 16317 +timestamp: 1762977521691 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/xorg-libxdmcp-1.1.5-he30d5cf_1.conda +sha256: 128d72f36bcc8d2b4cdbec07507542e437c7d67f677b7d77b71ed9eeac7d6df1 +md5: bff06dcde4a707339d66d45d96ceb2e2 +depends: +- libgcc >=14 +license: MIT +license_family: MIT +size: 21039 +timestamp: 1762979038025 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/yaml-0.2.5-h80f16a2_3.conda +sha256: 66265e943f32ce02396ad214e27cb35f5b0490b3bd4f064446390f9d67fa5d88 +md5: 032d8030e4a24fe1f72c74423a46fb88 +depends: +- libgcc >=14 +license: MIT +license_family: MIT +size: 88088 +timestamp: 1753484092643 +- conda: https://conda.anaconda.org/conda-forge/noarch/zipp-4.1.0-pyhcf101f3_0.conda +sha256: 210bd31c22bb88f5e2a167df24c95bb5f152b2ada7502f9b8c49d1f5366db423 +md5: ba3dcdc8584155c97c648ae9c044b7a3 +depends: +- python >=3.10 +- python +license: MIT +license_family: MIT +size: 24190 +timestamp: 1779159948016 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/zlib-ng-2.3.3-ha7cb516_1.conda +sha256: 638a3a41a4fbfed52d3c60c8ef5a3693b3f12a5b1a3f58fa29f5698d0a0702e2 +md5: f731af71c723065d91b4c01bb822641b +depends: +- libgcc >=14 +- libstdcxx >=14 +license: Zlib +license_family: Other +size: 121046 +timestamp: 1770167944449 +- conda: https://conda.anaconda.org/conda-forge/linux-aarch64/zstd-1.5.7-h85ac4a6_6.conda +sha256: 569990cf12e46f9df540275146da567d9c618c1e9c7a0bc9d9cfefadaed20b75 +md5: c3655f82dcea2aa179b291e7099c1fcc +depends: +- libzlib >=1.3.1,<2.0a0 +license: BSD-3-Clause +license_family: BSD +size: 614429 +timestamp: 1764777145593 diff --git a/modules/nf-core/multiqc/environment.yml b/modules/nf-core/multiqc/environment.yml index 6f5b867..7a970e2 100644 --- a/modules/nf-core/multiqc/environment.yml +++ b/modules/nf-core/multiqc/environment.yml @@ -1,5 +1,7 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda dependencies: - - bioconda::multiqc=1.25.1 + - bioconda::multiqc=1.35 diff --git a/modules/nf-core/multiqc/main.nf b/modules/nf-core/multiqc/main.nf index cc0643e..c4bc715 100644 --- a/modules/nf-core/multiqc/main.nf +++ b/modules/nf-core/multiqc/main.nf @@ -1,24 +1,21 @@ process MULTIQC { + tag "${meta.id}" label 'process_single' conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/multiqc:1.25.1--pyhdfd78af_0' : - 'biocontainers/multiqc:1.25.1--pyhdfd78af_0' }" + container "${workflow.containerEngine in ['singularity', 'apptainer'] && !task.ext.singularity_pull_docker_container + ? 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/c8/c8e346f4f6080eadf1253505e6ff09ef004454fc18e8d672006fd7b222cc412e/data' + : 'community.wave.seqera.io/library/multiqc:1.35--c17fb751507e9dfc'}" input: - path multiqc_files, stageAs: "?/*" - path(multiqc_config) - path(extra_multiqc_config) - path(multiqc_logo) - path(replace_names) - path(sample_names) + tuple val(meta), path(multiqc_files, stageAs: "?/*"), path(multiqc_config, stageAs: "?/*"), path(multiqc_logo), path(replace_names), path(sample_names) output: - path "*multiqc_report.html", emit: report - path "*_data" , emit: data - path "*_plots" , optional:true, emit: plots - path "versions.yml" , emit: versions + tuple val(meta), path("*.html"), emit: report + tuple val(meta), path("*_data"), emit: data + tuple val(meta), path("*_plots"), emit: plots, optional: true + // MultiQC should not push its versions to the `versions` topic. Its input depends on the versions topic to be resolved thus outputting to the topic will let the pipeline hang forever + tuple val("${task.process}"), val('multiqc'), eval('multiqc --version | sed "s/.* //g"'), emit: versions when: task.ext.when == null || task.ext.when @@ -26,38 +23,28 @@ process MULTIQC { script: def args = task.ext.args ?: '' def prefix = task.ext.prefix ? "--filename ${task.ext.prefix}.html" : '' - def config = multiqc_config ? "--config $multiqc_config" : '' - def extra_config = extra_multiqc_config ? "--config $extra_multiqc_config" : '' + def config = multiqc_config ? multiqc_config instanceof List ? "--config ${multiqc_config.join(' --config ')}" : "--config ${multiqc_config}" : "" def logo = multiqc_logo ? "--cl-config 'custom_logo: \"${multiqc_logo}\"'" : '' def replace = replace_names ? "--replace-names ${replace_names}" : '' def samples = sample_names ? "--sample-names ${sample_names}" : '' """ multiqc \\ --force \\ - $args \\ - $config \\ - $prefix \\ - $extra_config \\ - $logo \\ - $replace \\ - $samples \\ + ${args} \\ + ${config} \\ + ${prefix} \\ + ${logo} \\ + ${replace} \\ + ${samples} \\ . - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - multiqc: \$( multiqc --version | sed -e "s/multiqc, version //g" ) - END_VERSIONS """ stub: """ mkdir multiqc_data + touch multiqc_data/.stub mkdir multiqc_plots + touch multiqc_plots/.stub touch multiqc_report.html - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - multiqc: \$( multiqc --version | sed -e "s/multiqc, version //g" ) - END_VERSIONS """ } diff --git a/modules/nf-core/multiqc/meta.yml b/modules/nf-core/multiqc/meta.yml index b16c187..27ce18d 100644 --- a/modules/nf-core/multiqc/meta.yml +++ b/modules/nf-core/multiqc/meta.yml @@ -1,6 +1,6 @@ name: multiqc -description: Aggregate results from bioinformatics analyses across many samples into - a single report +description: Aggregate results from bioinformatics analyses across many samples + into a single report keywords: - QC - bioinformatics tools @@ -12,60 +12,91 @@ tools: It's a general use tool, perfect for summarising the output from numerous bioinformatics tools. homepage: https://multiqc.info/ documentation: https://multiqc.info/docs/ - licence: ["GPL-3.0-or-later"] + licence: + - "GPL-3.0-or-later" identifier: biotools:multiqc input: - - - multiqc_files: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'sample1', single_end:false ] + - multiqc_files: type: file description: | List of reports / files recognised by MultiQC, for example the html and zip output of FastQC - - - multiqc_config: + ontologies: [] + - multiqc_config: type: file description: Optional config yml for MultiQC pattern: "*.{yml,yaml}" - - - extra_multiqc_config: - type: file - description: Second optional config yml for MultiQC. Will override common sections - in multiqc_config. - pattern: "*.{yml,yaml}" - - - multiqc_logo: + ontologies: + - edam: http://edamontology.org/format_3750 + - multiqc_logo: type: file description: Optional logo file for MultiQC pattern: "*.{png}" - - - replace_names: + ontologies: [] + - replace_names: type: file description: | Optional two-column sample renaming file. First column a set of patterns, second column a set of corresponding replacements. Passed via MultiQC's `--replace-names` option. pattern: "*.{tsv}" - - - sample_names: + ontologies: + - edam: http://edamontology.org/format_3475 + - sample_names: type: file description: | Optional TSV file with headers, passed to the MultiQC --sample_names argument. pattern: "*.{tsv}" + ontologies: + - edam: http://edamontology.org/format_3475 output: - - report: - - "*multiqc_report.html": + report: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'sample1', single_end:false ] + - "*.html": type: file description: MultiQC report file - pattern: "multiqc_report.html" - - data: + pattern: ".html" + ontologies: [] + data: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'sample1', single_end:false ] - "*_data": type: directory description: MultiQC data dir pattern: "multiqc_data" - - plots: + plots: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'sample1', single_end:false ] - "*_plots": type: file description: Plots created by MultiQC - pattern: "*_data" - - versions: - - versions.yml: - type: file - description: File containing software versions - pattern: "versions.yml" + pattern: "*_plots" + ontologies: [] + versions: + - - ${task.process}: + type: string + description: The process the versions were collected from + - multiqc: + type: string + description: The tool name + - multiqc --version | sed "s/.* //g": + type: eval + description: The expression to obtain the version of the tool authors: - "@abhi18av" - "@bunop" @@ -76,3 +107,27 @@ maintainers: - "@bunop" - "@drpatelh" - "@jfy133" +containers: + conda: + linux/amd64: + lock_file: modules/nf-core/multiqc/.conda-lock/linux_amd64-bd-c17fb751507e9dfc_1.txt + linux/arm64: + lock_file: modules/nf-core/multiqc/.conda-lock/linux_arm64-bd-5c84a5000a226ab5_1.txt + docker: + linux/amd64: + name: community.wave.seqera.io/library/multiqc:1.35--c17fb751507e9dfc + build_id: bd-c17fb751507e9dfc_1 + scan_id: sc-3b1b3932f9846892_1 + linux/arm64: + name: community.wave.seqera.io/library/multiqc:1.35--5c84a5000a226ab5 + build_id: bd-5c84a5000a226ab5_1 + scan_id: sc-0d39df41e9737bbd_1 + singularity: + linux/amd64: + name: oras://community.wave.seqera.io/library/multiqc:1.35--c680f2aea25ccec2 + build_id: bd-c680f2aea25ccec2_1 + https: https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/c8/c8e346f4f6080eadf1253505e6ff09ef004454fc18e8d672006fd7b222cc412e/data + linux/arm64: + name: oras://community.wave.seqera.io/library/multiqc:1.35--c0468833d65b2f81 + build_id: bd-c0468833d65b2f81_1 + https: https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/e4/e48aa28aebc881254a499b24c3e1ce77b8df1b85a5432699ed6f72eb17ac7fb5/data diff --git a/modules/nf-core/multiqc/tests/custom_prefix.config b/modules/nf-core/multiqc/tests/custom_prefix.config new file mode 100644 index 0000000..b30b135 --- /dev/null +++ b/modules/nf-core/multiqc/tests/custom_prefix.config @@ -0,0 +1,5 @@ +process { + withName: 'MULTIQC' { + ext.prefix = "custom_prefix" + } +} diff --git a/modules/nf-core/multiqc/tests/main.nf.test b/modules/nf-core/multiqc/tests/main.nf.test index 33316a7..4cbdb95 100644 --- a/modules/nf-core/multiqc/tests/main.nf.test +++ b/modules/nf-core/multiqc/tests/main.nf.test @@ -15,25 +15,84 @@ nextflow_process { when { process { """ - input[0] = Channel.of(file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastqc/test_fastqc.zip', checkIfExists: true)) - input[1] = [] - input[2] = [] - input[3] = [] - input[4] = [] - input[5] = [] + input[0] = channel.of([ + [ id: 'FASTQC' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastqc/test_fastqc.zip', checkIfExists: true), + [], + [], + [], + [] + ]) """ } } then { - assertAll( - { assert process.success }, - { assert process.out.report[0] ==~ ".*/multiqc_report.html" }, - { assert process.out.data[0] ==~ ".*/multiqc_data" }, - { assert snapshot(process.out.versions).match("multiqc_versions_single") } - ) + assert process.success + assert snapshot( + sanitizeOutput(process.out).collectEntries { key, val -> + if (key == "data") { + return [key, val.collect { [path(it[1]).list().collect { file(it.toString()).name }] }] + } + else if (key == "plots") { + return [key, val.collect { [ + "pdf", + path("${it[1]}/pdf").list().collect { file(it.toString()).name }, + "png", + path("${it[1]}/png").list().collect { file(it.toString()).name }, + "svg", + path("${it[1]}/svg").list().collect { file(it.toString()).name }] }] + } + else if (key == "report") { + return [key, file(val[0][1].toString()).name] + } + return [key, val] + } + ).match() + } + } + + test("sarscov2 single-end [fastqc] - custom prefix") { + config "./custom_prefix.config" + + when { + process { + """ + input[0] = channel.of([ + [ id: 'FASTQC' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastqc/test_fastqc.zip', checkIfExists: true), + [], + [], + [], + [] + ]) + """ + } } + then { + assert process.success + assert snapshot( + sanitizeOutput(process.out).collectEntries { key, val -> + if (key == "data") { + return [key, val.collect { [path(it[1]).list().collect { file(it.toString()).name }] }] + } + else if (key == "plots") { + return [key, val.collect { [ + "pdf", + path("${it[1]}/pdf").list().collect { file(it.toString()).name }, + "png", + path("${it[1]}/png").list().collect { file(it.toString()).name }, + "svg", + path("${it[1]}/svg").list().collect { file(it.toString()).name }] }] + } + else if (key == "report") { + return [key, file(val[0][1].toString()).name] + } + return [key, val] + } + ).match() + } } test("sarscov2 single-end [fastqc] [config]") { @@ -41,23 +100,85 @@ nextflow_process { when { process { """ - input[0] = Channel.of(file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastqc/test_fastqc.zip', checkIfExists: true)) - input[1] = Channel.of(file("https://github.com/nf-core/tools/raw/dev/nf_core/pipeline-template/assets/multiqc_config.yml", checkIfExists: true)) - input[2] = [] - input[3] = [] - input[4] = [] - input[5] = [] + input[0] = channel.of([ + [ id: 'FASTQC' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastqc/test_fastqc.zip', checkIfExists: true), + file("https://raw.githubusercontent.com/nf-core/seqinspector/1.0.0/assets/multiqc_config.yml", checkIfExists: true), + [], + [], + [] + ]) """ } } then { - assertAll( - { assert process.success }, - { assert process.out.report[0] ==~ ".*/multiqc_report.html" }, - { assert process.out.data[0] ==~ ".*/multiqc_data" }, - { assert snapshot(process.out.versions).match("multiqc_versions_config") } - ) + assert process.success + assert snapshot( + sanitizeOutput(process.out).collectEntries { key, val -> + if (key == "data") { + return [key, val.collect { [path(it[1]).list().collect { file(it.toString()).name }] }] + } + else if (key == "plots") { + return [key, val.collect { [ + "pdf", + path("${it[1]}/pdf").list().collect { file(it.toString()).name }, + "png", + path("${it[1]}/png").list().collect { file(it.toString()).name }, + "svg", + path("${it[1]}/svg").list().collect { file(it.toString()).name }] }] + } + else if (key == "report") { + return [key, file(val[0][1].toString()).name] + } + return [key, val] + } + ).match() + } + } + + test("sarscov2 single-end [fastqc] [multiple configs]") { + + when { + process { + """ + input[0] = channel.of([ + [ id: 'FASTQC' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastqc/test_fastqc.zip', checkIfExists: true), + [ + file("https://raw.githubusercontent.com/nf-core/seqinspector/1.0.0/assets/multiqc_config.yml", checkIfExists: true), + file("https://raw.githubusercontent.com/nf-core/seqinspector/1.0.0/assets/multiqc_config.yml", checkIfExists: true) + ], + [], + [], + [] + ]) + """ + } + } + + then { + assert process.success + assert snapshot( + sanitizeOutput(process.out).collectEntries { key, val -> + if (key == "data") { + return [key, val.collect { [path(it[1]).list().collect { file(it.toString()).name }] }] + } + else if (key == "plots") { + return [key, val.collect { [ + "pdf", + path("${it[1]}/pdf").list().collect { file(it.toString()).name }, + "png", + path("${it[1]}/png").list().collect { file(it.toString()).name }, + "svg", + path("${it[1]}/svg").list().collect { file(it.toString()).name }] }] + } + else if (key == "report") { + return [key, file(val[0][1].toString()).name] + } + return [key, val] + } + ).match() } } @@ -68,25 +189,23 @@ nextflow_process { when { process { """ - input[0] = Channel.of(file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastqc/test_fastqc.zip', checkIfExists: true)) - input[1] = [] - input[2] = [] - input[3] = [] - input[4] = [] - input[5] = [] + input[0] = channel.of([ + [ id: 'FASTQC' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastqc/test_fastqc.zip', checkIfExists: true), + [], + [], + [], + [] + ]) """ } } then { + assert process.success assertAll( - { assert process.success }, - { assert snapshot(process.out.report.collect { file(it).getName() } + - process.out.data.collect { file(it).getName() } + - process.out.plots.collect { file(it).getName() } + - process.out.versions ).match("multiqc_stub") } + { assert snapshot(sanitizeOutput(process.out)).match() } ) } - } } diff --git a/modules/nf-core/multiqc/tests/main.nf.test.snap b/modules/nf-core/multiqc/tests/main.nf.test.snap index 2fcbb5f..4489921 100644 --- a/modules/nf-core/multiqc/tests/main.nf.test.snap +++ b/modules/nf-core/multiqc/tests/main.nf.test.snap @@ -1,41 +1,422 @@ { - "multiqc_versions_single": { + "sarscov2 single-end [fastqc] [multiple configs]": { "content": [ - [ - "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" - ] + { + "data": [ + [ + [ + "fastqc-status-check-heatmap.txt", + "fastqc_overrepresented_sequences_plot.txt", + "fastqc_per_base_n_content_plot.txt", + "fastqc_per_base_sequence_quality_plot.txt", + "fastqc_per_sequence_gc_content_plot_Counts.txt", + "fastqc_per_sequence_gc_content_plot_Percentages.txt", + "fastqc_per_sequence_quality_scores_plot.txt", + "fastqc_sequence_counts_plot.txt", + "fastqc_sequence_duplication_levels_plot.txt", + "fastqc_sequence_length_distribution_plot.txt", + "fastqc_top_overrepresented_sequences_table.txt", + "llms-full.txt", + "multiqc.log", + "multiqc.parquet", + "multiqc_citations.txt", + "multiqc_data.json", + "multiqc_fastqc.txt", + "multiqc_general_stats.txt", + "multiqc_sources.txt" + ] + ] + ], + "plots": [ + [ + "pdf", + [ + "fastqc-status-check-heatmap.pdf", + "fastqc_overrepresented_sequences_plot.pdf", + "fastqc_per_base_n_content_plot.pdf", + "fastqc_per_base_sequence_quality_plot.pdf", + "fastqc_per_sequence_gc_content_plot_Counts.pdf", + "fastqc_per_sequence_gc_content_plot_Percentages.pdf", + "fastqc_per_sequence_quality_scores_plot.pdf", + "fastqc_sequence_counts_plot-cnt.pdf", + "fastqc_sequence_counts_plot-pct.pdf", + "fastqc_sequence_duplication_levels_plot.pdf", + "fastqc_sequence_length_distribution_plot.pdf", + "fastqc_top_overrepresented_sequences_table.pdf" + ], + "png", + [ + "fastqc-status-check-heatmap.png", + "fastqc_overrepresented_sequences_plot.png", + "fastqc_per_base_n_content_plot.png", + "fastqc_per_base_sequence_quality_plot.png", + "fastqc_per_sequence_gc_content_plot_Counts.png", + "fastqc_per_sequence_gc_content_plot_Percentages.png", + "fastqc_per_sequence_quality_scores_plot.png", + "fastqc_sequence_counts_plot-cnt.png", + "fastqc_sequence_counts_plot-pct.png", + "fastqc_sequence_duplication_levels_plot.png", + "fastqc_sequence_length_distribution_plot.png", + "fastqc_top_overrepresented_sequences_table.png" + ], + "svg", + [ + "fastqc-status-check-heatmap.svg", + "fastqc_overrepresented_sequences_plot.svg", + "fastqc_per_base_n_content_plot.svg", + "fastqc_per_base_sequence_quality_plot.svg", + "fastqc_per_sequence_gc_content_plot_Counts.svg", + "fastqc_per_sequence_gc_content_plot_Percentages.svg", + "fastqc_per_sequence_quality_scores_plot.svg", + "fastqc_sequence_counts_plot-cnt.svg", + "fastqc_sequence_counts_plot-pct.svg", + "fastqc_sequence_duplication_levels_plot.svg", + "fastqc_sequence_length_distribution_plot.svg", + "fastqc_top_overrepresented_sequences_table.svg" + ] + ] + ], + "report": "multiqc_report.html", + "versions": [ + [ + "MULTIQC", + "multiqc", + "1.35" + ] + ] + } ], + "timestamp": "2026-03-17T16:15:42.577775492", "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-10-02T17:51:46.317523" + "nf-test": "0.9.4", + "nextflow": "25.10.4" + } }, - "multiqc_stub": { + "sarscov2 single-end [fastqc]": { "content": [ - [ - "multiqc_report.html", - "multiqc_data", - "multiqc_plots", - "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" - ] + { + "data": [ + [ + [ + "fastqc-status-check-heatmap.txt", + "fastqc_overrepresented_sequences_plot.txt", + "fastqc_per_base_n_content_plot.txt", + "fastqc_per_base_sequence_quality_plot.txt", + "fastqc_per_sequence_gc_content_plot_Counts.txt", + "fastqc_per_sequence_gc_content_plot_Percentages.txt", + "fastqc_per_sequence_quality_scores_plot.txt", + "fastqc_sequence_counts_plot.txt", + "fastqc_sequence_duplication_levels_plot.txt", + "fastqc_sequence_length_distribution_plot.txt", + "fastqc_top_overrepresented_sequences_table.txt", + "llms-full.txt", + "multiqc.log", + "multiqc.parquet", + "multiqc_citations.txt", + "multiqc_data.json", + "multiqc_fastqc.txt", + "multiqc_general_stats.txt", + "multiqc_software_versions.txt", + "multiqc_sources.txt" + ] + ] + ], + "plots": [ + [ + "pdf", + [ + "fastqc-status-check-heatmap.pdf", + "fastqc_overrepresented_sequences_plot.pdf", + "fastqc_per_base_n_content_plot.pdf", + "fastqc_per_base_sequence_quality_plot.pdf", + "fastqc_per_sequence_gc_content_plot_Counts.pdf", + "fastqc_per_sequence_gc_content_plot_Percentages.pdf", + "fastqc_per_sequence_quality_scores_plot.pdf", + "fastqc_sequence_counts_plot-cnt.pdf", + "fastqc_sequence_counts_plot-pct.pdf", + "fastqc_sequence_duplication_levels_plot.pdf", + "fastqc_sequence_length_distribution_plot.pdf", + "fastqc_top_overrepresented_sequences_table.pdf" + ], + "png", + [ + "fastqc-status-check-heatmap.png", + "fastqc_overrepresented_sequences_plot.png", + "fastqc_per_base_n_content_plot.png", + "fastqc_per_base_sequence_quality_plot.png", + "fastqc_per_sequence_gc_content_plot_Counts.png", + "fastqc_per_sequence_gc_content_plot_Percentages.png", + "fastqc_per_sequence_quality_scores_plot.png", + "fastqc_sequence_counts_plot-cnt.png", + "fastqc_sequence_counts_plot-pct.png", + "fastqc_sequence_duplication_levels_plot.png", + "fastqc_sequence_length_distribution_plot.png", + "fastqc_top_overrepresented_sequences_table.png" + ], + "svg", + [ + "fastqc-status-check-heatmap.svg", + "fastqc_overrepresented_sequences_plot.svg", + "fastqc_per_base_n_content_plot.svg", + "fastqc_per_base_sequence_quality_plot.svg", + "fastqc_per_sequence_gc_content_plot_Counts.svg", + "fastqc_per_sequence_gc_content_plot_Percentages.svg", + "fastqc_per_sequence_quality_scores_plot.svg", + "fastqc_sequence_counts_plot-cnt.svg", + "fastqc_sequence_counts_plot-pct.svg", + "fastqc_sequence_duplication_levels_plot.svg", + "fastqc_sequence_length_distribution_plot.svg", + "fastqc_top_overrepresented_sequences_table.svg" + ] + ] + ], + "report": "multiqc_report.html", + "versions": [ + [ + "MULTIQC", + "multiqc", + "1.35" + ] + ] + } ], + "timestamp": "2026-03-17T16:21:17.072841555", "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-10-02T17:52:20.680978" + "nf-test": "0.9.4", + "nextflow": "25.10.4" + } }, - "multiqc_versions_config": { + "sarscov2 single-end [fastqc] - stub": { "content": [ - [ - "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" - ] + { + "data": [ + [ + { + "id": "FASTQC" + }, + [ + ".stub:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "plots": [ + [ + { + "id": "FASTQC" + }, + [ + ".stub:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "report": [ + [ + { + "id": "FASTQC" + }, + "multiqc_report.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + [ + "MULTIQC", + "multiqc", + "1.35" + ] + ] + } ], + "timestamp": "2026-02-26T15:14:39.789193051", "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-10-02T17:52:09.185842" + "nf-test": "0.9.4", + "nextflow": "25.10.4" + } + }, + "sarscov2 single-end [fastqc] [config]": { + "content": [ + { + "data": [ + [ + [ + "fastqc-status-check-heatmap.txt", + "fastqc_overrepresented_sequences_plot.txt", + "fastqc_per_base_n_content_plot.txt", + "fastqc_per_base_sequence_quality_plot.txt", + "fastqc_per_sequence_gc_content_plot_Counts.txt", + "fastqc_per_sequence_gc_content_plot_Percentages.txt", + "fastqc_per_sequence_quality_scores_plot.txt", + "fastqc_sequence_counts_plot.txt", + "fastqc_sequence_duplication_levels_plot.txt", + "fastqc_sequence_length_distribution_plot.txt", + "fastqc_top_overrepresented_sequences_table.txt", + "llms-full.txt", + "multiqc.log", + "multiqc.parquet", + "multiqc_citations.txt", + "multiqc_data.json", + "multiqc_fastqc.txt", + "multiqc_general_stats.txt", + "multiqc_sources.txt" + ] + ] + ], + "plots": [ + [ + "pdf", + [ + "fastqc-status-check-heatmap.pdf", + "fastqc_overrepresented_sequences_plot.pdf", + "fastqc_per_base_n_content_plot.pdf", + "fastqc_per_base_sequence_quality_plot.pdf", + "fastqc_per_sequence_gc_content_plot_Counts.pdf", + "fastqc_per_sequence_gc_content_plot_Percentages.pdf", + "fastqc_per_sequence_quality_scores_plot.pdf", + "fastqc_sequence_counts_plot-cnt.pdf", + "fastqc_sequence_counts_plot-pct.pdf", + "fastqc_sequence_duplication_levels_plot.pdf", + "fastqc_sequence_length_distribution_plot.pdf", + "fastqc_top_overrepresented_sequences_table.pdf" + ], + "png", + [ + "fastqc-status-check-heatmap.png", + "fastqc_overrepresented_sequences_plot.png", + "fastqc_per_base_n_content_plot.png", + "fastqc_per_base_sequence_quality_plot.png", + "fastqc_per_sequence_gc_content_plot_Counts.png", + "fastqc_per_sequence_gc_content_plot_Percentages.png", + "fastqc_per_sequence_quality_scores_plot.png", + "fastqc_sequence_counts_plot-cnt.png", + "fastqc_sequence_counts_plot-pct.png", + "fastqc_sequence_duplication_levels_plot.png", + "fastqc_sequence_length_distribution_plot.png", + "fastqc_top_overrepresented_sequences_table.png" + ], + "svg", + [ + "fastqc-status-check-heatmap.svg", + "fastqc_overrepresented_sequences_plot.svg", + "fastqc_per_base_n_content_plot.svg", + "fastqc_per_base_sequence_quality_plot.svg", + "fastqc_per_sequence_gc_content_plot_Counts.svg", + "fastqc_per_sequence_gc_content_plot_Percentages.svg", + "fastqc_per_sequence_quality_scores_plot.svg", + "fastqc_sequence_counts_plot-cnt.svg", + "fastqc_sequence_counts_plot-pct.svg", + "fastqc_sequence_duplication_levels_plot.svg", + "fastqc_sequence_length_distribution_plot.svg", + "fastqc_top_overrepresented_sequences_table.svg" + ] + ] + ], + "report": "multiqc_report.html", + "versions": [ + [ + "MULTIQC", + "multiqc", + "1.35" + ] + ] + } + ], + "timestamp": "2026-03-17T16:15:30.372239611", + "meta": { + "nf-test": "0.9.4", + "nextflow": "25.10.4" + } + }, + "sarscov2 single-end [fastqc] - custom prefix": { + "content": [ + { + "data": [ + [ + [ + "fastqc-status-check-heatmap.txt", + "fastqc_overrepresented_sequences_plot.txt", + "fastqc_per_base_n_content_plot.txt", + "fastqc_per_base_sequence_quality_plot.txt", + "fastqc_per_sequence_gc_content_plot_Counts.txt", + "fastqc_per_sequence_gc_content_plot_Percentages.txt", + "fastqc_per_sequence_quality_scores_plot.txt", + "fastqc_sequence_counts_plot.txt", + "fastqc_sequence_duplication_levels_plot.txt", + "fastqc_sequence_length_distribution_plot.txt", + "fastqc_top_overrepresented_sequences_table.txt", + "llms-full.txt", + "multiqc.log", + "multiqc.parquet", + "multiqc_citations.txt", + "multiqc_data.json", + "multiqc_fastqc.txt", + "multiqc_general_stats.txt", + "multiqc_software_versions.txt", + "multiqc_sources.txt" + ] + ] + ], + "plots": [ + [ + "pdf", + [ + "fastqc-status-check-heatmap.pdf", + "fastqc_overrepresented_sequences_plot.pdf", + "fastqc_per_base_n_content_plot.pdf", + "fastqc_per_base_sequence_quality_plot.pdf", + "fastqc_per_sequence_gc_content_plot_Counts.pdf", + "fastqc_per_sequence_gc_content_plot_Percentages.pdf", + "fastqc_per_sequence_quality_scores_plot.pdf", + "fastqc_sequence_counts_plot-cnt.pdf", + "fastqc_sequence_counts_plot-pct.pdf", + "fastqc_sequence_duplication_levels_plot.pdf", + "fastqc_sequence_length_distribution_plot.pdf", + "fastqc_top_overrepresented_sequences_table.pdf" + ], + "png", + [ + "fastqc-status-check-heatmap.png", + "fastqc_overrepresented_sequences_plot.png", + "fastqc_per_base_n_content_plot.png", + "fastqc_per_base_sequence_quality_plot.png", + "fastqc_per_sequence_gc_content_plot_Counts.png", + "fastqc_per_sequence_gc_content_plot_Percentages.png", + "fastqc_per_sequence_quality_scores_plot.png", + "fastqc_sequence_counts_plot-cnt.png", + "fastqc_sequence_counts_plot-pct.png", + "fastqc_sequence_duplication_levels_plot.png", + "fastqc_sequence_length_distribution_plot.png", + "fastqc_top_overrepresented_sequences_table.png" + ], + "svg", + [ + "fastqc-status-check-heatmap.svg", + "fastqc_overrepresented_sequences_plot.svg", + "fastqc_per_base_n_content_plot.svg", + "fastqc_per_base_sequence_quality_plot.svg", + "fastqc_per_sequence_gc_content_plot_Counts.svg", + "fastqc_per_sequence_gc_content_plot_Percentages.svg", + "fastqc_per_sequence_quality_scores_plot.svg", + "fastqc_sequence_counts_plot-cnt.svg", + "fastqc_sequence_counts_plot-pct.svg", + "fastqc_sequence_duplication_levels_plot.svg", + "fastqc_sequence_length_distribution_plot.svg", + "fastqc_top_overrepresented_sequences_table.svg" + ] + ] + ], + "report": "custom_prefix.html", + "versions": [ + [ + "MULTIQC", + "multiqc", + "1.35" + ] + ] + } + ], + "timestamp": "2026-03-17T16:15:18.189023981", + "meta": { + "nf-test": "0.9.4", + "nextflow": "25.10.4" + } } } \ No newline at end of file diff --git a/modules/nf-core/multiqc/tests/nextflow.config b/modules/nf-core/multiqc/tests/nextflow.config index c537a6a..374dfef 100644 --- a/modules/nf-core/multiqc/tests/nextflow.config +++ b/modules/nf-core/multiqc/tests/nextflow.config @@ -1,5 +1,6 @@ process { withName: 'MULTIQC' { ext.prefix = null + ext.args = '-p' } } diff --git a/modules/nf-core/multiqc/tests/tags.yml b/modules/nf-core/multiqc/tests/tags.yml deleted file mode 100644 index bea6c0d..0000000 --- a/modules/nf-core/multiqc/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -multiqc: - - modules/nf-core/multiqc/** diff --git a/nextflow.config b/nextflow.config index 3fd3e96..64944d7 100644 --- a/nextflow.config +++ b/nextflow.config @@ -1,6 +1,6 @@ /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - nf-core/dmscore Nextflow config file + nf-core/deepmutscan Nextflow config file ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Default config options for all compute environments ---------------------------------------------------------------------------------------- @@ -9,14 +9,23 @@ // Global default params, used in configs params { - // TODO nf-core: Specify your pipeline's command line flags - // Input options input = null + min_counts = 10 + mutagenesis_type = 'nnk' + run_seqdepth = false + reading_frame = null + custom_codon_library = '/NULL' + sliding_window_size = 10 + aimed_cov = 100 + fitness = false + dimsum = false + mutscan = false + // References genome = null igenomes_base = 's3://ngi-igenomes/igenomes/' - igenomes_ignore = false + igenomes_ignore = true // MultiQC options multiqc_config = null @@ -32,13 +41,14 @@ params { email_on_fail = null plaintext_email = false monochrome_logs = false - hook_url = null help = false help_full = false show_hidden = false version = false pipelines_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/' - trace_report_suffix = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')// Config options + trace_report_suffix = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss') + + // Config options config_profile_name = null config_profile_description = null @@ -51,6 +61,10 @@ params { validate_params = true } +// Backwards compatibility for publishDir syntax +outputDir = params.outdir +workflow.output.mode = params.publish_dir_mode + // Load base.config by default for all pipelines includeConfig 'conf/base.config' @@ -91,7 +105,17 @@ profiles { apptainer.enabled = false docker.runOptions = '-u $(id -u):$(id -g)' } - arm { + arm64 { + process.arch = 'arm64' + // For now if you're using arm64 you have to use wave for the sake of the maintainers + // wave profile + apptainer.ociAutoPull = true + singularity.ociAutoPull = true + wave.enabled = true + wave.freeze = true + wave.strategy = 'conda,container' + } + emulate_amd64 { docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64' } singularity { @@ -148,28 +172,24 @@ profiles { wave.freeze = true wave.strategy = 'conda,container' } - gitpod { - executor.name = 'local' - executor.cpus = 4 - executor.memory = 8.GB - process { - resourceLimits = [ - memory: 8.GB, - cpus : 4, - time : 1.h - ] - } + gpu { + docker.runOptions = '-u $(id -u):$(id -g) --gpus all' + apptainer.runOptions = '--nv' + singularity.runOptions = '--nv' } test { includeConfig 'conf/test.config' } test_full { includeConfig 'conf/test_full.config' } } -// Load nf-core custom profiles from different Institutions -includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" +// Load nf-core custom profiles from different institutions + +// If params.custom_config_base is set AND either the NXF_OFFLINE environment variable is not set or params.custom_config_base is a local path, the nfcore_custom.config file from the specified base path is included. +// Load nf-core/deepmutscan custom profiles from different institutions. +includeConfig params.custom_config_base && (!System.getenv('NXF_OFFLINE') || !params.custom_config_base.startsWith('http')) ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" -// Load nf-core/dmscore custom profiles from different institutions. -// TODO nf-core: Optionally, you can add a pipeline-specific nf-core config at https://github.com/nf-core/configs -// includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/pipeline/dmscore.config" : "/dev/null" + +// Load nf-core/deepmutscan custom profiles from different institutions. +includeConfig params.custom_config_base && (!System.getenv('NXF_OFFLINE') || !params.custom_config_base.startsWith('http')) ? "${params.custom_config_base}/pipeline/deepmutscan.config" : "/dev/null" // Set default registry for Apptainer, Docker, Podman, Charliecloud and Singularity independent of -profile // Will not be used unless Apptainer / Docker / Podman / Charliecloud / Singularity are enabled @@ -195,14 +215,14 @@ env { } // Set bash options -process.shell = """\ -bash - -set -e # Exit if a tool returns a non-zero status/exit code -set -u # Treat unset variables and parameters as an error -set -o pipefail # Returns the status of the last command to exit with a non-zero status or zero if all successfully execute -set -C # No clobber - prevent output redirection from overwriting files. -""" +process.shell = [ + "bash", + "-C", // No clobber - prevent output redirection from overwriting files. + "-e", // Exit if a tool returns a non-zero status/exit code + "-u", // Treat unset variables and parameters as an error + "-o", // Returns the status of the last command to exit.. + "pipefail" // ..with a non-zero status or zero if all successfully execute +] // Disable process selector warnings by default. Use debug profile to enable warnings. nextflow.enable.configProcessNamesValidation = false @@ -225,10 +245,8 @@ dag { } manifest { - name = 'nf-core/dmscore' - author = """Benjamin Wehnert & Max Stammnitz""" // The author field is deprecated from Nextflow version 24.10.0, use contributors instead + name = 'nf-core/deepmutscan' contributors = [ - // TODO nf-core: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0 [ name: 'Benjamin Wehnert & Max Stammnitz', affiliation: '', @@ -238,51 +256,20 @@ manifest { orcid: '' ], ] - homePage = 'https://github.com/nf-core/dmscore' + homePage = 'https://github.com/nf-core/deepmutscan' description = """Until now, most Deep Mutational Scanning (DMS) experiments relied on variant-specific barcoded libraries for sequencing. This method enabled DMS on large proteins and led to many great publications. Recently, efforts have increased to make use of the classic and more simple random fragmentation-based short-read sequencing (“shotgun-sequencing”). This saves time and money and due to its simpler experimental design is less prone to mistakes. dmscore handles the essential computational steps, processing the raw FASTQ files and generating a count table of variants. Along the way, it provides multiple QC metrics, enabling users to quickly evaluate the success of their experimental setup.""" mainScript = 'main.nf' defaultBranch = 'master' - nextflowVersion = '!>=24.04.2' - version = '1.0.0dev' + nextflowVersion = '!>=25.10.4' + version = '1.0.0' doi = '' } // Nextflow plugins plugins { - id 'nf-schema@2.3.0' // Validation of pipeline parameters and creation of an input channel from a sample sheet + id 'nf-schema@2.5.1' // Validation of pipeline parameters and creation of an input channel from a sample sheet } -validation { - defaultIgnoreParams = ["genomes"] - monochromeLogs = params.monochrome_logs - help { - enabled = true - command = "nextflow run nf-core/dmscore -profile --input samplesheet.csv --outdir " - fullParameter = "help_full" - showHiddenParameter = "show_hidden" - beforeText = """ --\033[2m----------------------------------------------------\033[0m- - \033[0;32m,--.\033[0;30m/\033[0;32m,-.\033[0m -\033[0;34m ___ __ __ __ ___ \033[0;32m/,-._.--~\'\033[0m -\033[0;34m |\\ | |__ __ / ` / \\ |__) |__ \033[0;33m} {\033[0m -\033[0;34m | \\| | \\__, \\__/ | \\ |___ \033[0;32m\\`-._,-`-,\033[0m - \033[0;32m`._,._,\'\033[0m -\033[0;35m nf-core/dmscore ${manifest.version}\033[0m --\033[2m----------------------------------------------------\033[0m- -""" - afterText = """${manifest.doi ? "\n* The pipeline\n" : ""}${manifest.doi.tokenize(",").collect { " https://doi.org/${it.trim().replace('https://doi.org/','')}"}.join("\n")}${manifest.doi ? "\n" : ""} -* The nf-core framework - https://doi.org/10.1038/s41587-020-0439-x - -* Software dependencies - https://github.com/nf-core/dmscore/blob/master/CITATIONS.md -""" - } - summary { - beforeText = validation.help.beforeText - afterText = validation.help.afterText - } -} // Load modules.config for DSL2 module specific options includeConfig 'conf/modules.config' diff --git a/nextflow_schema.json b/nextflow_schema.json index e4e5523..a8da70f 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -1,8 +1,8 @@ { "$schema": "https://json-schema.org/draft/2020-12/schema", - "$id": "https://raw.githubusercontent.com/nf-core/dmscore/master/nextflow_schema.json", - "title": "nf-core/dmscore pipeline parameters", - "description": "Until now, most Deep Mutational Scanning (DMS) experiments relied on variant-specific barcoded libraries for sequencing. This method enabled DMS on large proteins and led to many great publications. Recently, efforts have increased to make use of the classic and more simple random fragmentation-based short-read sequencing (“shotgun-sequencing”). This saves time and money and due to its simpler experimental design is less prone to mistakes. dmscore handles the essential computational steps, processing the raw FASTQ files and generating a count table of variants. Along the way, it provides multiple QC metrics, enabling users to quickly evaluate the success of their experimental setup.", + "$id": "https://raw.githubusercontent.com/nf-core/deepmutscan/master/nextflow_schema.json", + "title": "nf-core/deepmutscan pipeline parameters", + "description": "Until now, most Deep Mutational Scanning (DMS) experiments relied on variant-specific barcoded libraries for sequencing. This method enabled DMS on large proteins and led to many great publications. Recently, efforts have increased to make use of the classic and more simple random fragmentation-based short-read sequencing (\u201cshotgun-sequencing\u201d). This saves time and money and due to its simpler experimental design is less prone to mistakes. dmscore handles the essential computational steps, processing the raw FASTQ files and generating a count table of variants. Along the way, it provides multiple QC metrics, enabling users to quickly evaluate the success of their experimental setup.", "type": "object", "$defs": { "input_output_options": { @@ -10,7 +10,7 @@ "type": "object", "fa_icon": "fas fa-terminal", "description": "Define where the pipeline should find input data and save output data.", - "required": ["input", "outdir"], + "required": ["input", "outdir", "fasta", "reading_frame"], "properties": { "input": { "type": "string", @@ -20,7 +20,7 @@ "mimetype": "text/csv", "pattern": "^\\S+\\.csv$", "description": "Path to comma-separated file containing information about the samples in the experiment.", - "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/dmscore/usage#samplesheet-input).", + "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/deepmutscan/usage#samplesheet-input).", "fa_icon": "fas fa-file-csv" }, "outdir": { @@ -40,6 +40,54 @@ "type": "string", "description": "MultiQC report title. Printed as page header, used for filename if not otherwise specified.", "fa_icon": "fas fa-file-signature" + }, + "reading_frame": { + "type": "string", + "description": "Start and stop codon positions in the format 'start-stop', e.g., '352-1383'.", + "pattern": "^\\d+-\\d+$" + }, + "min_counts": { + "type": "integer", + "description": "minimum counts for variant to be recognized. All variants below min_counts will be set to 0", + "minimum": 1, + "default": 10 + }, + "sliding_window_size": { + "type": "integer", + "description": "To flatten graphs in plots (e.g. `GLOBAL_POS_BIASES_COUNTS` function)", + "default": 10 + }, + "aimed_cov": { + "type": "integer", + "description": "aimed coverage (assuming equal spread) to visualize threshold in plots", + "default": 100 + }, + "mutagenesis_type": { + "type": "string", + "description": "Type of mutagenic primers. Choose from nnk, nns, nnh, nnn, nnk_nns, nnk_nns_nnh, custom. When using 'custom', also provide the parameter 'custom_codon_library'.", + "default": "nnk" + }, + "custom_codon_library": { + "type": "string", + "format": "file-path", + "description": "Path to a file defining a custom codon library. Required when mutagenesis_type is set to `custom`. The script auto-detects the format: provide either a global comma-separated list of codons (e.g., 'AAA,AAC,AAG') OR a position-specific, headerless list where each row specifies the target sequence position followed by its allowed codons (e.g., '1,ACG,AAA,ACA' and '2,AAA,TTT,ACA' on separate lines). Both as .csv.", + "default": "/NULL" + }, + "dimsum": { + "type": ["boolean", "string"], + "description": "Run DiMSum for fitness/functionality scores from growth-based selection input & output samples" + }, + "mutscan": { + "type": ["boolean", "string"], + "description": "Run mutscan (Soneson et al., 2023) to estimate fitness/functionality scores from input & output samples" + }, + "fitness": { + "type": ["boolean", "string"], + "description": "Enable basic fitness calculation and preceded data preparation." + }, + "run_seqdepth": { + "type": ["boolean", "string"], + "description": "Whether to run the SeqDepth simulation module." } } }, @@ -74,7 +122,6 @@ }, "igenomes_base": { "type": "string", - "format": "directory-path", "description": "The base path to the igenomes reference files", "fa_icon": "fas fa-ban", "hidden": true, @@ -180,13 +227,6 @@ "fa_icon": "fas fa-palette", "hidden": true }, - "hook_url": { - "type": "string", - "description": "Incoming hook URL for messaging service", - "fa_icon": "fas fa-people-group", - "help_text": "Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.", - "hidden": true - }, "multiqc_config": { "type": "string", "format": "file-path", @@ -224,6 +264,14 @@ "fa_icon": "far calendar", "description": "Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.", "hidden": true + }, + "help_full": { + "type": "boolean", + "description": "Display the full detailed help message." + }, + "show_hidden": { + "type": "boolean", + "description": "Display hidden parameters in the help message (only works when --help or --help_full are provided)." } } } diff --git a/nf-test.config b/nf-test.config new file mode 100644 index 0000000..c0c14da --- /dev/null +++ b/nf-test.config @@ -0,0 +1,39 @@ +config { + // location for all nf-test tests + testsDir = "." + + // nf-test directory including temporary files for each test + workDir = System.getenv("NFT_WORKDIR") ?: ".nf-test" + + // location of an optional nextflow.config file specific for executing tests + configFile = "tests/nextflow.config" + + // ignore tests coming from the nf-core/modules repo + ignore = [ + 'modules/nf-core/**/tests/*', + 'subworkflows/nf-core/**/tests/*', + ] + + // run all test with defined profile(s) from the main nextflow.config + profile = "test" + + // list of filenames or patterns that should be trigger a full test run + triggers = [ + '.github/actions/nf-test/action.yml', + '.github/workflows/nf-test.yml', + 'assets/schema_input.json', + 'bin/*', + 'conf/test.config', + 'nextflow.config', + 'nextflow_schema.json', + 'nf-test.config', + 'tests/.nftignore', + 'tests/nextflow.config', + ] + + // load the necessary plugins + plugins { + load "nft-utils@0.0.3" + load "nft-csv@0.1.0" + } +} diff --git a/ro-crate-metadata.json b/ro-crate-metadata.json index 766d934..1a0a29a 100644 --- a/ro-crate-metadata.json +++ b/ro-crate-metadata.json @@ -1,6 +1,6 @@ { "@context": [ - "https://w3id.org/ro/crate/1.1/context", + "https://w3id.org/ro/crate/1.2/context", { "GithubService": "https://w3id.org/ro/terms/test#GithubService", "JenkinsService": "https://w3id.org/ro/terms/test#JenkinsService", @@ -21,9 +21,9 @@ { "@id": "./", "@type": "Dataset", - "creativeWorkStatus": "InProgress", - "datePublished": "2025-01-22T16:52:07+00:00", - "description": "

\n \n \n \"nf-core/dmscore\"\n \n

\n\n[![GitHub Actions CI Status](https://github.com/nf-core/dmscore/actions/workflows/ci.yml/badge.svg)](https://github.com/nf-core/dmscore/actions/workflows/ci.yml)\n[![GitHub Actions Linting Status](https://github.com/nf-core/dmscore/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-core/dmscore/actions/workflows/linting.yml)[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/dmscore/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX)\n[![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com)\n\n[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A524.04.2-23aa62.svg)](https://www.nextflow.io/)\n[![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/)\n[![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/)\n[![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/)\n[![Launch on Seqera Platform](https://img.shields.io/badge/Launch%20%F0%9F%9A%80-Seqera%20Platform-%234256e7)](https://cloud.seqera.io/launch?pipeline=https://github.com/nf-core/dmscore)\n\n[![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23dmscore-4A154B?labelColor=000000&logo=slack)](https://nfcore.slack.com/channels/dmscore)[![Follow on Twitter](http://img.shields.io/badge/twitter-%40nf__core-1DA1F2?labelColor=000000&logo=twitter)](https://twitter.com/nf_core)[![Follow on Mastodon](https://img.shields.io/badge/mastodon-nf__core-6364ff?labelColor=FFFFFF&logo=mastodon)](https://mstdn.science/@nf_core)[![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core)\n\n## Introduction\n\n**nf-core/dmscore** is a bioinformatics pipeline that ...\n\n\n\n\n1. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))\n\n## Usage\n\n> [!NOTE]\n> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.\n\n\n\nNow, you can run the pipeline using:\n\n\n\n```bash\nnextflow run nf-core/dmscore \\\n -profile \\\n --input samplesheet.csv \\\n --outdir \n```\n\n> [!WARNING]\n> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files).\n\nFor more details and further functionality, please refer to the [usage documentation](https://nf-co.re/dmscore/usage) and the [parameter documentation](https://nf-co.re/dmscore/parameters).\n\n## Pipeline output\n\nTo see the results of an example test run with a full size dataset refer to the [results](https://nf-co.re/dmscore/results) tab on the nf-core website pipeline page.\nFor more details about the output files and reports, please refer to the\n[output documentation](https://nf-co.re/dmscore/output).\n\n## Credits\n\nnf-core/dmscore was originally written by Benjamin Wehnert & Max Stammnitz.\n\nWe thank the following people for their extensive assistance in the development of this pipeline:\n\n\n\n## Contributions and Support\n\nIf you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).\n\nFor further information or help, don't hesitate to get in touch on the [Slack `#dmscore` channel](https://nfcore.slack.com/channels/dmscore) (you can join with [this invite](https://nf-co.re/join/slack)).\n\n## Citations\n\n\n\n\n\n\nAn extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.\n\nYou can cite the `nf-core` publication as follows:\n\n> **The nf-core framework for community-curated bioinformatics pipelines.**\n>\n> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.\n>\n> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).\n", + "creativeWorkStatus": "Stable", + "datePublished": "2026-06-13T18:39:59+00:00", + "description": "

\n \n \n \"nf-core/deepmutscan\"\n \n

\n\n[![GitHub Actions CI Status](https://github.com/nf-core/deepmutscan/actions/workflows/ci.yml/badge.svg)](https://github.com/nf-core/deepmutscan/actions/workflows/ci.yml)\n[![GitHub Actions Linting Status](https://github.com/nf-core/deepmutscan/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-core/deepmutscan/actions/workflows/linting.yml)[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/deepmutscan/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX)\n[![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com)\n\n[![Nextflow](https://img.shields.io/badge/version-%E2%89%A525.10.4-green?style=flat&logo=nextflow&logoColor=white&color=%230DC09D&link=https%3A%2F%2Fnextflow.io)](https://www.nextflow.io/)\n[![nf-core template version](https://img.shields.io/badge/nf--core_template-4.0.2-green?style=flat&logo=nfcore&logoColor=white&color=%2324B064&link=https%3A%2F%2Fnf-co.re)](https://github.com/nf-core/tools/releases/tag/4.0.2)\n[![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/)\n[![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/)\n[![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/)\n[![Launch on Seqera Platform](https://img.shields.io/badge/Launch%20%F0%9F%9A%80-Seqera%20Platform-%234256e7)](https://cloud.seqera.io/launch?pipeline=https://github.com/nf-core/deepmutscan)\n\n[![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23deepmutscan-4A154B?labelColor=000000&logo=slack)](https://nfcore.slack.com/channels/deepmutscan)[![Follow on Twitter](http://img.shields.io/badge/twitter-%40nf__core-1DA1F2?labelColor=000000&logo=twitter)](https://twitter.com/nf_core)[![Follow on Mastodon](https://img.shields.io/badge/mastodon-nf__core-6364ff?labelColor=FFFFFF&logo=mastodon)](https://mstdn.science/@nf_core)[![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core)\n\n## Introduction\n\n**nf-core/deepmutscan** is a workflow designed for the analysis of deep mutational scanning (DMS) data. DMS enables researchers to experimentally measure the fitness effects of thousands of genes or gene variants simultaneously, helping to classify disease causing mutants in human and animal populations, to learn the fundamental rules of protein architecture, small-molecule binding, mRNA splicing, viral evolution and many other quantifiable phenotypes.\n\nWhile DNA synthesis and sequencing technologies have advanced substantially, long open reading frame (ORF) targets still present a major challenge for DMS studies. Shotgun DNA sequencing can be used to greatly speed up the inference of long ORF mutant fitness landscapes, theoretically at no expense in accuracy. We have designed the `nf-core/deepmutscan` pipeline to unlock the power of shotgun sequencing based DMS studies on long ORFs, to simplify and standardise the complex bioinformatics steps involved in data processing of such experiments – from read alignment to QC reporting and fitness landscape inferences.\n\n![nf-core/deepmutscan workflow](docs/images/pipeline.png)\n\nThe pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!\n\nOn release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/deepmutscan/results).\n\n## Major features\n\n- End-to-end analyses of various DMS data\n- Modular, three-stage workflow: alignment → QC → error-aware fitness estimation\n- Integration with popular statistical fitness estimation tools like [DiMSum](https://github.com/lehner-lab/DiMSum), [Enrich2](https://github.com/FowlerLab/Enrich2), [rosace](https://github.com/pimentellab/rosace/) and [mutscan](https://github.com/fmicompbio/mutscan)\n- Support of multiple mutagenesis strategies, e.g. by nicking with degenerate NNK and NNS codons\n- Containerisation via Docker, Singularity and Apptainer\n- Scalability across HPC and Cloud systems\n- Monitoring of CPU, memory, and CO₂ usage\n\nFor more details on the pipeline and on potential future expansions, please consider reading our [usage description](https://nf-co.re/deepmutscan/usage).\n\n## Step-by-step pipeline summary\n\nThe pipeline processes deep mutational scanning (DMS) sequencing data in several stages:\n\n1. Alignment of reads to the reference open reading frame (ORF) (`BWA-mem`)\n2. Filtering of wildtype and erroneous reads (`samtools view`)\n3. Read merging for base error reduction (`vsearch merge`)\n4. Mutation counting\n5. Single nucleotide variant error correction\n6. DMS library quality control\n7. Data summarisation across samples\n8. Fitness estimation (`DiMSum`, `mutscan`)\n\n## Usage\n\n> [!NOTE]\n> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/get_started/environment_setup/overview) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/get_started/run-your-first-pipeline) with `-profile test` before running the workflow on actual data.\n\nFirst, prepare a samplesheet with your input/output data in which each row represents a pair of fastq files (paired end). This should look as follows:\n\n```csv title=\"samplesheet.csv\"\nsample,type,replicate,file1,file2\nORF1,input,1,/reads/forward1.fastq.gz,/reads/reverse1.fastq.gz\nORF1,input,2,/reads/forward2.fastq.gz,/reads/reverse2.fastq.gz\nORF1,output,1,/reads/forward3.fastq.gz,/reads/reverse3.fastq.gz\nORF1,output,2,/reads/forward4.fastq.gz,/reads/reverse4.fastq.gz\n```\n\nSecondly, specify the gene or gene region of interest using a reference FASTA file via `--fasta`. Provide the exact codon coordinates using `--reading_frame`.\n\nNow, you can run the pipeline using:\n\n```bash title=\"example_run.sh\"\nnextflow run nf-core/deepmutscan \\\n -profile \\\n --input ./samplesheet.csv \\\n --fasta ./ref.fa \\\n --reading_frame 1-300 \\\n --outdir ./results\n```\n\n## Pipeline output\n\nTo see the results of an example test run with a full size dataset refer to the [results](https://nf-co.re/deepmutscan/results) tab on the nf-core website pipeline page.\n\nFor more details about the output files and reports, please refer to the\n[output documentation](https://nf-co.re/deepmutscan/output).\n\n## Contributing\n\nWe welcome contributions from the community!\n\nFor technical challenges and feedback on the pipeline, please use our [Github repository](https://github.com/nf-core/deepmutscan). Please open an [issue](https://github.com/nf-core/deepmutscan/issues/new) or [pull request](https://github.com/nf-core/deepmutscan/compare) to:\n\n- Report bugs or solve data incompatibilities when running `nf-core/deepmutscan`\n- Suggest the implementation of new modules for custom DMS workflows\n- Help improve this documentation\n\nIf you are interested in getting involved as a developer, please consider joining our interactive [`#deepmutscan` Slack channel](https://nfcore.slack.com/channels/deepmutscan) (via [this invite](https://nf-co.re/join/slack)).\n\n## Credits\n\nnf-core/deepmutscan was originally written by [Benjamin Wehnert](https://github.com/BenjaminWehnert1008) and [Max Stammnitz](https://github.com/MaximilianStammnitz) at the [Centre for Genomic Regulation, Barcelona](https://www.crg.eu/), with the generous support of an EMBO Long-term Postdoctoral Fellowship and a Marie Skłodowska-Curie grant by the European Union.\n\nIf you use `nf-core/deepmutscan` in your analyses, please cite:\n\n> 📄 Wehnert et al., _bioRxiv_ preprint (coming soon)\n\nPlease also cite the `nf-core` framework:\n\n> 📄 Ewels et al., _Nature Biotechnology_, 2020\n> [https://doi.org/10.1038/s41587-020-0439-x](https://doi.org/10.1038/s41587-020-0439-x)\n\nFor further information or help, don't hesitate to get in touch on the [Slack `#deepmutscan` channel](https://nfcore.slack.com/channels/deepmutscan) (you can join with [this invite](https://nf-co.re/join/slack)).\n\n## Scientific contact\n\nFor scientific discussions around the use of this pipeline (e.g. on experimental design or sequencing data requirements), please feel free to get in touch with us directly:\n\n- Benjamin Wehnert — wehnertbenjamin@gmail.com\n- Maximilian Stammnitz — maximilian.stammnitz@crg.eu\n", "hasPart": [ { "@id": "main.nf" @@ -43,6 +43,9 @@ { "@id": "modules/" }, + { + "@id": "modules/local/" + }, { "@id": "modules/nf-core/" }, @@ -92,17 +95,17 @@ "@id": ".prettierignore" } ], - "isBasedOn": "https://github.com/nf-core/dmscore", + "isBasedOn": "https://github.com/nf-core/deepmutscan", "license": "MIT", "mainEntity": { "@id": "main.nf" }, "mentions": [ { - "@id": "#f9815654-37bd-4781-a133-ab36324210f5" + "@id": "#7146010e-cfc6-420d-99f0-b8197ccd79c7" } ], - "name": "nf-core/dmscore" + "name": "nf-core/deepmutscan" }, { "@id": "ro-crate-metadata.json", @@ -112,7 +115,7 @@ }, "conformsTo": [ { - "@id": "https://w3id.org/ro/crate/1.1" + "@id": "https://w3id.org/ro/crate/1.2" }, { "@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0" @@ -126,18 +129,27 @@ "SoftwareSourceCode", "ComputationalWorkflow" ], + "contributor": [ + { + "@id": "#72177bf6-a8bc-482d-af8f-8a6c470a26a6" + } + ], "dateCreated": "", - "dateModified": "2025-01-22T17:52:07Z", + "dateModified": "2026-06-13T20:39:59Z", "dct:conformsTo": "https://bioschemas.org/profiles/ComputationalWorkflow/1.0-RELEASE/", "keywords": [ "nf-core", - "nextflow" + "nextflow", + "community-workflow", + "deep-mutational-scanning", + "genotype-phenotype", + "shotgun-sequencing" ], "license": [ "MIT" ], "name": [ - "nf-core/dmscore" + "nf-core/deepmutscan" ], "programmingLanguage": { "@id": "https://w3id.org/workflowhub/workflow-ro-crate#nextflow" @@ -146,11 +158,11 @@ "@id": "https://nf-co.re/" }, "url": [ - "https://github.com/nf-core/dmscore", - "https://nf-co.re/dmscore/dev/" + "https://github.com/nf-core/deepmutscan", + "https://nf-co.re/deepmutscan/1.0.0/" ], "version": [ - "1.0.0dev" + "1.0.0" ] }, { @@ -163,26 +175,26 @@ "url": { "@id": "https://www.nextflow.io/" }, - "version": "!>=24.04.2" + "version": "!>=25.10.4" }, { - "@id": "#f9815654-37bd-4781-a133-ab36324210f5", + "@id": "#7146010e-cfc6-420d-99f0-b8197ccd79c7", "@type": "TestSuite", "instance": [ { - "@id": "#34dab5c1-0a1b-41b0-80d1-74e2f2d38434" + "@id": "#a6466342-1368-4c50-86fc-cb19fd46c1cc" } ], "mainEntity": { "@id": "main.nf" }, - "name": "Test suite for nf-core/dmscore" + "name": "Test suite for nf-core/deepmutscan" }, { - "@id": "#34dab5c1-0a1b-41b0-80d1-74e2f2d38434", + "@id": "#a6466342-1368-4c50-86fc-cb19fd46c1cc", "@type": "TestInstance", - "name": "GitHub Actions workflow for testing nf-core/dmscore", - "resource": "repos/nf-core/dmscore/actions/workflows/ci.yml", + "name": "GitHub Actions workflow for testing nf-core/deepmutscan", + "resource": "repos/nf-core/deepmutscan/actions/workflows/nf-test.yml", "runsOn": { "@id": "https://w3id.org/ro/terms/test#GithubService" }, @@ -221,6 +233,11 @@ "@type": "Dataset", "description": "Modules used by the pipeline" }, + { + "@id": "modules/local/", + "@type": "Dataset", + "description": "Pipeline-specific modules" + }, { "@id": "modules/nf-core/", "@type": "Dataset", @@ -306,6 +323,11 @@ "@type": "Organization", "name": "nf-core", "url": "https://nf-co.re/" + }, + { + "@id": "#72177bf6-a8bc-482d-af8f-8a6c470a26a6", + "@type": "Person", + "name": "Benjamin Wehnert & Max Stammnitz" } ] } \ No newline at end of file diff --git a/subworkflows/local/calculate_fitness/main.nf b/subworkflows/local/calculate_fitness/main.nf new file mode 100644 index 0000000..5f63c16 --- /dev/null +++ b/subworkflows/local/calculate_fitness/main.nf @@ -0,0 +1,144 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + IMPORT MODULES +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ +include { MERGE_COUNTS } from '../../../modules/local/fitness/merge_counts/main' +include { EXPDESIGN_FITNESS } from '../../../modules/local/fitness/fitness_experimental_design/main' +include { FIND_SYNONYMOUS_MUTATION } from '../../../modules/local/fitness/find_synonymous_mutation/main' +include { FITNESS_CALCULATION } from '../../../modules/local/fitness/fitness_calculation/main' +include { FITNESS_QC } from '../../../modules/local/fitness/fitness_QC/main' +include { FITNESS_HEATMAP } from '../../../modules/local/fitness/fitness_heatmap/main' +include { RUN_DIMSUM } from '../../../modules/local/fitness/run_dimsum/main' +include { RUN_MUTSCAN } from '../../../modules/local/fitness/run_mutscan/main' + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + SUBWORKFLOW DEFINITION +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +workflow CALCULATE_FITNESS { + + take: + ch_fitness_input // channel: output from GATK_GATKTOFITNESS (grouped by sample/replicate logic) + ch_samplesheet_csv // channel: path to samplesheet csv + ch_fasta // channel: tuple([meta], path(fasta)) + val_reading_frame // value: reading frame param + ch_aa_seq // channel: output from DMSANALYSIS_AASEQ (for heatmap) + + main: + ch_versions = Channel.empty() + + // 1. Group per biological sample and merge counts + ch_fitness_input + .map { meta, tsv -> + def s = meta.sample as String + def id = meta.id as String + def base = s ? (s.replaceFirst(/_(input|output|quality)\d+$/, '')) + : (id?.tokenize('_')?.first()) + tuple(base as String, tuple(meta, tsv)) + } + .groupTuple() + .map { base, pairs -> + def metas = pairs.collect { it[0] } + def inputs = pairs.findAll { it[0].type == 'input' }.sort { it[0].replicate }.collect { it[1] } + def outputs = pairs.findAll { it[0].type == 'output' }.sort { it[0].replicate }.collect { it[1] } + tuple([sample: base], metas, inputs, outputs) + } + .filter { smeta, metas, ins, outs -> ins && outs } + .set { ch_fitness_bundled } + + // 2. Merge Counts + MERGE_COUNTS( ch_fitness_bundled ) + ch_versions = ch_versions.mix(MERGE_COUNTS.out.versions) + + // 3. Experimental Design + EXPDESIGN_FITNESS( ch_samplesheet_csv ) + ch_versions = ch_versions.mix(EXPDESIGN_FITNESS.out.versions) + + // 4. Find Synonymous Mutations + // Prepare inputs (broadcast logic) + def ch_fasta_path = ch_fasta.map { it[1] } // strip meta + + FIND_SYNONYMOUS_MUTATION( + MERGE_COUNTS.out.merged_counts, + ch_fasta_path.combine(MERGE_COUNTS.out.merged_counts).map { it[0] }, + val_reading_frame.combine(MERGE_COUNTS.out.merged_counts).map { it[0] } + ) + ch_versions = ch_versions.mix(FIND_SYNONYMOUS_MUTATION.out.versions) + + + // 5. Align Channels for Fitness Calculation + + // Key counts and WT by biological sample name + def ch_counts_keyed_d = MERGE_COUNTS.out.merged_counts + .map { smp, counts -> tuple(smp.sample as String, smp, counts) } + + def ch_wt_keyed_d = FIND_SYNONYMOUS_MUTATION.out.synonymous_wt + .map { smp, wt -> tuple(smp.sample as String, wt) } + + // Join by key + def ch_counts_wt_d = ch_counts_keyed_d.join(ch_wt_keyed_d) + .map { key, smp, counts, wt -> tuple(smp, counts, wt) } + + // Broadcast experimental design + def ch_exp_for_each_d = EXPDESIGN_FITNESS.out.experimental_design + .combine(ch_counts_wt_d) + .map { it[0] } + + // Final aligned channels + def ch_run_counts_d = ch_counts_wt_d.map { smp, counts, wt -> tuple(smp, counts) } + def ch_run_wt_d = ch_counts_wt_d.map { smp, counts, wt -> wt } + def ch_run_exp_d = ch_exp_for_each_d + + // 6. Run Fitness Calculation & QC + FITNESS_CALCULATION( + ch_run_counts_d, + ch_run_exp_d, + ch_run_wt_d + ) + ch_versions = ch_versions.mix(FITNESS_CALCULATION.out.versions) + + FITNESS_QC( FITNESS_CALCULATION.out.fitness_estimation ) + ch_versions = ch_versions.mix(FITNESS_QC.out.versions) + + FITNESS_HEATMAP( + FITNESS_CALCULATION.out.fitness_estimation, + ch_aa_seq + ) + ch_versions = ch_versions.mix(FITNESS_HEATMAP.out.versions) + + // 7. Run DiMSum (optional based on params inside subworkflow or handled by control logic) + // Note: Logic checking for 'params.dimsum' needs to be handled. + // Since subworkflows inherit params, we can check params.dimsum here. + + if (params.dimsum) { + log.warn(""" + '--dimsum true' only works together with '--fitness true' + and is currently (30 Oct 2025) NOT supported on ARM processors. + Use AMD/x86_64 systems for DiMSum execution. + """) + + RUN_DIMSUM( + ch_run_counts_d, + ch_run_wt_d, + ch_run_exp_d + ) + ch_versions = ch_versions.mix(RUN_DIMSUM.out.versions) + } + + // 8. Run Mutscan + if (params.mutscan) { + RUN_MUTSCAN( + ch_run_counts_d, + ch_run_wt_d, + ch_run_exp_d + ) + ch_versions = ch_versions.mix(RUN_MUTSCAN.out.versions) + } + + emit: + fitness_estimation = FITNESS_CALCULATION.out.fitness_estimation + versions = ch_versions +} diff --git a/subworkflows/local/calculate_fitness/meta.yml b/subworkflows/local/calculate_fitness/meta.yml new file mode 100644 index 0000000..46a4a9e --- /dev/null +++ b/subworkflows/local/calculate_fitness/meta.yml @@ -0,0 +1,42 @@ +name: "calculate_fitness" +description: Consolidates variant counts and calculates fitness scores. +keywords: + - "deep mutational scanning" + - "fitness estimation" + - "dms" + - "enrichment scores" +components: + - "merge/counts" + - "fitness/calculation" + - "fitness/heatmap" + - "fitness/qc" + - "run/dimsum" + - "run/mutscan" + - "expdesign/fitness" + - "find/synonymous/mutation" +input: + - ch_fitness_input: + type: channel + description: GATK variant count channels + - ch_samplesheet_csv: + type: channel + description: Pipeline samplesheet + - ch_fasta: + type: channel + description: Reference sequence channel + - val_reading_frame: + type: value + description: Reading frame parameter + - ch_aa_seq: + type: channel + description: Amino acid sequence channel +output: + - fitness_estimation: + type: channel + description: Output fitness metrics + - versions: + type: channel + description: Software versions +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/subworkflows/local/utils_nfcore_dmscore_pipeline/main.nf b/subworkflows/local/utils_nfcore_deepmutscan_pipeline/main.nf similarity index 73% rename from subworkflows/local/utils_nfcore_dmscore_pipeline/main.nf rename to subworkflows/local/utils_nfcore_deepmutscan_pipeline/main.nf index 14eba3c..50e4edb 100644 --- a/subworkflows/local/utils_nfcore_dmscore_pipeline/main.nf +++ b/subworkflows/local/utils_nfcore_deepmutscan_pipeline/main.nf @@ -1,5 +1,5 @@ // -// Subworkflow with functionality specific to the nf-core/dmscore pipeline +// Subworkflow with functionality specific to the nf-core/deepmutscan pipeline // /* @@ -10,10 +10,8 @@ include { UTILS_NFSCHEMA_PLUGIN } from '../../nf-core/utils_nfschema_plugin' include { paramsSummaryMap } from 'plugin/nf-schema' -include { samplesheetToList } from 'plugin/nf-schema' include { completionEmail } from '../../nf-core/utils_nfcore_pipeline' include { completionSummary } from '../../nf-core/utils_nfcore_pipeline' -include { imNotification } from '../../nf-core/utils_nfcore_pipeline' include { UTILS_NFCORE_PIPELINE } from '../../nf-core/utils_nfcore_pipeline' include { UTILS_NEXTFLOW_PIPELINE } from '../../nf-core/utils_nextflow_pipeline' @@ -50,10 +48,43 @@ workflow PIPELINE_INITIALISATION { // // Validate parameters and generate parameter summary to stdout // + + def before_text = "" + def after_text = "" + before_text = """ +-\033[2m----------------------------------------------------\033[0m- + \033[0;32m,--.\033[0;30m/\033[0;32m,-.\033[0m +\033[0;34m ___ __ __ __ ___ \033[0;32m/,-._.--~\'\033[0m +\033[0;34m |\\ | |__ __ / ` / \\ |__) |__ \033[0;33m} {\033[0m +\033[0;34m | \\| | \\__, \\__/ | \\ |___ \033[0;32m\\`-._,-`-,\033[0m + \033[0;32m`._,._,\'\033[0m +\033[0;35m nf-core/deepmutscan ${workflow.manifest.version}\033[0m +-\033[2m----------------------------------------------------\033[0m- +""" + after_text = """${workflow.manifest.doi ? "\n* The pipeline\n" : ""}${workflow.manifest.doi.tokenize(",").collect { doi -> " https://doi.org/${doi.trim().replace('https://doi.org/','')}"}.join("\n")}${workflow.manifest.doi ? "\n" : ""} +* The nf-core framework + https://doi.org/10.1038/s41587-020-0439-x + +* Software dependencies + https://github.com/nf-core/deepmutscan/blob/master/CITATIONS.md +""" + if (monochrome_logs) { + before_text = before_text.replaceAll(/\033\[[0-9;]*m/, '') + } + + command = "nextflow run ${workflow.manifest.name} -profile --input samplesheet.csv --outdir " + UTILS_NFSCHEMA_PLUGIN ( workflow, validate_params, - null + params.help, + params.help_full, + params.show_hidden, + null, + before_text, + after_text, + command, + "${projectDir}/nextflow_schema.json" ) // @@ -73,25 +104,39 @@ workflow PIPELINE_INITIALISATION { // Channel - .fromList(samplesheetToList(params.input, "${projectDir}/assets/schema_input.json")) - .map { - meta, fastq_1, fastq_2 -> - if (!fastq_2) { - return [ meta.id, meta + [ single_end:true ], [ fastq_1 ] ] - } else { - return [ meta.id, meta + [ single_end:false ], [ fastq_1, fastq_2 ] ] - } - } - .groupTuple() - .map { samplesheet -> - validateInputSamplesheet(samplesheet) + .fromPath(params.input) + .splitCsv(header: true) + .filter { row -> + // Skip rows where file1 or file2 are BAM files + !(row.file1.endsWith('.bam') || (row.file2 && row.file2.endsWith('.bam'))) + } + .map { row -> + // Determine suffix based on the presence of file2 + def suffix = row.file2 ? "_pe" : "_se" + + // Construct metadata object with updated ID + def meta = [ + id : "${row.sample}_${row.type}_${row.replicate}${suffix}", // Base ID with suffix + sample : row.sample, + type : row.type, + replicate : row.replicate as int + ] + + // Generate file paths based on the presence of file1 and file2 + def reads = [] + if (row.file1) { + reads << row.file1 // Add file1 path } - .map { - meta, fastqs -> - return [ meta, fastqs.flatten() ] + if (row.file2) { + reads << row.file2 // Add file2 path } - .set { ch_samplesheet } + // Return metadata and file paths as a tuple + return [meta, reads] + } + .set { ch_samplesheet } + + // Emit the samplesheet channel and an empty version channel for use in the workflow emit: samplesheet = ch_samplesheet versions = ch_versions @@ -111,7 +156,6 @@ workflow PIPELINE_COMPLETION { plaintext_email // boolean: Send plain-text email instead of HTML outdir // path: Path to output directory where results will be published monochrome_logs // boolean: Disable ANSI colour codes in log output - hook_url // string: hook URL for notifications multiqc_report // string: Path to MultiQC report main: @@ -135,13 +179,11 @@ workflow PIPELINE_COMPLETION { } completionSummary(monochrome_logs) - if (hook_url) { - imNotification(summary_params, hook_url) - } + } workflow.onError { - log.error "Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting" + log.error "Pipeline failed. Please refer to troubleshooting docs for common issues: https://nf-co.re/docs/running/troubleshooting" } } @@ -200,7 +242,6 @@ def genomeExistsError() { // Generate methods description for MultiQC // def toolCitationText() { - // TODO nf-core: Optionally add in-text citation tools to this list. // Can use ternary operators to dynamically construct based conditions, e.g. params["run_xyz"] ? "Tool (Foo et al. 2023)" : "", // Uncomment function in methodsDescriptionText to render in MultiQC report def citation_text = [ @@ -214,7 +255,6 @@ def toolCitationText() { } def toolBibliographyText() { - // TODO nf-core: Optionally add bibliographic entries to this list. // Can use ternary operators to dynamically construct based conditions, e.g. params["run_xyz"] ? "
  • Author (2023) Pub name, Journal, DOI
  • " : "", // Uncomment function in methodsDescriptionText to render in MultiQC report def reference_text = [ @@ -249,9 +289,8 @@ def methodsDescriptionText(mqc_methods_yaml) { meta["tool_citations"] = "" meta["tool_bibliography"] = "" - // TODO nf-core: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled! - // meta["tool_citations"] = toolCitationText().replaceAll(", \\.", ".").replaceAll("\\. \\.", ".").replaceAll(", \\.", ".") - // meta["tool_bibliography"] = toolBibliographyText() + meta["tool_citations"] = toolCitationText().replaceAll(", \\.", ".").replaceAll("\\. \\.", ".").replaceAll(", \\.", ".") + meta["tool_bibliography"] = toolBibliographyText() def methods_text = mqc_methods_yaml.text @@ -261,4 +300,3 @@ def methodsDescriptionText(mqc_methods_yaml) { return description_html.toString() } - diff --git a/subworkflows/local/utils_nfcore_deepmutscan_pipeline/meta.yml b/subworkflows/local/utils_nfcore_deepmutscan_pipeline/meta.yml new file mode 100644 index 0000000..2f5669a --- /dev/null +++ b/subworkflows/local/utils_nfcore_deepmutscan_pipeline/meta.yml @@ -0,0 +1,58 @@ +name: "PIPELINE_INITIALISATION" +description: Local utility functions, pipeline initialization, completion workflows, and parameter validation wrappers for the deepmutscan pipeline. +keywords: + - "utility" + - "validation" + - "manifest" + - "initialisation" + - "completion" +components: + - "utils_nfschema_plugin" + - "utils_nfcore_pipeline" + - "utils_nextflow_pipeline" + - "completionemail" + - "completionsummary" + +input: + - version: + type: boolean + description: Display version and exit + - validate_params: + type: boolean + description: Boolean whether to validate parameters against the schema at runtime + - monochrome_logs: + type: boolean + description: Do not use coloured log outputs + - nextflow_cli_args: + type: array + description: List of positional nextflow CLI args + - outdir: + type: string + description: The output directory where the results will be saved + - input: + type: string + description: Path to input samplesheet + - email: + type: string + description: Email address for completion summary + - email_on_fail: + type: string + description: Email address sent on pipeline failure + - plaintext_email: + type: boolean + description: Send plain-text email instead of HTML + - multiqc_report: + type: string + description: Path to MultiQC report + +output: + - samplesheet: + type: channel + description: Processed samplesheet channel containing metadata objects and file paths + - versions: + type: channel + description: Empty version channel for use in the workflow + +authors: + - "@BenjaminWehnert1008" + - "@MaximilianStammnitz" diff --git a/subworkflows/nf-core/utils_nextflow_pipeline/main.nf b/subworkflows/nf-core/utils_nextflow_pipeline/main.nf index d6e593e..37939ac 100644 --- a/subworkflows/nf-core/utils_nextflow_pipeline/main.nf +++ b/subworkflows/nf-core/utils_nextflow_pipeline/main.nf @@ -73,11 +73,23 @@ def getWorkflowVersion() { def dumpParametersToJSON(outdir) { def timestamp = new java.util.Date().format('yyyy-MM-dd_HH-mm-ss') def filename = "params_${timestamp}.json" - def temp_pf = new File(workflow.launchDir.toString(), ".${filename}") - def jsonStr = groovy.json.JsonOutput.toJson(params) + def temp_pf = workflow.launchDir.resolve(".${filename}") + def jsonGenerator = new groovy.json.JsonGenerator.Options() + .excludeNulls() + .addConverter(Path) { Path path -> path.toUriString() } + .addConverter(Duration) { Duration duration -> duration.toMillis() } + .addConverter(MemoryUnit) { MemoryUnit memory -> memory.toBytes() } + .addConverter(nextflow.script.types.VersionNumber) { nextflow.script.types.VersionNumber version -> version.toString() } + .build() + def jsonStr = jsonGenerator.toJson(params) temp_pf.text = groovy.json.JsonOutput.prettyPrint(jsonStr) - - nextflow.extension.FilesEx.copyTo(temp_pf.toPath(), "${outdir}/pipeline_info/params_${timestamp}.json") + if (outdir instanceof Path) { + temp_pf.copyTo(outdir.resolve("pipeline_info/${filename}")) + } else if (outdir instanceof String) { + temp_pf.copyTo("${outdir}/pipeline_info/params_${timestamp}.json") + } else { + log.warn("Could not determine type of outdir, parameters JSON file will not be copied to output directory!") + } temp_pf.delete() } diff --git a/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml b/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml deleted file mode 100644 index f847611..0000000 --- a/subworkflows/nf-core/utils_nextflow_pipeline/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -subworkflows/utils_nextflow_pipeline: - - subworkflows/nf-core/utils_nextflow_pipeline/** diff --git a/subworkflows/nf-core/utils_nfcore_pipeline/main.nf b/subworkflows/nf-core/utils_nfcore_pipeline/main.nf index bfd2587..afca543 100644 --- a/subworkflows/nf-core/utils_nfcore_pipeline/main.nf +++ b/subworkflows/nf-core/utils_nfcore_pipeline/main.nf @@ -17,7 +17,7 @@ workflow UTILS_NFCORE_PIPELINE { checkProfileProvided(nextflow_cli_args) emit: - valid_config + valid_config = valid_config } /* @@ -98,7 +98,7 @@ def workflowVersionToYAML() { // Get channel of software versions used in pipeline in YAML format // def softwareVersionsToYAML(ch_versions) { - return ch_versions.unique().map { version -> processVersionsFromYAML(version) }.unique().mix(Channel.of(workflowVersionToYAML())) + return ch_versions.unique().map { version -> processVersionsFromYAML(version) }.unique().mix(channel.of(workflowVersionToYAML())) } // @@ -353,67 +353,3 @@ def completionSummary(monochrome_logs=true) { log.info("-${colors.purple}[${workflow.manifest.name}]${colors.red} Pipeline completed with errors${colors.reset}-") } } - -// -// Construct and send a notification to a web server as JSON e.g. Microsoft Teams and Slack -// -def imNotification(summary_params, hook_url) { - def summary = [:] - summary_params - .keySet() - .sort() - .each { group -> - summary << summary_params[group] - } - - def misc_fields = [:] - misc_fields['start'] = workflow.start - misc_fields['complete'] = workflow.complete - misc_fields['scriptfile'] = workflow.scriptFile - misc_fields['scriptid'] = workflow.scriptId - if (workflow.repository) { - misc_fields['repository'] = workflow.repository - } - if (workflow.commitId) { - misc_fields['commitid'] = workflow.commitId - } - if (workflow.revision) { - misc_fields['revision'] = workflow.revision - } - misc_fields['nxf_version'] = workflow.nextflow.version - misc_fields['nxf_build'] = workflow.nextflow.build - misc_fields['nxf_timestamp'] = workflow.nextflow.timestamp - - def msg_fields = [:] - msg_fields['version'] = getWorkflowVersion() - msg_fields['runName'] = workflow.runName - msg_fields['success'] = workflow.success - msg_fields['dateComplete'] = workflow.complete - msg_fields['duration'] = workflow.duration - msg_fields['exitStatus'] = workflow.exitStatus - msg_fields['errorMessage'] = (workflow.errorMessage ?: 'None') - msg_fields['errorReport'] = (workflow.errorReport ?: 'None') - msg_fields['commandLine'] = workflow.commandLine.replaceFirst(/ +--hook_url +[^ ]+/, "") - msg_fields['projectDir'] = workflow.projectDir - msg_fields['summary'] = summary << misc_fields - - // Render the JSON template - def engine = new groovy.text.GStringTemplateEngine() - // Different JSON depending on the service provider - // Defaults to "Adaptive Cards" (https://adaptivecards.io), except Slack which has its own format - def json_path = hook_url.contains("hooks.slack.com") ? "slackreport.json" : "adaptivecard.json" - def hf = new File("${workflow.projectDir}/assets/${json_path}") - def json_template = engine.createTemplate(hf).make(msg_fields) - def json_message = json_template.toString() - - // POST - def post = new URL(hook_url).openConnection() - post.setRequestMethod("POST") - post.setDoOutput(true) - post.setRequestProperty("Content-Type", "application/json") - post.getOutputStream().write(json_message.getBytes("UTF-8")) - def postRC = post.getResponseCode() - if (!postRC.equals(200)) { - log.warn(post.getErrorStream().getText()) - } -} diff --git a/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.workflow.nf.test b/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.nf.test similarity index 100% rename from subworkflows/nf-core/utils_nfcore_pipeline/tests/main.workflow.nf.test rename to subworkflows/nf-core/utils_nfcore_pipeline/tests/main.nf.test diff --git a/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.workflow.nf.test.snap b/subworkflows/nf-core/utils_nfcore_pipeline/tests/main.nf.test.snap similarity index 100% rename from subworkflows/nf-core/utils_nfcore_pipeline/tests/main.workflow.nf.test.snap rename to subworkflows/nf-core/utils_nfcore_pipeline/tests/main.nf.test.snap diff --git a/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml b/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml deleted file mode 100644 index ac8523c..0000000 --- a/subworkflows/nf-core/utils_nfcore_pipeline/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -subworkflows/utils_nfcore_pipeline: - - subworkflows/nf-core/utils_nfcore_pipeline/** diff --git a/subworkflows/nf-core/utils_nfschema_plugin/main.nf b/subworkflows/nf-core/utils_nfschema_plugin/main.nf index 4994303..9ff0681 100644 --- a/subworkflows/nf-core/utils_nfschema_plugin/main.nf +++ b/subworkflows/nf-core/utils_nfschema_plugin/main.nf @@ -4,6 +4,7 @@ include { paramsSummaryLog } from 'plugin/nf-schema' include { validateParameters } from 'plugin/nf-schema' +include { paramsHelp } from 'plugin/nf-schema' workflow UTILS_NFSCHEMA_PLUGIN { @@ -15,32 +16,62 @@ workflow UTILS_NFSCHEMA_PLUGIN { // when this input is empty it will automatically use the configured schema or // "${projectDir}/nextflow_schema.json" as default. This input should not be empty // for meta pipelines + help // boolean: show help message + help_full // boolean: show full help message + show_hidden // boolean: show hidden parameters in help message + before_text // string: text to show before the help message and parameters summary + after_text // string: text to show after the help message and parameters summary + command // string: an example command of the pipeline + cli_typecast // boolean: whether to perform typecasting of CLI parameters. Set this to `null` to use the default behaviour main: + if(help || help_full) { + help_options = [ + beforeText: before_text, + afterText: after_text, + command: command, + showHidden: show_hidden, + fullHelp: help_full, + ] + if(parameters_schema) { + help_options << [parameters_schema: parameters_schema] + } + log.info paramsHelp( + help_options, + (help instanceof String && help != "true") ? help : "", + ) + exit 0 + } + // // Print parameter summary to stdout. This will display the parameters // that differ from the default given in the JSON schema // + + summary_options = [:] if(parameters_schema) { - log.info paramsSummaryLog(input_workflow, parameters_schema:parameters_schema) - } else { - log.info paramsSummaryLog(input_workflow) + summary_options << [parameters_schema: parameters_schema] } + log.info before_text + log.info paramsSummaryLog(summary_options, input_workflow) + log.info after_text // // Validate the parameters using nextflow_schema.json or the schema // given via the validation.parametersSchema configuration option // if(validate_params) { + validateOptions = [:] if(parameters_schema) { - validateParameters(parameters_schema:parameters_schema) - } else { - validateParameters() + validateOptions << [parameters_schema: parameters_schema] + } + if(cli_typecast != null) { + validateOptions << [cast_cli_params: cli_typecast] } + validateParameters(validateOptions) } emit: dummy_emit = true } - diff --git a/subworkflows/nf-core/utils_nfschema_plugin/meta.yml b/subworkflows/nf-core/utils_nfschema_plugin/meta.yml index f7d9f02..1d8c75a 100644 --- a/subworkflows/nf-core/utils_nfschema_plugin/meta.yml +++ b/subworkflows/nf-core/utils_nfschema_plugin/meta.yml @@ -25,6 +25,30 @@ input: option. When this input is empty it will automatically use the configured schema or "${projectDir}/nextflow_schema.json" as default. The schema should not be given in this way for meta pipelines. + - help: + type: boolean, string + description: | + Show the help message and exit. When a parameter name is given, show the help message for that parameter instead of the general help message. + - help_full: + type: boolean + description: Show the full help message and exit. + - show_hidden: + type: boolean + description: Show hidden parameters in the help message. + - before_text: + type: string + description: Text to show before the parameters summary and help message. + - after_text: + type: string + description: Text to show after the parameters summary and help message. + - command: + type: string + description: An example command to run the pipeline, to show in the help message and the summary. + - cli_typecast: + type: boolean + description: | + Whether to apply typecasting to the parameters given via the CLI before validation. + Set this to `null` to use the default behavior. output: - dummy_emit: type: boolean diff --git a/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test b/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test index 8fb3016..1fd1eac 100644 --- a/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test @@ -25,6 +25,13 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" + input[9] = null """ } } @@ -51,6 +58,13 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" + input[9] = null """ } } @@ -77,6 +91,13 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" + input[9] = null """ } } @@ -103,6 +124,13 @@ nextflow_workflow { input[0] = workflow input[1] = validate_params input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = false + input[4] = false + input[5] = false + input[6] = "" + input[7] = "" + input[8] = "" + input[9] = null """ } } @@ -114,4 +142,37 @@ nextflow_workflow { ) } } + + test("Should create a help message") { + + when { + + params { + test_data = '' + outdir = null + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + input[3] = true + input[4] = false + input[5] = false + input[6] = "Before" + input[7] = "After" + input[8] = "nextflow run test/test" + input[9] = null + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } } diff --git a/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config index 0907ac5..fd71cb8 100644 --- a/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config @@ -1,8 +1,8 @@ plugins { - id "nf-schema@2.1.0" + id "nf-schema@2.7.2" } validation { parametersSchema = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" monochromeLogs = true -} \ No newline at end of file +} diff --git a/tests/.nftignore b/tests/.nftignore new file mode 100644 index 0000000..324b115 --- /dev/null +++ b/tests/.nftignore @@ -0,0 +1,21 @@ +.DS_Store +multiqc/multiqc_data/fastqc_top_overrepresented_sequences_table.txt +multiqc/multiqc_data/multiqc.parquet +multiqc/multiqc_data/multiqc.log +multiqc/multiqc_data/multiqc_data.json +multiqc/multiqc_data/multiqc_sources.txt +multiqc/multiqc_data/multiqc_software_versions.txt +multiqc/multiqc_data/llms-full.txt +multiqc/multiqc_plots/{svg,pdf,png}/*.{svg,pdf,png} +multiqc/multiqc_report.html +fastqc/*_fastqc.{html,zip} +pipeline_info/*.{html,json,txt,yml} +library_QC/**/*.pdf +fitness/DiMSum_results/**/*.pdf +fitness/DiMSum_results/**/dimsum_results_workspace.RData +fitness/DiMSum_results/**/report.html +**/*.pdf +**/*.RData +fitness/DiMSum_results/**/report.html +**/fitness_estimation_mutscan_edgeR.tsv +**/fitness_estimation_mutscan_limma.tsv diff --git a/tests/default.nf.test b/tests/default.nf.test new file mode 100644 index 0000000..5599aee --- /dev/null +++ b/tests/default.nf.test @@ -0,0 +1,41 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + + test("-profile test") { + + when { + params { + outdir = "$outputDir" + } + } + + then { + // stable_path: All files + folders in ${params.outdir}/ with a stable path (including file name) + def stable_path = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}', 'multiqc/multiqc_plots', 'multiqc/multiqc_plots/**']) + // stable_content: All files in ${params.outdir}/ with stable content + def stable_content = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // Parse the unstable TSVs to snapshot their structure instead of their hashes + def edger_tsv = path("$outputDir/fitness/mutscan_results/fitness_estimation_mutscan_edgeR.tsv").csv(sep: '\t') + def limma_tsv = path("$outputDir/fitness/mutscan_results/fitness_estimation_mutscan_limma.tsv").csv(sep: '\t') + assert workflow.success + assertAll( + { assert snapshot( + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we test pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_deepmutscan_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_path, + // All files with stable contents + stable_content, + // Partial structural snapshot of the unstable TSV files + [ + edger_rowCount: edger_tsv.rowCount, + limma_rowCount: limma_tsv.rowCount + ] + ).match() } + ) + } + } +} diff --git a/tests/default.nf.test.snap b/tests/default.nf.test.snap new file mode 100644 index 0000000..3984d27 --- /dev/null +++ b/tests/default.nf.test.snap @@ -0,0 +1,412 @@ +{ + "-profile test": { + "content": [ + { + "${task.process}": { + "r-base": "4.5.1", + "r-mutscan": "1.0.0", + "r-Biostrings": "2.78.0" + }, + "BWA_INDEX": { + "bwa": "0.7.19-r1273" + }, + "BWA_MEM": { + "bwa": "0.7.19-r1273", + "samtools": "1.22.1" + }, + "DIMSUM_RUN": { + "r-base": "4.4.2" + }, + "DMSANALYSIS_POSSIBLE_MUTATIONS": { + "r-base": "4.5.1", + "biostrings": "2.78.0" + }, + "EXPDESIGN_FITNESS": { + "r-base": "4.5.1" + }, + "FASTQC": { + "fastqc": "0.12.1" + }, + "FIND_SYNONYMOUS_MUTATION": { + "r-base": "4.5.1", + "r-Biostrings": "2.78.0" + }, + "FITNESS_HEATMAP": { + "r-base": "4.5.1", + "r-dplyr": "1.1.4", + "r-ggplot2": "4.0.2", + "r-methods": "4.5.1", + "r-grid": "4.5.1" + }, + "FITNESS_QC": { + "r-base": "4.5.1" + }, + "MERGE_COUNTS": { + "r-base": "4.5.1" + }, + "Workflow": { + "nf-core/deepmutscan": "v1.0.0" + } + }, + [ + "fastqc", + "fastqc/GID1A_input_1_pe_1_fastqc.html", + "fastqc/GID1A_input_1_pe_1_fastqc.zip", + "fastqc/GID1A_input_1_pe_2_fastqc.html", + "fastqc/GID1A_input_1_pe_2_fastqc.zip", + "fastqc/GID1A_input_2_pe_1_fastqc.html", + "fastqc/GID1A_input_2_pe_1_fastqc.zip", + "fastqc/GID1A_input_2_pe_2_fastqc.html", + "fastqc/GID1A_input_2_pe_2_fastqc.zip", + "fastqc/GID1A_output_1_pe_1_fastqc.html", + "fastqc/GID1A_output_1_pe_1_fastqc.zip", + "fastqc/GID1A_output_1_pe_2_fastqc.html", + "fastqc/GID1A_output_1_pe_2_fastqc.zip", + "fastqc/GID1A_output_2_pe_1_fastqc.html", + "fastqc/GID1A_output_2_pe_1_fastqc.zip", + "fastqc/GID1A_output_2_pe_2_fastqc.html", + "fastqc/GID1A_output_2_pe_2_fastqc.zip", + "fitness", + "fitness/DiMSum_results", + "fitness/DiMSum_results/dimsum_results", + "fitness/DiMSum_results/dimsum_results/dimsum_results_fitness_intermediate.RData", + "fitness/DiMSum_results/dimsum_results/dimsum_results_fitness_replicates.RData", + "fitness/DiMSum_results/dimsum_results/dimsum_results_indel_variant_data_merge.tsv", + "fitness/DiMSum_results/dimsum_results/dimsum_results_nobarcode_variant_data_merge.tsv", + "fitness/DiMSum_results/dimsum_results/dimsum_results_rejected_variant_data_merge.tsv", + "fitness/DiMSum_results/dimsum_results/dimsum_results_sessionInfo.RData", + "fitness/DiMSum_results/dimsum_results/dimsum_results_variant_data_merge.RData", + "fitness/DiMSum_results/dimsum_results/dimsum_results_variant_data_merge.tsv", + "fitness/DiMSum_results/dimsum_results/dimsum_results_workspace.RData", + "fitness/DiMSum_results/dimsum_results/fitness_doubles.txt", + "fitness/DiMSum_results/dimsum_results/fitness_singles.txt", + "fitness/DiMSum_results/dimsum_results/fitness_singles_MaveDB.csv", + "fitness/DiMSum_results/dimsum_results/fitness_synonymous.txt", + "fitness/DiMSum_results/dimsum_results/fitness_wildtype.txt", + "fitness/DiMSum_results/dimsum_results/report.html", + "fitness/DiMSum_results/dimsum_results/reports", + "fitness/DiMSum_results/dimsum_results/reports/Dumpling.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__cutadapt_report_pair1.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__cutadapt_report_pair2.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__diagnostics_report_count_hist_input_aa.pdf", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__diagnostics_report_count_hist_input_aa.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__diagnostics_report_count_hist_input_nt.pdf", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__diagnostics_report_count_hist_input_nt.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__diagnostics_report_scatterplotmatrix_all.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__fastqc_report_pair1_fastqc.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__fastqc_report_pair2_fastqc.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__merge_report_aamutationcounts.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__merge_report_aamutationpercentages.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__merge_report_nucmutationcounts.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__merge_report_nucmutationpercentages.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__vsearch_report_mergedlength.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum__vsearch_report_paircounts.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum_stage_fitness_report_1_errormodel_fitness_inputcounts.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum_stage_fitness_report_1_errormodel_fitness_replicates_density.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum_stage_fitness_report_1_errormodel_fitness_replicates_density_norm.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum_stage_fitness_report_1_errormodel_fitness_replicates_scatter.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum_stage_fitness_report_1_errormodel_fitness_replicates_scatter_norm.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum_stage_fitness_report_1_errormodel_leaveoneout_qqplot.png", + "fitness/DiMSum_results/dimsum_results/reports/dimsum_stage_fitness_report_1_errormodel_repspec.png", + "fitness/DiMSum_results/dimsum_results/reports/report.Rmd", + "fitness/DiMSum_results/dimsum_results/reports/report_settings.RData", + "fitness/DiMSum_results/dimsum_results/tmp", + "fitness/DiMSum_results/dimsum_results/tmp/mutation_stats_dicts.RData", + "fitness/DiMSum_results/single_rep_counts", + "fitness/DiMSum_results/single_rep_counts/GID1A_input_1_pe_fitness_input.tsv", + "fitness/DiMSum_results/single_rep_counts/GID1A_input_2_pe_fitness_input.tsv", + "fitness/DiMSum_results/single_rep_counts/GID1A_output_1_pe_fitness_input.tsv", + "fitness/DiMSum_results/single_rep_counts/GID1A_output_2_pe_fitness_input.tsv", + "fitness/counts_merged.tsv", + "fitness/default_results", + "fitness/default_results/fitness_estimation.tsv", + "fitness/default_results/fitness_estimation_count_correlation.pdf", + "fitness/default_results/fitness_estimation_fitness_correlation.pdf", + "fitness/default_results/fitness_heatmap.pdf", + "fitness/experimentalDesign.tsv", + "fitness/mutscan_results", + "fitness/mutscan_results/fitness_estimation_mutscan_edgeR.tsv", + "fitness/mutscan_results/fitness_estimation_mutscan_limma.tsv", + "fitness/mutscan_results/mutscan_counts_corr.pdf", + "fitness/mutscan_results/mutscan_edgeR_volcano.pdf", + "fitness/mutscan_results/mutscan_limma_volcano.pdf", + "fitness/synonymous_wt.txt", + "intermediate_files", + "intermediate_files/aa_seq.txt", + "intermediate_files/bam_files", + "intermediate_files/bam_files/bwa", + "intermediate_files/bam_files/bwa/GID1A.amb", + "intermediate_files/bam_files/bwa/GID1A.ann", + "intermediate_files/bam_files/bwa/GID1A.bwt", + "intermediate_files/bam_files/bwa/GID1A.pac", + "intermediate_files/bam_files/bwa/GID1A.sa", + "intermediate_files/bam_files/bwa/mem", + "intermediate_files/bam_files/bwa/mem/GID1A_input_1_pe.bam", + "intermediate_files/bam_files/bwa/mem/GID1A_input_2_pe.bam", + "intermediate_files/bam_files/bwa/mem/GID1A_output_1_pe.bam", + "intermediate_files/bam_files/bwa/mem/GID1A_output_2_pe.bam", + "intermediate_files/bam_files/filtered", + "intermediate_files/bam_files/filtered/GID1A_input_1_pe_filtered.bam", + "intermediate_files/bam_files/filtered/GID1A_input_2_pe_filtered.bam", + "intermediate_files/bam_files/filtered/GID1A_output_1_pe_filtered.bam", + "intermediate_files/bam_files/filtered/GID1A_output_2_pe_filtered.bam", + "intermediate_files/bam_files/premerged", + "intermediate_files/bam_files/premerged/GID1A_input_1_pe_merged.bam", + "intermediate_files/bam_files/premerged/GID1A_input_2_pe_merged.bam", + "intermediate_files/bam_files/premerged/GID1A_output_1_pe_merged.bam", + "intermediate_files/bam_files/premerged/GID1A_output_2_pe_merged.bam", + "intermediate_files/gatk", + "intermediate_files/gatk/GID1A_input_1_pe", + "intermediate_files/gatk/GID1A_input_1_pe/gatk_output.aaCounts", + "intermediate_files/gatk/GID1A_input_1_pe/gatk_output.aaFractions", + "intermediate_files/gatk/GID1A_input_1_pe/gatk_output.codonCounts", + "intermediate_files/gatk/GID1A_input_1_pe/gatk_output.codonFractions", + "intermediate_files/gatk/GID1A_input_1_pe/gatk_output.coverageLengthCounts", + "intermediate_files/gatk/GID1A_input_1_pe/gatk_output.readCounts", + "intermediate_files/gatk/GID1A_input_1_pe/gatk_output.refCoverage", + "intermediate_files/gatk/GID1A_input_1_pe/gatk_output.variantCounts", + "intermediate_files/gatk/GID1A_input_2_pe", + "intermediate_files/gatk/GID1A_input_2_pe/gatk_output.aaCounts", + "intermediate_files/gatk/GID1A_input_2_pe/gatk_output.aaFractions", + "intermediate_files/gatk/GID1A_input_2_pe/gatk_output.codonCounts", + "intermediate_files/gatk/GID1A_input_2_pe/gatk_output.codonFractions", + "intermediate_files/gatk/GID1A_input_2_pe/gatk_output.coverageLengthCounts", + "intermediate_files/gatk/GID1A_input_2_pe/gatk_output.readCounts", + "intermediate_files/gatk/GID1A_input_2_pe/gatk_output.refCoverage", + "intermediate_files/gatk/GID1A_input_2_pe/gatk_output.variantCounts", + "intermediate_files/gatk/GID1A_output_1_pe", + "intermediate_files/gatk/GID1A_output_1_pe/gatk_output.aaCounts", + "intermediate_files/gatk/GID1A_output_1_pe/gatk_output.aaFractions", + "intermediate_files/gatk/GID1A_output_1_pe/gatk_output.codonCounts", + "intermediate_files/gatk/GID1A_output_1_pe/gatk_output.codonFractions", + "intermediate_files/gatk/GID1A_output_1_pe/gatk_output.coverageLengthCounts", + "intermediate_files/gatk/GID1A_output_1_pe/gatk_output.readCounts", + "intermediate_files/gatk/GID1A_output_1_pe/gatk_output.refCoverage", + "intermediate_files/gatk/GID1A_output_1_pe/gatk_output.variantCounts", + "intermediate_files/gatk/GID1A_output_2_pe", + "intermediate_files/gatk/GID1A_output_2_pe/gatk_output.aaCounts", + "intermediate_files/gatk/GID1A_output_2_pe/gatk_output.aaFractions", + "intermediate_files/gatk/GID1A_output_2_pe/gatk_output.codonCounts", + "intermediate_files/gatk/GID1A_output_2_pe/gatk_output.codonFractions", + "intermediate_files/gatk/GID1A_output_2_pe/gatk_output.coverageLengthCounts", + "intermediate_files/gatk/GID1A_output_2_pe/gatk_output.readCounts", + "intermediate_files/gatk/GID1A_output_2_pe/gatk_output.refCoverage", + "intermediate_files/gatk/GID1A_output_2_pe/gatk_output.variantCounts", + "intermediate_files/possible_mutations.csv", + "intermediate_files/processed_gatk_files", + "intermediate_files/processed_gatk_files/GID1A_input_1_pe", + "intermediate_files/processed_gatk_files/GID1A_input_1_pe/annotated_variantCounts.csv", + "intermediate_files/processed_gatk_files/GID1A_input_1_pe/library_completed_variantCounts.csv", + "intermediate_files/processed_gatk_files/GID1A_input_1_pe/variantCounts_filtered_by_library.csv", + "intermediate_files/processed_gatk_files/GID1A_input_1_pe/variantCounts_for_heatmaps.csv", + "intermediate_files/processed_gatk_files/GID1A_input_2_pe", + "intermediate_files/processed_gatk_files/GID1A_input_2_pe/annotated_variantCounts.csv", + "intermediate_files/processed_gatk_files/GID1A_input_2_pe/library_completed_variantCounts.csv", + "intermediate_files/processed_gatk_files/GID1A_input_2_pe/variantCounts_filtered_by_library.csv", + "intermediate_files/processed_gatk_files/GID1A_input_2_pe/variantCounts_for_heatmaps.csv", + "intermediate_files/processed_gatk_files/GID1A_output_1_pe", + "intermediate_files/processed_gatk_files/GID1A_output_1_pe/annotated_variantCounts.csv", + "intermediate_files/processed_gatk_files/GID1A_output_1_pe/library_completed_variantCounts.csv", + "intermediate_files/processed_gatk_files/GID1A_output_1_pe/variantCounts_filtered_by_library.csv", + "intermediate_files/processed_gatk_files/GID1A_output_1_pe/variantCounts_for_heatmaps.csv", + "intermediate_files/processed_gatk_files/GID1A_output_2_pe", + "intermediate_files/processed_gatk_files/GID1A_output_2_pe/annotated_variantCounts.csv", + "intermediate_files/processed_gatk_files/GID1A_output_2_pe/library_completed_variantCounts.csv", + "intermediate_files/processed_gatk_files/GID1A_output_2_pe/variantCounts_filtered_by_library.csv", + "intermediate_files/processed_gatk_files/GID1A_output_2_pe/variantCounts_for_heatmaps.csv", + "library_QC", + "library_QC/GID1A_input_1_pe", + "library_QC/GID1A_input_1_pe/SeqDepth.pdf", + "library_QC/GID1A_input_1_pe/counts_heatmap.pdf", + "library_QC/GID1A_input_1_pe/counts_per_cov_heatmap.pdf", + "library_QC/GID1A_input_1_pe/logdiff_plot.pdf", + "library_QC/GID1A_input_1_pe/logdiff_varying_bases.pdf", + "library_QC/GID1A_input_1_pe/rolling_counts.pdf", + "library_QC/GID1A_input_1_pe/rolling_counts_per_cov.pdf", + "library_QC/GID1A_input_1_pe/rolling_coverage.pdf", + "library_QC/GID1A_input_2_pe", + "library_QC/GID1A_input_2_pe/SeqDepth.pdf", + "library_QC/GID1A_input_2_pe/counts_heatmap.pdf", + "library_QC/GID1A_input_2_pe/counts_per_cov_heatmap.pdf", + "library_QC/GID1A_input_2_pe/logdiff_plot.pdf", + "library_QC/GID1A_input_2_pe/logdiff_varying_bases.pdf", + "library_QC/GID1A_input_2_pe/rolling_counts.pdf", + "library_QC/GID1A_input_2_pe/rolling_counts_per_cov.pdf", + "library_QC/GID1A_input_2_pe/rolling_coverage.pdf", + "library_QC/GID1A_output_1_pe", + "library_QC/GID1A_output_1_pe/SeqDepth.pdf", + "library_QC/GID1A_output_1_pe/counts_heatmap.pdf", + "library_QC/GID1A_output_1_pe/counts_per_cov_heatmap.pdf", + "library_QC/GID1A_output_1_pe/logdiff_plot.pdf", + "library_QC/GID1A_output_1_pe/logdiff_varying_bases.pdf", + "library_QC/GID1A_output_1_pe/rolling_counts.pdf", + "library_QC/GID1A_output_1_pe/rolling_counts_per_cov.pdf", + "library_QC/GID1A_output_1_pe/rolling_coverage.pdf", + "library_QC/GID1A_output_2_pe", + "library_QC/GID1A_output_2_pe/SeqDepth.pdf", + "library_QC/GID1A_output_2_pe/counts_heatmap.pdf", + "library_QC/GID1A_output_2_pe/counts_per_cov_heatmap.pdf", + "library_QC/GID1A_output_2_pe/logdiff_plot.pdf", + "library_QC/GID1A_output_2_pe/logdiff_varying_bases.pdf", + "library_QC/GID1A_output_2_pe/rolling_counts.pdf", + "library_QC/GID1A_output_2_pe/rolling_counts_per_cov.pdf", + "library_QC/GID1A_output_2_pe/rolling_coverage.pdf", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/fastqc-status-check-heatmap.txt", + "multiqc/multiqc_data/fastqc_adapter_content_plot.txt", + "multiqc/multiqc_data/fastqc_overrepresented_sequences_plot.txt", + "multiqc/multiqc_data/fastqc_per_base_n_content_plot.txt", + "multiqc/multiqc_data/fastqc_per_base_sequence_quality_plot.txt", + "multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Counts.txt", + "multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Percentages.txt", + "multiqc/multiqc_data/fastqc_per_sequence_quality_scores_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_counts_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_duplication_levels_plot.txt", + "multiqc/multiqc_data/fastqc_top_overrepresented_sequences_table.txt", + "multiqc/multiqc_data/llms-full.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc.parquet", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_fastqc.txt", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_deepmutscan_software_mqc_versions.yml" + ], + [ + "dimsum_results_indel_variant_data_merge.tsv:md5,5f82d5e4ea2238efa12ac0012816f3d7", + "dimsum_results_nobarcode_variant_data_merge.tsv:md5,d76631f8a295e3d95d92444584395d70", + "dimsum_results_rejected_variant_data_merge.tsv:md5,339820b62a120f6dc4b559b02f0bd112", + "dimsum_results_variant_data_merge.tsv:md5,1f65b4cd8797dffe8747c0290e6e604d", + "fitness_doubles.txt:md5,68b329da9893e34099c7d8ad5cb9c940", + "fitness_singles.txt:md5,466f96eb5c18bdf1440f168a45404ccc", + "fitness_singles_MaveDB.csv:md5,1ee16ae50857532d9c2838e6b977acb1", + "fitness_synonymous.txt:md5,2aecd8d74788c82f23996ab537ad8a00", + "fitness_wildtype.txt:md5,c5e39c5186b30e79a5de38b9d0d22947", + "Dumpling.png:md5,b070c7f81e170ffce462fe2711563756", + "dimsum__cutadapt_report_pair1.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum__cutadapt_report_pair2.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum__diagnostics_report_count_hist_input_aa.png:md5,e2b78e8c54e37b6ceeafedc7cf04983a", + "dimsum__diagnostics_report_count_hist_input_nt.png:md5,516b9945c514d9d22638ef4caf3f67f8", + "dimsum__diagnostics_report_scatterplotmatrix_all.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum__fastqc_report_pair1_fastqc.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum__fastqc_report_pair2_fastqc.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum__merge_report_aamutationcounts.png:md5,a5ac6c7706c46bec9e4d3f09edc1dcd8", + "dimsum__merge_report_aamutationpercentages.png:md5,658257268b72d27d6fdc2bc148b9d336", + "dimsum__merge_report_nucmutationcounts.png:md5,a6bdf234a20be07edcff9c288868f2fa", + "dimsum__merge_report_nucmutationpercentages.png:md5,8f6e67ce41c42ff16274da55ca4f9a29", + "dimsum__vsearch_report_mergedlength.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum__vsearch_report_paircounts.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum_stage_fitness_report_1_errormodel_fitness_inputcounts.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum_stage_fitness_report_1_errormodel_fitness_replicates_density.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum_stage_fitness_report_1_errormodel_fitness_replicates_density_norm.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum_stage_fitness_report_1_errormodel_fitness_replicates_scatter.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum_stage_fitness_report_1_errormodel_fitness_replicates_scatter_norm.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum_stage_fitness_report_1_errormodel_leaveoneout_qqplot.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "dimsum_stage_fitness_report_1_errormodel_repspec.png:md5,68b329da9893e34099c7d8ad5cb9c940", + "report.Rmd:md5,a19acae87f16d0059c6699bc1e818d95", + "GID1A_input_1_pe_fitness_input.tsv:md5,37f117c085aed5c047bfc4bdb877f6a5", + "GID1A_input_2_pe_fitness_input.tsv:md5,08c149f5af39243413d4fd3ac4a9cee6", + "GID1A_output_1_pe_fitness_input.tsv:md5,8255190abc161bc4f9146d11a6819e62", + "GID1A_output_2_pe_fitness_input.tsv:md5,86ca879069467c7883919676659000d4", + "counts_merged.tsv:md5,8f8f810518a000cd4e8b7b076f50e121", + "fitness_estimation.tsv:md5,319d093abe93cf8bbe1f7a75dbb1c3b5", + "experimentalDesign.tsv:md5,f95dea4a3e6cf7e206e789a96e502604", + "synonymous_wt.txt:md5,bb94c5975185b80f8d63acebfbf4a0da", + "aa_seq.txt:md5,421287586f566191f7da4a7aad7d24a8", + "GID1A.amb:md5,383ef17f6abb846ab669ebac30f5659b", + "GID1A.ann:md5,30c4a2204f9166d9594c790e0ca58793", + "GID1A.bwt:md5,120149e0239465625c2dd20a171701fd", + "GID1A.pac:md5,104f743d81d39b2527b5ae52db26654c", + "GID1A.sa:md5,1f697dcb3fedd2f3f29642598bdd795b", + "GID1A_input_1_pe.bam:md5,57f54fa2bad68dc8ff39d2bb79d94742", + "GID1A_input_2_pe.bam:md5,5da708382cddfaa90e7b53e3d79d8f8e", + "GID1A_output_1_pe.bam:md5,f888077ed2234b2b7b033ceea621b33f", + "GID1A_output_2_pe.bam:md5,5b147ba80d87ee3ff3be6c3f34d7f67d", + "GID1A_input_1_pe_filtered.bam:md5,3ca481d1fcdc0c5f3c1bca772b491d1e", + "GID1A_input_2_pe_filtered.bam:md5,9d4f9769ab087d37a3c7cb632ff4df73", + "GID1A_output_1_pe_filtered.bam:md5,4116f59370fc259900c3f4d6dd3f5dd4", + "GID1A_output_2_pe_filtered.bam:md5,6ae818bdb9e7ec5ea1c4c96de50d17ce", + "GID1A_input_1_pe_merged.bam:md5,97036832fa1aa512fd98899dd6d55adc", + "GID1A_input_2_pe_merged.bam:md5,4022c01b39e15881ae30999e18317ba2", + "GID1A_output_1_pe_merged.bam:md5,7304058a732059a7da38c325963be3b1", + "GID1A_output_2_pe_merged.bam:md5,9fce3acae393d542c5586f7d03d290c7", + "gatk_output.aaCounts:md5,24b4246b9a8541dd9e4a600cd117bc7a", + "gatk_output.aaFractions:md5,ad28ff8aa58621f156772013fefc1a78", + "gatk_output.codonCounts:md5,6a09ed13037ef40d63408089453a40a8", + "gatk_output.codonFractions:md5,cc16649790d7d025921e85d3bf6e7479", + "gatk_output.coverageLengthCounts:md5,6ad92adf2629169b61da7d8c74c81c20", + "gatk_output.readCounts:md5,53dd247510935ed95236b07ea20b097d", + "gatk_output.refCoverage:md5,2e85472b0d6d539a13d10d0d0cc7c119", + "gatk_output.variantCounts:md5,022b9d41c4affb02f682cc4a2a14c414", + "gatk_output.aaCounts:md5,2900fec9b0fa14733c03456c1db87873", + "gatk_output.aaFractions:md5,6c65be966c311f62763b2f04ff79fe6f", + "gatk_output.codonCounts:md5,1cf868ae173efa006d2d70a457232672", + "gatk_output.codonFractions:md5,5ea7101d36c48576512c171b1f94fea5", + "gatk_output.coverageLengthCounts:md5,67d304f700488942eb9e3150b401db0e", + "gatk_output.readCounts:md5,03d07b7ab4c91ba3ffb7f4d19b81060b", + "gatk_output.refCoverage:md5,643b3de32ea21faff68f4cdc059d5186", + "gatk_output.variantCounts:md5,dd84779bec019f1b1c3dd862146191e5", + "gatk_output.aaCounts:md5,e301ba5289887fe749d4f2aaddff564f", + "gatk_output.aaFractions:md5,4a643d3ab4485e2afd556cd8a1acdcd9", + "gatk_output.codonCounts:md5,9e035037f015583ec40ee7ac3f95d770", + "gatk_output.codonFractions:md5,efefcc6a81dafc3f480e21094b7797a4", + "gatk_output.coverageLengthCounts:md5,949ad6af6bc030ee3b604bd5dd796bca", + "gatk_output.readCounts:md5,991d059f48e512ec9e1eee741c6d148c", + "gatk_output.refCoverage:md5,b9c46a171cc8909cc87a16c4042b6c9a", + "gatk_output.variantCounts:md5,fbd6faf6e9cedfd4dac56aa078933292", + "gatk_output.aaCounts:md5,6c8e64ad5a2d3a83f7068ccad61c0dc7", + "gatk_output.aaFractions:md5,8a5f15131201756003f613ae5ecb5d62", + "gatk_output.codonCounts:md5,ee8d49ea420e5ae93c167f7c61ade5f4", + "gatk_output.codonFractions:md5,3dd42165eb49c347abce91687c52748a", + "gatk_output.coverageLengthCounts:md5,f60eeeb00a95c9864adf497d697784f9", + "gatk_output.readCounts:md5,306f43a55c3609510aab16660426ff79", + "gatk_output.refCoverage:md5,923c5e2f1b98088de89b2e6413e43daa", + "gatk_output.variantCounts:md5,74abfdcd00aea020b9b56c091c38a805", + "possible_mutations.csv:md5,9ce59fe91b726b1d88587a13fa899567", + "annotated_variantCounts.csv:md5,92ec154a8faf4626717fda3c117c119d", + "library_completed_variantCounts.csv:md5,88a4ad2575403e322c06ce57fc9b0c3f", + "variantCounts_filtered_by_library.csv:md5,630d2284b78d441511262eb7a4f21e0f", + "variantCounts_for_heatmaps.csv:md5,abe9765175ab724c8bca1bd17bfb4875", + "annotated_variantCounts.csv:md5,11fd59e89d9668a2ffeb24792443b1df", + "library_completed_variantCounts.csv:md5,dda6173170646bb39f19271b58d147aa", + "variantCounts_filtered_by_library.csv:md5,89eaef32b5e1d135de7aaf1364f289f8", + "variantCounts_for_heatmaps.csv:md5,f768ee02434502059987f3332b90ae5d", + "annotated_variantCounts.csv:md5,ed86a8575c0dccaf9aa35f3ca5b4b42a", + "library_completed_variantCounts.csv:md5,4e3f8bb562803cd5b868a5a189c57bc4", + "variantCounts_filtered_by_library.csv:md5,651fdd44ab9f7365ccba5b468cd04bd5", + "variantCounts_for_heatmaps.csv:md5,5fd8161758ccddfe6ae74d7ef919826f", + "annotated_variantCounts.csv:md5,ae8c407261b6107e67d4c9ec6b9dfa89", + "library_completed_variantCounts.csv:md5,34974d5568a315d78892139e71d2221e", + "variantCounts_filtered_by_library.csv:md5,0c38bb486fce548f2cce646a77307bb2", + "variantCounts_for_heatmaps.csv:md5,2e513e7bfe1d0ac8758af03e82685cf6", + "fastqc-status-check-heatmap.txt:md5,58a9580c11af4210b1af65753383e120", + "fastqc_adapter_content_plot.txt:md5,2990ad34b6fa29151dd4b9a38828afbe", + "fastqc_overrepresented_sequences_plot.txt:md5,c703a9df0ea3a5c1c5f515a1fd88bbe8", + "fastqc_per_base_n_content_plot.txt:md5,6af419871cdec77149cf013b72e82417", + "fastqc_per_base_sequence_quality_plot.txt:md5,460b8e65cf8d58087b3143ba78b30e17", + "fastqc_per_sequence_gc_content_plot_Counts.txt:md5,7f1eacaa19e912e18a4419b23e749e3a", + "fastqc_per_sequence_gc_content_plot_Percentages.txt:md5,0e405d9394ae8aeaa6ae162e18f7c195", + "fastqc_per_sequence_quality_scores_plot.txt:md5,5d1ee338c146901ca9f1690f7ee47200", + "fastqc_sequence_counts_plot.txt:md5,8b438078b7f3475b6ce659cc4b8dca11", + "fastqc_sequence_duplication_levels_plot.txt:md5,a0839f96103959fd7265cbb986e2724a", + "multiqc_citations.txt:md5,4c806e63a283ec1b7e78cdae3a923d4f", + "multiqc_fastqc.txt:md5,72cb1ee38d354d674ced1b4b21bbeff4", + "multiqc_general_stats.txt:md5,7c2fd44ded06835128a208d3365aa23b" + ], + { + "edger_rowCount": 758, + "limma_rowCount": 758 + } + ], + "timestamp": "2026-06-13T21:44:08.811231", + "meta": { + "nf-test": "0.9.4", + "nextflow": "26.04.1" + } + } +} \ No newline at end of file diff --git a/tests/nextflow.config b/tests/nextflow.config new file mode 100644 index 0000000..b4075e0 --- /dev/null +++ b/tests/nextflow.config @@ -0,0 +1,29 @@ +/* +======================================================================================== + Nextflow config file for running nf-test tests +======================================================================================== +*/ + +process { + resourceLimits = [ + cpus: 4, + memory: '8.GB', + time: '1.h' + ] +} + +params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + pipelines_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/refs/heads/deepmutscan/' + + // Input data + input = params.pipelines_testdata_base_path + 'samplesheet/GID1A_test.csv' + fasta = params.pipelines_testdata_base_path + 'testdata/GID1A.fasta' + reading_frame = '352-1383' + min_counts = 2 + mutagenesis_type = 'nnk_nns' + run_seqdepth = true + fitness = true +} + +aws.client.anonymous = true // fixes S3 access issues on self-hosted runners diff --git a/workflows/deepmutscan.nf b/workflows/deepmutscan.nf new file mode 100644 index 0000000..54a2887 --- /dev/null +++ b/workflows/deepmutscan.nf @@ -0,0 +1,311 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ +include { FASTQC } from '../modules/nf-core/fastqc/main' +include { MULTIQC } from '../modules/nf-core/multiqc/main' +include { BWA_INDEX } from '../modules/nf-core/bwa/index/main' +include { BWA_MEM } from '../modules/nf-core/bwa/mem/main' +include { BAMFILTER_DMS } from '../modules/local/bamprocessing/bam_filter/main' +include { PREMERGE } from '../modules/local/bamprocessing/premerge/main' +include { GATK_SATURATIONMUTAGENESIS } from '../modules/local/gatk/saturationmutagenesis/main' +include { DMSANALYSIS_AASEQ } from '../modules/local/dmsanalysis/aa_seq/main' +include { DMSANALYSIS_POSSIBLE_MUTATIONS } from '../modules/local/dmsanalysis/possible_mutations/main' +include { DMSANALYSIS_PROCESS_GATK } from '../modules/local/dmsanalysis/process_gatk/main' +include { VISUALIZATION_COUNTS_PER_COV } from '../modules/local/visualization/counts_per_cov/main' +include { VISUALIZATION_COUNTS_HEATMAP } from '../modules/local/visualization/counts_heatmap/main' +include { VISUALIZATION_GLOBAL_POS_BIASES_COUNTS } from '../modules/local/visualization/global_pos_biases_counts/main' +include { VISUALIZATION_GLOBAL_POS_BIASES_COV } from '../modules/local/visualization/global_pos_biases_cov/main' +include { VISUALIZATION_LOGDIFF } from '../modules/local/visualization/logdiff/main' +include { VISUALIZATION_SEQDEPTH } from '../modules/local/visualization/seqdepth/main' +include { GATK_GATKTOFITNESS } from '../modules/local/gatk/gatk_to_fitness/main' + +include { CALCULATE_FITNESS } from '../subworkflows/local/calculate_fitness/main' + +include { paramsSummaryMap } from 'plugin/nf-schema' +include { paramsSummaryMultiqc } from '../subworkflows/nf-core/utils_nfcore_pipeline' +include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline' +include { methodsDescriptionText } from '../subworkflows/local/utils_nfcore_deepmutscan_pipeline' + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + RUN MAIN WORKFLOW +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +workflow DEEPMUTSCAN { + + take: + ch_samplesheet // channel: samplesheet read in from --input + multiqc_config + multiqc_logo + multiqc_methods_description + outdir + + main: + + def ch_versions = Channel.empty() + def ch_multiqc_files = Channel.empty() + + // Define input channels from parameters + def ch_fasta = Channel + .fromPath(params.fasta, checkIfExists: true) + .map { fasta -> tuple( [id: 'ref'], fasta ) } + + def reading_frame_ch = Channel.value(params.reading_frame) + def min_counts_ch = Channel.value(params.min_counts) + def custom_codon_library_ch = Channel.value(params.custom_codon_library) + def mutagenesis_type_ch = Channel.value(params.mutagenesis_type) + def sliding_window_size_ch = Channel.value(params.sliding_window_size) + def aimed_cov_ch = Channel.value(params.aimed_cov) + def run_seqdepth_ch = Channel.value(params.run_seqdepth) + + // Raw samplesheet path channel for downstream subworkflows + def ch_samplesheet_csv = Channel.fromPath(params.input, checkIfExists: true) + + // + // MODULE: Run FastQC + // + FASTQC(ch_samplesheet) + ch_multiqc_files = ch_multiqc_files.mix(FASTQC.out.zip.map{ _meta, file -> file }) + + // + // MODULE: BWA Index + // + BWA_INDEX ( + ch_fasta + ) + + // Broadcast index to all samples + def ch_bwa_index = BWA_INDEX.out.index + + // Broadcast the index to all samples + def ch_bwa_index_broadcast = ch_samplesheet + .combine(ch_bwa_index) + .map { [it[2], it[3]] } + + // Broadcast the fasta to all samples + def ch_fasta_broadcast = ch_fasta + .combine(ch_samplesheet) + .map { [it[0], it[1]] } + + // Broadcast the sort flag to all samples + def ch_sort_bam = ch_samplesheet.map { false } + + // Run BWA_MEM with all four inputs aligned + BWA_MEM( + ch_samplesheet, + ch_bwa_index_broadcast, + ch_fasta_broadcast, + ch_sort_bam + ) + + BAMFILTER_DMS ( + BWA_MEM.out.bam + ) + + // Broadcast the FASTA path to every BAM emitted by BAMFILTER_DMS + def ch_fasta_path_broadcast = ch_fasta + .combine(BAMFILTER_DMS.out.bam) // flattened item: [meta3, fasta, meta, bam] + .map { it[1] } // keep only the fasta path (N emissions) + + PREMERGE( + BAMFILTER_DMS.out.bam, // tuple(val(meta), path(bam)) + ch_fasta_path_broadcast // path(fasta) + ) + + // FASTA path for GATK: broadcast to N + def ch_fasta_for_gatk = ch_fasta.combine(PREMERGE.out.bam).map { it[1] } // path -- N + // Reading frame for GATK: broadcast to N (it's a val string) + def ch_rf_for_gatk = reading_frame_ch.combine(PREMERGE.out.bam).map { it[0] } // val -- N + // min_counts for GATK: broadcast to N (also a val) + def ch_min_for_gatk = min_counts_ch.combine(PREMERGE.out.bam).map { it[0] } // val -- N + + GATK_SATURATIONMUTAGENESIS( + PREMERGE.out.bam, // merged reads - tuple(val(meta), path(bam)) + ch_fasta_for_gatk, // path(fasta) + ch_rf_for_gatk, // val(reading_frame string) + ch_min_for_gatk // val(min_counts) + ) + + DMSANALYSIS_AASEQ ( + ch_fasta, + reading_frame_ch + ) + ch_versions = ch_versions.mix(DMSANALYSIS_AASEQ.out.versions) + + DMSANALYSIS_POSSIBLE_MUTATIONS( + ch_fasta, + reading_frame_ch, // pos_range (as val) + mutagenesis_type_ch, // mutagenesis_type (as val) + custom_codon_library_ch // custom_codon_library (as path) + ) + ch_versions = ch_versions.mix(DMSANALYSIS_POSSIBLE_MUTATIONS.out.versions) + + // Anchor (N items; one per sample) + def ch_vc = GATK_SATURATIONMUTAGENESIS.out.variantCounts // tuple(val(meta), path) + + // Build per-sample inputs using inline combinations (replaces fanout) + def ch_possible_mut_for_proc = DMSANALYSIS_POSSIBLE_MUTATIONS.out.possible_mutations.map { it[1] }.combine(ch_vc).map { it[0] } + def ch_aa_seq_for_proc = DMSANALYSIS_AASEQ.out.aa_seq.map { it[1] }.combine(ch_vc).map { it[0] } + def ch_min_counts_for_proc = min_counts_ch.combine(ch_vc).map { it[0] } + + // Call with all inputs aligned (each has N items now) + DMSANALYSIS_PROCESS_GATK( + ch_vc, // tuple(val(meta), path(variantCounts)) -- N + ch_possible_mut_for_proc, // path(possible_mutations) -- N + ch_aa_seq_for_proc, // path(aa_seq) -- N + ch_min_counts_for_proc // val(min_counts) -- N + ) + + def annotated_variantCounts_ch = DMSANALYSIS_PROCESS_GATK.out.processed_variantCounts.map { meta, a, b, c, d -> tuple(meta, a) } + def variantCounts_filtered_by_library_ch = DMSANALYSIS_PROCESS_GATK.out.processed_variantCounts.map { meta, a, b, c, d -> tuple(meta, b) } + def library_completed_variantCounts_ch = DMSANALYSIS_PROCESS_GATK.out.processed_variantCounts.map { meta, a, b, c, d -> tuple(meta, c) } + def variantCounts_for_heatmaps_ch = DMSANALYSIS_PROCESS_GATK.out.processed_variantCounts.map { meta, a, b, c, d -> tuple(meta, d) } + + // --- For VISUALIZATION_COUNTS_PER_COV & HEATMAP (replaces fanoutTo) + def min_counts_for_cov_ch = min_counts_ch.combine(variantCounts_for_heatmaps_ch).map { it[0] } + def min_counts_for_heatmap_ch = min_counts_ch.combine(variantCounts_for_heatmaps_ch).map { it[0] } + + // --- For VISUALIZATION_GLOBAL_POS_BIASES_* + def aa_seq_for_bias_ch = DMSANALYSIS_AASEQ.out.aa_seq.map { it[1] }.combine(variantCounts_filtered_by_library_ch).map { it[0] } + def sliding_window_size_N = sliding_window_size_ch.combine(variantCounts_filtered_by_library_ch).map { it[0] } + def aimed_cov_N = aimed_cov_ch.combine(variantCounts_filtered_by_library_ch).map { it[0] } + + // --- For VISUALIZATION_SEQDEPTH + def possible_mutations_N = DMSANALYSIS_POSSIBLE_MUTATIONS.out.possible_mutations.map { it[1] }.combine(variantCounts_filtered_by_library_ch).map { it[0] } + def min_counts_for_seqdepth_ch = min_counts_ch.combine(variantCounts_filtered_by_library_ch).map { it[0] } + + VISUALIZATION_COUNTS_PER_COV( + variantCounts_for_heatmaps_ch, + min_counts_for_cov_ch + ) + + VISUALIZATION_COUNTS_HEATMAP( + variantCounts_for_heatmaps_ch, + min_counts_for_heatmap_ch + ) + + VISUALIZATION_GLOBAL_POS_BIASES_COUNTS( + variantCounts_filtered_by_library_ch, + aa_seq_for_bias_ch, + sliding_window_size_N + ) + + VISUALIZATION_GLOBAL_POS_BIASES_COV( + variantCounts_filtered_by_library_ch, + aa_seq_for_bias_ch, + sliding_window_size_N, + aimed_cov_N + ) + + VISUALIZATION_LOGDIFF( + library_completed_variantCounts_ch + ) + + if (params.run_seqdepth) { + VISUALIZATION_SEQDEPTH( + variantCounts_filtered_by_library_ch, + possible_mutations_N, + min_counts_for_seqdepth_ch + ) + } + + // Broadcast singletons to N (one per sample), anchored on variantCounts_filtered_by_library_ch + def ch_fasta_for_fitness = ch_fasta.combine(variantCounts_filtered_by_library_ch).map { it[1] } // path(fasta) -- N + def ch_rf_for_fitness = reading_frame_ch.combine(variantCounts_filtered_by_library_ch).map { it[0] } // val(range) -- N + + // Call with aligned inputs + GATK_GATKTOFITNESS( + variantCounts_filtered_by_library_ch, // tuple(val(meta), path) + ch_fasta_for_fitness, // path(fasta) + ch_rf_for_fitness // val(reading_frame) + ) + + // Execution of fitness subworkflow, if --fitness true + if (params.fitness) { + + CALCULATE_FITNESS ( + GATK_GATKTOFITNESS.out.fitness_input, // Input from previous step + ch_samplesheet_csv, // Path to samplesheet + ch_fasta, // The original Fasta tuple + reading_frame_ch, // Reading frame value channel + DMSANALYSIS_AASEQ.out.aa_seq // Amino Acid Sequence (for Heatmap) + ) + + // Collect versions + ch_versions = ch_versions.mix(CALCULATE_FITNESS.out.versions) + } + + // + // Collate and save software versions + // + def topic_versions = channel.topic("versions") + .distinct() + .branch { entry -> + versions_file: entry instanceof Path + versions_tuple: true + } + + def topic_versions_string = topic_versions.versions_tuple + .map { process, tool, version -> + [ process[process.lastIndexOf(':')+1..-1], " ${tool}: ${version}" ] + } + .groupTuple(by:0) + .map { process, tool_versions -> + tool_versions.unique().sort() + "${process}:\n${tool_versions.join('\n')}" + } + + def ch_collated_versions = softwareVersionsToYAML(ch_versions.mix(topic_versions.versions_file)) + .mix(topic_versions_string) + .collectFile( + storeDir: "${outdir}/pipeline_info", + name: 'nf_core_' + 'deepmutscan_software_' + 'mqc_' + 'versions.yml', + sort: true, + newLine: true + ) + + // + // MODULE: MultiQC + // + ch_multiqc_files = ch_multiqc_files.mix(ch_collated_versions) + def ch_summary_params = paramsSummaryMap(workflow, parameters_schema: "nextflow_schema.json") + def ch_workflow_summary = channel.value(paramsSummaryMultiqc(ch_summary_params)) + ch_multiqc_files = ch_multiqc_files.mix(ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml')) + def ch_multiqc_custom_methods_description = multiqc_methods_description + ? file(multiqc_methods_description, checkIfExists: true) + : file("${projectDir}/assets/methods_description_template.yml", checkIfExists: true) + def ch_methods_description = channel.value(methodsDescriptionText(ch_multiqc_custom_methods_description)) + ch_multiqc_files = ch_multiqc_files.mix(ch_methods_description.collectFile(name: 'methods_description_mqc.yaml', sort: true)) + + MULTIQC( + ch_multiqc_files.flatten().collect().map { files -> + [ + [id: 'deepmutscan'], + files, + multiqc_config + ? file(multiqc_config, checkIfExists: true) + : file("${projectDir}/assets/multiqc_config.yml", checkIfExists: true), + multiqc_logo ? file(multiqc_logo, checkIfExists: true) : [], + [], + [], + ] + } + ) + + emit: + multiqc_report = MULTIQC.out.report.map { _meta, report -> [report] }.toList() // channel: /path/to/multiqc_report.html + versions = ch_versions // channel: [ path(versions.yml) ] + bwa_index = BWA_INDEX.out.index + aligned_bam = BWA_MEM.out.bam + filtered_bam = BAMFILTER_DMS.out.bam + premerged_bam = PREMERGE.out.bam +} + +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + THE END +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ diff --git a/workflows/dmscore.nf b/workflows/dmscore.nf deleted file mode 100644 index 041e503..0000000 --- a/workflows/dmscore.nf +++ /dev/null @@ -1,97 +0,0 @@ -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/ -include { FASTQC } from '../modules/nf-core/fastqc/main' -include { MULTIQC } from '../modules/nf-core/multiqc/main' -include { paramsSummaryMap } from 'plugin/nf-schema' -include { paramsSummaryMultiqc } from '../subworkflows/nf-core/utils_nfcore_pipeline' -include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline' -include { methodsDescriptionText } from '../subworkflows/local/utils_nfcore_dmscore_pipeline' - -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - RUN MAIN WORKFLOW -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/ - -workflow DMSCORE { - - take: - ch_samplesheet // channel: samplesheet read in from --input - main: - - ch_versions = Channel.empty() - ch_multiqc_files = Channel.empty() - // - // MODULE: Run FastQC - // - FASTQC ( - ch_samplesheet - ) - ch_multiqc_files = ch_multiqc_files.mix(FASTQC.out.zip.collect{it[1]}) - ch_versions = ch_versions.mix(FASTQC.out.versions.first()) - - // - // Collate and save software versions - // - softwareVersionsToYAML(ch_versions) - .collectFile( - storeDir: "${params.outdir}/pipeline_info", - name: 'nf_core_' + 'dmscore_software_' + 'mqc_' + 'versions.yml', - sort: true, - newLine: true - ).set { ch_collated_versions } - - - // - // MODULE: MultiQC - // - ch_multiqc_config = Channel.fromPath( - "$projectDir/assets/multiqc_config.yml", checkIfExists: true) - ch_multiqc_custom_config = params.multiqc_config ? - Channel.fromPath(params.multiqc_config, checkIfExists: true) : - Channel.empty() - ch_multiqc_logo = params.multiqc_logo ? - Channel.fromPath(params.multiqc_logo, checkIfExists: true) : - Channel.empty() - - summary_params = paramsSummaryMap( - workflow, parameters_schema: "nextflow_schema.json") - ch_workflow_summary = Channel.value(paramsSummaryMultiqc(summary_params)) - ch_multiqc_files = ch_multiqc_files.mix( - ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml')) - ch_multiqc_custom_methods_description = params.multiqc_methods_description ? - file(params.multiqc_methods_description, checkIfExists: true) : - file("$projectDir/assets/methods_description_template.yml", checkIfExists: true) - ch_methods_description = Channel.value( - methodsDescriptionText(ch_multiqc_custom_methods_description)) - - ch_multiqc_files = ch_multiqc_files.mix(ch_collated_versions) - ch_multiqc_files = ch_multiqc_files.mix( - ch_methods_description.collectFile( - name: 'methods_description_mqc.yaml', - sort: true - ) - ) - - MULTIQC ( - ch_multiqc_files.collect(), - ch_multiqc_config.toList(), - ch_multiqc_custom_config.toList(), - ch_multiqc_logo.toList(), - [], - [] - ) - - emit:multiqc_report = MULTIQC.out.report.toList() // channel: /path/to/multiqc_report.html - versions = ch_versions // channel: [ path(versions.yml) ] - -} - -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - THE END -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/