Skip to content

Change from java-based EGA downloader to pyega3 regarding legacy data #583

@famosab

Description

@famosab

Manifest files that are generated in the DCC require a java-based download client that is not maintained anymore and not even linked in the ICGC ARGO docs covering the legacy data from the ICGC 25k project.

Detailed Description

The dcc documentation states, that one needs to use the EGA Download client to access ICGC data stored with EGA. I found that this client is not really supported anymore and that EGA now usually points to their Python based pyega3 client. This is denoted in the argo documentation, however, the auto-generated manifest files in DCC still require the java based client. Only few adaptations are necessary to switch to pyega3.

Possible Implementation

The following explanation could be added to the documentation. The manifest generation could be changed but that might not be worth the time since the DCC will be retired soon.


You will need to

  1. Install pyega3 v.5.1.0 or higher
  2. Adapt the auto-generated manifest file to look like the following file (most likely you will only need to update the file_ids & mapping variables by copying from the auto-generated manifest file):
#!/bin/bash
###############################################################################
# Manifest
###############################################################################

file_ids="EGAF00001074790"
mapping="{EGAF00001074790=FI41441}

###############################################################################
# Checking
###############################################################################

if ! command -v python &>/dev/null; then
   echo "Python not found. Exiting..."
   exit 2
fi

###############################################################################
# Request
###############################################################################

echo "Requesting $mapping..."

for file_id in $file_ids
do
  echo "Requesting $file_id..."
  pyega3 -c 20 -ms 1073741824 -cf conf.pyega3.json fetch $file_id
done

echo "Finished!"
  1. Provide a second file called conf.pyega3.json which looks like:
{
   "username":"<your email registered with EGA>",
   "password":"<your password registered with EGA>"
}

Note that this issue is mostly a duplicate of a post in the ICGC discuss forum. I was not sure how often people check the posts there.

Metadata

Metadata

Assignees

No one assigned

    Labels

    new-featureRequest is a new feature

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions