-
Notifications
You must be signed in to change notification settings - Fork 3
Allow per phenotype p-value thresholds #87
Copy link
Copy link
Open
Labels
enhancementNew feature or requestNew feature or request
Description
The idea
Currently, the p-value thresholds for SNPs selection and locus breaker are configured per study directly in the summary stat input table (p_thresh1 and p_thresh2 columns). Thus, the same thresholds are applied to all phenotypes for a given study.
However, this is not ideal for molecular traits (like eQTLs or pQTLs results) where one may prefer to set different thresholds for each gene / protein.
A possible implementation
- We allow an additional optional input column in the summary stat input table, like
p_thresh_table. This must point to a TSV table with columns: study_id, pheno_id (aka gene_id), p_thresh1, p_thresh2 - We update the locus-breaker R script to accept an extra argument
--p_thresh_table. - When we load the summary stats, we add 2 columns representing the thresholds. These are populated with fixed values from
--p_thres1and--p_thres2then--p_thresh_tableis not set, otherwise we use a merge by(study_id, pheno_id)to populate them with information from the p threshold table - SNPs are filtered based on the p_thresh columns (and then these can be removed eventually)
We have to point out in the docs that when the p-value threshold table is provided, this will take precedence over the fixed values.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request