Skip to content

Apps for analysis of GeoMx digital spatial profiling protein data.

Notifications You must be signed in to change notification settings

CancerTargetLab/apps_geomx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

output
html_document
toc number_sections toc_float
true
true
collapsed

General information.

The applications were developed by Elias Carlsson as a master thesis project 2022 at the department of Immunotechnology, LTH.

Installation

Dependencies

Make sure following packages are installed (they will be requried in the scripts):

dependencies <- c("rio","dplyr","tidyverse","shiny","corrplot","ggrepel","ggplot2","pheatmap","lmerTest","patchwork","digest","plotly","reshape2","Hmisc","psych","stats","reshape", "caTools", "randomForest", "caret", "survminer", "ggpubr", "janitor", "coxme", "survival", "forcats")
req <- lapply(dependencies, require, character.only = TRUE)
lapply(dependencies[!unlist(req)],install.packages, character.only = TRUE)

Change working direction

It might be necessary to define the working directory for the apps to work.

your_directory = “C:change/this/to/dicretory/to/this/folder”
setwd(your_directory)

If error messages appears containing "No such file or directory" this might solve the problem.

Dataset

Right now the data-folder contains an example-dataset of randomly generated data. Try the apps first with this dataset to see that installation works as it should. There are four apps:

  • app_normalization
  • app_lmem
  • app_survival
  • app_ML

Try to start them all and click around to ensure that everything works as it should.

Necessary folder and scripts

The app comes with a folder and script structure, that might cause failure if not intact.

Following files/folders are necessary.

Top layer

  • app_survival.R
  • app_normalization.R
  • app_ML.R
  • app_lmem.R
  • configsetup.R
  • README.md
  • cache (folder)
  • functions (folder)
  • data (folder)
  • miscellaneous (folder)

cache (folder)

  • survival_functions (folder)
  • lmem_functions (folder)

functions (folder)

  • load_data.R
  • survival_functions.R
  • lmem_functions.R
  • normalization_functions.R
  • ML_functions.R

data (folder)

  • setup.RData
  • your datasets

miscellaneous

  • normalyzerDE_matrices.R
  • configsetup.R

Setup for new data

It is possible to add GeoMx datasets to the applications. To do this, there are essentially two steps. 1) Add your data, 2) Configure the setup-file. These steps are described below.

Add the data files

Add the datafiles at your desired location (preferably in the "data" folder, or in a subfolder to it). The requirement for the data is as following:

GeoMx data and data normalized in GeoMx

  • The apps uses raw GeoMx data and loads the excel-file the way it is exported from GeoMx.

NormalyzerDE data

  • The apps can also handle data normalized in NormalyzerDE. Just place the text-files output in a folder.
  • There is a script in (./miscellaneous/normalyzerDE_matrices.R) that created design and data matrixes for the GeoMx-data. If this one is used with NormalyzerDE, everything should work fine. This script requires you to enter all the names of your proteins, and is therefore recommended to be performed after 2.2.

OBS: The data matrix to NormalyzerDE is scaled up by factor 10 and is therefore downscaled again when loaded

"setup.RData"

Next it is necessary to provide some information of your dataset to the apps. For instance the location of your datafiles and the names of your variables. There is a R script (./miscellaneous/configsetup.R) aiding in this process.

setup.RData is a RData-file providing this information. The datafile should be stored in the data-folder, as following: ./data/setup.RData

Following is a table defining what values needs to be contained in setup.RData along with a table of examples:

Number Description Name Type OBS
1 Vector containing all available proteins proteins a character vector
2 Vector containing all household and negative control proteins cn a named character vector HK proteins named "HK", negative control named "NegC"
3 Vector containing all types relevant for analysis vec_type a character vector
4 Vector containing locations of datasets loc a named character vector
5 Vector containing locations of normalyzerDE datasets loc_nDE a named character vector Needs to contain the element: "None" = 0
6 Vector defining the names of relevant features feature_names a named character vector * (See note)
Number Example
1 proteins <- c("BIM", "PanCk")
2 cn <- c("HK" = "GAPDH", "NegC" = Ms IgG2a")
3 vec_type <- c("Main_type", "Type", "Stage_1")
4 loc <- c("H3 Normalized" = "./data/H3data.xlsx", "Non normalized" = "./data/NNdata.xlsx")
5 loc_nDE <- c("None" = 0, "VSN Normalized" = "./data/nDE/VSNdata.txt")
6 feature_names <- c("Diagnosis date" = "Diagnosis_date", "Area" = "AOI surface area"...)

* Included in feature names must be:

  • "Diagnosis date" =
  • "Cause of death" =
  • "Last record alive" =
  • "Date of death" =
  • "Area" =
  • "Nuclei count" =
  • "PatientID" =

If there is already a time column and event column in the dataset, make sure they are named "time" and "event". Then:

  • "Diagnosis date" = ""
  • "Cause of death" = ""
  • "Last record alive" = ""
  • "Date of death" = ""

App information/limitations

Preselected dataset

Which dataset is preselected within the applications has to be changed manually in every app.

Ctrl+F for : selected = loc_nDE[ - to change which nDE dataset is selected Ctrl+F for : selected = loc[ - to change which dataset is selected.

Choose data and filter menu

Choosing dataset and filtering it can be done in every app and looks like in the image displayed below. Selecting dataset allows for comparasion of different normalization approaches/similar projects. Filtering the data allows for selecting only specific values from some columns. For example only selecting ROI-type = tumor or Therapy_sucessful = Yes.

app_normalization

The scatterplots in the normalization app is limited to plotting three negative control and three housekeepers.

app_survival

If error message "Inf value..." (usually when two or more random effects) change method to nelder-mead Did not work? Did you define columns correctly in feature_names? Is your event and time column named "event" and "time"?

app_ML

For all biomarkers a RF-regression will be made. If regression is wanted for parameters as well this can be edited in ML_functions.R

Ctrl+F for "Add which parameters should be regression manually"

About

Apps for analysis of GeoMx digital spatial profiling protein data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages