This repository contains the code and data necessary to get all the main results of the Brazilian Reproducibility Initiative. Authorship for data and code is provided in the "author_contributions.xlxs" file using the CRediT taxonomy.
All analyses are run from the script main.R - you can simply clone the repo and run it. The exception is the code on the "maps" folder, which is not R, but Vega-lite and needs to be run separately.
The main.R file calls all other functions necessary for analysis. Packages are installed automatically via the pacman package in the analysis.R script. Versions of R and packages used for running the original analysis are avaliable at "R session info.txt".
Results from individual experiments with each method (produced by the mtt-summarizer.R, pcr-summarizer.R, pcr-summarizer-alt.R and epm-summarizer.R scripts) are saved in the "replication-results" folder. The "PCR" folder contains RT-PCR relative expression values in log scale (which are used in the primary analysis), while the "PCR-ALT" folder contains relative expression values in linear scale (as a sensitivity analysis).
Simulated results for post-hoc power analysis can be generated by the post-hoc-power-sim.R function (or uncommenting it in line 127 of main.R). Running the simulations every time is not necessary for the analysis, which by default uses the results of a previous round of 1,000 simulations from the "inclusion_sets.xlxs" table.
The main analysis output will be generated in a folder named "output" (the subfolder name is set in the variable "results_path" in main.R and uses the current date by default). In it, there is a series of subfolders corresponding to (a) each set of experiments for analysis (as explained in Table 2 in the manuscript) and (b) each distribution used for analysis (z, t (based on number of experimental units) and knha (based on number of studies). Each of these subfolders will then contain the following:
/.
at the root level of the folder, there will be tables summarizing the main results for that set of experiments. File names are mostly self-explanatory. The main results are generated from the data in the "Replication Assessment" and "Replication Success" tables, and are summarized in the "Replication Rate Summary". A description of the variables in these tables can be found in the data dictionary under "Intermediate datasets".
/escalc
effect sizes and variances generated by the metafor escalc function for each experiment.
/summaries
statistical summaries used for analysis for each experiment.
/forest-plots
forest plots for each meta-analysis
/individual-plots
plots for each individual replication, comparing the results for the two groups of interest
/predictors
correlation tables and scatter plots for all combinations of experiment-level ("by experiment" folder) and replication-level ("by replication" folder) predictors and replication outcomes.
/CV plots
plots comparing coefficients of variation between original experiments and replications
/additional-figures
plots showing comparisons/correlations between effect sizes of original experiments and replications
Additionally, the root /output folder also contains the following folders:
/self-assessment
includes figures and tables describing the initiative's self-assessment process.
/survey_processed_data
includes figures summarizing data from the prediction survey.
/Power Histograms
includes histograms for post-hoc power analysis
/_manuscript figures and tables
contains all figures, tables (/tables subfolder) and numerical data (included in "/tables/Document - Text-Cited Numbers") used in the manuscript. Note that table formatting may differ from that used in the final manuscript.