Skip to content

The package `chunkfactory`is meant to facilitate the production of documents for the exploration of bivariate descriptive statistics.

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

icg-cat/chunkfactory

Repository files navigation

vignette - chunkfactory

author: irene cruz

date: 27-05-25

This package was developed in the context of the project Igualdad de género en los usos del tiempo: Cambios, resistencias y continuidades, GENERA Proyecto PID2021-122515NB-I00.

Objectives

The package chunkfactoryis meant to minimize the coding for markdown documents where a handful of functions need to be executed iteratively.

It solves 2 problems:

  • repetitive code when performing the same operations on multiple combinations of variables
  • presenting figures and tables in RMD when repeating functions on a list

The chunk factory can be applied to custom functions, or can be used with the built-in descriptive statistics functions for weighted data. Let's see the two cases.

library(chunkfactory)
library(tidyverse)
utils::data(package = "palmerpenguins", "penguins")
penguins$pes <- 1

Execute any function as a chunk factory {.tabset}

  1. We can use the chunk factory on any custom function that generates the collection of results we need. The only restriction is that the data argument needs to be passed as a text string. Then the function needs to evaluate the text string.

For example, we'll build a function that, for every pair of variables returns:

  • a linear regression model
  • a formatted table of model results
  • a tibble of model coefficients
  • a predicted means plot
myfunc <- function(data_name, v1, v2){
  data <- get(data_name)
  
  myformula <- as.formula(glue::glue("{v1} ~ {v2}"))
  res1 <- lm(myformula, data = data)
  res2 <- sjPlot::tab_model(res1, use.viewer = F)
  res3 <- broom::tidy(res1)
  res4 <- sjPlot::plot_model(res1, type = "eff", terms = v2)
  
  return(list(res1, res2, res3, res4))
}

If we were to simply execute myfunc()and get the list of results it produces as is, we'd find each element is preceded by the aesthetically unpleasant [[x]], and table outputs are not printed following document guidelines (i.e. paged data frames):

myfunc("penguins", "flipper_length_mm", "species")
## [[1]]
## 
## Call:
## lm(formula = myformula, data = data)
## 
## Coefficients:
##      (Intercept)  speciesChinstrap     speciesGentoo  
##           189.95              5.87             27.23  
## 
## 
## [[2]]
## 
## [[3]]
## # A tibble: 3 × 5
##   term             estimate std.error statistic   p.value
##   <chr>               <dbl>     <dbl>     <dbl>     <dbl>
## 1 (Intercept)        190.       0.540    351.   0        
## 2 speciesChinstrap     5.87     0.970      6.05 3.79e-  9
## 3 speciesGentoo       27.2      0.807     33.8  1.84e-110
## 
## [[4]]

chunkfactory will help us reproduce results in an optimal way for a rmd document, while minimising the iteration of code.

  1. In order to create a chunk factory, first we'll need to define a list of parameters to iterate upon. Remember that the parameters need to be named exactly as the function arguments' names, and follow the same order:
myfunc_params <- list(
  data_name = "penguins", 
  v1        = c("flipper_length_mm", "bill_length_mm"),
  v2        = c("species", "island", "sex")
  )

Next we'll only need to execute our custom function with our list of parameters:

myres <- fabrica_chunks_myfunc(
  myfunc = myfunc, 
  param_list = myfunc_params, 
  title_level = 2)

This generates a character vector containing the markdown, chunks and code to be evaluated. See a sample:

head(myres)
## [1] "\n\n## penguins.flipper_length_mm.species\n\n```{r echo = TRUE}\n\neval(parse(text = 'reslist[[1]][[1]]'))\neval(parse(text = 'reslist[[1]][[2]]'))\neval(parse(text = 'reslist[[1]][[3]]'))\neval(parse(text = 'reslist[[1]][[4]]'))\n```\n\n"
## [2] "\n\n## penguins.bill_length_mm.species\n\n```{r echo = TRUE}\n\neval(parse(text = 'reslist[[2]][[1]]'))\neval(parse(text = 'reslist[[2]][[2]]'))\neval(parse(text = 'reslist[[2]][[3]]'))\neval(parse(text = 'reslist[[2]][[4]]'))\n```\n\n"   
## [3] "\n\n## penguins.flipper_length_mm.island\n\n```{r echo = TRUE}\n\neval(parse(text = 'reslist[[3]][[1]]'))\neval(parse(text = 'reslist[[3]][[2]]'))\neval(parse(text = 'reslist[[3]][[3]]'))\neval(parse(text = 'reslist[[3]][[4]]'))\n```\n\n" 
## [4] "\n\n## penguins.bill_length_mm.island\n\n```{r echo = TRUE}\n\neval(parse(text = 'reslist[[4]][[1]]'))\neval(parse(text = 'reslist[[4]][[2]]'))\neval(parse(text = 'reslist[[4]][[3]]'))\neval(parse(text = 'reslist[[4]][[4]]'))\n```\n\n"    
## [5] "\n\n## penguins.flipper_length_mm.sex\n\n```{r echo = TRUE}\n\neval(parse(text = 'reslist[[5]][[1]]'))\neval(parse(text = 'reslist[[5]][[2]]'))\neval(parse(text = 'reslist[[5]][[3]]'))\neval(parse(text = 'reslist[[5]][[4]]'))\n```\n\n"    
## [6] "\n\n## penguins.bill_length_mm.sex\n\n```{r echo = TRUE}\n\neval(parse(text = 'reslist[[6]][[1]]'))\neval(parse(text = 'reslist[[6]][[2]]'))\neval(parse(text = 'reslist[[6]][[3]]'))\neval(parse(text = 'reslist[[6]][[4]]'))\n```\n\n"
  1. The list of results is then interpreted by knit_child(), like in the previous example:

penguins.flipper_length_mm.species

eval(parse(text = 'reslist[[1]][[1]]'))
## 
## Call:
## lm(formula = myformula, data = data)
## 
## Coefficients:
##      (Intercept)  speciesChinstrap     speciesGentoo  
##           189.95              5.87             27.23
eval(parse(text = 'reslist[[1]][[2]]'))
  flipper length mm
Predictors Estimates CI p
(Intercept) 189.95 188.89 – 191.02 <0.001
species [Chinstrap] 5.87 3.96 – 7.78 <0.001
species [Gentoo] 27.23 25.65 – 28.82 <0.001
Observations 342
R2 / R2 adjusted 0.778 / 0.777
eval(parse(text = 'reslist[[1]][[3]]'))
## # A tibble: 3 × 5
##   term             estimate std.error statistic   p.value
##   <chr>               <dbl>     <dbl>     <dbl>     <dbl>
## 1 (Intercept)        190.       0.540    351.   0        
## 2 speciesChinstrap     5.87     0.970      6.05 3.79e-  9
## 3 speciesGentoo       27.2      0.807     33.8  1.84e-110
eval(parse(text = 'reslist[[1]][[4]]'))

penguins.bill_length_mm.species

eval(parse(text = 'reslist[[2]][[1]]'))
## 
## Call:
## lm(formula = myformula, data = data)
## 
## Coefficients:
##      (Intercept)  speciesChinstrap     speciesGentoo  
##           38.791            10.042             8.713
eval(parse(text = 'reslist[[2]][[2]]'))
  bill length mm
Predictors Estimates CI p
(Intercept) 38.79 38.32 – 39.27 <0.001
species [Chinstrap] 10.04 9.19 – 10.89 <0.001
species [Gentoo] 8.71 8.01 – 9.42 <0.001
Observations 342
R2 / R2 adjusted 0.708 / 0.706
eval(parse(text = 'reslist[[2]][[3]]'))
## # A tibble: 3 × 5
##   term             estimate std.error statistic   p.value
##   <chr>               <dbl>     <dbl>     <dbl>     <dbl>
## 1 (Intercept)         38.8      0.241     161.  2.47e-322
## 2 speciesChinstrap    10.0      0.432      23.2 4.23e- 72
## 3 speciesGentoo        8.71     0.360      24.2 5.33e- 76
eval(parse(text = 'reslist[[2]][[4]]'))

penguins.flipper_length_mm.island

eval(parse(text = 'reslist[[3]][[1]]'))
## 
## Call:
## lm(formula = myformula, data = data)
## 
## Coefficients:
##     (Intercept)      islandDream  islandTorgersen  
##          209.71           -16.63           -18.51
eval(parse(text = 'reslist[[3]][[2]]'))
  flipper length mm
Predictors Estimates CI p
(Intercept) 209.71 208.01 – 211.40 <0.001
island [Dream] -16.63 -19.23 – -14.04 <0.001
island [Torgersen] -18.51 -22.02 – -15.00 <0.001
Observations 342
R2 / R2 adjusted 0.376 / 0.372
eval(parse(text = 'reslist[[3]][[3]]'))
## # A tibble: 3 × 5
##   term            estimate std.error statistic  p.value
##   <chr>              <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)        210.      0.862     243.  0       
## 2 islandDream        -16.6     1.32      -12.6 4.20e-30
## 3 islandTorgersen    -18.5     1.78      -10.4 4.04e-22
eval(parse(text = 'reslist[[3]][[4]]'))

penguins.bill_length_mm.island

eval(parse(text = 'reslist[[4]][[1]]'))
## 
## Call:
## lm(formula = myformula, data = data)
## 
## Coefficients:
##     (Intercept)      islandDream  islandTorgersen  
##          45.257           -1.090           -6.307
eval(parse(text = 'reslist[[4]][[2]]'))
  bill length mm
Predictors Estimates CI p
(Intercept) 45.26 44.49 – 46.02 <0.001
island [Dream] -1.09 -2.26 – 0.08 0.069
island [Torgersen] -6.31 -7.89 – -4.72 <0.001
Observations 342
R2 / R2 adjusted 0.154 / 0.149
eval(parse(text = 'reslist[[4]][[3]]'))
## # A tibble: 3 × 5
##   term            estimate std.error statistic   p.value
##   <chr>              <dbl>     <dbl>     <dbl>     <dbl>
## 1 (Intercept)        45.3      0.390    116.   4.68e-275
## 2 islandDream        -1.09     0.597     -1.83 6.88e-  2
## 3 islandTorgersen    -6.31     0.806     -7.83 6.44e- 14
eval(parse(text = 'reslist[[4]][[4]]'))

penguins.flipper_length_mm.sex

eval(parse(text = 'reslist[[5]][[1]]'))
## 
## Call:
## lm(formula = myformula, data = data)
## 
## Coefficients:
## (Intercept)      sexmale  
##     197.364        7.142
eval(parse(text = 'reslist[[5]][[2]]'))
  flipper length mm
Predictors Estimates CI p
(Intercept) 197.36 195.29 – 199.44 <0.001
sex [male] 7.14 4.22 – 10.07 <0.001
Observations 333
R2 / R2 adjusted 0.065 / 0.062
eval(parse(text = 'reslist[[5]][[3]]'))
## # A tibble: 2 × 5
##   term        estimate std.error statistic    p.value
##   <chr>          <dbl>     <dbl>     <dbl>      <dbl>
## 1 (Intercept)   197.        1.06    187.   0         
## 2 sexmale         7.14      1.49      4.80 0.00000239
eval(parse(text = 'reslist[[5]][[4]]'))

penguins.bill_length_mm.sex

eval(parse(text = 'reslist[[6]][[1]]'))
## 
## Call:
## lm(formula = myformula, data = data)
## 
## Coefficients:
## (Intercept)      sexmale  
##      42.097        3.758
eval(parse(text = 'reslist[[6]][[2]]'))
  bill length mm
Predictors Estimates CI p
(Intercept) 42.10 41.31 – 42.88 <0.001
sex [male] 3.76 2.65 – 4.87 <0.001
Observations 333
R2 / R2 adjusted 0.118 / 0.116
eval(parse(text = 'reslist[[6]][[3]]'))
## # A tibble: 2 × 5
##   term        estimate std.error statistic   p.value
##   <chr>          <dbl>     <dbl>     <dbl>     <dbl>
## 1 (Intercept)    42.1      0.400    105.   2.18e-256
## 2 sexmale         3.76     0.564      6.67 1.09e- 10
eval(parse(text = 'reslist[[6]][[4]]'))

Examples with built-in functions

The package includes built-in functions in order to perform bivariate analyses with weighted data.

If the dependent variable is numeric, results will show:

  • grouped descriptive statistics
  • grouped boxplots

If the dependent variable is categorical, results will show:

  • a cross-tab in tidy format, including adjusted-standardized residuals [citation]
  • stacked bar chart

Results are organized into tabsets, like in the following example:

Bivariates by sex {.tabset}

myres <- fabrica_chunks(
  vd = c("bill_length_mm", "bill_depth_mm", "flipper_length_mm"), 
  vi = c("sex"), 
  d = "penguins", 
  w = "pes")

myres generates a vector character containing the code that will be later evaluated with knit_child

bill_length_mm x sex

eval(parse(text = (reslist_mytab[[1]][[1]])))
## # A tibble: 3 × 9
##   sex        n     N mitjana mediana desv_tip margin lower upper
##   <fct>  <int> <dbl>   <dbl>   <dbl>    <dbl>  <dbl> <dbl> <dbl>
## 1 female   165   165    42.1    42.8     4.90  0.754  41.3  42.9
## 2 male     168   168    45.9    46.8     5.37  0.817  45.0  46.7
## 3 <NA>      11    11    41.3    42       4.63  3.07   38.2  44.4
eval(parse(text = (reslist_mytab[[1]][[2]])))

bill_depth_mm x sex

eval(parse(text = (reslist_mytab[[2]][[1]])))
## # A tibble: 3 × 9
##   sex        n     N mitjana mediana desv_tip margin lower upper
##   <fct>  <int> <dbl>   <dbl>   <dbl>    <dbl>  <dbl> <dbl> <dbl>
## 1 female   165   165    16.4    17       1.80  0.276  16.1  16.7
## 2 male     168   168    17.9    18.4     1.86  0.284  17.6  18.2
## 3 <NA>      11    11    16.6    17.1     2.24  1.48   15.2  18.1
eval(parse(text = (reslist_mytab[[2]][[2]])))

flipper_length_mm x sex

eval(parse(text = (reslist_mytab[[3]][[1]])))
## # A tibble: 3 × 9
##   sex        n     N mitjana mediana desv_tip margin lower upper
##   <fct>  <int> <dbl>   <dbl>   <dbl>    <dbl>  <dbl> <dbl> <dbl>
## 1 female   165   165    197.    193      12.5   1.92  195.  199.
## 2 male     168   168    205.    200.     14.5   2.22  202.  207.
## 3 <NA>      11    11    199     193      16.5  10.9   188.  210.
eval(parse(text = (reslist_mytab[[3]][[2]])))

Bivariates by island {.tabset}

myres <- fabrica_chunks(
  vd = c("sex", "species"), 
  vi = c("island"), 
  d = "penguins", 
  w = "pes")

sex x island

eval(parse(text = (reslist_mytab[[1]][[1]])))
## # A tibble: 6 × 7
##   VI        VD         N     n    TT    PP  ASres
##   <chr>     <chr>  <dbl> <int> <dbl> <dbl>  <dbl>
## 1 Biscoe    female    80    80   163  49.1 -0.167
## 2 Biscoe    male      83    83   163  50.9  0.167
## 3 Dream     female    61    61   123  49.6  0.012
## 4 Dream     male      62    62   123  50.4 -0.013
## 5 Torgersen female    24    24    47  51.1  0.225
## 6 Torgersen male      23    23    47  48.9 -0.224
eval(parse(text = (reslist_mytab[[1]][[2]])))

species x island

eval(parse(text = (reslist_mytab[[2]][[1]])))
## # A tibble: 5 × 7
##   VI        VD            N     n    TT    PP  ASres
##   <chr>     <chr>     <dbl> <int> <dbl> <dbl>  <dbl>
## 1 Biscoe    Adelie       44    44   168  26.2 -6.57 
## 2 Biscoe    Gentoo      124   124   168  73.8 14.3  
## 3 Dream     Adelie       56    56   124  45.2  0.273
## 4 Dream     Chinstrap    68    68   124  54.8 12.3  
## 5 Torgersen Adelie       52    52    52 100    8.80
eval(parse(text = (reslist_mytab[[2]][[2]])))

About

The package `chunkfactory`is meant to facilitate the production of documents for the exploration of bivariate descriptive statistics.

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages