This package runs the SDC methods required for frequency tables in IDS. It enables the user to create a frequency table which has cell key perturbation applied to the counts, meaning that users cannot be sure whether differences between tables represent a real person, or are caused by the perturbation.
Cell Key Perturbation is consistent and repeatable, so the same cells are always perturbed in the same way.
To improve speed and reduce memory usage for very large datasets, this package uses the data.table package. Data passed in to the method must be a data table, and the perturbed frequency table is returned as a data table.
You can install cellkeyperturbation from GitHub with:
# install.packages("devtools")
devtools::install_github("ONSdigital/cell-key-perturbation-R", build_vignettes = TRUE)This is an example showing how to create a perturbed table from data which has been included in this package in order to showcase the method.
micro is an example dataset (data table) containing randomly generated data.
ptable_10_5 is an example ptable (data table) containing the rules to apply cell key perturbation with a threshold of 10 and rounding to base 5.
library(cellkeyperturbation)
perturbed_table <-create_perturbed_table(data = micro,
record_key = "record_key",
geog = c("var1"),
tab_vars = c("var5","var8"),
ptable = ptable_10_5,
threshold = 10)Links to the Help Pages and User Guide (Introduction to cellkeyperturbation) can be accessed using:
help(package="cellkeyperturbation")