IMR

The IMR package provides a framework for both matrix completion and regression on response matrices with missing values. Let $\boldsymbol{Y} \in \Re^{n\times m}$ denote the observed incomplete matrix, where missing values are designated by either NA or zero. The estimator for each entry of the matrix is specified by any combination of the following

$$\boldsymbol{\hat Y_{ij}} = \boldsymbol{\hat\beta_{oi}} + \boldsymbol{\hat\Gamma_{oj}} + \boldsymbol{X_i\hat\beta_j} + \boldsymbol{\hat\Gamma_i Z^{'}_j} + \boldsymbol{\hat M_{ij}}$$

where $\boldsymbol{X}\in \Re^{n\times p}$ and $\boldsymbol{Z}\in \Re^{m\times q}$ are row ($p$ predictors) and column ($q$ predictors) covariate matrices, respectively. The vectors $\boldsymbol{\hat\beta_{o}}\in \Re^{n}$ and $\boldsymbol{\hat\Gamma_{o}}\in \Re^{m}$ represent the row-level and column-level intercepts. The term $\boldsymbol{\hat\beta}$ denotes the row covariate coefficients, which may be structured as either an $p\times m$ matrix (one coefficient of each (predictor,column) pair, or a $p$-dimensional vector (one coefficient for each predictor), forcing coefficients to be equal across all columns for each covariate. Similarly, the column covariate coefficients, denoted by $\boldsymbol{\hat\Gamma}$, can be either an $n \times q$ matrix or a $q$-dimensional vector where all rows share the same coefficient for each covariate. To avoid having too many parameters, we impose Lasso ($L_1$) penalties on the row and column covariate coefficients. Finally, $\boldsymbol{M}$ is a rank-r (r is a hyper-parameter) low-rank matrix subject to a nuclear norm penalty. Together, they yield the following penalty structure:

$$\mathrm{Penalty} = \lambda_\beta {|\boldsymbol{\beta}|}_{1} + \lambda_\Gamma {|\boldsymbol{\Gamma}|}_{1} + \lambda_m {|\boldsymbol{S_r^{1/2}MS_c^{1/2}}|}_{*},$$

where $\boldsymbol{S_r} \in \Re^{n\times n}$ and $\boldsymbol{S_c} \in \Re^{m \times m}$ are similarity (or information) matrices that describe the correlation structure among the rows and columns of the response matrix, respectively. In the absence of a known correlation structure, this penalty term reduces to the standard nuclear norm, ${|\boldsymbol{M}|}{*}$. We have 4 penalty parameters: $(\lambda\beta, \lambda_\Gamma, \lambda_m, r)$. We provide a method to estimate those parameters.

As we said above, we can use any combination of the model components to define our estimator. Examples of these combinations include:

$\boldsymbol{\hat Y_{ij}} = \boldsymbol{\hat\beta_{oi}} + \boldsymbol{X_i\hat\beta}$ (where $\boldsymbol{\hat\beta}$ is a p-dimensional vector).

$\boldsymbol{\hat Y_{ij}} = \boldsymbol{\hat\beta_{oi}} + \boldsymbol{\hat\Gamma_{oj}} +\boldsymbol{\hat M_{ij}}$ with $\boldsymbol{S_r}$ set to a the inverse of a Matern kernel and $S_c$ left unspecified.

$\boldsymbol{\hat Y_{ij}} = \boldsymbol{\hat M_{ij}}$, where neither $\boldsymbol{S_r}$ or $\boldsymbol{S_c}$ is specified. This corresponds exactly to the Soft-Impute model [@hastie2015].

Installation

You can install the development version of IMR from GitHub with:

# install.packages("remotes")
remotes::install_github("khaledfouda/IMR", build_vignettes = TRUE)
# or
# install.packages("pak")
# pak::pak("khaledfouda/IMR")

Example

To illustrate the standard workflow within the package, assume the matrices $\boldsymbol{Y}$ and $\boldsymbol{X}$ are defined as above, and the objective is to fit example 1 above to obtain the complete estimated matrix $\boldsymbol{\hat Y}$. Then, the implementation is as follows:

library(IMR)
# set the hyperparameter value.
lambda_beta <- 0.02
# load the data example (see ?IMR::Bixi_sample for more information)
Bixi <- IMR::Bixi_sample
# create the data object
data <- imr_data(Y = Bixi$Y, X = Bixi$X)
# update the model structure to fit example 1
data <- update(data, row_covariates = TRUE, # turn  XBeta on (on by default when X is provided)
               shared_beta = TRUE, # make beta a p-dimensional vector (off by default)
               low_rank_component = FALSE, # turn M off (on by default)
               row_intercept = TRUE) # turn row intercepts on (off by default).
# fit the model
fit  <- imr_fit(data, lambda_beta = lambda_beta )
# obtain \hat{Y}
Y_hat <- reconstruct(fit, data)$estimates

For a detailed overview of the package and step-by-step example, please read the vignette. You can access it from within R by running:

vignette("IMR", "IMR")

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
.github		.github
R		R
data		data
dev		dev
inst		inst
man		man
src		src
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
CRAN-SUBMISSION		CRAN-SUBMISSION
DESCRIPTION		DESCRIPTION
IMR.Rproj		IMR.Rproj
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md
cran-comments.md		cran-comments.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMR

Installation

Example

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

IMR

Installation

Example

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages