Skip to content
This repository was archived by the owner on Mar 10, 2026. It is now read-only.
This repository was archived by the owner on Mar 10, 2026. It is now read-only.

CrossTabulation Error on certain data #58

@kevintcaron

Description

@kevintcaron

Hi @MamadouSDiallo,

Sorry to add another issue, but hopefully this is easier to resolve. I am attempting to tabulate smoking prevalence ('_RFSMOK3') by imputed race ('_IMPRACE') with BRFSS data. This works fine for all territories including Puerto Rico from 2018-2022. However, when using the 2021 BRFSS data for Puerto Rico I am getting the error below. I have also provided my code below the error, but it will require download of the 2021 BRFSS data (or other years for it to work). Do you know what is causing this issue?

ERROR:

---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
Cell In[247], line 12
      8 # df = recode_smoker(df)
      9 # df = recode_smokeless(df)
     11 stratified_count = CrossTabulation(param=PopParam.count)
---> 12 stratified_count.tabulate(
     13     vars=df[[variable, '_IMPRACE']], 
     14     samp_weight=df['_LLCPWT'],
     15     stratum=df['_STSTR'],
     16     psu=df['_PSU'],
     17     single_psu=SinglePSUEst.skip,
     18     remove_nan=True)   
     20 df_out = stratified_count.to_dataframe()
     21 df_out

File [~\.conda\envs\ntcp_env\Lib\site-packages\samplics\categorical\tabulation.py:598](http://localhost:8888/lab/tree/~/.conda/envs/ntcp_env/Lib/site-packages/samplics/categorical/tabulation.py#line=597), in CrossTabulation.tabulate(self, vars, varnames, samp_weight, stratum, psu, ssu, fpc, deff, coef_var, single_psu, strata_comb, remove_nan)
    593 # TODO:
    594 # Replace the inversion of cov_prop and cov_prop_srs below by the a multiplication
    595 # we have that inv(x' V x) = inv(x' L L' x) = z'z where z = inv(L) x
    596 # L is the Cholesky factor i.e. L = np.linalg.cholesky(V)
    597 x1_t = np.transpose(x1)
--> 598 x2_tilde = x2 - x1 @ np.linalg.inv(x1_t @ cov_prop_srs @ x1) @ (
    599     x1_t @ cov_prop_srs @ x2
    600 )
    602 delta_est = np.linalg.inv(np.transpose(x2_tilde) @ cov_prop_srs @ x2_tilde) @ (
    603     np.transpose(x2_tilde) @ cov_prop @ x2_tilde  # TODO: is it cov_prop_srs
    604 )
    606 tbl_keys = list(tbl_est.point_est.keys())

File [~\.conda\envs\ntcp_env\Lib\site-packages\numpy\linalg\_linalg.py:608](http://localhost:8888/lab/tree/~/.conda/envs/ntcp_env/Lib/site-packages/numpy/linalg/_linalg.py#line=607), in inv(a)
    605 signature = 'D->D' if isComplexType(t) else 'd->d'
    606 with errstate(call=_raise_linalgerror_singular, invalid='call',
    607               over='ignore', divide='ignore', under='ignore'):
--> 608     ainv = _umath_linalg.inv(a, signature=signature)
    609 return wrap(ainv.astype(result_t, copy=False))

File [~\.conda\envs\ntcp_env\Lib\site-packages\numpy\linalg\_linalg.py:104](http://localhost:8888/lab/tree/~/.conda/envs/ntcp_env/Lib/site-packages/numpy/linalg/_linalg.py#line=103), in _raise_linalgerror_singular(err, flag)
    103 def _raise_linalgerror_singular(err, flag):
--> 104     raise LinAlgError("Singular matrix")

LinAlgError: Singular matrix

CODE:

import pandas as pd
import samplics
import numpy as np
from samplics.estimation import TaylorEstimator, ReplicateEstimator
from samplics.categorical import Tabulation, CrossTabulation
from samplics.utils import PopParam, SinglePSUEst

df_2021 = pd.read_sas(r'LLCP2021.XPT')
df = df_2021.copy()
df = df[df['_STATE'] == 72]
recipient = 'Puerto Rico'
year = 2021
demo = _imprace.copy()
variable = '_RFSMOK3'

# df = recode_smoker(df)
# df = recode_smokeless(df)

stratified_count = CrossTabulation(param=PopParam.count)
stratified_count.tabulate(
    vars=df[[variable, '_IMPRACE']], 
    samp_weight=df['_LLCPWT'],
    stratum=df['_STSTR'],
    psu=df['_PSU'],
    single_psu=SinglePSUEst.skip,
    remove_nan=True)   

df_out = stratified_count.to_dataframe()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions