A few tweaks from Paul's work on his real dataset#160
Merged
Conversation
Fit the non-competing-event univariate Cox model in the hazard step with survival::coxph.fit() on a prebuilt one-column design matrix instead of coxph(Surv(followup, event == 1) ~ get(tx_bas), data). The formula interface rebuilds the model.frame and model.matrix on every call, and handler() is run once for the full fit plus once per bootstrap replicate, so that overhead was paid on every iteration. On a 200k-row univariate fit with the heavy ties that integer follow-up produces, the formula path spends about 86% of its time in model.frame construction rather than the C fit, and the direct call is roughly 7x faster. The competing-event finegray path is unchanged. Results are identical: an ITT bootstrap hazard run (seed 42) agrees with the previous formula-based fit to ~1e-13 on the hazard ratio and CIs. coxph.fit uses method = "efron" and the default coxph.control() to match coxph()'s defaults. Add survival::coxph.fit and survival::coxph.control to the imports (roxygen @importFrom and NAMESPACE), a NEWS.md entry, and a test asserting coxph.fit and coxph(formula) give the same coefficient on tied data so a future survival change cannot silently break the equivalence.
Fit the competing-event Fine-Gray model with survival::agreg.fit() on a prebuilt design matrix instead of coxph(Surv(fgstart, fgstop, fgstatus) ~ get(tx_bas), data = hr.data). The counting-process formula fit rebuilds the model.frame and model.matrix on every call, and handler()
Pass the finegray() case weights (fgwt) to the competing-event subdistribution Cox fit. These inverse-probability-of-censoring weights are required for a valid Fine-Gray subdistribution-hazard estimate and had been omitted since the hazard function was first written (fgwt never appears anywhere in the git history). This is a no-op for the current hazard ratio: the model is fit on simulated data in which every subject-trial is followed across the full 0..followup.max grid until its first event, so the only censoring is administrative at a single time and all fgwt are exactly 1. Verified bit-for-bit in the pipeline - the competing-event bootstrap hazard (seed 123) is identical with weights = fgwt and with weights = NULL. The fix matters only if the simulated hazard data ever carries genuine random censoring (e.g. if IPCW/LTFU were folded into the simulation, or per-trial follow-up grids became non-uniform), in which case the unweighted estimate would be biased. Update the competing-event equivalence test to the weighted form (agreg.fit with weights = fgwt against coxph(formula, weights = fgwt)) so it validates the corrected call on data that does have random censoring, and add a NEWS.md entry.
Replace the simple table() counts with compevent.table(), a new helper mirroring outcome.table(), so competing event counts in @info are grouped by tx_init_bas rather than reported as a single total.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This,
Speed up the hazard ratio calculation by fitting the Cox model with the survival C fitters directly on a prebuilt design matrix instead of
coxph(formula, data), avoiding themodel.frame/model.matrixrebuild on every bootstrap iteration:survival::coxph.fit()for the non-competing-event model andsurvival::agreg.fit()for the competing-event Fine-Gray (counting-process) model. The hazard ratio and CIs are unchanged.Fix the competing-event Fine-Gray hazard fit to use the
finegray()case weights (fgwt), which are required for a valid subdistribution-hazard estimate and were previously omitted. This is a no-op for the current hazard simulation (which has only administrative censoring, so allfgwtare 1) but corrects the estimate should the simulated data ever carry random censoring.Report competing events per treatment arm in the
@infoslot asinfo$compevent.uniqueandinfo$compevent.nonunique, mirroring the structure ofinfo$outcome.unique/info$outcome.nonunique. Both are grouped by baseline treatment; the non-unique table counts all competing event occurrences in the expanded data and the unique table counts distinct subjects who experienced the competing event. Both areNAwhen nocompeventis specified.