refactor(converters): Migrate converters to MSstatsConvert#91
Conversation
📝 WalkthroughWalkthroughThis PR migrates converter implementations out of MSstatsTMT and replaces them with re-exports from MSstatsConvert, removes many local example datasets and their roxygen/man pages, relocates a small internal doc helper, updates vignette examples to load data from MSstatsConvert, and makes minor project metadata/gitignore edits. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User
participant MSstatsTMT as MSstatsTMT\n(package)
participant MSstatsConvert as MSstatsConvert\n(package)
participant FS as FileSystem
Note over MSstatsTMT,MSstatsConvert: Old flow (before this PR)
User->>MSstatsTMT: call MaxQtoMSstatsTMTFormat(input, annotation, ...)
MSstatsTMT->>FS: read input files / annotation
MSstatsTMT->>MSstatsTMT: run local converter logic\n(filtering, cleaning, summarizing)
MSstatsTMT->>FS: write/log output
MSstatsTMT-->>User: return MSstatsTMT-formatted data
Note over MSstatsTMT,MSstatsConvert: New flow (after this PR)
User->>MSstatsTMT: call MaxQtoMSstatsTMTFormat(input, annotation, ...)
MSstatsTMT->>MSstatsConvert: delegate to MSstatsConvert::MaxQtoMSstatsTMTFormat(...)
MSstatsConvert->>FS: read input files / annotation
MSstatsConvert->>MSstatsConvert: run converter logic\n(filtering, cleaning, summarizing)
MSstatsConvert->>FS: write/log output
MSstatsConvert-->>MSstatsTMT: return formatted data
MSstatsTMT-->>User: return MSstatsTMT-formatted data
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
R/utils_MSstatsTMT.R (1)
20-23:⚠️ Potential issue | 🟡 MinorRemove or replace these dangling documentation links.
raw.pdandannotation.pdare not documented datasets in this package—they exist only as temporary objects created in the vignette. The\code{\link{raw.pd}}and\code{\link{annotation.pd}}references in theinput.pdhelp topic will produce broken cross-references. Either drop the links and use plain code formatting (e.g.,\code{raw.pd}) or verify that equivalent documented objects exist in theMSstatsConvertpackage and update the links accordingly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@R/utils_MSstatsTMT.R` around lines 20 - 23, The roxygen examples in utils_MSstatsTMT.R reference temporary vignette objects using dangling links \code{\link{raw.pd}} and \code{\link{annotation.pd}}; update the documentation for the input.pd/help topic by replacing those broken link forms with plain code formatting (e.g., \code{raw.pd} and \code{annotation.pd}) or, if equivalents are documented in MSstatsConvert, change to package-qualified links (e.g., \link[MSstatsConvert]{raw.pd}) so the cross-references resolve; make this change where PDtoMSstatsTMTFormat and input.pd are referenced to remove the broken links.
🧹 Nitpick comments (1)
.gitignore (1)
9-9: Remove duplicate ignore rule.
*.Rprojis already listed at Line 7, so duplicating it at Line 9 adds noise without changing behavior.Proposed cleanup
*.Rproj *.DS_Store -*.Rproj🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitignore at line 9, Remove the duplicate ignore rule for the pattern "*.Rproj" in .gitignore: locate the duplicate entry (the second occurrence matching "*.Rproj") and delete it so the pattern only appears once, leaving the original "*.Rproj" entry intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@data/datalist`:
- Line 2: The roxygen documentation in R/utils_MSstatsTMT.R still points to
now-removed files raw.pd and annotation.pd causing broken links while
data/datalist still lists input.pd; update the roxygen block(s) in
R/utils_MSstatsTMT.R that mention raw.pd and annotation.pd to instead reference
the new MSstatsConvert-based source files (or remove the link tags altogether),
and ensure any mention of input.pd matches the current data/datalist entry so
the generated docs no longer include dead links.
In `@R/utils_docs.R`:
- Around line 5-24: The roxygen `@param` block documents logging parameters but
the .documentFunction formals still list legacy converter arguments, causing
mismatched usage/arguments in generated Rd files and broken `@inheritParams` for
callers; fix by aligning them: either (A) replace the .documentFunction formals
(fewMeasurements, useUniquePeptide, summaryforMultipleRows,
removeProtein_with1Feature, removeProtein_with1Protein,
removeOxidationMpeptides, removeMpeptides) with the documented logging formals
(use_log_file, append, verbose, log_file_path) and update any internal
references or callers to the new names, or (B) restore explicit `@param` entries
for each legacy formal so documentation matches the actual formals—choose one
approach and make the code+roxygen consistent for .documentFunction and any
functions using `@inheritParams` .documentFunction.
---
Outside diff comments:
In `@R/utils_MSstatsTMT.R`:
- Around line 20-23: The roxygen examples in utils_MSstatsTMT.R reference
temporary vignette objects using dangling links \code{\link{raw.pd}} and
\code{\link{annotation.pd}}; update the documentation for the input.pd/help
topic by replacing those broken link forms with plain code formatting (e.g.,
\code{raw.pd} and \code{annotation.pd}) or, if equivalents are documented in
MSstatsConvert, change to package-qualified links (e.g.,
\link[MSstatsConvert]{raw.pd}) so the cross-references resolve; make this change
where PDtoMSstatsTMTFormat and input.pd are referenced to remove the broken
links.
---
Nitpick comments:
In @.gitignore:
- Line 9: Remove the duplicate ignore rule for the pattern "*.Rproj" in
.gitignore: locate the duplicate entry (the second occurrence matching
"*.Rproj") and delete it so the pattern only appears once, leaving the original
"*.Rproj" entry intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2a9c08f1-a0a5-4491-9106-1ffb79fa7b55
⛔ Files ignored due to path filters (2)
inst/raw_data/Philosopher/MSstatsTMT_annotation.csvis excluded by!**/*.csvinst/raw_data/Philosopher/msstats.csvis excluded by!**/*.csv
📒 Files selected for processing (34)
.gitignoreDESCRIPTIONMSstatsTMT.RprojNAMESPACER/converters.RR/utils_MSstatsTMT.RR/utils_docs.Rdata/annotation.mine.rdadata/annotation.mq.rdadata/annotation.pd.rdadata/datalistdata/evidence.rdadata/proteinGroups.rdadata/raw.mine.rdadata/raw.om.rdadata/raw.pd.rdaman/MaxQtoMSstatsTMTFormat.Rdman/OpenMStoMSstatsTMTFormat.Rdman/PDtoMSstatsTMTFormat.Rdman/PhilosophertoMSstatsTMTFormat.Rdman/SpectroMinetoMSstatsTMTFormat.Rdman/annotation.mine.Rdman/annotation.mq.Rdman/annotation.pd.Rdman/dot-documentFunction.Rdman/dot-getPhilosopherInput.Rdman/evidence.Rdman/proteinGroups.Rdman/raw.mine.Rdman/raw.om.Rdman/raw.pd.Rdman/reexports.Rdtests/testthat/test-converters.Rvignettes/MSstatsTMT.Rmd
💤 Files with no reviewable changes (16)
- MSstatsTMT.Rproj
- man/MaxQtoMSstatsTMTFormat.Rd
- man/PDtoMSstatsTMTFormat.Rd
- man/dot-getPhilosopherInput.Rd
- man/raw.pd.Rd
- man/annotation.mine.Rd
- man/proteinGroups.Rd
- man/PhilosophertoMSstatsTMTFormat.Rd
- tests/testthat/test-converters.R
- man/annotation.mq.Rd
- man/raw.mine.Rd
- man/raw.om.Rd
- man/annotation.pd.Rd
- man/OpenMStoMSstatsTMTFormat.Rd
- man/evidence.Rd
- man/SpectroMinetoMSstatsTMTFormat.Rd
Motivation and Context
This PR refactors the MSstatsTMT package to consolidate converter implementations by delegating to the upstream
MSstatsConvertpackage. Previously, MSstatsTMT maintained local implementations of five converter functions (MaxQtoMSstatsTMTFormat,OpenMStoMSstatsTMTFormat,PDtoMSstatsTMTFormat,SpectroMinetoMSstatsTMTFormat, andPhilosophertoMSstatsTMTFormat) with wrapper logic for logging, preprocessing, and data handling. This refactoring removes the local implementations and establishes MSstatsTMT as a re-exporting interface to the canonical converter implementations in MSstatsConvert, reducing code duplication and maintenance burden while ensuring a single source of truth for data conversion logic.Changes
Converter Implementation Changes
@importFromroxygen directives. Removed associated helper functions (.documentFunction,.getPhilosopherInput) and all wrapper logic for logging, preprocessing, and data handlingMaxQtoMSstatsTMTFormat,OpenMStoMSstatsTMTFormat,PDtoMSstatsTMTFormat,SpectroMinetoMSstatsTMTFormat,PhilosophertoMSstatsTMTFormatMSstatsConvert::<function>implementationsDocumentation Refactoring
man/MaxQtoMSstatsTMTFormat.Rd,man/OpenMStoMSstatsTMTFormat.Rd,man/PDtoMSstatsTMTFormat.Rd,man/SpectroMinetoMSstatsTMTFormat.Rd,man/PhilosophertoMSstatsTMTFormat.Rd(now provided by MSstatsConvert package)man/evidence.Rd,man/proteinGroups.Rd,man/raw.pd.Rd,man/raw.mine.Rd,man/annotation.pd.Rd,man/annotation.mq.Rd,man/annotation.mine.Rd,man/raw.om.Rdman/reexports.Rddocuments the imported converter functions with cross-references to MSstatsConvert sourceman/dot-documentFunction.Rdupdated to referenceR/utils_docs.Rsource location instead ofR/converters.R; removed argument documentation for.documentFunctionman/dot-getPhilosopherInput.RdUtility Functions
.documentFunction()for roxygen documentation parameter inheritance. Function implements parameter documentation forfewMeasurements,useUniquePeptide,summaryforMultipleRows,removeProtein_with1Feature,removeProtein_with1Protein,removeOxidationMpeptides, andremoveMpeptides, but returnsNULLwith no operational logicNamespace and Dependencies
MSstatsBalancedDesign,MSstatsClean,MSstatsImport,MSstatsLogsSettings,MSstatsMakeAnnotation,MSstatsPreprocess)Data Management
annotatio.mine,annotation.mq,annotation.pd,evidence,proteinGroups,raw.mine,raw.pd,raw.om,input.pd), retaining onlytest.pairwise,input.pd, andquant.pd.msstatsVignette Updates
data.table::fread(system.file(..., package="MSstatsConvert"))instead of relying on bundled example datasetsraw.pdandannotation.pddirectly viaPDtoMSstatsTMTFormat()evidence,proteinGroups, andannotation.mqdirectly viaMaxQtoMSstatsTMTFormat()raw.mineandannotation.minedirectly viaSpectroMinetoMSstatsTMTFormat()raw.omdirectly viaOpenMStoMSstatsTMTFormat()msstats.csvandMSstatsTMT_annotation.csvdirectly viaPhilosophertoMSstatsTMTFormat()dataProcessPlotsTMTexample fromP04406toO75947head()call reference toquant.msstats$ProteinLevelDataMetadata and Configuration
7.3.2to7.3.3(reflects roxygen2 version used for documentation generation).Rprojpattern entry (resulting in redundant ignore rule with existing*.Rprojentry)Testing
tests/testthat/test-converters.R: Entire test file removed (41 lines deleted). Deleted test suites included:
PDtoMSstatsTMTFormaterror condition testsMaxQtoMSstatsTMTFormaterror condition testsSpectroMinetoMSstatsTMTFormaterror condition testsPhilosophertoMSstatsTMTFormaterror condition tests and functional integration test validating output has 550 rowsNote: Converter tests have been consolidated in the upstream MSstatsConvert package rather than duplicated in MSstatsTMT. This represents a shift in testing responsibility to the dependency package.
Coding Guidelines
No violations of standard R package coding guidelines are present. The refactoring properly uses roxygen directives (
@importFrom,@export) and follows R conventions for re-exporting functions from dependencies. However, two minor issues exist:.Rprojpattern duplicates the existing*.Rprojpattern without adding new functionalityMotivation and Solution
This refactoring consolidates duplicate converter implementations by delegating to the upstream
MSstatsConvertpackage. Five converter functions—MaxQtoMSstatsTMTFormat,OpenMStoMSstatsTMTFormat,PDtoMSstatsTMTFormat,SpectroMinetoMSstatsTMTFormat, andPhilosophertoMSstatsTMTFormat—are converted from local implementations with wrapper logic (logging, validation, preprocessing) to simple re-exports that directly reference their counterparts inMSstatsConvert. This eliminates code duplication, reduces maintenance burden, and centralizes conversion logic in a single package. Supporting infrastructure (documentation, tests, example datasets, and helper functions) are removed or relocated accordingly.Detailed Changes
Core Converter Functions:
R/converters.Rreduced from ~426 lines to ~18 lines by replacing function implementations with@importFromdirectives and re-export stubsMSstatsConvertimplementations; local wrapper logic (logging, preprocessing, renaming) removed.documentFunction()and.getPhilosopherInput()removed fromR/converters.RNamespace and Imports:
NAMESPACEupdated to import five converter functions fromMSstatsConvertand removed imports forMSstatsBalancedDesign,MSstatsClean,MSstatsImport,MSstatsLogsSettings,MSstatsMakeAnnotation, andMSstatsPreprocessDocumentation:
MaxQtoMSstatsTMTFormat.Rd,OpenMStoMSstatsTMTFormat.Rd,PDtoMSstatsTMTFormat.Rd,SpectroMinetoMSstatsTMTFormat.Rd,PhilosophertoMSstatsTMTFormat.Rd)evidence.Rd,proteinGroups.Rd,raw.pd.Rd,raw.mine.Rd,raw.om.Rd,annotation.pd.Rd,annotation.mq.Rd,annotation.mine.Rd)man/dot-getPhilosopherInput.Rd(helper function documentation)man/dot-documentFunction.Rdto reference new locationR/utils_docs.Rand simplified usage signatureman/reexports.Rdto document imported converters fromMSstatsConvertman/input.pd.Rddescription to reflect it as example dataset rather than converter outputNew/Modified Utility Files:
R/utils_docs.Rcontaining internal.documentFunction()placeholder for roxygen parameter inheritance (returnsNULL, no operational logic)R/utils_MSstatsTMT.R(~227 lines removed)Example Data:
data/datalistreduced to retain onlytest.pairwise,input.pd, andquant.pd.msstats; removed bundled example datasets for Proteome Discoverer, MaxQuant, SpectroMine, and OpenMSVignette Updates:
vignettes/MSstatsTMT.Rmdrefactored to load example inputs fromMSstatsConvertviasystem.file()anddata.table::fread()instead of bundled datasetsP04406toO75947indataProcessPlotsTMTexamplehead()reference to usequant.msstats$ProteinLevelDatainstead ofquant.pd.msstats$ProteinLevelDataBuild Configuration:
MSstatsTMT.Rproj(RStudio project configuration file)DESCRIPTIONRoxygenNote from 7.3.2 to 7.3.3.Rprojignore pattern to.gitignore(duplicate of existing*.Rprojentry)Unit Tests
tests/testthat/test-converters.Rentirely, removing all local test coverage forPDtoMSstatsTMTFormat,MaxQtoMSstatsTMTFormat,SpectroMinetoMSstatsTMTFormat, andPhilosophertoMSstatsTMTFormatMSstatsConvertbut this transition lacks explicit documentation in changelogCoding Guideline Issues
.gitignorenow contains a redundant duplicate entry: added.Rprojalongside the pre-existing*.Rprojpattern, resulting in identical ignore behavior declared twiceMSstatsConvert