Skip to content

Update column_spec to not put special characters inside of regular expression#900

Closed
RockfordMankiniUCSD wants to merge 1 commit into
haozhu233:masterfrom
RockfordMankiniUCSD:master
Closed

Update column_spec to not put special characters inside of regular expression#900
RockfordMankiniUCSD wants to merge 1 commit into
haozhu233:masterfrom
RockfordMankiniUCSD:master

Conversation

@RockfordMankiniUCSD
Copy link
Copy Markdown

Used to fix the error that comes from running the following code in RMarkdown:

---
title: "Untitled"
output:
  pdf_document: default
date: "2025-05-15"
---
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
library(kableExtra)

col1 <- c("Wave 1", "Wave 2", "Wave 3", "Wave 4", "Wave 5")
col2 <- c("ID", "ID", "ID", "ID", "ID")
col3 <- c("This is a test.", "This is a test.", "This is a test.", "This is a test", "This is a test.")
df <- data.frame(col1, col2, col3)



kbl(df[,2:3], booktabs=T, align="lll", format = "latex", col.names = NULL,
           row.names = FALSE, longtable = T,  escape=F) %>%
  kable_styling(position = "left", full_width=FALSE) %>%
  column_spec(1, width = "6cm") %>%  
  pack_rows(index=table(fct_inorder(df$col1)), bold=F) %>%
  add_indent(1:nrow(df),level_of_indent = 2) %>%
  sub("\\\\toprule", "", .) %>%  sub("\\\\bottomrule", "", .)

Error:

! Misplaced \noalign.
\addlinespace ->\noalign 
                         {\ifnum 0=`}\fi \@ifnextchar [{\@addspace }{\@addsp...
l.129 \addlinespace
                   [0.3em]
I expect to see \noalign only after the \cr of
an alignment. Proceed, and I'll ignore this case.

Error: LaTeX failed to compile testrmarkdown.tex. See https://yihui.org/tinytex/r/#debugging for debugging tips. See testrmarkdown.log for more info.
Execution halted

Analysis:
See this snippet from column_spec_latex in column_spec.R:

  for (i in rows) {
    target_row <- table_info$contents[i]
    new_row <- latex_cell_builder(
      target_row, column, table_info,
      bold[i], italic[i], monospace[i], underline[i],
      strikeout[i], color[i], background[i], link[i], image[i]
      # font_size, angle
      )
    temp_sub <- ifelse(i == 1 & (table_info$tabular == "longtable" |
                                   !is.null(table_info$repeat_header_latex)),
                       gsub, sub)
    out <- temp_sub(target_row, new_row, out, perl = T)
    table_info$contents[i] <- new_row
  }

target_row is used as a regular expression to match against data from the output of kbl(). If sub() is used with an unescaped ., it'll treat it as a wildcard and erroneously substitute out the 4th row, leading to the snippet generating broken LaTeX.

Here's the value of the piped kbl() output from above. You can see a period was erroneously inserted at ID & This is a test.\ along with other missing LaTeX syntax.

\begin{longtable}[l]{>{\raggedright\arraybackslash}p{6cm}l}

\addlinespace[0.3em]
\multicolumn{2}{l}{Wave 1}\\
\hspace{2em}\hspace{1em}ID & This is a \vphantom{3} test.\\
\addlinespace[0.3em]
\multicolumn{2}{l}{Wave 2}\\
\hspace{2em}\hspace{1em}ID & This is a \vphantom{2} test.\\
\addlinespace[0.3em]
\multicolumn{2}{l}{Wave 3}\\
\hspace{2em}\hspace{1em}ID & This is a \vphantom{1} test.\\
ID & This is a test.\
\addlinespace[0.3em]
\multicolumn{2}{l}{Wave 5}\\
\hspace{2em}\hspace{1em}ID & This is a test.\\

\end{longtable}

Solution: escape the regex before putting it into sub().

@dmurdoch
Copy link
Copy Markdown
Collaborator

I think your mods look as though they would work, but what worries me is I don't understand why this is necessary here and not elsewhere. The contents of table_info$contents[i] have already been processed by regex_escape(..., double_backslash = TRUE), which looks like this:

regex_escape <- function(x, double_backslash = FALSE) {
  if (double_backslash) {
    x <- gsub("\\\\", "\\\\\\\\", x)
  }
  x <- gsub("\\$", "\\\\\\$", x)
  x <- gsub("\\(", "\\\\(", x)
  x <- gsub("\\)", "\\\\)", x)
  x <- gsub("\\[", "\\\\[", x)
  x <- gsub("\\]", "\\\\]", x)
  x <- gsub("\\{", "\\\\{", x)
  x <- gsub("\\}", "\\\\}", x)
  x <- gsub("\\*", "\\\\*", x)
  x <- gsub("\\+", "\\\\+", x)
  x <- gsub("\\?", "\\\\?", x)
  x <- gsub("\\|", "\\\\|", x)
  x <- gsub("\\^", "\\\\^", x)
  return(x)
}

If dots need escaping, shouldn't that function be modified to do it?

@dmurdoch
Copy link
Copy Markdown
Collaborator

PR #911 (just merged) addresses this issue differently.

@dmurdoch dmurdoch closed this Jul 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants