Skip to content

Provide ability for students to access clean PDFs of Quarto slides #2

Description

@caeyo

For future versions of this class, I'd like to suggest providing students the ability to access a clean PDF of the Quarto slides that we switched to this semester. I definitely appreciate that this slide format is easier to work with for the professors and allows students to keep up with iterations to the material, but I found it difficult to take notes in my usual form of live annotating slides (which I don't think is a particularly uncommon practice) as Quarto's options for print output are not stellar - see how the Reductions slideset gets mangled below when using the PDF export mode (the junk is present on every slide, making it unusable):

Screenshot 2024-12-11 at 8 09 39 PM

I recently wrote a script using a headless browser and PDF conversion tooling to piece together screenshots of the slideset one slide at a time, and I've provided that code below if you'd like a baseline to work from to solve this issue. I didn't do this as a PR as I didn't want to impose a particular method of integration, and the code is fairly hacky. It requires playwright and img2pdf, both are accessible via pip.

import argparse
import os
from playwright.sync_api import sync_playwright
import img2pdf


def capture_slides_and_create_pdf(url, output_pdf):
    with sync_playwright() as p:
        browser = p.chromium.launch()
        page = browser.new_page()
        page.goto(url)
        
        page.wait_for_selector('.reveal .slides')
        total_slides = page.evaluate('''() => {
            return Reveal.getTotalSlides();
        }''') 

        screenshots = []
        for i in range(total_slides):
            if i == 0:
                page.evaluate(f'Reveal.slide(0)')
            else:
                page.evaluate(f'Reveal.next()')
            
            page.wait_for_timeout(100)
            page.evaluate('''() => {
                const currentSlide = Reveal.getCurrentSlide();
                const fragments = currentSlide.querySelectorAll('.fragment');
                fragments.forEach(fragment => fragment.classList.add('visible'));
            }''')
            page.wait_for_timeout(100)
            
            screenshot_path = f'slide_{i+1}.png'
            page.screenshot(path=screenshot_path, full_page=True)
            screenshots.append(screenshot_path)

        browser.close()
    
    with open(output_pdf, "wb") as f:
        f.write(img2pdf.convert(screenshots))
    
    for screenshot in screenshots:
        os.remove(screenshot)


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("input", help = "HTML filename", type = os.path.abspath)
    parser.add_argument("output", help = "PDF filename", type = os.path.abspath)
    args = parser.parse_args()
    capture_slides_and_create_pdf("file://" + args.input, args.output)

Thanks for the enjoyable semester!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions