Skip to content

⚡ Bolt: Use CSafeLoader for faster YAML parsing#564

Open
aafre wants to merge 1 commit into
mainfrom
bolt-fast-yaml-13112542513810214310
Open

⚡ Bolt: Use CSafeLoader for faster YAML parsing#564
aafre wants to merge 1 commit into
mainfrom
bolt-fast-yaml-13112542513810214310

Conversation

@aafre
Copy link
Copy Markdown
Owner

@aafre aafre commented May 31, 2026

💡 What: The optimization implemented is the introduction of a fast_yaml_load utility in utils/yaml_converter.py that leverages PyYAML's CSafeLoader (which uses libyaml C bindings) when available, falling back to the standard SafeLoader. This utility replaces pure Python yaml.safe_load across the entire application (in app.py, resume_generator.py, scripts, and tests).

🎯 Why: The codebase frequently loads large YAML templates and configurations. PyYAML's default yaml.safe_load is a pure Python parser and performs significantly slower than its C-based counterpart. Switching to CSafeLoader drastically reduces the parsing overhead for I/O bound tasks and API responses that rely on translating these files.

📊 Impact: Expected performance improvement is a roughly ~9x speedup in YAML parsing times. Based on a synthetic benchmark during exploration using a large dummy resume configuration, the loading time decreased from ~9.275s (Slow) to ~0.981s (Fast) for 100 iterations.

🔬 Measurement: How to verify the improvement: The performance gain can be measured by comparing the execution time of endpoints or background tasks that parse YAML templates or resume configurations (e.g., PDF generation latency or the /api/templates endpoint response times).


PR created automatically by Jules for task 13112542513810214310 started by @aafre

Co-authored-by: aafre <8656674+aafre@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a high-performance YAML loading utility, fast_yaml_load, in utils/yaml_converter.py that leverages CSafeLoader when available, falling back to SafeLoader. It replaces occurrences of yaml.safe_load with fast_yaml_load globally across the application, scripts, and test suites to optimize parsing performance. Widespread code formatting and import sorting have also been applied throughout the codebase. Feedback on the changes suggests moving the try...except block and Loader imports in fast_yaml_load to the module level to avoid the overhead of importing on every function call.

Comment thread utils/yaml_converter.py
Comment on lines +11 to +26
from typing import Any, Dict

import yaml
from typing import Dict, Any


def fast_yaml_load(stream):
"""
High-performance YAML loading utility.
Leverages CSafeLoader for significantly faster parsing when available,
falling back to SafeLoader if not.
"""
try:
from yaml import CSafeLoader as Loader
except ImportError:
from yaml import SafeLoader as Loader
return yaml.load(stream, Loader=Loader)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To maximize the performance of the fast_yaml_load utility, the try...except block and the Loader imports should be moved to the module level. Importing inside the function on every invocation introduces unnecessary overhead, which defeats some of the performance benefits of using CSafeLoader.

Suggested change
from typing import Any, Dict
import yaml
from typing import Dict, Any
def fast_yaml_load(stream):
"""
High-performance YAML loading utility.
Leverages CSafeLoader for significantly faster parsing when available,
falling back to SafeLoader if not.
"""
try:
from yaml import CSafeLoader as Loader
except ImportError:
from yaml import SafeLoader as Loader
return yaml.load(stream, Loader=Loader)
from typing import Any, Dict
import yaml
try:
from yaml import CSafeLoader as Loader
except ImportError:
from yaml import SafeLoader as Loader
def fast_yaml_load(stream):
"""
High-performance YAML loading utility.
Leverages CSafeLoader for significantly faster parsing when available,
falling back to SafeLoader if not.
"""
return yaml.load(stream, Loader=Loader)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant