Better forwarding to improve read performance a little by GregTheMadMonk · Pull Request #484 · getml/reflect-cpp

GregTheMadMonk · 2025-08-19T22:28:04Z

Hello!

This isn't really a significant change, but it helps speed up rfl::json::read<rfl::Generic> almost twice in the release build (at least in the benchmarked case).

Sorry for not using your benchmark suite here, I had problems getting the vcpkg dependencies properly compiled with GCC15 that ships with my system.

I also tried to look into improving the debug build performance of rfl::to_generic, but couldn't find anything to improve there, at least not as simple.

Benchmark results (on my PC, Ryzen 3600)

In release mode

Before:

--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------
read_struct               95.5 ns         94.4 ns      7332806
read_generic               645 ns          637 ns      1117257
write_struct               104 ns          103 ns      6884743
write_generic              144 ns          142 ns      4990100
to_generic                 115 ns          114 ns      6160175
to_generic_via_json        774 ns          765 ns       932490

After:

--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------
read_struct               94.9 ns         93.7 ns      7444236
read_generic               367 ns          363 ns      1945679
write_struct               104 ns          103 ns      6845176
write_generic              132 ns          130 ns      5413376
to_generic                 119 ns          118 ns      5871778
to_generic_via_json        497 ns          491 ns      1442071

In debug mode

Before:

--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------
read_struct               1461 ns         1445 ns       493132
read_generic              9450 ns         9337 ns        74690
write_struct               657 ns          649 ns      1092616
write_generic             1083 ns         1072 ns       646018
to_generic                3980 ns         3937 ns       175912
to_generic_via_json      10213 ns        10119 ns        69557

After:

--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------
read_struct               1485 ns         1465 ns       483753
read_generic              7859 ns         7753 ns        91012
write_struct               665 ns          659 ns      1067324
write_generic             1073 ns         1063 ns       660308
to_generic                3925 ns         3899 ns       177444
to_generic_via_json       8641 ns         8539 ns        81955

Benchmark code

#include <benchmark/benchmark.h>

#include <string>

#include <rfl.hpp>
#include <rfl/json.hpp>

constexpr auto jsd = R"json({ "str": "hello", "i": 42 })json";

struct S { std::string str; int i; };

const S s_etalon{ .str = "hello", .i = 42 };
const rfl::Generic::Object g_etalon = [] () {
    rfl::Generic::Object ret;
    ret["str"] = "hello";
    ret["i"]   = 42;
    return ret;
} ();

using rfl::json::read;
using rfl::json::write;

void read_struct(benchmark::State& state) {
    for (auto _ : state) {
        benchmark::DoNotOptimize( read<S>(jsd).value() );
    }
} // <-- void read_struct(state)

void read_generic(benchmark::State& state) {
    for (auto _ : state) {
        benchmark::DoNotOptimize( read<rfl::Generic::Object>(jsd).value() );
    }
} // <-- void read_generic(state)

void write_struct(benchmark::State& state) {
    for (auto _ : state) {
        benchmark::DoNotOptimize( write(s_etalon) );
    }
} // <-- void write_struct(state)

void write_generic(benchmark::State& state) {
    for (auto _ : state) {
        benchmark::DoNotOptimize( write(g_etalon) );
    }
} // <-- void write_generic(state)

void to_generic(benchmark::State& state) {
    for (auto _ : state) {
        benchmark::DoNotOptimize( rfl::to_generic(s_etalon) );
    }
} // <-- void to_generic(state)

void to_generic_via_json(benchmark::State& state) {
    for (auto _ : state) {
        benchmark::DoNotOptimize( rfl::json::read<rfl::Generic::Object>(rfl::json::write(s_etalon)).value() );
    }
} // <-- void to_generic(state)

BENCHMARK(read_struct);
BENCHMARK(read_generic);
BENCHMARK(write_struct);
BENCHMARK(write_generic);
BENCHMARK(to_generic);
BENCHMARK(to_generic_via_json);

BENCHMARK_MAIN();

gemini-code-assist

Summary of Changes

Hello @GregTheMadMonk, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces targeted optimizations to the rfl::json library, primarily focusing on improving read performance by leveraging C++11 move semantics more effectively. The changes aim to reduce unnecessary data copying, particularly within the rfl::json::readrfl::Generic path, leading to notable speedups in release builds. Additionally, the PR enhances error handling efficiency and optimizes memory allocation in parsing routines.

Highlights

Performance Enhancement through Move Semantics: Optimized data flow by consistently applying std::move and std::forward in parsing and error handling logic, significantly reducing object copying and improving read operation performance, as demonstrated by benchmarks.
Improved Error Object Handling: The rfl::Error class now supports move construction and assignment, allowing for more efficient propagation and storage of error messages without redundant string copies.
Optimized Variant Parsing Memory: Pre-allocation of memory for error vectors in rfl::Variant and std::variant parsing paths using errors.reserve helps mitigate reallocations and improve performance when multiple parsing errors occur.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces several excellent performance improvements by leveraging move semantics and perfect forwarding more effectively across the parsing logic. The changes are well-implemented and the benchmark results clearly demonstrate significant speedups, especially for rfl::json::read<rfl::Generic>. I have found one potential issue regarding exception safety in a noexcept function that could lead to program termination under certain conditions. Please see the detailed comment for a suggested fix.

liuzicheng1987 · 2025-08-20T06:54:07Z

@GregTheMadMonk that sounds great, but some of the tests appear to be failing. Could you take a look?

GregTheMadMonk · 2025-08-20T08:06:37Z

@liuzicheng1987 if you're talking about the compilation errors (-Werror=maybe-uninitialized), then I'm honestly kind of out of ideas for what may cause it, and I also observed it with main too. I had to add -w to my CMAKE_CXX_FLAGS in order to compile tests locally.

The "offending" piece of code in NamedTupleParser.hpp:

  static std::optional<Error> read_object_or_array_with_default(
      const R& _r, const InputObjectOrArrayType& _obj_or_arr,
      NamedTupleType* _view) noexcept {
    std::vector<Error> errors;
    const auto reader = ViewReaderWithDefaultType(&_r, _view, &errors);
    std::optional<Error> err; // <-- declaration here
    if constexpr (_no_field_names) {
      err = _r.read_array(reader, _obj_or_arr);
    } else {
      err = _r.read_object(reader, _obj_or_arr);
    }
    if (err) {
      return err; // <-- -Werror=maybe-uninitialized - why?!
    }
    if (errors.size() != 0) {
      return to_single_error_message(errors);
    }
    return std::nullopt;
  }
};

(and a similar function above it) appears safe and straightforward to me... I tried to reproduce the error in a similar scenario and didn't succeed: https://godbolt.org/z/4scKPvnnM (this also didn't produce the same -Werror for me locally)

My only two ideas would be either a) GCC gets confused and it's a misfire of this warning or b) there is something in the particular implementations of .read_object/.read_array used that perform a maybe-uninitialized access to the data which GCC sees but doesn't report as the operation that set the optional into that potentially incorrect state

liuzicheng1987 · 2025-08-20T08:24:43Z

Ok. Could you try to just set err = std::nullopt when you first declare it?

GregTheMadMonk · 2025-08-20T08:38:10Z

It didn't help, sadly, for me at least. Same with replacing if (err) with if (err.has_value()):

liuzicheng1987 · 2025-08-24T23:05:57Z

@GregTheMadMonk , this should fix the issue. We'll wait for the tests to run through and then this is ready to be merged.

Thank you for the contribution!

GregTheMadMonk · 2025-08-25T07:40:14Z

Thank you!

How did you come to this fix? Or was that the warning system in GCC has actually misfired and you had to do it by trial and error?

liuzicheng1987 · 2025-09-01T22:55:20Z

@GregTheMadMonk I had figured that the problem only occurs where we have a mutable optional variable, moreover I recall having seen something like this before. So that's how I figured it out.

GregTheMadMonk added 2 commits August 19, 2025 13:14

Optimize error construction with rvalues&prealloc

f60fdfe

Perf: perfect forwarding improvements

1e111a2

gemini-code-assist Bot reviewed Aug 19, 2025

View reviewed changes

Declare err using const auto

6e6aff8

liuzicheng1987 merged commit dcc4c1d into getml:main Aug 24, 2025
133 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better forwarding to improve read performance a little#484

Better forwarding to improve read performance a little#484
liuzicheng1987 merged 3 commits into
getml:mainfrom
GregTheMadMonk:speedups

GregTheMadMonk commented Aug 19, 2025 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

liuzicheng1987 commented Aug 20, 2025

Uh oh!

GregTheMadMonk commented Aug 20, 2025

Uh oh!

liuzicheng1987 commented Aug 20, 2025

Uh oh!

GregTheMadMonk commented Aug 20, 2025

Uh oh!

liuzicheng1987 commented Aug 24, 2025

Uh oh!

Uh oh!

GregTheMadMonk commented Aug 25, 2025

Uh oh!

liuzicheng1987 commented Sep 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

GregTheMadMonk commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark results (on my PC, Ryzen 3600)

In release mode

In debug mode

Benchmark code

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

liuzicheng1987 commented Aug 20, 2025

Uh oh!

GregTheMadMonk commented Aug 20, 2025

Uh oh!

liuzicheng1987 commented Aug 20, 2025

Uh oh!

GregTheMadMonk commented Aug 20, 2025

Uh oh!

liuzicheng1987 commented Aug 24, 2025

Uh oh!

Uh oh!

GregTheMadMonk commented Aug 25, 2025

Uh oh!

liuzicheng1987 commented Sep 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GregTheMadMonk commented Aug 19, 2025 •

edited

Loading