Skip to content

KHR_gaussian_splatting: Editorial review#2567

Open
lexaknyazev wants to merge 1 commit intoKhronosGroup:mainfrom
lexaknyazev:splats-update
Open

KHR_gaussian_splatting: Editorial review#2567
lexaknyazev wants to merge 1 commit intoKhronosGroup:mainfrom
lexaknyazev:splats-update

Conversation

@lexaknyazev
Copy link
Copy Markdown
Member

No description provided.

@lexaknyazev lexaknyazev requested review from javagl and weegeekps April 8, 2026 16:02
Copy link
Copy Markdown
Contributor

@javagl javagl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not verify the math.
The inlined comments are mainly about typos or wording details.
Some of the comments may be subjective, and may not have to be addressed.

The most "important" one is probably about which attribute semantics are 'Required'. Iff we agreed on making them all required, then that's OK. But I think that it could make sense to point out that they are only required for the specific case (ellipse kernel) that is covered here.

@NorbertNopper-Huawei
Copy link
Copy Markdown

Regarding the clamping discussion:
srgb_rec709_display suggesting to clamp(0, 1)

lin_rec709_display currently doing some research ...

@lexaknyazev
Copy link
Copy Markdown
Member Author

Regarding the clamping discussion: srgb_rec709_display suggesting to clamp(0, 1)

lin_rec709_display currently doing some research ...

Clamping should not depend on the encoding differences between these two options because both of them are defined only for the $[0, 1]$ range.

Please see the current language at the end of the "Splat Lighting" section.

@NorbertNopper-Huawei
Copy link
Copy Markdown

Regarding the clamping discussion: srgb_rec709_display suggesting to clamp(0, 1)
lin_rec709_display currently doing some research ...

Clamping should not depend on the encoding differences between these two options because both of them are defined only for the [ 0 , 1 ] range.

Please see the current language at the end of the "Splat Lighting" section.

The point is, that display referred is an image working space and needs a conversion to the final display format anyway. In common case, usually srgb to srgb, we can clamp if really required. If srgb to another display format, there also a conversion is required.

@NorbertNopper-Huawei
Copy link
Copy Markdown

Furthermore, as we do have display referred data, we are working in the display image space:
https://www.pixelsham.com/2022/12/06/scene-referred-vs-display-referred-color-workflows/?utm_source=copilot.com

For scene referred and lighting, this will come in another extension. So, you can remove the lighting in this section, as this needs to be solved differently.

@NorbertNopper-Huawei
Copy link
Copy Markdown

NorbertNopper-Huawei commented Apr 9, 2026

If one is really nitpicking, the display referred data is not allowed to be used in scene referred space.

@NorbertNopper-Huawei
Copy link
Copy Markdown

flowchart TD
  A[lin_rec709_display] <--> B[encoding and decoding]
  B <--> C[srgb_rec709_display]
  C --> D[SRGB Display]
Loading

@lexaknyazev
Copy link
Copy Markdown
Member Author

So, you can remove the lighting in this section, as this needs to be solved differently.

It's not a real scene lighting; "Splat Lighting" there means computing the final color from SH values. This term was used in the previous spec draft so it's still there; feel free to propose a better option.

The whole clamping issue addresses a completely separate topic. Imagine a splat containing only SH0 values (for brevity) and the SH0 values are like (10.0, 20.0, 30.0). After applying the math, the computed diffuse color will be (3.32, 6.14, 8.96). The spec simply says that if the splat's color space is not defined for such values, they must be clamped, in this case to (1.0, 1.0, 1.0).

@NorbertNopper-Huawei
Copy link
Copy Markdown

So, you can remove the lighting in this section, as this needs to be solved differently.

It's not a real scene lighting; "Splat Lighting" there means computing the final color from SH values. This term was used in the previous spec draft so it's still there; feel free to propose a better option.

The whole clamping issue addresses a completely separate topic. Imagine a splat containing only SH0 values (for brevity) and the SH0 values are like (10.0, 20.0, 30.0). After applying the math, the computed diffuse color will be (3.32, 6.14, 8.96). The spec simply says that if the splat's color space is not defined for such values, they must be clamped, in this case to (1.0, 1.0, 1.0).

This can not happen, as the values were trained in a given range.

@NorbertNopper-Huawei
Copy link
Copy Markdown

Can not happen to srgb, and if trained for linear, please look at the image above

@lexaknyazev
Copy link
Copy Markdown
Member Author

lexaknyazev commented Apr 9, 2026

This can not happen, as the values were trained in a given range.

Training process is irrelevant. The spec describes what to do with input data and there are only three possible options:

  1. Say that out-of-range values are invalid and MUST be rejected (very expensive to implement).
  2. Say that out-of-range values produce undefined rendering (will lead to portability issues).
  3. Say that out-of-range values MUST be clamped (the safest option).

@NorbertNopper-Huawei
Copy link
Copy Markdown

This can not happen, as the values were trained in a given range.

Training process is irrelevant. The spec describes what to do with input data and there are only three possible options:

  1. Say that out-of-range data is invalid and MUST be rejected (very expensive to implement).
  2. Say that out-of-range data produces undefined rendering (will lead to portability issues).
  3. Say that out-of-range data MUST be clamped (the safest option).

Training process is relevant, as we are reproducing input images.

@NorbertNopper-Huawei
Copy link
Copy Markdown

We are doing a glTF extension for 3D Gaussian Splatting, so training is relevant.

@javagl
Copy link
Copy Markdown
Contributor

javagl commented Apr 9, 2026

None of the test data sets in #2562 was created with a training process. The spec is supposed to say how input data is handled, regardless where and how it has been created.

@NorbertNopper-Huawei
Copy link
Copy Markdown

None of the test data sets in #2562 was created with a training process. The spec is supposed to say how input data is handled, regardless where and how it has been created.

Then ignore these test assets except the real world data ones, also shared there.

@NorbertNopper-Huawei
Copy link
Copy Markdown

We are doing a 3D Gaussian Splatting glTF extension related to the paper and not splat everything.

@NorbertNopper-Huawei
Copy link
Copy Markdown

So, if one is creating 3D Gaussian Splatting data manually, one has to take the training algorithm into account.

@javagl
Copy link
Copy Markdown
Contributor

javagl commented Apr 9, 2026

I'll just point to #2490 (comment) , so that you can discuss this with the person who wrote that comment, and assume that you're OK with clamping the resulting values.

@NorbertNopper-Huawei
Copy link
Copy Markdown

Maybe I was wrong at that point of time and learned a little bit.

@NorbertNopper-Huawei
Copy link
Copy Markdown

image

@NorbertNopper-Huawei
Copy link
Copy Markdown

NorbertNopper-Huawei commented Apr 9, 2026

So, for display referred and current color spaces srgb_rec709_display and linear_rec709_display in the given glTF extension, we use clamp(0, 1).
For other colorSpace values, as a future glTF extension as usual.

@NorbertNopper-Huawei
Copy link
Copy Markdown

I implemented something similar a few years ago at UX3D from scratch to understand the color conversion pipeline. Maybe in this code, there is a bug.
For the shared 3DGS viewer, not yet implemented the linear case.

@lexaknyazev
Copy link
Copy Markdown
Member Author

After further review, the clamping issue seems far from just nitpicking. Not specifying this step may lead to severe output differences when using alpha-premultiplied blending. Consider the following input:

  • srgb_rec709_display
  • Two identical splats in the same place
    • SH0: (2, 4, 7), therefore the initial color value is (1.06, 1.63, 2.47)
    • Opacity: 0.5
  • Consider only the pixel corresponding to the mean point

Clamping the SH sum

  1. Color value becomes (1.0, 1.0, 1.0)
  2. Fragment output alpha-premultiplied color is (0.5, 0.5, 0.5, 0.5) (already in range)
  3. Blending result is (0.75, 0.75, 0.75) (still in range)

Implicit clamping when rendering to a normalized color buffer

  1. Color value remains (1.06, 1.63, 2.47)
  2. Fragment shader alpha-premultiplied color is (0.53, 0.81, 1.24, 0.5)
  3. Actual fragment color after implicit clamping is (0.53, 0.81, 1.0, 0.5)
  4. Blending result is (0.8, 1.22, 1.5), further clamped to (0.8, 1.0, 1.0) on write

@NorbertNopper-Huawei
Copy link
Copy Markdown

NorbertNopper-Huawei commented Apr 9, 2026

For 3D Gaussian Splatting, this should not happen. Because the original images color values were between 0 and 1. So, the sum has to be between 0 and 1. So, if we have something like 1.0000000001, I would consider as valid data. However, if the sum is e.g. 2.0, the data is for sure incorrect.

@javagl
Copy link
Copy Markdown
Contributor

javagl commented Apr 9, 2026

I took an implicit "TODO (when I'm really bored)" to check this, but maybe someone knows from the tip of their head:

The specification currently allows implementations to ignore the higher-degree spherical harmonics. Can there be a case where a color component becomes smaller by taking the higher-degree coefficients into account? Or as pseudocode: Can there be a case where
colorRed = computeFrom(sh0); yields 1.5 but
colorRed = computeFrom(sh0, sh1, sh2); would yield 0.99?

I think that this can be the case, given that the coefficients can be negative, and how they are mushed together in the shader, but I haven't confirmed.

@NorbertNopper-Huawei
Copy link
Copy Markdown

NorbertNopper-Huawei commented Apr 9, 2026

I took an implicit "TODO (when I'm really bored)" to check this, but maybe someone knows from the tip of their head:

The specification currently allows implementations to ignore the higher-degree spherical harmonics. Can there be a case where a color component becomes smaller by taking the higher-degree coefficients into account? Or as pseudocode: Can there be a case where colorRed = computeFrom(sh0); yields 1.5 but colorRed = computeFrom(sh0, sh1, sh2); would yield 0.99?

I think that this can be the case, given that the coefficients can be negative, and how they are mushed together in the shader, but I haven't confirmed.

You can try it out e.g. in the https://superspl.at/editor
There you can change the SH bands during runtime.

Sorry, do not want to be impolite. Let me check. Or what do you not understand when asking AI?

@lexaknyazev
Copy link
Copy Markdown
Member Author

@NorbertNopper-Huawei I think we have agreed on clamping in principle already. The final confirmation is where exactly it happens:

  1. On the SH sum before applying alpha
  2. On the SH sum after applying alpha

If we do (1), then (2) is not needed. If we do only (2), then blending would have to be done with alpha-premultiplied values to avoid very convoluted workarounds. Note that (2) technically allows premultiplied color values exceeding the corresponding alpha values, which is generally incorrect but that's what some viewers do today, maybe by mistake.

@NorbertNopper-Huawei
Copy link
Copy Markdown

NorbertNopper-Huawei commented Apr 9, 2026

@lexaknyazev What do you suggest? I mean, and I am wondering, why now these questions do pop up now.

I do not know it at the moment, do you know a solution? I try to provide one as well. Looking at my code.

@NorbertNopper-Huawei
Copy link
Copy Markdown

@lexaknyazev Can you please share the links of these viewers, where this happens? Want to look at the code and understand.

@NorbertNopper-Huawei
Copy link
Copy Markdown

I do not have the clamp in my code, let's look at the original paper, as this is the ground truth.

@NorbertNopper-Huawei
Copy link
Copy Markdown

Let's ask my good old friend R2D2, ...

@NorbertNopper-Huawei
Copy link
Copy Markdown

... seems in the original paper and implementation, there is no clamp at all. So, a perfect pipeline would not need it. However, I suggest to just tolerate the current implementations and provide a guide, how to do in a correct way.

@NorbertNopper-Huawei
Copy link
Copy Markdown

Of course, maybe someone else should look at this as well.

@lexaknyazev
Copy link
Copy Markdown
Member Author

@NorbertNopper-Huawei Your viewer performs color clamping implicitly, like described in (2) above because:

  1. The SH sum itself is not clamped.
  2. The fragment output and the blending state use premultiplied alpha.
  3. The rendering surface is unsigned normalized.

You could try adding clamp to the end of the computeColor shader function to check what would happen if we standardize on (1).

@NorbertNopper-Huawei
Copy link
Copy Markdown

@lexaknyazev The links to the viewers, please.
Also, please show me, where this happens in the original paper and code.

@NorbertNopper-Huawei
Copy link
Copy Markdown

Of course, this is a glTF community decision and we will adapt after internal clarifications.

@lexaknyazev
Copy link
Copy Markdown
Member Author

The original research paper provides both sides of the pipeline so it does not need to explicitly call out potential cross-vendor portability issues. Standards like glTF are generally one-sided in that sense.

By "your viewer" I mean the one currently located in the Khronos internal GitLab.

@NorbertNopper-Huawei
Copy link
Copy Markdown

@NorbertNopper-Huawei Your viewer performs color clamping implicitly, like described in (2) above because:

Yes, it would clamp but it does not, as the data itself is in the 0 to 1 range. This is the reason, that (1) and (2) implementors should consider (0) - the original paper implementation - where no clamping is done and required.

@NorbertNopper-Huawei
Copy link
Copy Markdown

... Standards like glTF are generally one-sided in that sense.

Maybe both sides need to make changes in their implementations? That is also an option.

@NorbertNopper-Huawei
Copy link
Copy Markdown

NorbertNopper-Huawei commented Apr 10, 2026

A compromise. Also, future proof for scene referred data.

@lexaknyazev
Copy link
Copy Markdown
Member Author

Yes, it would clamp but it does not, as the data itself is in the 0 to 1 range.

  • The viewer has no control over the input data.
  • The viewer (as it stands today) does clamp as in (2). To claim "no clamp", it must be changed to use a floating-point color buffer for blending (which may make rendering much slower).

@NorbertNopper-Huawei
Copy link
Copy Markdown

Yes, it would clamp but it does not, as the data itself is in the 0 to 1 range.

  • The viewer has no control over the input data.
  • The viewer (as it stands today) does clamp as in (2). To claim "no clamp", it must be changed to use a floating-point color buffer for blending (which may make rendering much slower).

It is assumed, that the 3DGS data in the glTF is valid, of course. And again, the viewer does not clamp, as the values are not above 1 or below 0.

@lexaknyazev
Copy link
Copy Markdown
Member Author

It is assumed, that the 3DGS data in the glTF is valid, of course.

Unless we provide an exhaustive SH data validity definition, including free view direction selection and toggling of the higher degrees (see Marco's comments above), all SH data is considered as valid.

@NorbertNopper-Huawei
Copy link
Copy Markdown

We could add a note, where we say, that by purpose we are not clamping, as a) we rely on the original paper where no clamping happens and b) there are several approaches to handle this issue, not just clamping, and glTF does not want to mandate this. Also, this is a general problem to 3DGS data itself, so not just related to 3DGS in glTF.

@NorbertNopper-Huawei
Copy link
Copy Markdown

If we would have perfect data, the data does not need to be clamped, as in the 0 to 1 range. However, as this can not be guaranteed, there are several approaches to handle this issue, where clamping is one of them.
I was not aware of this. Now I am.

@lexaknyazev
Copy link
Copy Markdown
Member Author

we rely on the original paper where no clamping happens

This is not a valid argument. Research papers are not concerned with interoperability.

If you want to treat the reference rasterizer as the ground truth, then three rules should be followed:

  1. Negative SH sums must be clamped to zero before alpha multiplication (see the computeColorFromSH function in the forward pass).
  2. All rendering and blending must be done in a floating-point color buffer (because the reference rasterizer accumulates everything in floating-point variables).
  3. The very final color values should be clamped to 1.0 (because the image is saved to PNG).

@NorbertNopper-Huawei
Copy link
Copy Markdown

we rely on the original paper where no clamping happens

This is not a valid argument. Research papers are not concerned with interoperability.

If you want to treat the reference rasterizer as the ground truth, then three rules should be followed:

  1. Negative SH sums must be clamped to zero before alpha multiplication (see the computeColorFromSH function in the forward pass).
  2. All rendering and blending must be done in a floating-point color buffer (because the reference rasterizer accumulates everything in floating-point variables).
  3. The very final color values should be clamped to 1.0 (because the image is saved to PNG).

Then please make a proposal how you would do it and I trust you, that it will be accepted by everyone.

@lexaknyazev lexaknyazev force-pushed the splats-update branch 2 times, most recently from dcbb237 to 7827abe Compare April 14, 2026 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants