Skip to content

Conversation

@cstorey-monzo
Copy link

We had an internal incident where:

  • We a pre-slog.WithParams() style usage, passing around a context map to child functions
  • We did some shadow testing, running a primary and shadow implementation concurrently
  • One of the implementations would mutate the context map concurrently to it being iterated over in the other goroutine
  • This caused the process to abort with a fatal error

This occured because we took a reference to the passed in map, and stored in the context, and lazily merged all the context maps in the context chain into one on demand. So, we were storing a reference to a map was concurrently modified.

This makes that structurally impossible, by using the immutable maps from https://github.com/benbjohnson/immutable/ .

@cstorey-monzo cstorey-monzo changed the title Inc 2025 12 01 s bizops task search consumer lag immutable maps Use immutable maps internally to to slog for params Dec 2, 2025
@cstorey-monzo cstorey-monzo marked this pull request as draft December 2, 2025 10:38
@cstorey-monzo cstorey-monzo marked this pull request as ready for review December 2, 2025 11:40
tomcyk
tomcyk previously approved these changes Dec 3, 2025
params.go Outdated
// the new parameters will be merged with the existing set, with newer values taking
// precedence over older ones.
//
// It is not safe to modify the supplied map after passing it to WithParams.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cstorey-monzo this ^ makes me wonder if slog is behaving as expected and it's the service(s) that should really be updated. wdyt?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's fair, Hayden had the same challenge. I thinking about this more as a sort of insurance scheme, making it structurally impossible for this library to be a contributing factor in any concurrency bugs.

if node := paramNodeFromContext(parent); node != nil {
p = node.mergedParams
} else {
p = immutable.NewMap[string, string](nil)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably missing something obvious here, but does immutable give us any meaningful benefit compared to taking a copy of input by merging it into any existing params using standard maps?

I then wonder how useful allocating a new map in Params(...) is in reality. It seems unlikely, although definitely possible, someone would retrieve the map from the context and make changes to it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re "One of the implementations would mutate the context map concurrently to it being iterated over in the other goroutine", immutable's not going to help with that. If something is concurrently updating the input map, ranging over it here could still panic, right?

Copy link
Author

@cstorey-monzo cstorey-monzo Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but given the interface, there's not much we can practically do there, and this does mean that backtraces would lead to the callsite of slog.WithParams() instead of a random call to slog.Params(), which should hopefully make it easier to track these down.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backtraces would lead to the callsite of slog.WithParams()

Love this. Anything that makes races easier to debug is a good thing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But going back to #23 (comment), I can't see how immutable would help any more than taking a copy of inputs in some manner.

Copy link
Author

@cstorey-monzo cstorey-monzo Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the only way to update an immutable.Map is immutably, this prevents race conditions by guaranteeing that once added to the context, no values can be mutated in place. Conversely, relying on copying map[string]string instances relies on engineers remembering to copy when they need to (within slog, anyway).

Comment on lines +38 to +40
for k, v := range input {
p = p.Set(k, v)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use MapBuilder here?

Copy link
Author

@cstorey-monzo cstorey-monzo Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no way to create a builder from an existing map, without copying it.

Granted, given the map sizes involved (usually <10 items) and the max fanout per node is ~32, updates are practically a full copy, but hopefully, the change in interface makes safe usage less cumbersome.

func paramsToBuiltinMap(p params) map[string]string {
m := map[string]string{}

mergedParams params
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also effectively removes the caching for slog.Params(), I will happily put that back (I think I was just happy to get rid of the need for the mutex).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake, we'd clone it in (n *paramNode) params() anyway.

@cstorey-monzo cstorey-monzo marked this pull request as draft December 9, 2025 10:51
@cstorey-monzo
Copy link
Author

We'll need to audit slog.WithParams() call sites before we merge this, I think, so I'm leaving that for the new year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants