Summary
pulseengine/core/signals.py manually implements a keyword-pattern cache using a plain module-level dict:
_KW_PATTERN_CACHE: dict[str, re.Pattern] = {}
def _kw_re(kw: str) -> re.Pattern:
if kw not in _KW_PATTERN_CACHE:
...
_KW_PATTERN_CACHE[kw] = re.compile(prefix + escaped + suffix)
return _KW_PATTERN_CACHE[kw]
Python ships functools.lru_cache specifically for this pattern. Using it is idiomatic, thread-safe by Python's GIL for pure reads, and removes the need to manage the cache dict manually.
Proposed refactor
import functools
@functools.lru_cache(maxsize=None)
def _kw_re(kw: str) -> re.Pattern:
"""Return a compiled regex that matches *kw* as a whole token in lowercase text."""
escaped = re.escape(kw)
prefix = r'\b'
suffix = r'\b' if kw[-1].isalnum() else ''
return re.compile(prefix + escaped + suffix)
This also eliminates the _KW_PATTERN_CACHE module-level variable entirely, reducing the public namespace noise.
Notes
- The pre-warming loop (
for _kw_pairs in ASSET_KEYWORDS.values(): ...) still works unchanged—it just calls _kw_re() which populates the lru_cache internally
lru_cache with maxsize=None is equivalent to an unbounded dict cache but with built-in cache-info introspection (_kw_re.cache_info())
Summary
pulseengine/core/signals.pymanually implements a keyword-pattern cache using a plain module-leveldict:Python ships
functools.lru_cachespecifically for this pattern. Using it is idiomatic, thread-safe by Python's GIL for pure reads, and removes the need to manage the cache dict manually.Proposed refactor
This also eliminates the
_KW_PATTERN_CACHEmodule-level variable entirely, reducing the public namespace noise.Notes
for _kw_pairs in ASSET_KEYWORDS.values(): ...) still works unchanged—it just calls_kw_re()which populates thelru_cacheinternallylru_cachewithmaxsize=Noneis equivalent to an unbounded dict cache but with built-in cache-info introspection (_kw_re.cache_info())