`robustdiff` to address #97 by pavelkomarov · Pull Request #161 · florisvb/PyNumDiff

pavelkomarov · 2025-10-14T03:47:11Z

added robustdiff to kalman_smooth, added tests for it, added it to notebooks 1 and 2a (2b yet to do), added it to optimization and played with it a long time, still toying, improved the way imports are done in various init.py with the else clause, which I didn't know about before

Still need to:

rerun and finalize 2a notebook
reorganize, add robustdiff to, and rerun 2b notebook
add robustdiff to suggest_method and rerun notebook 3
setup and run experiments with robustdiff in notebook 4 to see whether this thing lives up to the name I've given it

…tebooks 1 and 2a (2b yet to do), added it to optimization and played with it a long time, still toying, improved the way imports are done in various __init__.py with the else clause, which I didn't know about before

pavelkomarov · 2025-10-14T03:49:14Z

pynumdiff/optimize/_optimize.py

                         {'q': (1e-10, 1e10),
-                          'r': (1e-10, 1e10)})
+                          'r': (1e-10, 1e10)}),
+    robustdiff: ({'order': {1, 2, 3}, #categorical


might also need huberM to be a hyperparameter. Maybe discrete, because we want to hit 0 on the dot.

pavelkomarov · 2025-10-14T03:49:47Z

pynumdiff/tests/test_diff_methods.py

    (lineardiff, {'order':3, 'gamma':5, 'window_size':11, 'solver':'CLARABEL'}), (lineardiff, [3, 5, 11], {'solver':'CLARABEL'}),
    (rbfdiff, {'sigma':0.5, 'lmbd':0.001})
    ]
+diff_methods_and_params = [(robustdiff, {'order':3, 'qr_ratio':1e6})]


gotta remove this next commit. Should test all.

…the CI version is arriving at a slightly different answer

pavelkomarov · 2025-10-15T01:36:56Z

pynumdiff/optimize/_optimize.py

 from warnings import filterwarnings, warn
-from multiprocessing import Pool
+from multiprocessing import Pool, Manager
+from hashlib import sha1


Apparently calling hash() on identical objects in different processes isn't guaranteed to return the same value, so I had to get fancier.

pavelkomarov · 2025-10-15T01:37:32Z

pynumdiff/optimize/_optimize.py


 # Map from method -> (search_space, bounds_low_hi)
 method_params_and_bounds = {
-    spectraldiff: ({'even_extension': {True, False}, # give categorical params in a set


I changed the order of a bunch of listings around this PR, to better match the taxonomy paper and readme.

pavelkomarov · 2025-10-15T01:40:10Z

pynumdiff/optimize/_optimize.py

    """
-    point_params = {k:(v if search_space_types[k] == float else 
-                int(np.round(v))) for k,v in zip(search_space_types, point)} # point -> dict
+    key = sha1((''.join(f"{v:.3e}" for v in point) + # This hash is stable across processes. Takes bytes


Hashing unhashable types reliably was more pain than I expected. It's still not totally perfect, because scientific notation isn't a totally reliable way to see we've already queried somewhere very nearby. For instance, 0.00e+0 looks very different from 1.00e-5 as a string, but 1.10e3 vs identical-looking 1.10e3 could be hiding a difference much greater than 1e-5.

pavelkomarov · 2025-10-15T01:46:31Z

pynumdiff/optimize/_optimize.py

-                int(np.round(v))) for k,v in zip(search_space_types, point)} # point -> dict
+    key = sha1((''.join(f"{v:.3e}" for v in point) + # This hash is stable across processes. Takes bytes
+               ''.join(str(v) for k,v in sorted(categorical_params.items()))).encode()).digest()
+    if key in cache: return cache[key] # short circuit if this hyperparam combo has already been queried, ~10% savings per #160


I'm not sure I actually need to store and return the value. Could maybe just shortcircuit by returning NaN, since there's a nanargmin at the end. But storing and returning a number is cheap.

pavelkomarov · 2025-10-15T01:47:59Z

pynumdiff/optimize/_optimize.py

-    params, bounds = method_params_and_bounds[func]
-    params.update(search_space_updates) # for things not given, use defaults
+    search_space, bounds = method_params_and_bounds[func]
+    search_space.update(search_space_updates) # for things not given, use defaults


Renamed this. It's not exactly the search space. It's a bunch of directions about how to make starting conditions. But it's close in spirit.

pavelkomarov · 2025-10-15T01:49:18Z

pynumdiff/optimize/_optimize.py

-            _minimize = partial(scipy.optimize.minimize, _obj_fun, method=opt_method, bounds=bounds, options={'maxiter':maxiter})
-            results += pool.map(_minimize, starting_points) # returns a bunch of OptimizeResult objects
+    with Manager() as manager:
+        cache = manager.dict() # cache answers to avoid expensive repeat queries


What a great object somebody made. Super easy to use.

pavelkomarov commented Oct 14, 2025

View reviewed changes

pavelkomarov mentioned this pull request Oct 14, 2025

Add caching to optimization to avoid duplicate work? #160

Closed

pavelkomarov added 7 commits October 14, 2025 16:17

added caching to optimization

de68e28

reran notebooks

c0ec315

updated notebook

5b3fd12

trying to get tests to pass

91dc1b7

removed some commented-out code and adjusted an error bound, because …

ca0f657

…the CI version is arriving at a slightly different answer

adjusting error bounds more

c99e6ca

tweaked docs

4b7543b

pavelkomarov commented Oct 15, 2025

View reviewed changes

pavelkomarov merged commit 8eaea82 into master Oct 15, 2025
1 of 2 checks passed

pavelkomarov deleted the outlier-robust branch October 15, 2025 01:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

`robustdiff` to address #97#161

`robustdiff` to address #97#161
pavelkomarov merged 8 commits intomasterfrom
outlier-robust

pavelkomarov commented Oct 14, 2025 •

edited

Loading

Uh oh!

pavelkomarov Oct 14, 2025

Uh oh!

pavelkomarov Oct 14, 2025

Uh oh!

pavelkomarov Oct 15, 2025

Uh oh!

pavelkomarov Oct 15, 2025

Uh oh!

pavelkomarov Oct 15, 2025 •

edited

Loading

Uh oh!

pavelkomarov Oct 15, 2025 •

edited

Loading

Uh oh!

pavelkomarov Oct 15, 2025

Uh oh!

pavelkomarov Oct 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

pavelkomarov commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pavelkomarov Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pavelkomarov commented Oct 14, 2025 •

edited

Loading

pavelkomarov Oct 15, 2025 •

edited

Loading

pavelkomarov Oct 15, 2025 •

edited

Loading