Conversation
…tebooks 1 and 2a (2b yet to do), added it to optimization and played with it a long time, still toying, improved the way imports are done in various __init__.py with the else clause, which I didn't know about before
pynumdiff/optimize/_optimize.py
Outdated
| {'q': (1e-10, 1e10), | ||
| 'r': (1e-10, 1e10)}) | ||
| 'r': (1e-10, 1e10)}), | ||
| robustdiff: ({'order': {1, 2, 3}, #categorical |
There was a problem hiding this comment.
might also need huberM to be a hyperparameter. Maybe discrete, because we want to hit 0 on the dot.
pynumdiff/tests/test_diff_methods.py
Outdated
| (lineardiff, {'order':3, 'gamma':5, 'window_size':11, 'solver':'CLARABEL'}), (lineardiff, [3, 5, 11], {'solver':'CLARABEL'}), | ||
| (rbfdiff, {'sigma':0.5, 'lmbd':0.001}) | ||
| ] | ||
| diff_methods_and_params = [(robustdiff, {'order':3, 'qr_ratio':1e6})] |
There was a problem hiding this comment.
gotta remove this next commit. Should test all.
…the CI version is arriving at a slightly different answer
| from warnings import filterwarnings, warn | ||
| from multiprocessing import Pool | ||
| from multiprocessing import Pool, Manager | ||
| from hashlib import sha1 |
There was a problem hiding this comment.
Apparently calling hash() on identical objects in different processes isn't guaranteed to return the same value, so I had to get fancier.
|
|
||
| # Map from method -> (search_space, bounds_low_hi) | ||
| method_params_and_bounds = { | ||
| spectraldiff: ({'even_extension': {True, False}, # give categorical params in a set |
There was a problem hiding this comment.
I changed the order of a bunch of listings around this PR, to better match the taxonomy paper and readme.
| """ | ||
| point_params = {k:(v if search_space_types[k] == float else | ||
| int(np.round(v))) for k,v in zip(search_space_types, point)} # point -> dict | ||
| key = sha1((''.join(f"{v:.3e}" for v in point) + # This hash is stable across processes. Takes bytes |
There was a problem hiding this comment.
Hashing unhashable types reliably was more pain than I expected. It's still not totally perfect, because scientific notation isn't a totally reliable way to see we've already queried somewhere very nearby. For instance, 0.00e+0 looks very different from 1.00e-5 as a string, but 1.10e3 vs identical-looking 1.10e3 could be hiding a difference much greater than 1e-5.
| int(np.round(v))) for k,v in zip(search_space_types, point)} # point -> dict | ||
| key = sha1((''.join(f"{v:.3e}" for v in point) + # This hash is stable across processes. Takes bytes | ||
| ''.join(str(v) for k,v in sorted(categorical_params.items()))).encode()).digest() | ||
| if key in cache: return cache[key] # short circuit if this hyperparam combo has already been queried, ~10% savings per #160 |
There was a problem hiding this comment.
I'm not sure I actually need to store and return the value. Could maybe just shortcircuit by returning NaN, since there's a nanargmin at the end. But storing and returning a number is cheap.
| params, bounds = method_params_and_bounds[func] | ||
| params.update(search_space_updates) # for things not given, use defaults | ||
| search_space, bounds = method_params_and_bounds[func] | ||
| search_space.update(search_space_updates) # for things not given, use defaults |
There was a problem hiding this comment.
Renamed this. It's not exactly the search space. It's a bunch of directions about how to make starting conditions. But it's close in spirit.
| _minimize = partial(scipy.optimize.minimize, _obj_fun, method=opt_method, bounds=bounds, options={'maxiter':maxiter}) | ||
| results += pool.map(_minimize, starting_points) # returns a bunch of OptimizeResult objects | ||
| with Manager() as manager: | ||
| cache = manager.dict() # cache answers to avoid expensive repeat queries |
There was a problem hiding this comment.
What a great object somebody made. Super easy to use.
added
robustdifftokalman_smooth, added tests for it, added it to notebooks 1 and 2a (2b yet to do), added it to optimization and played with it a long time, still toying, improved the way imports are done in various init.py with the else clause, which I didn't know about beforeStill need to:
robustdiffto, and rerun 2b notebookrobustdifftosuggest_methodand rerun notebook 3robustdiffin notebook 4 to see whether this thing lives up to the name I've given it