Feature/generalized trial data#136
Conversation
| refs : (N, d) | ||
| comparisons : (N, d) | ||
| responses : (N,) | ||
| inputs : (N, s, d) |
There was a problem hiding this comment.
I suggest renaming s to k as outlined in the issue. K is the conventional counting variable in statistics e.g, k-means. s is non-standard as a shape label and risks confusion with "seconds" or "samples". Uppercase R and C follow the same convention for response and context dimensions.
…variable in ML/stats and s as a shape label risks confusion. Same reasoning for R and C.
| """ | ||
| refs = jnp.asarray(data.refs) | ||
| comparisons = jnp.asarray(data.comparisons) | ||
| stimuli = jnp.asarray(data.stimuli) |
There was a problem hiding this comment.
likelihood.py currently extracts stimuli by hardcoded position (stimuli[:, 0, :], stimuli[:, 1, :]), so any TrialData where the ref is not in slot 0 or K ≠ 2 will silently compute wrong likelihoods. Adding stimulus_names as an empty-tuple field on TrialData and a data.stimulus("ref") accessor would let the likelihood look up slots by name instead of index, making it robust to different slot orderings and generalizable to K > 2 tasks. I can implement that.
…, relative tolerance blows up.
…e details see issues
This update adjusts formatting of ResponseData and TrialData to be more generalizable to other tasks. This is a minimal initial change which adjusts only what is necessary to update Data classes while maintaining previous PsyPhy functionality. It does NOT include updates to possible task structures that would allow for realistic use in any task besides OddityTask.
Overview
Previously, data structures assumed that all trials included a 2D "reference," a 2D "comparison," and a binary response. Now, data structures permit trials to include any number of any-dimensional stimuli and permit any-dimensional (& potentially non-binary) responses (so long as dimensions are consistent across trials in a given data object).
Additionally, an optional "context" attribute was added to data structures to store trial-wide features that could vary between trials and could condition response probabilities.
Examples, tutorials, documentation, tests, i/o, and any related structures which rely on the implementation of these data classes were updated to reflect these changes. New test file added to test these changes.
Core Change: Updating Data Formats
Previous
TrialDatafields:where N = number of trials and d = stimulus dimensions
Updated
TrialDatafields:where N = number of trials, s = number of stimuli, and d/r/c = stimulus/response/context dimensions
Previous
ResponseDatafields:Updated
ResponseDatafields:ResponseDatatakes in inputs in the formlist[tuple[Any, ...]]. Then, it converts them tolist[np.array]. This is done for several reasons:ResponseData. A stimulus can be represented as a list of features, a single scalar value, etc. And any number of stimuli can be grouped in a tuple to create the input.TrialBatchdoes not require significant changes.TrialBatchhandles groups of stimuli in a tuple (previously this was(ref, comparison)). In this update it still handles stimuli in a tuple, now of the form(stimulus1, stimulus2, ...etc)for any number of stimuli. This consistency means that fewer changes are required for this refactoring, asTrialBatchis called directly in other files (sobol.pyandgrid.py). This structure means that those changes are not necessary.ResponseDataconverts to np.array under the hood for simplicity and so that theinputs.shape()can be used as a quick check for appropriate dimensionality.How to update previous uses of each data class:
TrialData(refs, comps, resps)->TrialData(jnp.stack([refs, comps], axis=1), resps)ResponseData(refs, comps, resps)->ResponseData((refs, comps), resps)Catalogue of Specific Updates
dataset.py
TrialDataattribute changeTrialDatashape check change (wants (N, s, d) & (N, r). Checks that N is the same for both, and that if context was given, it also provides a consistent N.)ResponseDataattribute change. Update correspondingadd_trial,add_batchfunctionsTrialData<-->ResponseData, arrays <-->ResponseData)TrialBatchto matchResponseDataformat. (note thatTrialBatchobjects created using "ref" and "comparison" exactly as before will function completely identically.)int(responses)likelihood.py
TaskLikelihood.loglikexpected data to be Object with "refs", "comparisons", "responses" array attributes. -> Now expects data to be Object with "inputs", "responses" array attributes.dataset.py)Note that in this update, likelihood does not significantly change. It is still specific to OddityTask.
io.py
save_responses_csvis updated to infer number of stimuli fromResponseData, and will create a csv with columns numbered "stimulus 1", "stimulus 2", etc. Response column remains unchanged.load_responses_csvupdated to package "refs" and "probes" into appropriate tuple (refs, probes) forResponseData. Documentation clarifies that this function is still specific to OddityTask!!init.py + API docs
tests/
If a test used
TrialData(refs, comps, resps)-> now usesTrialData(jnp.stack([refs, comps], axis=1), resps)If a test used
ResponseData(refs, comps, resps)-> now usesResponseData((refs, comps), resps)Updated:
test_3stimulus_decision_ruletest_covariance_fieldtest_mc_likelihoodtest_model_apitest_noise_modelstest_posteriorstest_wishart_covariancetest_imports(to addTrialDataas top level API import)docs/examples/wppm
quick_start.pyandfull_wppm_example.py(Note that I'm on my laptop so could not fully test full_wppm_example with sufficiently large MC simulations or trials)
tests/test_data_format
Created new test file
Includes:
Comparison to Issue #134
This is a partial step towards completing the significant changes outlined in that issue.
Implemented:
TrialDataformat updateTaskLikelihoodworks with new dataNot Yet Implemented:
Divergences from Issue:
src/psyphy/trial_placement/was listed as requiring an update, but I believe that won't be necessary at this first step, asTrialBatchstructure in the OddityTask case was preserved. Updatingtrial_placementto support more general task formats can be done separately.docs/examples/wppm/weber_law_demonortest_nuts_samplerwere both listed as needing updates, but I do not see either file in the current version.