Skip to content

Feature/generalized trial data#136

Merged
hmd101 merged 18 commits into
flatironinstitute:mainfrom
agorwant:feature/Generalized-TrialData
May 11, 2026
Merged

Feature/generalized trial data#136
hmd101 merged 18 commits into
flatironinstitute:mainfrom
agorwant:feature/Generalized-TrialData

Conversation

@agorwant
Copy link
Copy Markdown
Contributor

@agorwant agorwant commented May 5, 2026

This update adjusts formatting of ResponseData and TrialData to be more generalizable to other tasks. This is a minimal initial change which adjusts only what is necessary to update Data classes while maintaining previous PsyPhy functionality. It does NOT include updates to possible task structures that would allow for realistic use in any task besides OddityTask.

Overview

Previously, data structures assumed that all trials included a 2D "reference," a 2D "comparison," and a binary response. Now, data structures permit trials to include any number of any-dimensional stimuli and permit any-dimensional (& potentially non-binary) responses (so long as dimensions are consistent across trials in a given data object).

Additionally, an optional "context" attribute was added to data structures to store trial-wide features that could vary between trials and could condition response probabilities.

Examples, tutorials, documentation, tests, i/o, and any related structures which rely on the implementation of these data classes were updated to reflect these changes. New test file added to test these changes.

Core Change: Updating Data Formats

Previous TrialData fields:

refs: (N, d) 
comparisons: (N, d)
responses: (N,) - binary

where N = number of trials and d = stimulus dimensions

Updated TrialData fields:

inputs (N, s, d)
responses: (N, r)
context: (N, c) _optional_

where N = number of trials, s = number of stimuli, and d/r/c = stimulus/response/context dimensions

Previous ResponseData fields:

refs: list[Any]
comparisons: list[Any]
responses: list[int]

Updated ResponseData fields:

inputs: list[np.array] where each element is (s, d)
responses: list[np.array] where each element is (r,)
contexts: list[np.array] where each element is (c,)

ResponseData takes in inputs in the form list[tuple[Any, ...]]. Then, it converts them to list[np.array]. This is done for several reasons:

  1. The tuple structure means that the user does not need to convert each stimulus into an np.array before using ResponseData. A stimulus can be represented as a list of features, a single scalar value, etc. And any number of stimuli can be grouped in a tuple to create the input.
  2. The tuple structure means that TrialBatch does not require significant changes. TrialBatch handles groups of stimuli in a tuple (previously this was (ref, comparison)). In this update it still handles stimuli in a tuple, now of the form (stimulus1, stimulus2, ...etc) for any number of stimuli. This consistency means that fewer changes are required for this refactoring, as TrialBatch is called directly in other files (sobol.py and grid.py). This structure means that those changes are not necessary.
  3. ResponseData converts to np.array under the hood for simplicity and so that the inputs.shape() can be used as a quick check for appropriate dimensionality.

How to update previous uses of each data class:

  • TrialData: TrialData(refs, comps, resps) -> TrialData(jnp.stack([refs, comps], axis=1), resps)
  • ResponseData: ResponseData(refs, comps, resps) -> ResponseData((refs, comps), resps)
  • TrialBatch: No change.

Catalogue of Specific Updates

dataset.py

  • TrialData attribute change
  • TrialData shape check change (wants (N, s, d) & (N, r). Checks that N is the same for both, and that if context was given, it also provides a consistent N.)
  • ResponseData attribute change. Update corresponding add_trial, add_batch functions
  • Update conversions between different data formats (TrialData <--> ResponseData, arrays <--> ResponseData)
  • Update TrialBatch to match ResponseData format. (note that TrialBatch objects created using "ref" and "comparison" exactly as before will function completely identically.)
  • Removed all checks that responses are integers. Removed lines that force int(responses)

likelihood.py

  • TaskLikelihood.loglik expected data to be Object with "refs", "comparisons", "responses" array attributes. -> Now expects data to be Object with "inputs", "responses" array attributes.
  • Forces responses to be integers. (replaces the removal of this forcing in dataset.py)

Note that in this update, likelihood does not significantly change. It is still specific to OddityTask.

io.py

  • save_responses_csv is updated to infer number of stimuli from ResponseData, and will create a csv with columns numbered "stimulus 1", "stimulus 2", etc. Response column remains unchanged.
  • load_responses_csv updated to package "refs" and "probes" into appropriate tuple (refs, probes) for ResponseData. Documentation clarifies that this function is still specific to OddityTask!!

init.py + API docs

  • Include TrialData as top level API import

tests/
If a test used TrialData(refs, comps, resps) -> now uses TrialData(jnp.stack([refs, comps], axis=1), resps)
If a test used ResponseData(refs, comps, resps) -> now uses ResponseData((refs, comps), resps)

Updated:

  • test_3stimulus_decision_rule
  • test_covariance_field
  • test_mc_likelihood
  • test_model_api
  • test_noise_models
  • test_posteriors
  • test_wishart_covariance
  • test_imports (to add TrialData as top level API import)

docs/examples/wppm

  • repackage data & alter documentation + md files for quick_start.py and full_wppm_example.py

(Note that I'm on my laptop so could not fully test full_wppm_example with sufficiently large MC simulations or trials)

tests/test_data_format
Created new test file
Includes:

  • tests to show that inputs with new numbers of stimuli and/or new dimensionalities are supported.
  • tests to show that non-binary responses with new dimensionalities are supported.
  • tests to show that context is supported
  • tests to show that mismatches in dimensionality or number of stimuli throw appropriate errors.

Comparison to Issue #134

This is a partial step towards completing the significant changes outlined in that issue.

Implemented:

  • TrialData format update
  • TaskLikelihood works with new data
  • Existing tests pass
  • New tests created for these changes also pass
  • Examples run with new data object
  • CSV output works generally with new data object. CSV input works for OddityTask with new data object.

Not Yet Implemented:

  • xarray addition
  • Generalized CSV input (this will require deciding on appropriate user input to disambiguate inputs from responses)
  • trial_id, session_id

Divergences from Issue:

  • src/psyphy/trial_placement/ was listed as requiring an update, but I believe that won't be necessary at this first step, as TrialBatch structure in the OddityTask case was preserved. Updating trial_placement to support more general task formats can be done separately.
  • Variable names are not all identical to outline in issue. (This is just because I'd already made those augmentations before the issue was published. Let me know which I should change that aren't good fits!)
  • docs/examples/wppm/weber_law_demo nor test_nuts_sampler were both listed as needing updates, but I do not see either file in the current version.
  • ReponseData & TrialBatch change was different than described. (Details and explanation in "core change" section above.)

@agorwant agorwant marked this pull request as draft May 5, 2026 15:16
@agorwant agorwant marked this pull request as ready for review May 6, 2026 19:57
Comment thread src/psyphy/data/dataset.py Outdated
refs : (N, d)
comparisons : (N, d)
responses : (N,)
inputs : (N, s, d)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest renaming s to k as outlined in the issue. K is the conventional counting variable in statistics e.g, k-means. s is non-standard as a shape label and risks confusion with "seconds" or "samples". Uppercase R and C follow the same convention for response and context dimensions.

"""
refs = jnp.asarray(data.refs)
comparisons = jnp.asarray(data.comparisons)
stimuli = jnp.asarray(data.stimuli)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likelihood.py currently extracts stimuli by hardcoded position (stimuli[:, 0, :], stimuli[:, 1, :]), so any TrialData where the ref is not in slot 0 or K ≠ 2 will silently compute wrong likelihoods. Adding stimulus_names as an empty-tuple field on TrialData and a data.stimulus("ref") accessor would let the likelihood look up slots by name instead of index, making it robust to different slot orderings and generalizable to K > 2 tasks. I can implement that.

@hmd101 hmd101 marked this pull request as draft May 11, 2026 19:51
@hmd101 hmd101 assigned hmd101 and agorwant and unassigned hmd101 and agorwant May 11, 2026
@hmd101 hmd101 marked this pull request as ready for review May 11, 2026 20:08
@hmd101 hmd101 merged commit 18fb294 into flatironinstitute:main May 11, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants