High level need
It's possible that instruments/users/etc. may generate bad values and not want them to be included in training, but they would like to archive the values. It would also serve as a way to represent a "hole" in the dataset.
This also makes INTERSECT orchestration easier, both for campaign authors and the
Low level implementation
- Represent the mask in the database, generally encapsulate this from the user
- allow users to initialize_workflow with masked value(s) (provide optional array of indexes with masked values)
- allow users to update workflow with bad value(s) (simple boolean flags)
- store both an
all_values dataset_x + dataset_y and a good_values dataset_x + dataset_y in the database for efficiency
High level need
It's possible that instruments/users/etc. may generate bad values and not want them to be included in training, but they would like to archive the values. It would also serve as a way to represent a "hole" in the dataset.
This also makes INTERSECT orchestration easier, both for campaign authors and the
Low level implementation
all_valuesdataset_x + dataset_y and agood_valuesdataset_x + dataset_y in the database for efficiency