EchoXFlow conversion#424
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
fixed now, i will rerun conversion + upload to use the new timestamps |
|
Testing right now. @swpenninga could you upload the dataset on the HF branch "v0.1.0" instead of "0.1.0"? That is how it is done for the other datasets too now. After you can remove the "0.1.0". Also could you update the embedded zea version to "0.1.0" (by updating the toml and installing again). I see it is now 0.0.11 which contradicts the HF branch version. |
|
@swpenninga I see the images are now saved as |
Running that right now! |
Nice! Also wouldn't be opposed to putting everything in a dataset folder, so it is easier for people to see the readme / license etc. auxiliary files. Don't need to rerun, can use hugging face checkout to rearrange files maybe as quick fix? Sorry for last minute comment.... |
|
branch was changed to v0.1.0 now, not sure how to delete the old one |
It is allowed by the spec to have a channel dim in your images, which can be 1 (it is an empty dim, but the spec allows it.) The coordinates also look correct -- they just need to match the spatial dims |
Right, but would |
OisinNolan
left a comment
There was a problem hiding this comment.
looks good @swpenninga , nice work! This was a cool dataset to test the metadata spec with -- left one comment on that but otherwise I think it's good to go
| metadata["bmode_timestamps"] = { | ||
| "samples": ts, | ||
| "timestamps": ts - ts[0], | ||
| "start_time_offset": np.float32(ts[0]), |
There was a problem hiding this comment.
I found this part a little confusing -- we're storing here a signal whose values are absolute timestamps and whose timestamps are relative timestamps
I think this happens because (i) we want to record the timestamps of the B-mode frames, but (ii) we require a custom signal to have samples
I think there are two options here:
(i) We add an optional timestamp array to Map (non-breaking), where it's simply a 1D float array whose length needs to match the first dim of the Map.values (preferred)
(ii) We allow custom signals to be just Signal instead of SignalND
what do you think?
There was a problem hiding this comment.
I think option one would be perfect. Als found the current way of saving confusing when working on my example snippet.
@tristan-deep yeah indeed -- |
Basically just git, but using pointers so you don't have to download the entire dataset (using # clone repo, but using pointers only
export GIT_LFS_SKIP_SMUDGE=1
git clone https://huggingface.co/datasets/zeahub/echoxflow
cd echoxflow
# see branches
git branch -a
# delete the local branch
git branch --delete 0.1.0
# delete remote branch
git push origin --delete 0.1.0
# check
git log --onelineMake sure your |
|
@OisinNolan @tristan-deep i added the timestamp to the Map, and i removed the trailing 1 dim to get (z, x) in the last commit. Let me know if I can start (hopefully the last) conversion and upload to hf! |
For me all good! Changes look good. Will test after conversion again! (or if you already have a sample available) |
OisinNolan
left a comment
There was a problem hiding this comment.
Conversion lgtm! Just one remaining issue regarding the update to the spec
| the shape is ``(*values.shape[:-1], 3)``. The last axis holds ``[x, y, z]``. | ||
| The leading ``n_frames`` axis may be omitted to broadcast one coordinate grid | ||
| across all frames. | ||
| timestamps: Optional per-frame acquisition timestamps in seconds, shape ``(n_frames,)``. |
There was a problem hiding this comment.
nice addition!
I'm just thinking -- maybe the definition here should match that of timestamps on the Signal and ProbePose spec -- there we generally define them as relative to the first transmit event of the acquisition. This would mean we could more easily compare timestamps across these different sources.
There was a problem hiding this comment.
We would also then need a start_time_offset like in the other cases, with some validation that it's also set if we specify timestamps, and that timestamps are >0


cc by nc sa 4.0.This is not the entire dataset, it contains some filters. To keep things managable (and high quality), we copy only the 2D (single plane) B-modes:
with the following filters:
>30fps, >10frames in the sequence
this avoids some low quality single frames or noise frames that are used for probe positioning for example.
All the metadata that comes with these files is transferred to zea format.
Some interesting statistics about the 1291 different data shapes with these filters: