Improve concurrent sequential access speed by per-file file pointer. by tmhannes · Pull Request #182 · relan/exfat

tmhannes · 2022-03-22T09:46:56Z

This PR addresses #181 by adding a separate "fptr" for each open file handle, so that sequential reads never have to restart from the beginning of the file.

In light testing on an i7 CPU with an external SSD, the PR allows concurrent readers at different positions in a large file to achieve the same total read throughput and CPU usage as a single reader sequentially processing the whole file.

Any feedback would be gratefully received.

This commit extends the caching of cluster information introduced by commit 6c332e1 so that multiple open file handles accessing the same node each have their own cache. That prevents dramatic slowdowns when multiple processes are reading from a large file. * Replace the fptr_index and fptr_cluster members of struct exfat_node introduced by 6c332e1 with a list of exfat_fptr structs, each of which holds one index and cluster. * In fuse_exfat_open, add a new exfat_fptr to the exfat_node, and store a pointer to both in the fh member of the fuse_file_info. * In fuse_exfat_read and fuse_exfat_write, pass the exfat_fptr from the fh through to exfat_advance_cluster, so that multiple file handles open on the same file can independently track their most recently used cluster. For all other uses of exfat_advance_cluster, we continue to user the "shared" exfat_fptr stored in the node. * Adjust grow_file and shrink_file to detect when multiple exfat_fptr instances are stored in the node and update them all as needed.

relan · 2022-04-03T09:55:11Z

I like the idea, multi-thread operations are indeed suboptimal. Thanks for your proposal!

I have a few concerns about the implementation:

Why exfat_fptrs need to be chained?
The optionality of exfat_fptr argument makes the code prone to errors: if we pass it to read/write functions but forget to pass to truncate, the fptr will become invalid and can corrupt data.

tmhannes · 2022-04-06T20:06:37Z

Why exfat_ptrs need to be chained

The exfat_ptrs are chained so that shrink_file and grow_file (both called from exfat_truncate) can find all of the exfat_ptrs that point to the file they are working with (by following the chain from the fptr member of exfat_node), so that they can adjust them when necessary.

My impression is that this is required when, for example, a process A has a file open and then a process B truncates it, so that process A is not left holding an invalid exfat_ptr.

The optionality of exfat_fptr argument makes the code prone to errors: if we pass it to read/write functions but forget to pass to truncate, the fptr will become invalid and can corrupt data.

It's not necessary to pass the exfat_fptr to truncate at all, because truncate will find all the exfat_ptrs that exist by itself (see above).

I agree that the optional exfat_ptr argument to exfat_advance_cluster is weird though. Perhaps it would be cleaner to require every caller to provide an exfat_ptr (or an exfat_fh (which contains both exfat_ptr and exfat_node))?

I didn't pursue that option because I wanted to keep the patch as small as possible. Requiring every caller of exfat_advance_cluster to provide an exfat_ptr would cause changes rippling out into a few more sections of code.

arrowd · 2025-06-09T05:49:40Z

This looks useful, what's preventing getting this in?

relan · 2025-06-09T08:24:34Z

This looks useful, what's preventing getting this in?

It is indeed useful, but IMO it would be better to choose the approach when all functions, working with file contents, require an position argument. This would be much more bug-proof.

tmhannes force-pushed the per-fh-fptr branch from c9fb7b4 to a1af100 Compare March 22, 2022 09:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve concurrent sequential access speed by per-file file pointer.#182

Improve concurrent sequential access speed by per-file file pointer.#182
tmhannes wants to merge 1 commit intorelan:masterfrom
transfermedia:per-fh-fptr

tmhannes commented Mar 22, 2022

Uh oh!

relan commented Apr 3, 2022

Uh oh!

tmhannes commented Apr 6, 2022

Uh oh!

arrowd commented Jun 9, 2025

Uh oh!

relan commented Jun 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tmhannes commented Mar 22, 2022

Uh oh!

relan commented Apr 3, 2022

Uh oh!

tmhannes commented Apr 6, 2022

Uh oh!

arrowd commented Jun 9, 2025

Uh oh!

relan commented Jun 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants