Skip to content

PI/PO protection mechanism for par2 #269

@sjpotter

Description

@sjpotter

With the size of data being protected by par2 growing, we have a problem. due to par2's block size limits, one ends up with ever growing block sizes, which means small random errors can make a dataset unrecoverable due to the amount of random errors exceeding one's number of recoverable blocks.

This made me think, it be interesting if par2 could implement a DVD like (or at least inspired to me) recovery scheme with a PI/PO mechanism.

the PO would be the current par2 mechanism without much changes at all. i.e. provide recovery at the entire data set level with possibly a large block size (or the smallest it can be for the data size and fit within the 16k total blocks that can exist).

But the PI would be smarter. many PI par2 sets would be created and they would know that that they are only protecting from bit x to bit y of the data set (i.e. set1 0->X, set2 X+1->2X....). Without changing the par2 algorithm, these PI recovery sets woudl be able to use a much smaller block size and be more able to handle those random errors as the block size would be much smaller. And if they can't recover their set, that's where the PO comes in to recover on a global scale like par2 does today. If only one PI set can't be recovered, the chances that PO will be able to recover it is much larger (especially if each PI set is significantly smaller than than the amount the PO is meant to recover.

At a simplistic level, if it's 100GB of data you are protecting and we have 5% parity at the global level PO level, but divide the 100GB into 100 1GB blocks that are protected by PI at 5% parity as well, we'd have to lose at minimum 5+ total PI blocks before the data wasn't recoverable (and that's assuming the 5 PI blocks are a total loss, if they aren't, they just aren't recoverable at the PI level), one would have to lose even more data before recovery became impossible.

just a thought I had that I thought would be interesting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions