Skip to content

Default Replication #312

@wjmelements

Description

@wjmelements

We would like for storage contexts to upload to multiple storage providers by default. It should still be possible to select only one storage provider, so this will be configurable.

Without proof of replica, it is possible for replicated service providers to cheat proof of data possession. Therefore we plan to implement an endorsement system in the short term to certify availability. It should be possible to notice by latency if a service provider only has data possession by proxy.

My current plan to implement this is to have a replica count configuration parameter for createContext, and a wrapper replicated storage context that uses Promise.all on child context operations. This would need to handle individual SP failures gracefully. However because there is not a deadline on operations like AddPieces, it is still possible that a SP may succeed after we have determined that it failed, so simply replacing an apparently bad SP can result in paying for extra replicas.

I also want to make prior replica data sets somewhat discoverable in the same way that we discover your prior data sets. There are ways this can fail though. For example, a prior replicated data set may have been terminated or created with a different number of replicas. I don't have a good solution for this but I'll fallback to large existing relationships and use the new storage context preferences for new pieces.

Followups to this changeset:

  • helper to identify user's piece CID duplicates and their statuses
  • helper to boost (or repair) replication for previously uploaded pieces or data sets
  • upload via URL for faster uploads (requires Curio change)
  • prefer to use at least one endorsed SP by default

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

Status

🎉 Done

Relationships

None yet

Development

No branches or pull requests

Issue actions