Skip to content

Decentralized Data Lake Ideas #41

@davidgasquez

Description

@davidgasquez

Random thoughts around decentralized and permissionless data lakes.

  • An easy target is blockchain data.
  • Everything should be content adressed and inmutable! Easy to get with chain data. I should be able to query any CID without caring where it is.
  • Publish the CID of the something like a Delta Catalog JSON file on Ethereum. You can publish your fork or write contracts on top of it. Use any compute engine to run queries on top of that.
  • Collaborate on data TrueBlocks style, where more people usinig the service means better data reliability and speed. If there is a section missing, I can send somemthing like a PR to fill that data.

Also from datonic/datadex#22 (comment).

Reading "The Database I Wish I Had" and thinking about something like that for OLAP workloads. Feels like OLAP use cases might be the "killer database" for IPFS/Hypercore/Dat. For analysis, you want data to be inmutable, don't care that much about latency, and have to store large amount of data.

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type
No fields configured for issues without a type.

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions