Data Caterer is a metadata driven data generation tool that aids in creating production like data across batch and event data systems. Run data validations to ensure your systems have ingested it as expected. Use the Java, Scala API, or YAML files to help with setup or customisation that are all run via Docker.
This repo contains example Java and Scala API usage for Data Caterer.
Can follow detailed documentation found here for more details.
- Create new Java class similar
to DocumentationJavaPlanRun.java
- Needs to extend
io.github.datacatering.datacaterer.javaapi.api.PlanRun
- Needs to extend
- Create new Scala class similar
to DocumentationPlanRun.scala
- Needs to extend
io.github.datacatering.datacaterer.api.PlanRun
- Needs to extend
Requires:
- Docker
./run.sh
#check results under docker/sample/report/index.html folderCreate your own Docker image via:
./gradlew clean build
docker build -t <my_image_name>:<my_image_tag> .
docker run -e PLAN_CLASS=io.github.datacatering.plan.DocumentationPlanRun -v ${PWD}/docs/run:/opt/app/data <my_image_name>:<my_image_tag>
#check results under docs/run folderRun with own class from either Java or Scala API:
./gradlew clean build
cd docker
PLAN_CLASS=io.github.datacatering.plan.DocumentationPlanRun DATA_SOURCE=postgres docker-compose up -d datacatererDetails from docs.
Docker compose sample found under docker folder.
cd docker
docker-compose up -d datacatererCheck result under here.
Change to another data source via:
- postgres
- mysql
- cassandra
- solace
- kafka
- http
DATA_SOURCE=cassandra docker-compose up -d datacatererhelm install data-caterer ./data-caterer-example/helm/data-catererBase benchmark tests can be run via:
bash benchmark/run_benchmark.shResults can be found under benchmark/results.