Hi @wajda
I used on the HDFS on cloudera this config
spark.jars=hdfs:///tmp/spark-2.4-spline-agent-bundle_2.11-2.2.1.jar
spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener
spark.spline.mode=ENABLED
spark.spline.lineageDispatcher=hdfs
spark.spline.lineageDispatcher.hdfs.outputDir=hdfs:///tmp/spline/lineage/
spark.spline.lineageDispatcher.hdfs.fileNamePrefix=lineage_
spark.spline.lineageDispatcher.hdfs.fileBufferSize=4096
spark.spline.lineageDispatcher.hdfs.filePermissions=777
spark.driver.memory=4g
But it wrote the lineage to the target file on the script, seems to ignore the spark.spline.lineageDispatcher.hdfs.outputDir=hdfs:///tmp/spline/lineage/
Is there a way to set it to a centralized place and also to give each json file the execution_plan_id
thanks
Hi @wajda
I used on the HDFS on cloudera this config
spark.jars=hdfs:///tmp/spark-2.4-spline-agent-bundle_2.11-2.2.1.jar
spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener
spark.spline.mode=ENABLED
spark.spline.lineageDispatcher=hdfs
spark.spline.lineageDispatcher.hdfs.outputDir=hdfs:///tmp/spline/lineage/
spark.spline.lineageDispatcher.hdfs.fileNamePrefix=lineage_
spark.spline.lineageDispatcher.hdfs.fileBufferSize=4096
spark.spline.lineageDispatcher.hdfs.filePermissions=777
spark.driver.memory=4g
But it wrote the lineage to the target file on the script, seems to ignore the spark.spline.lineageDispatcher.hdfs.outputDir=hdfs:///tmp/spline/lineage/
Is there a way to set it to a centralized place and also to give each json file the execution_plan_id
thanks