Background [Optional]
Currently Spline 0.6.x has capability to track attribute level lineage (backward) for target data source. Spline UI has the capability which involves two clicks. First at the target attribute and then click on details.
Question
@wajda
We are looking for AQL to fetch attribute level dependency for all attributres of target datasource in an optimal way for a given eventID
For example, we have two datasets Employee and Department and going through below transformation where we are deriving effectiveBonus by deriving it from Employee.bonus and Department.bonusMultiplier. What will be the optimized AQL for the same.
Employee:
empId
empName
deptId
bonus
Department:
deptId
deptName
bonusMultiplier
Code Logic:
Dataset empDS = // Read Employee
Dataset deptDS = // Read Department
Dataset bonus = empDS.join(deptDS, "deptId").withColumn("effectiveBonus”, col(“bonus”).multiply(col("bonusMultiplier”)));
salDS.write().save(“Final_Table”) // effectiveBonus Table
Expected Result (Flexible to be tweaked if we can get required info):
Final_Table.empId, Employee.empId
Final_Table.empName, Employee.empName
Final_Table.bonus, Employee.bonus
Final_Table.deptId, Department.deptId
Final_Table.deptName, Department.deptName
Final_Table.effectiveBonus, Employee.bonus : Department.bonusMultiplier
Background [Optional]
Currently Spline 0.6.x has capability to track attribute level lineage (backward) for target data source. Spline UI has the capability which involves two clicks. First at the target attribute and then click on details.
Question
@wajda
We are looking for AQL to fetch attribute level dependency for all attributres of target datasource in an optimal way for a given eventID
For example, we have two datasets Employee and Department and going through below transformation where we are deriving effectiveBonus by deriving it from Employee.bonus and Department.bonusMultiplier. What will be the optimized AQL for the same.
Employee:
empId
empName
deptId
bonus
Department:
deptId
deptName
bonusMultiplier
Code Logic:
Dataset empDS = // Read Employee
Dataset deptDS = // Read Department
Dataset bonus = empDS.join(deptDS, "deptId").withColumn("effectiveBonus”, col(“bonus”).multiply(col("bonusMultiplier”)));
salDS.write().save(“Final_Table”) // effectiveBonus Table
Expected Result (Flexible to be tweaked if we can get required info):
Final_Table.empId, Employee.empId
Final_Table.empName, Employee.empName
Final_Table.bonus, Employee.bonus
Final_Table.deptId, Department.deptId
Final_Table.deptName, Department.deptName
Final_Table.effectiveBonus, Employee.bonus : Department.bonusMultiplier