This repository was archived by the owner on Jul 3, 2023. It is now read-only.

Running Hamilton on Flyte#233

Draft

ramannanda9 wants to merge 3 commits intomainfrom

flyte_integration

Member

ramannanda9 commented Nov 17, 2022

[Short description explaining the high-level reason for the pull request]

Changes

How I tested this

Notes

Checklist

PR has an informative and human-readable title (this will be pulled into the release notes)
Changes are limited to a single goal (no scope creep)
Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
Any change in functionality is tested
New functions are documented (with a description, list of inputs, and expected output)
Placeholder code is flagged / future TODOs are captured in comments
Project documentation has been updated if adding/changing functionality.

elijahbenizzy reviewed

View reviewed changes

Collaborator

elijahbenizzy left a comment

Some thoughts -- don't fully understand it -- I need to look into the flytekit API. I think we might want a better toolset for compiling to other frameworks...

hamilton/experimental/h_flytekit.py

+              class PandasSeriesTransformer(TypeTransformer[pd.Series]):
+                  """
+                  Creates a transformer to handle PandasSeries, Similar to PandasTransformer
+                  in flight repo https://github.com/flyteorg/flytekit/blob/master/flytekit/types/schema/types_pandas.py

Collaborator

elijahbenizzy Nov 18, 2022

What's different between the one here and that one?

hamilton/experimental/h_flytekit.py Outdated

+                      # map inputs
+                      for input_node in node.dependencies:
+                          if input_node.name not in self.workflow.inputs:
+                              self.workflow.add_workflow_input(input_node.name, input_node.type)

Collaborator

elijahbenizzy Nov 18, 2022

Should we check if they're python approved types?

Member Author

ramannanda9 Jan 5, 2023

There is fairly limited support for custom types, but yeah a check can be added.

hamilton/experimental/h_flytekit.py

+                              self.workflow.add_workflow_input(input_node.name, input_node.type)
+                          input_kwargs[input_node.name] = self.workflow.inputs[input_node.name]
+                      # add the node to workflow
+                      wf_node = self.workflow.add_entity(task, **input_kwargs)

Collaborator

elijahbenizzy Nov 18, 2022

I might be misreading, but why would it already have outputs/where does it get it from?

Member Author

ramannanda9 Jan 5, 2023

Yep, See Imperative Workflow . It happens when you add the entity i.e. task with input_kwargs

hamilton/experimental/h_flytekit.py Outdated

+                          if len(outputs) == 1:
+                              self.workflow.add_workflow_output(node.name, list(outputs.values())[0], node.type)
+                          else:
+                              self.workflow.add_workflow_output(node.name, outputs, node.type)

Collaborator

elijahbenizzy Nov 18, 2022

And why might it have multiple outputs?

skrawcz reviewed

View reviewed changes

hamilton/experimental/h_flytekit.py

+                  Flytekit Python is the Python Library for easily authoring, testing, deploying,
+                  and interacting with Flyte tasks, workflows, and launch plans
+                  """

Collaborator

skrawcz Nov 18, 2022 •

edited

Loading

On the right track @ramannanda9 !

Here we'd probably need an explanation here of how we're actually using Flyte. E.g. the code that we map things to, how does it behave/work and are there any caveats, decisions made, etc.

skrawcz reviewed

View reviewed changes

hamilton/experimental/h_flytekit.py



		# Register with FlyteKit type engine
		TypeEngine.register(PandasSeriesTransformer())

Collaborator

skrawcz Nov 18, 2022

@ramannanda9 since Hamilton supports any object type, we need to think how to handle arbitrary python object types here too.

Member Author

ramannanda9 Jan 5, 2023

Yeah not the best idea, but they ideally could default to just pickle dumping rather than raising unsupported type.

ramannanda9 force-pushed the flyte_integration branch 2 times, most recently from f428ae8 to 25498cc Compare

January 6, 2023 00:41

ramannanda9 added 3 commits

January 5, 2023 16:47


          Adds ability to execute hamilton on flytekit

459dcce

See #139.

The GraphAdapter treats a node as a PythonFunctionTask, it adds the node to an ImperativeWorkflow in flyte.
The execution of workflow is performed during build_output.
This way hamilton functions can be executed in Flyte runtime.

We end up adding a PandasSeriesTransformer as that is required for using PandasSeries as function outputs and inputs of flight task nodes.

Any customtype that is not a dataclass, native type or has support already in flyte will raise a ValueError

Adds FlyteKitGraphAdapter tests

Adds tests to show unsupported types


          Adds example for using FlyteKitGraphAdapter

99030b9


          Adds flytekit to extras

93f150c

ramannanda9 force-pushed the flyte_integration branch from 25498cc to 93f150c Compare

January 6, 2023 00:47

HamiltonRepoMigrationBot mentioned this pull request

Running Hamilton on Flyte apache/hamilton#53

Closed

7 tasks

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet