Skip to content

Add future feature plans: 9 detailed implementation plans for planned Flowfile features#377

Open
Edwardvaneechoud wants to merge 17 commits intomainfrom
claude/plan-future-features-iisFU
Open

Add future feature plans: 9 detailed implementation plans for planned Flowfile features#377
Edwardvaneechoud wants to merge 17 commits intomainfrom
claude/plan-future-features-iisFU

Conversation

@Edwardvaneechoud
Copy link
Copy Markdown
Owner

Overview document (docs/future_features.md) with foundational node containment
model analysis and per-feature plan files covering:

  • Iterative Nodes (Option B: embedded sub-flow)
  • Conditional Execution (Option A: parent pointer)
  • Delta Lake Catalog Storage
  • Flow Parameters
  • Catalog Query & Data Exploration (SQL + GraphicWalker)
  • Extended Connectors (PostgreSQL, MySQL, BigQuery, Snowflake)
  • Standardized Custom Node Designer
  • Flow as Custom Node (Option C: referenced flow)
  • Enhanced Code Generation (catalog reads/writes + kernel code wrapping)

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd

claude added 16 commits March 27, 2026 14:21
… Flowfile features

Overview document (docs/future_features.md) with foundational node containment
model analysis and per-feature plan files covering:
- Iterative Nodes (Option B: embedded sub-flow)
- Conditional Execution (Option A: parent pointer)
- Delta Lake Catalog Storage
- Flow Parameters
- Catalog Query & Data Exploration (SQL + GraphicWalker)
- Extended Connectors (PostgreSQL, MySQL, BigQuery, Snowflake)
- Standardized Custom Node Designer
- Flow as Custom Node (Option C: referenced flow)
- Enhanced Code Generation (catalog reads/writes + kernel code wrapping)

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
…admap

- Move docs/plans/ → docs/for-developers/roadmap/
- Add roadmap index with mermaid dependency diagram
- Register all 9 feature pages in mkdocs.yml nav
- Link roadmap from developer index page
- Remove standalone docs/future_features.md

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
…support

The original plan incorrectly stated Flowfile had only "generic database
read/write." In fact, database_reader already supports PostgreSQL, MySQL,
MariaDB, SQLite, MSSQL, Oracle via ConnectorX with both table and SQL query
modes, plus stored connection references and 100+ SQL type mappings.

Rewritten to focus on what's actually missing: partitioned reads, bulk loading,
BigQuery/Snowflake cloud warehouse support, and incremental loading.

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
…d today

Previous version incorrectly claimed broad database support. In reality:
- Database: only PostgreSQL is tested, implemented in the UI, and documented
- Cloud storage: only AWS S3 is tested and production-ready
- Other databases (MySQL, SQLite, MSSQL) exist only as type hints
- ADLS/GCS exist only as schema definitions, not tested implementations

Restructured plan around phased delivery: MySQL first, then ADLS/GCS,
then BigQuery/Snowflake, then PostgreSQL enhancements.

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
… not catalog

The catalog is internal to Flowfile (data produced/consumed within flows).
Extended connectors are about external database and cloud storage nodes.
Removed false dependency between connectors and catalog query in the
dependency diagram.

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
… problem

The original plan incorrectly framed the custom node API as inaccessible.
In reality, the CustomNodeBase + process() pattern is clean and intuitive.

The actual problem is kernel code generation: generate_kernel_code()
translates clean process() code into proxy classes (_V, _Self), rewrites
return statements, and produces unintuitive generated code. Refocused the
plan on making the code users write be the code that runs in the kernel.

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
Flow Parameters: rewritten to document the substantial implementation on
feature/add-flow-parameters branch (parameter_resolver.py, FlowParametersPanel,
flow_graph integration, 763 lines of tests). Refocused on what remains.

Custom Node Designer: acknowledge the visual designer already exists in the
frontend (3-panel drag-and-drop builder with Polars autocompletion). Custom
nodes always use Polars code. Refocused plan on the kernel code generation
gap and packaging/sharing.

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
…not script

Two key design changes:
- Generated code can depend on flowfile (from flowfile import read_from_catalog)
  rather than trying to be fully standalone
- Output is a Python package, not a single script file. Each sub-flow,
  iterator body, condition branch, and custom node gets its own module
  with a process() function. Main flow imports and calls them.

This is essential for features 1 (iterative), 2 (conditional), and 8
(flow-as-node) where a single script would be unmanageable.

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
…splitting

Condition nodes are if/else at the flow level: evaluate a condition on the
whole DataFrame (df.count() == 12, column existence, aggregate checks) and
route the ENTIRE df to one branch. The other branch is skipped entirely.

This is fundamentally different from the filter node (row-level subsetting).
Updated execution semantics, code generation (if/else with branch modules),
expression examples, and schema inference accordingly.

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
Merge keys are specified on the catalog_writer node settings (write_mode +
merge_keys fields), not magically known. The user selects them when
configuring the write operation. Optionally, catalog table metadata can
declare primary keys for validation.

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
…ble API

Polars LazyFrame.sink_delta() natively supports mode='merge' with
delta_merge_options for predicate/alias config, returning a TableMerger.
No need to convert to Arrow or manage DeltaTable objects manually.

Note: sink_delta is marked unstable in Polars, should be monitored.

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
Extracted cloud catalog storage from Delta Lake plan (was Phase 4) and
expanded into a proper feature covering:
- PostgreSQL as alternative to SQLite for catalog metadata
- Cloud storage (S3/ADLS/GCS) for catalog table data
- Shared catalog for multi-user/team deployments
- Catalog federation (external tables without data copying)

The current catalog uses SQLite + local Parquet. The SQLAlchemy ORM and
CatalogRepository protocol provide a good foundation for making backends
pluggable. Alembic migrations needed for schema evolution.

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
…plitting

The previous example used df.filter() to split rows into two streams,
which contradicts the flow-level branching design (Feature 2). Conditions
route the entire DataFrame to one branch via a Python if/else, not both.

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
…tures

Overview now includes:
- Recently Shipped section documenting parallel execution, kernel runtime,
  named I/O, flow catalog, catalog reader/writer, scheduling, embeddable WASM
- Implementation order in 5 phases: Foundations → Connectivity → Control Flow
  → Composition & Generation → Scale
- Feature table reordered by implementation sequence, not original numbering
- Updated mermaid dependency diagram with all 10 features
- MkDocs nav reordered to match implementation sequence

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
- Delete 04_flow_parameters.md (shipped)
- Remove "Recently Shipped" section (inaccurate)
- Reorder feature table and nav by implementation sequence:
  Phase 1 (Storage): Delta Lake
  Phase 2 (Connectivity): Extended Connectors, Catalog Query
  Phase 3 (Control Flow): Custom Node Designer, Conditional, Iterative
  Phase 4 (Composition): Code Generation, Flow as Custom Node
  Phase 5 (Scale): Cloud & Distributed Catalog
- Update references to Flow Parameters as shipped in other plans

https://claude.ai/code/session_01AUuzPPf1NKgNZaWno58wAd
@netlify
Copy link
Copy Markdown

netlify bot commented Mar 30, 2026

Deploy Preview for flowfile-wasm ready!

Name Link
🔨 Latest commit bbd2da7
🔍 Latest deploy log https://app.netlify.com/projects/flowfile-wasm/deploys/69ca885358ac7900098b203a
😎 Deploy Preview https://deploy-preview-377--flowfile-wasm.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants