Skip to content

fix: prevent SQL injection in workflow PGSqlNode #1322

Open
dongjiang1989 wants to merge 2 commits into
iflytek:mainfrom
dongjiang1989:fix-sql-injection
Open

fix: prevent SQL injection in workflow PGSqlNode #1322
dongjiang1989 wants to merge 2 commits into
iflytek:mainfrom
dongjiang1989:fix-sql-injection

Conversation

@dongjiang1989
Copy link
Copy Markdown
Contributor

Summary

  • Add input validation for PostgreSQL identifiers (table/column names) to block SQL reserved keywords and disallowed characters
  • Implement validate_sql_template() to restrict CUSTOM mode to safe DML operations (SELECT/INSERT/UPDATE/DELETE/CALL), block multi-statement SQL and dangerous keywords (DROP, ALTER, TRUNCATE,
    etc.)
  • Replace naive string substitution in replace_placeholders() with SQLAlchemy parameterized binding (text() + bindparams()) to prevent SQL injection through user-supplied values
  • Add type validation for database-returned INSERT IDs to prevent type confusion

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update
  • Refactoring

Related Issue

Fixes #1281

Changes

Testing

  • Existing tests pass
  • New tests added (if applicable)
  • Manual testing completed

Screenshots (if applicable)

Checklist

  • Code follows project coding standards
  • Self-review completed
  • Documentation updated (if needed)
  • Breaking changes documented

Signed-off-by: dongjiang <dongjiang1989@126.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces SQL validation and sanitization measures for PostgreSQL nodes, including identifier pattern enforcement, multi-statement blocking, and keyword restriction. It also migrates placeholder replacement to use SQLAlchemy's parameter binding for improved security. Feedback suggests expanding the allowed SQL keywords to include 'WITH' and 'EXPLAIN' to avoid over-restricting valid queries. Furthermore, the use of SQLAlchemy's text() on the full template needs to be reconsidered as it may cause regressions with PostgreSQL type casts, native colons, and quoted placeholders.

Comment thread core/workflow/engine/nodes/pgsql/pgsql_node.py Outdated
Comment on lines +244 to +245
stmt = text(sql).bindparams(**params)
return str(stmt.compile(compile_kwargs={"literal_binds": True}))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using SQLAlchemy's text() on the entire SQL template introduces several regressions:

  1. Type Casts: PostgreSQL's double-colon type casts (e.g., col::text) will be broken because SQLAlchemy interprets :: as an escaped parameter, resulting in a single colon in the output (e.g., col:text), which is invalid syntax.
  2. Unbound Parameters: If the user's SQL contains colons that are not intended as parameters (e.g., in a string literal or a native database parameter like :name), SQLAlchemy will attempt to bind them. Since they aren't in the params dictionary, the compilation will fail with a CompileError or KeyError.
  3. Quoted Placeholders: Placeholders wrapped in quotes in the template (e.g., '{{var}}') will not be substituted because SQLAlchemy does not perform parameter binding inside string literals. The resulting SQL will literally contain the string :p_0.
  4. Dialect Specifics: Compiling with the default dialect may lead to incorrect escaping of literals. It is safer to explicitly use the PostgreSQL dialect (e.g., stmt.compile(dialect=postgresql.dialect(), ...)).

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: dongjiang <dongjiang2010@gmail.com>
@dongjiang1989
Copy link
Copy Markdown
Contributor Author

/cc @hygao1024 PTAL

@hygao1024
Copy link
Copy Markdown
Contributor

Hi developer,

The database node in the workflow mainly serves as a carrier for invoking dbservice. It does not handle security checks such as SQL injection detection by itself. Related validation and protection logic should be uniformly handled by dbservice to ensure generality, consistency, and reusability.

Therefore, this issue is recommended to be enhanced at the dbservice layer, rather than being implemented separately within a single workflow node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Critical: Unauthenticated RCE via /workflow/v1/code/run + SQL Injection (CWE-94, CWE-306, CWE-89)

2 participants