Skip to content

Security: Validate SQL identifiers in Spanner search tool to prevent injection#5952

Open
Ashutosh0x wants to merge 1 commit into
google:mainfrom
Ashutosh0x:fix/spanner-sql-identifier-validation
Open

Security: Validate SQL identifiers in Spanner search tool to prevent injection#5952
Ashutosh0x wants to merge 1 commit into
google:mainfrom
Ashutosh0x:fix/spanner-sql-identifier-validation

Conversation

@Ashutosh0x
Copy link
Copy Markdown

@Ashutosh0x Ashutosh0x commented Jun 3, 2026

Summary

Fix SQL injection vulnerabilities in google.adk.tools.spanner.search_tool where LLM-controllable parameters ( able_name, columns, embedding_column_to_search, additional_filter, op_k) are interpolated into SQL via f-strings without validation.

Problem

The _generate_sql_for_knn() and _generate_sql_for_ann() functions build SQL queries using f-string interpolation. Since similarity_search is registered as a GoogleTool in SpannerToolset.get_tools(), the LLM populates these parameters at runtime via tool calls. A successful prompt injection can cause the LLM to call the tool with malicious values, enabling:

  1. UNION-based exfiltration via �dditional_filter:
    sql WHERE 1=1 UNION ALL SELECT password, 0.0 FROM admin_credentials

  2. INFORMATION_SCHEMA enumeration via columns:
    sql SELECT (SELECT STRING_AGG(table_name, ',') FROM INFORMATION_SCHEMA.TABLES) AS dump

  3. Cross-table reads via able_name:
    sql FROM documents JOIN admin_credentials ac ON TRUE

Fix

  • Identifier validation: Regex-validate able_name, columns, and embedding_column_to_search against a safe pattern (^[A-Za-z_][A-Za-z0-9_]*(\.[A-Za-z_][A-Za-z0-9_]*)*$ or quoted with backticks/double-quotes)
  • Filter validation: Reject common injection patterns (UNION, ;, --, /* */) in additional_filter
  • Model name validation: Validate GSQL model names as identifiers and PG endpoints as URI format
  • Type enforcement: Cast top_k and num_leaves_to_search to int

Testing

Added tests/unittests/tools/spanner/test_spanner_sql_validation.py with tests covering:

  • Valid identifiers (simple, schema-qualified, quoted)
  • Rejected injection patterns (JOIN in table_name, subquery in columns, UNION in filter)
  • String-to-int coercion for top_k

Fixes #5913

…tion

Add input validation for structural SQL identifiers (table_name, columns,
embedding_column_to_search) and filter expressions (additional_filter) in
the Spanner vector search tool before they are interpolated into SQL via
f-strings.

Problem:
The _generate_sql_for_knn() and _generate_sql_for_ann() functions build SQL
queries using f-string interpolation. Since similarity_search is registered
as a GoogleTool, the LLM populates these parameters at runtime. A prompt
injection attack can cause the LLM to supply malicious values like:

  - table_name='docs JOIN secrets ON TRUE' (cross-table read)
  - columns=['(SELECT ... FROM INFORMATION_SCHEMA.TABLES)'] (schema enum)
  - additional_filter='1=1 UNION ALL SELECT password FROM creds' (exfil)

Fix:
- Regex-validate identifiers against a safe pattern (alphanumeric, _, .)
- Reject common injection patterns (UNION, ;, --, /* */) in filters
- Validate GSQL model name as identifier and PG endpoint as URI format
- Enforce int type on top_k and num_leaves_to_search

Fixes google#5913
@adk-bot adk-bot added the tools [Component] This issue is related to tools label Jun 3, 2026
@adk-bot
Copy link
Copy Markdown
Collaborator

adk-bot commented Jun 3, 2026

Response from ADK Triaging Agent

Hello @Ashutosh0x, thank you for creating this PR! It is fantastic to see SQL injection validation being added to the Spanner search tool.

To help our reviewers process your contribution more efficiently, could you please update the PR to include:

  • A summary of passed pytest results (e.g., the terminal output showing the new tests under tests/unittests/tools/spanner/test_spanner_sql_validation.py are passing).

Thank you for contributing to the ADK project!

@Ashutosh0x
Copy link
Copy Markdown
Author

Ashutosh0x commented Jun 3, 2026

Hi @wukath — this PR fixes a SQL injection vulnerability in the Spanner search tool (issue #5913).

The similarity_search function is registered as a GoogleTool, meaning the LLM populates parameters like able_name, columns, and additional_filter at runtime. These values are interpolated directly into SQL via f-strings in _generate_sql_for_knn() and _generate_sql_for_ann() without any validation — a prompt injection can cause the LLM to supply malicious values enabling UNION-based data exfiltration, schema enumeration, or cross-table JOINs.

The fix adds regex-based identifier validation and keyword filtering at a single chokepoint (_validate_identifier, _validate_additional_filter), plus type enforcement on op_k. I also included a test suite covering the main injection primitives. Happy to adjust the approach if needed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tools [Component] This issue is related to tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spanner search tool: validate structural SQL identifiers (defense-in-depth for prompt-injection-driven SQLi)

2 participants