Python: Add `timeout` parameter to `FoundryAgent` to fix `ConnectTimeout` on multi-turn conversations by moonbox3 · Pull Request #6263 · microsoft/agent-framework

moonbox3 · 2026-06-02T10:08:24Z

Motivation and Context

On Azure AI endpoints, idle connections between conversation turns can be recycled by the network, causing the next request to re-establish a TCP connection. The OpenAI SDK's default connect timeout (5 s) is too short for Azure AI Foundry endpoints under load, leaving users no way to override it—resulting in httpx.ConnectTimeout → openai.APITimeoutError on every second (and subsequent) agent.run() call.

Fixes #6241

Description

The root cause is that FoundryAgent (and the underlying RawFoundryAgentChatClient) created the AsyncOpenAI / AsyncAzureOpenAI client without exposing any timeout knob, so calers were stuck with the SDK's hardcoded 5 s connect timeout. The fix adds an optional timeout: float | None parameter to FoundryAgent, RawFoundryAgent, _FoundryAgentChatClient, RawFoundryAgentChatClient, and RawOpenAIChatClient, threading it down to load_openai_service_settings where it is forwarded as the timeout argument when constructing the underlying async OpenAI client. When timeout=None (the default), existing behavior is preserved. Tests cover that a non-None timeout is applied to the client, that None leaves the client's default intact, and that the parameter is present (not absorbed by **kwargs) on all public constructors.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Note: PR autogenerated by an agent

…icrosoft#6241) Expose a `timeout` parameter on `RawFoundryAgentChatClient`, `_FoundryAgentChatClient`, `RawFoundryAgent`, `FoundryAgent`, and `RawOpenAIChatClient` so callers can override the HTTP timeout used by the underlying AsyncOpenAI client. Root cause: `RawFoundryAgentChatClient.__init__` called `project_client.get_openai_client()` without configuring any timeout, inheriting the OpenAI SDK default of `httpx.Timeout(connect=5.0)`. When connections are recycled between turns under load, the 5 s connect timeout fires and surfaces as `openai.APITimeoutError`. Fix: - `load_openai_service_settings` (`_shared.py`): accept `timeout` and include it in `client_args` for all three `AsyncOpenAI`/ `AsyncAzureOpenAI` construction paths. - `RawOpenAIChatClient.__init__` (`_chat_client.py`): accept `timeout` and forward to `load_openai_service_settings`. - `RawFoundryAgentChatClient.__init__` (`_agent.py`): accept `timeout` and set `openai_client.timeout = timeout` on the client returned by `get_openai_client()` before passing it to the base class. - `_FoundryAgentChatClient`, `RawFoundryAgent`, `FoundryAgent`: accept and propagate `timeout` through the construction chain. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Expose a timeout parameter on RawFoundryAgentChatClient, _FoundryAgentChatClient, RawFoundryAgent, FoundryAgent, and RawOpenAIChatClient. When provided, the value is applied to the underlying AsyncOpenAI client so that connect timeouts under load or after connection recycling can be tuned by callers. Previously, get_openai_client() was called without any timeout override, so the SDK default of httpx.Timeout(connect=5.0) was inherited and could fire on multi-turn conversations where the underlying connection is recycled between turns. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…out` on multi-turn conversations Fixes microsoft#6241

Copilot

Pull request overview

Adds an optional timeout: float | None parameter through FoundryAgent → _FoundryAgentChatClient → RawFoundryAgentChatClient → RawOpenAIChatClient → load_openai_service_settings, so callers can override the OpenAI SDK's 5s default connect timeout that was causing ConnectTimeout on multi-turn Foundry conversations (issue #6241). Also bumps some azure-ai-agentserver-* dependencies and applies minor formatting touch-ups in unrelated test/source files.

Changes:

Thread a timeout parameter through the OpenAI/Foundry client constructors and into the underlying AsyncOpenAI/AsyncAzureOpenAI client.
Add unit tests asserting the parameter is exposed (not absorbed by **kwargs) and that the timeout is/isn't applied based on None vs non-None.
Refresh uv.lock (notably azure-ai-agentserver-core 2.0.0b3→2.0.0b5, azure-ai-agentserver-responses 1.0.0b5→1.0.0b7, new microsoft-opentelemetry) plus minor formatting tweaks in bedrock and foundry_hosting tests.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
python/packages/openai/agent_framework_openai/_shared.py	Accept `timeout` and forward to `AsyncOpenAI` / `AsyncAzureOpenAI` constructors across all routing branches.
python/packages/openai/agent_framework_openai/_chat_client.py	Add `timeout` kwarg + docstrings to all `RawOpenAIChatClient.__init__` overloads; pass into `load_openai_service_settings`.
python/packages/openai/tests/openai/test_openai_chat_client.py	Tests that `timeout` is an explicit parameter and accepted when a preconfigured client is supplied.
python/packages/foundry/agent_framework_foundry/_agent.py	Add `timeout` to `RawFoundryAgentChatClient`, `_FoundryAgentChatClient`, `RawFoundryAgent`, `FoundryAgent`; mutate `openai_client.timeout` post-`get_openai_client`.
python/packages/foundry/tests/foundry/test_foundry_agent.py	Coverage for `timeout` propagation and the `None` no-op across all four entry points.
python/packages/bedrock/agent_framework_bedrock/_chat_client.py	Formatting collapse of a few multi-line literals (no behavior change).
python/packages/bedrock/tests/test_bedrock_structured_output.py	Cosmetic blank line.
python/packages/foundry_hosting/tests/test_responses.py	Collapse two list comprehensions onto single lines (cosmetic).
python/uv.lock	Bump azure-ai-agentserver-core/responses and pull in `microsoft-opentelemetry` + `opentelemetry-instrumentation-httpx/openai*/util-genai`.

moonbox3

Automated Code Review

Reviewers: 4 | Confidence: 74% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Design Approach

Automated review by automated agents

github-actions · 2026-06-02T10:13:27Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/bedrock/agent_framework_bedrock
_chat_client.py	445	98	77%	304–305, 321–330, 336, 404, 413, 424, 426, 428, 433, 452–453, 477, 490, 502, 505, 513–514, 517–518, 520–521, 526–528, 530, 540–541, 563, 570, 579–580, 582–583, 585–587, 589, 591–592, 598–600, 603–604, 610–613, 619–629, 632, 651, 656, 701–702, 715, 741, 753, 758, 786, 790–791, 794, 812, 836, 848, 852, 866, 874–875, 879, 881–888
packages/foundry/agent_framework_foundry
_agent.py	242	56	76%	119, 122, 244–245, 249–251, 256–259, 352, 425–426, 438–439, 451–453, 455–456, 458–464, 466–467, 469, 471, 477–479, 482–491, 495–496, 694–695, 698, 724, 734, 750, 820, 825, 829
packages/openai/agent_framework_openai
_chat_client.py	1084	148	86%	276, 289, 639–643, 651–654, 660–664, 714–721, 723–725, 732–734, 780, 788, 811, 929, 1028, 1087, 1089, 1091, 1093, 1159, 1173, 1253, 1263, 1268, 1311, 1422–1423, 1438, 1647, 1652, 1656–1658, 1662–1663, 1746, 1756, 1783, 1789, 1799, 1805, 1810, 1816, 1821–1822, 1841, 1844–1847, 1861, 1863, 1871–1872, 1884, 1926, 2016, 2038–2039, 2054–2055, 2073–2074, 2117, 2283, 2321–2322, 2340, 2420–2428, 2458, 2568, 2603, 2618, 2638–2648, 2661, 2672–2676, 2690, 2704–2715, 2724, 2756–2759, 2769–2770, 2781–2783, 2797–2799, 2809–2810, 2816, 2831
_shared.py	156	16	89%	223, 242–244, 256, 266, 278, 284, 306, 310, 344–345, 364, 383–384, 386
TOTAL	37790	4423	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
7516	34 💤	0 ❌	0 🔥	1m 56s ⏱️

… timeout (microsoft#6241) Replace direct assignment with in RawFoundryAgentChatClient.__init__. The Azure AI Projects SDK caches and returns a shared AsyncOpenAI client per AIProjectClient. Mutating its .timeout attribute leaked the override to all other code paths sharing that client (other agents, user code). with_options() returns a new client instance with the override applied, leaving the original shared client untouched. Update tests to assert with_options is called with the correct timeout and that the original shared client's timeout attribute is not mutated. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Automated Code Review

Reviewers: 4 | Confidence: 89%

✓ Correctness

The PR correctly adds a timeout parameter through the FoundryAgent, RawFoundryAgent, _FoundryAgentChatClient, RawFoundryAgentChatClient, and RawOpenAIChatClient constructors, threading it down to load_openai_service_settings where it is applied to all three client construction paths (AsyncOpenAI for OpenAI, AsyncOpenAI for Azure /openai/v1 endpoint, AsyncAzureOpenAI for standard Azure). The Foundry agent correctly mutates the client's timeout attribute after obtaining it from the factory. Control flow, attribute mutation ordering, and MRO propagation are all correct. No correctness bugs found.

✓ Security Reliability

The PR cleanly threads a timeout parameter through all OpenAI/Azure client construction paths. The timeout is applied correctly: as a constructor argument for newly-created clients in load_openai_service_settings, and via direct attribute mutation for Foundry's project_client.get_openai_client() path. No input validation issues, injection risks, or resource leaks were found. The silent no-op when both async_client and timeout are provided to RawOpenAIChatClient is tested and intentional (user is responsible for configuring their own pre-built client).

✓ Test Coverage

The PR adds comprehensive test coverage for the timeout parameter in the foundry package (verifying timeout is applied/not-applied to the client). However, the openai package has a significant test coverage gap: load_openai_service_settings has 3 newly-added code paths that thread timeout into AsyncOpenAI() / AsyncAzureOpenAI() constructors, but none of these paths have unit tests. The only openai-package test (test_raw_openai_chat_client_accepts_preconfigured_client_with_timeout) exercises the pre-configured client path where timeout is intentionally ignored, and asserts only client is not None. A test verifying timeout is actually applied when RawOpenAIChatClient constructs its own client (the primary fix path) would strengthen confidence in the openai-layer implementation.

✗ Design Approach

The timeout plumbing is mostly consistent, but one design gap remains in the Foundry wrapper path: after adding timeout as part of RawFoundryAgentChatClient's public configuration, converting that client with as_agent() still recreates a FoundryAgent without preserving the timeout, which silently falls back to the SDK default on that path.

Flagged Issues

RawFoundryAgentChatClient.as_agent() drops the newly added timeout setting. The method is documented to "reuse this client's Foundry configuration" but rebuilds FoundryAgent with only project_client, agent_name, and agent_version (lines 298-315). A caller using RawFoundryAgentChatClient(..., timeout=60).as_agent() will silently lose the timeout and fall back to the SDK default.

Automated review by moonbox3's agents

moonbox3

Automated Code Review

Reviewers: 4 | Confidence: 91%

✓ Correctness

The PR correctly fixes a shared-client mutation bug by replacing openai_client.timeout = timeout with openai_client = openai_client.with_options(timeout=timeout) in RawFoundryAgentChatClient.__init__. The returned client is immediately reassigned and passed to super().__init__, so the timeout is applied correctly. Both _FoundryAgentChatClient and FoundryAgent funel their timeout parameters through this single code path (lines 585 and 1003), so the fix covers all three public classes. Tests are consistent with the implementation. No correctness issues found.

✓ Security Reliability

The PR correctly fixes a shared-state mutation bug by replacing openai_client.timeout = timeout (direct attribute mutation of a shared client) with openai_client = openai_client.with_options(timeout=timeout) (immutable copy with the new timeout). The change is surgical, confined to one line at _agent.py:268, and fully covered by the updated tests. No security or reliability issues were identified.

✓ Test Coverage

The production fix is correct — openai_client.with_options(timeout=timeout) is called and its return value is reassigned. All four timeout tests verify the call was made and the original mock was not mutated. However, none of them assert that the return value of with_options is actually stored in the constructed instance's .client attribute. A future regression that calls with_options but discards the return value would pass all tests while silently ignoring the timeout.

✓ Design Approach

I did not find a design-approach issue in this change. The new with_options(timeout=...) call in python/packages/foundry/agent_framework_foundry/_agent.py scopes timeout to the wrapper-specific OpenAI client instance without mutating the shared client returned by project_client.get_openai_client(...), and that matches the new tests' stated invariant. I also did not find conflicting lifecycle or ownership behavior in the surrounding Foundry wrapper code that would make this approach unsafe.

Automated review by automated agents

…ent (microsoft#6241) The four timeout propagation tests verified that with_options was called but did not confirm that the returned (timeout-configured) client was actually stored on the instance. A silent discard of the return value would have left the tests green while the timeout had no effect. Each test now captures the constructed instance and asserts: assert <instance>.client is openai_client_mock.with_options.return_value Affected tests: - test_raw_foundry_agent_chat_client_init_applies_timeout_to_openai_client - test_raw_foundry_agent_chat_client_init_applies_timeout_with_preview_enabled - test_foundry_agent_chat_client_init_propagates_timeout - test_foundry_agent_init_propagates_timeout_to_openai_client Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

moonbox3

Automated Code Review

Reviewers: 4 | Confidence: 93% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Design Approach

Automated review by automated agents

Copilot and others added 3 commits June 2, 2026 09:40

Python: Add timeout parameter to FoundryAgent to fix `ConnectTime…

5e72b47

…out` on multi-turn conversations Fixes microsoft#6241

Copilot AI review requested due to automatic review settings June 2, 2026 10:08

Copilot started reviewing on behalf of moonbox3 June 2, 2026 10:08 View session

moonbox3 added the python label Jun 2, 2026

Copilot AI reviewed Jun 2, 2026

View reviewed changes

Comment thread python/packages/foundry/agent_framework_foundry/_agent.py

moonbox3 commented Jun 2, 2026

View reviewed changes

github-actions Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread python/packages/openai/tests/openai/test_openai_chat_client.py

eavanvalkenburg approved these changes Jun 2, 2026

View reviewed changes

moonbox3 commented Jun 2, 2026

View reviewed changes

moonbox3 enabled auto-merge June 2, 2026 10:38

moonbox3 commented Jun 2, 2026

View reviewed changes

moonbox3 requested review from giles17 and semenshi June 2, 2026 11:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Add `timeout` parameter to `FoundryAgent` to fix `ConnectTimeout` on multi-turn conversations#6263

Python: Add `timeout` parameter to `FoundryAgent` to fix `ConnectTimeout` on multi-turn conversations#6263
moonbox3 wants to merge 5 commits into
microsoft:mainfrom
moonbox3:agent/fix-6241-1

moonbox3 commented Jun 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

moonbox3 left a comment

Uh oh!

github-actions Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

moonbox3 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

moonbox3 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

moonbox3 commented Jun 2, 2026

Motivation and Context

Description

Contribution Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

moonbox3 left a comment

Choose a reason for hiding this comment

Automated Code Review

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✗ Design Approach

Flagged Issues

Uh oh!

Uh oh!

moonbox3 left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✓ Design Approach

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

moonbox3 left a comment

Choose a reason for hiding this comment

Automated Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Jun 2, 2026 •

edited

Loading