7.9 KiB
7.9 KiB
Codex Router Alignment Plan
Confirmed Scope (from latest requirements)
- Keep
parallel_tool_calls: truein outbound responses payloads. - Do not send
prompt_cache_keyfrom the router for now. - Always send
include: ["reasoning.encrypted_content"]. - Header work now: remove self-added duplicate headers only.
- Message payload work now: stop string
contentserialization and send content parts like the good sample.
Deferred (intentionally postponed)
- Full header parity with the golden capture (transport/runtime-level UA and low-level accept-encoding parity).
- Full one-to-one
inputhistory shape parity (typeomission strategy for message items). - Recovering or synthesizing top-level developer message from upstream chat-completions schema.
- End-to-end reasoning item roundtrip parity in history (
type: reasoningpass-through and replay behavior). - Prompt cache implementation strategy and lifecycle management.
Feasible Path For Deferred Items
-
Header parity
- Keep current sdk-based client for now.
- If exact parity is required, switch codex provider transport from
AsyncOpenAIto a customhttpxSSE client and set an explicit header allowlist.
-
Input history shape parity
- Add a translator mode that emits implicit message items (
{"role":...,"content":...}) withouttype: "message". - Keep explicit item support for
function_callandfunction_call_outputunchanged.
- Add a translator mode that emits implicit message items (
-
Developer message availability
- Add optional request extension field(s) in
model_extra, e.g.opencode_developer_messageoropencode_input_items. - Use extension when provided; otherwise keep current first-system/developer-to-instructions behavior.
- Add optional request extension field(s) in
-
Reasoning item roundtrip
- Accept explicit inbound items with
extra.type == "reasoning"and pass throughencrypted_content+summaryto responsesinput. - Keep chat-completions output contract unchanged; reasoning passthrough is input-side only unless a dedicated raw endpoint is added.
- Accept explicit inbound items with
-
Prompt cache strategy
- Keep disabled by default.
- Add optional feature flag for deterministic hash-based key generation once cache policy is agreed.
Schema.md Gap Breakdown (planning only, no implementation yet)
Legend
Supported= already implemented.Partial= partly implemented but not schema-complete.Missing= not implemented yet.
| # | Area | What it does for users | Current status | Decision from latest review | Notes / planned behavior |
|---|---|---|---|---|---|
| 1 | Extra request controls (provider, plugins, session_id, trace, models, debug, image_config) |
Lets users steer upstream routing, observability, plugin behavior, and image/provider-specific behavior directly from request body. | Missing | Explain each field first, then choose individually | Keep pass-through design: accept fields in API schema, preserve in internal request, forward when provider supports. |
| 2 | reasoning object in request (reasoning.effort, reasoning.summary) |
Standard schema-compatible way to request reasoning effort and summary verbosity. | Partial (we use flat reasoning_effort / reasoning_summary) |
Must support | Add canonical reasoning object support while preserving backward compatibility with current flat aliases. Define precedence rules if both forms are provided. |
| 3 | modalities alignment (text/image) |
Controls output modalities users request. Must match schema contract exactly. | Partial / mismatched (text/audio now) |
Must support schema behavior | Change request schema and internal mapping to text/image for the public API; ensure providers receive compatible values. |
| 4 | Full message content parts (audio/video/cache-control variants) | Enables multi-part multimodal inputs (audio, video, richer text metadata) and cache hints on message parts. | Partial | Must support | Expand accepted message content item parsing and translator mapping for all schema item variants, including preservation of unknown-but-valid provider fields where safe. |
| 5 | Assistant response extensions (reasoning, reasoning_details, images) |
Returns richer assistant payloads: plain reasoning, structured reasoning metadata, and generated image outputs. | Missing | Must support | Extend response schemas and mappers so these fields can be emitted in non-streaming and streaming-compatible forms. |
| 6 | Encrypted reasoning passthrough (reasoning_details with encrypted data) |
Exposes encrypted reasoning blocks from upstream exactly as received for advanced clients/debugging/replay. | Missing | High priority, must support | Capture encrypted reasoning items from responses stream (response.output_item.* for type=reasoning) and surface in API output as raw/structured reasoning details without lossy transformation. |
| 7 | Usage passthrough fidelity | Users should receive full upstream usage payload (raw), not a reduced subset. | Partial | Needed: pass full raw usage through | Do not over-normalize; preserve upstream usage object as-is when available. If upstream omits usage, return null/missing naturally. |
| 8 | Detailed HTTP error matrix parity | Strictly maps many status codes exactly like reference schema. | Partial | Not required now | Keep current error strategy unless product requirements change. |
| 9 | Optional model when models routing is used |
OpenRouter-style multi-model router behavior. | Missing | Not required for this project | Keep model required in our API for now. |
Field-by-field reference for item #1 (for product decision)
| Field | User-visible purpose | Typical payload shape | Risk/complexity |
|---|---|---|---|
provider |
Control provider routing policy (allow/deny fallback, specific providers, price/perf constraints). | Object with routing knobs (order/only/ignore, pricing, latency/throughput prefs). | Medium-High (router semantics + validation + provider compatibility). |
plugins |
Enable optional behavior modules (web search/moderation/auto-router/etc). | Array of plugin descriptors with id and optional settings. |
Medium (validation + pass-through + provider-specific effects). |
session_id (body) |
Group related requests for observability/conversation continuity. | String (usually short opaque id). | Low (mostly passthrough + precedence with headers if both exist). |
trace |
Attach tracing metadata for distributed observability. | Object (trace_id, span_name, etc + custom keys). |
Low-Medium (schema + passthrough). |
models |
Candidate model set for automatic selection/router behavior. | Array of model identifiers/patterns. | Medium-High (changes model resolution flow). |
debug |
Request debug payloads (e.g., transformed upstream request echo in stream). | Object flags like echo_upstream_body. |
Medium (security/sensitivity review required). |
image_config |
Provider/model-specific image generation tuning options. | Arbitrary object map by provider/model conventions. | Medium (loosely-typed passthrough plus safety limits). |
Execution order when implementation starts (agreed priorities)
- Encrypted reasoning + reasoning details output path (#6 + #5 core subset).
- Full usage passthrough fidelity (#7).
- Request
reasoningobject support (#2). - Modalities contract alignment to schema (
text/image) (#3). - Message content multimodal expansion (#4).
- Decide and then implement selected item-#1 controls (
provider/plugins/session_id/trace/models/debug/image_config).
Implementation Steps (current)
- Update codex translator payload fields:
- remove
prompt_cache_key - add mandatory
include
- remove
- Update message content serialization:
- serialize string message content as
[{"type":"input_text","text":...}] - preserve empty-content filtering behavior
- serialize string message content as
- Update codex provider header handling:
- avoid mutating oauth headers in place
- remove self-added duplicate
user-agentheader
- Update/extend tests for new payload contract.
- Run full
pytestand fix regressions until green.