ai/ai
1
0
Fork 0
ai/docs/TODO.md

7.9 KiB

Codex Router Alignment Plan

Confirmed Scope (from latest requirements)

  • Keep parallel_tool_calls: true in outbound responses payloads.
  • Do not send prompt_cache_key from the router for now.
  • Always send include: ["reasoning.encrypted_content"].
  • Header work now: remove self-added duplicate headers only.
  • Message payload work now: stop string content serialization and send content parts like the good sample.

Deferred (intentionally postponed)

  • Full header parity with the golden capture (transport/runtime-level UA and low-level accept-encoding parity).
  • Full one-to-one input history shape parity (type omission strategy for message items).
  • Recovering or synthesizing top-level developer message from upstream chat-completions schema.
  • End-to-end reasoning item roundtrip parity in history (type: reasoning pass-through and replay behavior).
  • Prompt cache implementation strategy and lifecycle management.

Feasible Path For Deferred Items

  1. Header parity

    • Keep current sdk-based client for now.
    • If exact parity is required, switch codex provider transport from AsyncOpenAI to a custom httpx SSE client and set an explicit header allowlist.
  2. Input history shape parity

    • Add a translator mode that emits implicit message items ({"role":...,"content":...}) without type: "message".
    • Keep explicit item support for function_call and function_call_output unchanged.
  3. Developer message availability

    • Add optional request extension field(s) in model_extra, e.g. opencode_developer_message or opencode_input_items.
    • Use extension when provided; otherwise keep current first-system/developer-to-instructions behavior.
  4. Reasoning item roundtrip

    • Accept explicit inbound items with extra.type == "reasoning" and pass through encrypted_content + summary to responses input.
    • Keep chat-completions output contract unchanged; reasoning passthrough is input-side only unless a dedicated raw endpoint is added.
  5. Prompt cache strategy

    • Keep disabled by default.
    • Add optional feature flag for deterministic hash-based key generation once cache policy is agreed.

Schema.md Gap Breakdown (planning only, no implementation yet)

Legend

  • Supported = already implemented.
  • Partial = partly implemented but not schema-complete.
  • Missing = not implemented yet.
# Area What it does for users Current status Decision from latest review Notes / planned behavior
1 Extra request controls (provider, plugins, session_id, trace, models, debug, image_config) Lets users steer upstream routing, observability, plugin behavior, and image/provider-specific behavior directly from request body. Missing Explain each field first, then choose individually Keep pass-through design: accept fields in API schema, preserve in internal request, forward when provider supports.
2 reasoning object in request (reasoning.effort, reasoning.summary) Standard schema-compatible way to request reasoning effort and summary verbosity. Partial (we use flat reasoning_effort / reasoning_summary) Must support Add canonical reasoning object support while preserving backward compatibility with current flat aliases. Define precedence rules if both forms are provided.
3 modalities alignment (text/image) Controls output modalities users request. Must match schema contract exactly. Partial / mismatched (text/audio now) Must support schema behavior Change request schema and internal mapping to text/image for the public API; ensure providers receive compatible values.
4 Full message content parts (audio/video/cache-control variants) Enables multi-part multimodal inputs (audio, video, richer text metadata) and cache hints on message parts. Partial Must support Expand accepted message content item parsing and translator mapping for all schema item variants, including preservation of unknown-but-valid provider fields where safe.
5 Assistant response extensions (reasoning, reasoning_details, images) Returns richer assistant payloads: plain reasoning, structured reasoning metadata, and generated image outputs. Missing Must support Extend response schemas and mappers so these fields can be emitted in non-streaming and streaming-compatible forms.
6 Encrypted reasoning passthrough (reasoning_details with encrypted data) Exposes encrypted reasoning blocks from upstream exactly as received for advanced clients/debugging/replay. Missing High priority, must support Capture encrypted reasoning items from responses stream (response.output_item.* for type=reasoning) and surface in API output as raw/structured reasoning details without lossy transformation.
7 Usage passthrough fidelity Users should receive full upstream usage payload (raw), not a reduced subset. Partial Needed: pass full raw usage through Do not over-normalize; preserve upstream usage object as-is when available. If upstream omits usage, return null/missing naturally.
8 Detailed HTTP error matrix parity Strictly maps many status codes exactly like reference schema. Partial Not required now Keep current error strategy unless product requirements change.
9 Optional model when models routing is used OpenRouter-style multi-model router behavior. Missing Not required for this project Keep model required in our API for now.

Field-by-field reference for item #1 (for product decision)

Field User-visible purpose Typical payload shape Risk/complexity
provider Control provider routing policy (allow/deny fallback, specific providers, price/perf constraints). Object with routing knobs (order/only/ignore, pricing, latency/throughput prefs). Medium-High (router semantics + validation + provider compatibility).
plugins Enable optional behavior modules (web search/moderation/auto-router/etc). Array of plugin descriptors with id and optional settings. Medium (validation + pass-through + provider-specific effects).
session_id (body) Group related requests for observability/conversation continuity. String (usually short opaque id). Low (mostly passthrough + precedence with headers if both exist).
trace Attach tracing metadata for distributed observability. Object (trace_id, span_name, etc + custom keys). Low-Medium (schema + passthrough).
models Candidate model set for automatic selection/router behavior. Array of model identifiers/patterns. Medium-High (changes model resolution flow).
debug Request debug payloads (e.g., transformed upstream request echo in stream). Object flags like echo_upstream_body. Medium (security/sensitivity review required).
image_config Provider/model-specific image generation tuning options. Arbitrary object map by provider/model conventions. Medium (loosely-typed passthrough plus safety limits).

Execution order when implementation starts (agreed priorities)

  1. Encrypted reasoning + reasoning details output path (#6 + #5 core subset).
  2. Full usage passthrough fidelity (#7).
  3. Request reasoning object support (#2).
  4. Modalities contract alignment to schema (text/image) (#3).
  5. Message content multimodal expansion (#4).
  6. Decide and then implement selected item-#1 controls (provider/plugins/session_id/trace/models/debug/image_config).

Implementation Steps (current)

  1. Update codex translator payload fields:
    • remove prompt_cache_key
    • add mandatory include
  2. Update message content serialization:
    • serialize string message content as [{"type":"input_text","text":...}]
    • preserve empty-content filtering behavior
  3. Update codex provider header handling:
    • avoid mutating oauth headers in place
    • remove self-added duplicate user-agent header
  4. Update/extend tests for new payload contract.
  5. Run full pytest and fix regressions until green.