Create a model response for a conversation. This endpoint is OpenAI-compatible, including message roles, tool calling, and streaming.
Create a chat completion
POST /v1/chat/completions
Request body
Required
model (string). The model ID to use.
messages (array). A list of messages that make up the conversation so far.
Messages
Each message has a role plus content. Supported roles include system, developer, user, assistant, tool, and function.
content can be either a simple string or a structured array for multimodal inputs. For example, user content can include typed blocks like text, image_url, input_audio, document, video_url, or file.
Provide tool definitions in tools. When the model decides to call a tool, it will return tool_calls on the assistant message, and you should respond by sending a tool role message referencing the matching tool_call_id.
Tool calling example
Streaming
Set stream: true to receive server-sent events (SSE). Each event contains a delta. The stream ends with a final event.
Response
The response follows the OpenAI Chat Completions format (for example, choices with an assistant message, plus token usage).
AlphaNeural proxy extensions
The proxy accepts a few optional routing and reliability fields that do not exist in the upstream OpenAI API, such as guardrails, caching, num_retries, fallbacks, and context_window_fallback_dict. Use these only if you need proxy-level behaviour controls.