# Chat Completions

### Create a chat completion

`POST /v1/chat/completions`

#### Request body

**Required**

* `model` (string). The model ID to use.
* `messages` (array). A list of messages that make up the conversation so far.

**Messages**

Each message has a `role` plus `content`. Supported roles include `system`, `developer`, `user`, `assistant`, `tool`, and `function`.&#x20;

`content` can be either a simple string or a structured array for multimodal inputs. For example, `user` content can include typed blocks like `text`, `image_url`, `input_audio`, `document`, `video_url`, or `file`.&#x20;

#### Basic example

```bash
curl https://proxy.alfnrl.io/v1/chat/completions \
  -H "Authorization: Bearer $ALPHANEURAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3",
    "messages": [
      { "role": "user", "content": "Write a haiku about routers." }
    ]
  }'
```

#### Multimodal example (text + image)

```bash
curl https://proxy.alfnrl.io/v1/chat/completions \
  -H "Authorization: Bearer $ALPHANEURAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3",
    "messages": [
      {
        "role": "user",
        "content": [
          { "type": "text", "text": "What is in this image?" },
          { "type": "image_url", "image_url": { "url": "https://example.com/cat.png" } }
        ]
      }
    ]
  }'
```

Multimodal content blocks are part of the message schema supported by the proxy.

### Common parameters

These follow the OpenAI Chat Completions shape.

* Sampling and length: `temperature`, `top_p`, `max_tokens`, `stop`, `n`, `seed`
* Penalties and biasing: `presence_penalty`, `frequency_penalty`, `logit_bias`&#x20;
* Structured outputs: `response_format`
* Logging: `logprobs`, `top_logprobs`
* Streaming: `stream`, `stream_options`&#x20;
* Tool calling: `tools`, `tool_choice`, `parallel_tool_calls`
* Legacy function calling: `functions`, `function_call`
* Metadata: `metadata`, `user`

### Tool calling

Provide tool definitions in `tools`. When the model decides to call a tool, it will return `tool_calls` on the assistant message, and you should respond by sending a `tool` role message referencing the matching `tool_call_id`.

#### Tool calling example

```bash
curl https://proxy.alfnrl.io/v1/chat/completions \
  -H "Authorization: Bearer $ALPHANEURAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3",
    "messages": [
      { "role": "user", "content": "What is the weather in Paris right now?" }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the current weather for a city",
          "parameters": {
            "type": "object",
            "properties": { "city": { "type": "string" } },
            "required": ["city"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'
```

### Streaming

Set `stream: true` to receive server-sent events (SSE). Each event contains a delta. The stream ends with a final event.

```bash
curl https://proxy.alfnrl.io/v1/chat/completions \
  -H "Authorization: Bearer $ALPHANEURAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3",
    "stream": true,
    "messages": [{ "role": "user", "content": "Explain TCP slow start in one paragraph." }]
  }'
```

### Response

The response follows the OpenAI Chat Completions format (for example, `choices` with an assistant message, plus token `usage`).

### AlphaNeural proxy extensions

The proxy accepts a few optional routing and reliability fields that do not exist in the upstream OpenAI API, such as `guardrails`, `caching`, `num_retries`, `fallbacks`, and `context_window_fallback_dict`. Use these only if you need proxy-level behaviour controls.
