AIone API (English)
    • 01 - Quick Start
    • 02 - Authentication
    • 03 - Error Codes
    • 04 - Pricing
    • 05 - Contact Us
    • 06 - Quality of Service
    • 07 - Complete Examples
    • 08 - Caching & Cost Optimization
    • 11 - Model Quality Monitoring
    • 12 - Network & Connectivity
    • 13 - Model Naming & Compatibility
    • 14 - Gemini Image Generation
    • 09 - Model Verification
    • 10 - IDE Integration

    07 - Complete Examples

    Complete Request Examples#

    /v1/chat/completions Endpoint Reference#

    This guide provides a complete breakdown of the /v1/chat/completions endpoint, covering request parameters and response formats to help you get started with AIone's OpenAI-compatible API.
    If you plan to use Gemini image generation models or image-specific parameters such as aspect_ratio, image_size, and top_k, please also refer to the Gemini Image Generation guide.

    1. Request Parameters#

    Required Parameters#

    model (required)#

    The model ID to use. See Models & Pricing for the full list.
    "model": "claude-sonnet-4-6"

    messages (required)#

    An array of conversation messages. Each message contains a role and content:
    "messages": [
      {"role": "system", "content": "You are a professional technical consultant"},
      {"role": "user", "content": "Please explain what an API Gateway is"}
    ]
    Accepted role values:
    system: System prompt that defines the AI's behavior and persona
    user: User message
    assistant: AI response (used for multi-turn conversations)

    Optional Parameters#

    temperature (default: 1.0)#

    Controls output randomness, range 0-2:
    0: Deterministic output; ideal for code generation and data extraction
    0.7: Balanced creativity and consistency; recommended for general use
    1.5+: High creativity; suitable for creative writing

    max_tokens#

    Maximum number of tokens to generate. If not specified, the model's default value is used.

    stream (default: false)#

    Whether to enable streaming responses. When set to true, the response is delivered as an SSE data stream.
    We strongly recommend enabling stream: true for interactive use cases. This reduces time-to-first-token from 10+ seconds down to 2-3 seconds. Non-streaming requests must wait for the model to generate the entire response before returning, which can easily time out when the model produces longer outputs.

    top_p (default: 1.0)#

    Nucleus sampling parameter. Typically, you only need to adjust either temperature or top_p, not both.

    tools#

    Function calling tool definitions, allowing the model to invoke external functions:
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the weather for a specified city",
          "parameters": {
            "type": "object",
            "properties": {
              "city": {"type": "string", "description": "City name"}
            },
            "required": ["city"]
          }
        }
      }
    ]

    response_format#

    Specifies the response format; supports JSON mode:
    "response_format": {"type": "json_object"}

    2. Complete Request Example (Streaming - Recommended)#

    {
      "model": "claude-sonnet-4-6",
      "messages": [
        {"role": "system", "content": "You are a helpful assistant. Please answer concisely."},
        {"role": "user", "content": "What is a RESTful API? Explain in 3 sentences."}
      ],
      "max_tokens": 500,
      "temperature": 0.7,
      "stream": true
    }

    3. Streaming Response Format#

    When stream: true is enabled, the response is delivered in SSE (Server-Sent Events) format. Each chunk is a JSON object:
    data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"claude-sonnet-4-6","choices":[{"index":0,"delta":{"content":"REST"},"finish_reason":null}]}
    
    data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"claude-sonnet-4-6","choices":[{"index":0,"delta":{"content":"ful"},"finish_reason":null}]}
    
    data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"claude-sonnet-4-6","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
    
    data: [DONE]
    Key fields:
    choices[0].delta.content: The incremental text fragment in this chunk
    choices[0].finish_reason: null means generation is still in progress; stop means the model finished normally; length means max_tokens was reached

    4. Non-Streaming Response Format#

    {
      "id": "chatcmpl-abc123def456",
      "object": "chat.completion",
      "created": 1711500000,
      "model": "claude-sonnet-4-6",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "A RESTful API is an interface design style based on the HTTP protocol..."
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 42,
        "completion_tokens": 85,
        "total_tokens": 127
      }
    }
    Field descriptions:
    FieldDescription
    idUnique request identifier
    choices[0].message.contentThe AI-generated response
    choices[0].finish_reasonCompletion reason: stop (normal), length (reached max_tokens)
    usage.prompt_tokensNumber of input tokens consumed
    usage.completion_tokensNumber of output tokens consumed
    usage.total_tokensTotal tokens (used for billing)

    5. Important Notes#

    1.
    Use streaming mode: For interactive scenarios (chat, IDE coding), always use stream: true for a significantly better experience and stability
    2.
    Authentication: Ensure your API Key is valid and authorized to access the selected model
    3.
    Parameter format: messages is an array; each message must include both role and content
    4.
    Token billing: Input and output tokens are billed separately at different rates
    5.
    Compatibility: AIone is fully compatible with the OpenAI SDK -- use the openai library directly
    6.
    Error handling: Implement exponential backoff for 429 (rate limit) and 5xx (server error) responses
    Modified at 2026-04-04 16:02:45
    Previous
    06 - Quality of Service
    Next
    08 - Caching & Cost Optimization
    Built with