AiLinkLab Blog

Published on

- 3 min read

OpenAI Compatible API Chat Completion Message Structure

OpenAI API Schema Development Guide
img of OpenAI Compatible API Chat Completion Message Structure

Introduction

Instead of treating AI model APIs as black boxes, it’s essential to understand their internal structure for better utilization. Let’s approach large language model APIs as we would any other API. Understanding the parameters’ meanings will give us a better grasp of the AI model’s capabilities.

For more information about OpenAI’s API, you can refer to: OpenAI’s Complete Interface Standard Definition Document

Why Study OpenAI’s Interface? The Reasons are Simple:

  1. OpenAI is an industry pioneer, and their interface design has become a standard
  2. Many language models in the market are now compatible with OpenAI’s interface, making it a universal key
  3. Understanding each parameter helps better control AI model behavior

Chat Completion Interface: The Most Commonly Used Dialogue Interface

OpenAI’s complete message Schema definition can be found in the link above. Since there’s a lot of content, let’s focus on some important aspects.

Here’s the Body message structure for the /chat/completions interface:

Field NameTypeRequiredDefaultDescription
modelstringYes-Model name, available values in specific model documentation
messagesobject[]Yes-Dialogue message list, including role (user/assistant/system/tools) and content
streambooleanNofalseEnable streaming return, tokens returned as Server-Sent Events when enabled
max_tokensintegerNo512Maximum number of tokens to generate, range: 1 < x < 8192
stopstring[]/nullNo["null"]Sequences to stop generation (max 4), returned text excludes these sequences
temperaturenumberNo0.7Controls output randomness, higher values increase randomness (typically 0-2)
top_pnumberNo0.7Nucleus sampling parameter, dynamically adjusts token selection range
top_knumberNo50Number of top-k tokens to consider during sampling
frequency_penaltynumberNo0.5Frequency penalty, suppresses repeated token generation
nintegerNo1Number of completions to generate
response_formatobjectNo{"type": "text"}Output format object
toolsobject[]No-Tool call list (function calls), includes type: "function" and function metadata

Example Request Body

{
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "user",
      "content": "What opportunities and challenges will 2025 bring?"
    }
  ],
  "stream": false,
  "max_tokens": 512,
  "temperature": 0.7,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "analyze_industry_trend",
        "description": "Analyze AI industry trends",
        "parameters": {
          "year": 2025,
          "region": "Global"
        }
      }
    }
  ]
}

Core Parameter Analysis

  1. Choose Your AI Partner (model)
"model": "gpt-3.5-turbo"

This is like choosing different levels of teachers - some excel at creative writing, others at code analysis. Different models have varying capabilities and prices, choose based on your needs.

  1. Conversation History (messages)

This is your chat history with AI, including several roles:

  • user: The questioner
  • assistant: The AI responding
  • system: Sets rules for the AI
  • tools: Results returned after AI uses tools
  1. Control AI’s Creative Freedom
  • temperature (Range 0-2)

    • Set to 0: AI becomes conservative, answers are very certain
    • Set to 1: AI shows appropriate creativity
    • Set to 2: AI becomes highly imaginative
  • max_tokens (Word limit):

    • Think of it as setting a word count limit for AI
    • One English word is typically 1-2 tokens
    • Setting appropriate values avoids waste and overages
  1. Make Conversations More Fluid (stream)
"stream": true

Enabling this option makes AI responses appear character by character, like human typing, instead of all at once, creating a better experience.

  1. Avoid Repetition (frequency_penalty and presence_penalty)
  • frequency_penalty:

    • Positive (0.1 to 2.0): Discourages AI from using repeated words
    • Negative (-2.0 to -0.1): Encourages word repetition
    • 0: Neutral, no intervention
  • presence_penalty:

    • Positive: Encourages new topics
    • Negative: Keeps AI focused on current topic
    • 0: Natural transition
  1. Sampling Control (top_p and top_k)
  • top_p (Nucleus sampling):

    • Range 0-1, default 0.7
    • Lower values make AI more conservative
    • Higher values increase response diversity
    • Avoid adjusting alongside temperature
  • top_k (Top-K sampling):

    • Default value 50
    • Controls number of candidate words considered
    • Lower values make responses more conservative
  1. Output Diversity (n)
"n": 3
  • Makes AI provide multiple different answers at once
  • Default value is 1
  • Higher values increase API costs
  • Best used with high temperature
  1. Format Control (response_format)
"response_format": {"type": "json_object"}
  • Controls AI response format
  • text: Plain text (default)
  • json_object: JSON format
  1. Stop Sequences (stop)
"stop": ["end", "complete"]
  • Sets specific words as response termination markers
  • Maximum of 4 stop sequences
  • AI stops generating when encountering these words
  1. Tool Calls (tools)

The tools parameter allows AI to call external tools for specific tasks. It’s like equipping AI with a toolbox that it can use when needed.

"tools": [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather information for specified city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name"
                    },
                    "date": {
                        "type": "string",
                        "description": "Query date"
                    }
                }
            }
        }
    }
]