OpenAI Compatible API Chat Completion Message Structure • AiLinkLab Blog

Introduction

Instead of treating AI model APIs as black boxes, it’s essential to understand their internal structure for better utilization. Let’s approach large language model APIs as we would any other API. Understanding the parameters’ meanings will give us a better grasp of the AI model’s capabilities.

For more information about OpenAI’s API, you can refer to: OpenAI’s Complete Interface Standard Definition Document

Why Study OpenAI’s Interface? The Reasons are Simple:

OpenAI is an industry pioneer, and their interface design has become a standard
Many language models in the market are now compatible with OpenAI’s interface, making it a universal key
Understanding each parameter helps better control AI model behavior

Chat Completion Interface: The Most Commonly Used Dialogue Interface

OpenAI’s complete message Schema definition can be found in the link above. Since there’s a lot of content, let’s focus on some important aspects.

Here’s the Body message structure for the /chat/completions interface:

Field Name	Type	Required	Default	Description
`model`	string	Yes	-	Model name, available values in specific model documentation
`messages`	object[]	Yes	-	Dialogue message list, including `role` (`user`/`assistant`/`system`/`tools`) and `content`
`stream`	boolean	No	`false`	Enable streaming return, tokens returned as Server-Sent Events when enabled
`max_tokens`	integer	No	`512`	Maximum number of tokens to generate, range: `1 < x < 8192`
`stop`	string[]/null	No	`["null"]`	Sequences to stop generation (max 4), returned text excludes these sequences
`temperature`	number	No	`0.7`	Controls output randomness, higher values increase randomness (typically 0-2)
`top_p`	number	No	`0.7`	Nucleus sampling parameter, dynamically adjusts token selection range
`top_k`	number	No	`50`	Number of top-k tokens to consider during sampling
`frequency_penalty`	number	No	`0.5`	Frequency penalty, suppresses repeated token generation
`n`	integer	No	`1`	Number of completions to generate
`response_format`	object	No	`{"type": "text"}`	Output format object
`tools`	object[]	No	-	Tool call list (function calls), includes `type: "function"` and function metadata

Example Request Body

{
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "user",
      "content": "What opportunities and challenges will 2025 bring?"
    }
  ],
  "stream": false,
  "max_tokens": 512,
  "temperature": 0.7,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "analyze_industry_trend",
        "description": "Analyze AI industry trends",
        "parameters": {
          "year": 2025,
          "region": "Global"
        }
      }
    }
  ]
}

Core Parameter Analysis

Choose Your AI Partner (model)

"model": "gpt-3.5-turbo"

This is like choosing different levels of teachers - some excel at creative writing, others at code analysis. Different models have varying capabilities and prices, choose based on your needs.

Conversation History (messages)

This is your chat history with AI, including several roles:

user: The questioner
assistant: The AI responding
system: Sets rules for the AI
tools: Results returned after AI uses tools

Control AI’s Creative Freedom

temperature (Range 0-2)
- Set to 0: AI becomes conservative, answers are very certain
- Set to 1: AI shows appropriate creativity
- Set to 2: AI becomes highly imaginative
max_tokens (Word limit):
- Think of it as setting a word count limit for AI
- One English word is typically 1-2 tokens
- Setting appropriate values avoids waste and overages

Make Conversations More Fluid (stream)

"stream": true

Enabling this option makes AI responses appear character by character, like human typing, instead of all at once, creating a better experience.

Avoid Repetition (frequency_penalty and presence_penalty)

frequency_penalty:
- Positive (0.1 to 2.0): Discourages AI from using repeated words
- Negative (-2.0 to -0.1): Encourages word repetition
- 0: Neutral, no intervention
presence_penalty:
- Positive: Encourages new topics
- Negative: Keeps AI focused on current topic
- 0: Natural transition

Sampling Control (top_p and top_k)

top_p (Nucleus sampling):
- Range 0-1, default 0.7
- Lower values make AI more conservative
- Higher values increase response diversity
- Avoid adjusting alongside temperature
top_k (Top-K sampling):
- Default value 50
- Controls number of candidate words considered
- Lower values make responses more conservative

Output Diversity (n)

"n": 3

Makes AI provide multiple different answers at once
Default value is 1
Higher values increase API costs
Best used with high temperature

Format Control (response_format)

"response_format": {"type": "json_object"}

Controls AI response format
text: Plain text (default)
json_object: JSON format

Stop Sequences (stop)

"stop": ["end", "complete"]

Sets specific words as response termination markers
Maximum of 4 stop sequences
AI stops generating when encountering these words

Tool Calls (tools)

The tools parameter allows AI to call external tools for specific tasks. It’s like equipping AI with a toolbox that it can use when needed.

"tools": [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather information for specified city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name"
                    },
                    "date": {
                        "type": "string",
                        "description": "Query date"
                    }
                }
            }
        }
    }
]