OpenAI Chat Completion API
POSThttps://api.acedata.cloud/openai/chat/completions
POSThttps://api.acedata.cloud/v1/chat/completions

OpenAI Chat Completion API, compatible with the official API format.

Related Products

ChatGPT - Ace Data Cloud
ChatGPT dialogue generation platform, directly connected to this API, fast, stable, and available.

Request Headers

acceptstring
Specify the response format returned by the server.
Please select
authorizationstring
Bearer token

Request Body

modelstringRequired parameter
Model ID to be used.
messagesarrayRequired parameter
List of messages that make up the current conversation.
streamboolean
If set to true, it will send partial messages incrementally like ChatGPT.
Please select
stream_optionsobject
Options for streaming responses. Use only when setting `stream: true`.
include_usageboolean
If set, an additional chunk will be streamed back before the `data: [DONE]` message. The `usage` field of this chunk displays the total token usage statistics for the entire request, while the `choices` field will always be an empty array.
Please select
max_tokensnumber
The maximum number of tokens that can be generated in dialogue completion.
max_completion_tokensinteger
The maximum number of tokens that can be generated for a single completion, including visible output tokens and inference tokens. For newer models (o1/o3/o4/gpt-5 series and other inference models), this parameter replaces `max_tokens`.
reasoning_effortstring
The inference input level of the constraint reasoning model (o1/o3/o4/gpt-5 series). Currently supports `minimal`, `low`, `medium`, and `high`. Reducing the inference input can lead to faster responses and decrease the number of tokens used for inference in the responses.
Please select
modalitiesarray
You hope for the output type generated by the model. Most models default to `["text"]`. The `gpt-4o-audio-preview` model can also be used to generate audio. If you want the model to generate both text and audio responses, you can use `["text", "audio"]`.
audioobject
Audio output parameters. Required when requesting audio output via `modalities: ["audio"]`.
voicestringRequired parameter
The tone used by the model in its responses.
formatstringRequired parameter
Specify the output audio format.
predictionobject
Static prediction output content, such as the content of a text file that is being regenerated. It can be used to shorten response time when most of the response content is already known in advance.
typestringRequired parameter
Predict the type of content, fixed as `content`.
contentRequired parameter
Content for prediction.
nnumber
Generate how many dialogue completion options for each input message.
toolsarray
List of tools that can be called by the model. Currently, only functions are supported as tools.
tool_choice
Control which tool the model calls (if any). `none` means the model will not call any tools; `auto` means the model can choose between generating a message or calling one/multiple tools; `required` means the model must call one or more tools.
parallel_tool_callsboolean
Whether to enable parallel function calls during tool invocation.
Please select
web_search_optionsobject
Options for web search tools, for use with the GPT-4o model that has built-in search (`gpt-4o-search-preview`, `gpt-4o-mini-search-preview`).
user_locationobject
Approximate location parameters for search.
typestring
Please select
approximateobject
citystring
regionstring
countrystring
timezonestring
search_context_sizestring
High-level guidelines on the size of the context window space occupied by the search.
Please select
response_format
The object format that the specified model must output. Setting it to `{ "type": "json_schema", "json_schema": {...} }` enables structured output; setting it to `{ "type": "json_object" }` enables the older JSON Object mode.
temperaturenumber
The sampling temperature used ranges from 0 to 2. Higher values (such as 0.8) will make the output more random, while lower values (such as 0.2) will make the output more focused and certain.
top_pnumber
An alternative method for temperature sampling, called nucleus sampling, considers only the portion of tokens whose cumulative probability mass reaches top_p. For example, 0.1 means only considering the tokens that account for the top 10% of cumulative probability mass. It is generally recommended to adjust only one of these parameters or `temperature`, rather than adjusting both simultaneously.
frequency_penaltynumber
The value ranges from -2.0 to 2.0. Positive values will penalize based on the frequency of the token appearing in the generated text, thereby reducing the likelihood of the model repeating the same line word for word.
presence_penaltynumber
The value ranges from -2.0 to 2.0. Positive values will penalize based on whether the token has already appeared in the generated text, thereby increasing the model's likelihood of discussing new topics.
seedinteger
If specified, the system will strive to perform deterministic sampling, so that repeated requests using the same `seed` and parameters return the same results. Determinism cannot be guaranteed; please refer to the `system_fingerprint` in the response to monitor backend changes.
stop
A maximum of 4 sequences, the API will stop generating when it reaches these sequences. The returned text does not include the stopping sequences.
logprobsboolean
Whether to return the logarithmic probability of the output tokens. If true, it will return the logarithmic probability of each output token in the `content` of `message`.
Please select
top_logprobsinteger
The integer value between 0 and 20 specifies the number of most likely tokens to return at each token position, each accompanied by the corresponding log probability. When using this parameter, `logprobs` must be set to `true`.
logit_biasobject
Modify the likelihood of a specified token appearing in the completion. Accept a JSON object that maps the token (specified by its token ID in the tokenizer) to a bias value between -100 and 100.
userstring
A unique identifier representing your end users, which helps the platform monitor and detect abusive behavior.
service_tierstring
Specify the processing type used for the designated service request. Set to `auto` to let the platform choose automatically, `default` for standard pricing/performance, `flex` for a slower but cheaper Flex processing layer, and `priority` for the fastest processing layer on supported models.
Please select
storeboolean
Whether to store the output of this conversation completion request for model distillation or product evaluation.
Please select
metadataobject
A set of key-value pairs that can be attached to an object, with a maximum of 16 pairs. It can be used to store additional information about the object in a structured format. The key is a string of up to 64 characters, and the value is a string of up to 512 characters.

Response

OpenAI generation
Allow Use General Balance

When 'Allow General Balance' is enabled, the general balance is used automatically if an app's balance is insufficient.

Shell

Python

JavaScript

Java

Go

PHP

Kind reminder: For streaming requests, the above code may not be fully applicable. Please refer to the integration documentation for changes.