This documentation provides details on how to use the XML Gateway API to interact with OpenAI's language models.
The API supports two authentication methods:
Include your API key in the X-API-Key
header:
X-API-Key: your-api-key-here
First, obtain a JWT token by sending your API key to the /auth/token
endpoint:
{
"api_key": "your-api-key-here"
}
{
"access_token": "eyJhbGciOiJIUzI1...",
"refresh_token": "eyJhbGciOiJIUzI1...",
"token_type": "Bearer",
"expires_in": 1800,
"refresh_expires_in": 86400,
"user": {
"user_id": "user_1",
"name": "User 1",
"tier": "starter"
}
}
Then, include the token in the Authorization
header:
Authorization: Bearer eyJhbGciOiJIUzI1...
Access tokens expire after 30 minutes. Use the refresh token to obtain a new access token:
{
"refresh_token": "eyJhbGciOiJIUzI1..."
}
The main endpoint for sending prompts to OpenAI and receiving responses in XML format.
<Request>
<prompt>Who was James Baldwin?</prompt>
<model>gpt-4o</model>
</Request>
<Response>
<answer>James Baldwin was an American writer and activist who explored racial, sexual, and class distinctions in Western society...</answer>
</Response>
<Error>
<message>Invalid XML structure. Missing 'prompt' element.</message>
</Error>
<Error>
<message>Rate limit exceeded. Too many requests in a short period. Please wait 30 seconds before trying again.</message>
</Error>
<Error>
<message>Token limit exceeded. Monthly usage: 105000 tokens. Limit: 100000 tokens. Please upgrade your plan or wait until next month.</message>
</Error>
All responses include rate limit information in the following headers:
X-RateLimit-Limit
: Maximum requests per minuteX-RateLimit-Remaining
: Requests remaining in the current windowX-RateLimit-Reset
: Seconds until the rate limit resetsRate-limited responses (429) also include a Retry-After
header indicating when to retry.
A test endpoint that accepts the same XML as /ask
but returns a static response. Use this for testing without consuming OpenAI tokens.
<Response>
<answer>This is a test response. In a production environment, this would be generated by OpenAI's API.</answer>
</Response>
The API tracks token usage for both input (prompt) and output (completion) tokens. OpenAI uses tokens to measure usage, where a token is approximately 4 characters or 0.75 words.
For planning purposes, you can estimate token usage as follows:
The API automatically tracks actual token usage from the OpenAI API response.
You can check your current usage with the /api/usage/summary
endpoint:
GET /api/usage/summary?days=7
X-API-Key: your-api-key-here
{
"usage": {
"total_tokens": 25000,
"prompt_tokens": 10000,
"completion_tokens": 15000,
"models": {
"gpt-4o": 20000,
"gpt-3.5-turbo": 5000
},
"period_days": 7
},
"limits": {
"token_limit": 100000,
"used_tokens": 25000,
"remaining_tokens": 75000,
"percentage_used": 25
}
}