Drop-in replacement for OpenAI with built-in analytics and cost optimization
Replace your OpenAI endpoint with Neurometric in just one line. All your existing code works exactly the same.
Just change the base_url. Everything else stays the same.
from openai import OpenAI
client = OpenAI(
api_key=os.environ.get("NEUROMETRIC_API_KEY"),
base_url="https://api.neurometric.ai/v1"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.NEUROMETRIC_API_KEY,
baseURL: 'https://api.neurometric.ai/v1',
});
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'user', content: 'Hello!' }
],
});
console.log(response.choices[0].message.content);
curl -X POST https://api.neurometric.ai/v1/chat/completions \
-H "Authorization: Bearer $NEUROMETRIC_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Analytics are automatic. Every request through Neurometric is tracked, analyzed, and available in your dashboard without any extra code or setup.
Monitor all API calls, models used, tokens consumed, and response times in real-time.
Automatic cost breakdown by model, task type, and time period. Identify optimization opportunities.
Track latency, throughput, error rates, and success metrics across all your LLM requests.
Get AI-powered suggestions to reduce costs while maintaining quality based on your usage patterns.
Access detailed analytics, usage reports, and optimization recommendations.
Open DashboardAll API requests require authentication using a Bearer token in the Authorization header.
Authorization: Bearer YOUR_API_KEY
/chat/completions
Create a chat completion. Fully compatible with OpenAI's chat completions API format.
Use any model ID from OpenAI, Anthropic, Google, or other providers. Neurometric automatically routes to the correct provider.
OpenAI
gpt-4o, gpt-4-turbo, gpt-3.5-turbo
Anthropic
claude-3-5-sonnet, claude-3-opus
gemini-pro, gemini-flash
Meta
llama-3-70b, llama-3-8b
Mistral
mistral-large, mistral-7b
...and more
100+ models supported
model
string
ID of the model to use. See supported models above.
Example: "gpt-4o", "claude-3-5-sonnet"
messages
array
Array of message objects with role and content.
Roles: "system", "user", "assistant"
temperature
number
Sampling temperature between 0 and 2. Higher values = more random.
Default: 1
max_tokens
integer
Maximum number of tokens to generate in the response.
Model-dependent default
top_p
number
Nucleus sampling: consider tokens with top_p probability mass.
Default: 1
stop
string | array
Sequences where the API will stop generating tokens.
Up to 4 sequences
{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "What is the capital of France?"
}
],
"temperature": 0.7,
"max_tokens": 150
}
id
string
Unique identifier for the completion.
model
string
The model that generated the response.
choices
array
Array of completion choices. Each contains:
message - The generated message object
finish_reason - Why generation stopped ("stop", "length", etc.)
index - Index of this choice
usage
object
Token usage statistics:
prompt_tokens - Tokens in the prompt
completion_tokens - Tokens in the response
total_tokens - Total tokens used
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1706745600,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 8,
"total_tokens": 33
}
}
Errors follow a standard format with HTTP status codes and descriptive messages:
{ "error": { "message": "...", "type": "...", "code": "..." } }
401 - Invalid API key
429 - Rate limit exceeded
500 - Server error
If you're using TypeScript, here are the type definitions for request and response objects.
// Message in a conversation
interface Message {
role: 'system' | 'user' | 'assistant';
content: string;
}
// Request body for /chat/completions
interface ChatCompletionRequest {
model: string;
messages: Message[];
temperature?: number; // 0-2, default 1
max_tokens?: number; // Model-dependent
top_p?: number; // 0-1, default 1
stop?: string | string[]; // Up to 4 sequences
}
// Response from /chat/completions
interface ChatCompletionResponse {
id: string;
object: 'chat.completion';
created: number;
model: string;
choices: {
index: number;
message: Message;
finish_reason: 'stop' | 'length' | 'content_filter';
}[];
usage: {
prompt_tokens: number;
completion_tokens: number;
total_tokens: number;
};
}
Get your Neurometric API key
Create an account and generate an API key
Update your base URL
Change https://api.openai.com/v1 to https://api.neurometric.ai/v1
Replace your API key
Use your Neurometric API key instead of your OpenAI key
No other code changes required. Neurometric uses the same request/response format as OpenAI, so your existing code works without modification. You can even continue using the official OpenAI SDK.
Need help? We're here for you.