DEMO ENVIRONMENT — All data, pricing, and workflows shown are synthetic and illustrative only. Not affiliated with any AI provider. Contact →
⬡ Demo Environment — Synthetic Data

API Documentation

Illustrative RESTful API for per-request AI cost tracking, model tier comparison, workflow cost tracing, and spend analytics. This documentation is part of a demonstration environment using synthetic request, workflow, and pricing examples.

// Demo API · Synthetic responses · Not connected to any live system or provider

Base URL
https://api.requestcost.com/v1 Demo Only
Demo Environment — This documentation describes a demonstration API concept. All endpoints, responses, pricing data, and workflow examples shown are synthetic and illustrative only. This API is not connected to any live AI provider, billing system, or production infrastructure.

Overview

The RequestCost API provides programmatic access to per-request AI cost attribution, model tier comparison, workflow cost tracing, and spend analytics. This RESTful API is designed to integrate with AI gateways, agent orchestration systems, FinOps platforms, and internal cost management tooling.

🎯
Request-Level Attribution
Per-request cost breakdown across all contributing factors
⚖️
Model Tier Comparison
Illustrative cost comparison across model tiers and workload types
🔗
Workflow Cost Tracing
Step-level cost attribution across agent chains and tool calls
📈
Spend Forecasting
Projected monthly and annual cost from request-level data

Authentication

All API requests use Bearer token authentication passed via the Authorization header. API keys are scoped per workspace and carry configurable permission levels.

HTTP Header
Authorization: Bearer rc_live_xxxxxxxxxxxxxxxxxxxxxxxx

// Demo key format — not a real credential
Authorization: Bearer rc_demo_a1b2c3d4e5f6g7h8i9j0
Key Type Prefix Scope Rate Limit
Live Key rc_live_ Full API access 1,000 / min
Read-Only Key rc_read_ GET endpoints only 500 / min
Demo Key rc_demo_ Synthetic data only 100 / min

Base URL

All API requests should be directed to the following base URL:

Base URL
https://api.requestcost.com/v1

// All endpoints are relative to this base
// Example: GET https://api.requestcost.com/v1/request/cost
This is a demonstration API concept. The endpoint above is illustrative and does not represent a live or production API surface.

Rate Limits

Rate limits are applied per API key per minute. Exceeded limits return a 429 Too Many Requests response with a Retry-After header.

Rate Limit Headers
X-RateLimit-Limit:     1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset:     1735689600
Retry-After:           34

Error Codes

Status Code Description
200 success Request completed successfully
400 invalid_params Missing or malformed request parameters
401 unauthorized Invalid or missing API key
422 unprocessable Valid format but semantically invalid input
429 rate_limited Rate limit exceeded — see Retry-After header
500 server_error Unexpected server-side error
GET

/request/cost

Returns a detailed cost breakdown for a specific AI request, including token costs, tool call overhead, workflow depth multiplier, and cache savings applied.

GET https://api.requestcost.com/v1/request/cost
Click "Run Demo Request" to see a synthetic response
Query Parameters
Parameter Type Required Description
request_id string Yes Unique identifier for the AI request
model_tier string No Filter by model tier: efficient, advanced, frontier, opensource
include_breakdown boolean No Include per-factor cost breakdown in response. Default: true
currency string No Response currency code. Default: USD
POST

/analyze

Submits a request payload for cost analysis. Returns a full cost attribution breakdown including all contributing factors, workflow depth multiplier, and optimization recommendations.

POST https://api.requestcost.com/v1/analyze
Click "Run Demo Request" to see a synthetic response
Request Body Schema
Field Type Required Description
model_tier string Yes Model tier used: efficient, advanced, frontier, opensource
input_tokens integer Yes Total input tokens including system prompt
output_tokens integer Yes Total output tokens generated
tool_calls integer No Number of tool calls invoked. Default: 0
workflow_depth integer No Number of workflow steps. Default: 1
cache_hit boolean No Whether a cache hit was applied. Default: false
request_type string No Type: chat, rag, agent, tool, batch, embedding
GET

/compare/models

Returns an illustrative cost comparison across all available model tiers for a given request configuration. Useful for evaluating tier selection decisions based on workload type and volume.

GET https://api.requestcost.com/v1/compare/models
Click "Run Demo Request" to see a synthetic response
GET

/workflow/trace

Returns a step-by-step cost attribution trace for a workflow or agent chain. Each step includes its individual cost contribution, percentage of total cost, and step type classification.

GET https://api.requestcost.com/v1/workflow/trace
Click "Run Demo Request" to see a synthetic response
POST

/batch/estimate

Accepts an array of request configurations and returns aggregated cost estimates for the entire batch, with per-request and summary-level breakdowns.

Request Body — JSON
{
  "requests": [
    {
      "model_tier":    "efficient",
      "input_tokens":  1100,
      "output_tokens": 400,
      "request_type": "batch"
    },
    {
      "model_tier":    "advanced",
      "input_tokens":  3200,
      "output_tokens": 1200,
      "request_type": "rag"
    }
  ],
  "currency": "USD"
}
Response — JSON (Synthetic)
{
  "status":          "success",
  "environment":     "demo",
  "request_count":   2,
  "total_cost_usd":  0.0634,
  "avg_cost_usd":    0.0317,
  "results": [
    {
      "index":             0,
      "model_tier":       "efficient",
      "estimated_cost":   0.0008,
      "input_cost":       0.0006,
      "output_cost":      0.0002
    },
    {
      "index":             1,
      "model_tier":       "advanced",
      "estimated_cost":   0.0276,
      "input_cost":       0.0096,
      "output_cost":      0.0180
    }
  ],
  "note": "Synthetic demo response. Illustrative pricing only."
}
GET

/spend/summary

Returns an aggregated spend summary for a given time period, broken down by model tier, request type, and workflow category.

Response — JSON (Synthetic)
{
  "status":        "success",
  "environment":   "demo",
  "period":        "2026-01",
  "total_requests": 248400,
  "total_cost_usd": 3847.22,
  "by_model_tier": {
    "frontier":   { "requests": 12400,  "cost_usd": 1842.10 },
    "advanced":   { "requests": 98200,  "cost_usd": 1624.88 },
    "efficient":   { "requests": 128600, "cost_usd": 342.14  },
    "opensource":  { "requests": 9200,   "cost_usd": 38.10   }
  },
  "by_request_type": {
    "rag":       1284.40,
    "agent":     1102.18,
    "chat":      842.34,
    "batch":     482.10,
    "embedding": 136.20
  },
  "note": "Synthetic demo response. Illustrative figures only."
}
GET

/forecast

Returns a projected cost forecast based on current usage patterns and a configurable growth rate assumption. All projections are illustrative estimates based on synthetic data.

Response — JSON (Synthetic)
{
  "status":            "success",
  "environment":       "demo",
  "forecast_months":   3,
  "growth_assumption": "10% monthly",
  "projection": [
    { "month": "2026-02", "est_cost_usd": 4231.94, "est_requests": 273240 },
    { "month": "2026-03", "est_cost_usd": 4655.14, "est_requests": 300564 },
    { "month": "2026-04", "est_cost_usd": 5120.65, "est_requests": 330620 }
  ],
  "note": "Illustrative projection. Synthetic assumptions only."
}
GET

/breakdown/factors

Returns an aggregated breakdown of cost by contributing factor across all requests in the specified period. Useful for identifying the primary cost drivers in your AI workload.

Response — JSON (Synthetic)
{
  "status":      "success",
  "environment": "demo",
  "period":      "2026-01",
  "factors": {
    "input_tokens":      { "cost_usd": 1542.10, "pct": 40.1 },
    "output_tokens":     { "cost_usd": 1284.40, "pct": 33.4 },
    "tool_calls":        { "cost_usd": 482.18,  "pct": 12.5 },
    "workflow_depth":    { "cost_usd": 346.24,  "pct": 9.0  },
    "retries":          { "cost_usd": 154.22,  "pct": 4.0  },
    "cache_savings":    { "cost_usd": -61.92,  "pct": -1.6 }
  },
  "note": "Synthetic demo response. Illustrative figures only."
}

Cost Factors

A single AI request can vary significantly in cost depending on a combination of the following factors. Understanding these factors is the foundation of request-level cost intelligence.

Factor Impact Description
model_tier_selected High Primary cost driver — pricing varies significantly across tiers
input_output_token_length High Token count directly determines base inference cost
tool_calls_invoked Medium Each tool call adds overhead beyond base token cost
workflow_depth Medium Multi-step chains multiply base cost per step
retries_and_fallbacks Medium Failed or retried requests add to effective cost
caching_behavior Reduces Prompt caching can reduce effective input token cost
routing_logic Variable Smart routing may reduce cost by directing to lower tiers
provider_pricing_tier Variable Volume discounts and tier agreements affect effective rate

Model Tiers

The RequestCost demo uses four illustrative model tier classifications. These tiers are generic descriptors and are not affiliated with any specific provider or model family.

Tier Input / 1K tokens Output / 1K tokens Typical Use
frontier ~$0.015 ~$0.060 Complex reasoning, research, advanced agents
advanced ~$0.003 ~$0.015 Balanced workloads, RAG, moderate complexity
efficient ~$0.0005 ~$0.0015 High-volume, simple chat, classification
opensource ~$0.0001 ~$0.0003 Self-hosted infrastructure, infra cost not included
These are illustrative pricing tiers for demo purposes only. They do not represent the pricing of any specific AI provider. Open source self-hosted costs exclude infrastructure expenses.

Workflow Depth

Workflow depth refers to the number of sequential steps in an agent or automation chain. Each additional step can add cost through additional inference calls, tool invocations, and retries.

Classification Steps Cost Multiplier Example
simple 1 ×1.0 Single chat turn, classification
moderate 2 — 4 ×1.4 RAG pipeline, simple tool use
complex 5 — 10 ×2.1 Multi-tool agent, research workflow
deep 10+ ×3.5 Deep agent chain, autonomous task runner
Multipliers shown are illustrative estimates used in this demonstration only. Actual workflow cost impact varies significantly based on implementation details.
// Domain Available

RequestCost.com

Relevant to AI gateways, model operations, workflow tooling, and cost-aware platform teams. Includes demo platform and API documentation assets. Descriptive .com term. For project, partnership, or ownership-related inquiries, please use the contact page.