API Docs — RequestCost.com

Overview

The RequestCost API provides programmatic access to per-request AI cost attribution, model tier comparison, workflow cost tracing, and spend analytics. This RESTful API is designed to integrate with AI gateways, agent orchestration systems, FinOps platforms, and internal cost management tooling.

🎯

Request-Level Attribution

Per-request cost breakdown across all contributing factors

⚖️

Model Tier Comparison

Illustrative cost comparison across model tiers and workload types

🔗

Workflow Cost Tracing

Step-level cost attribution across agent chains and tool calls

📈

Spend Forecasting

Projected monthly and annual cost from request-level data

Authentication

Demo Concept

All API requests use Bearer token authentication passed via the Authorization header. API keys are scoped per workspace and carry configurable permission levels.

HTTP Header

Authorization: Bearer rc_live_xxxxxxxxxxxxxxxxxxxxxxxx

// Demo key format — not a real credential
Authorization: Bearer rc_demo_a1b2c3d4e5f6g7h8i9j0

Key Type	Prefix	Scope	Rate Limit
Live Key	`rc_live_`	Full API access	1,000 / min
Read-Only Key	`rc_read_`	GET endpoints only	500 / min
Demo Key	`rc_demo_`	Synthetic data only	100 / min

Base URL

Demo Concept

All API requests should be directed to the following base URL:

Base URL

https://api.requestcost.com/v1

// All endpoints are relative to this base
// Example: GET https://api.requestcost.com/v1/request/cost

ℹ This is a demonstration API concept. The endpoint above is illustrative and does not represent a live or production API surface.

Rate Limits

Demo Concept

Rate limits are applied per API key per minute. Exceeded limits return a 429 Too Many Requests response with a Retry-After header.

Rate Limit Headers

X-RateLimit-Limit:     1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset:     1735689600
Retry-After:           34

Error Codes

Demo Concept

Status	Code	Description
200	`success`	Request completed successfully
400	`invalid_params`	Missing or malformed request parameters
401	`unauthorized`	Invalid or missing API key
422	`unprocessable`	Valid format but semantically invalid input
429	`rate_limited`	Rate limit exceeded — see Retry-After header
500	`server_error`	Unexpected server-side error

GET

/request/cost

Core

Returns a detailed cost breakdown for a specific AI request, including token costs, tool call overhead, workflow depth multiplier, and cache savings applied.

GET https://api.requestcost.com/v1/request/cost

request_id

model_tier

include_breakdown

Click "Run Demo Request" to see a synthetic response

Query Parameters

Parameter	Type	Required	Description
`request_id`	string	Yes	Unique identifier for the AI request
`model_tier`	string	No	Filter by model tier: `efficient`, `advanced`, `frontier`, `opensource`
`include_breakdown`	boolean	No	Include per-factor cost breakdown in response. Default: `true`
`currency`	string	No	Response currency code. Default: `USD`

POST

/analyze

Core

Submits a request payload for cost analysis. Returns a full cost attribution breakdown including all contributing factors, workflow depth multiplier, and optimization recommendations.

POST https://api.requestcost.com/v1/analyze

Request Body (JSON)

Click "Run Demo Request" to see a synthetic response

Request Body Schema

Field	Type	Required	Description
`model_tier`	string	Yes	Model tier used: `efficient`, `advanced`, `frontier`, `opensource`
`input_tokens`	integer	Yes	Total input tokens including system prompt
`output_tokens`	integer	Yes	Total output tokens generated
`tool_calls`	integer	No	Number of tool calls invoked. Default: `0`
`workflow_depth`	integer	No	Number of workflow steps. Default: `1`
`cache_hit`	boolean	No	Whether a cache hit was applied. Default: `false`
`request_type`	string	No	Type: `chat`, `rag`, `agent`, `tool`, `batch`, `embedding`

GET

/compare/models

Core

Returns an illustrative cost comparison across all available model tiers for a given request configuration. Useful for evaluating tier selection decisions based on workload type and volume.

GET https://api.requestcost.com/v1/compare/models

input_tokens

output_tokens

request_type

Click "Run Demo Request" to see a synthetic response

GET

/workflow/trace

Core

Returns a step-by-step cost attribution trace for a workflow or agent chain. Each step includes its individual cost contribution, percentage of total cost, and step type classification.

GET https://api.requestcost.com/v1/workflow/trace

workflow_id

scenario

Click "Run Demo Request" to see a synthetic response

POST

/batch/estimate

Core

Accepts an array of request configurations and returns aggregated cost estimates for the entire batch, with per-request and summary-level breakdowns.

Request Body — JSON

{
  "requests": [
    {
      "model_tier":    "efficient",
      "input_tokens":  1100,
      "output_tokens": 400,
      "request_type": "batch"
    },
    {
      "model_tier":    "advanced",
      "input_tokens":  3200,
      "output_tokens": 1200,
      "request_type": "rag"
    }
  ],
  "currency": "USD"
}

Response — JSON (Synthetic)

{
  "status":          "success",
  "environment":     "demo",
  "request_count":   2,
  "total_cost_usd":  0.0634,
  "avg_cost_usd":    0.0317,
  "results": [
    {
      "index":             0,
      "model_tier":       "efficient",
      "estimated_cost":   0.0008,
      "input_cost":       0.0006,
      "output_cost":      0.0002
    },
    {
      "index":             1,
      "model_tier":       "advanced",
      "estimated_cost":   0.0276,
      "input_cost":       0.0096,
      "output_cost":      0.0180
    }
  ],
  "note": "Synthetic demo response. Illustrative pricing only."
}

GET

/spend/summary

Analytics

Returns an aggregated spend summary for a given time period, broken down by model tier, request type, and workflow category.

Response — JSON (Synthetic)

{
  "status":        "success",
  "environment":   "demo",
  "period":        "2026-01",
  "total_requests": 248400,
  "total_cost_usd": 3847.22,
  "by_model_tier": {
    "frontier":   { "requests": 12400,  "cost_usd": 1842.10 },
    "advanced":   { "requests": 98200,  "cost_usd": 1624.88 },
    "efficient":   { "requests": 128600, "cost_usd": 342.14  },
    "opensource":  { "requests": 9200,   "cost_usd": 38.10   }
  },
  "by_request_type": {
    "rag":       1284.40,
    "agent":     1102.18,
    "chat":      842.34,
    "batch":     482.10,
    "embedding": 136.20
  },
  "note": "Synthetic demo response. Illustrative figures only."
}

GET

/forecast

Analytics

Returns a projected cost forecast based on current usage patterns and a configurable growth rate assumption. All projections are illustrative estimates based on synthetic data.

Response — JSON (Synthetic)

{
  "status":            "success",
  "environment":       "demo",
  "forecast_months":   3,
  "growth_assumption": "10% monthly",
  "projection": [
    { "month": "2026-02", "est_cost_usd": 4231.94, "est_requests": 273240 },
    { "month": "2026-03", "est_cost_usd": 4655.14, "est_requests": 300564 },
    { "month": "2026-04", "est_cost_usd": 5120.65, "est_requests": 330620 }
  ],
  "note": "Illustrative projection. Synthetic assumptions only."
}

GET

/breakdown/factors

Analytics

Returns an aggregated breakdown of cost by contributing factor across all requests in the specified period. Useful for identifying the primary cost drivers in your AI workload.

Response — JSON (Synthetic)

{
  "status":      "success",
  "environment": "demo",
  "period":      "2026-01",
  "factors": {
    "input_tokens":      { "cost_usd": 1542.10, "pct": 40.1 },
    "output_tokens":     { "cost_usd": 1284.40, "pct": 33.4 },
    "tool_calls":        { "cost_usd": 482.18,  "pct": 12.5 },
    "workflow_depth":    { "cost_usd": 346.24,  "pct": 9.0  },
    "retries":          { "cost_usd": 154.22,  "pct": 4.0  },
    "cache_savings":    { "cost_usd": -61.92,  "pct": -1.6 }
  },
  "note": "Synthetic demo response. Illustrative figures only."
}

Cost Factors

Concept

A single AI request can vary significantly in cost depending on a combination of the following factors. Understanding these factors is the foundation of request-level cost intelligence.

Factor	Impact	Description
`model_tier_selected`	High	Primary cost driver — pricing varies significantly across tiers
`input_output_token_length`	High	Token count directly determines base inference cost
`tool_calls_invoked`	Medium	Each tool call adds overhead beyond base token cost
`workflow_depth`	Medium	Multi-step chains multiply base cost per step
`retries_and_fallbacks`	Medium	Failed or retried requests add to effective cost
`caching_behavior`	Reduces	Prompt caching can reduce effective input token cost
`routing_logic`	Variable	Smart routing may reduce cost by directing to lower tiers
`provider_pricing_tier`	Variable	Volume discounts and tier agreements affect effective rate

Model Tiers

Concept

The RequestCost demo uses four illustrative model tier classifications. These tiers are generic descriptors and are not affiliated with any specific provider or model family.

Tier	Input / 1K tokens	Output / 1K tokens	Typical Use
`frontier`	~$0.015	~$0.060	Complex reasoning, research, advanced agents
`advanced`	~$0.003	~$0.015	Balanced workloads, RAG, moderate complexity
`efficient`	~$0.0005	~$0.0015	High-volume, simple chat, classification
`opensource`	~$0.0001	~$0.0003	Self-hosted infrastructure, infra cost not included

ℹ These are illustrative pricing tiers for demo purposes only. They do not represent the pricing of any specific AI provider. Open source self-hosted costs exclude infrastructure expenses.

Workflow Depth

Concept

Workflow depth refers to the number of sequential steps in an agent or automation chain. Each additional step can add cost through additional inference calls, tool invocations, and retries.

Classification	Steps	Cost Multiplier	Example
`simple`	1	×1.0	Single chat turn, classification
`moderate`	2 — 4	×1.4	RAG pipeline, simple tool use
`complex`	5 — 10	×2.1	Multi-tool agent, research workflow
`deep`	10+	×3.5	Deep agent chain, autonomous task runner

⚠ Multipliers shown are illustrative estimates used in this demonstration only. Actual workflow cost impact varies significantly based on implementation details.

Pagination

Concept

List endpoints use cursor-based pagination. Each response includes a next_cursor value that can be passed to retrieve the next page of results.

Pagination Response — JSON (Synthetic)

{
  "data": [ /* ... results ... */ ],
  "pagination": {
    "has_more":    true,
    "next_cursor": "cur_a1b2c3d4e5f6",
    "page_size":   50,
    "total_count": 4820
  }
}

API Documentation

Overview

Authentication

Base URL

Rate Limits

Error Codes

/request/cost

/analyze

/compare/models

/workflow/trace

/batch/estimate

/spend/summary

/forecast

/breakdown/factors

Cost Factors

Model Tiers

Workflow Depth

Pagination

RequestCost.com