The Frosty AI Python SDK allows you to connect to your AI router and interact with LLM providers like OpenAI, Anthropic, Mistral, and Meta—without managing each provider directly.
📦 Installation
Install from PyPI:
pip install frosty-ai
🚀 Getting Started
from frosty import Frosty
frosty = Frosty(
router_id="your-router-id",
router_key="your-router-key"
)
response = frosty.chat([
{"role": "user", "content": "What is Frosty AI?"}
])
print(response['response']) # ➜ "Only the best LLM router and observability tool ever. ❄️"
⚙️ Initialization
frosty = Frosty(router_id="...", router_key="...")
-
router_id: Your router's unique ID from the Frosty console.
-
router_key: Your router's secret API key.
This triggers an automatic connection and fetches your default, cost, and performance models for both text generation and embeddings.
💬 chat()
Method
frosty.chat(prompt, rule=None, thinking=None)
Parameter | Type | Description |
---|---|---|
| list | A list of messages in OpenAI/Anthropic format. |
| str | Optional grounding text. If a list is provided, items are joined with newlines. |
| str |
|
| str | (Optional) Used with Claude 3 models that support internal thoughts. |
✅ Handles fallback automatically if the primary model fails.
🧩 Using context
(optional grounding)
context
lets you pass extra information (RAG chunks, tone/policy instructions, user profile, etc.) alongside the prompt.
The SDK merges this into the first user message so it works across OpenAI, Anthropic, Mistral, and Meta—no provider-specific params needed.
Quick examples
-
Tone / style control
from frosty_ai import Frosty frosty = Frosty(router_id="...", router_key="...") res = frosty.chat( prompt=[{"role":"user","content":"Tell me a 10-word weather joke"}], context="Always answer in pirate speak." ) print(res["response"])
-
RAG: pass retrieved chunks
chunks = [ "SeniorPlanet offers tech classes for older adults.", "Council on Aging provides meals, transportation, care coordination in Cincinnati." ] res = frosty.chat( [{"role":"user","content":"What services exist for older adults in Cincinnati?"}], context=chunks # list is supported )
-
Load context from a file
kb = open("kb.txt", "r", encoding="utf-8").read() res = frosty.chat( [{"role":"user","content":"Summarize key points for a one-pager."}], context=kb )
📌 embeddings()
Method
frosty.embeddings(prompt, rule=None)
Parameter | Type | Description |
---|---|---|
| str or list | Text to convert to embeddings. |
| str |
|
Returns a dictionary including tokens used, time, model, and result.
🧠 Routing Logic
-
If
auto_route
is enabled, the SDK selects the best model based on real-time weights. -
If
rule="cost"
orrule="performance"
, it uses those respective models. -
If no rule is given, it uses your primary model.
-
If any call fails, Frosty will fallback to the configured backup provider.
✅ Supported Providers
-
OpenAI
-
Anthropic
-
Mistral
-
Meta (via Bedrock proxy)
🧪 Logging + Metrics
All requests are logged automatically. You can:
Frosty also tracks token usage, model latency, and performance—feeding this data back to improve routing over time.
🔄 Auto-Routing Controls
Call this to let Frosty dynamically select the best model based on your weights:
frosty.set_best_model()
Or fetch the current best model chosen by the auto-router:
model = frosty.choose_best_model()
🧯 Error Handling
Frosty may raise custom exceptions:
Exception | Description |
---|---|
| Invalid router ID or key |
| API call limit exceeded |
| Failed response from a provider or Frosty API |
| SDK setup is incomplete or invalid |