Skip to main content

v1.74.6

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

Deploy this versionโ€‹

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.74.6

Key Highlightsโ€‹

  • Vector Stores - Support for Vertex RAG Engine, PG Vector, OpenAI & Azure OpenAI Vector Stores.
  • Health Check Improvements - Separate health check app on dedicated port for better Kubernetes liveness probes.
  • Control Plane + Data Plane Architecture - Enhanced proxy architecture for better scalability and separation of concerns.
  • New LLM Providers - Added Moonshot API moonshot and v0 provider support.

MCP Gateway: Enhanced Namespacingโ€‹

v1.74.6 introduces improved URL-based namespacing for MCP servers, enabling better segregation and organization of MCP tools across different environments and teams.

Key features include:

  • URL-based namespacing: Better isolation between different MCP server instances
  • Access group improvements: Enhanced management of MCP server access through configuration
  • Tool permission management: Improved object permissions when updating/deleting keys and teams

Read more here


Vector Stores APIโ€‹

v1.74.6 introduces OpenAI-compatible vector store endpoints, bringing powerful vector search capabilities to the LiteLLM proxy.

New Endpoints:

  • /v1/vector_stores - Create and manage vector stores
  • /v1/vector_stores/{vector_store_id}/search - Perform vector searches

Supported Providers:

  • Vertex RAG Engine - Google's managed RAG solution
  • PG Vector - PostgreSQL vector extension support
  • OpenAI Vector Stores - Full OpenAI compatibility
  • Azure AI Search - Microsoft's vector search service

This enables developers to easily integrate vector search capabilities into their applications while maintaining compatibility with OpenAI's vector store API.

Get started


Control Plane + Data Plane Architectureโ€‹


New Models / Updated Modelsโ€‹

Pricing / Context Window Updatesโ€‹

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)
Azure AIazure_ai/grok-3131k$3.30$16.50
Azure AIazure_ai/global/grok-3131k$3.00$15.00
Azure AIazure_ai/global/grok-3-mini131k$0.25$1.27
Azure AIazure_ai/grok-3-mini131k$0.275$1.38
Azure AIazure_ai/jais-30b-chat8k$3200$9710
Groqgroq/moonshotai-kimi-k2-instruct131k$1.00$3.00
AI21jamba-large-1.7256k$2.00$8.00
AI21jamba-mini-1.7256k$0.20$0.40
Together.aitogether_ai/moonshotai/Kimi-K2-Instruct131k$1.00$3.00
v0v0/v0-1.0-md128k$3.00$15.00
v0v0/v0-1.5-md128k$3.00$15.00
v0v0/v0-1.5-lg512k$15.00$75.00
Moonshotmoonshot/moonshot-v1-8k8k$0.20$2.00
Moonshotmoonshot/moonshot-v1-32k32k$1.00$3.00
Moonshotmoonshot/moonshot-v1-128k131k$2.00$5.00
Moonshotmoonshot/moonshot-v1-auto131k$2.00$5.00
Moonshotmoonshot/kimi-k2-0711-preview131k$0.60$2.50
Moonshotmoonshot/moonshot-v1-32k-043032k$1.00$3.00
Moonshotmoonshot/moonshot-v1-128k-0430131k$2.00$5.00
Moonshotmoonshot/moonshot-v1-8k-04308k$0.20$2.00
Moonshotmoonshot/kimi-latest131k$2.00$5.00
Moonshotmoonshot/kimi-latest-8k8k$0.20$2.00
Moonshotmoonshot/kimi-latest-32k32k$1.00$3.00
Moonshotmoonshot/kimi-latest-128k131k$2.00$5.00
Moonshotmoonshot/kimi-thinking-preview131k$30.00$30.00
Moonshotmoonshot/moonshot-v1-8k-vision-preview8k$0.20$2.00
Moonshotmoonshot/moonshot-v1-32k-vision-preview32k$1.00$3.00
Moonshotmoonshot/moonshot-v1-128k-vision-preview131k$2.00$5.00

Featuresโ€‹

Bugsโ€‹


LLM API Endpointsโ€‹

Featuresโ€‹

Bugsโ€‹


MCP Gatewayโ€‹

Featuresโ€‹

Bugsโ€‹

  • Fix to update object permission on update/delete key/team - PR #12701
  • Include /mcp in list of available routes on proxy - PR #12612

Management Endpoints / UIโ€‹

Featuresโ€‹

  • Keys
    • Regenerate Key State Management improvements - PR #12729
  • Models
    • Wildcard model filter support - PR #12597
    • Fixes for handling team only models on UI - PR #12632
  • Usage Page
    • Fix Y-axis labels overlap on Spend per Tag chart - PR #12754
  • Teams
    • Allow setting custom key duration + show key creation stats - PR #12722
    • Enable team admins to update member roles - PR #12629
  • Users
  • Logs Page
    • Add end_user filter on UI Logs Page - PR #12663
  • MCP Servers
    • Copy MCP Server name functionality - PR #12760
  • Vector Stores
    • UI support for clicking into Vector Stores - PR #12741
    • Allow adding Vertex RAG Engine, OpenAI, Azure through UI - PR #12752
  • General
    • Add Copy-on-Click for all IDs (Key, Team, Organization, MCP Server) - PR #12615
  • SCIM
    • Add GET /ServiceProviderConfig endpoint - PR #12664

Bugsโ€‹

  • Teams
    • Ensure user id correctly added when creating new teams - PR #12719
    • Fixes for handling team-only models on UI - PR #12632

Logging / Guardrail Integrationsโ€‹

Featuresโ€‹

Bugsโ€‹


Performance / Loadbalancing / Reliability improvementsโ€‹

Featuresโ€‹

  • Health Checks
    • Separate health app for liveness probes - PR #12669
    • Health check app on separate port - PR #12718
  • Caching
  • Router
    • Handle ZeroDivisionError with zero completion tokens in lowest_latency strategy - PR #12734

Bugsโ€‹

  • Database
    • Use upsert for managed object table to avoid UniqueViolationError - PR #11795
    • Refactor to support use_prisma_migrate for helm hook - PR #12600
  • Cache
    • Fix: redis caching for embedding response models - PR #12750

Helm Chartโ€‹

  • DB Migration Hook: refactor to support use_prisma_migrate - for helm hook PR
  • Add envVars and extraEnvVars support to Helm migrations job - PR #12591

General Proxy Improvementsโ€‹

Featuresโ€‹

  • Control Plane + Data Plane Architecture
    • Control Plane + Data Plane support - PR #12601
  • Proxy CLI
    • Add "keys import" command to CLI - PR #12620
  • Swagger Documentation
    • Add swagger docs for LiteLLM /chat/completions, /embeddings, /responses - PR #12618
  • Dependencies
    • Loosen rich version from ==13.7.1 to >=13.7.1 - PR #12704

Bugsโ€‹

  • Verbose log is enabled by default fix - PR #12596

  • Add support for disabling callbacks in request body - PR #12762

  • Handle circular references in spend tracking metadata JSON serialization - PR #12643


New Contributorsโ€‹

Full Changelogโ€‹