Available models
W&B Inference powered by CoreWeave provides API and playground access to leading open-source LLMs, including OpenAI GPT OSS, Qwen3, Kimi K2, Llama 4, DeepSeek, and Phi, allowing Weights & Biases users to develop AI applications and agents without needing to sign up for a hosting provider or host models themselves

OpenAI GPT OSS 120B
Text
New
Aug 2025
$0.15 input / $0.60 output
131k
Efficient Mixture-of-Experts model designed for high-reasoning, agentic and general-purpose use cases.

OpenAI GPT OSS 20B
Text
New
Aug 2025
$0.05 input / $0.20 output
131k
Lower latency Mixture-of-Experts model trained on OpenAI’s Harmony response format with reasoning capabilities.

MoonshotAI Kimi K2
Text
New
Jul 2025
$1.35 input / $4.00 output
128K
Mixture-of-Experts model optimized for complex tool use, reasoning, and code synthesis.

Qwen3 Coder 480B A35B
Text
New
Jul 2025
$1.00 input / $1.50 output
262K
Mixture-of-Experts model optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning.

Qwen3 235B A22B-2507
Text
New
Jul 2025
$0.10 input / $0.10 output
262K
Efficient multilingual, Mixture-of-Experts, instruction-tuned model, optimized for logical reasoning.

Qwen3 235B A22B Thinking-2507
Text
New
Jul 2025
$0.10 input / $0.10 output
262K
High-performance Mixture-of-Experts model optimized for structured reasoning, math, and long-form generation.

Meta Llama 3.1 8B
Text
Jul 2024
$0.22 input / $0.22 output
128K
Efficient conversational model optimized for responsive multilingual chatbot interactions.

DeepSeek V3-0324
Text
Mar 2025
$1.14 input / $2.75 output
161K
Robust Mixture-of-Experts model tailored for high-complexity language processing and comprehensive document analysis.

Meta Llama 3.3 70B
Text
Dec 2024
$0.71 input / $0.71 output
128K
Multilingual model excelling in conversational tasks, detailed instruction-following, and coding.

DeepSeek R1-0528
Text
May 2025
$1.35 input / $5.40 output
161K
Optimized for precise reasoning tasks including complex coding, math, and structured document analysis.

Meta Llama 4 Scout
Text
Vision
Apr 2025
$0.17 input / $0.66 output
64K
Multimodal model integrating text and image understanding, ideal for visual tasks and combined analysis.

Microsoft Phi 4 Mini 3.8B
Text
Feb 2025
$0.08 input / $0.35 output
128K
Compact, efficient model ideal for fast responses in resource-constrained environments.
import openai
import weave
# Weave autopatches OpenAI to log calls to Weave
weave.init("/")
client = openai.OpenAI(
# The custom base URL points to Inference
base_url='https://api.inference.wandb.ai/v1',
# Get your API key from https://wandb.ai/authorize
# Consider setting it in the environment as OPENAI_API_KEY instead for safety
api_key="",
# Team and project are required for usage tracking
project="/",
)
response = client.chat.completions.create(
model="moonshotai/Kimi-K2-Instruct",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a joke."}
],
)
print(response.choices[0].message.content)
Quickly explore and switch new models
New models with better performance and pricing pop up all the time, but each new model means another provider, another account, and another API key to deal with.
W&B Inference powered by CoreWeave hosts popular open source models on powerful CoreWeave infrastructure that you can readily access with your existing Weights & Biases account via the SDK or the UI. Test and switch between models quickly without signing up for additional API keys or hosting models yourself.
Access models in playground with zero configuration
Explore open-source models instantly in the playground. No model endpoints or access keys required.
Skip the hassle of configuring model endpoints and custom providers, your Weights & Biases account gives you instant access to a wide selection of powerful open-source foundation models, fully hosted on our infrastructure. Zero configuration needed.


Easily iterate on AI applications that use open source models
LLM-powered apps need observability tools, but open-source model hosting providers don’t offer them, forcing developers to juggle disconnected platforms for hosting and observability.
W&B Inference runs directly on CoreWeave infrastructure with observability built-in through W&B Weave to evaluate, monitor, and iterate on AI applications and agents—no extra instrumentation, fragmented workflows, or complexity.
Get started for free
Experimentation can quickly get expensive when every new model you test comes with a separate price plan.
We host the latest models, ready for inference within your existing Weights & Biases subscription, keeping costs low and simple with a single plan instead of managing multiple providers.
See our pricing page for more information.

The Weights & Biases end-to-end AI developer platform
Weave
Models
The Weights & Biases platform helps you streamline your workflow from end to end
Models
Experiments
Track and visualize your ML experiments
Sweeps
Optimize your hyperparameters
Registry
Publish and share your ML models and datasets
Automations
Trigger workflows automatically
Weave
Traces
Explore and
debug LLMs
Evaluations
Rigorous evaluations of GenAI applications