Why the Provider Choice Matters
Choosing an LLM provider is not just a technical decision. It affects your compliance posture, operating costs, latency profile, and long-term vendor flexibility. For .NET teams, the decision also shapes which SDKs you depend on, how you authenticate, and whether you can leverage familiar patterns like dependency injection and middleware pipelines.
The good news: the .NET ecosystem now offers genuine provider abstraction through Microsoft.Extensions.AI. But to make an informed choice, you need to understand what each provider actually brings to the table.
OpenAI Direct
OpenAI is the origin point. The company behind GPT-4, GPT-4o, and the o-series reasoning models hosts its APIs at api.openai.com and provides the official OpenAI .NET client library.
Strengths:
- First access to new models and features — OpenAI ships capabilities to its own platform before Azure
- Simple API key authentication
- Pay-as-you-go pricing with no minimum commitment
- The .NET SDK (
OpenAINuGet package) is well-maintained and idiomatic
Limitations:
- No enterprise SLA — the platform status page is your only guarantee
- Data processing policies may not satisfy regulated industries (healthcare, finance, government)
- No virtual network isolation or private endpoints
- Content filtering is limited compared to Azure OpenAI
When to use it: Startups, indie developers, prototyping, and teams that need bleeding-edge model access on day one.
using OpenAI;
using OpenAI.Chat;
var client = new OpenAIClient("sk-your-api-key");
ChatClient chatClient = client.GetChatClient("gpt-4o");
ChatCompletion completion = await chatClient.CompleteChatAsync(
[new UserChatMessage("Explain dependency injection in three sentences.")]);
Console.WriteLine(completion.Content[0].Text);
Azure OpenAI
Azure OpenAI hosts the same OpenAI models — GPT-4o, GPT-4 Turbo, the embedding models — inside Microsoft Azure. The models are identical. The difference is everything around the models: authentication, networking, compliance, and operational guarantees.
For a detailed look at the SDK, see Azure.AI.OpenAI 2.1.0 Released.
Strengths:
- Enterprise SLA (99.9% uptime) backed by Azure credit guarantees
- Data residency — choose the Azure region where your data is processed
- Azure AD / Entra ID managed identity authentication — no API keys in production
- Virtual network integration and private endpoints
- Built-in content filtering with configurable severity thresholds
- Provisioned Throughput Units (PTUs) for predictable latency at scale
- SOC 2, HIPAA, FedRAMP compliance inheritance from Azure
Limitations:
- Model availability lags behind OpenAI direct (typically 2-8 weeks)
- Requires an Azure subscription and resource provisioning
- Deployment quotas must be managed per-region
When to use it: Enterprise applications, regulated industries, any workload that needs contractual SLAs, and teams already invested in the Azure ecosystem.
using Azure.AI.OpenAI;
using Azure.Identity;
using OpenAI.Chat;
// No API key — uses Azure Managed Identity
var client = new AzureOpenAIClient(
new Uri("https://your-resource.openai.azure.com/"),
new DefaultAzureCredential());
ChatClient chatClient = client.GetChatClient("gpt-4o"); // deployment name
ChatCompletion completion = await chatClient.CompleteChatAsync(
[new UserChatMessage("Explain dependency injection in three sentences.")]);
Notice that the code is nearly identical to the OpenAI direct example. The AzureOpenAIClient inherits from the same base types — the SDK was co-designed to make migration straightforward.
Anthropic Claude
Anthropic’s Claude models (Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku) offer a genuinely different architecture. Claude uses Constitutional AI training and tends to produce more cautious, well-structured outputs. The 200K token context window in Claude 3.5 is particularly relevant for applications processing long documents.
Strengths:
- 200K token context window — process entire codebases or legal documents in a single call
- Strong performance on reasoning, code generation, and instruction-following benchmarks
- Available via direct API and Amazon Bedrock (for AWS-centric teams)
- Constitutional AI approach to safety alignment
Limitations:
- No first-party .NET SDK — use HttpClient or community packages
- Smaller ecosystem of tooling and integrations compared to OpenAI
- Fewer fine-tuning options
When to use it: Long-context workloads (document analysis, code review), teams that need an alternative provider for redundancy, and applications where Claude’s reasoning style matches the use case.
using System.Net.Http.Json;
// Using HttpClient — or use a community Anthropic .NET package
var httpClient = new HttpClient
{
BaseAddress = new Uri("https://api.anthropic.com/"),
DefaultRequestHeaders =
{
{ "x-api-key", "your-anthropic-key" },
{ "anthropic-version", "2023-06-01" }
}
};
var response = await httpClient.PostAsJsonAsync("v1/messages", new
{
model = "claude-3-5-sonnet-20241022",
max_tokens = 1024,
messages = new[]
{
new { role = "user", content = "Explain dependency injection in three sentences." }
}
});
var result = await response.Content.ReadFromJsonAsync<JsonElement>();
Console.WriteLine(result.GetProperty("content")[0].GetProperty("text").GetString());
While this works, most .NET teams will prefer using Anthropic through a Semantic Kernel connector or the Microsoft.Extensions.AI abstraction layer for a cleaner integration pattern.
Google Gemini
Google’s Gemini models (Gemini 1.5 Pro, Gemini 1.5 Flash, Gemini 2.0) are multimodal from the ground up — native support for text, images, video, and audio in a single model. Gemini 1.5 Pro offers a 1M token context window for scenarios that exceed even Claude’s capacity.
Strengths:
- Native multimodal capabilities — not bolted on, built in
- 1M token context window (Gemini 1.5 Pro)
- Available via Google AI Studio (direct) and Vertex AI (enterprise)
- Competitive pricing, especially Gemini Flash for high-throughput, lower-complexity tasks
Limitations:
- .NET SDK support is community-driven (no official Microsoft connector)
- Enterprise features require Vertex AI, which means Google Cloud Platform commitment
- Smaller presence in the .NET ecosystem compared to OpenAI/Azure
When to use it: Multimodal applications, extremely long-context workloads, teams on Google Cloud Platform, and cost-sensitive high-volume scenarios using Gemini Flash.
Open-Source Models via Ollama
Open-source models — Meta’s Llama 3, DeepSeek-V3, Alibaba’s Qwen 2.5 — represent a fundamentally different approach. You run the model on your own hardware, retaining full control over data flow and eliminating per-token costs.
Ollama is the simplest way to run these models locally during development:
# Install and run Llama 3 locally
ollama pull llama3.3
ollama serve
Strengths:
- Complete data sovereignty — nothing leaves your network
- Zero per-token cost after hardware investment
- Models like DeepSeek-V3 and Qwen 2.5 72B rival GPT-4 on many benchmarks
- Ideal for air-gapped environments (government, defense, healthcare)
Limitations:
- Requires GPU hardware — consumer hardware for development, enterprise GPUs for production
- No managed SLA — you own the uptime, scaling, and maintenance
- Smaller models (7B-13B) trail frontier models on complex reasoning
- Ollama is a development tool — for production, consider vLLM, TGI, or Azure ML endpoints
When to use it: Development and testing, air-gapped deployments, cost optimization for high-volume low-complexity tasks, and specialized fine-tuned models.
.NET SDK Support Matrix
Every provider has a path into .NET. Here is the current landscape:
| Provider | Primary .NET SDK | Microsoft.Extensions.AI | Semantic Kernel |
|---|---|---|---|
| OpenAI | OpenAI NuGet | Microsoft.Extensions.AI.OpenAI | Built-in connector |
| Azure OpenAI | Azure.AI.OpenAI NuGet | Microsoft.Extensions.AI.OpenAI | Built-in connector |
| Anthropic | Community / HttpClient | Community adapter | Community connector |
| Google Gemini | Google.Cloud.AIPlatform | Community adapter | Community connector |
| Ollama | OllamaSharp NuGet | Microsoft.Extensions.AI.Ollama | Community connector |
The first-class story is clearly Azure OpenAI and OpenAI. But the abstraction layer provided by Microsoft.Extensions.AI means your application logic can remain provider-agnostic.
Provider Abstraction with IChatClient
This is the pattern that changes everything for .NET teams. Instead of depending on any specific provider SDK, your services depend on IChatClient:
using Microsoft.Extensions.AI;
public class SummarizationService(IChatClient chatClient)
{
public async Task<string> SummarizeAsync(string document)
{
var response = await chatClient.CompleteAsync(
[
new ChatMessage(ChatRole.System,
"You are a summarization assistant. Produce concise summaries."),
new ChatMessage(ChatRole.User, document)
]);
return response.Message.Text ?? string.Empty;
}
}
The provider registration happens once in Program.cs. Swapping providers is a configuration change, not a code change:
using Microsoft.Extensions.AI;
var builder = WebApplication.CreateBuilder(args);
// Option A: Azure OpenAI
builder.Services.AddChatClient(pipeline =>
pipeline
.UseLogging()
.UseOpenTelemetry()
.Build())
.UseAzureOpenAI(options =>
{
options.Endpoint = new Uri(builder.Configuration["AzureOpenAI:Endpoint"]!);
options.DeploymentName = "gpt-4o";
});
// Option B: Ollama (local development) — uncomment to swap
// builder.Services.AddChatClient(pipeline =>
// pipeline
// .UseLogging()
// .Build())
// .UseOllama(new Uri("http://localhost:11434"), "llama3.3");
builder.Services.AddScoped<SummarizationService>();
This approach is covered in depth in What is Generative AI for .NET Developers. The key insight is that IChatClient is to LLM providers what DbContext is to database providers — a stable contract that decouples your domain logic from infrastructure.
Decision Matrix for .NET Teams
Choosing a provider comes down to a handful of concrete questions:
| Requirement | Recommended Provider |
|---|---|
| Enterprise SLA + compliance | Azure OpenAI |
| Bleeding-edge model access | OpenAI direct |
| 200K+ token context | Anthropic Claude or Google Gemini |
| Multimodal (text + image + video) | Google Gemini |
| Air-gapped / on-premises | Open-source via Ollama / vLLM |
| Cost-optimized high volume | Gemini Flash or open-source |
| Existing Azure investment | Azure OpenAI |
| Existing AWS investment | Anthropic via Bedrock |
| Local development / offline | Ollama |
Many production systems use multiple providers. Azure OpenAI handles the primary workload with an SLA, while Ollama runs locally for development. Some teams add Claude as a fallback for long-context tasks. The IChatClient abstraction makes this multi-provider strategy practical without duplicating application logic.
Enterprise Considerations
Beyond the model capabilities, enterprise teams need to evaluate:
Data Residency. Azure OpenAI lets you select the Azure region. Your prompts and completions are processed in that region and not stored beyond the API request lifecycle (when abuse monitoring is disabled). OpenAI direct processes data in the United States. Anthropic and Google have their own data processing agreements — review them with your legal team.
Content Filtering. Azure OpenAI includes configurable content filtering that blocks harmful content at the API level. This is critical for customer-facing applications. OpenAI direct has less granular filtering controls. Open-source models have no built-in content filtering — you must implement your own.
Authentication. Azure OpenAI supports Azure AD / Entra ID, meaning no API keys in your production environment. OpenAI, Anthropic, and Gemini use API keys. For enterprise security posture, keyless authentication is a significant advantage.
Cost Predictability. Most providers charge per token. Azure OpenAI offers Provisioned Throughput Units (PTUs) for fixed monthly pricing with guaranteed throughput — essential for budgeting production workloads.
A Practical Multi-Provider Strategy
Here is a pattern that works well for teams running on Azure but needing flexibility:
- Primary: Azure OpenAI with GPT-4o — handles 90% of production traffic with SLA coverage
- Development: Ollama with Llama 3.3 — fast iteration without API costs or network dependency
- Long-context fallback: Anthropic Claude — activated for documents exceeding Azure OpenAI context limits
- All accessed through IChatClient — business logic never knows which provider is active
This strategy gives you enterprise compliance where it matters, cost efficiency during development, and architectural flexibility to add providers as your needs evolve.
Summary
The LLM provider landscape is broad, but the decision framework for .NET teams is straightforward. Start with your non-functional requirements — compliance, latency, cost, context length — and let those drive the provider choice. Then wrap everything behind IChatClient so the choice remains reversible.
Azure OpenAI is the default for enterprise .NET. OpenAI direct is the fast path for experiments. Anthropic and Gemini are situationally powerful. Open-source models via Ollama give you sovereignty and offline capability. None of these are mutually exclusive, and the .NET abstraction layer makes it practical to use several simultaneously.