What Is Microsoft.Extensions.AI?
Microsoft.Extensions.AI is the official .NET abstraction layer for AI workloads. Rather than directly coding against the Azure OpenAI SDK or Ollama client, you program against IChatClient and IEmbeddingGenerator — and swap the underlying provider at registration time.
This makes your application code provider-agnostic from day one:
// IChatClient works with Azure OpenAI, OpenAI, Ollama, or any registered provider
public class ChatService(IChatClient chatClient)
{
public async Task<string> CompleteAsync(string userMessage, CancellationToken ct = default)
{
var response = await chatClient.CompleteAsync(
[new ChatMessage(ChatRole.User, userMessage)], cancellationToken: ct);
return response.Message.Text ?? string.Empty;
}
}
What’s New in 10.3.0
Improved Middleware Pipeline Composition
The IChatClient pipeline now supports cleaner composition when stacking multiple middleware layers. The ChatClientBuilder extension methods have been updated to reduce boilerplate when registering logging, caching, and resilience middleware together:
builder.Services.AddChatClient(innerClient =>
innerClient
.AsBuilder()
.UseLogging()
.UseOpenTelemetry()
.UseRateLimitRetry() // New in 10.3.0
.UseFunctionInvocation()
.Build())
.UseAzureOpenAI(opts =>
{
opts.Endpoint = new Uri(builder.Configuration["AzureOpenAI:Endpoint"]!);
opts.DeploymentName = builder.Configuration["AzureOpenAI:DeploymentName"]!;
});
Built-in Rate Limit Resilience Middleware
The new UseRateLimitRetry() middleware automatically intercepts 429 Too Many Requests responses and retries with the server-specified Retry-After delay. This removes the need for a separate Polly pipeline for common LLM rate limiting scenarios:
// Before 10.3.0 — manual Polly ResiliencePipeline setup required
// After 10.3.0 — built-in, one line:
.UseRateLimitRetry(maxRetries: 3)
The middleware respects the Retry-After header value (in seconds) returned by Azure OpenAI, OpenAI, and other compliant providers.
OpenTelemetry gen_ai Semantic Conventions
Tracing now emits spans following the OpenTelemetry gen_ai semantic conventions, including:
| Attribute | Description |
|---|---|
gen_ai.system | The AI provider (e.g., az.ai.openai) |
gen_ai.request.model | Requested model name |
gen_ai.response.finish_reasons | Finish reason from the completion |
gen_ai.usage.input_tokens | Prompt token count |
gen_ai.usage.output_tokens | Completion token count |
Enable in your Program.cs:
builder.Services
.AddOpenTelemetry()
.WithTracing(tracing => tracing
.AddSource("Microsoft.Extensions.AI")
.AddAzureMonitorTraceExporter());
How to Upgrade
dotnet add package Microsoft.Extensions.AI --version 10.3.0
dotnet add package Microsoft.Extensions.AI.OpenAI --version 10.3.0
# If using Ollama:
dotnet add package Microsoft.Extensions.AI.Ollama --version 10.3.0
Verify the installed versions:
dotnet list package --include-transitive | grep Microsoft.Extensions.AI
Compatibility
| Runtime | Supported |
|---|---|
| .NET 8 LTS | ✅ Full support |
| .NET 9 | ✅ Full support |
| .NET 10 Preview | ✅ Full support |
| .NET Standard 2.0 | ✅ via netstandard2.0 target |
All Microsoft.Extensions.AI packages target netstandard2.0 in addition to net8.0, so you can use them in class libraries without a .NET 8+ hard requirement.
Breaking Changes
None in 10.3.0. This is a fully additive release. The only deprecation is IChatClient.CompleteAsync overloads that accepted raw string messages (deprecated in 10.1.0) — these will be removed in 11.0.0.
Next Steps
- Microsoft.Extensions.AI on NuGet
- Official abstractions documentation
- See the Workshop tutorial for streaming chat completion for a full integration example