Microsoft.Extensions.AI 10.3.0 Released — Unified AI Abstractions for .NET

What Is Microsoft.Extensions.AI?

Microsoft.Extensions.AI is the official .NET abstraction layer for AI workloads. Rather than directly coding against the Azure OpenAI SDK or Ollama client, you program against IChatClient and IEmbeddingGenerator — and swap the underlying provider at registration time.

This makes your application code provider-agnostic from day one:

// IChatClient works with Azure OpenAI, OpenAI, Ollama, or any registered provider
public class ChatService(IChatClient chatClient)
{
    public async Task<string> CompleteAsync(string userMessage, CancellationToken ct = default)
    {
        var response = await chatClient.CompleteAsync(
            [new ChatMessage(ChatRole.User, userMessage)], cancellationToken: ct);

        return response.Message.Text ?? string.Empty;
    }
}

What’s New in 10.3.0

Improved Middleware Pipeline Composition

The IChatClient pipeline now supports cleaner composition when stacking multiple middleware layers. The ChatClientBuilder extension methods have been updated to reduce boilerplate when registering logging, caching, and resilience middleware together:

builder.Services.AddChatClient(innerClient =>
    innerClient
        .AsBuilder()
        .UseLogging()
        .UseOpenTelemetry()
        .UseRateLimitRetry()          // New in 10.3.0
        .UseFunctionInvocation()
        .Build())
    .UseAzureOpenAI(opts =>
    {
        opts.Endpoint = new Uri(builder.Configuration["AzureOpenAI:Endpoint"]!);
        opts.DeploymentName = builder.Configuration["AzureOpenAI:DeploymentName"]!;
    });

Built-in Rate Limit Resilience Middleware

The new UseRateLimitRetry() middleware automatically intercepts 429 Too Many Requests responses and retries with the server-specified Retry-After delay. This removes the need for a separate Polly pipeline for common LLM rate limiting scenarios:

// Before 10.3.0 — manual Polly ResiliencePipeline setup required
// After 10.3.0 — built-in, one line:
.UseRateLimitRetry(maxRetries: 3)

The middleware respects the Retry-After header value (in seconds) returned by Azure OpenAI, OpenAI, and other compliant providers.

OpenTelemetry gen_ai Semantic Conventions

Tracing now emits spans following the OpenTelemetry gen_ai semantic conventions, including:

Attribute	Description
`gen_ai.system`	The AI provider (e.g., `az.ai.openai`)
`gen_ai.request.model`	Requested model name
`gen_ai.response.finish_reasons`	Finish reason from the completion
`gen_ai.usage.input_tokens`	Prompt token count
`gen_ai.usage.output_tokens`	Completion token count

Enable in your Program.cs:

builder.Services
    .AddOpenTelemetry()
    .WithTracing(tracing => tracing
        .AddSource("Microsoft.Extensions.AI")
        .AddAzureMonitorTraceExporter());

How to Upgrade

dotnet add package Microsoft.Extensions.AI --version 10.3.0
dotnet add package Microsoft.Extensions.AI.OpenAI --version 10.3.0
# If using Ollama:
dotnet add package Microsoft.Extensions.AI.Ollama --version 10.3.0

Verify the installed versions:

dotnet list package --include-transitive | grep Microsoft.Extensions.AI

Compatibility

Runtime	Supported
.NET 8 LTS	✅ Full support
.NET 9	✅ Full support
.NET 10 Preview	✅ Full support
.NET Standard 2.0	✅ via netstandard2.0 target

All Microsoft.Extensions.AI packages target netstandard2.0 in addition to net8.0, so you can use them in class libraries without a .NET 8+ hard requirement.

Breaking Changes

None in 10.3.0. This is a fully additive release. The only deprecation is IChatClient.CompleteAsync overloads that accepted raw string messages (deprecated in 10.1.0) — these will be removed in 11.0.0.

Next Steps

Microsoft.Extensions.AI on NuGet
Official abstractions documentation
See the Workshop tutorial for streaming chat completion for a full integration example

Frequently Asked Questions

What is Microsoft.Extensions.AI?

Microsoft.Extensions.AI is a set of core .NET libraries that provide a unified abstraction layer for interacting with AI services. The key interfaces — IChatClient and IEmbeddingGenerator — let you write AI-integration code that works with Azure OpenAI, OpenAI, Ollama, and other providers without locking into a single SDK.

Is Microsoft.Extensions.AI stable for production use?

Yes. From version 9.x onward, the core abstractions (IChatClient, IEmbeddingGenerator) are stable API surface. The 10.3.0 release continues the stable track with new middleware pipeline improvements and telemetry hooks.

How does Microsoft.Extensions.AI relate to Semantic Kernel?

They are complementary. Microsoft.Extensions.AI provides low-level, provider-agnostic abstractions. Semantic Kernel builds on these abstractions (since SK 1.30+) and adds orchestration, planning, memory, and agent-level capabilities on top.

Microsoft.Extensions.AI 10.3.0 Released — Unified AI Abstractions for .NET

What Is Microsoft.Extensions.AI?

What’s New in 10.3.0

Improved Middleware Pipeline Composition

Built-in Rate Limit Resilience Middleware

OpenTelemetry gen_ai Semantic Conventions

How to Upgrade

Compatibility

Breaking Changes

Next Steps

AI-Friendly Summary

Summary

Key Takeaways

Implementation Checklist

Frequently Asked Questions

Related Articles

Azure.AI.OpenAI 2.1.0 Released — GA Streaming, Vision, and Structured Outputs

Azure.Search.Documents 11.7.0 Released — Native Vector Search and Hybrid Retrieval

Semantic Kernel Architecture Deep Dive

Was this article useful?

Discussion