Microsoft.Extensions.AI 10.3.0 Released — Unified AI Abstractions for .NET

From NuGet Release .NET 8 Microsoft.Extensions.AI 10.3.0
By Rajesh Mishra · Feb 18, 2026 · Verified: Feb 18, 2026 · 5 min read

What Is Microsoft.Extensions.AI?

Microsoft.Extensions.AI is the official .NET abstraction layer for AI workloads. Rather than directly coding against the Azure OpenAI SDK or Ollama client, you program against IChatClient and IEmbeddingGenerator — and swap the underlying provider at registration time.

This makes your application code provider-agnostic from day one:

// IChatClient works with Azure OpenAI, OpenAI, Ollama, or any registered provider
public class ChatService(IChatClient chatClient)
{
    public async Task<string> CompleteAsync(string userMessage, CancellationToken ct = default)
    {
        var response = await chatClient.CompleteAsync(
            [new ChatMessage(ChatRole.User, userMessage)], cancellationToken: ct);

        return response.Message.Text ?? string.Empty;
    }
}

What’s New in 10.3.0

Improved Middleware Pipeline Composition

The IChatClient pipeline now supports cleaner composition when stacking multiple middleware layers. The ChatClientBuilder extension methods have been updated to reduce boilerplate when registering logging, caching, and resilience middleware together:

builder.Services.AddChatClient(innerClient =>
    innerClient
        .AsBuilder()
        .UseLogging()
        .UseOpenTelemetry()
        .UseRateLimitRetry()          // New in 10.3.0
        .UseFunctionInvocation()
        .Build())
    .UseAzureOpenAI(opts =>
    {
        opts.Endpoint = new Uri(builder.Configuration["AzureOpenAI:Endpoint"]!);
        opts.DeploymentName = builder.Configuration["AzureOpenAI:DeploymentName"]!;
    });

Built-in Rate Limit Resilience Middleware

The new UseRateLimitRetry() middleware automatically intercepts 429 Too Many Requests responses and retries with the server-specified Retry-After delay. This removes the need for a separate Polly pipeline for common LLM rate limiting scenarios:

// Before 10.3.0 — manual Polly ResiliencePipeline setup required
// After 10.3.0 — built-in, one line:
.UseRateLimitRetry(maxRetries: 3)

The middleware respects the Retry-After header value (in seconds) returned by Azure OpenAI, OpenAI, and other compliant providers.

OpenTelemetry gen_ai Semantic Conventions

Tracing now emits spans following the OpenTelemetry gen_ai semantic conventions, including:

AttributeDescription
gen_ai.systemThe AI provider (e.g., az.ai.openai)
gen_ai.request.modelRequested model name
gen_ai.response.finish_reasonsFinish reason from the completion
gen_ai.usage.input_tokensPrompt token count
gen_ai.usage.output_tokensCompletion token count

Enable in your Program.cs:

builder.Services
    .AddOpenTelemetry()
    .WithTracing(tracing => tracing
        .AddSource("Microsoft.Extensions.AI")
        .AddAzureMonitorTraceExporter());

How to Upgrade

dotnet add package Microsoft.Extensions.AI --version 10.3.0
dotnet add package Microsoft.Extensions.AI.OpenAI --version 10.3.0
# If using Ollama:
dotnet add package Microsoft.Extensions.AI.Ollama --version 10.3.0

Verify the installed versions:

dotnet list package --include-transitive | grep Microsoft.Extensions.AI

Compatibility

RuntimeSupported
.NET 8 LTS✅ Full support
.NET 9✅ Full support
.NET 10 Preview✅ Full support
.NET Standard 2.0✅ via netstandard2.0 target

All Microsoft.Extensions.AI packages target netstandard2.0 in addition to net8.0, so you can use them in class libraries without a .NET 8+ hard requirement.

Breaking Changes

None in 10.3.0. This is a fully additive release. The only deprecation is IChatClient.CompleteAsync overloads that accepted raw string messages (deprecated in 10.1.0) — these will be removed in 11.0.0.

Next Steps

AI-Friendly Summary

Summary

Microsoft.Extensions.AI 10.3.0 ships updated IChatClient and IEmbeddingGenerator interfaces with improved middleware pipeline composition, enhanced OpenTelemetry tracing, and new resilience middleware for 429 rate limit handling. Compatible with .NET 8 LTS and later.

Key Takeaways

  • IChatClient and IEmbeddingGenerator now support richer pipeline middleware composition
  • New built-in resilience middleware handles 429 rate limit responses automatically
  • OpenTelemetry integration now emits gen_ai.* semantic convention spans
  • Compatible with .NET 8 LTS — no need to upgrade to .NET 10
  • Semantic Kernel 1.34+ delegates to IChatClient under the hood

Implementation Checklist

  • Update Microsoft.Extensions.AI NuGet package to 10.3.0
  • Update Microsoft.Extensions.AI.OpenAI to matching version
  • Review middleware pipeline for any breaking changes in builder API
  • Enable OpenTelemetry gen_ai spans for observability
  • Test Ollama and Azure OpenAI backends with updated client

Frequently Asked Questions

What is Microsoft.Extensions.AI?

Microsoft.Extensions.AI is a set of core .NET libraries that provide a unified abstraction layer for interacting with AI services. The key interfaces — IChatClient and IEmbeddingGenerator — let you write AI-integration code that works with Azure OpenAI, OpenAI, Ollama, and other providers without locking into a single SDK.

Is Microsoft.Extensions.AI stable for production use?

Yes. From version 9.x onward, the core abstractions (IChatClient, IEmbeddingGenerator) are stable API surface. The 10.3.0 release continues the stable track with new middleware pipeline improvements and telemetry hooks.

How does Microsoft.Extensions.AI relate to Semantic Kernel?

They are complementary. Microsoft.Extensions.AI provides low-level, provider-agnostic abstractions. Semantic Kernel builds on these abstractions (since SK 1.30+) and adds orchestration, planning, memory, and agent-level capabilities on top.

Related Articles

Was this article useful?

Feedback is anonymous and helps us improve content quality.

Discussion

Engineering discussion powered by GitHub Discussions.

#Microsoft.Extensions.AI #.NET 8+ #AI Abstractions #NuGet Release #IChatClient