Skip to main content

Microsoft.Extensions.AI: Foundation for .NET AI

Verified Mar 2026 Intermediate Original .NET 9 Microsoft.Extensions.AI 1.0.3 Microsoft.Extensions.AI.Abstractions 1.0.3
By Rajesh Mishra · Mar 11, 2026 · 12 min read
In 30 Seconds

Microsoft.Extensions.AI is the .NET abstraction layer for AI services. It defines IChatClient and IEmbeddingGenerator interfaces that all providers implement. It sits below Semantic Kernel in the stack, providing provider-agnostic AI access with native DI support. Supports middleware pipelines for logging, caching, rate limiting, and telemetry.

Why This Layer Exists

Before Microsoft.Extensions.AI, every AI provider in .NET had its own client library with its own API surface. Azure OpenAI used AzureOpenAIClient. OpenAI used OpenAIClient. Ollama used third-party libraries with completely different method signatures. Switching providers meant rewriting every call site.

Microsoft.Extensions.AI solves this the same way ILogger solved logging — by defining a common interface that all providers implement. You program against IChatClient, register the provider in DI, and swap implementations without changing your business code.

Where It Sits in the Stack

Your Application CodeMicrosoft.Extensions.AIIChatClient · IEmbeddingGeneratorSemantic Kernel (optional)Plugins · Memory · PlanningMicrosoft Agent Framework (optional)Agents · Orchestration · MCPAzure OpenAI / OpenAI / Ollama provider calls
Microsoft.Extensions.AI is the abstraction layer. If you only need chat or embeddings, stop at IChatClient — SK and Agent Framework are optional orchestration layers on top.

If you only need chat completion or embedding generation, Microsoft.Extensions.AI is all you need. SK and Agent Framework are additional layers that add orchestration and agent capabilities on top.

The Two Core Interfaces

IChatClient

The interface for text generation — chat, completion, reasoning:

public interface IChatClient : IDisposable
{
    Task<ChatResponse> GetResponseAsync(
        IEnumerable<ChatMessage> messages,
        ChatOptions? options = null,
        CancellationToken cancellationToken = default);

    IAsyncEnumerable<ChatResponseUpdate> GetStreamingResponseAsync(
        IEnumerable<ChatMessage> messages,
        ChatOptions? options = null,
        CancellationToken cancellationToken = default);

    ChatClientMetadata Metadata { get; }
}

Two methods — synchronous completion and streaming. That’s the entire contract.

IEmbeddingGenerator

The interface for vector embeddings:

public interface IEmbeddingGenerator<TInput, TEmbedding> : IDisposable
    where TEmbedding : Embedding
{
    Task<GeneratedEmbeddings<TEmbedding>> GenerateAsync(
        IEnumerable<TInput> values,
        EmbeddingGenerationOptions? options = null,
        CancellationToken cancellationToken = default);
}

Turns text (or other inputs) into vectors. Used by RAG systems, semantic search, and memory stores.

Getting Started

Install the package:

dotnet add package Microsoft.Extensions.AI

For Azure OpenAI:

dotnet add package Azure.AI.OpenAI

Basic Chat Completion

using Azure.AI.OpenAI;
using Microsoft.Extensions.AI;

IChatClient client = new AzureOpenAIClient(
    new Uri(Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!),
    new System.ClientModel.ApiKeyCredential(
        Environment.GetEnvironmentVariable("AZURE_OPENAI_KEY")!))
    .GetChatClient("chat-deployment")
    .AsIChatClient();

var response = await client.GetResponseAsync("What is dependency injection?");
Console.WriteLine(response.Text);

The key method is .AsIChatClient() — it wraps the provider-specific client in the IChatClient interface.

Using Dependency Injection

The real power shows in DI-based applications:

using Azure.AI.OpenAI;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;

var builder = Host.CreateApplicationBuilder(args);

// Register the AI client — change this one line to switch providers
builder.Services.AddChatClient(services =>
    new AzureOpenAIClient(
        new Uri(builder.Configuration["AzureOpenAI:Endpoint"]!),
        new System.ClientModel.ApiKeyCredential(
            builder.Configuration["AzureOpenAI:Key"]!))
    .GetChatClient("chat-deployment")
    .AsIChatClient());

builder.Services.AddTransient<MyAiService>();

var app = builder.Build();
var service = app.Services.GetRequiredService<MyAiService>();
await service.RunAsync();

Your service depends only on IChatClient:

public class MyAiService
{
    private readonly IChatClient _chatClient;

    public MyAiService(IChatClient chatClient)
    {
        _chatClient = chatClient;
    }

    public async Task RunAsync()
    {
        var response = await _chatClient.GetResponseAsync(
            "Explain the strategy pattern in 3 sentences.");
        Console.WriteLine(response.Text);
    }
}

Switching to Ollama

To swap Azure OpenAI for a local Ollama instance, change only the registration:

dotnet add package OllamaChatClient
// Before (Azure OpenAI)
builder.Services.AddChatClient(services =>
    new AzureOpenAIClient(endpoint, key)
        .GetChatClient("chat-deployment")
        .AsIChatClient());

// After (Ollama - local)
builder.Services.AddChatClient(services =>
    new OllamaChatClient(new Uri("http://localhost:11434"), "llama3"));

MyAiService doesn’t change at all. It still receives an IChatClient and calls GetResponseAsync on it.

Middleware Pipelines

The killer feature of Microsoft.Extensions.AI is composable middleware. You can wrap any IChatClient with cross-cutting concerns:

Logging

builder.Services.AddChatClient(services =>
    new AzureOpenAIClient(endpoint, key)
        .GetChatClient("chat-deployment")
        .AsIChatClient()
        .AsBuilder()
        .UseLogging()
        .Build(services));

Every request and response gets logged through ILogger.

Caching

builder.Services.AddDistributedMemoryCache();

builder.Services.AddChatClient(services =>
    new AzureOpenAIClient(endpoint, key)
        .GetChatClient("chat-deployment")
        .AsIChatClient()
        .AsBuilder()
        .UseDistributedCache()
        .Build(services));

Identical prompts return cached responses, saving tokens and reducing latency.

Stacking Middleware

Middleware composes. Order matters — outermost wraps first:

builder.Services.AddChatClient(services =>
    new AzureOpenAIClient(endpoint, key)
        .GetChatClient("chat-deployment")
        .AsIChatClient()
        .AsBuilder()
        .UseLogging()              // Log all requests
        .UseDistributedCache()     // Cache after logging
        .UseOpenTelemetry()        // Trace all calls
        .Build(services));

This gives you observability (logging + telemetry) and performance (caching) across every AI call in your application, regardless of which service makes the call. You can also add resilience middleware to handle 429 rate limit errors from Azure OpenAI — see Fix Azure OpenAI 429 Too Many Requests in .NET for the Polly-based patterns that integrate cleanly with this pipeline.

Streaming

Every IChatClient supports streaming out of the box:

await foreach (var update in _chatClient.GetStreamingResponseAsync("Explain SOLID principles"))
{
    Console.Write(update.Text);
}

This works with Azure OpenAI, OpenAI, Ollama — any provider. The streaming contract is part of the interface.

Embedding Generation

For RAG and semantic search, use IEmbeddingGenerator:

using Azure.AI.OpenAI;
using Microsoft.Extensions.AI;

IEmbeddingGenerator<string, Embedding<float>> embeddingGenerator =
    new AzureOpenAIClient(endpoint, key)
        .GetEmbeddingClient("text-embedding-3-small")
        .AsIEmbeddingGenerator();

var embeddings = await embeddingGenerator.GenerateAsync(
    ["What is dependency injection?", "Explain the repository pattern"]);

foreach (var embedding in embeddings)
{
    Console.WriteLine($"Vector dimension: {embedding.Vector.Length}");
}

When to Use This vs Semantic Kernel

The decision tree is straightforward:

Use Microsoft.Extensions.AI directly when:

  • You need chat completion in a service
  • You need embeddings for vector search
  • You want provider-agnostic code with DI
  • You don’t need function calling, plugins, or memory
  • You want the lightest possible dependency

Use Semantic Kernel when:

  • You need automatic function calling (tool use)
  • You need plugins exposing C# code to the LLM
  • You need memory (vector stores, chat history management)
  • You need prompt templates with variable substitution
  • You’re building features that chain multiple AI operations

SK uses IChatClient internally. Registering SK in your DI still gives you provider-agnostic code — SK just adds the orchestration layer.

Integration with Semantic Kernel

SK consumes IChatClient implementations. When you configure SK with AddAzureOpenAIChatCompletion, it registers an IChatClient under the hood:

// These are equivalent:
// 1. Direct Extensions.AI registration
builder.Services.AddChatClient(services => azureClient.AsIChatClient());

// 2. SK registration (also registers IChatClient)
builder.Services.AddKernel()
    .AddAzureOpenAIChatCompletion("chat-deployment", endpoint, key);

This means services that only need IChatClient can coexist with services that need the full Kernel — both share the same underlying AI connection.

Next Steps

⚠ Production Considerations

  • Don't register IChatClient as a singleton if your provider has per-request state. The Azure OpenAI implementation is stateless and safe as singleton, but verify this for other providers.
  • Middleware order matters. Place rate limiting before caching — you don't want to rate-limit cache hits.

Enjoying this article?

Get weekly .NET + AI insights delivered to your inbox. No spam.

Subscribe Free →

🧠 Architect’s Note

Microsoft.Extensions.AI is the interface layer you standardize on. Even if you use Semantic Kernel today, having services depend on IChatClient means you can swap SK's implementation for a lighter one in contexts where full orchestration isn't needed.

AI-Friendly Summary

Summary

Microsoft.Extensions.AI is the .NET abstraction layer for AI services. It defines IChatClient and IEmbeddingGenerator interfaces that all providers implement. It sits below Semantic Kernel in the stack, providing provider-agnostic AI access with native DI support. Supports middleware pipelines for logging, caching, rate limiting, and telemetry.

Key Takeaways

  • IChatClient is the provider-agnostic interface for chat completions in .NET
  • IEmbeddingGenerator provides the same abstraction for embedding generation
  • Swap Azure OpenAI for Ollama by changing one DI registration
  • Middleware pipelines add logging, caching, and rate limiting to any provider
  • Semantic Kernel and Agent Framework both build on these interfaces

Implementation Checklist

  • Install Microsoft.Extensions.AI NuGet package
  • Register an IChatClient implementation in DI
  • Use IChatClient for chat completions in your services
  • Add middleware (logging, caching) to the client pipeline
  • Migrate to a different provider by changing the registration only

Frequently Asked Questions

What is Microsoft.Extensions.AI?

It's the official .NET abstraction layer for AI services. It defines provider-agnostic interfaces — IChatClient for chat completions and IEmbeddingGenerator for embeddings — that Azure OpenAI, OpenAI, Ollama, and other providers implement. You write code against the interface, swap providers with one line.

What is the difference between Microsoft.Extensions.AI and Semantic Kernel?

Microsoft.Extensions.AI provides abstractions (interfaces). Semantic Kernel provides orchestration (plugins, memory, planning) built on top of those abstractions. Use Extensions.AI when you need simple chat or embedding calls. Use SK when you need function calling, multi-step workflows, or agent capabilities.

Can I use Microsoft.Extensions.AI without Semantic Kernel?

Yes. Microsoft.Extensions.AI is a standalone package. You can use IChatClient and IEmbeddingGenerator directly with dependency injection for simple AI integration — no SK required. Many applications that only need basic chat completion or embedding generation don't need SK's overhead.

Which providers support IChatClient?

Azure OpenAI, OpenAI, Ollama, and any provider that ships a Microsoft.Extensions.AI adapter. The ecosystem is growing — most major .NET AI packages now implement IChatClient.

Track your progress through this learning path.

You Might Also Enjoy

Was this article useful?

Feedback is anonymous and helps us improve content quality.

Discussion

Engineering discussion powered by GitHub Discussions.

#Microsoft.Extensions.AI #.NET AI #IChatClient #IEmbeddingGenerator #AI Abstraction