Fix Ollama Connection Refused Error in .NET and Semantic Kernel

From GitHub Issue .NET 9 OllamaSharp 5.1.0 Microsoft.SemanticKernel 1.54.0

By Rajesh Mishra · Mar 21, 2026 · 9 min read

Verified Mar 2026 .NET 9 OllamaSharp 5.1.0

In 30 Seconds

Ollama Connection refused errors in .NET have four common causes: Ollama not running (run 'ollama serve'), Docker networking (use host.docker.internal), wrong endpoint URL path (no /v1 for OllamaSharp), or using the deprecated Microsoft.Extensions.AI.Ollama preview package. Fix by verifying Ollama is running, adjusting the endpoint URL for your environment, migrating to OllamaSharp 5.x, and adding a health check.

⚠️

Error Fix Guide

Root cause analysis and verified fix. Code examples use OllamaSharp 5.1.0.

✓ SOLVED

The Error

Your .NET application tries to connect to Ollama and throws:

System.Net.Http.HttpRequestException: Connection refused (127.0.0.1:11434)
 ---> System.Net.Sockets.SocketException (111): Connection refused
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)

Unlike a 401 or 404, a Connection refused error means the TCP connection never established. The process listening on port 11434 is either not running or not reachable from your application’s network context.

Fixes at a Glance

Start Ollama — the process is not running; run ollama serve or launch the desktop app
Fix Docker networking — replace localhost with host.docker.internal when running inside a container
Correct the endpoint URL — OllamaSharp uses no /v1; Semantic Kernel needs /v1
Migrate from the deprecated preview package — uninstall Microsoft.Extensions.AI.Ollama and switch to OllamaSharp 5.x

Root Cause 1: Ollama Is Not Running

The simplest cause. Verify Ollama is running and models are installed:

ollama list          # Lists installed models — if this fails, Ollama is not running
ollama serve         # Start Ollama server manually

Or launch the Ollama desktop application. Once Ollama is running, you can confirm it is accepting connections by hitting the API directly:

curl http://localhost:11434/api/tags

A successful response returns a JSON object listing your installed models. A Connection refused response means Ollama did not start correctly.

Running local models like Phi-4 with Ollama can dramatically reduce inference costs compared to hosted APIs — but only if Ollama stays running. Consider adding it to your system startup or running it as a service.

Root Cause 2: Docker Networking

This is the most common cause of Connection refused when moving from local development to a containerised environment. Inside a Docker container, localhost and 127.0.0.1 resolve to the container itself, not the host machine where Ollama is running.

// ❌ Wrong — localhost inside Docker refers to the container
var client = new OllamaApiClient(new Uri("http://localhost:11434"));

// ✅ Correct — reach the host machine from Docker Desktop (Windows/Mac)
var client = new OllamaApiClient(new Uri("http://host.docker.internal:11434"));

For Linux Docker hosts where host.docker.internal is not available, use one of these alternatives:

# Option A: use the host's gateway IP (usually 172.17.0.1 on default bridge network)
http://172.17.0.1:11434

# Option B: run the container with host networking (no network isolation)
docker run --network=host your-dotnet-app

In Docker Compose, set the endpoint via environment variable so it is configurable per environment:

services:
  api:
    image: your-dotnet-app
    environment:
      - Ollama__BaseUrl=http://host.docker.internal:11434
    depends_on:
      - ollama

  ollama:
    image: ollama/ollama
    ports:
      - "11434:11434"

When Ollama runs as a sibling container (as shown above), replace host.docker.internal with the service name: http://ollama:11434.

Root Cause 3: Wrong Endpoint URL Format

OllamaSharp and Semantic Kernel use different URL paths. Using the wrong path gives Connection refused (if you have the port wrong) or 404 Not Found (if the port is right but the path is wrong). Here is the correct format for each client:

// OllamaSharp — connects to Ollama's native API (NO /v1)
var ollamaClient = new OllamaApiClient(new Uri("http://localhost:11434"));

// Semantic Kernel — uses the OpenAI-compatible endpoint (WITH /v1)
var kernelBuilder = Kernel.CreateBuilder();
kernelBuilder.AddOpenAIChatCompletion(
    modelId: "phi4-mini",
    endpoint: new Uri("http://localhost:11434/v1"),
    apiKey: "ollama");   // Any non-empty string; Ollama ignores the key

var kernel = kernelBuilder.Build();

// Microsoft.Extensions.AI via OpenAI-compat — also uses /v1
builder.Services.AddOpenAIChatClient(
    modelId: "phi4-mini",
    endpoint: new Uri("http://localhost:11434/v1"),
    apiKey: "ollama");

The /v1 suffix activates Ollama’s OpenAI-compatible API layer. Without it, requests to the Ollama native API endpoint receive a response, but the JSON format does not match what the OpenAI client library expects, resulting in deserialization errors rather than Connection refused.

Root Cause 4: Deprecated Preview Package Migration

The Microsoft.Extensions.AI.Ollama NuGet package was released as a preview during early 2025 and has since been deprecated. If you are still referencing it, remove it and migrate to OllamaSharp 5.x:

# Remove the deprecated package
dotnet remove package Microsoft.Extensions.AI.Ollama

# Install OllamaSharp
dotnet add package OllamaSharp --version 5.1.0

Update the registration code:

// ❌ Old preview package (deprecated — remove this)
// services.AddOllamaChatClient(new Uri("http://localhost:11434"), "phi4-mini");

// ✅ OllamaSharp 5.x — current approach
builder.Services.AddSingleton(new OllamaApiClient(new Uri("http://localhost:11434")));
builder.Services.AddSingleton<IChatClient>(sp =>
    sp.GetRequiredService<OllamaApiClient>().AsChatClient("phi4-mini"));

Note the constructor: new OllamaApiClient(new Uri(...)) — not new OllamaClient(). The type name changed in OllamaSharp 5.x. Using the old constructor name causes a compilation error, not a runtime one, so this is caught at build time rather than at runtime.

With OllamaSharp properly registered, you can resolve IChatClient anywhere in your application and send requests in the standard Microsoft.Extensions.AI format:

public class ChatService(IChatClient chatClient)
{
    public async Task<string> AskAsync(string question)
    {
        var response = await chatClient.GetResponseAsync(question);
        return response.Text;
    }
}

Fix: Add a Health Check for Ollama

Rather than discovering that Ollama is unreachable when a user makes a request, add an ASP.NET Core health check that detects the problem immediately. This is especially important in containerised deployments where Ollama may not have finished loading the model when your application starts.

using Microsoft.Extensions.Diagnostics.HealthChecks;

public class OllamaHealthCheck : IHealthCheck
{
    private readonly HttpClient _httpClient;
    private readonly string _ollamaUrl;

    public OllamaHealthCheck(IHttpClientFactory factory, IConfiguration config)
    {
        _httpClient = factory.CreateClient();
        _ollamaUrl = config["Ollama:BaseUrl"] ?? "http://localhost:11434";
    }

    public async Task<HealthCheckResult> CheckHealthAsync(
        HealthCheckContext context, CancellationToken ct = default)
    {
        try
        {
            var response = await _httpClient.GetAsync(
                $"{_ollamaUrl}/api/tags", ct);

            return response.IsSuccessStatusCode
                ? HealthCheckResult.Healthy("Ollama is running")
                : HealthCheckResult.Degraded($"Ollama returned {response.StatusCode}");
        }
        catch (HttpRequestException)
        {
            return HealthCheckResult.Unhealthy("Ollama is not reachable");
        }
    }
}

builder.Services.AddHttpClient();
builder.Services.AddHealthChecks()
    .AddCheck<OllamaHealthCheck>("ollama");

var app = builder.Build();
app.MapHealthChecks("/health");

With this in place, GET /health returns a Healthy or Unhealthy status that Kubernetes liveness probes, Docker health checks, and load balancers can act on. You can also add a readiness probe that waits until Ollama is healthy before accepting traffic:

# kubernetes deployment excerpt
readinessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5

Running local AI with Ollama reduces API costs significantly compared to hosted services, but it introduces an operational dependency that the health check makes visible.

Quick Diagnostic Checklist

If you are still getting Connection refused after working through the fixes above, run through this checklist:

Check	Command	Expected
Ollama running	`ollama list`	Shows model list
Port open	`curl http://localhost:11434/api/tags`	JSON response
From Docker	`curl http://host.docker.internal:11434/api/tags`	JSON response
OllamaSharp URL	No `/v1` suffix	`http://localhost:11434`
SK/MEA URL	With `/v1` suffix	`http://localhost:11434/v1`
Package	`dotnet list package`	`OllamaSharp 5.x` present

⚠ Production Considerations

Ollama has no built-in authentication. When running Ollama in a production environment accessible on a network, add a reverse proxy (nginx, Caddy) with API key authentication in front of the Ollama endpoint. Exposing http://0.0.0.0:11434 directly gives anyone on the network full access to your models.
Ollama loads models into VRAM on first request, which can take 5-30 seconds. The first HTTP request after startup may time out. Pre-warm the model by sending a probe request at application startup, or increase your HTTP client timeout for the first call.

Enjoying this article?

Get weekly .NET + AI insights delivered to your inbox. No spam.

Subscribe Free →

🧠 Architect’s Note

For production Ollama deployments, run Ollama as a systemd service (Linux) and use Docker health checks to ensure it is ready before your .NET container starts. Use container orchestration dependencies (dependsOn in Docker Compose, initContainers in Kubernetes) to prevent your app from starting before Ollama is available.

AI-Friendly Summary

Summary

Key Takeaways

Check Ollama is running: ollama list or GET http://localhost:11434/api/tags
Docker: use host.docker.internal:11434 instead of localhost:11434
OllamaSharp URL: http://localhost:11434 (no /v1). SK URL: http://localhost:11434/v1
Migrate from deprecated Microsoft.Extensions.AI.Ollama to OllamaSharp 5.x
Add IHealthChecks for Ollama to detect failures before requests reach AI code

Implementation Checklist

Run ollama list to verify Ollama is running and models are installed
Check endpoint URL: OllamaSharp needs http://localhost:11434, SK needs http://localhost:11434/v1
For Docker: replace localhost with host.docker.internal
Uninstall Microsoft.Extensions.AI.Ollama preview package if present
Install OllamaSharp 5.1.0 and update OllamaApiClient constructor
Add OllamaHealthCheck to your ASP.NET Core health checks

Frequently Asked Questions

What causes the Connection refused error when connecting to Ollama from .NET?

The most common causes are: Ollama is not running (start it with 'ollama serve' or the desktop app), wrong endpoint URL format (use http://localhost:11434 without /v1 for OllamaSharp, with /v1 for OpenAI-compat), or Docker networking where 127.0.0.1 resolves to the container itself instead of the host.

How do I fix Ollama Connection refused inside a Docker container?

Replace 127.0.0.1 or localhost with host.docker.internal in your endpoint URL. For example: new Uri('http://host.docker.internal:11434'). On Linux Docker hosts, you may need to use the host's IP address directly or add --network=host to your docker run command.

What is the correct Ollama endpoint URL for OllamaSharp vs Semantic Kernel?

OllamaSharp connects to the Ollama API directly at http://localhost:11434 (no /v1). Semantic Kernel's AddOpenAIChatCompletion uses the OpenAI-compatible endpoint at http://localhost:11434/v1. Using the wrong path returns 404 Not Found, not Connection refused.

How do I migrate from the deprecated Microsoft.Extensions.AI.Ollama preview package?

Uninstall Microsoft.Extensions.AI.Ollama (the old preview package). Install OllamaSharp 5.x. Replace new OllamaClient() with new OllamaApiClient(new Uri('http://localhost:11434')). Register IChatClient by calling .AsChatClient('phi4-mini') on the OllamaApiClient instance.

How do I check if Ollama is running from my .NET application?

Send an HTTP GET request to http://localhost:11434/api/tags. If Ollama is running, it returns a JSON list of installed models. A Connection refused response means Ollama is not running. Add this as a health check using IHealthChecks.

Can I use ASP.NET Core health checks to monitor Ollama availability?

Yes. Add a custom health check that calls GET http://localhost:11434/api/tags. Register it with services.AddHealthChecks().AddCheck<OllamaHealthCheck>('ollama'). Expose it at /health. This lets Kubernetes or load balancers detect when Ollama is unavailable.

Why does my Ollama connection work in development but fail in production Docker?

In development, Ollama runs on your host machine and localhost resolves correctly. In Docker, each container has its own network namespace — localhost inside the container refers to the container itself, not the host. Use host.docker.internal (Docker Desktop) or the host's IP (Linux Docker) to reach Ollama on the host.

Was this article useful?

Feedback is anonymous and helps us improve content quality.

Discussion

Engineering discussion powered by GitHub Discussions.

#Ollama #Connection Refused #Error Fix #Semantic Kernel #Local AI #.NET AI

Fix Ollama Connection Refused Error in .NET and Semantic Kernel

The Error

Fixes at a Glance

Root Cause 1: Ollama Is Not Running

Root Cause 2: Docker Networking

Root Cause 3: Wrong Endpoint URL Format

Root Cause 4: Deprecated Preview Package Migration

Fix: Add a Health Check for Ollama

Quick Diagnostic Checklist

Further Reading

⚠ Production Considerations

🧠 Architect’s Note

AI-Friendly Summary

Summary

Key Takeaways

Implementation Checklist

Frequently Asked Questions

You Might Also Enjoy

Build a Local AI App with Ollama, Semantic Kernel, and Aspire

Run Phi-4 Locally in C#: Ollama vs ONNX vs Foundry Local

Fix Semantic Kernel Tool Call Message Ordering Error

Was this article useful?

Discussion

The Error

Fixes at a Glance

Root Cause 1: Ollama Is Not Running

Root Cause 2: Docker Networking

Root Cause 3: Wrong Endpoint URL Format

Root Cause 4: Deprecated Preview Package Migration

Fix: Add a Health Check for Ollama

Quick Diagnostic Checklist

Further Reading

Related Articles

⚠ Production Considerations

🧠 Architect’s Note

AI-Friendly Summary

Summary

Key Takeaways

Implementation Checklist

Frequently Asked Questions

You Might Also Enjoy

Build a Local AI App with Ollama, Semantic Kernel, and Aspire

Run Phi-4 Locally in C#: Ollama vs ONNX vs Foundry Local

Fix Semantic Kernel Tool Call Message Ordering Error

Was this article useful?

Discussion