Skip to main content

AI Agent Architecture in .NET: Single-Agent, Orchestrator & Group Chat Patterns

Verified Apr 2026 Intermediate Original .NET 10 Microsoft.SemanticKernel 1.54.0 Microsoft.SemanticKernel.Agents.Core 1.x
By Rajesh Mishra · Mar 12, 2026 · 12 min read
In 30 Seconds

Architecture guide for AI agents in .NET. Covers the five components of agent anatomy (perception, reasoning, tools, memory, action), three production patterns (single-agent, orchestrator-worker, group chat), and implementation guidance using Semantic Kernel and Microsoft Agent Framework.

The Five Components of an AI Agent

Every AI agent, regardless of framework, has the same fundamental anatomy. Understanding these components helps you design agents that work reliably — and debug them when they don’t.

PerceptionInput · Tool Results · ContextReasoningLLM Call · DecisionToolsKernelFunctionC# MethodsMemoryChat History · Vector StoreActionResponse · API Calls tool callresultcontext
The five-component agent loop — perception feeds reasoning, which calls tools and reads memory to produce actions.

1. Perception — Understanding Input

Perception is how the agent receives and interprets information. In .NET agent systems, this typically means:

  • User messages — Natural language input from chat
  • System prompts — Instructions that define the agent’s role and constraints
  • Tool results — Structured data returned from previous tool calls
  • Environment state — Context like current time, user identity, or session metadata

The perception layer converts raw input into a representation the reasoning component can process. In Semantic Kernel, this happens through ChatHistory — each message, tool result, and system instruction becomes a structured entry the LLM can reason about.

2. Reasoning — Deciding What to Do

Reasoning is the LLM call. The agent sends its perception context (system prompt, chat history, tool descriptions) to the model, and the model returns either a direct response or a tool call request.

This is where prompt engineering matters most. The system prompt shapes reasoning:

var agent = new ChatCompletionAgent
{
    Name = "DataAnalyst",
    Instructions = """
        You are a data analyst. When asked a question:
        1. First check what data sources are available using list_tables
        2. Write a query to answer the question
        3. Execute the query and interpret results
        4. Present findings with specific numbers, never vague summaries

        If the data doesn't contain what you need, say so clearly.
        Never fabricate data.
        """,
    Kernel = kernel
};

The quality of reasoning depends on three factors: the model’s capability, the system prompt’s clarity, and the tool descriptions’ precision.

3. Tools — Taking Action

Tools are functions the agent can call to interact with systems beyond its training data. In .NET, tools are regular C# methods exposed to the LLM:

public class CustomerTools
{
    private readonly ICustomerRepository _repository;

    public CustomerTools(ICustomerRepository repository) => _repository = repository;

    [KernelFunction("lookup_customer")]
    [Description("Find a customer by email address. Returns name, plan, and account status.")]
    public async Task<CustomerInfo?> LookupCustomerAsync(
        [Description("Customer email address")] string email)
    {
        return await _repository.FindByEmailAsync(email);
    }

    [KernelFunction("get_recent_orders")]
    [Description("Get the last N orders for a customer. Returns order ID, date, total, and status.")]
    public async Task<IReadOnlyList<OrderSummary>> GetRecentOrdersAsync(
        [Description("Customer email address")] string email,
        [Description("Number of orders to return, max 20")] int count = 5)
    {
        return await _repository.GetOrdersAsync(email, Math.Min(count, 20));
    }
}

Tool design principles:

  • One tool, one job — A tool that “searches and updates” should be two tools
  • Rich descriptions — The LLM reads descriptions to decide when and how to call tools
  • Return structured data — Let the LLM format for the user; don’t pre-format in tools
  • Validate inputs — Tools are the boundary between AI reasoning and real systems

4. Memory — Retaining Context

Agent memory operates at three levels:

Working memory is the current conversation. In Semantic Kernel, that’s the ChatHistory object — it holds every message, tool call, and result from the current session. The LLM sees all of this on every turn.

var history = new ChatHistory();
history.AddSystemMessage("You are a customer support agent.");
history.AddUserMessage("I want to return my order");
// Agent reasons and calls tools — all added to history
// Next turn, the LLM sees the full conversation

Short-term memory persists across conversations within a session but gets cleared eventually. Think of it as “the user came back to continue yesterday’s task.” You implement this by storing and reloading chat history from a session store.

Long-term memory is permanent knowledge stored in a vector database. The agent searches this before responding to find relevant context:

var memoryResults = await kernel.SearchAsync(
    "customer_knowledge",
    userQuestion,
    limit: 3);

For most .NET applications, working memory plus a vector store for domain knowledge covers the common requirements. Short-term session memory adds complexity and is worth implementing only when users genuinely need multi-session continuity.

5. Action — Producing Output

Action is the agent’s response — text back to the user, a file written, an API called, or a task delegated to another agent. The action layer also handles:

  • Streaming — Delivering tokens as they generate for responsive UX
  • Structured output — Returning JSON when downstream systems need it
  • Side effects — Confirming destructive operations before executing

Three Agent Patterns for Production

Pattern 1: Single Agent with Tools

One agent, one LLM, a set of tools. This is where 80% of agent projects should start.

var kernel = Kernel.CreateBuilder()
    .AddAzureOpenAIChatCompletion("chat-deployment", endpoint, credential)
    .Build();

kernel.Plugins.AddFromObject(new CustomerTools(repository));
kernel.Plugins.AddFromObject(new OrderTools(orderService));

var agent = new ChatCompletionAgent
{
    Name = "SupportAgent",
    Instructions = supportPrompt,
    Kernel = kernel
};

When it works: Focused domains, 3-10 tools, single responsibility.
When it breaks: Tool count exceeds 15 (LLM starts confusing tools), tasks require different models for different sub-problems, or you need agents to verify each other’s output.

Pattern 2: Orchestrator-Worker

A central orchestrator agent receives the user request, breaks it into subtasks, and delegates to specialized worker agents.

var researcher = new ChatCompletionAgent
{
    Name = "Researcher",
    Instructions = "Search knowledge bases and return raw findings. Do not interpret.",
    Kernel = researchKernel // Has search tools
};

var analyst = new ChatCompletionAgent
{
    Name = "Analyst",
    Instructions = "Analyze data and produce charts. Use statistical methods.",
    Kernel = analysisKernel // Has data tools
};

var orchestrator = new ChatCompletionAgent
{
    Name = "Orchestrator",
    Instructions = """
        You coordinate research tasks. 
        Delegate searches to Researcher, analysis to Analyst.
        Combine their outputs into a final report for the user.
        """,
    Kernel = orchestratorKernel
};

When it works: Complex tasks that span multiple domains, when different sub-tasks benefit from different deployments (for example, a primary chat deployment for reasoning and a lower-cost mini deployment for summarization).
When it breaks: Simple tasks where the orchestration overhead adds latency without value.

Pattern 3: Group Chat

Peer agents collaborate on a shared task, taking turns based on routing logic. No single agent is “in charge.”

var groupChat = new AgentGroupChat(researcher, analyst, writer)
{
    ExecutionSettings = new()
    {
        TerminationStrategy = new MaxMessageTermination(maxMessages: 20),
        SelectionStrategy = new SequentialSelectionStrategy()
    }
};

groupChat.AddChatMessage(new ChatMessageContent(
    AuthorRole.User, "Analyze our Q1 sales data and draft a board summary"));

await foreach (var message in groupChat.InvokeAsync())
{
    Console.WriteLine($"[{message.AuthorName}]: {message.Content}");
}

When it works: Creative tasks where multiple perspectives improve quality, review workflows where one agent checks another’s work.
When it breaks: Agents get stuck in loops, or the conversation diverges because no agent takes ownership of convergence.

Choosing the Right Pattern

Single AgentOrchestrator-WorkerGroup ChatUserAgentTools (3–10)UserOrchestratorWorker AWorker BAgent 1Agent 2Agent 3
Three production patterns — start with Single Agent. Add Orchestrator-Worker when single-agent scope overflows. Use Group Chat only when peer collaboration genuinely improves quality.
FactorSingle AgentOrchestrator-WorkerGroup Chat
Tool count3-1010-30 (distributed)5-15 per agent
LatencyLow (1-3 LLM calls)Medium (3-10 calls)High (5-20+ calls)
Cost per turn$$$$$$
DebuggingSimpleModerateComplex
Best forFocused tasksWorkflowsCreative/Review

Start with the simplest pattern that works. Single agent handles most real-world requirements. Upgrade to orchestrator-worker when you’ve proven the single agent is insufficient — not because the architecture diagram looks better.

Memory Architecture for Production

For production .NET agents, here’s the memory stack that works:

// Working memory — built-in ChatHistory
var history = new ChatHistory();

// Long-term memory — Azure AI Search or Cosmos DB vector store
var memoryStore = new AzureAISearchVectorStore(
    new Uri(searchEndpoint), credential);

// Memory-enhanced agent loop
while (true)
{
    var userInput = GetUserInput();
    history.AddUserMessage(userInput);

    // Search long-term memory for relevant context
    var context = await memoryStore.SearchAsync(
        "knowledge", userInput, limit: 3);

    // Inject context into the conversation
    if (context.Any())
    {
        var contextText = string.Join("\n", context.Select(c => c.Text));
        history.AddSystemMessage($"Relevant context:\n{contextText}");
    }

    var response = await chatService.GetChatMessageContentAsync(
        history, settings, kernel);
    history.Add(response);
}

Observability

You cannot debug agents in production without observability. Instrument everything:

  • Tool calls — What was called, with what arguments, what was returned
  • LLM interactions — Token counts, latency, model used
  • Reasoning traces — The full prompt sent to the LLM for each turn
  • Error paths — Tool failures, retries, fallbacks

Semantic Kernel integrates with OpenTelemetry:

builder.Services.AddOpenTelemetry()
    .WithTracing(tracing =>
    {
        tracing.AddSource("Microsoft.SemanticKernel*");
        tracing.AddOtlpExporter();
    });

Next Steps

⚠ Production Considerations

  • Tool descriptions are prompts. A vague description like 'searches stuff' causes the LLM to misuse the tool. Write descriptions as if explaining to a new team member what the tool does, when to use it, and what it returns.
  • Agents without memory limits will stuff entire chat histories into each LLM call, burning tokens and hitting context limits. Implement a sliding window or summarization strategy from day one.

Enjoying this article?

Get weekly .NET + AI insights delivered to your inbox. No spam.

Subscribe Free →

🧠 Architect’s Note

The right number of agents is the fewest that solve the problem. Start with one. Add a second only when you've proven the first can't handle the scope. Multi-agent systems are distributed systems — they inherit all the complexity of network calls, state synchronization, and partial failures.

AI-Friendly Summary

Summary

Architecture guide for AI agents in .NET. Covers the five components of agent anatomy (perception, reasoning, tools, memory, action), three production patterns (single-agent, orchestrator-worker, group chat), and implementation guidance using Semantic Kernel and Microsoft Agent Framework.

Key Takeaways

  • Agents have five components: perception, reasoning, tools, memory, action
  • Single-agent pattern: one LLM with tool-calling for focused tasks
  • Orchestrator-worker: central agent delegates to specialized workers
  • Group chat: peer agents collaborate with routing logic
  • Memory spans three levels: working (chat history), short-term (session), long-term (vector store)

Implementation Checklist

  • Identify the agent pattern that fits your use case
  • Define tools as discrete, well-described functions
  • Implement appropriate memory strategy
  • Add observability to tool calls and reasoning steps
  • Test with adversarial and edge-case inputs

Frequently Asked Questions

What makes an AI agent different from a chatbot?

A chatbot responds to user input with text. An agent acts — it selects tools, queries databases, writes files, and chains multiple steps together autonomously. The key difference is agency: the agent decides what to do next based on reasoning, not a scripted conversation flow.

When should I use multi-agent instead of single-agent?

Single-agent works well for focused tasks with 5-10 tools. When you exceed 15+ tools, need different models for different tasks, or need agents to check each other's work, switch to multi-agent. The trade-off is increased latency and cost per conversation turn.

Can agents run without an LLM?

Technically yes — rule-based agents existed before LLMs. But the LLM is what gives modern agents flexible reasoning. Without one, you're building a traditional workflow engine, which is fine for deterministic tasks but lacks the adaptability that makes agents valuable.

Track your progress through this learning path.

You Might Also Enjoy

Was this article useful?

Feedback is anonymous and helps us improve content quality.

Discussion

Engineering discussion powered by GitHub Discussions.

#AI Agents #Architecture #.NET AI #Design Patterns #Semantic Kernel