Prompt Engineering Fundamentals for C# Developers

Beginner Original .NET 9 Azure.AI.OpenAI 2.2.0 Microsoft.Extensions.AI 9.1.0
By Rajesh Mishra · Feb 28, 2026 · Verified: Feb 28, 2026 · 14 min read

From Theory to Practice

In the previous article, we explored how LLMs generate text through next-token prediction. That understanding is not academic — it directly informs how you write prompts that produce reliable results.

Prompt engineering is where mechanical understanding meets practical application. The model predicts tokens one at a time, influenced by everything in its context window. Your job is to fill that context with information that biases the model toward the output you need.

This article covers the fundamentals with C# code you can use in production. We will work with Microsoft.Extensions.AI and Azure OpenAI, but the prompting principles apply to any LLM provider.

The Anatomy of a Prompt: Message Roles

Modern LLM APIs structure conversations as a sequence of messages, each with a role. Understanding these roles is foundational.

System Message

The system message sets the model’s behavior for the entire conversation. It is processed before any user input and influences every response the model generates. Think of it as configuration — it defines who the model is and how it should behave.

User Message

User messages contain the actual requests. In a chatbot, these are literally what the user types. In an automated pipeline, these are the prompts your application constructs.

Assistant Message

Assistant messages represent the model’s previous responses. Including them in the conversation history gives the model context about what it has already said, enabling multi-turn conversations.

Here is how these roles look in C# using Microsoft.Extensions.AI:

using Microsoft.Extensions.AI;

var messages = new List<ChatMessage>
{
    new(ChatRole.System, "You are a senior .NET architect. Answer questions about C# and .NET with precise, production-focused guidance. Use code examples when helpful."),
    new(ChatRole.User, "How should I implement the repository pattern with EF Core in .NET 9?")
};

var response = await chatClient.GetResponseAsync(messages);
Console.WriteLine(response.Text);

The order matters. System first, then alternating user and assistant messages to build conversation history. The model sees the entire message sequence and generates its response based on all of it.

Designing Effective System Messages

The system message is the single highest-leverage prompt engineering decision. A well-designed system message can eliminate entire categories of bad output. A vague one leaves the model guessing.

The Four Components

Every production system message should address four concerns:

1. Role — Who is the model?

You are a .NET technical documentation assistant for the DotNetStudioAI platform.

2. Task — What should it do?

Answer developer questions about C#, .NET, Azure AI services, and Semantic Kernel. Provide code examples targeting .NET 9 unless a different version is specified.

3. Format — How should it respond?

Use markdown formatting. Include code blocks with language identifiers. Keep responses concise — under 500 words unless the question requires a detailed walkthrough.

4. Boundaries — What should it refuse?

If a question is outside .NET development or AI integration, politely redirect. Do not speculate about unreleased features. Do not provide security advice beyond referencing official Microsoft documentation.

Combined into a single system message:

const string SystemPrompt = """
    You are a .NET technical documentation assistant for the DotNetStudioAI platform.

    Answer developer questions about C#, .NET, Azure AI services, and Semantic Kernel.
    Provide code examples targeting .NET 9 unless a different version is specified.

    Use markdown formatting. Include code blocks with language identifiers.
    Keep responses concise — under 500 words unless the question requires a detailed walkthrough.

    If a question is outside .NET development or AI integration, politely redirect.
    Do not speculate about unreleased features.
    Do not provide security advice beyond referencing official Microsoft documentation.
    """;

Notice the use of C# raw string literals ("""). They are ideal for multi-line system prompts — no escaping needed, clean indentation, easy to read and maintain.

Few-Shot Prompting

Few-shot prompting gives the model examples of the exact input-output pattern you want. It is the most reliable technique for controlling output format without fine-tuning.

When Zero-Shot Is Not Enough

Zero-shot prompting — just telling the model what to do without examples — works for simple, well-understood tasks. But when the output needs to follow a specific format or convention that the model would not naturally adopt, examples are far more effective than lengthy instructions.

Building Few-Shot Prompts in C#

In the message-based API, few-shot examples are pairs of user and assistant messages injected into the conversation before the real question:

var messages = new List<ChatMessage>
{
    new(ChatRole.System, "You are an API that classifies .NET exceptions into categories. Respond with only the category name."),

    // Few-shot example 1
    new(ChatRole.User, "System.NullReferenceException: Object reference not set to an instance of an object."),
    new(ChatRole.Assistant, "Null Reference"),

    // Few-shot example 2
    new(ChatRole.User, "System.Net.Http.HttpRequestException: Connection refused (localhost:5001)"),
    new(ChatRole.Assistant, "Network/Connectivity"),

    // Few-shot example 3
    new(ChatRole.User, "System.Text.Json.JsonException: The JSON value could not be converted to System.Int32."),
    new(ChatRole.Assistant, "Serialization"),

    // Actual request
    new(ChatRole.User, "System.InvalidOperationException: Sequence contains no elements")
};

var response = await chatClient.GetResponseAsync(messages);
// Expected output: "Collection/LINQ"

Three to five examples typically suffice. More examples consume tokens (and cost money) without proportionally improving quality. Choose examples that cover the diversity of cases you expect.

Dynamic Few-Shot Selection

For production systems, you might select examples dynamically based on the input. This is especially powerful when combined with embeddings — retrieve the most similar examples from a database rather than using a fixed set:

public class FewShotExampleSelector
{
    private readonly IEmbeddingGenerator<string, Embedding<float>> _embedder;
    private readonly List<FewShotExample> _examples;

    public async Task<List<FewShotExample>> SelectExamplesAsync(
        string userInput, int count = 3)
    {
        var inputEmbedding = await _embedder.GenerateAsync([userInput]);

        return _examples
            .Select(ex => new
            {
                Example = ex,
                Similarity = CosineSimilarity(
                    inputEmbedding[0].Vector, ex.Embedding)
            })
            .OrderByDescending(x => x.Similarity)
            .Take(count)
            .Select(x => x.Example)
            .ToList();
    }
}

Chain-of-Thought Prompting

Chain-of-thought (CoT) prompting asks the model to reason through a problem step by step before providing a final answer. This significantly improves accuracy on tasks that require multi-step reasoning — math, logic, code analysis, complex decision-making.

The technique works because of how next-token prediction works. When the model generates intermediate reasoning tokens, those tokens become part of its context and influence subsequent token predictions. The reasoning steps literally help the model “think” its way to a better answer.

var messages = new List<ChatMessage>
{
    new(ChatRole.System, """
        You are a .NET performance analyst. When analyzing code for performance issues:
        1. First, identify what the code does at a high level
        2. Then, analyze each operation's time and space complexity
        3. Identify specific bottlenecks with line references
        4. Finally, provide your recommendation with corrected code

        Always show your reasoning before your conclusion.
        """),
    new(ChatRole.User, """
        Analyze this code for performance issues:

        public List<Customer> GetActiveCustomers(List<Customer> customers, List<Order> orders)
        {
            var result = new List<Customer>();
            foreach (var customer in customers)
            {
                foreach (var order in orders)
                {
                    if (order.CustomerId == customer.Id && order.IsActive)
                    {
                        if (!result.Contains(customer))
                            result.Add(customer);
                    }
                }
            }
            return result;
        }
        """)
};

The model will walk through its analysis step by step rather than jumping to a conclusion. This produces more accurate and more useful output.

Structured Output: Getting JSON from LLMs

One of the most common tasks in production AI applications is extracting structured data from model responses. C# developers need strongly-typed objects, not free-form text.

The Prompt Strategy

Three elements work together: system instructions specifying the format, an example of the expected structure, and a low temperature to minimize deviation.

var messages = new List<ChatMessage>
{
    new(ChatRole.System, """
        You are a data extraction API. Extract structured information from
        user-provided text and return it as JSON.

        Always respond with valid JSON matching this exact schema:
        {
            "entities": [
                {
                    "name": "string",
                    "type": "Person | Organization | Technology",
                    "context": "string (brief description of how the entity appears)"
                }
            ],
            "summary": "string (one-sentence summary)",
            "sentiment": "positive | neutral | negative"
        }

        Do not include any text outside the JSON object.
        """),
    new(ChatRole.User, "Microsoft announced that Semantic Kernel 2.0 will include native support for multi-agent orchestration, building on the success of their AI integration in Visual Studio and GitHub Copilot.")
};

Enabling JSON Mode

Azure OpenAI and other providers support a JSON mode that guarantees the response is valid JSON (though not necessarily matching your schema — that is your responsibility to validate):

using Azure.AI.OpenAI;

var options = new ChatCompletionOptions
{
    Temperature = 0.1f,
    ResponseFormat = ChatResponseFormat.CreateJsonObjectFormat()
};

Deserializing and Validating in C#

Never trust model output blindly. Always validate:

using System.Text.Json;

public record ExtractionResult(
    List<ExtractedEntity> Entities,
    string Summary,
    string Sentiment);

public record ExtractedEntity(
    string Name,
    string Type,
    string Context);

public static ExtractionResult? ParseAndValidate(string modelOutput)
{
    try
    {
        var result = JsonSerializer.Deserialize<ExtractionResult>(
            modelOutput,
            new JsonSerializerOptions { PropertyNameCaseInsensitive = true });

        if (result is null || result.Entities is null || result.Summary is null)
            return null;

        // Validate enum-like fields
        var validTypes = new HashSet<string> { "Person", "Organization", "Technology" };
        if (result.Entities.Any(e => !validTypes.Contains(e.Type)))
            return null;

        var validSentiments = new HashSet<string> { "positive", "neutral", "negative" };
        if (!validSentiments.Contains(result.Sentiment))
            return null;

        return result;
    }
    catch (JsonException)
    {
        return null;
    }
}

This pattern — prompt for JSON, enable JSON mode, deserialize with validation — is the standard approach for structured extraction in production .NET applications.

Temperature and Max Tokens: Tuning for Your Task

These two parameters have the most immediate impact on output quality and cost.

Temperature by Task Type

TaskTemperatureReasoning
JSON extraction0 - 0.1Deterministic output, minimal variation
Code generation0 - 0.2Correctness over creativity
Summarization0.3 - 0.5Some variety in phrasing, consistent content
Conversational chat0.5 - 0.7Natural-sounding, varied responses
Creative writing0.7 - 1.0Diverse, unexpected outputs

Max Tokens

MaxTokens (or MaxOutputTokens) caps the model’s response length. Set it thoughtfully:

var options = new ChatCompletionOptions
{
    Temperature = 0.1f,
    MaxOutputTokenCount = 500 // Cap response at ~375 words
};

Setting this too low truncates useful responses mid-sentence. Setting it too high wastes money when the model generates unnecessary padding. For JSON extraction, you can often estimate the maximum response size from your schema and set a tight limit.

Common Anti-Patterns and Fixes

After working with dozens of production LLM integrations, certain mistakes appear repeatedly. Here are the most damaging ones and how to fix them.

Anti-Pattern 1: Vague System Messages

Bad:

You are a helpful assistant.

Better:

You are a .NET 9 code review assistant. Analyze C# code for bugs, performance issues, and style violations. Reference specific line numbers. Suggest fixes with corrected code snippets. If the code has no issues, say so explicitly.

The vague version gives the model no guidance. It will produce generic, unfocused output. The specific version constrains the model’s behavior in ways that match your application’s needs.

Anti-Pattern 2: Instructions Without Structure

Bad:

Summarize this document and also extract any dates mentioned and list the people involved and tell me if the sentiment is positive or negative.

Better:

Analyze the following document. Provide your analysis in these sections:

## Summary
Two to three sentences summarizing the key points.

## People Mentioned
Bulleted list of names with their role or context.

## Key Dates
Bulleted list of dates with associated events.

## Overall Sentiment
One word: positive, neutral, or negative. Followed by a one-sentence justification.

Structured instructions produce structured output. The model follows formatting patterns when they are clearly demonstrated.

Anti-Pattern 3: Ignoring Token Economics

Every token in your system prompt is charged on every request. A 3,000-token system prompt across 100,000 daily requests is 300 million tokens per day in system prompt alone. At typical pricing, that adds up fast.

Audit your system prompts. Remove redundant instructions. Move examples to few-shot messages that you can conditionally include. Measure whether each sentence in your system prompt actually changes the model’s behavior — if it does not, remove it.

Anti-Pattern 4: No Retry or Validation

LLM responses are probabilistic. Even with temperature 0, the model can occasionally produce malformed output (especially at API boundaries like rate limits or timeouts). Always implement:

public async Task<ExtractionResult> ExtractWithRetryAsync(
    IChatClient client,
    List<ChatMessage> messages,
    int maxRetries = 3)
{
    for (int attempt = 0; attempt < maxRetries; attempt++)
    {
        var response = await client.GetResponseAsync(messages);
        var result = ParseAndValidate(response.Text);

        if (result is not null)
            return result;

        // Log the failed attempt for prompt iteration
        Log.Warning("Extraction attempt {Attempt} failed. Raw output: {Output}",
            attempt + 1, response.Text);
    }

    throw new InvalidOperationException(
        $"Failed to extract valid result after {maxRetries} attempts");
}

Building Reusable Prompt Templates

Scattering prompt strings across your codebase is a maintenance nightmare. Centralize them in template classes that are easy to find, test, and iterate on.

public static class PromptTemplates
{
    public static string CodeReviewSystem(string dotnetVersion = "9") => $"""
        You are a code review assistant specializing in .NET {dotnetVersion} and C#.

        For each code snippet submitted:
        1. Identify bugs, including null reference risks and unhandled exceptions
        2. Flag performance concerns with estimated impact
        3. Note style issues per current .NET conventions
        4. Provide corrected code for any issues found

        If the code is clean, explicitly state that no issues were found.
        Respond in markdown with separate sections for each category.
        """;

    public static string EntityExtractionSystem(string[] entityTypes) => $"""
        You are a named entity extraction API.
        Extract entities of these types: {string.Join(", ", entityTypes)}.

        Return a JSON array of objects with "name", "type", and "context" fields.
        Return an empty array if no entities are found.
        Do not include any text outside the JSON array.
        """;

    public static string SummarizationUser(string content, int maxSentences = 3) => $"""
        Summarize the following content in {maxSentences} sentences or fewer.
        Focus on actionable information relevant to software engineers.

        Content:
        {content}
        """;
}

This approach keeps prompt logic testable and versionable. When you need to iterate on a prompt — and you will — you change it in one place.

Complete Example: Azure OpenAI Structured Extraction

Here is a production-ready example that ties everything together — calling Azure OpenAI with Microsoft.Extensions.AI, using a structured prompt, and validating the response:

using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Extensions.AI;
using System.Text.Json;

// Configure the client with Managed Identity (no API keys in production)
var azureClient = new AzureOpenAIClient(
    new Uri("https://your-resource.openai.azure.com/"),
    new DefaultAzureCredential());

IChatClient chatClient = azureClient
    .GetChatClient("gpt-4o")
    .AsIChatClient();

// Build the prompt
var messages = new List<ChatMessage>
{
    new(ChatRole.System, """
        You are a technical content classifier for .NET developer articles.

        Classify the given article title and description into:
        - category: one of "tutorial", "reference", "troubleshooting", "news", "opinion"
        - difficulty: one of "beginner", "intermediate", "advanced"
        - tags: array of 3-5 relevant technology tags
        - summary: one-sentence summary of what the article covers

        Respond with valid JSON only.
        """),

    // Few-shot example
    new(ChatRole.User, """
        Title: Fix: CS8032 Instance of Analyzer Cannot Be Created in .NET 8+
        Description: How to resolve the CS8032 error when building .NET 8 projects with Roslyn analyzers.
        """),
    new(ChatRole.Assistant, """
        {"category":"troubleshooting","difficulty":"intermediate","tags":["Roslyn","Analyzers",".NET 8","Build Errors"],"summary":"Guide to resolving CS8032 analyzer instantiation errors caused by Roslyn API version mismatches in .NET 8+ projects."}
        """),

    // Actual request
    new(ChatRole.User, """
        Title: Build a RAG Chatbot with .NET, Semantic Kernel, and Azure Cosmos DB
        Description: End-to-end tutorial for building a production-ready RAG chatbot using Semantic Kernel and Cosmos DB vector search.
        """)
};

var options = new ChatOptions
{
    Temperature = 0.1f,
    MaxOutputTokens = 300
};

var response = await chatClient.GetResponseAsync(messages, options);

// Parse and validate
var classification = JsonSerializer.Deserialize<ArticleClassification>(
    response.Text,
    new JsonSerializerOptions { PropertyNameCaseInsensitive = true });

if (classification is not null)
{
    Console.WriteLine($"Category: {classification.Category}");
    Console.WriteLine($"Difficulty: {classification.Difficulty}");
    Console.WriteLine($"Tags: {string.Join(", ", classification.Tags)}");
    Console.WriteLine($"Summary: {classification.Summary}");
}

public record ArticleClassification(
    string Category,
    string Difficulty,
    string[] Tags,
    string Summary);

This example demonstrates the complete pattern: typed client setup with managed identity, system message with clear schema, few-shot example, low temperature for consistency, and strongly-typed deserialization.

What Comes Next

You now have the prompting fundamentals — roles, system messages, few-shot patterns, chain-of-thought, structured output, and template management. These techniques apply to every LLM interaction you build, regardless of provider.

The next step is understanding the provider landscape. Different models have different strengths, pricing, and API patterns. In Comparing LLM Providers: OpenAI, Azure, and Anthropic, we break down the trade-offs to help you choose the right provider for your use case.

For hands-on practice with Azure OpenAI streaming in C#, check out the Azure OpenAI Chat Completion with Streaming in .NET workshop.

⚠ Production Considerations

  • Hardcoding prompts as inline strings throughout your codebase makes iteration and A/B testing nearly impossible — centralize prompts in template classes or configuration.
  • Trusting model JSON output without schema validation leads to runtime deserialization failures that surface only with specific inputs.

🧠 Architect’s Note

Treat prompts as configuration, not code. They will change more frequently than your C# logic. Design your architecture so prompt text can be updated, versioned, and A/B tested without redeploying your application.

AI-Friendly Summary

Summary

This article teaches C# developers prompt engineering fundamentals with practical code examples. It covers the message role system (system/user/assistant), system message design patterns, few-shot prompting with C# string interpolation, chain-of-thought prompting, structured JSON output extraction, temperature tuning for different tasks, common anti-patterns, reusable prompt template classes, and a complete Azure OpenAI integration example using Microsoft.Extensions.AI.

Key Takeaways

  • System messages set the behavioral foundation — define role, constraints, output format, and boundaries
  • Few-shot examples are the most reliable way to control output format without fine-tuning
  • Chain-of-thought prompting improves accuracy on reasoning tasks by forcing the model to show its work
  • Use JSON mode and schema validation together — JSON mode ensures valid JSON, not valid schema
  • Build reusable prompt template classes in C# instead of scattering string interpolation across your codebase

Implementation Checklist

  • Design system messages with role, task, format, and boundary sections
  • Use few-shot examples for consistent output formatting
  • Set temperature based on task type (0 for code/JSON, 0.5-0.7 for conversation)
  • Enable JSON response format for structured output tasks
  • Validate deserialized output against expected schema
  • Create reusable prompt template classes for maintainability
  • Test prompts with edge cases before deploying to production

Frequently Asked Questions

What is prompt engineering?

Prompt engineering is the practice of designing inputs to large language models that reliably produce the outputs you need. It includes crafting system messages, structuring user prompts, providing examples (few-shot prompting), and tuning parameters like temperature. For C# developers, prompt engineering also involves building reusable prompt templates and handling structured output deserialization.

How do I write effective system messages for Azure OpenAI?

Effective system messages define the model's role, set behavioral constraints, specify output format, and establish boundaries. Keep them focused: state who the model is, what it should do, what format to use, and what it should refuse. Avoid vague instructions like 'be helpful' — instead, be specific: 'You are a .NET documentation assistant. Answer questions about C# and .NET only. Respond in markdown format. If a question is outside .NET development, say so.'

What is few-shot prompting and when should I use it?

Few-shot prompting is providing 2-5 examples of input-output pairs in your prompt before the actual request. This teaches the model the pattern you expect without fine-tuning. Use it when zero-shot prompting produces inconsistent formatting, when you need domain-specific output patterns, or when the task requires a specific structure the model would not infer from instructions alone.

How do I get structured JSON output from an LLM in C#?

Use a system message that specifies JSON output format, provide a schema or example, set temperature to 0-0.2 for consistency, and enable JSON mode if your provider supports it (Azure OpenAI response_format: json_object). In C#, deserialize the response using System.Text.Json with strict validation. Always validate the parsed output against your expected schema — models can produce valid JSON that does not match your structure.

Related Articles

Was this article useful?

Feedback is anonymous and helps us improve content quality.

Discussion

Engineering discussion powered by GitHub Discussions.

#Prompt Engineering #C# #Azure OpenAI #System Messages #Few-Shot Prompting