Menu

Project — Build a Customer Support Agent with Escalation Workflows in LangGraph

Written by Selva Prabhakaran | 29 min read

A customer writes in: “I was charged twice for my order.” Your agent needs to check the knowledge base, realize this isn’t a simple FAQ, and escalate to a human — with full context. Not just forward the message. Actually hand it off with sentiment, category, and a suggested resolution.

Most LangGraph tutorials stop at “agent calls a tool.” Real support systems need more. Routing logic. Escalation workflows. Knowledge base search. Conversation logging. Graceful handoffs. That’s what we’re building here.

Before we write any code, here’s how the pieces connect.

A customer message arrives. First, the agent classifies it — billing issue, product question, complaint, or something else? That classification feeds a router.

Simple questions go straight to the knowledge base. The agent finds an answer and responds. But if the query is complex, or the customer sounds frustrated, the router sends it down the escalation path. The escalation node packages everything — the message, category, sentiment, and a suggested resolution — then flags it for a human.

Every interaction gets logged, whether resolved automatically or escalated. We’ll build each piece as a separate LangGraph node.

Prerequisites

  • Python version: 3.10+
  • Required libraries: langgraph (0.4+), langchain-openai (0.3+), langchain-core (0.3+), python-dotenv
  • Install: pip install langgraph langchain-openai langchain-core python-dotenv
  • API key: OpenAI API key stored in a .env file as OPENAI_API_KEY (Get one here)
  • Prior knowledge: Basic LangGraph concepts (nodes, edges, state). If you haven’t built a graph yet, start with our LangGraph Installation and First Graph post.
  • Time to complete: 35-40 minutes

Defining the Agent State

What data does the agent need to carry between nodes? The customer’s message, of course. The classification result. Sentiment. The response. Whether it escalated. That’s your state.

We’ll use TypedDict for this. Each field has a clear purpose — no vague catch-all dictionaries.

python
import os
from dotenv import load_dotenv
from typing import TypedDict, Literal, Annotated
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
from langchain_openai import ChatOpenAI

load_dotenv()

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)


class SupportState(TypedDict):
    customer_message: str
    category: str
    sentiment: str
    sentiment_score: float
    kb_results: str
    response: str
    escalated: bool
    escalation_reason: str
    log_entry: str

print("State schema defined -- 9 fields covering the full support workflow")
python
State schema defined -- 9 fields covering the full support workflow

Notice there’s no messages list here. We’re not building a chatbot — we’re building a single-turn support processor. Each customer message flows through the pipeline once. That keeps the state simple and the logic predictable.

Key Insight: **Design your state for your workflow, not for a generic chatbot.** A support agent that classifies, routes, and escalates needs structured fields (category, sentiment, escalated). A flat messages list would force every node to parse unstructured text.

Building the Classifier Node

Why would a billing question from a calm customer get a different treatment than the same question from an angry customer? Because sentiment matters as much as category.

The classifier assigns two things: a category (billing, product, technical, complaint, general) and a sentiment (positive, neutral, negative, angry). A calm billing question goes to the knowledge base. An angry billing question gets escalated — even if the KB has the answer.

The classifier calls the LLM with a structured prompt and asks for JSON output.

python
import json


def classify_message(state: SupportState) -> dict:
    """Classify the customer message by category and sentiment."""
    prompt = f"""Analyze this customer support message.

Message: "{state['customer_message']}"

Return JSON with exactly these fields:
- category: one of [billing, product, technical, complaint, general]
- sentiment: one of [positive, neutral, negative, angry]
- sentiment_score: float from -1.0 (very negative) to 1.0 (very positive)

Return ONLY valid JSON, no extra text."""

    result = llm.invoke(prompt)
    parsed = json.loads(result.content)

    return {
        "category": parsed["category"],
        "sentiment": parsed["sentiment"],
        "sentiment_score": parsed["sentiment_score"],
    }

print("classify_message node defined")
python
classify_message node defined

I kept the prompt minimal on purpose. In production, you’d add few-shot examples for each category. But for learning the architecture, a simple prompt keeps the focus on the graph structure.

Building the Knowledge Base Node

In production, this node would query a vector database — Pinecone, ChromaDB, or Weaviate. For this project, a dictionary-based KB keeps us focused on the graph architecture.

The node grabs the category from the classifier and looks up matching entries. Found a match? Return the answer. No match? Return an empty string. The router uses that empty string as an escalation trigger.

python
KNOWLEDGE_BASE = {
    "billing": {
        "double_charge": "If you were charged twice, we'll refund the duplicate within 3-5 business days. No action needed from your side.",
        "refund_policy": "Refunds are processed within 5-7 business days after approval. You'll receive an email confirmation.",
        "payment_methods": "We accept Visa, Mastercard, PayPal, and bank transfers.",
    },
    "product": {
        "shipping": "Standard shipping takes 5-7 business days. Express shipping (2-3 days) is available for $9.99.",
        "returns": "You can return any item within 30 days of delivery for a full refund. Items must be unused.",
        "sizing": "Check our sizing guide at /sizing. If between sizes, we recommend going one size up.",
    },
    "technical": {
        "login": "Try resetting your password at /reset-password. If that fails, clear your browser cache and try again.",
        "app_crash": "Update to the latest version from your app store. If crashes persist, email bugs@support.com with your device model.",
    },
}


def search_knowledge_base(state: SupportState) -> dict:
    """Search the knowledge base for relevant answers."""
    category = state["category"]
    message = state["customer_message"].lower()

    if category not in KNOWLEDGE_BASE:
        return {"kb_results": ""}

    best_match = ""
    for key, answer in KNOWLEDGE_BASE[category].items():
        if any(word in message for word in key.split("_")):
            best_match = answer
            break

    if not best_match:
        entries = list(KNOWLEDGE_BASE[category].values())
        best_match = entries[0] if entries else ""

    return {"kb_results": best_match}

print("search_knowledge_base node defined")
python
search_knowledge_base node defined
Tip: **Swap this for a vector search in production.** Replace the dictionary lookup with an embedding-based search using `langchain.vectorstores`. The node signature stays identical — only the internals change. That’s the power of LangGraph’s modular design.

Building the Response Generator Node

Raw KB answers sound robotic. “Refunds are processed within 5-7 business days” is accurate but cold. The response generator turns that into a human-sounding reply.

It adapts tone based on sentiment. An angry customer gets extra empathy. A neutral customer gets a direct, professional answer.

python
def generate_response(state: SupportState) -> dict:
    """Generate a customer-friendly response using KB results."""
    tone_guide = {
        "positive": "friendly and warm",
        "neutral": "professional and clear",
        "negative": "empathetic and solution-focused",
        "angry": "deeply empathetic, apologetic, and action-oriented",
    }
    tone = tone_guide.get(state["sentiment"], "professional")

    prompt = f"""Write a customer support response.

Customer message: "{state['customer_message']}"
Category: {state['category']}
Relevant information: {state['kb_results']}

Tone: Be {tone}.
Keep the response under 3 sentences.
Address the customer's concern directly.
Do NOT make up information not in the relevant information above."""

    result = llm.invoke(prompt)
    return {"response": result.content}

print("generate_response node defined")
python
generate_response node defined

That tone mapping is deliberate. A support bot that responds to “I’m furious about this charge!” with “Thanks for reaching out!” kills trust instantly. Matching tone to sentiment is the difference between helpful and infuriating.

Building the Escalation Node

Sometimes the agent needs to step back. The escalation node packages everything a human agent needs: the message, category, sentiment, and a suggested resolution.

Why the suggested resolution? Human agents handle dozens of tickets daily. A starting point like “suggest refunding the duplicate charge” saves them time and keeps resolutions consistent.

python
def escalate_to_human(state: SupportState) -> dict:
    """Package the ticket for human agent review."""
    prompt = f"""A customer support ticket needs human review.

Customer message: "{state['customer_message']}"
Category: {state['category']}
Sentiment: {state['sentiment']} (score: {state['sentiment_score']})
Knowledge base results: {state['kb_results'] or 'No relevant KB entry found'}

Write a brief escalation summary (2-3 sentences) that:
1. States why this needs human attention
2. Suggests a resolution
3. Notes the customer's emotional state"""

    result = llm.invoke(prompt)

    return {
        "escalated": True,
        "escalation_reason": result.content,
        "response": f"I understand your concern. I've escalated this to a specialist who will review your case. You'll hear back within 2 hours.",
    }

print("escalate_to_human node defined")
python
escalate_to_human node defined
Warning: **Never tell the customer “I can’t help you.”** Always provide a concrete next step. “I’ve escalated this to a specialist” with a time frame is much better than “I don’t know the answer.” Your auto-response should set clear expectations.

Building the Logger Node

No paper trail? No accountability. The logger creates a structured JSON entry capturing what happened, what the agent decided, and the outcome.

It runs at the end of every path — both the auto-response branch and the escalation branch.

python
from datetime import datetime


def log_interaction(state: SupportState) -> dict:
    """Create a structured log entry for the interaction."""
    log = {
        "timestamp": datetime.now().isoformat(),
        "customer_message": state["customer_message"],
        "category": state["category"],
        "sentiment": state["sentiment"],
        "sentiment_score": state["sentiment_score"],
        "escalated": state.get("escalated", False),
        "response_preview": state["response"][:100],
    }

    if state.get("escalated"):
        log["escalation_reason"] = state.get("escalation_reason", "")

    log_str = json.dumps(log, indent=2)
    print(f"--- Interaction Log ---\n{log_str}")
    return {"log_entry": log_str}

print("log_interaction node defined")
python
log_interaction node defined

Wiring the Router — Conditional Edges

Here’s the brain of the system. Should the agent respond directly, or hand off to a human?

Three factors drive the decision:

  • Sentiment: Angry customers always get escalated.
  • Category: Complaints go to humans regardless of tone.
  • KB coverage: No knowledge base results? Escalate.

LangGraph handles this with conditional edges — a function that returns the next node name based on state.

python
def route_support(state: SupportState) -> str:
    """Decide whether to respond directly or escalate."""
    # Angry customers always get escalated
    if state["sentiment"] == "angry":
        return "escalate"

    # Complaints go to humans
    if state["category"] == "complaint":
        return "escalate"

    # Very negative sentiment gets escalated
    if state["sentiment_score"] < -0.5:
        return "escalate"

    # No KB results -- escalate
    if not state.get("kb_results"):
        return "escalate"

    return "respond"

print("route_support router defined")
python
route_support router defined

I prefer rule-based routing over LLM-based decisions for this. Rules are predictable, testable, and don’t burn API credits on decisions you can codify. Save the LLM for tasks that genuinely need language understanding.

Assembling the Graph

Five nodes. Two paths. One StateGraph. Here’s how they connect.

The classifier runs first, then the KB search, then the router picks a path. One path leads to the response generator. The other leads to escalation. Both converge at the logger before ending.

python
# Build the graph
workflow = StateGraph(SupportState)

# Add nodes
workflow.add_node("classify", classify_message)
workflow.add_node("search_kb", search_knowledge_base)
workflow.add_node("respond", generate_response)
workflow.add_node("escalate", escalate_to_human)
workflow.add_node("log", log_interaction)

# Wire edges
workflow.add_edge(START, "classify")
workflow.add_edge("classify", "search_kb")

# Conditional routing after KB search
workflow.add_conditional_edges(
    "search_kb",
    route_support,
    {"respond": "respond", "escalate": "escalate"},
)

workflow.add_edge("respond", "log")
workflow.add_edge("escalate", "log")
workflow.add_edge("log", END)

# Compile with checkpointer
checkpointer = MemorySaver()
support_agent = workflow.compile(checkpointer=checkpointer)

print("Support agent compiled -- 5 nodes, 2 paths, conditional routing active")
python
Support agent compiled -- 5 nodes, 2 paths, conditional routing active
Key Insight: **The graph structure IS your business logic.** When a product manager asks “why did the agent escalate?”, you can point to the exact conditional edge and routing function. That’s not possible with a single-prompt approach where the LLM makes all decisions inside one call.

Running the Agent — Test Scenarios

Does the routing actually work? Let’s find out. Three test cases: a simple question (should auto-respond), a furious customer (should escalate), and a formal complaint (should escalate regardless of tone).

Each invocation needs a unique thread_id. That’s how LangGraph keeps separate conversations apart in the checkpointer.

Test 1: Simple billing question (should auto-respond)

python
config = {"configurable": {"thread_id": "test-1"}}

result = support_agent.invoke(
    {
        "customer_message": "What payment methods do you accept?",
        "category": "",
        "sentiment": "",
        "sentiment_score": 0.0,
        "kb_results": "",
        "response": "",
        "escalated": False,
        "escalation_reason": "",
        "log_entry": "",
    },
    config=config,
)

print(f"\nCategory: {result['category']}")
print(f"Sentiment: {result['sentiment']}")
print(f"Escalated: {result['escalated']}")
print(f"Response: {result['response']}")

The agent should classify this as billing or general with neutral sentiment. It’ll find the payment methods entry and respond directly. No escalation.

Test 2: Angry customer (should escalate)

python
config_2 = {"configurable": {"thread_id": "test-2"}}

result_2 = support_agent.invoke(
    {
        "customer_message": "I was charged THREE TIMES for the same order! This is absolutely unacceptable. I want a full refund NOW and I'm filing a complaint with my bank.",
        "category": "",
        "sentiment": "",
        "sentiment_score": 0.0,
        "kb_results": "",
        "response": "",
        "escalated": False,
        "escalation_reason": "",
        "log_entry": "",
    },
    config=config_2,
)

print(f"\nCategory: {result_2['category']}")
print(f"Sentiment: {result_2['sentiment']}")
print(f"Escalated: {result_2['escalated']}")
print(f"Escalation reason: {result_2['escalation_reason'][:200]}")

Clearly angry. The classifier should assign sentiment: angry, which triggers immediate escalation. Category and KB results don’t matter here — the sentiment alone is enough.

Test 3: Complaint (should escalate regardless of sentiment)

python
config_3 = {"configurable": {"thread_id": "test-3"}}

result_3 = support_agent.invoke(
    {
        "customer_message": "I'd like to file a formal complaint. Your delivery service damaged my package and the item inside is broken.",
        "category": "",
        "sentiment": "",
        "sentiment_score": 0.0,
        "kb_results": "",
        "response": "",
        "escalated": False,
        "escalation_reason": "",
        "log_entry": "",
    },
    config=config_3,
)

print(f"\nCategory: {result_3['category']}")
print(f"Sentiment: {result_3['sentiment']}")
print(f"Escalated: {result_3['escalated']}")
print(f"Response: {result_3['response']}")

Even if this customer sounds relatively calm, the complaint category triggers escalation. That’s your routing rules doing their job.

Exercise 1: Add a Priority Level to the Router

Now that you’ve seen the routing logic, try modifying it. Add a priority field to the state and set it based on the routing decision.

typescript
{
  type: 'exercise',
  id: 'priority-routing',
  title: 'Exercise 1: Add Priority-Based Routing',
  difficulty: 'advanced',
  exerciseType: 'write',
  instructions: 'Modify the route_support function to also return a priority level. Add a "priority" field to SupportState. Set priority to "urgent" for angry customers, "high" for complaints and negative sentiment, and "normal" for everything else. Print the priority after routing.',
  starterCode: '# Add "priority" to SupportState\n# Modify route_support to set priority\n\ndef route_support_v2(state):\n    # Your escalation logic here\n    # Set state["priority"] based on rules\n    # Return "escalate" or "respond"\n    pass\n\n# Test with an angry message\ntest_state = {\n    "customer_message": "This is outrageous!",\n    "sentiment": "angry",\n    "sentiment_score": -0.9,\n    "category": "billing",\n    "kb_results": "Refund policy info",\n}\nresult = route_support_v2(test_state)\nprint(f"Route: {result}")\nprint(f"Priority: {test_state.get(\'priority\', \'not set\')}")',
  testCases: [
    { id: 'tc1', input: 'test_state = {"sentiment": "angry", "sentiment_score": -0.9, "category": "billing", "kb_results": "info"}; route_support_v2(test_state); print(test_state["priority"])', expectedOutput: 'urgent', description: 'Angry customers should be urgent priority' },
    { id: 'tc2', input: 'test_state = {"sentiment": "neutral", "sentiment_score": 0.2, "category": "complaint", "kb_results": "info"}; route_support_v2(test_state); print(test_state["priority"])', expectedOutput: 'high', description: 'Complaints should be high priority' },
    { id: 'tc3', input: 'test_state = {"sentiment": "neutral", "sentiment_score": 0.5, "category": "product", "kb_results": "info"}; route_support_v2(test_state); print(test_state["priority"])', expectedOutput: 'normal', description: 'Neutral product queries should be normal priority' },
  ],
  hints: [
    'Check sentiment first (angry = urgent), then category (complaint = high), then default to normal.',
    'Set state["priority"] = "urgent" at the top if sentiment == "angry", then return "escalate". For complaints, set "high". Everything else gets "normal".',
  ],
  solution: 'def route_support_v2(state):\n    if state["sentiment"] == "angry":\n        state["priority"] = "urgent"\n        return "escalate"\n    if state["category"] == "complaint":\n        state["priority"] = "high"\n        return "escalate"\n    if state["sentiment_score"] < -0.5:\n        state["priority"] = "high"\n        return "escalate"\n    if not state.get("kb_results"):\n        state["priority"] = "high"\n        return "escalate"\n    state["priority"] = "normal"\n    return "respond"',
  solutionExplanation: 'The priority assignment mirrors the routing logic. Angry customers get the highest priority (urgent) and always escalate. Complaints and very negative sentiment get high priority. Everything else is normal and routes to the auto-response path.',
  xpReward: 20,
}

Adding Human-in-the-Loop Approval

What if escalation decisions themselves need review? Maybe a junior agent shouldn’t escalate directly — a supervisor should approve first.

LangGraph’s interrupt function handles this. It pauses the graph mid-execution, shows context to a human reviewer, and waits. The graph only continues when the reviewer sends a Command to resume.

Two changes are needed: an interrupt inside the escalation node, and Command(resume=...) to continue with the human’s decision.

python
from langgraph.types import interrupt, Command


def escalate_with_approval(state: SupportState) -> dict:
    """Escalate and pause for human approval."""
    prompt = f"""A customer support ticket needs human review.

Customer message: "{state['customer_message']}"
Category: {state['category']}
Sentiment: {state['sentiment']} (score: {state['sentiment_score']})

Write a 2-sentence escalation summary."""

    result = llm.invoke(prompt)
    escalation_summary = result.content

    # Pause here -- show the human what's happening
    human_decision = interrupt({
        "escalation_summary": escalation_summary,
        "customer_message": state["customer_message"],
        "question": "Approve this escalation? (approve/reject/modify)",
    })

    # human_decision comes from Command(resume=...)
    if human_decision.get("action") == "reject":
        return {
            "escalated": False,
            "response": "Let me look into this further for you.",
            "escalation_reason": "Rejected by reviewer",
        }

    return {
        "escalated": True,
        "escalation_reason": escalation_summary,
        "response": human_decision.get(
            "custom_response",
            "I've escalated this to a specialist. You'll hear back within 2 hours.",
        ),
    }

print("escalate_with_approval node defined -- uses interrupt for HITL")
python
escalate_with_approval node defined -- uses interrupt for HITL

Here’s the graph with the approval node wired in. The first invoke runs until it hits the interrupt and pauses. The second invoke resumes with the reviewer’s decision.

python
# Build graph with approval step
approval_workflow = StateGraph(SupportState)
approval_workflow.add_node("classify", classify_message)
approval_workflow.add_node("search_kb", search_knowledge_base)
approval_workflow.add_node("respond", generate_response)
approval_workflow.add_node("escalate", escalate_with_approval)
approval_workflow.add_node("log", log_interaction)

approval_workflow.add_edge(START, "classify")
approval_workflow.add_edge("classify", "search_kb")
approval_workflow.add_conditional_edges(
    "search_kb",
    route_support,
    {"respond": "respond", "escalate": "escalate"},
)
approval_workflow.add_edge("respond", "log")
approval_workflow.add_edge("escalate", "log")
approval_workflow.add_edge("log", END)

approval_agent = approval_workflow.compile(checkpointer=MemorySaver())
print("Approval-enabled agent compiled")
python
Approval-enabled agent compiled

The approval workflow takes two steps. First, invoke until the interrupt fires. Second, resume with the reviewer’s decision.

python
config_approval = {"configurable": {"thread_id": "approval-test-1"}}

# Step 1: Invoke -- this will pause at the interrupt
first_result = approval_agent.invoke(
    {
        "customer_message": "I'm really frustrated. I've been waiting 3 weeks for my order and nobody responds to my emails.",
        "category": "",
        "sentiment": "",
        "sentiment_score": 0.0,
        "kb_results": "",
        "response": "",
        "escalated": False,
        "escalation_reason": "",
        "log_entry": "",
    },
    config=config_approval,
)
print("Graph paused at interrupt -- waiting for human decision")

python
# Step 2: Resume with human approval
final_result = approval_agent.invoke(
    Command(resume={"action": "approve", "custom_response": "I sincerely apologize for the delay. I've personally flagged your order for priority review. Expect an update within 24 hours."}),
    config=config_approval,
)
print(f"Escalated: {final_result['escalated']}")
print(f"Response: {final_result['response']}")

Note: **The interrupt/Command pattern is the core of LangGraph’s HITL system.** The graph saves its full state at the interrupt point. Even if your server restarts, the graph can resume from exactly where it paused — as long as your checkpointer persists state (use PostgresSaver in production).

Exercise 2: Add a Modification Path

The current approval step has two options: approve or reject. But human reviewers often want to modify the response before sending it. Add a “modify” action to the escalation approval.

typescript
{
  type: 'exercise',
  id: 'modify-escalation',
  title: 'Exercise 2: Add Response Modification to HITL',
  difficulty: 'advanced',
  exerciseType: 'write',
  instructions: 'Extend the escalate_with_approval function to handle a third action: "modify". When the human sends action="modify" with a "modified_response" field, use their modified response instead of the auto-generated one. The escalation should still be marked as True.',
  starterCode: '# Extend the human_decision handling\ndef handle_human_decision(human_decision, escalation_summary):\n    """Process the human reviewer decision.\n    \n    Args:\n        human_decision: dict with "action" and optional fields\n        escalation_summary: the AI-generated summary\n    \n    Returns:\n        dict with escalated, response, escalation_reason\n    """\n    # Handle "approve", "reject", and "modify" actions\n    pass\n\n# Test\nresult = handle_human_decision(\n    {"action": "modify", "modified_response": "Custom reply here"},\n    "Customer upset about shipping"\n)\nprint(f"Escalated: {result[\'escalated\']}")\nprint(f"Response: {result[\'response\']}")',
  testCases: [
    { id: 'tc1', input: 'r = handle_human_decision({"action": "modify", "modified_response": "Custom reply"}, "summary"); print(r["escalated"]); print(r["response"])', expectedOutput: 'True\nCustom reply', description: 'Modify action should use the custom response' },
    { id: 'tc2', input: 'r = handle_human_decision({"action": "reject"}, "summary"); print(r["escalated"])', expectedOutput: 'False', description: 'Reject should not escalate' },
  ],
  hints: [
    'Add an elif for action == "modify" that reads human_decision["modified_response"].',
    'if action == "approve": return {"escalated": True, "response": default, ...}. elif action == "modify": return {"escalated": True, "response": human_decision["modified_response"], ...}. else reject.',
  ],
  solution: 'def handle_human_decision(human_decision, escalation_summary):\n    action = human_decision.get("action", "approve")\n    if action == "reject":\n        return {"escalated": False, "response": "Let me look into this further.", "escalation_reason": "Rejected"}\n    elif action == "modify":\n        return {"escalated": True, "response": human_decision["modified_response"], "escalation_reason": escalation_summary}\n    else:\n        return {"escalated": True, "response": "Escalated to specialist. Response within 2 hours.", "escalation_reason": escalation_summary}',
  solutionExplanation: 'The modify path keeps escalated=True (the ticket still goes to a human) but uses the reviewer custom response instead of the auto-generated one. This gives human agents control over the customer-facing message while preserving the escalation workflow.',
  xpReward: 20,
}

Common Mistakes and How to Fix Them

Mistake 1: Forgetting the checkpointer for HITL workflows

Wrong:

python
# This compiles but interrupt() will fail at runtime
agent = workflow.compile()

Why it breaks: Without a checkpointer, LangGraph can’t save the graph state when interrupt() pauses execution. The graph has nowhere to store its progress.

Correct:

python
agent = workflow.compile(checkpointer=MemorySaver())

Mistake 2: Reusing thread IDs across different conversations

Wrong:

python
# Same thread_id for different customers
config = {"configurable": {"thread_id": "main"}}
agent.invoke(message_1, config=config)
agent.invoke(message_2, config=config)  # Sees state from message_1!

Why it breaks: LangGraph tracks conversations by thread_id. Reusing the same ID makes the second invocation inherit state from the first — including the previous customer’s data.

Correct:

python
import uuid
config_1 = {"configurable": {"thread_id": str(uuid.uuid4())}}
config_2 = {"configurable": {"thread_id": str(uuid.uuid4())}}

Mistake 3: Not handling missing state fields

Wrong:

python
def route_support(state):
    if state["sentiment"] == "angry":  # KeyError if classify didn't run
        return "escalate"

Why it breaks: If a node upstream fails or doesn’t set a field, accessing it directly raises a KeyError.

Correct:

python
def route_support(state):
    if state.get("sentiment", "") == "angry":
        return "escalate"

When NOT to Use This Architecture

This agent pattern works well for structured, category-based support. But it’s not the right tool for every scenario.

Don’t use this when:

  • You need multi-turn conversations. This agent handles single messages. For back-and-forth dialogue, you’d need a messages list in state and a conversation loop.
  • Your categories change weekly. A hard-coded category list requires redeployment. If categories shift often, use an LLM-based classifier without fixed options.
  • You have fewer than 50 support tickets per day. The engineering overhead doesn’t pay off at small scale. A simple chain or even a prompt is sufficient.

Use this when:

  • You need auditable routing decisions with clear logic.
  • Human escalation is a core requirement, not an afterthought.
  • You want to swap components (KB, classifier, response generator) independently.

Production Considerations

The code in this tutorial is designed for learning. Here’s what changes for production.

Tip: **Production vs. tutorial differences** — keep these in mind when deploying.
Tutorial Version Production Version
MemorySaver (in-memory) PostgresSaver or RedisSaver (persistent)
Dictionary knowledge base Vector database (Pinecone, ChromaDB, Weaviate)
gpt-4o-mini for all nodes Cheaper model for classification, stronger for response generation
No authentication JWT tokens, API keys for every endpoint
Print-based logging Structured logging with ELK stack or CloudWatch
Single-turn processing Queue-based processing with retry logic

The biggest difference is the checkpointer. MemorySaver loses everything on restart. For any system where escalation pauses matter, you need persistence.

Summary

You built a customer support agent with five components: a classifier, a knowledge base search, a response generator, an escalation handler, and a logger. The conditional routing makes it smart — it responds to simple queries automatically but escalates complex or emotional cases to humans.

The key pattern to remember: design your graph around business decisions, not just LLM calls. The routing function, the escalation criteria, the tone mapping — these are business rules expressed as code. They’re testable, auditable, and don’t depend on LLM behavior.

Here’s a practice exercise to solidify what you’ve learned:

Practice: Add a feedback collection node

After the response is sent (but before logging), add a node that asks the customer to rate the interaction on a 1-5 scale. Store the rating in a new `feedback_score` field in the state. The logger should include this score in its output.

**Solution:**

python
def collect_feedback(state: SupportState) -> dict:
    """Simulate collecting customer feedback."""
    # In production, this would pause for real customer input
    # For this exercise, simulate with interrupt()
    feedback = interrupt({
        "question": "How would you rate this interaction? (1-5)",
        "response_given": state["response"],
    })
    return {"feedback_score": feedback.get("rating", 0)}

Add it to the graph between the response/escalation nodes and the logger:

python
workflow.add_node("feedback", collect_feedback)
workflow.add_edge("respond", "feedback")
workflow.add_edge("escalate", "feedback")
workflow.add_edge("feedback", "log")

Complete Code

Click to expand the full script (copy-paste and run)
python
# Complete code from: Build a Customer Support Agent with Escalation Workflows
# Requires: pip install langgraph langchain-openai langchain-core python-dotenv
# Python 3.10+

import os
import json
import uuid
from datetime import datetime
from dotenv import load_dotenv
from typing import TypedDict, Literal
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
from langgraph.types import interrupt, Command
from langchain_openai import ChatOpenAI

load_dotenv()

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# --- State Definition ---
class SupportState(TypedDict):
    customer_message: str
    category: str
    sentiment: str
    sentiment_score: float
    kb_results: str
    response: str
    escalated: bool
    escalation_reason: str
    log_entry: str

# --- Knowledge Base ---
KNOWLEDGE_BASE = {
    "billing": {
        "double_charge": "If you were charged twice, we'll refund the duplicate within 3-5 business days. No action needed from your side.",
        "refund_policy": "Refunds are processed within 5-7 business days after approval. You'll receive an email confirmation.",
        "payment_methods": "We accept Visa, Mastercard, PayPal, and bank transfers.",
    },
    "product": {
        "shipping": "Standard shipping takes 5-7 business days. Express shipping (2-3 days) is available for $9.99.",
        "returns": "You can return any item within 30 days of delivery for a full refund. Items must be unused.",
        "sizing": "Check our sizing guide at /sizing. If between sizes, we recommend going one size up.",
    },
    "technical": {
        "login": "Try resetting your password at /reset-password. If that fails, clear your browser cache and try again.",
        "app_crash": "Update to the latest version from your app store. If crashes persist, email bugs@support.com with your device model.",
    },
}

# --- Nodes ---
def classify_message(state: SupportState) -> dict:
    prompt = f"""Analyze this customer support message.
Message: "{state['customer_message']}"
Return JSON with exactly these fields:
- category: one of [billing, product, technical, complaint, general]
- sentiment: one of [positive, neutral, negative, angry]
- sentiment_score: float from -1.0 (very negative) to 1.0 (very positive)
Return ONLY valid JSON, no extra text."""
    result = llm.invoke(prompt)
    parsed = json.loads(result.content)
    return {
        "category": parsed["category"],
        "sentiment": parsed["sentiment"],
        "sentiment_score": parsed["sentiment_score"],
    }

def search_knowledge_base(state: SupportState) -> dict:
    category = state["category"]
    message = state["customer_message"].lower()
    if category not in KNOWLEDGE_BASE:
        return {"kb_results": ""}
    best_match = ""
    for key, answer in KNOWLEDGE_BASE[category].items():
        if any(word in message for word in key.split("_")):
            best_match = answer
            break
    if not best_match:
        entries = list(KNOWLEDGE_BASE[category].values())
        best_match = entries[0] if entries else ""
    return {"kb_results": best_match}

def generate_response(state: SupportState) -> dict:
    tone_guide = {
        "positive": "friendly and warm",
        "neutral": "professional and clear",
        "negative": "empathetic and solution-focused",
        "angry": "deeply empathetic, apologetic, and action-oriented",
    }
    tone = tone_guide.get(state["sentiment"], "professional")
    prompt = f"""Write a customer support response.
Customer message: "{state['customer_message']}"
Category: {state['category']}
Relevant information: {state['kb_results']}
Tone: Be {tone}. Keep under 3 sentences. Address the concern directly."""
    result = llm.invoke(prompt)
    return {"response": result.content}

def escalate_to_human(state: SupportState) -> dict:
    prompt = f"""A customer support ticket needs human review.
Customer message: "{state['customer_message']}"
Category: {state['category']}
Sentiment: {state['sentiment']} (score: {state['sentiment_score']})
KB results: {state['kb_results'] or 'No relevant KB entry found'}
Write a 2-3 sentence escalation summary."""
    result = llm.invoke(prompt)
    return {
        "escalated": True,
        "escalation_reason": result.content,
        "response": "I understand your concern. I've escalated this to a specialist who will review your case. You'll hear back within 2 hours.",
    }

def log_interaction(state: SupportState) -> dict:
    log = {
        "timestamp": datetime.now().isoformat(),
        "customer_message": state["customer_message"],
        "category": state["category"],
        "sentiment": state["sentiment"],
        "sentiment_score": state["sentiment_score"],
        "escalated": state.get("escalated", False),
        "response_preview": state["response"][:100],
    }
    if state.get("escalated"):
        log["escalation_reason"] = state.get("escalation_reason", "")
    log_str = json.dumps(log, indent=2)
    print(f"--- Interaction Log ---\n{log_str}")
    return {"log_entry": log_str}

# --- Router ---
def route_support(state: SupportState) -> str:
    if state.get("sentiment", "") == "angry":
        return "escalate"
    if state.get("category", "") == "complaint":
        return "escalate"
    if state.get("sentiment_score", 0) < -0.5:
        return "escalate"
    if not state.get("kb_results"):
        return "escalate"
    return "respond"

# --- Build Graph ---
workflow = StateGraph(SupportState)
workflow.add_node("classify", classify_message)
workflow.add_node("search_kb", search_knowledge_base)
workflow.add_node("respond", generate_response)
workflow.add_node("escalate", escalate_to_human)
workflow.add_node("log", log_interaction)

workflow.add_edge(START, "classify")
workflow.add_edge("classify", "search_kb")
workflow.add_conditional_edges(
    "search_kb",
    route_support,
    {"respond": "respond", "escalate": "escalate"},
)
workflow.add_edge("respond", "log")
workflow.add_edge("escalate", "log")
workflow.add_edge("log", END)

checkpointer = MemorySaver()
support_agent = workflow.compile(checkpointer=checkpointer)

# --- Run ---
config = {"configurable": {"thread_id": str(uuid.uuid4())}}
result = support_agent.invoke(
    {
        "customer_message": "What payment methods do you accept?",
        "category": "",
        "sentiment": "",
        "sentiment_score": 0.0,
        "kb_results": "",
        "response": "",
        "escalated": False,
        "escalation_reason": "",
        "log_entry": "",
    },
    config=config,
)
print(f"\nCategory: {result['category']}")
print(f"Escalated: {result['escalated']}")
print(f"Response: {result['response']}")

print("\nScript completed successfully.")

Frequently Asked Questions

Can I use a different LLM instead of OpenAI?

Yes. Swap ChatOpenAI for any LangChain-compatible chat model. ChatAnthropic, ChatGroq, or ChatOllama all work. The graph structure doesn’t change — only the model initialization.

How do I add more categories to the classifier?

Add the new category to the classifier prompt and add corresponding entries to KNOWLEDGE_BASE. The router’s route_support function may also need updates if the new category has special routing rules.

What’s the difference between interrupt and add_breakpoint?

interrupt pauses inside a node and can pass data to the human reviewer. add_breakpoint (the older API) pauses before a node runs. For new projects, use interrupt — it gives you more control and cleaner code.

How do I persist state across server restarts?

Replace MemorySaver() with PostgresSaver from langgraph-checkpoint-postgres. Install it with pip install langgraph-checkpoint-postgres, configure your database URL, and the API stays identical.

python
from langgraph.checkpoint.postgres import PostgresSaver
checkpointer = PostgresSaver.from_conn_string("postgresql://user:pass@localhost/db")

Can this agent handle multiple languages?

The LLM handles language detection and response generation natively. Add a language field to the state and include “detect the language” in the classifier prompt. The response generator will reply in the detected language automatically.

References

  1. LangGraph documentation — StateGraph and Conditional Edges. Link
  2. LangGraph documentation — Human-in-the-Loop patterns. Link
  3. LangGraph documentation — Checkpointers and Persistence. Link
  4. LangChain documentation — ChatOpenAI. Link
  5. LangGraph Customer Support Tutorial (official). Link
  6. LangGraph documentation — interrupt and Command. Link
  7. OpenAI API documentation — Chat Completions. Link

Free Course
Master Core Python — Your First Step into AI/ML

Build a strong Python foundation with hands-on exercises designed for aspiring Data Scientists and AI/ML Engineers.

Start Free Course
Trusted by 50,000+ learners
Related Course
Master Gen AI — Hands-On
Join 5,000+ students at edu.machinelearningplus.com
Explore Course
Get the full course,
completely free.
Join 57,000+ students learning Python, SQL & ML. One year of access, all resources included.
📚 10 Courses
🐍 Python & ML
🗄️ SQL
📦 Downloads
📅 1 Year Access
No thanks
🎓
Free AI/ML Starter Kit
Python · SQL · ML · 10 Courses · 57,000+ students
🎉   You're in! Check your inbox (or Promotions/Spam) for the access link.
⚡ Before you go

Python.
SQL. NumPy.
All free.

Get the exact 10-course programming foundation that Data Science professionals use.

🐍
Core Python — from first line to expert level
📈
NumPy & Pandas — the #1 libraries every DS job needs
🗃️
SQL Levels I–III — basics to Window Functions
📄
Real industry data — Jupyter notebooks included
R A M S K
57,000+ students
★★★★★ Rated 4.9/5
⚡ Before you go
Python. SQL.
All Free.
R A M S K
57,000+ students  ★★★★★ 4.9/5
Get Free Access Now
10 courses. Real projects. Zero cost. No credit card.
New learners enrolling right now
🔒 100% free ☕ No spam, ever ✓ Instant access
🚀
You're in!
Check your inbox for your access link.
(Check Promotions or Spam if you don't see it)
Or start your first course right now:
Start Free Course →
Scroll to Top
Scroll to Top
Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science