Project — Build a Customer Support Agent with Escalation Workflows in LangGraph
A customer writes in: “I was charged twice for my order.” Your agent needs to check the knowledge base, realize this isn’t a simple FAQ, and escalate to a human — with full context. Not just forward the message. Actually hand it off with sentiment, category, and a suggested resolution.
Most LangGraph tutorials stop at “agent calls a tool.” Real support systems need more. Routing logic. Escalation workflows. Knowledge base search. Conversation logging. Graceful handoffs. That’s what we’re building here.
Before we write any code, here’s how the pieces connect.
A customer message arrives. First, the agent classifies it — billing issue, product question, complaint, or something else? That classification feeds a router.
Simple questions go straight to the knowledge base. The agent finds an answer and responds. But if the query is complex, or the customer sounds frustrated, the router sends it down the escalation path. The escalation node packages everything — the message, category, sentiment, and a suggested resolution — then flags it for a human.
Every interaction gets logged, whether resolved automatically or escalated. We’ll build each piece as a separate LangGraph node.
Prerequisites
- Python version: 3.10+
- Required libraries: langgraph (0.4+), langchain-openai (0.3+), langchain-core (0.3+), python-dotenv
- Install:
pip install langgraph langchain-openai langchain-core python-dotenv - API key: OpenAI API key stored in a
.envfile asOPENAI_API_KEY(Get one here) - Prior knowledge: Basic LangGraph concepts (nodes, edges, state). If you haven’t built a graph yet, start with our LangGraph Installation and First Graph post.
- Time to complete: 35-40 minutes
Defining the Agent State
What data does the agent need to carry between nodes? The customer’s message, of course. The classification result. Sentiment. The response. Whether it escalated. That’s your state.
We’ll use TypedDict for this. Each field has a clear purpose — no vague catch-all dictionaries.
import os
from dotenv import load_dotenv
from typing import TypedDict, Literal, Annotated
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
from langchain_openai import ChatOpenAI
load_dotenv()
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
class SupportState(TypedDict):
customer_message: str
category: str
sentiment: str
sentiment_score: float
kb_results: str
response: str
escalated: bool
escalation_reason: str
log_entry: str
print("State schema defined -- 9 fields covering the full support workflow")
State schema defined -- 9 fields covering the full support workflow
Notice there’s no messages list here. We’re not building a chatbot — we’re building a single-turn support processor. Each customer message flows through the pipeline once. That keeps the state simple and the logic predictable.
Building the Classifier Node
Why would a billing question from a calm customer get a different treatment than the same question from an angry customer? Because sentiment matters as much as category.
The classifier assigns two things: a category (billing, product, technical, complaint, general) and a sentiment (positive, neutral, negative, angry). A calm billing question goes to the knowledge base. An angry billing question gets escalated — even if the KB has the answer.
The classifier calls the LLM with a structured prompt and asks for JSON output.
import json
def classify_message(state: SupportState) -> dict:
"""Classify the customer message by category and sentiment."""
prompt = f"""Analyze this customer support message.
Message: "{state['customer_message']}"
Return JSON with exactly these fields:
- category: one of [billing, product, technical, complaint, general]
- sentiment: one of [positive, neutral, negative, angry]
- sentiment_score: float from -1.0 (very negative) to 1.0 (very positive)
Return ONLY valid JSON, no extra text."""
result = llm.invoke(prompt)
parsed = json.loads(result.content)
return {
"category": parsed["category"],
"sentiment": parsed["sentiment"],
"sentiment_score": parsed["sentiment_score"],
}
print("classify_message node defined")
classify_message node defined
I kept the prompt minimal on purpose. In production, you’d add few-shot examples for each category. But for learning the architecture, a simple prompt keeps the focus on the graph structure.
Building the Knowledge Base Node
In production, this node would query a vector database — Pinecone, ChromaDB, or Weaviate. For this project, a dictionary-based KB keeps us focused on the graph architecture.
The node grabs the category from the classifier and looks up matching entries. Found a match? Return the answer. No match? Return an empty string. The router uses that empty string as an escalation trigger.
KNOWLEDGE_BASE = {
"billing": {
"double_charge": "If you were charged twice, we'll refund the duplicate within 3-5 business days. No action needed from your side.",
"refund_policy": "Refunds are processed within 5-7 business days after approval. You'll receive an email confirmation.",
"payment_methods": "We accept Visa, Mastercard, PayPal, and bank transfers.",
},
"product": {
"shipping": "Standard shipping takes 5-7 business days. Express shipping (2-3 days) is available for $9.99.",
"returns": "You can return any item within 30 days of delivery for a full refund. Items must be unused.",
"sizing": "Check our sizing guide at /sizing. If between sizes, we recommend going one size up.",
},
"technical": {
"login": "Try resetting your password at /reset-password. If that fails, clear your browser cache and try again.",
"app_crash": "Update to the latest version from your app store. If crashes persist, email bugs@support.com with your device model.",
},
}
def search_knowledge_base(state: SupportState) -> dict:
"""Search the knowledge base for relevant answers."""
category = state["category"]
message = state["customer_message"].lower()
if category not in KNOWLEDGE_BASE:
return {"kb_results": ""}
best_match = ""
for key, answer in KNOWLEDGE_BASE[category].items():
if any(word in message for word in key.split("_")):
best_match = answer
break
if not best_match:
entries = list(KNOWLEDGE_BASE[category].values())
best_match = entries[0] if entries else ""
return {"kb_results": best_match}
print("search_knowledge_base node defined")
search_knowledge_base node defined
Building the Response Generator Node
Raw KB answers sound robotic. “Refunds are processed within 5-7 business days” is accurate but cold. The response generator turns that into a human-sounding reply.
It adapts tone based on sentiment. An angry customer gets extra empathy. A neutral customer gets a direct, professional answer.
def generate_response(state: SupportState) -> dict:
"""Generate a customer-friendly response using KB results."""
tone_guide = {
"positive": "friendly and warm",
"neutral": "professional and clear",
"negative": "empathetic and solution-focused",
"angry": "deeply empathetic, apologetic, and action-oriented",
}
tone = tone_guide.get(state["sentiment"], "professional")
prompt = f"""Write a customer support response.
Customer message: "{state['customer_message']}"
Category: {state['category']}
Relevant information: {state['kb_results']}
Tone: Be {tone}.
Keep the response under 3 sentences.
Address the customer's concern directly.
Do NOT make up information not in the relevant information above."""
result = llm.invoke(prompt)
return {"response": result.content}
print("generate_response node defined")
generate_response node defined
That tone mapping is deliberate. A support bot that responds to “I’m furious about this charge!” with “Thanks for reaching out!” kills trust instantly. Matching tone to sentiment is the difference between helpful and infuriating.
Building the Escalation Node
Sometimes the agent needs to step back. The escalation node packages everything a human agent needs: the message, category, sentiment, and a suggested resolution.
Why the suggested resolution? Human agents handle dozens of tickets daily. A starting point like “suggest refunding the duplicate charge” saves them time and keeps resolutions consistent.
def escalate_to_human(state: SupportState) -> dict:
"""Package the ticket for human agent review."""
prompt = f"""A customer support ticket needs human review.
Customer message: "{state['customer_message']}"
Category: {state['category']}
Sentiment: {state['sentiment']} (score: {state['sentiment_score']})
Knowledge base results: {state['kb_results'] or 'No relevant KB entry found'}
Write a brief escalation summary (2-3 sentences) that:
1. States why this needs human attention
2. Suggests a resolution
3. Notes the customer's emotional state"""
result = llm.invoke(prompt)
return {
"escalated": True,
"escalation_reason": result.content,
"response": f"I understand your concern. I've escalated this to a specialist who will review your case. You'll hear back within 2 hours.",
}
print("escalate_to_human node defined")
escalate_to_human node defined
Building the Logger Node
No paper trail? No accountability. The logger creates a structured JSON entry capturing what happened, what the agent decided, and the outcome.
It runs at the end of every path — both the auto-response branch and the escalation branch.
from datetime import datetime
def log_interaction(state: SupportState) -> dict:
"""Create a structured log entry for the interaction."""
log = {
"timestamp": datetime.now().isoformat(),
"customer_message": state["customer_message"],
"category": state["category"],
"sentiment": state["sentiment"],
"sentiment_score": state["sentiment_score"],
"escalated": state.get("escalated", False),
"response_preview": state["response"][:100],
}
if state.get("escalated"):
log["escalation_reason"] = state.get("escalation_reason", "")
log_str = json.dumps(log, indent=2)
print(f"--- Interaction Log ---\n{log_str}")
return {"log_entry": log_str}
print("log_interaction node defined")
log_interaction node defined
Wiring the Router — Conditional Edges
Here’s the brain of the system. Should the agent respond directly, or hand off to a human?
Three factors drive the decision:
- Sentiment: Angry customers always get escalated.
- Category: Complaints go to humans regardless of tone.
- KB coverage: No knowledge base results? Escalate.
LangGraph handles this with conditional edges — a function that returns the next node name based on state.
def route_support(state: SupportState) -> str:
"""Decide whether to respond directly or escalate."""
# Angry customers always get escalated
if state["sentiment"] == "angry":
return "escalate"
# Complaints go to humans
if state["category"] == "complaint":
return "escalate"
# Very negative sentiment gets escalated
if state["sentiment_score"] < -0.5:
return "escalate"
# No KB results -- escalate
if not state.get("kb_results"):
return "escalate"
return "respond"
print("route_support router defined")
route_support router defined
I prefer rule-based routing over LLM-based decisions for this. Rules are predictable, testable, and don’t burn API credits on decisions you can codify. Save the LLM for tasks that genuinely need language understanding.
Assembling the Graph
Five nodes. Two paths. One StateGraph. Here’s how they connect.
The classifier runs first, then the KB search, then the router picks a path. One path leads to the response generator. The other leads to escalation. Both converge at the logger before ending.
# Build the graph
workflow = StateGraph(SupportState)
# Add nodes
workflow.add_node("classify", classify_message)
workflow.add_node("search_kb", search_knowledge_base)
workflow.add_node("respond", generate_response)
workflow.add_node("escalate", escalate_to_human)
workflow.add_node("log", log_interaction)
# Wire edges
workflow.add_edge(START, "classify")
workflow.add_edge("classify", "search_kb")
# Conditional routing after KB search
workflow.add_conditional_edges(
"search_kb",
route_support,
{"respond": "respond", "escalate": "escalate"},
)
workflow.add_edge("respond", "log")
workflow.add_edge("escalate", "log")
workflow.add_edge("log", END)
# Compile with checkpointer
checkpointer = MemorySaver()
support_agent = workflow.compile(checkpointer=checkpointer)
print("Support agent compiled -- 5 nodes, 2 paths, conditional routing active")
Support agent compiled -- 5 nodes, 2 paths, conditional routing active
Running the Agent — Test Scenarios
Does the routing actually work? Let’s find out. Three test cases: a simple question (should auto-respond), a furious customer (should escalate), and a formal complaint (should escalate regardless of tone).
Each invocation needs a unique thread_id. That’s how LangGraph keeps separate conversations apart in the checkpointer.
Test 1: Simple billing question (should auto-respond)
config = {"configurable": {"thread_id": "test-1"}}
result = support_agent.invoke(
{
"customer_message": "What payment methods do you accept?",
"category": "",
"sentiment": "",
"sentiment_score": 0.0,
"kb_results": "",
"response": "",
"escalated": False,
"escalation_reason": "",
"log_entry": "",
},
config=config,
)
print(f"\nCategory: {result['category']}")
print(f"Sentiment: {result['sentiment']}")
print(f"Escalated: {result['escalated']}")
print(f"Response: {result['response']}")
The agent should classify this as billing or general with neutral sentiment. It’ll find the payment methods entry and respond directly. No escalation.
Test 2: Angry customer (should escalate)
config_2 = {"configurable": {"thread_id": "test-2"}}
result_2 = support_agent.invoke(
{
"customer_message": "I was charged THREE TIMES for the same order! This is absolutely unacceptable. I want a full refund NOW and I'm filing a complaint with my bank.",
"category": "",
"sentiment": "",
"sentiment_score": 0.0,
"kb_results": "",
"response": "",
"escalated": False,
"escalation_reason": "",
"log_entry": "",
},
config=config_2,
)
print(f"\nCategory: {result_2['category']}")
print(f"Sentiment: {result_2['sentiment']}")
print(f"Escalated: {result_2['escalated']}")
print(f"Escalation reason: {result_2['escalation_reason'][:200]}")
Clearly angry. The classifier should assign sentiment: angry, which triggers immediate escalation. Category and KB results don’t matter here — the sentiment alone is enough.
Test 3: Complaint (should escalate regardless of sentiment)
config_3 = {"configurable": {"thread_id": "test-3"}}
result_3 = support_agent.invoke(
{
"customer_message": "I'd like to file a formal complaint. Your delivery service damaged my package and the item inside is broken.",
"category": "",
"sentiment": "",
"sentiment_score": 0.0,
"kb_results": "",
"response": "",
"escalated": False,
"escalation_reason": "",
"log_entry": "",
},
config=config_3,
)
print(f"\nCategory: {result_3['category']}")
print(f"Sentiment: {result_3['sentiment']}")
print(f"Escalated: {result_3['escalated']}")
print(f"Response: {result_3['response']}")
Even if this customer sounds relatively calm, the complaint category triggers escalation. That’s your routing rules doing their job.
Exercise 1: Add a Priority Level to the Router
Now that you’ve seen the routing logic, try modifying it. Add a priority field to the state and set it based on the routing decision.
{
type: 'exercise',
id: 'priority-routing',
title: 'Exercise 1: Add Priority-Based Routing',
difficulty: 'advanced',
exerciseType: 'write',
instructions: 'Modify the route_support function to also return a priority level. Add a "priority" field to SupportState. Set priority to "urgent" for angry customers, "high" for complaints and negative sentiment, and "normal" for everything else. Print the priority after routing.',
starterCode: '# Add "priority" to SupportState\n# Modify route_support to set priority\n\ndef route_support_v2(state):\n # Your escalation logic here\n # Set state["priority"] based on rules\n # Return "escalate" or "respond"\n pass\n\n# Test with an angry message\ntest_state = {\n "customer_message": "This is outrageous!",\n "sentiment": "angry",\n "sentiment_score": -0.9,\n "category": "billing",\n "kb_results": "Refund policy info",\n}\nresult = route_support_v2(test_state)\nprint(f"Route: {result}")\nprint(f"Priority: {test_state.get(\'priority\', \'not set\')}")',
testCases: [
{ id: 'tc1', input: 'test_state = {"sentiment": "angry", "sentiment_score": -0.9, "category": "billing", "kb_results": "info"}; route_support_v2(test_state); print(test_state["priority"])', expectedOutput: 'urgent', description: 'Angry customers should be urgent priority' },
{ id: 'tc2', input: 'test_state = {"sentiment": "neutral", "sentiment_score": 0.2, "category": "complaint", "kb_results": "info"}; route_support_v2(test_state); print(test_state["priority"])', expectedOutput: 'high', description: 'Complaints should be high priority' },
{ id: 'tc3', input: 'test_state = {"sentiment": "neutral", "sentiment_score": 0.5, "category": "product", "kb_results": "info"}; route_support_v2(test_state); print(test_state["priority"])', expectedOutput: 'normal', description: 'Neutral product queries should be normal priority' },
],
hints: [
'Check sentiment first (angry = urgent), then category (complaint = high), then default to normal.',
'Set state["priority"] = "urgent" at the top if sentiment == "angry", then return "escalate". For complaints, set "high". Everything else gets "normal".',
],
solution: 'def route_support_v2(state):\n if state["sentiment"] == "angry":\n state["priority"] = "urgent"\n return "escalate"\n if state["category"] == "complaint":\n state["priority"] = "high"\n return "escalate"\n if state["sentiment_score"] < -0.5:\n state["priority"] = "high"\n return "escalate"\n if not state.get("kb_results"):\n state["priority"] = "high"\n return "escalate"\n state["priority"] = "normal"\n return "respond"',
solutionExplanation: 'The priority assignment mirrors the routing logic. Angry customers get the highest priority (urgent) and always escalate. Complaints and very negative sentiment get high priority. Everything else is normal and routes to the auto-response path.',
xpReward: 20,
}
Adding Human-in-the-Loop Approval
What if escalation decisions themselves need review? Maybe a junior agent shouldn’t escalate directly — a supervisor should approve first.
LangGraph’s interrupt function handles this. It pauses the graph mid-execution, shows context to a human reviewer, and waits. The graph only continues when the reviewer sends a Command to resume.
Two changes are needed: an interrupt inside the escalation node, and Command(resume=...) to continue with the human’s decision.
from langgraph.types import interrupt, Command
def escalate_with_approval(state: SupportState) -> dict:
"""Escalate and pause for human approval."""
prompt = f"""A customer support ticket needs human review.
Customer message: "{state['customer_message']}"
Category: {state['category']}
Sentiment: {state['sentiment']} (score: {state['sentiment_score']})
Write a 2-sentence escalation summary."""
result = llm.invoke(prompt)
escalation_summary = result.content
# Pause here -- show the human what's happening
human_decision = interrupt({
"escalation_summary": escalation_summary,
"customer_message": state["customer_message"],
"question": "Approve this escalation? (approve/reject/modify)",
})
# human_decision comes from Command(resume=...)
if human_decision.get("action") == "reject":
return {
"escalated": False,
"response": "Let me look into this further for you.",
"escalation_reason": "Rejected by reviewer",
}
return {
"escalated": True,
"escalation_reason": escalation_summary,
"response": human_decision.get(
"custom_response",
"I've escalated this to a specialist. You'll hear back within 2 hours.",
),
}
print("escalate_with_approval node defined -- uses interrupt for HITL")
escalate_with_approval node defined -- uses interrupt for HITL
Here’s the graph with the approval node wired in. The first invoke runs until it hits the interrupt and pauses. The second invoke resumes with the reviewer’s decision.
# Build graph with approval step
approval_workflow = StateGraph(SupportState)
approval_workflow.add_node("classify", classify_message)
approval_workflow.add_node("search_kb", search_knowledge_base)
approval_workflow.add_node("respond", generate_response)
approval_workflow.add_node("escalate", escalate_with_approval)
approval_workflow.add_node("log", log_interaction)
approval_workflow.add_edge(START, "classify")
approval_workflow.add_edge("classify", "search_kb")
approval_workflow.add_conditional_edges(
"search_kb",
route_support,
{"respond": "respond", "escalate": "escalate"},
)
approval_workflow.add_edge("respond", "log")
approval_workflow.add_edge("escalate", "log")
approval_workflow.add_edge("log", END)
approval_agent = approval_workflow.compile(checkpointer=MemorySaver())
print("Approval-enabled agent compiled")
Approval-enabled agent compiled
The approval workflow takes two steps. First, invoke until the interrupt fires. Second, resume with the reviewer’s decision.
config_approval = {"configurable": {"thread_id": "approval-test-1"}}
# Step 1: Invoke -- this will pause at the interrupt
first_result = approval_agent.invoke(
{
"customer_message": "I'm really frustrated. I've been waiting 3 weeks for my order and nobody responds to my emails.",
"category": "",
"sentiment": "",
"sentiment_score": 0.0,
"kb_results": "",
"response": "",
"escalated": False,
"escalation_reason": "",
"log_entry": "",
},
config=config_approval,
)
print("Graph paused at interrupt -- waiting for human decision")
# Step 2: Resume with human approval
final_result = approval_agent.invoke(
Command(resume={"action": "approve", "custom_response": "I sincerely apologize for the delay. I've personally flagged your order for priority review. Expect an update within 24 hours."}),
config=config_approval,
)
print(f"Escalated: {final_result['escalated']}")
print(f"Response: {final_result['response']}")
Exercise 2: Add a Modification Path
The current approval step has two options: approve or reject. But human reviewers often want to modify the response before sending it. Add a “modify” action to the escalation approval.
{
type: 'exercise',
id: 'modify-escalation',
title: 'Exercise 2: Add Response Modification to HITL',
difficulty: 'advanced',
exerciseType: 'write',
instructions: 'Extend the escalate_with_approval function to handle a third action: "modify". When the human sends action="modify" with a "modified_response" field, use their modified response instead of the auto-generated one. The escalation should still be marked as True.',
starterCode: '# Extend the human_decision handling\ndef handle_human_decision(human_decision, escalation_summary):\n """Process the human reviewer decision.\n \n Args:\n human_decision: dict with "action" and optional fields\n escalation_summary: the AI-generated summary\n \n Returns:\n dict with escalated, response, escalation_reason\n """\n # Handle "approve", "reject", and "modify" actions\n pass\n\n# Test\nresult = handle_human_decision(\n {"action": "modify", "modified_response": "Custom reply here"},\n "Customer upset about shipping"\n)\nprint(f"Escalated: {result[\'escalated\']}")\nprint(f"Response: {result[\'response\']}")',
testCases: [
{ id: 'tc1', input: 'r = handle_human_decision({"action": "modify", "modified_response": "Custom reply"}, "summary"); print(r["escalated"]); print(r["response"])', expectedOutput: 'True\nCustom reply', description: 'Modify action should use the custom response' },
{ id: 'tc2', input: 'r = handle_human_decision({"action": "reject"}, "summary"); print(r["escalated"])', expectedOutput: 'False', description: 'Reject should not escalate' },
],
hints: [
'Add an elif for action == "modify" that reads human_decision["modified_response"].',
'if action == "approve": return {"escalated": True, "response": default, ...}. elif action == "modify": return {"escalated": True, "response": human_decision["modified_response"], ...}. else reject.',
],
solution: 'def handle_human_decision(human_decision, escalation_summary):\n action = human_decision.get("action", "approve")\n if action == "reject":\n return {"escalated": False, "response": "Let me look into this further.", "escalation_reason": "Rejected"}\n elif action == "modify":\n return {"escalated": True, "response": human_decision["modified_response"], "escalation_reason": escalation_summary}\n else:\n return {"escalated": True, "response": "Escalated to specialist. Response within 2 hours.", "escalation_reason": escalation_summary}',
solutionExplanation: 'The modify path keeps escalated=True (the ticket still goes to a human) but uses the reviewer custom response instead of the auto-generated one. This gives human agents control over the customer-facing message while preserving the escalation workflow.',
xpReward: 20,
}
Common Mistakes and How to Fix Them
Mistake 1: Forgetting the checkpointer for HITL workflows
Wrong:
# This compiles but interrupt() will fail at runtime
agent = workflow.compile()
Why it breaks: Without a checkpointer, LangGraph can’t save the graph state when interrupt() pauses execution. The graph has nowhere to store its progress.
Correct:
agent = workflow.compile(checkpointer=MemorySaver())
Mistake 2: Reusing thread IDs across different conversations
Wrong:
# Same thread_id for different customers
config = {"configurable": {"thread_id": "main"}}
agent.invoke(message_1, config=config)
agent.invoke(message_2, config=config) # Sees state from message_1!
Why it breaks: LangGraph tracks conversations by thread_id. Reusing the same ID makes the second invocation inherit state from the first — including the previous customer’s data.
Correct:
import uuid
config_1 = {"configurable": {"thread_id": str(uuid.uuid4())}}
config_2 = {"configurable": {"thread_id": str(uuid.uuid4())}}
Mistake 3: Not handling missing state fields
Wrong:
def route_support(state):
if state["sentiment"] == "angry": # KeyError if classify didn't run
return "escalate"
Why it breaks: If a node upstream fails or doesn’t set a field, accessing it directly raises a KeyError.
Correct:
def route_support(state):
if state.get("sentiment", "") == "angry":
return "escalate"
When NOT to Use This Architecture
This agent pattern works well for structured, category-based support. But it’s not the right tool for every scenario.
Don’t use this when:
- You need multi-turn conversations. This agent handles single messages. For back-and-forth dialogue, you’d need a
messageslist in state and a conversation loop. - Your categories change weekly. A hard-coded category list requires redeployment. If categories shift often, use an LLM-based classifier without fixed options.
- You have fewer than 50 support tickets per day. The engineering overhead doesn’t pay off at small scale. A simple chain or even a prompt is sufficient.
Use this when:
- You need auditable routing decisions with clear logic.
- Human escalation is a core requirement, not an afterthought.
- You want to swap components (KB, classifier, response generator) independently.
Production Considerations
The code in this tutorial is designed for learning. Here’s what changes for production.
| Tutorial Version | Production Version |
|---|---|
MemorySaver (in-memory) |
PostgresSaver or RedisSaver (persistent) |
| Dictionary knowledge base | Vector database (Pinecone, ChromaDB, Weaviate) |
gpt-4o-mini for all nodes |
Cheaper model for classification, stronger for response generation |
| No authentication | JWT tokens, API keys for every endpoint |
| Print-based logging | Structured logging with ELK stack or CloudWatch |
| Single-turn processing | Queue-based processing with retry logic |
The biggest difference is the checkpointer. MemorySaver loses everything on restart. For any system where escalation pauses matter, you need persistence.
Summary
You built a customer support agent with five components: a classifier, a knowledge base search, a response generator, an escalation handler, and a logger. The conditional routing makes it smart — it responds to simple queries automatically but escalates complex or emotional cases to humans.
The key pattern to remember: design your graph around business decisions, not just LLM calls. The routing function, the escalation criteria, the tone mapping — these are business rules expressed as code. They’re testable, auditable, and don’t depend on LLM behavior.
Here’s a practice exercise to solidify what you’ve learned:
Practice: Add a feedback collection node
After the response is sent (but before logging), add a node that asks the customer to rate the interaction on a 1-5 scale. Store the rating in a new `feedback_score` field in the state. The logger should include this score in its output.
**Solution:**
def collect_feedback(state: SupportState) -> dict:
"""Simulate collecting customer feedback."""
# In production, this would pause for real customer input
# For this exercise, simulate with interrupt()
feedback = interrupt({
"question": "How would you rate this interaction? (1-5)",
"response_given": state["response"],
})
return {"feedback_score": feedback.get("rating", 0)}
Add it to the graph between the response/escalation nodes and the logger:
workflow.add_node("feedback", collect_feedback)
workflow.add_edge("respond", "feedback")
workflow.add_edge("escalate", "feedback")
workflow.add_edge("feedback", "log")
Complete Code
Click to expand the full script (copy-paste and run)
# Complete code from: Build a Customer Support Agent with Escalation Workflows
# Requires: pip install langgraph langchain-openai langchain-core python-dotenv
# Python 3.10+
import os
import json
import uuid
from datetime import datetime
from dotenv import load_dotenv
from typing import TypedDict, Literal
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
from langgraph.types import interrupt, Command
from langchain_openai import ChatOpenAI
load_dotenv()
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
# --- State Definition ---
class SupportState(TypedDict):
customer_message: str
category: str
sentiment: str
sentiment_score: float
kb_results: str
response: str
escalated: bool
escalation_reason: str
log_entry: str
# --- Knowledge Base ---
KNOWLEDGE_BASE = {
"billing": {
"double_charge": "If you were charged twice, we'll refund the duplicate within 3-5 business days. No action needed from your side.",
"refund_policy": "Refunds are processed within 5-7 business days after approval. You'll receive an email confirmation.",
"payment_methods": "We accept Visa, Mastercard, PayPal, and bank transfers.",
},
"product": {
"shipping": "Standard shipping takes 5-7 business days. Express shipping (2-3 days) is available for $9.99.",
"returns": "You can return any item within 30 days of delivery for a full refund. Items must be unused.",
"sizing": "Check our sizing guide at /sizing. If between sizes, we recommend going one size up.",
},
"technical": {
"login": "Try resetting your password at /reset-password. If that fails, clear your browser cache and try again.",
"app_crash": "Update to the latest version from your app store. If crashes persist, email bugs@support.com with your device model.",
},
}
# --- Nodes ---
def classify_message(state: SupportState) -> dict:
prompt = f"""Analyze this customer support message.
Message: "{state['customer_message']}"
Return JSON with exactly these fields:
- category: one of [billing, product, technical, complaint, general]
- sentiment: one of [positive, neutral, negative, angry]
- sentiment_score: float from -1.0 (very negative) to 1.0 (very positive)
Return ONLY valid JSON, no extra text."""
result = llm.invoke(prompt)
parsed = json.loads(result.content)
return {
"category": parsed["category"],
"sentiment": parsed["sentiment"],
"sentiment_score": parsed["sentiment_score"],
}
def search_knowledge_base(state: SupportState) -> dict:
category = state["category"]
message = state["customer_message"].lower()
if category not in KNOWLEDGE_BASE:
return {"kb_results": ""}
best_match = ""
for key, answer in KNOWLEDGE_BASE[category].items():
if any(word in message for word in key.split("_")):
best_match = answer
break
if not best_match:
entries = list(KNOWLEDGE_BASE[category].values())
best_match = entries[0] if entries else ""
return {"kb_results": best_match}
def generate_response(state: SupportState) -> dict:
tone_guide = {
"positive": "friendly and warm",
"neutral": "professional and clear",
"negative": "empathetic and solution-focused",
"angry": "deeply empathetic, apologetic, and action-oriented",
}
tone = tone_guide.get(state["sentiment"], "professional")
prompt = f"""Write a customer support response.
Customer message: "{state['customer_message']}"
Category: {state['category']}
Relevant information: {state['kb_results']}
Tone: Be {tone}. Keep under 3 sentences. Address the concern directly."""
result = llm.invoke(prompt)
return {"response": result.content}
def escalate_to_human(state: SupportState) -> dict:
prompt = f"""A customer support ticket needs human review.
Customer message: "{state['customer_message']}"
Category: {state['category']}
Sentiment: {state['sentiment']} (score: {state['sentiment_score']})
KB results: {state['kb_results'] or 'No relevant KB entry found'}
Write a 2-3 sentence escalation summary."""
result = llm.invoke(prompt)
return {
"escalated": True,
"escalation_reason": result.content,
"response": "I understand your concern. I've escalated this to a specialist who will review your case. You'll hear back within 2 hours.",
}
def log_interaction(state: SupportState) -> dict:
log = {
"timestamp": datetime.now().isoformat(),
"customer_message": state["customer_message"],
"category": state["category"],
"sentiment": state["sentiment"],
"sentiment_score": state["sentiment_score"],
"escalated": state.get("escalated", False),
"response_preview": state["response"][:100],
}
if state.get("escalated"):
log["escalation_reason"] = state.get("escalation_reason", "")
log_str = json.dumps(log, indent=2)
print(f"--- Interaction Log ---\n{log_str}")
return {"log_entry": log_str}
# --- Router ---
def route_support(state: SupportState) -> str:
if state.get("sentiment", "") == "angry":
return "escalate"
if state.get("category", "") == "complaint":
return "escalate"
if state.get("sentiment_score", 0) < -0.5:
return "escalate"
if not state.get("kb_results"):
return "escalate"
return "respond"
# --- Build Graph ---
workflow = StateGraph(SupportState)
workflow.add_node("classify", classify_message)
workflow.add_node("search_kb", search_knowledge_base)
workflow.add_node("respond", generate_response)
workflow.add_node("escalate", escalate_to_human)
workflow.add_node("log", log_interaction)
workflow.add_edge(START, "classify")
workflow.add_edge("classify", "search_kb")
workflow.add_conditional_edges(
"search_kb",
route_support,
{"respond": "respond", "escalate": "escalate"},
)
workflow.add_edge("respond", "log")
workflow.add_edge("escalate", "log")
workflow.add_edge("log", END)
checkpointer = MemorySaver()
support_agent = workflow.compile(checkpointer=checkpointer)
# --- Run ---
config = {"configurable": {"thread_id": str(uuid.uuid4())}}
result = support_agent.invoke(
{
"customer_message": "What payment methods do you accept?",
"category": "",
"sentiment": "",
"sentiment_score": 0.0,
"kb_results": "",
"response": "",
"escalated": False,
"escalation_reason": "",
"log_entry": "",
},
config=config,
)
print(f"\nCategory: {result['category']}")
print(f"Escalated: {result['escalated']}")
print(f"Response: {result['response']}")
print("\nScript completed successfully.")
Frequently Asked Questions
Can I use a different LLM instead of OpenAI?
Yes. Swap ChatOpenAI for any LangChain-compatible chat model. ChatAnthropic, ChatGroq, or ChatOllama all work. The graph structure doesn’t change — only the model initialization.
How do I add more categories to the classifier?
Add the new category to the classifier prompt and add corresponding entries to KNOWLEDGE_BASE. The router’s route_support function may also need updates if the new category has special routing rules.
What’s the difference between interrupt and add_breakpoint?
interrupt pauses inside a node and can pass data to the human reviewer. add_breakpoint (the older API) pauses before a node runs. For new projects, use interrupt — it gives you more control and cleaner code.
How do I persist state across server restarts?
Replace MemorySaver() with PostgresSaver from langgraph-checkpoint-postgres. Install it with pip install langgraph-checkpoint-postgres, configure your database URL, and the API stays identical.
from langgraph.checkpoint.postgres import PostgresSaver
checkpointer = PostgresSaver.from_conn_string("postgresql://user:pass@localhost/db")
Can this agent handle multiple languages?
The LLM handles language detection and response generation natively. Add a language field to the state and include “detect the language” in the classifier prompt. The response generator will reply in the detected language automatically.
References
- LangGraph documentation — StateGraph and Conditional Edges. Link
- LangGraph documentation — Human-in-the-Loop patterns. Link
- LangGraph documentation — Checkpointers and Persistence. Link
- LangChain documentation — ChatOpenAI. Link
- LangGraph Customer Support Tutorial (official). Link
- LangGraph documentation — interrupt and Command. Link
- OpenAI API documentation — Chat Completions. Link
Build a strong Python foundation with hands-on exercises designed for aspiring Data Scientists and AI/ML Engineers.
Start Free Course →