machine learning +
LLM Temperature, Top-P, and Top-K Explained — With Python Simulations
LangGraph Cycles & Recursion Limits: Control Agent Loops
Master LangGraph cycles and recursion limits — stop runaway agents, handle GraphRecursionError, and build clean exit strategies for your loops.
Learn why LangGraph agents loop, how recursion limits keep them in check, and three ways to build clean exit paths so your graphs never crash.
Your ReAct agent handles easy queries just fine. It calls a tool, gets a reply, and wraps up. But then someone asks a vague question. The LLM calls the same tool again. And again. Twenty-five rounds later, your screen shows GraphRecursionError: Recursion limit of 25 reached without hitting a stop condition. I’ll show you why this happens and how to fix it for good.
What Are Cycles in a LangGraph Graph?
Cycle: A path in your graph that sends control back to a node it already ran. Cycles let agents loop — think, act, observe, then decide if they should keep going.
Think of a cycle as a path that circles back. Normal DAGs (directed acyclic graphs) don’t allow this. Each node fires once, data flows forward, and the graph is done. LangGraph breaks that rule on purpose.
Prerequisites
- Python version: 3.10+
- Required library: langgraph (0.4+)
- Install:
pip install langgraph - Prior knowledge: LangGraph basics — nodes, edges, state, conditional edges
- Time to complete: 20-25 minutes
Here’s the simplest cycle you can build. The increment node bumps a counter. A routing edge checks — if the count is under 3, go back to increment. If not, head to END.
python
from langgraph.graph import StateGraph, START, END
from typing_extensions import TypedDict
class CounterState(TypedDict):
count: int
message: str
def increment(state: CounterState) -> dict:
new_count = state["count"] + 1
return {"count": new_count, "message": f"Step {new_count}"}
def should_continue(state: CounterState) -> str:
if state["count"] < 3:
return "increment"
return END
graph = StateGraph(CounterState)
graph.add_node("increment", increment)
graph.add_edge(START, "increment")
graph.add_conditional_edges("increment", should_continue)
app = graph.compile()
result = app.invoke({"count": 0, "message": ""})
print(result)
python
{'count': 3, 'message': 'Step 3'}
The node ran three times. After each pass, should_continue looked at the counter and looped back until it hit 3. That’s a cycle — a node running again through a routing edge.
KEY INSIGHT: Cycles are what make LangGraph agents tick. Without them, every node fires once and the graph stops. With them, an agent can think, act, check the result, and decide if it needs to keep going — the core loop behind every ReAct agent.
Why Do Agents Need Cycles?
You might ask: if cycles can cause problems, why use them? Because three core agent patterns won’t work without them.
The ReAct loop. Your agent picks a tool, reads the output, then chooses whether to call another tool or reply. That pick-run-read cycle keeps going until the agent has what it needs.
Step-by-step polish. One node writes a draft. A second node scores it. If the score is too low, control goes back to the writer. This loop runs until the output is good enough.
Page-by-page data pulls. An agent hits an API, checks for more pages, and loops back to grab the next batch. The cycle stops when there are no pages left.
Each of these needs a cycle. And each one can run forever if you don’t add a way out.
WARNING: Every cycle needs an exit path. If your routing edge can send control back to a past node, you must make sure it will also reach
ENDat some point. Skip this, and your graph loops until the hard cap kicks in and throws an error.
When Should You Avoid Cycles?
Not every task calls for a loop. If your pipeline runs a fixed set of steps — say, extract, clean, load — use a straight graph with plain edges. No cycle needed.
Go with map-reduce when you’re handling items on their own. Rather than looping through a list one by one, fan out to many nodes at once and merge at the end. It’s faster and kills the risk of runaway loops.
A good rule: if you know the step count at build time, skip cycles. They shine when the loop count depends on live data — LLM choices, quality scores, or API pages.
How Does the recursion_limit Work?
LangGraph counts every node run as one step. The recursion_limit cap sets how many steps a graph can take before it raises GraphRecursionError. The default is 25 steps.
You set it through a config dict when you call invoke() or stream(). Here are three options — the default, a loose cap, and a tight one.
python
# Default: 25 steps
result = app.invoke({"count": 0, "message": ""})
# Custom: allow up to 50 steps
result = app.invoke(
{"count": 0, "message": ""},
{"recursion_limit": 50}
)
# Tight limit: only 5 steps allowed
result = app.invoke(
{"count": 0, "message": ""},
{"recursion_limit": 5}
)
Think of this limit as a circuit breaker. It doesn’t drive your graph’s logic — it’s a safety net that kills the run if your logic fails to stop the loop on its own.
Quick check: Say your graph has a 4-node cycle and you set the cap to 25. How many full passes can it make? About 6 (25 / 4 = 6.25, cut to 6 full rounds).
The same cap works with stream() too. Each event you see still counts as a step toward the limit.
python
# Streaming also respects recursion_limit
for event in app.stream(
{"count": 0, "message": ""},
{"recursion_limit": 30}
):
print(event)
TIP: Always set this limit by hand for live systems. Don’t lean on the default 25. If your agent usually needs 10 tool calls, set it to 30-40. If it’s a simple 3-step flow, use 10. A clear limit tells the next dev what “normal” looks like.
What Is GraphRecursionError?
When your graph blows past the cap, LangGraph throws GraphRecursionError. It’s a normal Python error you can catch with try/except.
Let’s force one on purpose. This graph has no way out — should_loop always sends control back to "loop_node".
python
from langgraph.errors import GraphRecursionError
class LoopState(TypedDict):
value: int
def loop_node(state: LoopState) -> dict:
return {"value": state["value"] + 1}
def should_loop(state: LoopState) -> str:
return "loop_node" # always loops — no exit!
bad_graph = StateGraph(LoopState)
bad_graph.add_node("loop_node", loop_node)
bad_graph.add_edge(START, "loop_node")
bad_graph.add_conditional_edges("loop_node", should_loop)
bad_app = bad_graph.compile()
try:
bad_app.invoke({"value": 0}, {"recursion_limit": 5})
except GraphRecursionError as e:
print(f"Caught it: {e}")
python
Caught it: Recursion limit of 5 reached without hitting a stop condition. If you are using a graph with a cycle, set a higher recursion limit.
The error says what went wrong and hints at raising the cap. But if your graph truly has no exit, a bigger cap just delays the crash. The real fix is adding logic that sends control to END.
How Do You Debug Loops That Won’t Stop?
You hit GraphRecursionError and didn’t see it coming. Where do you look? Start with these three checks — they’ll find the bug 90% of the time.
Step 1: Print your routing choices. Drop a print inside your routing function. You’ll see right away if it ever picks END.
python
def should_continue_debug(state: CounterState) -> str:
decision = "increment" if state["count"] < 3 else END
print(f" count={state['count']}, routing to: {decision}")
return decision
Step 2: Make sure your state gets updated. The top cause of endless loops isn’t bad routing logic. It’s a node that never changes the field the router checks. If the router reads state["count"] but no node ever bumps count, the test stays true forever.
Guess the outcome: What happens if you run this graph? The node returns {"message": "done"} but never touches count.
python
def broken_node(state: CounterState) -> dict:
return {"message": "done"} # oops — count never changes
def check_count(state: CounterState) -> str:
return END if state["count"] >= 3 else "broken_node"
Answer: it loops until it crashes. The router checks count, but count stays at 0 because broken_node never updates it.
Step 3: Watch for off-by-one bugs. If your node bumps the counter and your router checks < 3, the cycle runs 3 times (count goes 0, 1, 2, 3 — exits when it equals 3). Double-check that the cutoff matches what you want.
KEY INSIGHT: Most endless loops aren’t routing bugs — they’re state update bugs. The routing logic is usually right. The node it sends control back to just never changes the value the router looks at.
How Does a State Counter Help You Exit Cleanly?
The simplest exit plan uses a counter in your state. Each time the agent loops, the counter goes up. When it hits a cap, the router sends control to END.
Here’s the setup for a real agent loop. The agent_step node tracks passes in a counter field. When it reaches the max, it writes a safe fallback message and sets done to True. The route_agent function checks that flag and either loops or exits.
python
class AgentState(TypedDict):
query: str
iterations: int
max_iterations: int
result: str
done: bool
def agent_step(state: AgentState) -> dict:
iteration = state["iterations"] + 1
if iteration >= state["max_iterations"]:
return {
"iterations": iteration,
"result": "Reached max iterations. Best answer so far.",
"done": True,
}
return {
"iterations": iteration,
"result": f"Working... (iteration {iteration})",
"done": False,
}
def route_agent(state: AgentState) -> str:
if state["done"]:
return END
return "agent_step"
Wire it up and run with a 3-pass cap.
python
agent_graph = StateGraph(AgentState)
agent_graph.add_node("agent_step", agent_step)
agent_graph.add_edge(START, "agent_step")
agent_graph.add_conditional_edges("agent_step", route_agent)
agent_app = agent_graph.compile()
result = agent_app.invoke({
"query": "What is the capital of France?",
"iterations": 0,
"max_iterations": 3,
"result": "",
"done": False,
})
print(f"Iterations: {result['iterations']}")
print(f"Result: {result['result']}")
python
Iterations: 3
Result: Reached max iterations. Best answer so far.
No crash. The agent stopped after 3 rounds. And since max_iterations lives in the state, you can tune it per request — low caps for simple queries, higher ones for deep research.
typescript
{
type: 'exercise',
id: 'cycle-counter-ex1',
title: 'Exercise 1: Build a Retry Loop with State Counter',
difficulty: 'intermediate',
exerciseType: 'write',
instructions: 'Create a LangGraph graph with a node called `process` that increments a `step` counter. Add a conditional edge that loops back to `process` if `step < 4`, and routes to END otherwise. The node should set status to "done" when step reaches 4, and "processing" otherwise. Invoke the graph with `step=0` and print the final state.',
starterCode: 'from langgraph.graph import StateGraph, START, END\nfrom typing_extensions import TypedDict\n\nclass RetryState(TypedDict):\n step: int\n status: str\n\ndef process(state: RetryState) -> dict:\n new_step = state["step"] + 1\n # Return updated step and status\n # YOUR CODE HERE\n\ndef should_retry(state: RetryState) -> str:\n # Return "process" if step < 4, else END\n # YOUR CODE HERE\n\ngraph = StateGraph(RetryState)\n# Add node, edges, compile, and invoke\n# YOUR CODE HERE\n',
testCases: [
{ id: 'tc1', input: 'print(result["step"])', expectedOutput: '4', description: 'Final step count should be 4' },
{ id: 'tc2', input: 'print(result["status"])', expectedOutput: 'done', description: 'Final status should be done' },
],
hints: [
'In process, check if new_step >= 4 and return status "done" if so, "processing" otherwise.',
'Full solution: def process(state): new_step = state["step"] + 1; status = "done" if new_step >= 4 else "processing"; return {"step": new_step, "status": status}',
],
solution: 'from langgraph.graph import StateGraph, START, END\nfrom typing_extensions import TypedDict\n\nclass RetryState(TypedDict):\n step: int\n status: str\n\ndef process(state: RetryState) -> dict:\n new_step = state["step"] + 1\n status = "done" if new_step >= 4 else "processing"\n return {"step": new_step, "status": status}\n\ndef should_retry(state: RetryState) -> str:\n if state["step"] < 4:\n return "process"\n return END\n\ngraph = StateGraph(RetryState)\ngraph.add_node("process", process)\ngraph.add_edge(START, "process")\ngraph.add_conditional_edges("process", should_retry)\n\napp = graph.compile()\nresult = app.invoke({"step": 0, "status": ""})\nprint(result)',
solutionExplanation: 'The process node increments step and sets status based on whether the cap was reached. The should_retry edge checks the step count and routes back or to END. This gives you a clean, predictable loop.',
xpReward: 15,
}
How Does a Quality Gate Help You Exit?
What if the right answer isn’t “stop after N passes” but “stop when the output is good enough”?
Picture this: a graph writes text and rates it. If the score tops 0.8, it exits. If not, it loops back to rewrite. But what if the score never gets high enough? That’s where the two-door exit comes in.
The generate function fakes a steady gain — each pass adds 0.3 to the score. The route_refinement function checks two things: did we clear the bar, OR did we hit the max pass count? Two exit doors cover the happy path and the worst case.
python
class RefinementState(TypedDict):
draft: str
score: float
iterations: int
max_iterations: int
def generate(state: RefinementState) -> dict:
iteration = state["iterations"] + 1
score = 0.3 * iteration # simulated improvement each round
return {
"draft": f"Draft v{iteration}",
"score": score,
"iterations": iteration,
}
def route_refinement(state: RefinementState) -> str:
if state["score"] >= 0.8:
return END
if state["iterations"] >= state["max_iterations"]:
return END
return "generate"
python
refine_graph = StateGraph(RefinementState)
refine_graph.add_node("generate", generate)
refine_graph.add_edge(START, "generate")
refine_graph.add_conditional_edges("generate", route_refinement)
refine_app = refine_graph.compile()
result = refine_app.invoke({
"draft": "",
"score": 0.0,
"iterations": 0,
"max_iterations": 5,
})
print(f"Final draft: {result['draft']}")
print(f"Score: {result['score']}")
print(f"Iterations used: {result['iterations']}")
python
Final draft: Draft v3
Score: 0.9
Iterations used: 3
By round three, the score hit 0.9 (0.3 times 3). That clears the 0.8 bar, so it stopped early. The max cap never fired — but it would have if the bar had been set too high.
TIP: Always pair a quality gate with a max pass guard. A quality test on its own can loop without end if the score never climbs high enough. The hard cap catches that edge case.
How Do You Catch GraphRecursionError at Runtime?
What if your agent’s loop count is truly hard to predict? Catch the error where you call the graph and return a fallback reply. This fits well in API routes and chatbot backends where a raw crash means a 500 error to the user.
python
from langgraph.errors import GraphRecursionError
def run_agent_safely(app, inputs, limit=25):
try:
result = app.invoke(inputs, {"recursion_limit": limit})
return result
except GraphRecursionError:
return {
"result": "I couldn't complete the task within "
"the step limit. Try a simpler question.",
"error": "recursion_limit_exceeded",
}
output = run_agent_safely(bad_app, {"value": 0}, limit=10)
print(output["result"])
python
I couldn't complete the task within the step limit. Try a simpler question.
No 500 error for your users — they get a clear note instead. I like this pattern more than letting crashes bubble up. It keeps you in charge of what people see.
typescript
{
type: 'exercise',
id: 'graceful-exit-ex2',
title: 'Exercise 2: Add a Max-Iteration Guard to a Quality Loop',
difficulty: 'intermediate',
exerciseType: 'write',
instructions: 'Build a refinement graph where the quality threshold is 0.95 (unreachable with our simulated 0.3-per-iteration scores). Set max_iterations to 4. Run the graph and verify it exits cleanly after 4 iterations instead of looping forever.',
starterCode: 'from langgraph.graph import StateGraph, START, END\nfrom typing_extensions import TypedDict\n\nclass RefinementState(TypedDict):\n draft: str\n score: float\n iterations: int\n max_iterations: int\n\ndef generate(state: RefinementState) -> dict:\n iteration = state["iterations"] + 1\n score = 0.3 * iteration\n return {"draft": f"Draft v{iteration}", "score": score, "iterations": iteration}\n\ndef route_refinement(state: RefinementState) -> str:\n # Check quality threshold (0.95) and max_iterations\n # YOUR CODE HERE\n\n# Build, compile, invoke with max_iterations=4\n# YOUR CODE HERE\nprint(f"Iterations: {result[\'iterations\']}")\nprint(f"Score: {result[\'score\']}")\n',
testCases: [
{ id: 'tc1', input: 'print(result["iterations"])', expectedOutput: '4', description: 'Should stop after 4 iterations' },
{ id: 'tc2', input: 'print(result["draft"])', expectedOutput: 'Draft v4', description: 'Should produce Draft v4' },
],
hints: [
'In route_refinement, check if score >= 0.95 OR iterations >= max_iterations. If either is true, return END.',
'Full route function: if state["score"] >= 0.95: return END\\nif state["iterations"] >= state["max_iterations"]: return END\\nreturn "generate"',
],
solution: 'from langgraph.graph import StateGraph, START, END\nfrom typing_extensions import TypedDict\n\nclass RefinementState(TypedDict):\n draft: str\n score: float\n iterations: int\n max_iterations: int\n\ndef generate(state: RefinementState) -> dict:\n iteration = state["iterations"] + 1\n score = 0.3 * iteration\n return {"draft": f"Draft v{iteration}", "score": score, "iterations": iteration}\n\ndef route_refinement(state: RefinementState) -> str:\n if state["score"] >= 0.95:\n return END\n if state["iterations"] >= state["max_iterations"]:\n return END\n return "generate"\n\nrefine_graph = StateGraph(RefinementState)\nrefine_graph.add_node("generate", generate)\nrefine_graph.add_edge(START, "generate")\nrefine_graph.add_conditional_edges("generate", route_refinement)\n\nrefine_app = refine_graph.compile()\nresult = refine_app.invoke({"draft": "", "score": 0.0, "iterations": 0, "max_iterations": 4})\nprint(f"Iterations: {result[\'iterations\']}")\nprint(f"Score: {result[\'score\']}")',
solutionExplanation: 'The quality threshold of 0.95 is unreachable (max score at iteration 4 is 1.2, but the max_iterations check triggers first). The max_iterations guard catches it at exactly 4 iterations. This proves why you always need both conditions.',
xpReward: 15,
}
Which Cycles Are Safe and Which Are Risky?
Not all loops carry the same danger. Here’s a quick guide for sizing up yours.
| Cycle Type | Example | Risk | Why |
|---|---|---|---|
| Counter-capped | Retry up to 3 times | Low | A hard cap forces the exit |
| Quality gate + cap | Polish until score > 0.8, max 5 | Low | Two exit paths handle all cases |
| LLM-driven | Agent calls tools until it wants to stop | Medium | The LLM might never want to stop |
| No exit | Node always sends control back to itself | Very high | Crash is certain |
| Hidden multi-node loop | A goes to B goes to C goes back to A | High | Easy to miss in big graphs |
The rule of thumb: if a person wrote the exit logic with a hard cap, you’re safe. If the LLM picks when to stop and there’s no backup cap, you’re rolling dice.
WARNING: LLM-driven loops need a backup cap. LLMs sometimes call the same tool over and over with small tweaks, hoping for a new answer. Always add a
max_iterationscheck next to the LLM’s own choice to block runaway loops.
How Do You Build a Full Task Loop with All Three Safety Layers?
Let’s bring it all together. We’ll use a state counter, a “tasks done” check, and recursion_limit at call time — all three layers in one graph.
The agent works through a list of tasks. Each loop pass handles one. The work_on_task node grabs the first task left, marks it done, and bumps the counter. When nothing is left, it sets status to "all_done". The route_tasks function checks three things before picking the next step.
python
from langgraph.graph import StateGraph, START, END
from typing_extensions import TypedDict
class TaskState(TypedDict):
tasks: list[str]
completed: list[str]
iterations: int
max_iterations: int
status: str
def work_on_task(state: TaskState) -> dict:
remaining = [t for t in state["tasks"] if t not in state["completed"]]
if not remaining:
return {"iterations": state["iterations"] + 1, "status": "all_done"}
current = remaining[0]
new_completed = state["completed"] + [current]
return {
"completed": new_completed,
"iterations": state["iterations"] + 1,
"status": f"completed: {current}",
}
def route_tasks(state: TaskState) -> str:
if state["status"] == "all_done":
return END
if state["iterations"] >= state["max_iterations"]:
return END
if len(state["completed"]) >= len(state["tasks"]):
return END
return "work_on_task"
Build and run with three tasks.
python
task_graph = StateGraph(TaskState)
task_graph.add_node("work_on_task", work_on_task)
task_graph.add_edge(START, "work_on_task")
task_graph.add_conditional_edges("work_on_task", route_tasks)
task_app = task_graph.compile()
result = task_app.invoke({
"tasks": ["fetch_data", "clean_data", "train_model"],
"completed": [],
"iterations": 0,
"max_iterations": 10,
"status": "",
}, {"recursion_limit": 15})
print(f"Completed: {result['completed']}")
print(f"Iterations: {result['iterations']}")
print(f"Status: {result['status']}")
python
Completed: ['fetch_data', 'clean_data', 'train_model']
Iterations: 4
Status: all_done
Four passes: one per task, plus a last pass that found nothing left and set status to "all_done". The router caught that and stopped.
What Mistakes Do People Make with Cycles?
Mistake 1: The node doesn’t update the field the router reads
This causes more endless loops than any other bug.
Wrong:
python
def my_node(state: MyState) -> dict:
return {"result": "done"} # never updates 'count'
def should_stop(state: MyState) -> str:
if state["count"] >= 3: # count never changes!
return END
return "my_node"
Why it breaks: The router reads count, but the node never touches it. The test count >= 3 stays false forever.
Right:
python
def my_node(state: MyState) -> dict:
return {"result": "done", "count": state["count"] + 1}
Mistake 2: The cap is too low for multi-node loops
Wrong:
python
# Graph has 5 nodes per cycle, needs ~15 steps
result = app.invoke(inputs, {"recursion_limit": 10})
Why it breaks: Each loop pass burns 5 steps. Two passes take 10. One more node run and you hit the wall on a perfectly fine run.
Right:
python
# 5 nodes x 3 expected passes = 15, plus a buffer
result = app.invoke(inputs, {"recursion_limit": 25})
Mistake 3: No error handling in your live code
Wrong:
python
result = app.invoke(user_input) # unhandled crash
return {"response": result["answer"]}
Right:
python
from langgraph.errors import GraphRecursionError
try:
result = app.invoke(user_input, {"recursion_limit": 30})
return {"response": result["answer"]}
except GraphRecursionError:
return {"response": "I need more steps to answer this."}
What Should You Take Away?
Cycles give LangGraph agents the power to think step by step. They drive ReAct loops, draft-and-refine flows, and page-by-page data pulls. But a cycle with no brake will crash your agent.
Guard yourself with three layers:
- State-based exits — counters or quality bars in your routing logic
recursion_limit— a hard cap at call time as a crash guardGraphRecursionErrorhandling — try/except at the call site for user-friendly fallbacks
I’d use all three in any live system. The state exit covers the normal path. The hard cap catches your bugs. The error handler makes sure users never see a raw crash.
Practice exercise: Build a research agent loop. It starts with a query, “searches” for info (faked), rates its own progress (a score that climbs by 0.2 each pass, starting at 0.1), and stops when it hits 0.85 or runs 5 times. Print the final score and pass count.
Complete Code
Frequently Asked Questions
What sets recursion_limit apart from a custom max_iterations counter?
recursion_limit is LangGraph’s built-in safety net. It counts every node run and throws GraphRecursionError when the cap is hit. A custom max_iterations is a field you put in your state — you own the logic, and it triggers a smooth exit through your router. Use both: max_iterations for the expected path, recursion_limit as a crash guard.
Can I lock in recursion_limit at compile time?
Yes. Call graph.compile().with_config({"recursion_limit": 50}) to set a default for all runs. You can still swap it per call by passing a config to invoke().
Does the limit count single nodes or full loop passes?
Single node runs (also called supersteps). If your loop has 3 nodes and the cap is 25, you get about 8 full passes. Keep this in mind for loops that touch many nodes.
What happens to state when the error fires?
Without a checkpointer, the state in progress is lost. With one set up, you can pull the last saved state and look at what went on before the crash.
Do subgraphs share the parent’s limit?
Yes. Each node run inside a subgraph counts toward the parent’s recursion_limit. If the parent allows 25 and a subgraph burns 10, that leaves 15 for the rest. Plan your caps with nesting in mind.
How do I stop an LLM from calling the same tool in a loop?
Put a max_iterations counter in your state and check it in the router. Once it crosses your cap, force the return value to END no matter what the LLM wants. This is the most solid approach — you can’t rely on the LLM to stop on its own.
References
- LangGraph documentation — GRAPH_RECURSION_LIMIT error reference. Link
- LangGraph GitHub — Troubleshooting recursion limit errors. Link
- LangGraph documentation — Interrupts for human-in-the-loop control. Link
- LangGraph errors module — Python reference. Link
- LangChain blog — Building LangGraph: Designing an Agent Runtime from first principles. Link
- LangGraph official site — Agent Orchestration Framework. Link
- Arxiv — Unsupervised Cycle Detection in Agentic Applications (2025). Link
Free Course
Master Core Python — Your First Step into AI/ML
Build a strong Python foundation with hands-on exercises designed for aspiring Data Scientists and AI/ML Engineers.
Start Free Course →Trusted by 50,000+ learners
Related Course
Master Gen AI — Hands-On
Join 5,000+ students at edu.machinelearningplus.com
Explore Course
Up Next in Learning Path
LangGraph Streaming: Real-Time Agent Output Guide
