MeshWorld India Logo MeshWorld.
AI AI Agents Architecture Design Patterns LLM Autonomous Agents Tool Use ReAct LangChain Claude OpenAI 12 min read

AI Agent Architecture Patterns: ReAct, Planning & Memory

Vishnu
By Vishnu
AI Agent Architecture Patterns: ReAct, Planning & Memory

Building AI agents that don’t hallucinate, get stuck in loops, or drain your API budget requires more than plugging an LLM into a chat interface. You need architectural patterns — proven templates for how agents think, plan, use tools, and remember context.

This guide covers the five essential patterns every developer should know when building production-ready AI agents.

TL;DR
  • ReAct Pattern: Think → Act → Observe → Repeat. Best for single-step reasoning tasks.
  • Plan-and-Execute: Plan all steps first, then execute. Better for complex multi-step workflows.
  • Multi-Agent Systems: Divide work between specialized agents. Scales to enterprise complexity.
  • Tool Use with Reflection: Let agents call APIs, then verify results before proceeding.
  • Memory Systems: Short-term (context window) + Long-term (vector DB) + Entity (knowledge graph).

What Is an AI Agent?

An AI agent is an LLM-powered system that can:

  1. Reason through complex problems
  2. Plan sequences of actions
  3. Use tools (APIs, databases, code execution)
  4. Observe results and adapt
  5. Remember context across interactions

Unlike simple chatbots, agents can take autonomous actions to achieve goals.

The Scenario: Your company needs to automate competitor analysis. A simple prompt won’t work — you need an agent that searches websites, extracts pricing, compares features, and generates a report. That’s where architecture patterns matter.

Pattern 1: ReAct (Reasoning + Acting)

The ReAct pattern alternates between reasoning and action. It’s the simplest effective agent architecture.

How It Works

plaintext
Thought: I need to check the weather in New York
Action: weather_api(location="New York")
Observation: {"temp": 72, "condition": "sunny"}
Thought: Now I have the weather data. The user asked about outdoor activities.
Action: search_activities(weather="sunny", location="New York")
Observation: ["Central Park", "High Line", "Brooklyn Bridge Walk"]
Final Answer: Based on sunny 72°F weather, I recommend Central Park, the High Line, or a Brooklyn Bridge walk.

Code Implementation

python
from typing import List, Dict, Any
import json

class ReActAgent:
    def __init__(self, llm_client, tools: Dict[str, callable]):
        self.llm = llm_client
        self.tools = tools
        self.max_iterations = 10
    
    def run(self, query: str) -> str:
        context = f"Query: {query}\n\n"
        
        for i in range(self.max_iterations):
            # Generate thought and action
            prompt = self._build_prompt(context)
            response = self.llm.generate(prompt)
            
            # Parse response
            thought = self._extract_thought(response)
            action = self._extract_action(response)
            
            if not action:
                # Agent provided final answer
                return response
            
            # Execute tool
            tool_name, tool_input = self._parse_action(action)
            if tool_name in self.tools:
                observation = self.tools[tool_name](**tool_input)
                context += f"Thought: {thought}\n"
                context += f"Action: {action}\n"
                context += f"Observation: {observation}\n\n"
            else:
                context += f"Error: Tool '{tool_name}' not found\n\n"
        
        return "Max iterations reached"
    
    def _build_prompt(self, context: str) -> str:
        return f"""You are a helpful assistant. Use the following format:

Thought: [Your reasoning about what to do next]
Action: [Tool name and JSON input, or "Final Answer"]

Available tools:
- weather_api(location: str)
- search_activities(weather: str, location: str)
- calculator(expression: str)

{context}
"""

When to Use ReAct

| Use Case | Example | |---|---| | Single-step decisions | “Should I bring an umbrella?” | | Sequential tool calls | Search → Filter → Summarize | | Interactive debugging | Fix code errors step by step | | Customer support | Diagnose issues through questioning |

Limitations

  • No backtracking: Can’t revise earlier decisions
  • Short horizon: Struggles with 10+ step tasks
  • No parallel execution: Steps happen sequentially

Pattern 2: Plan-and-Execute

For complex tasks, plan everything first, then execute. This avoids mid-task dead ends.

How It Works

plaintext
Plan:
1. Search for competitor pricing data
2. Extract pricing from top 3 results
3. Compare with our pricing
4. Generate analysis report

Execution:
[Execute step 1] → [Execute step 2] → [Execute step 3] → [Execute step 4]

Code Implementation

python
from dataclasses import dataclass
from typing import List, Optional

@dataclass
class Step:
    description: str
    tool: Optional[str]
    input_params: Dict[str, Any]
    output_var: str

class PlanAndExecuteAgent:
    def __init__(self, llm_client, tools: Dict[str, callable]):
        self.llm = llm_client
        self.tools = tools
    
    def run(self, query: str) -> str:
        # Phase 1: Planning
        plan = self._create_plan(query)
        
        # Phase 2: Execution
        results = {}
        for step in plan:
            if step.tool:
                # Substitute variables from previous steps
                params = self._substitute_vars(step.input_params, results)
                results[step.output_var] = self.tools[step.tool](**params)
            else:
                # LLM reasoning step
                results[step.output_var] = self._llm_reason(step.description, results)
        
        return results.get('final_output', 'Task completed')
    
    def _create_plan(self, query: str) -> List[Step]:
        prompt = f"""Create a step-by-step plan for: {query}

Format each step as:
- Step: [description]
- Tool: [tool_name or "none"]
- Input: [params as JSON]
- Output: [variable_name]

Available tools: {list(self.tools.keys())}
"""
        response = self.llm.generate(prompt)
        return self._parse_plan(response)
    
    def _substitute_vars(self, params: Dict, results: Dict) -> Dict:
        """Replace {{var}} with actual values from previous steps"""
        resolved = {}
        for key, value in params.items():
            if isinstance(value, str) and value.startswith('{{') and value.endswith('}}'):
                var_name = value[2:-2]
                resolved[key] = results.get(var_name, value)
            else:
                resolved[key] = value
        return resolved

When to Use Plan-and-Execute

| Use Case | Example | |---|---| | Multi-step workflows | Research → Draft → Review → Publish | | Data pipelines | Extract → Transform → Load → Validate | | Report generation | Gather data → Analyze → Visualize → Write | | Code generation | Plan architecture → Generate files → Test |

Advantages Over ReAct

  • Global optimization: Plans consider all steps upfront
  • Parallel execution: Independent steps run simultaneously
  • Better error recovery: Can replan from any failure point

Pattern 3: Multi-Agent Systems

Divide complex tasks between specialized agents. Each agent has a specific role and expertise.

Architecture Overview

plaintext
┌─────────────────────────────────────────┐
│         Orchestrator Agent              │
│    (Routes tasks, manages workflow)     │
└─────────────────────────────────────────┘

    ┌──────┼──────┬──────┐
    │      │      │      │
    ▼      ▼      ▼      ▼
┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐
│Research│ │Writer│ │Coder │ │Review│
│ Agent │ │ Agent│ │ Agent│ │ Agent│
└─────┘ └─────┘ └─────┘ └─────┘

Code Implementation

python
from typing import Callable
import asyncio

class Agent:
    def __init__(self, name: str, system_prompt: str, tools: List[str]):
        self.name = name
        self.system_prompt = system_prompt
        self.tools = tools
    
    async def execute(self, task: str, context: Dict) -> str:
        # Each agent uses ReAct or Plan-and-Execute internally
        prompt = f"{self.system_prompt}\n\nTask: {task}\nContext: {context}"
        return await self.llm.generate(prompt)

class MultiAgentSystem:
    def __init__(self):
        self.agents: Dict[str, Agent] = {}
        self.orchestrator = None
    
    def register_agent(self, agent: Agent):
        self.agents[agent.name] = agent
    
    async def run(self, query: str) -> str:
        # Orchestrator decides which agents to call
        plan = await self._orchestrate(query)
        
        results = {}
        for step in plan:
            agent_name = step['agent']
            task = step['task']
            
            if agent_name in self.agents:
                agent = self.agents[agent_name]
                results[agent_name] = await agent.execute(task, results)
        
        # Synthesize final output
        return await self._synthesize(query, results)
    
    async def _orchestrate(self, query: str) -> List[Dict]:
        """Determine which agents to use and in what order"""
        orchestrator_prompt = f"""Given this task: {query}

Available agents:
{[f"- {name}: {agent.system_prompt[:100]}..." for name, agent in self.agents.items()]}

Create an execution plan:
1. Which agents to use
2. What task to give each
3. Dependencies between agents

Output as JSON list with 'agent', 'task', and 'depends_on' keys.
"""
        response = await self.llm.generate(orchestrator_prompt)
        return json.loads(response)

# Usage
system = MultiAgentSystem()

system.register_agent(Agent(
    name="researcher",
    system_prompt="You are a research specialist. Find accurate, up-to-date information.",
    tools=["web_search", "academic_search", "news_api"]
))

system.register_agent(Agent(
    name="writer",
    system_prompt="You are a technical writer. Create clear, engaging content.",
    tools=["grammar_check", "readability_score"]
))

system.register_agent(Agent(
    name="coder",
    system_prompt="You are a senior developer. Write clean, tested code.",
    tools=["code_executor", "linter", "test_runner"]
))

result = await system.run("Create a Python script that fetches weather data and sends email alerts")

Agent Specialization Examples

| Agent Type | Responsibility | Tools | |---|---|---| | Research Agent | Information gathering | Search APIs, databases, web scraping | | Analysis Agent | Data processing | Pandas, SQL, visualization | | Code Agent | Implementation | Code execution, linters, tests | | Review Agent | Quality assurance | Fact-checking, style guides | | UI Agent | Interface design | Component libraries, design systems |

When to Use Multi-Agent

| Use Case | Why Multiple Agents? | |---|---| | Content platform | Research → Write → Edit → SEO optimize | | DevOps automation | Monitor → Analyze → Plan → Execute | | Customer support | Triage → Resolve → Escalate → Follow-up | | Research assistant | Literature review → Analysis → Synthesis |

Pattern 4: Tool Use with Reflection

Don’t just call tools — verify the results before proceeding.

The Reflection Loop

plaintext
Plan → Act → Observe → Reflect → [Retry if needed] → Continue

Code Implementation

python
@dataclass
class ToolResult:
    success: bool
    data: Any
    error: Optional[str] = None

class ReflectiveToolAgent:
    def __init__(self, llm_client, tools: Dict[str, callable]):
        self.llm = llm_client
        self.tools = tools
        self.max_retries = 3
    
    async def use_tool(self, tool_name: str, params: Dict) -> ToolResult:
        for attempt in range(self.max_retries):
            # Execute tool
            try:
                raw_result = await self.tools[tool_name](**params)
                
                # Reflect on result
                reflection = await self._reflect(tool_name, params, raw_result)
                
                if reflection.is_valid:
                    return ToolResult(success=True, data=raw_result)
                
                # Retry with corrections
                params = reflection.corrected_params
                
            except Exception as e:
                if attempt == self.max_retries - 1:
                    return ToolResult(success=False, error=str(e))
        
        return ToolResult(success=False, error="Max retries exceeded")
    
    async def _reflect(self, tool_name: str, params: Dict, result: Any) -> 'Reflection':
        """Analyze if tool output is valid and useful"""
        prompt = f"""Tool: {tool_name}
Input: {json.dumps(params)}
Output: {json.dumps(result)}

Evaluate:
1. Is the output valid and well-formed?
2. Does it contain the expected data?
3. Are there any errors or anomalies?

Respond with JSON:
{{
    "is_valid": true/false,
    "issues": ["list of problems if any"],
    "corrected_params": {{"param": "value"}} // if retry needed
}}
"""
        response = await self.llm.generate(prompt)
        return Reflection.parse(response)

Reflection Checks

| Check | Example | |---|---| | Format validation | JSON parsing, schema validation | | Semantic validation | “Does this answer the user’s question?” | | Error detection | Empty results, rate limits, timeouts | | Quality assessment | “Is this search result relevant?” |

Pattern 5: Memory Systems

Agents need to remember context across sessions and learn from past interactions.

Three Types of Memory

plaintext
┌─────────────────────────────────────────────────────┐
│                   WORKING MEMORY                      │
│         (Current conversation context)                │
│              ~128K tokens (Claude/GPT)                │
└─────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────┐
│                  SHORT-TERM MEMORY                  │
│     (Recent conversations, session history)          │
│         Vector DB: Pinecone, Chroma, Weaviate        │
└─────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────┐
│                   LONG-TERM MEMORY                  │
│    (User preferences, learned facts, entity graph)   │
│        Knowledge Graph + Document Store              │
└─────────────────────────────────────────────────────┘

Implementation

python
from typing import List
import hashlib

class AgentMemory:
    def __init__(self, vector_store, knowledge_graph):
        self.vector_store = vector_store
        self.kg = knowledge_graph
        self.session_id = None
    
    def start_session(self, user_id: str):
        self.session_id = hashlib.md5(f"{user_id}:{time.time()}".encode()).hexdigest()
    
    async def remember(self, content: str, memory_type: str = "short_term"):
        """Store information for later retrieval"""
        if memory_type == "short_term":
            # Vector embedding for semantic search
            embedding = await self.embed(content)
            await self.vector_store.upsert(
                ids=[f"{self.session_id}:{time.time()}"],
                embeddings=[embedding],
                metadatas=[{"content": content, "session": self.session_id}]
            )
        
        elif memory_type == "long_term":
            # Extract entities and relationships
            entities = await self._extract_entities(content)
            for entity in entities:
                await self.kg.add_entity(entity)
    
    async def recall(self, query: str, k: int = 5) -> List[str]:
        """Retrieve relevant past information"""
        # Semantic search
        query_embedding = await self.embed(query)
        results = await self.vector_store.query(
            query_embeddings=[query_embedding],
            n_results=k,
            filter={"session": self.session_id}
        )
        
        return [r['content'] for r in results['metadatas'][0]]
    
    async def _extract_entities(self, text: str) -> List[Dict]:
        """Use LLM to extract entities and relationships"""
        prompt = f"""Extract entities and relationships from:
{text}

Format: JSON list of {{"entity": "name", "type": "person/place/thing", "relationships": [{{"to": "other", "type": "works_with/located_in/etc"}}]}}"""
        response = await self.llm.generate(prompt)
        return json.loads(response)

Memory Retrieval Strategies

| Strategy | Use Case | |---|---| | Semantic search | “Find similar past conversations” | | Entity lookup | “What’s the user’s company?” | | Temporal recall | “What did we discuss last week?” | | Structured query | “List all API integrations mentioned” |

Choosing the Right Pattern

| Task Complexity | Recommended Pattern | |---|---| | Simple Q&A with tool use | ReAct | | Multi-step workflow | Plan-and-Execute | | Cross-functional automation | Multi-Agent | | Critical operations (finance, health) | Tool Use + Reflection | | Persistent user relationships | Any pattern + Memory |

Common Pitfalls

The Infinite Loop

python
# BAD: No iteration limit
while not task_complete:
    agent.step()

# GOOD: Bounded execution
for i in range(max_iterations):
    if task_complete:
        break
    agent.step()

The Context Explosion

python
# BAD: Unlimited context growth
context += f"Step {i}: {result}\n"

# GOOD: Summarize old context
if len(context) > 100000:
    context = await agent.summarize(context)

The Tool Overload

python
# BAD: 50 tools confuses the agent
tools = [tool1, tool2, ..., tool50]

# GOOD: Group tools by function
research_tools = [search, scrape, summarize]
code_tools = [execute, lint, test]

Production Checklist

  • [ ] Set maximum iteration limits
  • [ ] Implement timeout handling
  • [ ] Add cost tracking per request
  • [ ] Log all tool calls for debugging
  • [ ] Cache frequent tool results
  • [ ] Implement graceful degradation
  • [ ] Add human-in-the-loop for critical decisions
  • [ ] Monitor hallucination rates
  • [ ] A/B test different prompts
  • [ ] Version control your agent configurations

Summary

  • ReAct: Simple, effective for single-step reasoning. Think → Act → Observe.
  • Plan-and-Execute: Complex workflows need upfront planning.
  • Multi-Agent: Scale to enterprise by specializing agents.
  • Tool Use + Reflection: Verify results, don’t blindly trust.
  • Memory: Context across sessions separates toys from tools.

The best agents combine these patterns. Start with ReAct, add planning for complexity, specialize into multi-agent for scale, and always verify critical operations.