Building Effective AI Agents
Start with the simplest solution and add complexity only when needed:
Python# Basic augmented LLM pattern def enhanced_llm_call(prompt, tools=None, context=None): """Foundation building block for all agent patterns""" enhanced_prompt = f""" Context: {context or "No additional context"} Available tools: {[tool.name for tool in tools] if tools else "None"} Task: {prompt} Think step by step and use tools when helpful. """ response = llm_call(enhanced_prompt, tools=tools) return response # Prompt chaining example def generate_marketing_copy(product_info, target_language): # Step 1: Generate copy copy = enhanced_llm_call(f"Create marketing copy for: {product_info}") # Step 2: Gate check if not meets_quality_criteria(copy): copy = enhanced_llm_call(f"Improve this copy: {copy}") # Step 3: Translate translated = enhanced_llm_call(f"Translate to {target_language}: {copy}") return translated
1. Prompt Chaining
Decompose tasks into sequential steps with quality gates.
When to use: Tasks can be cleanly broken into fixed subtasks.
Pythondef document_workflow(requirements): # Generate outline outline = llm_call(f"Create outline for: {requirements}") # Validate outline if not validate_outline(outline): outline = llm_call(f"Fix this outline: {outline}") # Write document document = llm_call(f"Write document based on: {outline}") return document
2. Routing
Classify inputs and direct to specialized handlers.
Pythondef customer_support_router(query): classification = llm_call(f"Classify this query: {query}") if classification == "refund": return handle_refund(query) elif classification == "technical": return handle_technical_support(query) else: return handle_general_inquiry(query)
3. Parallelization
Run multiple LLM calls simultaneously and aggregate results.
Pythonimport asyncio async def code_review_voting(code): """Multiple reviewers vote on code quality""" tasks = [ review_security(code), review_performance(code), review_style(code) ] reviews = await asyncio.gather(*tasks) final_verdict = aggregate_reviews(reviews) return final_verdict async def content_sectioning(large_document): """Process document sections in parallel""" sections = split_document(large_document) tasks = [process_section(section) for section in sections] processed_sections = await asyncio.gather(*tasks) return combine_sections(processed_sections)
4. Orchestrator-Workers
Dynamic task delegation with synthesis.
Pythondef coding_orchestrator(task_description): # Orchestrator plans the work plan = llm_call(f""" Analyze this coding task and create a plan: {task_description} Return a list of files to modify and what changes each needs. """) # Workers execute subtasks results = [] for subtask in plan.subtasks: result = worker_llm_call(f"Execute: {subtask}") results.append(result) # Orchestrator synthesizes final_code = llm_call(f""" Combine these code changes into final implementation: {results} """) return final_code
5. Evaluator-Optimizer
Iterative improvement through feedback loops.
Pythondef literary_translation_loop(text, target_language, max_iterations=3): translation = llm_call(f"Translate to {target_language}: {text}") for i in range(max_iterations): evaluation = llm_call(f""" Evaluate this translation for accuracy and nuance: Original: {text} Translation: {translation} Provide specific feedback for improvement. """) if evaluation.score >= 8: # Good enough break translation = llm_call(f""" Improve this translation based on feedback: Translation: {translation} Feedback: {evaluation.feedback} """) return translation
6. Autonomous Agents
Self-directing systems with tool use and environmental feedback.
Pythondef autonomous_coding_agent(github_issue): """Agent that resolves GitHub issues independently""" progress = [] max_steps = 20 for step in range(max_steps): # Agent plans next action action = llm_call(f""" Current progress: {progress} GitHub issue: {github_issue} What should I do next? Choose from: - analyze_codebase - run_tests - modify_file - create_file - submit_solution """) # Execute action and get feedback result = execute_action(action) progress.append(f"Step {step}: {action} -> {result}") # Check if task is complete if action.type == "submit_solution" and result.success: break # Safety check if result.error: recovery_action = llm_call(f"How to recover from: {result.error}") result = execute_action(recovery_action) return progress
Use this checklist to choose the right pattern:
Progress:
- Can the task be solved with a single LLM call? → Use basic augmented LLM
- Can the task be broken into fixed sequential steps? → Use prompt chaining
- Are there distinct categories requiring different handling? → Use routing
- Can subtasks run in parallel or need multiple attempts? → Use parallelization
- Do subtasks depend on the specific input and can't be predefined? → Use orchestrator-workers
- Can iterative feedback demonstrably improve results? → Use evaluator-optimizer
- Is the solution path unpredictable and requires autonomous decision-making? → Use agents
Example 1 - Customer Support System: Input: "I want to return my order but the website won't let me" Output: Routes to refund specialist → Pulls order history → Processes return → Confirms with customer
Example 2 - Code Review Agent: Input: Python function with potential security issues Output: Parallel security, performance, and style reviews → Aggregated feedback → Improvement suggestions
Example 3 - Research Agent: Input: "Analyze market trends for electric vehicles in 2024" Output: Plans research approach → Gathers data from multiple sources → Synthesizes findings → Iteratively refines analysis
Tool Design:
- Provide clear, comprehensive tool documentation
- Include examples of successful tool usage
- Design tools to return structured, actionable feedback
- Test tools extensively in isolation
Error Handling:
- Implement stopping conditions (max iterations, timeouts)
- Add recovery mechanisms for common failure modes
- Log all agent decisions for debugging
- Include human oversight checkpoints for critical decisions
Performance Optimization:
- Start with smaller models for simple tasks
- Use caching for repeated operations
- Implement early stopping when quality thresholds are met
- Monitor costs and set budgets
Evaluation:
- Define clear success metrics before building
- Test in sandboxed environments
- Compare against simpler baselines
- Measure both accuracy and efficiency
Over-engineering: Don't use complex patterns when simple prompts work. Most tasks need basic augmented LLMs, not full agents.
Framework dependency: Don't let abstractions hide the underlying prompts. Understand what's happening under the hood.
Insufficient guardrails: Agents can compound errors. Always include stopping conditions and human oversight for critical paths.
Poor tool documentation: Agents are only as good as their tools. Invest heavily in clear tool interfaces and documentation.
Missing feedback loops: Agents need environmental feedback to self-correct. Don't just chain LLM calls without validation.
Ignoring costs: Agent patterns trade latency and cost for capability. Monitor usage and optimize for your specific constraints.