AI Skill Report Card

Building Effective AI Agents

A-85·May 30, 2026·Source: Extension-page
15 / 15

Start with the simplest solution and add complexity only when needed:

Python
# Basic augmented LLM pattern def enhanced_llm_call(prompt, tools=None, context=None): """Foundation building block for all agent patterns""" enhanced_prompt = f""" Context: {context or "No additional context"} Available tools: {[tool.name for tool in tools] if tools else "None"} Task: {prompt} Think step by step and use tools when helpful. """ response = llm_call(enhanced_prompt, tools=tools) return response # Prompt chaining example def generate_marketing_copy(product_info, target_language): # Step 1: Generate copy copy = enhanced_llm_call(f"Create marketing copy for: {product_info}") # Step 2: Gate check if not meets_quality_criteria(copy): copy = enhanced_llm_call(f"Improve this copy: {copy}") # Step 3: Translate translated = enhanced_llm_call(f"Translate to {target_language}: {copy}") return translated
Recommendation
Add more concrete input/output examples showing specific agent responses and decision chains

1. Prompt Chaining

Decompose tasks into sequential steps with quality gates.

When to use: Tasks can be cleanly broken into fixed subtasks.

Python
def document_workflow(requirements): # Generate outline outline = llm_call(f"Create outline for: {requirements}") # Validate outline if not validate_outline(outline): outline = llm_call(f"Fix this outline: {outline}") # Write document document = llm_call(f"Write document based on: {outline}") return document

2. Routing

Classify inputs and direct to specialized handlers.

Python
def customer_support_router(query): classification = llm_call(f"Classify this query: {query}") if classification == "refund": return handle_refund(query) elif classification == "technical": return handle_technical_support(query) else: return handle_general_inquiry(query)

3. Parallelization

Run multiple LLM calls simultaneously and aggregate results.

Python
import asyncio async def code_review_voting(code): """Multiple reviewers vote on code quality""" tasks = [ review_security(code), review_performance(code), review_style(code) ] reviews = await asyncio.gather(*tasks) final_verdict = aggregate_reviews(reviews) return final_verdict async def content_sectioning(large_document): """Process document sections in parallel""" sections = split_document(large_document) tasks = [process_section(section) for section in sections] processed_sections = await asyncio.gather(*tasks) return combine_sections(processed_sections)

4. Orchestrator-Workers

Dynamic task delegation with synthesis.

Python
def coding_orchestrator(task_description): # Orchestrator plans the work plan = llm_call(f""" Analyze this coding task and create a plan: {task_description} Return a list of files to modify and what changes each needs. """) # Workers execute subtasks results = [] for subtask in plan.subtasks: result = worker_llm_call(f"Execute: {subtask}") results.append(result) # Orchestrator synthesizes final_code = llm_call(f""" Combine these code changes into final implementation: {results} """) return final_code

5. Evaluator-Optimizer

Iterative improvement through feedback loops.

Python
def literary_translation_loop(text, target_language, max_iterations=3): translation = llm_call(f"Translate to {target_language}: {text}") for i in range(max_iterations): evaluation = llm_call(f""" Evaluate this translation for accuracy and nuance: Original: {text} Translation: {translation} Provide specific feedback for improvement. """) if evaluation.score >= 8: # Good enough break translation = llm_call(f""" Improve this translation based on feedback: Translation: {translation} Feedback: {evaluation.feedback} """) return translation

6. Autonomous Agents

Self-directing systems with tool use and environmental feedback.

Python
def autonomous_coding_agent(github_issue): """Agent that resolves GitHub issues independently""" progress = [] max_steps = 20 for step in range(max_steps): # Agent plans next action action = llm_call(f""" Current progress: {progress} GitHub issue: {github_issue} What should I do next? Choose from: - analyze_codebase - run_tests - modify_file - create_file - submit_solution """) # Execute action and get feedback result = execute_action(action) progress.append(f"Step {step}: {action} -> {result}") # Check if task is complete if action.type == "submit_solution" and result.success: break # Safety check if result.error: recovery_action = llm_call(f"How to recover from: {result.error}") result = execute_action(recovery_action) return progress

Use this checklist to choose the right pattern:

Progress:

  • Can the task be solved with a single LLM call? → Use basic augmented LLM
  • Can the task be broken into fixed sequential steps? → Use prompt chaining
  • Are there distinct categories requiring different handling? → Use routing
  • Can subtasks run in parallel or need multiple attempts? → Use parallelization
  • Do subtasks depend on the specific input and can't be predefined? → Use orchestrator-workers
  • Can iterative feedback demonstrably improve results? → Use evaluator-optimizer
  • Is the solution path unpredictable and requires autonomous decision-making? → Use agents
17 / 20

Example 1 - Customer Support System: Input: "I want to return my order but the website won't let me" Output: Routes to refund specialist → Pulls order history → Processes return → Confirms with customer

Example 2 - Code Review Agent: Input: Python function with potential security issues Output: Parallel security, performance, and style reviews → Aggregated feedback → Improvement suggestions

Example 3 - Research Agent: Input: "Analyze market trends for electric vehicles in 2024" Output: Plans research approach → Gathers data from multiple sources → Synthesizes findings → Iteratively refines analysis

Recommendation
Include actual code for the execute_action() and validation functions referenced in examples

Tool Design:

  • Provide clear, comprehensive tool documentation
  • Include examples of successful tool usage
  • Design tools to return structured, actionable feedback
  • Test tools extensively in isolation

Error Handling:

  • Implement stopping conditions (max iterations, timeouts)
  • Add recovery mechanisms for common failure modes
  • Log all agent decisions for debugging
  • Include human oversight checkpoints for critical decisions

Performance Optimization:

  • Start with smaller models for simple tasks
  • Use caching for repeated operations
  • Implement early stopping when quality thresholds are met
  • Monitor costs and set budgets

Evaluation:

  • Define clear success metrics before building
  • Test in sandboxed environments
  • Compare against simpler baselines
  • Measure both accuracy and efficiency

Over-engineering: Don't use complex patterns when simple prompts work. Most tasks need basic augmented LLMs, not full agents.

Framework dependency: Don't let abstractions hide the underlying prompts. Understand what's happening under the hood.

Insufficient guardrails: Agents can compound errors. Always include stopping conditions and human oversight for critical paths.

Poor tool documentation: Agents are only as good as their tools. Invest heavily in clear tool interfaces and documentation.

Missing feedback loops: Agents need environmental feedback to self-correct. Don't just chain LLM calls without validation.

Ignoring costs: Agent patterns trade latency and cost for capability. Monitor usage and optimize for your specific constraints.

0
Grade A-AI Skill Framework
Scorecard
Criteria Breakdown
Quick Start
15/15
Workflow
15/15
Examples
17/20
Completeness
10/20
Format
15/15
Conciseness
13/15