AI Skill Report Card
Monitoring Grafana Errors
Quick Start10 / 15
Bash# Basic monitoring workflow 1. Navigate to Grafana → Error Dashboard 2. Filter by time range (last 1h/24h) 3. Identify error patterns/spikes 4. Create ClickUp ticket with error details
Recommendation▾
Add concrete input/output examples showing actual Grafana screenshots or error logs with corresponding ClickUp ticket JSON/API responses
Workflow13 / 15
Progress:
- Access Grafana monitoring dashboard
- Set appropriate time range for investigation
- Review error metrics and logs
- Identify root cause or pattern
- Document findings
- Create ClickUp ticket with details
- Assign priority and team member
Step-by-Step Process
-
Open Grafana Dashboard
- Navigate to primary error monitoring dashboard
- Set time range based on issue scope (1h for active, 24h for trends)
-
Analyze Error Data
- Check error rate graphs for spikes
- Review log panels for specific error messages
- Note affected services/components
- Capture screenshots of relevant graphs
-
Create ClickUp Ticket
- Title: "ERROR: [Service] - [Brief Description] - [Timestamp]"
- Priority based on impact (Critical/High/Normal)
- Include Grafana dashboard links
- Attach error screenshots
- Add relevant team labels
Recommendation▾
Include specific error threshold values and baseline metrics (e.g., 'normal error rate: <0.1%, alert threshold: >1%')
Examples14 / 20
Example 1: Input: API error rate spike at 14:30, 500 errors in payment service Output: ClickUp ticket "ERROR: Payment Service - 500 Error Spike - 2024-01-15 14:30"
- Priority: High
- Description: Payment API showing 500 errors starting 14:30
- Dashboard: [Grafana link]
- Screenshots: Error rate graph, log sample
Example 2: Input: Database connection timeouts increasing over 2 hours Output: ClickUp ticket "ERROR: Database - Connection Timeouts Trending Up - 2024-01-15"
- Priority: Normal
- Description: DB connection timeouts increased 300% over 2h period
- Trend data and connection pool metrics attached
Recommendation▾
Provide ClickUp ticket templates or API integration examples for automated ticket creation rather than just manual process description
Best Practices
- Set consistent time ranges - Use 1h for immediate issues, 24h for trend analysis
- Include context - Always link back to specific Grafana panels
- Use descriptive titles - Include service, error type, and timestamp
- Prioritize correctly - Critical for service down, High for degraded performance
- Tag appropriately - Add relevant team and component labels
- Document threshold values - Note when errors exceed normal baselines
Common Pitfalls
- Don't create tickets for single isolated errors - look for patterns
- Don't forget to include Grafana dashboard links in tickets
- Don't use vague titles like "Error found" - be specific about service and time
- Don't skip priority assignment - forces proper triage
- Don't create duplicate tickets - search existing issues first
- Don't ignore related metrics - check CPU, memory, network alongside errors