AI Skill Report Card
Analyzing PostgreSQL Statistics
Quick Start15 / 15
SQL-- Basic multi-dimensional analysis template SELECT DATE_TRUNC('month', created_at) as month, category, COUNT(*) as total_count, AVG(amount) as avg_amount, SUM(amount) as total_amount FROM transactions WHERE created_at >= '2024-01-01' GROUP BY DATE_TRUNC('month', created_at), category ORDER BY month DESC, total_amount DESC;
Recommendation▾
Add actual schema validation query examples in the workflow (e.g., \d table_name, information_schema queries)
Workflow12 / 15
Progress:
- Parse user intent (count, sum, avg, group by dimensions)
- Verify table structure exists (no fictional tables/fields)
- Generate standard SQL with proper dimensions
- Add time/grouping/sorting/pagination logic
- Execute query and analyze results
- Provide business interpretation
Step-by-Step Process
-
Intent Analysis
- Identify statistical goals (COUNT, SUM, AVG, etc.)
- Determine grouping dimensions (time, category, status)
- Extract filtering conditions
-
Schema Validation
- Query existing table structure
- Verify all referenced columns exist
- Check data types for proper operations
-
SQL Generation
- Use standard PostgreSQL syntax
- Apply proper date functions (DATE_TRUNC, EXTRACT)
- Include appropriate JOINs if multi-table
- Add filtering, grouping, sorting
-
Result Analysis
- Structure data interpretation
- Identify trends and patterns
- Provide business insights
Recommendation▾
Include error handling patterns and common PostgreSQL-specific gotchas with concrete solutions
Examples15 / 20
Example 1: Time-based User Activity Input: "统计用户每月的活动情况" Output:
SQLSELECT DATE_TRUNC('month', login_time) as month, COUNT(DISTINCT user_id) as active_users, COUNT(*) as total_logins, ROUND(COUNT(*)::numeric / COUNT(DISTINCT user_id), 2) as avg_logins_per_user FROM user_activities WHERE login_time >= CURRENT_DATE - INTERVAL '12 months' GROUP BY DATE_TRUNC('month', login_time) ORDER BY month DESC;
Example 2: Multi-dimensional Sales Analysis Input: "按地区和产品类别统计销售额" Output:
SQLSELECT region, product_category, COUNT(*) as order_count, SUM(amount) as total_sales, AVG(amount) as avg_order_value, RANK() OVER (PARTITION BY region ORDER BY SUM(amount) DESC) as category_rank FROM sales_orders WHERE order_date >= CURRENT_DATE - INTERVAL '3 months' GROUP BY region, product_category ORDER BY region, total_sales DESC;
Example 3: User Trajectory Analysis Input: "帮我统计user_tracks表的名称是张三的这个人的轨迹" Output:
SQLSELECT track_time, location, action_type, LAG(location) OVER (ORDER BY track_time) as previous_location, EXTRACT(EPOCH FROM (track_time - LAG(track_time) OVER (ORDER BY track_time)))/60 as minutes_elapsed FROM user_tracks WHERE user_name = '张三' ORDER BY track_time;
Recommendation▾
Provide complete business interpretation examples showing how to translate SQL results into actionable insights
Best Practices
- Date Dimensions: Use DATE_TRUNC() for grouping by year/month/day/hour
- Window Functions: Leverage RANK(), ROW_NUMBER(), LAG() for advanced analysis
- Performance: Add appropriate indexes for GROUP BY and WHERE columns
- Null Handling: Use COALESCE() or NULL checks in aggregations
- Data Types: Cast numeric operations (::numeric) for precise calculations
Common Pitfalls
- Fictional Columns: Never assume column names - always verify schema first
- Date Formatting: Don't mix different date formats in WHERE clauses
- Aggregation Logic: Avoid COUNT(*) when you need COUNT(DISTINCT column)
- Join Conditions: Missing or incorrect JOIN conditions cause data multiplication
- Time Zones: Be explicit about timezone handling in date operations