AI Skill Report Card
Maintaining Fabric Pipelines
YAML--- name: maintaining-fabric-pipelines description: Maintains and troubleshoots Microsoft Fabric data pipelines by monitoring flows, diagnosing issues, and implementing fixes. Use when pipelines are failing, performance is degraded, or data quality issues arise. ---
Quick Start14 / 15
Python# Check pipeline run status and errors from notebookutils import mssparkutils # Get recent pipeline runs runs = mssparkutils.fabric.list_runs(pipeline_name="your_pipeline") failed_runs = [r for r in runs if r['status'] == 'Failed'] # Get detailed error info for run in failed_runs[:5]: # Last 5 failures error_details = mssparkutils.fabric.get_run_details(run['runId']) print(f"Run {run['runId']}: {error_details['error']}")
Recommendation▾
Add more specific concrete input/output examples showing before/after states of actual pipeline configurations and error messages
Workflow14 / 15
Progress:
- Monitor Pipeline Health - Check run history, success rates, duration trends
- Identify Issues - Analyze failed runs, performance bottlenecks, data quality problems
- Diagnose Root Cause - Review activity logs, data lineage, resource usage
- Implement Fixes - Update pipeline logic, adjust configurations, optimize queries
- Test & Deploy - Validate fixes in dev environment, deploy to production
- Document Changes - Update runbooks, create alerts for similar issues
Issue Diagnosis Steps
-
Check Activity Logs
- Navigate to Monitor > Pipeline runs
- Click failed run → View details → Activity logs
- Look for error patterns in timestamps
-
Analyze Data Flow
- Verify source data availability and format
- Check transformation logic for edge cases
- Validate destination connectivity and permissions
-
Review Resource Usage
- Check Spark cluster scaling and memory usage
- Monitor concurrent pipeline execution limits
- Verify compute capacity during peak hours
Recommendation▾
Include specific monitoring queries or dashboard configurations that can be immediately implemented
Examples16 / 20
Example 1: Memory Error Fix Input: Pipeline failing with "OutOfMemoryError" in data transformation Output:
JSON// Update pipeline activity settings { "typeProperties": { "sparkConfig": { "spark.executor.memory": "8g", "spark.executor.cores": "4", "spark.dynamicAllocation.maxExecutors": "20" } } }
Example 2: Data Quality Issue Input: Pipeline completing but producing duplicate records Output:
SQL-- Add deduplication step in data transformation SELECT DISTINCT * FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY key_column ORDER BY updated_date DESC) as rn FROM source_table ) ranked WHERE rn = 1
Recommendation▾
Provide template alert configurations and specific threshold values for proactive monitoring setup
Best Practices
- Set up proactive monitoring with alerts on pipeline failures and SLA breaches
- Use parameterized pipelines to avoid hardcoded values and improve maintainability
- Implement retry logic with exponential backoff for transient failures
- Partition large datasets by date/region to improve performance and enable parallel processing
- Version control pipeline definitions using Git integration in Fabric workspace
- Create runbooks documenting common failure scenarios and resolution steps
- Use debug mode sparingly in production; enable only when troubleshooting specific issues
Common Pitfalls
- Ignoring concurrent execution limits - Fabric has workspace-level pipeline concurrency limits
- Not monitoring data drift - Source schema changes can break downstream transformations silently
- Overlooking time zone issues - Fabric uses UTC; ensure proper timezone handling in schedules
- Insufficient error handling - Always include try-catch blocks and meaningful error messages
- Not testing with production data volumes - Performance issues often appear only at scale
- Hardcoding connection strings - Use Key Vault references for credentials and connection details
- Skipping impact analysis - Changes to shared datasets can break dependent pipelines