AI Skill Report Card
Accessing Reddit API
Quick Start
Pythonimport requests import json # Basic subreddit posts fetch def get_subreddit_posts(subreddit, limit=25): url = f"https://www.reddit.com/r/{subreddit}/hot.json?limit={limit}" headers = {'User-Agent': 'YourApp/1.0 (by u/yourusername)'} response = requests.get(url, headers=headers) return response.json() # Get top posts from r/Python posts = get_subreddit_posts('Python', 10) for post in posts['data']['children']: print(f"{post['data']['title']} - {post['data']['score']} upvotes")
Workflow
Progress:
- Set up authentication (if needed)
- Define User-Agent header
- Construct API endpoint URL
- Make request with rate limiting
- Parse JSON response
- Handle errors and edge cases
1. Authentication Setup
For read-only public data (no auth needed):
Pythonheaders = {'User-Agent': 'YourApp/1.0 (by u/yourusername)'}
For write operations (OAuth required):
Pythonimport praw reddit = praw.Reddit( client_id="your_client_id", client_secret="your_client_secret", user_agent="your_user_agent" )
2. Common Endpoints
- Posts:
/r/{subreddit}/{sort}.json(sort: hot, new, top, rising) - Comments:
/r/{subreddit}/comments/{post_id}.json - User:
/user/{username}.json - Search:
/r/{subreddit}/search.json?q={query}
3. Rate Limiting
Pythonimport time def make_request_with_delay(url, headers, delay=2): response = requests.get(url, headers=headers) time.sleep(delay) # Reddit allows ~60 requests/minute return response
Examples
Example 1: Get Hot Posts Input: Subreddit "MachineLearning", top 5 posts
Pythonurl = "https://www.reddit.com/r/MachineLearning/hot.json?limit=5" response = requests.get(url, headers=headers) data = response.json()
Output: JSON with post titles, scores, URLs, timestamps
Example 2: Search Within Subreddit Input: Search for "transformers" in r/MachineLearning
Pythonurl = "https://www.reddit.com/r/MachineLearning/search.json?q=transformers&limit=10"
Output: Posts matching search term with relevance ranking
Example 3: Get Post Comments Input: Post ID "abc123" from r/Python
Pythonurl = "https://www.reddit.com/r/Python/comments/abc123.json"
Output: Nested comment tree with replies and scores
Best Practices
- Always use descriptive User-Agent: Include app name, version, and contact info
- Respect rate limits: Max 60 requests/minute for unauthenticated requests
- Use PRAW library for complex operations: Handles authentication and rate limiting automatically
- Cache responses: Reddit data doesn't change rapidly
- Handle pagination: Use
afterparameter for large result sets - Parse timestamps: Reddit uses Unix timestamps in
created_utcfield
Python# Proper pagination example def get_all_posts(subreddit, limit=100): posts = [] after = None while len(posts) < limit: url = f"https://www.reddit.com/r/{subreddit}/hot.json?limit=25" if after: url += f"&after={after}" response = requests.get(url, headers=headers) data = response.json() batch = data['data']['children'] if not batch: break posts.extend(batch) after = data['data']['after'] time.sleep(2) return posts[:limit]
Common Pitfalls
- Missing User-Agent: Reddit blocks requests without proper User-Agent headers
- Rate limit violations: Sending requests too quickly results in 429 errors
- Ignoring authentication scope: Some endpoints require OAuth even for reading
- Not handling deleted content: Posts/comments may be
[deleted]or[removed] - Assuming data structure: Reddit's JSON structure can vary, always check for key existence
- Using wrong sort parameters: Valid sorts are hot, new, top, rising (not popular, trending, etc.)
- Forgetting URL encoding: Special characters in search queries need proper encoding
Python# Error handling example def safe_get_posts(subreddit): try: response = requests.get(url, headers=headers, timeout=10) response.raise_for_status() data = response.json() if 'error' in data: print(f"Reddit API error: {data['error']}") return None return data except requests.exceptions.RequestException as e: print(f"Request failed: {e}") return None