AI Skill Report Card
Extracting Contact Info
Contact Information Extraction
Quick Start12 / 15
Pythonimport re def extract_company_info(text): info = {} # Extract CNPJ (Brazilian company ID) cnpj_match = re.search(r'CNPJ:\s*(\d{2}\.\d{3}\.\d{3}/\d{4}-\d{2})', text) if cnpj_match: info['cnpj'] = cnpj_match.group(1) # Extract emails emails = re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text) info['emails'] = emails # Extract address components address_match = re.search(r'(Avenida|Rua|Alameda)[^,]+,[^,]+,[^,]+', text) if address_match: info['address'] = address_match.group(0) return info
Recommendation▾
Add more concrete input/output examples showing the complete text blocks being processed, not just snippets
Workflow12 / 15
- Identify text source - Company footer, legal notice, or contact page
- Extract structured data:
- Company name and legal entity type
- Registration numbers (CNPJ, tax ID, etc.)
- Physical address with postal code
- Contact emails by department
- Phone numbers if present
- Categorize contacts - Sales, support, legal, etc.
- Validate format - Check email domains, postal codes, registration numbers
- Structure output - Organized by contact type and purpose
Recommendation▾
Include actual extraction methodology beyond basic regex - address parsing logic, validation rules, and structured output formats
Examples12 / 20
Example 1: Input: "© 2026 Kiwify Tecnologia e Serviços Ltda. CNPJ: 36.149.947/0001-06 compradores@kiwify.com.br" Output:
- Company: Kiwify Tecnologia e Serviços Ltda
- CNPJ: 36.149.947/0001-06
- Customer Support: compradores@kiwify.com.br
Example 2: Input: "ABC Corp Ltd. Registration: 12345678 support@abc.com sales@abc.com 123 Main St, City" Output:
- Company: ABC Corp Ltd
- Registration: 12345678
- Support: support@abc.com
- Sales: sales@abc.com
- Address: 123 Main St, City
Recommendation▾
Provide templates or frameworks for different document types (website footers vs legal documents vs contact pages) with specific extraction patterns
Best Practices
- Preserve original formatting for registration numbers and postal codes
- Group emails by function (support, sales, info, legal)
- Validate country-specific formats (CNPJ for Brazil, VAT for EU)
- Extract complete addresses including postal codes
- Note copyright years for business age estimation
Common Pitfalls
- Don't assume email patterns - some companies use different naming conventions
- Don't split addresses at commas blindly - some addresses contain comma-separated elements
- Don't ignore case variations in company suffixes (Ltd, LTDA, Inc, SA)
- Don't extract personal emails mixed with business contacts