Data Enricher

Enrich leads by finding verified emails via contact pages, Instagram, Hunter.io, and patterns, then format data for Notion with deduplication and batching.

تثبيت
$clawhub install data-enricher

SKILL.md - Data Enricher

Purpose

Enrich leads with email addresses and format data for Notion.

Model to Use

  • ollama/llama3.2:8b (FREE) for data formatting

  • haiku for Hunter.io API calls

Rate Limits

  • Max 10 Hunter.io lookups per session (API limit)

  • 5 seconds between API calls

  • Batch similar domains together

Email Discovery Methods (In Order)

1. Website Contact Page

  • Check /contact, /about, /pages/contact

  • Look for mailto: links

  • Check footer

2. Instagram Bio

  • Check bio for email

  • Check "Contact" button

3. Hunter.io API

GET https://api.hunter.io/v2/domain-search
?domain={domain}
&api_key={HUNTER_API_KEY}

Response includes:

  • emails[]

  • confidence score

  • type (generic/personal)

Only use emails with confidence > 70%

4. Email Pattern Guessing

Common patterns:

Email Priority

  1. Founder/owner personal email (best)

  2. hello@ or hi@ (good)

  3. info@ or contact@ (okay)

  4. Generic support@ (last resort)

Output Format

{
  "domain_key": "brandname.com",
  "brand_name": "Brand Name",
  "niche": "skincare",
  "website_url": "https://brandname.com",
  "ig_handle": "@brandname",
  "followers_est": 15000,
  "contact_email": "[email protected]",
  "email_confidence": "high",
  "email_source": "hunter.io",
  "source": "meta_ads",
  "status": "new"
}

Deduplication

Before adding any lead:

  1. Normalize domain: lowercase, remove www., remove https://

  2. Check if domain_key exists in Notion

  3. If exists, skip (don't duplicate)

  4. Log: "Skipped [domain] - already in pipeline"

Batch Processing

  • Process 10 leads at a time

  • Format all data before Notion sync

  • Save formatted batch to workspace/leads-enriched-YYYY-MM-DD.json