|
npx clawhub@latest install healthcare-providers-extractStructured practitioner extraction from healthcare practice websites, powered by Nimble's web data APIs.
User request: $ARGUMENTS
Before running any commands, read references/nimble-playbook.md for Claude Code
constraints (no shell state, no &/wait, sub-agent permissions, communication style).
Run the preflight pattern from references/nimble-playbook.md (5 simultaneous Bash
calls: date calc, today, CLI check, profile load, index.md load).
Also simultaneously — run WSA discovery and setup:
mkdir -p ~/.nimble/memory/{reports,healthcare-providers-extract/checkpoints}ls ~/.nimble/memory/healthcare-providers-extract/checkpoints/ 2>/dev/nullreferences/wsa-reference.md. Layer 2 (session-specific) runs after Step 1 when
you know the user's specialty.Classify discovered agents into phases and validate with nimble agent get per
references/wsa-reference.md.
From the preflight results:
references/profile-and-onboarding.md, stopreferences/nimble-playbook.md:
last_runs.healthcare-providers-extract is today, check
for existing report at ~/.nimble/memory/reports/healthcare-providers-extract-*[today].md.
If found, ask: "Already ran today. Run again for fresh data?"Parse $ARGUMENTS for input type using the Input Parsing Pattern from
references/nimble-playbook.md. Key routing:
If input is clear, confirm and ask one shaping question (plain text, not AskUserQuestion):
"Extracting providers from N practice sites. Quick questions:
- Healthcare vertical? (ophthalmology, dental, dermatology, general, or other)
- Quick scan (names + credentials only) or full extraction (all 5 fields)?"
If input is ambiguous, use AskUserQuestion (counts as 1 of max 2 prompts):
What practice sites should I extract providers from?
- Paste URLs directly (one per line)
- Provide a CSV file path or Google Sheet URL with practice URLs
- Or describe what you're looking for (e.g., "ophthalmologists in Austin, TX") and I'll find practices first
Skip questions the user already answered in their initial message.
Only if the user provided a specialty + location instead of URLs.
Two input paths into discovery:
Path A — Fresh discovery. User gave specialty + location. Run Layer 2 WSA discovery for session-specific agents:
nimble agent list --limit 50 --search "[specialty]" nimble agent list --limit 50 --search "[directory-user-mentioned]"See
references/wsa-reference.mdfor the full discovery strategy, agent evaluation criteria, and healthcare discovery prioritization.Run all discovery-phase agents simultaneously. Validate params with
nimble agent getfirst.Path B — Market-finder handoff. User ran
market-finderfirst and wants to extract providers from those results. Read the market-finder output:cat ~/.nimble/memory/market-finder/{slug}/entities.json 2>/dev/nullExtract practice records. Note: Google Maps results contain
place_url(a Maps link) but not the practice's actual website URL. Proceed to Step 2b to resolve real website URLs before site mapping.After either path: Deduplicate by domain. Present discovered practices:
"Found N practices for [specialty] in [location] across [M] data sources. Proceeding to extract providers from these sites..."
Fallback — if no discovery WSAs were found, or results are sparse (< 3):
nimble search --query "[specialty] in [location]" --max-results 20 --search-depth lite
Discovery sources (Google Maps, Yelp, BBB) return listing URLs, not practice website URLs. Before site mapping, resolve the actual website for each practice:
website
field in the structured output. Use it if present.website field, extract the Maps listing
to find the practice website link:
nimble extract --url "[maps-listing-url]" --format markdown
nimble search --query "[practice-name] [city] official website" --max-results 3 --search-depth lite
Skip practices where no website URL can be resolved — note them in the "Data Quality Summary" output section.
Follow the Site Mapping Pattern from references/nimble-playbook.md for each
practice URL. Skill-specific settings:
references/provider-extraction-patterns.mdsite:[domain] doctors OR providers OR teamFor 6+ practices, use sub-agents (see Sub-Agent Strategy below).
Save checkpoint: ~/.nimble/memory/healthcare-providers-extract/checkpoints/{slug}/mapping.json
WSA shortcuts first: If WSA discovery found agents that extract provider data from healthcare directories, use those for matching practices — structured WSA output is higher quality than parsed markdown.
For all other practices, follow the Page Extraction with Retry pattern from
references/nimble-playbook.md. Scale using the Scaled Execution pattern from
the same reference.
Save checkpoint: ~/.nimble/memory/healthcare-providers-extract/checkpoints/{slug}/extraction.json
Parse extracted markdown to identify providers and their fields. Read
references/provider-extraction-patterns.md for the 5 core fields, credential
regex patterns, and specialty keywords.
For each extracted page:
references/provider-extraction-patterns.mdBuild structured records:
{
"name": "Dr. Jane Smith",
"credentials": "MD, FACS",
"specialty": "Retinal Surgery",
"contact": {"phone": "(555) 123-4567", "scheduling_url": "..."},
"education": "Fellowship: Bascom Palmer Eye Institute",
"source_url": "https://practice.com/our-doctors",
"practice_name": "Shore Center for Eye Care",
"practice_url": "https://practice.com",
"confidence": "High"
}
Follow the Entity Deduplication and Entity Confidence Scoring patterns from
references/nimble-playbook.md. Skill-specific dedup rules and the 5-field
confidence criteria are in references/provider-extraction-patterns.md.
Present results grouped by practice, sorted by confidence within each practice.
# Provider Extraction: [N] Providers from [M] Practices
*[Date] | [H] High, [M] Medium, [L] Low confidence*
Extracted [N] providers from [M] practice websites. [H] with complete profiles, [L] with partial data. [Key finding: e.g., "12 of 15 providers are board-certified"].
| # | Name | Credentials | Specialty | Contact | Education | Confidence |
|---|---|---|---|---|---|---|
| 1 | Dr. Jane Smith | MD, FACS | Retinal Surgery | (555) 123-4567 | Fellowship: Bascom Palmer | High |
| 2 | Dr. John Doe | OD | General Ophthalmology | Book | Residency: Wills Eye | Medium |
[Repeat per practice]
[Clickable URL for every page extracted, grouped by practice]
**Source links are mandatory.** Every provider record must trace back to a source URL.
Make all Write calls simultaneously:
~/.nimble/memory/reports/healthcare-providers-extract-{slug}-{date}.md~/.nimble/memory/healthcare-providers-extract/{slug}/providers.jsonlast_runs.healthcare-providers-extract in
~/.nimble/business-profile.json (only if profile exists)references/memory-and-distribution.md: update
index.md rows for all affected entity files, append a log.md entry for this run.Always offer distribution -- do not skip. Follow
references/memory-and-distribution.md for connector detection and sharing flow.
Notion: full provider table as a dated subpage. Slack: TL;DR with provider count and confidence breakdown only.
Enrichment from discovered WSAs: If Step 0 found enrichment-phase agents (reviews, regulatory, practice details), offer them as immediate follow-ups:
"I also found [N] WSAs that could enrich this data: [brief list]. Want me to run reputation checks or regulatory lookups on these providers/practices?"
See references/wsa-reference.md for enrichment phase mapping and fallback chains.
Sibling skill suggestions:
Next steps:
- Run
healthcare-providers-enrichto fill data gaps (NPI lookup, board certification verification, additional contact info)- Run
healthcare-providers-verifyto validate credentials and license status- Run
market-finderto discover more practice URLs in this area
For batch extraction (6+ practices), use nimble-researcher agents
(agents/nimble-researcher.md) to parallelize site mapping and extraction.
Follow the sub-agent spawning rules from references/nimble-playbook.md
(bypassPermissions, batch max 4, explicit Bash instruction, fallback on failure).
Spawn pattern: One agent per practice (or per batch of 3 practices for large jobs). Each agent runs Steps 3-5 for its assigned practices and returns structured provider records.
Single-practice optimization: If only 1-2 practices, run directly from the main context instead of spawning agents.
Fallback: If any agent fails, run those extractions directly from the main context. Never leave gaps in the output.
See references/nimble-playbook.md for the standard error table (missing API key,
429, 401, empty results, extraction garbage). Skill-specific errors:
--render per the shared pattern)nimble search --query "[practice name] [location] doctors" --max-results 5 --search-depth litenpx clawhub@latest install healthcare-providers-extractnpx clawhub@latest install healthcare-providers-extractレビューを書くにはログイン
まだレビューはありません。最初の体験をシェアしましょう!