LLMWhisperer

Extract text and layout from images and PDFs using LLMWhisperer API. Good for handwriting and complex forms.

Instalar
$clawhub install llmwhisperer

LLMWhisperer

Extract text from images and PDFs using the LLMWhisperer API — great for handwriting and complex forms.

Configuration

Requires LLMWHISPERER_API_KEY in ~/.clawdbot/.env: bash echo "LLMWHISPERER_API_KEY=your_key_here" >> ~/.clawdbot/.env

Get an API Key

Get a free API key at unstract.com/llmwhisperer. - Free Tier: 100 pages/day

Usage

llmwhisperer <file>

Script Source

The executable script is located at scripts/llmwhisperer.

#!/bin/bash
# Extract text using LLMWhisperer API

if [ -z "$LLMWHISPERER_API_KEY" ]; then
  if [ -f ~/.clawdbot/.env ]; then
    # shellcheck disable=SC2046
    export $(grep -v '^#' ~/.clawdbot/.env | grep 'LLMWHISPERER_API_KEY' | xargs)
  fi
fi

if [ -z "$LLMWHISPERER_API_KEY" ]; then
  echo "Error: LLMWHISPERER_API_KEY not found in env or ~/.clawdbot/.env"
  exit 1
fi

FILE="$1"
if [ -z "$FILE" ]; then
  echo "Usage: $0 <file>"
  exit 1
fi

curl -s -X POST "https://llmwhisperer-api.us-central.unstract.com/api/v2/whisper?mode=high_quality&output_mode=layout_preserving" \
  -H "Content-Type: application/octet-stream" \
  -H "unstract-key: $LLMWHISPERER_API_KEY" \
  --data-binary "@$FILE"

Examples

Print text to terminal: bash llmwhisperer flyer.jpg

Save output to a text file: bash llmwhisperer invoice.pdf > invoice.txt

Process a handwritten note: bash llmwhisperer notes.jpg

Detalles

Versión
v0.0.7
Descargas
2,390

Skills populares

self-improving-agent
Captures learnings, errors, and corrections to enable continuous improvement. Use when: (1) A command or operation fails unexpectedly, (2) User corrects Claude ('No, that's wrong...', 'Actually...'), (3) User requests a capability that doesn't exist, (4) An external API or tool fails, (5) Claude realizes its knowledge is outdated or incorrect, (6) A better approach is discovered for a recurring task. Also review learnings before major tasks.
UI/UX Pro Max
UI/UX design intelligence and implementation guidance for building polished interfaces. Use when the user asks for UI design, UX flows, information architecture, visual style direction, design systems/tokens, component specs, copy/microcopy, accessibility, or to generate/critique/refine frontend UI (HTML/CSS/JS, React, Next.js, Vue, Svelte, Tailwind). Includes workflows for (1) generating new UI layouts and styling, (2) improving existing UI/UX, (3) producing design-system tokens and component guidelines, and (4) turning UX recommendations into concrete code changes.
Cognitive Memory
Intelligent multi-store memory system with human-like encoding, consolidation, decay, and recall. Use when setting up agent memory, configuring remember/forget triggers, enabling sleep-time reflection, building knowledge graphs, or adding audit trails. Replaces basic flat-file memory with a cognitive architecture featuring episodic, semantic, procedural, and core memory stores. Supports multi-agent systems with shared read, gated write access model. Includes philosophical meta-reflection that deepens understanding over time. Covers MEMORY.md, episode logging, entity graphs, decay scoring, reflection cycles, evolution tracking, and system-wide audit.