Skip to content

OCR Service

The OCR service extracts text from images using pluggable backends. It uses a Redis queue for async processing with callback notifications, and validates results via configurable minimum-character thresholds.

Quick Reference

Port 7031
Health endpoint GET /health
Source jarvis-ocr-service/
Framework FastAPI + Uvicorn
Tier 3 - Specialized

API Endpoints

Method Path Description
GET /health Health check
GET /v1/providers List available OCR backends
POST /v1/ocr Submit an image for OCR (queued async)
GET /v1/ocr/jobs/{job_id} Check job status and retrieve result
POST /v1/ocr/batch Submit multiple images in one request
GET /v1/queue/status Queue depth and worker status

Backends

Backend Platform Notes
Tesseract All Classic OCR, no GPU required
EasyOCR All Deep learning, GPU optional
PaddleOCR Linux Best accuracy on dense text, GPU recommended
RapidOCR All Fast CPU-based inference
LLM Proxy Vision All Routes to jarvis-llm-proxy-api vision endpoint
LLM Proxy Cloud All Routes to a cloud vision API via the LLM proxy REST backend

The active backend is selected at runtime via settings. Multiple backends can be enabled simultaneously; the tier configuration controls fallback order.

Environment Variables

Core

Variable Description
OCR_PORT API port (default 7031)
OCR_BACKEND Default OCR backend (tesseract, easyocr, paddleocr, rapidocr, llm_proxy_vision, llm_proxy_cloud)
OCR_ENABLED_TIERS Comma-separated list of enabled backend tiers
OCR_ENABLE_RAPIDOCR Enable RapidOCR backend (false)
OCR_ENABLE_LLM_PROXY_VISION Enable LLM Proxy Vision backend (false)
OCR_ENABLE_LLM_PROXY_CLOUD Enable LLM Proxy Cloud backend (false)
OCR_PUBLIC_URL Publicly reachable URL for this service (used in callback URLs)

Result Validation

Variable Description
OCR_MAX_TEXT_BYTES Maximum text size returned per job
OCR_MIN_VALID_CHARS Minimum characters for a result to be considered valid
OCR_LANGUAGE_DEFAULT Default language hint (e.g. en)
OCR_MAX_ATTEMPTS Retry attempts per job before failing
OCR_VALIDATION_MODEL LLM model used for result validation (when enabled)

Redis (async queue)

Variable Description
REDIS_URL Full Redis connection URL (takes precedence over host/port/password)
REDIS_HOST Redis host (default localhost)
REDIS_PORT Redis port (default 6379)
REDIS_PASSWORD Redis password

S3/MinIO (optional artifact storage)

Variable Description
S3_ENDPOINT_URL S3-compatible endpoint (e.g. MinIO URL)
S3_REGION S3 region
S3_FORCE_PATH_STYLE Use path-style S3 URLs (required for MinIO)

Auth

Variable Description
JARVIS_AUTH_BASE_URL Auth service URL
JARVIS_APP_ID App identity for service-to-service auth
JARVIS_APP_KEY App key for service-to-service auth

Dependencies

  • OCR engine -- one or more of: Tesseract, EasyOCR, PaddleOCR, RapidOCR
  • Redis -- async job queue
  • jarvis-auth -- app-to-app auth validation
  • jarvis-logs -- structured logging
  • jarvis-settings-client -- runtime backend selection
  • jarvis-llm-proxy-api -- optional vision inference backend

Dependents

  • jarvis-recipes-server -- sends recipe images for text extraction
  • jarvis-command-center -- optional OCR for image-based commands

Impact if Down

No image-to-text extraction. Recipe image scanning and any image-based command processing will fail. Text-based workflows are unaffected.