OCR Service¶

The OCR service extracts text from images using pluggable backends. It uses a Redis queue for async processing with callback notifications, and validates results via configurable minimum-character thresholds.

Quick Reference¶


Port	7031
Health endpoint	`GET /health`
Source	`jarvis-ocr-service/`
Framework	FastAPI + Uvicorn
Tier	3 - Specialized

API Endpoints¶

Method	Path	Description
`GET`	`/health`	Health check
`GET`	`/v1/providers`	List available OCR backends
`POST`	`/v1/ocr`	Submit an image for OCR (queued async)
`GET`	`/v1/ocr/jobs/{job_id}`	Check job status and retrieve result
`POST`	`/v1/ocr/batch`	Submit multiple images in one request
`GET`	`/v1/queue/status`	Queue depth and worker status

Backends¶

Backend	Platform	Notes
Tesseract	All	Classic OCR, no GPU required
EasyOCR	All	Deep learning, GPU optional
PaddleOCR	Linux	Best accuracy on dense text, GPU recommended
RapidOCR	All	Fast CPU-based inference
LLM Proxy Vision	All	Routes to jarvis-llm-proxy-api vision endpoint
LLM Proxy Cloud	All	Routes to a cloud vision API via the LLM proxy REST backend

The active backend is selected at runtime via settings. Multiple backends can be enabled simultaneously; the tier configuration controls fallback order.

Environment Variables¶

Core¶

Variable	Description
`OCR_PORT`	API port (default `7031`)
`OCR_BACKEND`	Default OCR backend (`tesseract`, `easyocr`, `paddleocr`, `rapidocr`, `llm_proxy_vision`, `llm_proxy_cloud`)
`OCR_ENABLED_TIERS`	Comma-separated list of enabled backend tiers
`OCR_ENABLE_RAPIDOCR`	Enable RapidOCR backend (`false`)
`OCR_ENABLE_LLM_PROXY_VISION`	Enable LLM Proxy Vision backend (`false`)
`OCR_ENABLE_LLM_PROXY_CLOUD`	Enable LLM Proxy Cloud backend (`false`)
`OCR_PUBLIC_URL`	Publicly reachable URL for this service (used in callback URLs)

Result Validation¶

Variable	Description
`OCR_MAX_TEXT_BYTES`	Maximum text size returned per job
`OCR_MIN_VALID_CHARS`	Minimum characters for a result to be considered valid
`OCR_LANGUAGE_DEFAULT`	Default language hint (e.g. `en`)
`OCR_MAX_ATTEMPTS`	Retry attempts per job before failing
`OCR_VALIDATION_MODEL`	LLM model used for result validation (when enabled)

Redis (async queue)¶

Variable	Description
`REDIS_URL`	Full Redis connection URL (takes precedence over host/port/password)
`REDIS_HOST`	Redis host (default `localhost`)
`REDIS_PORT`	Redis port (default `6379`)
`REDIS_PASSWORD`	Redis password

S3/MinIO (optional artifact storage)¶

Variable	Description
`S3_ENDPOINT_URL`	S3-compatible endpoint (e.g. MinIO URL)
`S3_REGION`	S3 region
`S3_FORCE_PATH_STYLE`	Use path-style S3 URLs (required for MinIO)

Auth¶

Variable	Description
`JARVIS_AUTH_BASE_URL`	Auth service URL
`JARVIS_APP_ID`	App identity for service-to-service auth
`JARVIS_APP_KEY`	App key for service-to-service auth

Dependencies¶

OCR engine -- one or more of: Tesseract, EasyOCR, PaddleOCR, RapidOCR
Redis -- async job queue
jarvis-auth -- app-to-app auth validation
jarvis-logs -- structured logging
jarvis-settings-client -- runtime backend selection
jarvis-llm-proxy-api -- optional vision inference backend

Dependents¶

jarvis-recipes-server -- sends recipe images for text extraction
jarvis-command-center -- optional OCR for image-based commands

Impact if Down¶

No image-to-text extraction. Recipe image scanning and any image-based command processing will fail. Text-based workflows are unaffected.