OCR Operator
v0.2.1New
Overview
Section titled “Overview”The OCR LOP extracts text from images using Optical Character Recognition. It sends image data to the vision_sidecar HTTP service for processing with either EasyOCR or PaddleOCR, keeping heavy ML inference out of TouchDesigner’s main process. Processing runs asynchronously so TouchDesigner stays responsive while OCR is in progress.
Requirements
Section titled “Requirements”- SideCar must be available with the
vision_sidecarservice. The operator communicates with it over HTTP and will attempt to start the service automatically if it is not already running. - The selected OCR library must be installed in the SideCar’s Python environment:
- PaddleOCR:
paddleocr,paddlepaddle-gpu - EasyOCR:
easyocr
- PaddleOCR:
- On first use, if the required library is missing, the operator will prompt you to install it (including PyTorch with CUDA support on Windows if needed).
Input/Output
Section titled “Input/Output”Inputs
Section titled “Inputs”- Input 1 (TOP): The image to run OCR on. Can also be set via the
Input TOPparameter on the OCR page.
Outputs
Section titled “Outputs”- Output 1 (DAT): Results table with columns for timestamp, text, u, v, width, height, and confidence. All coordinates are normalized (0-1).
- Output 2 (TOP): The input image overlaid with bounding boxes around detected text regions. Line width and color are configurable under the
Display / TOP Out2section on the OCR page.
- Connect an image TOP to the OCR operator’s input (or set it via the
Input TOPparameter) - On the OCR page, select a
Model Type— PaddleOCR or EasyOCR - Adjust
Min Confidenceto filter out low-confidence detections - Optionally lower
Image Scalefor faster processing on large images - Pulse
Process - The
Activeindicator turns on while processing runs asynchronously - Results appear in the output table and the
Output Textparameter once complete. The second TOP output shows detected regions visually.
Enable Combine Text to join all detected text blocks into a single string in Output Text. When off, only the first detected block is shown.
Best Practices
Section titled “Best Practices”- Image Scale: For large images, lowering
Image Scalereduces processing time without significantly impacting text detection quality. Start at 1.0 and reduce if processing is slow. - Model selection: PaddleOCR generally provides better accuracy for structured documents. EasyOCR supports more languages out of the box.
- Confidence filtering: Raise
Min Confidenceto eliminate noisy detections in complex scenes. A value around 0.5 is a good starting point.
Troubleshooting
Section titled “Troubleshooting”- “No input TOP specified”: Make sure an image TOP is connected to the operator’s input or set via the
Input TOPparameter. - “ML Server not available”: The vision_sidecar service could not be reached. The operator will queue the task and retry. Check that the SideCar is running and healthy.
- Slow first run: The OCR model is loaded into memory on the first request. Subsequent requests are significantly faster.
Parameters
Section titled “Parameters” Process (Process)
op('ocr').par.Process Pulse - Default:
False
Input TOP (Inputtop)
op('ocr').par.Inputtop TOP - Default:
"" (Empty String)
Status (Status)
op('ocr').par.Status Str - Default:
"" (Empty String)
Active (Active)
op('ocr').par.Active Toggle - Default:
False
Min Confidence (Minconfidence)
op('ocr').par.Minconfidence Float - Default:
0.0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
Image Scale (Imagescale)
op('ocr').par.Imagescale Float - Default:
0.0- Range:
- 0.1 to 1
- Slider Range:
- 0.1 to 1
Combine Text (Combinetext)
op('ocr').par.Combinetext Toggle - Default:
False
Output Text (Outputtext)
op('ocr').par.Outputtext Str - Default:
"" (Empty String)
Display / TOP Out2 Header
Line Width (Width)
op('ocr').par.Width Float - Default:
0.0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
Color (Colorr)
op('ocr').par.Colorr RGB - Default:
0.0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
Color (Colorg)
op('ocr').par.Colorg RGB - Default:
0.0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
Color (Colorb)
op('ocr').par.Colorb RGB - Default:
0.0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
Callbacks
Section titled “Callbacks” Callbacks Header
Callback DAT (Callbackdat)
op('ocr').par.Callbackdat DAT - Default:
ChatTD_callbacks
Edit Callbacks (Editcallbacksscript)
op('ocr').par.Editcallbacksscript Pulse - Default:
False
Create Callbacks (Createpulse)
op('ocr').par.Createpulse Pulse - Default:
False
onComplete (Oncomplete)
op('ocr').par.Oncomplete Toggle - Default:
False
Callbacks
Section titled “Callbacks” Available Callbacks:
onComplete
Example Callback Structure:
def onComplete(info):
# Called when OCR processing completes
# info contains:
# text: list of detected text strings
# results: list of dicts with text, confidence, u, v, width, height
# processing_time: server processing time in seconds
combined = " ".join(info.get('text', []))
print(f"OCR found {len(info['results'])} text blocks: {combined[:100]}")
pass Changelog
Section titled “Changelog”v0.2.12026-03-26
- Added OCR processing via vision_sidecar HTTP API - Supports Easy OCR and Paddle OCR models - Implemented results table and text output options
v0.2.02026-03-01
- Refactor to call ml_server HTTP API directly instead of SideCar methods - Async image processing via TDAsyncIO - Base64 image encoding for HTTP transmission
- Fix TD 32050+ freeze by using importlib.metadata for torch check - SideCar handles actual OCR inference
- Initial commit