Caption
v2.0.0
The Caption LOP generates text descriptions of images using vision-capable large language models. Point it at any TOP in your network, write a prompt, and get back a detailed caption — useful for accessibility, content tagging, visual analysis, or feeding image understanding into downstream LLM workflows.
Input/Output
Section titled “Input/Output”Inputs
Section titled “Inputs”- Input 1 (TOP): The image to caption. Any TOP operator in your network can be used as the source.
- Input 2 (DAT, optional): Conversation history table for multi-turn captioning. Must have columns:
role,message,id,timestamp.
Outputs
Section titled “Outputs”- Output 1 (DAT): Conversation history with the user prompt and assistant response appended.
- Output 2 (DAT): The generated caption text only.
Requirements
Section titled “Requirements”- ChatTD Operator: Must be configured with API keys. Set the
ChatTD Operatorparameter on the About page. - Vision-capable model: The selected model must support image inputs (e.g., Gemini Flash, GPT-4o, Claude Sonnet).
Usage Examples
Section titled “Usage Examples”Basic Image Captioning
Section titled “Basic Image Captioning”- Connect a TOP (e.g., a
moviefileinorrenderTOP) to the Caption LOP’s TOP input. - On the Caption page, enter your prompt in
Caption Prompt(e.g., “Describe this image in detail”). - On the Model page, select an
API ServerandAI Modelthat supports vision. - Pulse
Generate Caption. - The caption appears in the output DAT.
Multi-Turn Visual Conversation
Section titled “Multi-Turn Visual Conversation”- Set up a basic caption as above.
- Enable
Include Input Conversationto carry forward previous exchanges. - Enable
Append to Conversationto build up a running dialogue. - Enable
Add User Messageto include your prompt in the conversation history. - Each time you pulse
Generate Caption, the new prompt and response are added to the conversation output, allowing follow-up questions about the same image.
Continuous Captioning
Section titled “Continuous Captioning”- Configure a model and prompt as above.
- Toggle
Activeto On. - The operator will continuously generate captions whenever triggered, useful for real-time image analysis pipelines.
Best Practices
Section titled “Best Practices”- Use
Temperatureat 0 for consistent, factual descriptions. Increase for more creative or varied captions. - Set
Max Tokensto control response length — 0 uses the model’s default. - For multi-turn conversations, enable both
Append to ConversationandAdd User Messageto maintain full context. - Wire a conversation DAT into the input to give the model prior context when captioning related images in sequence.
Parameters
Section titled “Parameters”Caption
Section titled “Caption” Caption Prompt (Prompt)
op('caption').par.Prompt Str - Default:
"" (Empty String)
Include Input Conversation (Includeinput)
op('caption').par.Includeinput Toggle - Default:
False
Append to Conversation (Appendconversation)
op('caption').par.Appendconversation Toggle - Default:
False
Add User Message (Adduser)
op('caption').par.Adduser Toggle - Default:
False
Max Tokens (Maxtokens)
op('caption').par.Maxtokens Int - Default:
0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
Temperature (Temperature)
op('caption').par.Temperature Float - Default:
0.0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
Active (Active)
op('caption').par.Active Toggle - Default:
False
Generate Caption (Call)
op('caption').par.Call Pulse - Default:
False
Model Controller (Modelcontroller)
op('caption').par.Modelcontroller OP - Default:
"" (Empty String)
Search Models (Search)
op('caption').par.Search Toggle - Default:
False
Model Search (Modelsearch)
op('caption').par.Modelsearch Str - Default:
"" (Empty String)
Provider Model Documentation
Consult the documentation for your chosen provider to find supported models, API key information, and usage limits.
View LiteLLM Supported Providers →
Changelog
Section titled “Changelog”v2.0.02025-07-30
- Migrated from DotChatUtil to DotLOPUtils base class
- Reduced code from 380 to 130 lines (68% reduction)
- Simplified model selection using setup_standard_model_page()
- Replaced manual ChatTD.Customapicall with make_api_call() method
- Inherited streaming, tool execution, and error handling from DotLOPUtils
- Maintained standard conversation_dat output pattern for LLM/agent consistency
- Preserved all original functionality: vision support, conversation chaining, parameter compatibility
- Added ResetOp function with logger.Clearlog() integration
v1.0.02024-11-09
Initial release