Skip to content

Token Count

v1.1.0

The Token Count LOP estimates the number of tokens in text using industry-standard tokenizers. This is essential when working with LLMs, where API costs are based on token usage and each model has a fixed context window size. Wire it into your network to monitor token counts as conversations grow.

The operator reads text from one of four configurable sources — a full conversation table, a data table column, all text content from the input, or a custom text parameter — and runs it through the selected tokenizer. The result is displayed directly on the Token Count page as a read-only parameter, updating automatically when inputs change (if enabled).

  • Python package: tiktoken is required for the default OpenAI tokenizer (installed automatically via the shared LOPs Python environment)
  • Optional: The transformers package is needed only if you want to use the LLaMA3 tokenizer. LLaMA3 is a gated model on Hugging Face and may require authentication via huggingface-cli login
  • Input 1 (optional): A DAT table used in Full Conversation and Table input modes. For conversation mode, the table should have role and message columns. For table mode, specify the column name using Column Selection.
  • Output 1: Passes through the input connection. The token count result is displayed as the read-only Token Count parameter on the operator itself.
  1. Wire the output of a Chat LOP (or any operator producing a conversation table with role and message columns) into the Token Count LOP’s input.
  2. On the Token Count page, set Input Source to “Full Conversation”.
  3. The Token Count parameter displays the total tokens across all messages.
  1. Connect a table DAT to the input — for example, a RAG index table with a content column.
  2. Set Input Source to “Table”.
  3. Set Column Selection to the column containing the text you want to count (defaults to content).
  4. The operator sums tokens across all rows in that column.
  1. Set Input Source to “Custom Text Par”.
  2. Enter your text in the Custom Text field.
  3. The token count updates as you type (when onInputChange is enabled).
  1. On the Token Count page, change Tokenizer Model to “LLaMA3 (Transformers)”.
  2. This uses the Hugging Face transformers library with the Meta LLaMA 3 tokenizer, which produces different counts than the OpenAI tokenizer.
  3. Make sure the transformers package is installed in your Python environment.
  • Leave onInputChange enabled for real-time monitoring during development. Disable it if the input text changes at high frequency and you want to reduce processing overhead.
  • Use OpenAI (cl100k) as the default tokenizer — it covers GPT-4, GPT-4o, and most OpenAI models. Switch to LLaMA3 only when you need accurate counts for LLaMA-family models.
  • Place the Token Count LOP downstream of your conversation-building chain to monitor total context usage before it reaches an Agent or Chat LOP.
  • “No text input found” / “No conversation found”: The selected input mode expects data on the input wire, but nothing is connected or the table is empty. Verify the wire connection and that the upstream operator is producing output.
  • “No content in column ‘X’”: In Table mode, the specified column name does not exist in the input table or all rows are empty. Check Column Selection matches an actual column header.
  • “Counting failed”: The tokenizer library encountered an error. Check the operator’s Logger for details. Common causes include tiktoken not being installed, or transformers failing to import due to a dependency conflict (e.g., numpy version mismatch).
  • LLaMA3 authentication errors: The LLaMA3 model is gated on Hugging Face. Run huggingface-cli login in your Python environment to authenticate before using this tokenizer.
Status (Status) op('token_count').par.Status Str

Current processing status

Default:
No text input found
Token Count (Tokencount) op('token_count').par.Tokencount Int

Total number of tokens counted

Default:
0
Range:
0 to 1
Slider Range:
0 to 1
onInputChange (Oninputchange) op('token_count').par.Oninputchange Toggle
Default:
True
Tokenizer Model (Tokenizer) op('token_count').par.Tokenizer Menu

Select the tokenizer to use for counting

Default:
cl100k_base
Options:
cl100k_base, llama3
Input Source (Inputmode) op('token_count').par.Inputmode Menu

Source of text to count tokens from

Default:
text
Options:
conversation, table, text, custom
Custom Text (Customtext) op('token_count').par.Customtext Str

Enter custom text to count (when using Custom Text mode)

Default:
"" (Empty String)
Column Selection (Contentcolumn) op('token_count').par.Contentcolumn StrMenu

Column name containing text to count (for Data Table mode)

Default:
content
Menu Options:
  • Index / Source Table Content (content)
  • Conversation Messages (message)
v1.1.02025-07-12

fixed and cleaned up pars and refactored and simplfied code.

v1.0.02025-01-30

Initial release