Safety Check
The Safety Check LOP analyzes text for harmful content using toxicity detection and profanity filtering. It processes conversations or custom text, writes results to internal tables, and fires a callback when scores exceed configurable thresholds.
Overview
Section titled “Overview”Safety Check runs two independent analysis backends:
- Toxicity Detection uses the Detoxify library to score text across categories including toxicity, severe toxicity, obscenity, threat, insult, and identity hate.
- Profanity Filtering uses the better-profanity library to detect and flag profane words.
Each backend writes results to its own output table. When a score exceeds the configured threshold, the operator fires an onViolation callback so you can react in real time.
Requirements
Section titled “Requirements”- Python Packages:
detoxifyandbetter_profanity. These are installed automatically through ChatTD’s Python manager if missing when you first run a check.
Input/Output
Section titled “Input/Output”Inputs
Section titled “Inputs”The operator reads from an internal input_table DAT. Connect or populate a table with columns: id, role, message, timestamp.
Outputs
Section titled “Outputs”One output connector. The operator maintains several internal result tables:
- Toxicity Table: Columns include
toxicity_score,severe_toxicity,obscene,threat,insult,identity_hate,message_id,role,message,timestamp. - Profanity Table: Columns include
contains_profanity,profanity_probability,flagged_words,message_id,role,message,timestamp. - Summary Table: Columns
metricandvaluefor overall analysis summaries.
Switching the Safety Checks menu between Toxicity Detection and Profanity Filtering also changes which table is shown in the operator’s viewer.
Usage Examples
Section titled “Usage Examples”Analyzing a Full Conversation
Section titled “Analyzing a Full Conversation”- Place a Safety Check LOP in your network.
- Populate the internal
input_tablewith conversation data (columns:id,role,message,timestamp). - On the Safety page, set Analyze Mode to “Full Conversation”.
- Choose the check type from the Safety Checks menu — either “Toxicity Detection” or “Profanity Filtering”.
- Set the Toxicity Threshold or Profanity Threshold to your desired sensitivity (0.0 to 1.0).
- Set Table Update Mode to control how results are written — “Append Results” adds new rows, “Clear Before Analysis” wipes previous results first.
- Pulse Start Safety Checks.
- Monitor the Status field. Results appear in the corresponding output table once processing completes.
Checking Only the Last Message
Section titled “Checking Only the Last Message”- Set Analyze Mode to “Last Message” to analyze only the most recent row of the input table.
- Pulse Start Safety Checks.
- This is useful for real-time moderation where you only need to check new incoming messages.
Reacting to Violations with Callbacks
Section titled “Reacting to Violations with Callbacks”- On the Callbacks page, enable the onViolation toggle.
- Set the Callback DAT to a Text DAT containing your callback function.
- When a check result exceeds its threshold, the
onViolationcallback fires with a dictionary containingcheck_type,details(the full score breakdown), andmessage_id.
Table Update Modes
Section titled “Table Update Modes”The Table Update Mode parameter controls how results are written to the output tables:
- Append Results: Adds a new row for each analyzed message. Good for building up a history of checks.
- Replace By Index: Overwrites the row at the same index as the source message. Useful when re-checking an updated conversation.
- Batch Update: Collects all results and writes them at once after all messages are processed. Can be faster for large inputs.
- Clear Before Analysis: Clears all result tables before starting, then appends new results.
You can also pulse Clear Results at any time to manually wipe all output tables.
Performance Considerations
Section titled “Performance Considerations”- The Detoxify model loads on first use, which may cause a brief delay. Subsequent checks are faster.
- For large conversations, use “Last Message” analyze mode to check only new content instead of re-processing the entire history.
- “Batch Update” mode can be more efficient for processing many messages at once since it writes all results in a single pass.
Troubleshooting
Section titled “Troubleshooting”- “Error installing dependencies”: The operator tried to auto-install
detoxifyorbetter_profanitybut failed. Check your Python environment and internet connection, then try again. - “Safety checks already in progress”: Wait for the current check to finish before pulsing Start Safety Checks again.
- “Message index out of range”: When using “Specific Message Index” analyze mode, ensure the index points to a valid row in the input table.
Parameters
Section titled “Parameters”Safety
Section titled “Safety”op('safety_check').par.Check Pulse - Default:
False
op('safety_check').par.Status Str - Default:
"" (Empty String)
op('safety_check').par.Toxicitythreshold Float - Default:
0.0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
op('safety_check').par.Profanitythreshold Float - Default:
0.0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
op('safety_check').par.Clear Pulse - Default:
False
Callbacks
Section titled “Callbacks”op('safety_check').par.Callbackdat DAT - Default:
ChatTD_callbacks
op('safety_check').par.Editcallbacksscript Pulse - Default:
False
op('safety_check').par.Createpulse Pulse - Default:
False
op('safety_check').par.Onviolation Toggle - Default:
False
Callbacks
Section titled “Callbacks”onViolation
def onViolation(info):
# Called when a safety check score exceeds its threshold
# info contains:
# check_type: 'toxicity' or 'profanity'
# details: dict with scores and flags from the check
# message_id: ID of the message that triggered the violation
print(f"Violation: {info['check_type']} on message {info['message_id']}")
pass Changelog
Section titled “Changelog”v1.0.02024-11-10
Initial release