Safety Check

v1.0.0

The Safety Check LOP analyzes text for harmful content using toxicity detection and profanity filtering. It processes conversations or custom text, writes results to internal tables, and fires a callback when scores exceed configurable thresholds.

Overview

Safety Check runs two independent analysis backends:

Toxicity Detection uses the Detoxify library to score text across categories including toxicity, severe toxicity, obscenity, threat, insult, and identity hate.
Profanity Filtering uses the better-profanity library to detect and flag profane words.

Each backend writes results to its own output table. When a score exceeds the configured threshold, the operator fires an onViolation callback so you can react in real time.

Requirements

Python Packages: detoxify and better_profanity. These are installed automatically through ChatTD’s Python manager if missing when you first run a check.

Input/Output

Inputs

The operator reads from an internal input_table DAT. Connect or populate a table with columns: id, role, message, timestamp.

Outputs

One output connector. The operator maintains several internal result tables:

Toxicity Table: Columns include toxicity_score, severe_toxicity, obscene, threat, insult, identity_hate, message_id, role, message, timestamp.
Profanity Table: Columns include contains_profanity, profanity_probability, flagged_words, message_id, role, message, timestamp.
Summary Table: Columns metric and value for overall analysis summaries.

Switching the Safety Checks menu between Toxicity Detection and Profanity Filtering also changes which table is shown in the operator’s viewer.

Usage Examples

Analyzing a Full Conversation

Place a Safety Check LOP in your network.
Populate the internal input_table with conversation data (columns: id, role, message, timestamp).
On the Safety page, set Analyze Mode to “Full Conversation”.
Choose the check type from the Safety Checks menu — either “Toxicity Detection” or “Profanity Filtering”.
Set the Toxicity Threshold or Profanity Threshold to your desired sensitivity (0.0 to 1.0).
Set Table Update Mode to control how results are written — “Append Results” adds new rows, “Clear Before Analysis” wipes previous results first.
Pulse Start Safety Checks.
Monitor the Status field. Results appear in the corresponding output table once processing completes.

Checking Only the Last Message

Set Analyze Mode to “Last Message” to analyze only the most recent row of the input table.
Pulse Start Safety Checks.
This is useful for real-time moderation where you only need to check new incoming messages.

Reacting to Violations with Callbacks

On the Callbacks page, enable the onViolation toggle.
Set the Callback DAT to a Text DAT containing your callback function.
When a check result exceeds its threshold, the onViolation callback fires with a dictionary containing check_type, details (the full score breakdown), and message_id.

Table Update Modes

The Table Update Mode parameter controls how results are written to the output tables:

Append Results: Adds a new row for each analyzed message. Good for building up a history of checks.
Replace By Index: Overwrites the row at the same index as the source message. Useful when re-checking an updated conversation.
Batch Update: Collects all results and writes them at once after all messages are processed. Can be faster for large inputs.
Clear Before Analysis: Clears all result tables before starting, then appends new results.

You can also pulse Clear Results at any time to manually wipe all output tables.

Performance Considerations

The Detoxify model loads on first use, which may cause a brief delay. Subsequent checks are faster.
For large conversations, use “Last Message” analyze mode to check only new content instead of re-processing the entire history.
“Batch Update” mode can be more efficient for processing many messages at once since it writes all results in a single pass.

Troubleshooting

“Error installing dependencies”: The operator tried to auto-install detoxify or better_profanity but failed. Check your Python environment and internet connection, then try again.
“Safety checks already in progress”: Wait for the current check to finish before pulsing Start Safety Checks again.
“Message index out of range”: When using “Specific Message Index” analyze mode, ensure the index points to a valid row in the input table.

Parameters

Safety

Start Safety Checks (Check) op('safety_check').par.Check Pulse

Default:: False

Status (Status) op('safety_check').par.Status Str

Default:: "" (Empty String)

Toxicity Threshold (Toxicitythreshold) op('safety_check').par.Toxicitythreshold Float

Default:: 0.0
Range:: 0 to 1
Slider Range:: 0 to 1

Profanity Threshold (Profanitythreshold) op('safety_check').par.Profanitythreshold Float

Default:: 0.0
Range:: 0 to 1
Slider Range:: 0 to 1

Clear Results (Clear) op('safety_check').par.Clear Pulse

Default:: False

Callbacks

Callbacks Header

Callback DAT (Callbackdat) op('safety_check').par.Callbackdat DAT

Default:: ChatTD_callbacks

Edit Callbacks (Editcallbacksscript) op('safety_check').par.Editcallbacksscript Pulse

Default:: False

Create Callbacks (Createpulse) op('safety_check').par.Createpulse Pulse

Default:: False

onViolation (Onviolation) op('safety_check').par.Onviolation Toggle

Default:: False

Callbacks

Available Callbacks:

onViolation

Example Callback Structure:

def onViolation(info):
# Called when a safety check score exceeds its threshold
# info contains:
#   check_type: 'toxicity' or 'profanity'
#   details: dict with scores and flags from the check
#   message_id: ID of the message that triggered the violation
print(f"Violation: {info['check_type']} on message {info['message_id']}")
pass

Changelog

v1.0.02024-11-10

Initial release

Safety Check

Overview

Requirements

Input/Output

Inputs

Outputs

Usage Examples

Analyzing a Full Conversation

Checking Only the Last Message

Reacting to Violations with Callbacks

Table Update Modes

Performance Considerations

Troubleshooting

Parameters

Safety

Callbacks

Callbacks

Changelog

Related Operators