Overview
Guardrails AI validators are available as first-class scorers in MLflow’s GenAI evaluation framework starting with MLflow 3.10.0. This integration was contributed by Debu Sinha in MLflow PR #20038. This allows you to use Guardrails validators to evaluate LLM outputs for safety, PII detection, and content quality directly within MLflow’s evaluation pipelines.Key Features
- No LLM Required: All validators run locally using efficient classifiers - no API calls needed
- Production Tested: Battle-tested validators from the Guardrails Hub
- Easy Integration: Works seamlessly with MLflow’s
mlflow.genai.evaluate()API - Comprehensive Coverage: Safety, PII, secrets, and quality validators included
Prerequisites
Install MLflow with Guardrails support:Available Validators
The following Guardrails validators are available as MLflow scorers:| Scorer | Description | Use Case |
|---|---|---|
ToxicLanguage | Detects toxic or harmful content | Content moderation |
NSFWText | Identifies inappropriate content | Safety filtering |
DetectJailbreak | Detects prompt injection attempts | Security |
DetectPII | Identifies PII (emails, phones, names) | Privacy compliance |
SecretsPresent | Detects API keys and secrets | Security |
GibberishText | Identifies nonsensical text | Quality control |
Basic Usage
Direct Scorer Calls
Batch Evaluation with mlflow.genai.evaluate
Configuration Options
ToxicLanguage
DetectPII
DetectJailbreak
Dynamic Scorer Creation
Useget_scorer to create scorers dynamically:
Example: Safety Pipeline
Here’s a complete example evaluating LLM outputs for safety:Viewing Results
Results are automatically logged to MLflow:Best Practices
- Layer Multiple Validators: Combine safety validators for comprehensive coverage
- Tune Thresholds: Adjust thresholds based on your use case sensitivity
- Run Early: Evaluate outputs before returning to users
- Log Results: Use MLflow tracking to monitor safety metrics over time
Related Resources
- MLflow GenAI Evaluation Docs
- Guardrails Hub - Browse all available validators
- MLflow PR #20038 - Original integration PR