Guardrails AI
Similar To Previous Values
Checks if a value is similar to a list of previously known correct values.
en
string
integer
ML
Factuality
Structured data

Overview

updated 2 years
Developed by:
Guardrails AI
Date of development:
Feb 15, 2024
Validator type:
Quality
Blog:
License:
Apache 2
Input/Output:
Output

Playground

The validator playground is available to authenticated users. Please log in to use it.

log in
Description
Intended Use

This validator checks that a value is similar to a list of previously known correct values.

For example, let’s say you’re extracting structured data from a PDF document, and extract some value. If you have an existing golden dataset of previous values, then this validator will ensure that the extracted value is not too different from known good values.

This validator works on numerical and string types in the following manner:

  1. For numbers, this validator checks that the extracted value is within k standard deviations of the validator.
  2. For strings, this validator embeds the extracted value, generates embeddings for all reference values, and checks that the average semantic similarity is more than some threshold.
Requirements
  • Dependencies:
    • guardrails-ai>=0.4.0
Installation
$ guardrails hub install hub://guardrails/similar_to_previous_values
Usage Examples
Validating string output via Python

In this example, we apply the validator to a string output generated by an LLM.

# Import Guard and Validator
from guardrails.hub import SimilarToPreviousValues
from guardrails import Guard
import numpy as np
import os
from typing import List, Union

try:
    import cohere
except ImportError:
    raise ImportError(
        "This example requires the `cohere` package. "
        "Install it with `pip install cohere`, and try again."
    )

# Create a cohere client
cohere_key = os.environ["COHERE_API_KEY"]
cohere_client = cohere.Client(api_key=cohere_key)


def embed_function(text: Union[str, List[str]]) -> np.ndarray:
    """Embed the text using cohere's small model."""
    # If text is a string, wrap it in a list
    if isinstance(text, str):
        text = [text]

    response = cohere_client.embed(
        model="embed-english-light-v2.0",
        texts=text,
    )
    embeddings_list = response.embeddings
    return np.array(embeddings_list)


# Use the Guard with the validator
guard = Guard().use(
    SimilarToPreviousValues,
    threshold=0.6,  # Increase the threshold to make the validator stricter
    on_fail="exception",
)


# Test passing response
guard.validate(
    """
    You are so amazing!
    """,
    metadata={
        "prev_values": ["You are amazing", "You are awesome.", "You are great!"],
        "embed_function": embed_function,
    },
)

try:
    # Test failing response
    guard.validate(
        """
        Why don't you go to hell?
        """,
        metadata={
            "prev_values": ["You are amazing", "You are awesome.", "You are great!"],
            "embed_function": embed_function,
        },
    )
except Exception as e:
    print(e)

Output:

Validation failed for field with errors: The value 
	- Why don't you go to hell?
is not semantically similar to the previous values. Avg. similarity: 0.24 < Threshold: 0.6.
API Reference

__init__(self, standard_deviations=3, threshold=0.3, on_fail="noop")

Initializes a new instance of the Validator class.

Parameters

  • standard_deviations (int): Max number of standard deviations that the extracted value should be within. Required for numbers. Defaults to 3.
  • threshold (float): Average similarity threshold below which the validator will fail. Required for strings. Defaults to 0.8.
  • on_fail (str, Callable): The policy to enact when a validator fails. If str, must be one of reask, fix, filter, refrain, noop, exception or fix_reask. Otherwise, must be a function that is called when the validator fails.

__call__(self, value, metadata={}) -> ValidationResult

Validates the given value using the rules defined in this validator, relying on the metadata provided to customize the validation process. This method is automatically invoked by guard.parse(...), ensuring the validation logic is applied to the input data.

Note:

  1. This method should not be called directly by the user. Instead, invoke guard.parse(...) where this method will be called internally for each associated Validator.
  2. When invoking guard.parse(...), ensure to pass the appropriate metadata dictionary that includes keys and values required by this validator. If guard is associated with multiple validators, combine all necessary metadata into a single dictionary.

Parameters

  • value (Any): The input value to validate.

  • metadata (dict): A dictionary containing metadata required for validation. Keys and values must match the expectations of this validator.

    KeyTypeDescriptionDefault
    prev_valslistList of previous values to pass to the validatorN/A
    embed_functionCallableFunction to embed the input textsentence-transformer's paraphrase-MiniLM-L6-v2