LettuceDetect¶

LettuceDetect Logo

A lightweight hallucination detection framework for RAG applications.

LettuceDetect is an encoder-based model built on ModernBERT that detects unsupported spans in LLM-generated answers by comparing them against provided context. It provides token-level precision for identifying exactly which parts of an answer are hallucinated.

Highlights¶

Token-level precision — identifies exact hallucinated spans, not just "this answer has a problem"
Fast inference — 30-60 samples/sec on A100, suitable for production
Long context — supports up to 4K tokens (ModernBERT) or 8K tokens (EuroBERT)
Multilingual — English, German, French, Spanish, Italian, Polish, Chinese, Hungarian
Open source — MIT license, models on HuggingFace

Quick Example¶

from lettucedetect.models.inference import HallucinationDetector

detector = HallucinationDetector(
    method="transformer",
    model_path="KRLabsOrg/lettucedetect-base-modernbert-en-v1"
)

contexts = ["The capital of France is Paris. The population is 67 million."]
question = "What is the capital and population of France?"
answer = "The capital of France is Paris. The population is 69 million."

predictions = detector.predict(
    context=contexts, question=question, answer=answer, output_format="spans"
)
# [{'start': 31, 'end': 71, 'confidence': 0.99, 'text': ' The population of France is 69 million.'}]

Performance¶

Model	Example F1	vs GPT-4	vs Luna	Parameters
lettucedetect-base-v1	76.8%	+13.4%	+11.4%	149M
lettucedetect-large-v1	79.2%	+15.8%	+13.8%	395M

Evaluated on RAGTruth test set. Surpasses GPT-4, Luna, and fine-tuned Llama-2-13B.

What's New¶

Code Hallucination Dataset — A pipeline for generating span-level code hallucination data from SWE-bench (~18k samples across 53 repos)
Multilingual models — EuroBERT-based models for 8 languages
Web API — FastAPI server with async client support

LettuceDetect¶

Highlights¶

Quick Example¶

Performance¶

What's New¶

Links¶