Skip to content

Core Types

Data structures used throughout RuleChef.

TaskType

TaskType

Bases: Enum

Type of task being performed.

EXTRACTION: Find text spans (untyped). Output: {"spans": [{"text", "start", "end"}]} NER: Find typed entities. Output: {"entities": [{"text", "start", "end", "type"}]} CLASSIFICATION: Classify input into a label. Output: {"label": "class_name"} TRANSFORMATION: Extract structured data to a custom output schema you define. Like GLiNER — define {"company": "str", "amount": "str"} and get exactly that.

RuleFormat

RuleFormat

Bases: Enum

Rule representation formats

Task

Task(name, description, input_schema, output_schema, type=TaskType.EXTRACTION, output_matcher=None, matching_mode='text', text_field=None) dataclass

Abstract task definition. Describes what we're trying to accomplish.

Attributes:

Name Type Description
name str

Task name

description str

Free text description

input_schema dict[str, str]

Dict describing input fields

output_schema OutputSchema

Dict or Pydantic model describing output fields. - Dict: Simple string descriptions (e.g., {"spans": "List[Span]"}) - Pydantic model: Full type validation with Literal labels

type TaskType

TaskType enum (EXTRACTION, NER, CLASSIFICATION, TRANSFORMATION)

output_matcher OutputMatcher | None

Optional custom function to compare outputs. Signature: (expected: Dict, actual: Dict) -> bool If not provided, uses default matcher for the task type.

matching_mode Literal['text', 'exact']

For extraction tasks, choose "text" (default) or "exact" to control how span matches are evaluated.

text_field str | None

Optional input key to use for regex/spaCy matching. If not set, the longest string field is used.

get_labels(field_name='type')

Get label values from output schema.

For Pydantic schemas, extracts Literal values from the specified field. For dict schemas, returns empty list (labels not defined).

validate_output(output)

Validate output against schema.

For Pydantic schemas, uses model validation. For dict schemas, returns (True, []) - no validation.

Returns:

Type Description
tuple[bool, list[str]]

Tuple of (is_valid, list_of_error_messages)

get_schema_for_prompt()

Render schema for inclusion in LLM prompts.

For Pydantic schemas, generates a readable representation with descriptions. For dict schemas, returns the dict as a string.

to_dict()

Rule

Rule(id, name, description, format, content, priority=5, confidence=0.5, times_applied=0, successes=0, failures=0, created_at=datetime.now(), output_template=None, output_key=None) dataclass

Learned extraction rule.

For schema-aware rules (NER, TRANSFORMATION), use output_template and output_key to control how matches are mapped to structured output. For legacy rules (EXTRACTION), content holds the pattern directly.

Attributes:

Name Type Description
id str

Unique identifier.

name str

Human-readable rule name (used for merge-by-name in patching).

description str

What this rule matches or does.

format RuleFormat

Rule format (REGEX, CODE, or SPACY).

content str

Pattern string (regex, code, or JSON-encoded spaCy pattern). Also accessible via the pattern property.

priority int

Execution priority (1-10, higher runs first).

confidence float

Confidence score (0.0-1.0), adjusted based on success rate.

times_applied int

Total number of times this rule has been applied.

successes int

Number of successful applications.

failures int

Number of failed applications.

created_at datetime

When the rule was created.

output_template dict[str, Any] | None

JSON template for each match, using variables like $0, $1, $start, $end, $ent_type. None for plain span extraction.

output_key str | None

Which key in the output dict to populate (e.g. 'entities'). Inferred from task type if not set.

pattern property writable

Alias for content - clearer semantics for regex/spaCy patterns

update_stats(success)

Update performance stats and adjust confidence

to_dict()

Span

Span(text, start, end, score=1.0) dataclass

A text span with character-level position information.

Attributes:

Name Type Description
text str

The matched text content.

start int

Start character offset (inclusive) in the source string.

end int

End character offset (exclusive) in the source string.

score float

Confidence score for the match, between 0.0 and 1.0.

overlaps(other)

Check if spans overlap

overlap_ratio(other)

Calculate overlap ratio (IoU)

to_dict()

Dataset

Dataset(name, task, description='', examples=list(), corrections=list(), feedback=list(), structured_feedback=list(), rules=list(), version=1) dataclass

Complete training dataset containing examples, corrections, feedback, and rules.

Attributes:

Name Type Description
name str

Dataset name, used as the persistence filename.

task Task

Task definition describing the extraction/classification goal.

description str

Optional human-readable description.

examples list[Example]

List of labeled training examples.

corrections list[Correction]

List of user corrections (highest-value training signal).

feedback list[str]

Legacy list of plain-text feedback strings (task-level only).

structured_feedback list[Feedback]

Structured feedback entries at task/example/rule level.

rules list[Rule]

Learned rules (populated by learn_rules).

version int

Dataset schema version for forward compatibility.

get_all_training_data()

Get all examples and corrections combined

get_feedback_for(level, target_id='')

Get feedback filtered by level and optional target.

to_dict()

Example

Example(id, input, expected_output, source, confidence=0.8, timestamp=datetime.now()) dataclass

Regular training example. Lower priority than corrections.

Attributes:

Name Type Description
id str

Unique identifier.

input dict[str, Any]

Input data dict matching the task's input_schema.

expected_output dict[str, Any]

Expected output dict matching the task's output_schema.

source str

Origin of the example ('human_labeled' or 'llm_generated').

confidence float

Confidence score for this example (0.0-1.0).

timestamp datetime

When the example was created.

Correction

Correction(id, input, model_output, expected_output, feedback=None, timestamp=datetime.now()) dataclass

User correction -- the highest value training signal.

Contains both the wrong output and the correct output so the learner can understand what to fix.

Attributes:

Name Type Description
id str

Unique identifier.

input dict[str, Any]

Input data dict that was processed.

model_output dict[str, Any]

The incorrect output that was produced.

expected_output dict[str, Any]

The correct output the model should have produced.

feedback str | None

Optional free-text explanation of what went wrong.

timestamp datetime

When the correction was created.

Feedback

Feedback(id, text, level, target_id='', timestamp=datetime.now()) dataclass

User feedback at any level: task, example, or rule.

  • task: general guidance ("drugs usually follow dosage like 'mg'")
  • example: feedback on a specific training item
  • rule: feedback on a specific rule ("too broad", "too specific")

Attributes:

Name Type Description
id str

Unique identifier.

text str

The feedback text.

level str

Feedback scope -- 'task', 'example', or 'rule'.

target_id str

Empty for task-level; example_id or rule_id otherwise.

timestamp datetime

When the feedback was created.