Core Types¶
Data structures used throughout RuleChef.
TaskType¶
TaskType
¶
Bases: Enum
Type of task being performed.
EXTRACTION: Find text spans (untyped). Output: {"spans": [{"text", "start", "end"}]} NER: Find typed entities. Output: {"entities": [{"text", "start", "end", "type"}]} CLASSIFICATION: Classify input into a label. Output: {"label": "class_name"} TRANSFORMATION: Extract structured data to a custom output schema you define. Like GLiNER — define {"company": "str", "amount": "str"} and get exactly that.
RuleFormat¶
RuleFormat
¶
Bases: Enum
Rule representation formats
Task¶
Task(name, description, input_schema, output_schema, type=TaskType.EXTRACTION, output_matcher=None, matching_mode='text', text_field=None)
dataclass
¶
Abstract task definition. Describes what we're trying to accomplish.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Task name |
description |
str
|
Free text description |
input_schema |
dict[str, str]
|
Dict describing input fields |
output_schema |
OutputSchema
|
Dict or Pydantic model describing output fields. - Dict: Simple string descriptions (e.g., {"spans": "List[Span]"}) - Pydantic model: Full type validation with Literal labels |
type |
TaskType
|
TaskType enum (EXTRACTION, NER, CLASSIFICATION, TRANSFORMATION) |
output_matcher |
OutputMatcher | None
|
Optional custom function to compare outputs. Signature: (expected: Dict, actual: Dict) -> bool If not provided, uses default matcher for the task type. |
matching_mode |
Literal['text', 'exact']
|
For extraction tasks, choose "text" (default) or "exact" to control how span matches are evaluated. |
text_field |
str | None
|
Optional input key to use for regex/spaCy matching. If not set, the longest string field is used. |
get_labels(field_name='type')
¶
Get label values from output schema.
For Pydantic schemas, extracts Literal values from the specified field. For dict schemas, returns empty list (labels not defined).
validate_output(output)
¶
Validate output against schema.
For Pydantic schemas, uses model validation. For dict schemas, returns (True, []) - no validation.
Returns:
| Type | Description |
|---|---|
tuple[bool, list[str]]
|
Tuple of (is_valid, list_of_error_messages) |
get_schema_for_prompt()
¶
Render schema for inclusion in LLM prompts.
For Pydantic schemas, generates a readable representation with descriptions. For dict schemas, returns the dict as a string.
to_dict()
¶
Rule¶
Rule(id, name, description, format, content, priority=5, confidence=0.5, times_applied=0, successes=0, failures=0, created_at=datetime.now(), output_template=None, output_key=None)
dataclass
¶
Learned extraction rule.
For schema-aware rules (NER, TRANSFORMATION), use output_template and output_key to control how matches are mapped to structured output. For legacy rules (EXTRACTION), content holds the pattern directly.
Attributes:
| Name | Type | Description |
|---|---|---|
id |
str
|
Unique identifier. |
name |
str
|
Human-readable rule name (used for merge-by-name in patching). |
description |
str
|
What this rule matches or does. |
format |
RuleFormat
|
Rule format (REGEX, CODE, or SPACY). |
content |
str
|
Pattern string (regex, code, or JSON-encoded spaCy pattern).
Also accessible via the |
priority |
int
|
Execution priority (1-10, higher runs first). |
confidence |
float
|
Confidence score (0.0-1.0), adjusted based on success rate. |
times_applied |
int
|
Total number of times this rule has been applied. |
successes |
int
|
Number of successful applications. |
failures |
int
|
Number of failed applications. |
created_at |
datetime
|
When the rule was created. |
output_template |
dict[str, Any] | None
|
JSON template for each match, using variables like $0, $1, $start, $end, $ent_type. None for plain span extraction. |
output_key |
str | None
|
Which key in the output dict to populate (e.g. 'entities'). Inferred from task type if not set. |
Span¶
Span(text, start, end, score=1.0)
dataclass
¶
A text span with character-level position information.
Attributes:
| Name | Type | Description |
|---|---|---|
text |
str
|
The matched text content. |
start |
int
|
Start character offset (inclusive) in the source string. |
end |
int
|
End character offset (exclusive) in the source string. |
score |
float
|
Confidence score for the match, between 0.0 and 1.0. |
Dataset¶
Dataset(name, task, description='', examples=list(), corrections=list(), feedback=list(), structured_feedback=list(), rules=list(), version=1)
dataclass
¶
Complete training dataset containing examples, corrections, feedback, and rules.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Dataset name, used as the persistence filename. |
task |
Task
|
Task definition describing the extraction/classification goal. |
description |
str
|
Optional human-readable description. |
examples |
list[Example]
|
List of labeled training examples. |
corrections |
list[Correction]
|
List of user corrections (highest-value training signal). |
feedback |
list[str]
|
Legacy list of plain-text feedback strings (task-level only). |
structured_feedback |
list[Feedback]
|
Structured feedback entries at task/example/rule level. |
rules |
list[Rule]
|
Learned rules (populated by learn_rules). |
version |
int
|
Dataset schema version for forward compatibility. |
Example¶
Example(id, input, expected_output, source, confidence=0.8, timestamp=datetime.now())
dataclass
¶
Regular training example. Lower priority than corrections.
Attributes:
| Name | Type | Description |
|---|---|---|
id |
str
|
Unique identifier. |
input |
dict[str, Any]
|
Input data dict matching the task's input_schema. |
expected_output |
dict[str, Any]
|
Expected output dict matching the task's output_schema. |
source |
str
|
Origin of the example ('human_labeled' or 'llm_generated'). |
confidence |
float
|
Confidence score for this example (0.0-1.0). |
timestamp |
datetime
|
When the example was created. |
Correction¶
Correction(id, input, model_output, expected_output, feedback=None, timestamp=datetime.now())
dataclass
¶
User correction -- the highest value training signal.
Contains both the wrong output and the correct output so the learner can understand what to fix.
Attributes:
| Name | Type | Description |
|---|---|---|
id |
str
|
Unique identifier. |
input |
dict[str, Any]
|
Input data dict that was processed. |
model_output |
dict[str, Any]
|
The incorrect output that was produced. |
expected_output |
dict[str, Any]
|
The correct output the model should have produced. |
feedback |
str | None
|
Optional free-text explanation of what went wrong. |
timestamp |
datetime
|
When the correction was created. |
Feedback¶
Feedback(id, text, level, target_id='', timestamp=datetime.now())
dataclass
¶
User feedback at any level: task, example, or rule.
- task: general guidance ("drugs usually follow dosage like 'mg'")
- example: feedback on a specific training item
- rule: feedback on a specific rule ("too broad", "too specific")
Attributes:
| Name | Type | Description |
|---|---|---|
id |
str
|
Unique identifier. |
text |
str
|
The feedback text. |
level |
str
|
Feedback scope -- 'task', 'example', or 'rule'. |
target_id |
str
|
Empty for task-level; example_id or rule_id otherwise. |
timestamp |
datetime
|
When the feedback was created. |