Training Data Logger¶
Captures LLM calls as structured training data for model distillation.
TrainingDataLogger¶
TrainingDataLogger(path, run_metadata=None)
¶
Logs LLM calls as training data for model distillation.
Each logged call is a (messages, response) pair with metadata. Written as JSONL, one entry per LLM call.
Attributes:
| Name | Type | Description |
|---|---|---|
path |
Path to the output JSONL file. |
|
stats |
dict[str, int]
|
Dict of call counts by type. |
run_id |
dict[str, int]
|
Identifier for this logging session. |
Initialize the logger.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to the output JSONL file. Created if it doesn't exist. Appends if it already exists (safe for multiple runs). |
required |
run_metadata
|
dict[str, Any] | None
|
Optional dict of metadata to attach to every entry in this session (e.g. dataset name, model, configuration). |
None
|
count
property
¶
Total number of entries logged.
log(call_type, messages, response, metadata=None)
¶
Log a single LLM call.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
call_type
|
str
|
Type of call. One of: - "rule_synthesis": Initial rule generation from examples - "rule_synthesis_per_class": Per-class rule generation - "rule_patch": Patch rules for failures - "guide_refinement": Coordinator refinement guidance - "audit_rules": Rule pruning/merging audit - "trigger_decision": Should-learn decision - "synthetic_generation": Synthetic example generation |
required |
messages
|
list[dict[str, str]]
|
The messages sent to the LLM (system + user). |
required |
response
|
str
|
The raw response text from the LLM. |
required |
metadata
|
dict[str, Any] | None
|
Additional context for this specific call. Useful fields: - task_name, task_type: What task this is for - dataset_size, num_classes: Data stats - iteration, max_iterations: Where in the refinement loop - eval_before: Metrics before this call's output is applied - num_rules_in_response: How many rules were parsed - response_valid: Whether the response parsed successfully - target_class: For per-class synthesis - num_failures: For patch calls - guidance: For coordinator calls |
None
|