Extractor¶
The main entry point for tool output extraction.
ToolOutputExtractor(model_path=None, base_url=None, model_name=None, max_length=4096, device='auto')
¶
Extract relevant lines from tool output using a fine-tuned model.
Supports four backends: - vLLM/OpenAI-compatible server: pass base_url - Local transformers (generative): pass model_path - Encoder (token-level): auto-detected from model config, or backend="encoder" - Pooled (line-level): auto-detected from model config, or backend="pooled"
Usage: # vLLM (connects to running server) extractor = ToolOutputExtractor(base_url="http://localhost:8000/v1")
# Local generative
extractor = ToolOutputExtractor(model_path="./output/qwen-lora")
# Encoder (auto-detected)
extractor = ToolOutputExtractor(model_path="./output/squeez_encoder")
# Pooled line classifier (auto-detected)
extractor = ToolOutputExtractor(model_path="./output/squeez_pooled")
filtered = extractor.extract(task="Fix the bug", tool_output=raw)
extract(task, tool_output, max_new_tokens=1024, temperature=0.1)
¶
Extract relevant lines from tool output.
Args: task: Description of the coding task/issue tool_output: Raw tool output text max_new_tokens: Maximum tokens to generate temperature: Sampling temperature
Returns: Filtered output containing only relevant lines
extract_many(items, max_new_tokens=1024, temperature=0.1, concurrency=1)
¶
Extract relevant lines for many (task, tool_output) pairs.
Remote backends can use concurrent requests for higher throughput. Local backends fall back to sequential execution.
Helper functions¶
_build_messages(task, tool_output)
¶
Build chat messages for extraction.
The public API keeps the argument name task, but under the v3 benchmark
this value is the focused extraction query.