Extractor¶

The main entry point for tool output extraction.

`ToolOutputExtractor(model_path=None, base_url=None, model_name=None, max_length=4096, device='auto')` ¶

Extract relevant lines from tool output using a fine-tuned model.

Supports four backends: - vLLM/OpenAI-compatible server: pass base_url - Local transformers (generative): pass model_path - Encoder (token-level): auto-detected from model config, or backend="encoder" - Pooled (line-level): auto-detected from model config, or backend="pooled"

Usage: # vLLM (connects to running server) extractor = ToolOutputExtractor(base_url="http://localhost:8000/v1")

# Local generative
extractor = ToolOutputExtractor(model_path="./output/qwen-lora")

# Encoder (auto-detected)
extractor = ToolOutputExtractor(model_path="./output/squeez_encoder")

# Pooled line classifier (auto-detected)
extractor = ToolOutputExtractor(model_path="./output/squeez_pooled")

filtered = extractor.extract(task="Fix the bug", tool_output=raw)

`extract(task, tool_output, max_new_tokens=1024, temperature=0.1)` ¶

Extract relevant lines from tool output.

Args: task: Description of the coding task/issue tool_output: Raw tool output text max_new_tokens: Maximum tokens to generate temperature: Sampling temperature

Returns: Filtered output containing only relevant lines

`extract_many(items, max_new_tokens=1024, temperature=0.1, concurrency=1)` ¶

Extract relevant lines for many (task, tool_output) pairs.

Remote backends can use concurrent requests for higher throughput. Local backends fall back to sequential execution.

Helper functions¶

`_build_messages(task, tool_output)` ¶

Build chat messages for extraction.

The public API keeps the argument name task, but under the v3 benchmark this value is the focused extraction query.

Extractor¶

ToolOutputExtractor(model_path=None, base_url=None, model_name=None, max_length=4096, device='auto') ¶

extract(task, tool_output, max_new_tokens=1024, temperature=0.1) ¶

extract_many(items, max_new_tokens=1024, temperature=0.1, concurrency=1) ¶

Helper functions¶

_build_messages(task, tool_output) ¶

`ToolOutputExtractor(model_path=None, base_url=None, model_name=None, max_length=4096, device='auto')` ¶

`extract(task, tool_output, max_new_tokens=1024, temperature=0.1)` ¶

`extract_many(items, max_new_tokens=1024, temperature=0.1, concurrency=1)` ¶

`_build_messages(task, tool_output)` ¶