Skip to content

Squeez

Config

KRLabsOrg/squeez

Config¶

Pipeline configuration and constants.

`PipelineConfig(output_dir=Path('data'), source_cache_dir=Path('data/source_cache'), repos_dir=Path('data/repos'), github_token='', openai_api_key='', distillation_model='gpt-5.4', distillation_base_url=None, swebench_dataset='princeton-nlp/SWE-bench', splits=(lambda: ['test'])(), max_instances=None, min_tools_per_instance=3, max_tools_per_instance=7, max_tool_output_lines=MAX_TOOL_OUTPUT_LINES, distillation_max_concurrent=50, distillation_temperature=0.3, command_timeout=30)` `dataclass` ¶

Configuration for the data generation pipeline.

Constants¶

`config` ¶

Configuration for the data generation pipeline.

`SYSTEM_PROMPT = 'You extract relevant lines from tool output for a coding task. Return the relevant lines inside <relevant_lines> tags, one per line. Include ONLY lines the agent needs to see.'` `module-attribute` ¶

`TOOL_WEIGHTS = {'read_file': 0.28, 'grep': 0.18, 'python': 0.08, 'git_log': 0.08, 'test_output': 0.08, 'git_diff': 0.05, 'git_blame': 0.04, 'ls': 0.04, 'lint_output': 0.02, 'build_output': 0.02, 'curl': 0.03, 'pip_install': 0.04, 'type_check': 0.04, 'coverage': 0.02}` `module-attribute` ¶

`MIN_RELEVANT_RATIO = 0.02` `module-attribute` ¶

`MAX_RELEVANT_RATIO = 0.4` `module-attribute` ¶

`MIN_RELEVANT_LINES = 3` `module-attribute` ¶

`MIN_TOTAL_LINES = 10` `module-attribute` ¶

`MAX_TOOL_OUTPUT_LINES = 500` `module-attribute` ¶