Files
quant-trader-service/training/README.md
T
Codex 9acb3460a1 Improve Trader V4 training pipeline
Align entry labels with max future edge, tune direction labeling, and harden regression evaluation.

Add training diagnostics, price-plan search, feature screening, and nonlinear benchmark scripts.
2026-06-27 19:57:29 +08:00

3.8 KiB

Trader V4 Training Pipeline

This directory contains the executable training chain for Trader V4. Large data stays under /Users/zach/Desktop/quant-strategy-training-data.

Run order:

PY=/Users/zach/IdeaProjects/quant-trading-ai/quant-strategy-server/.venv/bin/python
RUN_ID=btc-v4-p0-001
ROOT=/Users/zach/Desktop/quant-strategy-training-data

$PY training/scripts/01_audit_source_data.py --run-id $RUN_ID --data-root $ROOT --symbol BTC-USDT-PERP --start-date 2025-05-01 --end-date 2026-06-25
$PY training/scripts/02_build_replay_1m.py --run-id $RUN_ID --data-root $ROOT --symbol BTC-USDT-PERP --start-date 2025-05-01 --end-date 2026-06-25
$PY training/scripts/03_build_splits.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/04_build_feature_frame.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/05_build_price_plan_context.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/06_build_direction_labels.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/07_build_entry_labels.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/08_build_position_state_samples.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/09_build_continue_exit_risk_labels.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/10_build_train_datasets.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/11_train_small_models.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/12_calibrate_models.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/13_search_pm_thresholds.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/14_integrated_backtest.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/15_export_artifact_bundle.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/16_validate_artifact_bundle.py --artifact-root $ROOT/trader-v4/runs/$RUN_ID/export/trader-model-bundle-$RUN_ID/artifact_bundle
$PY training/scripts/17_promote_artifact_bundle.py --artifact-root $ROOT/trader-v4/runs/$RUN_ID/export/trader-model-bundle-$RUN_ID/artifact_bundle --reason "validation_locked and latest_stress passed for SHADOW"
$PY training/scripts/16_validate_artifact_bundle.py --artifact-root $ROOT/trader-v4/runs/$RUN_ID/export/trader-model-bundle-$RUN_ID/artifact_bundle --require-active --run-onnx
$PY training/scripts/18_diagnose_training_run.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/19_search_price_plan.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/20_screen_entry_features.py --run-id $RUN_ID --data-root $ROOT
$PY training/scripts/21_benchmark_nonlinear_models.py --run-id $RUN_ID --data-root $ROOT

Java SHADOW 只加载 ACTIVE 包。15 号脚本永远只生成 CANDIDATE,16 号校验通过且上线门槛通过后,17 号脚本才允许把包提升为 ACTIVE

如果 16/17 号显示包完整但上线门槛不过,就跑 18 号脚本。18 号只做诊断,不改模型、不改阈值,用来说明是标签、模型分数,还是 PM 规则把交易挡住。

如果 18 号诊断显示 Entry 净收益为负,就跑 19 号脚本。19 号只搜索下一轮实验用的价格计划,不代表上线结论;选出的价格计划仍然必须重新生成标签、重新训练、重新回测。

如果 19 号换了价格计划以后还是没有稳定盈利交易,就跑 20 号脚本。20 号会检查每个 Entry 特征的高低区间,看历史里到底有没有稳定信号;如果连单特征区间都没有稳定改善,就不要继续盲目调阈值,应优先补特征或换模型表达能力。

如果 20 号仍然没有稳定正收益区间,就跑 21 号脚本。21 号只做诊断,不导出上线模型;它用更强一点的树模型检查同一批特征和标签里到底有没有可学习信号。如果树模型也没有稳定信号,下一步应回到特征和标签定义,而不是继续放宽 PM 阈值。