Files
managing-apps/.cursor/commands/benchmark-backtest-performance.md
cryptooda fc036bb7de docs: enhance benchmark command with business logic validation tests
- Add 2 ETH-based validation tests to benchmark script
- Validates ExecuteBacktest_With_ETH_FifteenMinutes_Data_Should_Return_LightBacktest
- Validates ExecuteBacktest_With_ETH_FifteenMinutes_Data_Second_File_Should_Return_LightBacktest
- Ensures performance optimizations don't break trading logic
- Update command documentation with comprehensive validation details
- All 3 validation levels must pass for benchmark success
2025-11-11 12:32:56 +07:00

4.5 KiB

Benchmark Backtest Performance

This command runs the backtest performance tests and records the results in the performance benchmark CSV file.

Usage

Run this command to benchmark backtest performance and update the tracking CSV:

/benchmark-backtest-performance

Or run the script directly:

./scripts/benchmark-backtest-performance.sh

What it does

  1. Runs the main performance telemetry test (ExecuteBacktest_With_Large_Dataset_Should_Show_Performance_Telemetry)
  2. Runs two business logic validation tests:
    • ExecuteBacktest_With_ETH_FifteenMinutes_Data_Should_Return_LightBacktest
    • ExecuteBacktest_With_ETH_FifteenMinutes_Data_Second_File_Should_Return_LightBacktest
  3. Validates Business Logic: Compares Final PnL with the first run baseline to ensure optimizations don't break behavior
  4. Extracts performance metrics from the test output
  5. Appends a new row to src/Managing.Workers.Tests/performance-benchmarks.csv
  6. Never commits changes automatically

CSV Format

The CSV file contains clean numeric values for all telemetry metrics:

  • DateTime: ISO 8601 timestamp when the benchmark was run
  • TestName: Name of the test that was executed
  • CandlesCount: Integer - Number of candles processed
  • ExecutionTimeSeconds: Decimal - Total execution time in seconds
  • ProcessingRateCandlesPerSec: Decimal - Candles processed per second
  • MemoryStartMB: Decimal - Memory usage at start
  • MemoryEndMB: Decimal - Memory usage at end
  • MemoryPeakMB: Decimal - Peak memory usage
  • SignalUpdatesCount: Decimal - Total signal updates performed
  • SignalUpdatesSkipped: Integer - Number of signal updates skipped
  • SignalUpdateEfficiencyPercent: Decimal - Percentage of signal updates that were skipped
  • BacktestStepsCount: Decimal - Number of backtest steps executed
  • AverageSignalUpdateMs: Decimal - Average time per signal update
  • AverageBacktestStepMs: Decimal - Average time per backtest step
  • FinalPnL: Decimal - Final profit and loss
  • WinRatePercent: Integer - Win rate percentage
  • GrowthPercentage: Decimal - Growth percentage
  • Score: Decimal - Backtest score
  • CommitHash: Git commit hash
  • GitBranch: Git branch name
  • Environment: Environment where test was run

Implementation Details

The command uses regex patterns to extract metrics from the test console output and formats them into CSV rows. It detects the current git branch and commit hash for tracking purposes but never commits changes automatically.

Example Output

🚀 Running backtest performance benchmark...
📊 Running main performance test...
✅ Performance test passed!
📊 Running business logic validation tests...
✅ Business logic validation tests passed!
✅ Business Logic OK: Final PnL matches baseline (±0)
📊 Benchmark Results:
   • Processing Rate: 5688.8 candles/sec
   • Execution Time: 1.005 seconds
   • Memory Peak: 24.66 MB
   • Signal Efficiency: 33.2%
   • Candles Processed: 5760
   • Score: 6015

✅ Benchmark data recorded successfully!

Business Logic Validation

The benchmark includes comprehensive business logic validation on three levels:

1. Dedicated ETH Backtest Tests (2 tests)

  • ExecuteBacktest_With_ETH_FifteenMinutes_Data_Should_Return_LightBacktest

    • Tests backtest with ETH 15-minute data
    • Validates specific trading scenarios and positions
    • Ensures indicator calculations are correct
  • ExecuteBacktest_With_ETH_FifteenMinutes_Data_Second_File_Should_Return_LightBacktest

    • Tests with a different ETH dataset
    • Validates consistency across different market data
    • Confirms trading logic works reliably

2. Large Dataset Telemetry Test (1 test)

  • ExecuteBacktest_With_Large_Dataset_Should_Show_Performance_Telemetry
    • Validates performance metrics extraction
    • Confirms signal updates and backtest steps
    • Ensures telemetry data is accurate

3. PnL Baseline Comparison

  • Consistent: Final PnL matches first run (±0.01 tolerance)
  • Baseline OK: Expected baseline is 24560.79
  • ⚠️ Warning: Large differences indicate broken business logic

All three validation levels must pass for the benchmark to succeed!

This prevents performance improvements from accidentally changing trading outcomes!

Files Modified

  • src/Managing.Workers.Tests/performance-benchmarks.csv - Modified (new benchmark row added)

Note: Changes are not committed automatically. Review the results and commit manually if satisfied.