This commit is contained in:
2025-11-09 02:08:31 +07:00
parent 1ed58d1a98
commit 7dba29c66f
57 changed files with 8362 additions and 359 deletions

View File

@@ -11,61 +11,79 @@
## Phase 2: Compute Worker Project
- [ ] Create `Managing.Compute` project (console app/worker service)
- [ ] Add project reference to shared projects (Application, Domain, Infrastructure)
- [ ] Configure DI container (NO Orleans)
- [ ] Refactor `Managing.Workers.Api` project (or rename to `Managing.Compute`)
- [ ] Remove Orleans dependencies completely
- [ ] Add project references to shared projects (Application, Domain, Infrastructure)
- [ ] Configure DI container with all required services (NO Orleans)
- [ ] Create `BacktestComputeWorker` background service
- [ ] Implement job polling logic (every 5 seconds)
- [ ] Implement job claiming with PostgreSQL advisory locks
- [ ] Implement semaphore-based concurrency control
- [ ] Implement progress callback mechanism
- [ ] Implement heartbeat mechanism (every 30 seconds)
- [ ] Add configuration: `MaxConcurrentBacktests`, `JobPollIntervalSeconds`
- [ ] Add configuration: `MaxConcurrentBacktests`, `JobPollIntervalSeconds`, `WorkerId`
## Phase 3: API Server Updates
- [ ] Update `BacktestController` to create jobs instead of calling grains directly
- [ ] Implement `CreateBundleBacktest` endpoint (returns immediately)
- [ ] Implement `GetBundleStatus` endpoint (polls database)
- [ ] Implement `GetJobStatus` endpoint (polls database for single job)
- [ ] Implement `GetBundleStatus` endpoint (polls database, aggregates job statuses)
- [ ] Update `Backtester.cs` to generate `BacktestJob` entities from bundle variants
- [ ] Remove direct Orleans grain calls for backtests (keep for other operations)
- [ ] Remove all Orleans grain calls for backtests (direct replacement, no feature flags)
- [ ] Remove `IGrainFactory` dependency from `Backtester.cs`
## Phase 4: Shared Logic
## Phase 4: Shared Logic Extraction
- [ ] Extract backtest execution logic from `BacktestTradingBotGrain` to `Backtester.cs`
- [ ] Make backtest logic Orleans-agnostic (can run in worker or grain)
- [ ] Add progress callback support to `RunBacktestAsync` method
- [ ] Ensure candle loading works in both contexts
- [ ] Create `BacktestExecutor.cs` service (new file)
- [ ] Extract backtest execution logic from `BacktestTradingBotGrain` to `BacktestExecutor`
- [ ] Make backtest logic Orleans-agnostic (no grain dependencies)
- [ ] Add progress callback support to execution method
- [ ] Ensure candle loading works in compute worker context
- [ ] Handle credit debiting/refunding in executor
- [ ] Handle user context resolution in executor
## Phase 5: Monitoring & Health Checks
- [ ] Add health check endpoint to compute worker
- [ ] Add health check endpoint to compute worker (`/health` or `/healthz`)
- [ ] Add metrics: pending jobs, running jobs, completed/failed counts
- [ ] Add stale job detection (reclaim jobs from dead workers)
- [ ] Add logging for job lifecycle events
- [ ] Add stale job detection (reclaim jobs from dead workers, LastHeartbeat > 5 min)
- [ ] Add comprehensive logging for job lifecycle events
- [ ] Include structured logging: JobId, BundleRequestId, UserId, WorkerId, Duration
## Phase 6: Deployment
## Phase 6: SignalR & Notifications
- [ ] Create Dockerfile for `Managing.Compute`
- [ ] Create deployment configuration for compute workers
- [ ] Configure environment variables for compute cluster
- [ ] Set up monitoring dashboards (Prometheus/Grafana)
- [ ] Configure auto-scaling rules for compute workers
- [ ] Inject `IHubContext<BacktestHub>` into compute worker or executor
- [ ] Send SignalR progress updates during job execution
- [ ] Update `BacktestJob.ProgressPercentage` in database
- [ ] Update `BundleBacktestRequest` progress when jobs complete
- [ ] Send completion notifications via SignalR and Telegram
## Phase 7: Testing & Validation
## Phase 7: Deployment
- [ ] Test single backtest job processing
- [ ] Test bundle backtest with multiple jobs
- [ ] Test concurrent job processing (multiple workers)
- [ ] Test job recovery after worker failure
- [ ] Test priority queue ordering
- [ ] Load test with 1000+ concurrent users
- [ ] Create Dockerfile for `Managing.Compute` (or update existing)
- [ ] Update `docker-compose.yml` to add compute worker service
- [ ] Configure environment variables: `MaxConcurrentBacktests`, `JobPollIntervalSeconds`, `WorkerId`
- [ ] Set up health check configuration in Docker
- [ ] Configure auto-scaling rules for compute workers (min: 1, max: 10)
## Phase 8: Migration Strategy
## Phase 9: Testing & Validation
- [ ] Keep Orleans grains as fallback during transition
- [ ] Feature flag to switch between Orleans and Compute workers
- [ ] Gradual migration: test with small percentage of traffic
- [ ] Monitor performance and error rates
- [ ] Full cutover once validated
- [ ] Unit tests: BacktestJobRepository (advisory locks, job claiming, stale detection)
- [ ] Unit tests: BacktestExecutor (core logic, progress callbacks)
- [ ] Integration tests: Single backtest job processing
- [ ] Integration tests: Bundle backtest with multiple jobs
- [ ] Integration tests: Concurrent job processing (multiple workers)
- [ ] Integration tests: Job recovery after worker failure
- [ ] Integration tests: Priority queue ordering
- [ ] Load tests: 100+ concurrent users, 1000+ pending jobs, multiple workers
## Phase 8: Cleanup & Removal
- [ ] Remove or deprecate `BacktestTradingBotGrain.cs` (no longer used)
- [ ] Remove or deprecate `BundleBacktestGrain.cs` (replaced by compute workers)
- [ ] Remove Orleans grain interfaces for backtests (if not used elsewhere)
- [ ] Update `ApiBootstrap.cs` to remove Orleans backtest grain registrations
- [ ] Remove Orleans dependencies from `Backtester.cs` (keep for other operations)
- [ ] Update documentation to reflect new architecture