# Redis + SignalR Multi-Instance Deployment Guide ## Summary The Managing API now supports **multiple instances** with **SignalR** (for LlmHub, BotHub, BacktestHub) using a **Redis backplane**. This solves the "No Connection with that ID" error that occurs when: - `/llmhub/negotiate` hits instance A - WebSocket connection hits instance B (which doesn't know about the connection ID) ## What Was Added ### 1. Infrastructure Layer - Generic Redis Service **Files Created:** - `src/Managing.Application.Abstractions/Services/IRedisConnectionService.cs` - Interface - `src/Managing.Infrastructure.Storage/RedisConnectionService.cs` - Implementation - `src/Managing.Infrastructure.Storage/README-REDIS.md` - Documentation **Purpose:** Generic Redis connectivity that can be used for SignalR, caching, or any Redis needs. ### 2. SignalR Redis Backplane **Files Modified:** - `src/Managing.Api/Program.cs` - Auto-configures SignalR with Redis when available - `src/Managing.Bootstrap/ApiBootstrap.cs` - Registers Redis service **How It Works:** - Checks if Redis is configured - If yes: Adds Redis backplane to SignalR - If no: Runs in single-instance mode (graceful degradation) ### 3. Configuration **Files Modified:** - `src/Managing.Api/appsettings.json` - Default config (empty, for local dev) - `src/Managing.Api/appsettings.Sandbox.json` - `srv-captain--redis:6379` - `src/Managing.Api/appsettings.Production.json` - `srv-captain--redis:6379` ### 4. NuGet Packages Added - `Microsoft.AspNetCore.SignalR.StackExchangeRedis` (8.0.10) - SignalR backplane - `Microsoft.Extensions.Caching.StackExchangeRedis` (8.0.10) - Redis caching - `StackExchange.Redis` (2.8.16) - Redis client ## Deployment Steps for CapRover ### Step 1: Create Redis Service 1. In CapRover, go to **Apps** 2. Click **One-Click Apps/Databases** 3. Search for "Redis" 4. Deploy Redis (or use existing one) 5. Note the service name: `srv-captain--redis` (or your custom name) ### Step 2: Configure CapRover App For `dev-managing-api` (Sandbox): 1. **Enable WebSocket Support** - Go to **HTTP Settings** - Toggle **"WebSocket Support"** to ON - Save 2. **Enable Sticky Sessions** - In **HTTP Settings** - Toggle **"Enable Sticky Sessions"** to ON - Save 3. **Verify Redis Connection String** - The connection string is already in `appsettings.Sandbox.json` - Default: `srv-captain--redis:6379` - If you used a different Redis service name, update via environment variable: ``` ConnectionStrings__Redis=srv-captain--your-redis-name:6379 ``` - Or use the fallback: ``` REDIS_URL=srv-captain--your-redis-name:6379 ``` ### Step 3: Deploy 1. Build and deploy the API: ```bash cd src/Managing.Api # Your normal deployment process ``` 2. Watch the logs during startup. You should see: ``` ✅ Configuring SignalR with Redis backplane: srv-captain--redis:6379 ✅ Redis connection established successfully ``` ### Step 4: Scale to Multiple Instances 1. In CapRover, go to your `dev-managing-api` app 2. **App Configs** tab 3. Set **"Number of app instances"** to `2` or `3` 4. Click **Save & Update** ### Step 5: Test 1. Open the frontend (Kaigen Web UI) 2. Open the AI Chat 3. Send a message 4. Should work without "No Connection with that ID" errors ## Verification Checklist After deployment, verify: - [ ] Redis service is running in CapRover - [ ] WebSocket support is enabled - [ ] Sticky sessions are enabled - [ ] API logs show Redis connection success - [ ] Multiple instances are running - [ ] AI Chat works without connection errors - [ ] Browser Network tab shows WebSocket upgrade successful ## Troubleshooting ### Issue: "No Connection with that ID" Still Appears **Check:** 1. Redis service is running: `redis-cli -h srv-captain--redis ping` 2. API logs show Redis connected (not "Redis not configured") 3. Sticky sessions are ON 4. WebSocket support is ON **Quick Test:** - Temporarily set instances to 1 - If it works with 1 instance, the issue is multi-instance setup - If it fails with 1 instance, check WebSocket/proxy configuration ### Issue: Redis Connection Failed **Check Logs For:** ``` ⚠️ Failed to configure SignalR Redis backplane: SignalR will work in single-instance mode only ``` **Solutions:** 1. Verify Redis service name matches configuration 2. Ensure Redis is not password-protected (or add password to config) 3. Check Redis service health in CapRover ### Issue: WebSocket Upgrade Failed Not related to Redis. Check: 1. CapRover WebSocket support is ON 2. Nginx configuration allows upgrades 3. Browser console for detailed error ## Configuration Reference ### Connection String Formats **Simple (no password):** ``` srv-captain--redis:6379 ``` **With Password:** ``` srv-captain--redis:6379,password=your-password ``` **Multiple Options:** ``` srv-captain--redis:6379,password=pwd,ssl=true,abortConnect=false ``` ### Configuration Priority The app checks these in order: 1. `ConnectionStrings:Redis` (appsettings.json or `ConnectionStrings__Redis` environment variable) 2. `REDIS_URL` (fallback environment variable) **Recommended**: Use `ConnectionStrings__Redis` environment variable to override appsettings without rebuilding. ## Architecture Benefits ### Before (Single Instance) ``` Frontend → Nginx → API Instance - In-memory SignalR - Connection IDs stored locally ❌ Scale limited to 1 instance ``` ### After (Multi-Instance with Redis) ``` Frontend → Nginx (sticky) → API Instance 1 ┐ → API Instance 2 ├─→ Redis ← SignalR Backplane → API Instance 3 ┘ - Connection IDs in Redis - Messages distributed via pub/sub - Any instance can handle any connection ✅ Scale to N instances ``` ## Next Steps After successful deployment: 1. **Monitor Performance** - Watch Redis memory usage - Check API response times - Monitor WebSocket connection stability 2. **Consider Redis Clustering** - For high availability - If scaling beyond 3-4 API instances 3. **Extend Redis Usage** - Distributed caching - Rate limiting - Session storage ## Rollback Plan If issues occur: 1. **Immediate**: Set instances to 1 2. **Environment Variable**: Set `REDIS_URL=` (empty) to disable Redis 3. **Code Rollback**: Previous version still works (graceful degradation) The implementation is backward-compatible and doesn't require Redis to function. ## Support For issues: 1. Check logs: `src/Managing.Infrastructure.Storage/README-REDIS.md` 2. Review this guide 3. Check CapRover app logs for Redis/SignalR messages 4. Test with 1 instance first, then scale up