Add monitoring on queries with sentry alert + Fix check position list in db for backtest
This commit is contained in:
336
SQL_MONITORING_README.md
Normal file
336
SQL_MONITORING_README.md
Normal file
@@ -0,0 +1,336 @@
|
||||
# SQL Query Monitoring and Loop Detection System
|
||||
|
||||
## Overview
|
||||
|
||||
This comprehensive SQL monitoring system has been implemented to identify and resolve the SQL script loop issue that was causing DDOS-like behavior on your server. The system provides detailed logging, performance monitoring, and automatic loop detection to help identify the root cause of problematic database operations.
|
||||
|
||||
## Features
|
||||
|
||||
### 🔍 **Comprehensive SQL Query Logging**
|
||||
- **Detailed Query Tracking**: Every SQL query is logged with timing, parameters, and execution context
|
||||
- **Performance Metrics**: Automatic tracking of query execution times, row counts, and resource usage
|
||||
- **Connection State Monitoring**: Tracks database connection open/close operations with timing
|
||||
- **Error Logging**: Comprehensive error logging with stack traces and context information
|
||||
|
||||
### 🚨 **Automatic Loop Detection**
|
||||
- **Pattern Recognition**: Identifies repeated query patterns that may indicate infinite loops
|
||||
- **Frequency Analysis**: Monitors query execution frequency and detects abnormally high rates
|
||||
- **Performance Thresholds**: Automatically flags slow queries and high-frequency operations
|
||||
- **Real-time Alerts**: Immediate notification when potential loops are detected
|
||||
|
||||
### 📊 **Performance Monitoring**
|
||||
- **Query Execution Statistics**: Tracks execution counts, average times, and performance trends
|
||||
- **Resource Usage Monitoring**: Monitors memory, CPU, and I/O usage during database operations
|
||||
- **Connection Pool Monitoring**: Tracks database connection pool health and usage
|
||||
- **Transaction Monitoring**: Monitors transaction duration and rollback rates
|
||||
|
||||
### 🎯 **Smart Alerting System**
|
||||
- **Configurable Thresholds**: Customizable thresholds for slow queries, high frequency, and error rates
|
||||
- **Multi-level Alerts**: Different alert levels (Info, Warning, Error, Critical) based on severity
|
||||
- **Contextual Information**: Alerts include repository name, method name, and query patterns
|
||||
- **Automatic Escalation**: Critical issues are automatically escalated with detailed diagnostics
|
||||
|
||||
## Components
|
||||
|
||||
### 1. SqlQueryLogger
|
||||
**Location**: `src/Managing.Infrastructure.Database/PostgreSql/SqlQueryLogger.cs`
|
||||
|
||||
Provides comprehensive logging for individual database operations:
|
||||
- Operation start/completion logging
|
||||
- Query execution timing and parameters
|
||||
- Connection state changes
|
||||
- Error handling and exception logging
|
||||
- Performance issue detection
|
||||
|
||||
### 2. SqlLoopDetectionService
|
||||
**Location**: `src/Managing.Infrastructure.Database/PostgreSql/SqlLoopDetectionService.cs`
|
||||
|
||||
Advanced loop detection and performance monitoring:
|
||||
- Real-time query pattern analysis
|
||||
- Execution frequency tracking
|
||||
- Performance threshold monitoring
|
||||
- Automatic cleanup of old tracking data
|
||||
- Configurable detection rules
|
||||
|
||||
### 3. BaseRepositoryWithLogging
|
||||
**Location**: `src/Managing.Infrastructure.Database/PostgreSql/BaseRepositoryWithLogging.cs`
|
||||
|
||||
Base class for repositories with integrated monitoring:
|
||||
- Automatic query execution tracking
|
||||
- Performance monitoring for all database operations
|
||||
- Error handling and logging
|
||||
- Loop detection integration
|
||||
|
||||
### 4. Enhanced ManagingDbContext
|
||||
**Location**: `src/Managing.Infrastructure.Database/PostgreSql/ManagingDbContext.cs`
|
||||
|
||||
Extended DbContext with monitoring capabilities:
|
||||
- Query execution tracking
|
||||
- Performance metrics collection
|
||||
- Loop detection integration
|
||||
- Statistics and health monitoring
|
||||
|
||||
### 5. SqlMonitoringController
|
||||
**Location**: `src/Managing.Api/Controllers/SqlMonitoringController.cs`
|
||||
|
||||
REST API endpoints for monitoring and management:
|
||||
- Real-time query statistics
|
||||
- Alert management
|
||||
- Performance metrics
|
||||
- Health monitoring
|
||||
- Configuration management
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Get Query Statistics
|
||||
```http
|
||||
GET /api/SqlMonitoring/statistics
|
||||
```
|
||||
Returns comprehensive query execution statistics including:
|
||||
- Loop detection statistics
|
||||
- Context execution counts
|
||||
- Active query patterns
|
||||
- Performance metrics
|
||||
|
||||
### Get Alerts
|
||||
```http
|
||||
GET /api/SqlMonitoring/alerts
|
||||
```
|
||||
Returns current alerts and potential issues:
|
||||
- High frequency queries
|
||||
- Slow query patterns
|
||||
- Performance issues
|
||||
- Loop detection alerts
|
||||
|
||||
### Clear Tracking Data
|
||||
```http
|
||||
POST /api/SqlMonitoring/clear-tracking
|
||||
```
|
||||
Clears all tracking data and resets monitoring counters.
|
||||
|
||||
### Get Query Details
|
||||
```http
|
||||
GET /api/SqlMonitoring/query-details/{repositoryName}/{methodName}
|
||||
```
|
||||
Returns detailed information about specific query patterns.
|
||||
|
||||
### Get Monitoring Health
|
||||
```http
|
||||
GET /api/SqlMonitoring/health
|
||||
```
|
||||
Returns overall monitoring system health status.
|
||||
|
||||
## Configuration
|
||||
|
||||
### SqlMonitoringSettings
|
||||
**Location**: `src/Managing.Infrastructure.Database/PostgreSql/SqlMonitoringSettings.cs`
|
||||
|
||||
Comprehensive configuration options:
|
||||
- **TrackingWindow**: Time window for query tracking (default: 5 minutes)
|
||||
- **MaxExecutionsPerWindow**: Maximum executions per window (default: 10)
|
||||
- **SlowQueryThresholdMs**: Slow query threshold (default: 1000ms)
|
||||
- **HighFrequencyThreshold**: High frequency threshold (default: 20 executions/minute)
|
||||
- **EnableDetailedLogging**: Enable detailed SQL logging (default: true)
|
||||
- **EnableLoopDetection**: Enable loop detection (default: true)
|
||||
- **EnablePerformanceMonitoring**: Enable performance monitoring (default: true)
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### 1. Using Enhanced Repository
|
||||
```csharp
|
||||
public class MyRepository : BaseRepositoryWithLogging, IMyRepository
|
||||
{
|
||||
public MyRepository(ManagingDbContext context, ILogger<MyRepository> logger, SqlLoopDetectionService loopDetectionService)
|
||||
: base(context, logger, loopDetectionService)
|
||||
{
|
||||
}
|
||||
|
||||
public async Task<User> GetUserAsync(string name)
|
||||
{
|
||||
return await ExecuteWithLoggingAsync(async () =>
|
||||
{
|
||||
// Your database operation here
|
||||
return await _context.Users.FirstOrDefaultAsync(u => u.Name == name);
|
||||
}, nameof(GetUserAsync), ("name", name));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Manual Query Tracking
|
||||
```csharp
|
||||
// Track a specific query execution
|
||||
_context.TrackQueryExecution("GetUserByName", TimeSpan.FromMilliseconds(150), "UserRepository", "GetUserAsync");
|
||||
```
|
||||
|
||||
### 3. Monitoring API Usage
|
||||
```bash
|
||||
# Get current statistics
|
||||
curl -X GET "https://your-api/api/SqlMonitoring/statistics"
|
||||
|
||||
# Get alerts
|
||||
curl -X GET "https://your-api/api/SqlMonitoring/alerts"
|
||||
|
||||
# Clear tracking data
|
||||
curl -X POST "https://your-api/api/SqlMonitoring/clear-tracking"
|
||||
```
|
||||
|
||||
## Logging Output Examples
|
||||
|
||||
### Query Execution Log
|
||||
```
|
||||
[SQL-OP-START] a1b2c3d4 | PostgreSqlUserRepository.GetUserByNameAsync | Started at 14:30:15.123
|
||||
[SQL-CONNECTION] a1b2c3d4 | PostgreSqlUserRepository.GetUserByNameAsync | Connection OPENED (took 5ms)
|
||||
[SQL-QUERY] a1b2c3d4 | PostgreSqlUserRepository.GetUserByNameAsync | Executed in 25ms | Rows: 1
|
||||
[SQL-CONNECTION] a1b2c3d4 | PostgreSqlUserRepository.GetUserByNameAsync | Connection CLOSED (took 2ms)
|
||||
[SQL-OP-COMPLETE] a1b2c3d4 | PostgreSqlUserRepository.GetUserByNameAsync | Completed in 32ms | Queries: 1 | Result: User
|
||||
```
|
||||
|
||||
### Loop Detection Alert
|
||||
```
|
||||
[SQL-LOOP-DETECTED] e5f6g7h8 | PostgreSqlTradingRepository.GetPositionsAsync | Pattern 'GetPositionsAsync()' executed 15 times | Possible infinite loop!
|
||||
[SQL-LOOP-ALERT] Potential infinite loop detected in PostgreSqlTradingRepository.GetPositionsAsync with pattern 'GetPositionsAsync()'
|
||||
```
|
||||
|
||||
### Performance Warning
|
||||
```
|
||||
[SQL-PERFORMANCE] PostgreSqlTradingRepository | GetPositionsAsync took 2500ms (threshold: 1000ms)
|
||||
[SQL-QUERY-DETAILS] i9j0k1l2 | Query: SELECT * FROM Positions WHERE Status = @status | Parameters: {"status":"Active"}
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues and Solutions
|
||||
|
||||
#### 1. High Query Frequency
|
||||
**Symptoms**: Multiple queries executing rapidly
|
||||
**Detection**: `[SQL-LOOP-DETECTED]` logs with high execution counts
|
||||
**Solution**:
|
||||
- Check for recursive method calls
|
||||
- Verify loop conditions in business logic
|
||||
- Review async/await patterns
|
||||
|
||||
#### 2. Slow Query Performance
|
||||
**Symptoms**: Queries taking longer than expected
|
||||
**Detection**: `[SQL-PERFORMANCE]` warnings
|
||||
**Solution**:
|
||||
- Review query execution plans
|
||||
- Check database indexes
|
||||
- Optimize query parameters
|
||||
|
||||
#### 3. Connection Issues
|
||||
**Symptoms**: Connection timeouts or pool exhaustion
|
||||
**Detection**: `[SQL-CONNECTION]` error logs
|
||||
**Solution**:
|
||||
- Review connection management
|
||||
- Check connection pool settings
|
||||
- Verify proper connection disposal
|
||||
|
||||
#### 4. Memory Issues
|
||||
**Symptoms**: High memory usage during database operations
|
||||
**Detection**: Memory monitoring alerts
|
||||
**Solution**:
|
||||
- Review query result set sizes
|
||||
- Implement pagination
|
||||
- Check for memory leaks in entity tracking
|
||||
|
||||
## Integration Steps
|
||||
|
||||
### 1. Update Existing Repositories
|
||||
Replace existing repository implementations with the enhanced base class:
|
||||
|
||||
```csharp
|
||||
// Before
|
||||
public class MyRepository : IMyRepository
|
||||
{
|
||||
private readonly ManagingDbContext _context;
|
||||
// ...
|
||||
}
|
||||
|
||||
// After
|
||||
public class MyRepository : BaseRepositoryWithLogging, IMyRepository
|
||||
{
|
||||
public MyRepository(ManagingDbContext context, ILogger<MyRepository> logger, SqlLoopDetectionService loopDetectionService)
|
||||
: base(context, logger, loopDetectionService)
|
||||
{
|
||||
}
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Update Dependency Injection
|
||||
The services are automatically registered in `Program.cs`:
|
||||
- `SqlLoopDetectionService` as Singleton
|
||||
- Enhanced `ManagingDbContext` with monitoring
|
||||
- All repositories with logging capabilities
|
||||
|
||||
### 3. Configure Monitoring Settings
|
||||
Add configuration to `appsettings.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"SqlMonitoring": {
|
||||
"TrackingWindow": "00:05:00",
|
||||
"MaxExecutionsPerWindow": 10,
|
||||
"SlowQueryThresholdMs": 1000,
|
||||
"HighFrequencyThreshold": 20,
|
||||
"EnableDetailedLogging": true,
|
||||
"EnableLoopDetection": true,
|
||||
"EnablePerformanceMonitoring": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Monitoring Dashboard
|
||||
|
||||
### Key Metrics to Monitor
|
||||
|
||||
1. **Query Execution Count**: Track total queries per minute
|
||||
2. **Average Execution Time**: Monitor query performance trends
|
||||
3. **Error Rate**: Track database error frequency
|
||||
4. **Connection Pool Usage**: Monitor connection health
|
||||
5. **Loop Detection Alerts**: Immediate notification of potential issues
|
||||
|
||||
### Alert Thresholds
|
||||
|
||||
- **Critical**: >50 queries/minute, >5 second execution time
|
||||
- **Warning**: >20 queries/minute, >1 second execution time
|
||||
- **Info**: Normal operation metrics
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Repository Design
|
||||
- Always inherit from `BaseRepositoryWithLogging`
|
||||
- Use `ExecuteWithLoggingAsync` for all database operations
|
||||
- Include meaningful parameter names in logging calls
|
||||
- Handle exceptions properly with logging
|
||||
|
||||
### 2. Performance Optimization
|
||||
- Monitor slow queries regularly
|
||||
- Implement proper indexing strategies
|
||||
- Use pagination for large result sets
|
||||
- Avoid N+1 query problems
|
||||
|
||||
### 3. Error Handling
|
||||
- Log all database errors with context
|
||||
- Implement proper retry mechanisms
|
||||
- Use circuit breaker patterns for external dependencies
|
||||
- Monitor error rates and trends
|
||||
|
||||
### 4. Security Considerations
|
||||
- Avoid logging sensitive data in query parameters
|
||||
- Use parameterized queries to prevent SQL injection
|
||||
- Implement proper access controls for monitoring endpoints
|
||||
- Regular security audits of database operations
|
||||
|
||||
## Conclusion
|
||||
|
||||
This comprehensive SQL monitoring system provides the tools needed to identify and resolve the SQL script loop issue. The system offers:
|
||||
|
||||
- **Real-time monitoring** of all database operations
|
||||
- **Automatic loop detection** with configurable thresholds
|
||||
- **Performance tracking** with detailed metrics
|
||||
- **Comprehensive logging** for debugging and analysis
|
||||
- **REST API endpoints** for monitoring and management
|
||||
- **Configurable settings** for different environments
|
||||
|
||||
The system is designed to be non-intrusive while providing maximum visibility into database operations, helping you quickly identify and resolve performance issues and potential infinite loops.
|
||||
Reference in New Issue
Block a user