Troubleshooting
Common issues, solutions, and debugging techniques for ScoutQuest deployment and operation.
Common Issues
Service Registration Problems
❌ Problem: "Connection refused" when registering service
Symptoms: Client throws connection errors when trying to register
Error: Request failed with status code ECONNREFUSED
at ScoutQuestClient.registerService (client.js:45)
// Or in Rust:
Error: Network error: Connection refused (os error 61)
Causes & Solutions:
- ScoutQuest server not running: Start the server with
./scoutquest-server
- Wrong server URL: Check if using correct host and port
(default:
http://localhost:8080
) - Firewall blocking: Ensure port 8080 is open and accessible
- Network connectivity: Test with
curl http://localhost:8080/health
❌ Problem: Service registration succeeds but service not discoverable
Symptoms: Registration returns success but
discoverService()
fails
// Registration succeeds
const instance = await client.registerService('my-service', 'localhost', 3000);
console.log('Registered:', instance.id); // Works
// Discovery fails
const discovered = await client.discoverService('my-service'); // ServiceNotFoundError
Solutions:
- Check service name spelling: Service names are case-sensitive
- Verify health status: Service might be registered but marked unhealthy
- Database issues: Check server logs for database connection errors
- Cache issues: Clear discovery cache or wait for TTL expiration
❌ Problem: Health check failures
Symptoms: Services registered but marked as unhealthy
// Health check endpoint not responding
GET http://localhost:3000/health → 404 Not Found
// Or timeout
Error: Health check timeout after 5000ms
Solutions:
- Implement health endpoint: Add
/health
endpoint to your service - Check endpoint path: Ensure health check URL is correct
- Verify service is running: Test service accessibility directly
- Adjust timeout: Increase health check timeout in configuration
Service Discovery Issues
❌ Problem: "ServiceNotFoundError" for existing services
Symptoms: Service exists but cannot be discovered
ScoutQuestError: Service 'user-service' not found
at ServiceDiscoveryClient.discoverService
Debug Steps:
# 1. List all services
curl http://localhost:8080/api/services
# 2. Check specific service instances
curl http://localhost:8080/api/services/user-service/instances
# 3. Verify service registration
curl -X POST http://localhost:8080/api/services \
-H "Content-Type: application/json" \
-d '{
"name": "user-service",
"host": "localhost",
"port": 3000
}'
❌ Problem: Load balancing not working
Symptoms: Always get same instance, no distribution
Solutions:
- Multiple instances: Ensure multiple healthy instances are registered
- Check strategy: Verify load balancing strategy configuration
- Clear cache: Discovery cache might return same cached instance
// Force fresh discovery (bypass cache)
const options = {
bypassCache: true,
loadBalancingStrategy: 'round_robin'
};
const instance = await client.discoverServiceWithOptions('user-service', options);
HTTP Communication Problems
❌ Problem: HTTP requests through ScoutQuest fail
Symptoms: Direct calls work, but calls through discovery fail
// Direct call works
const response = await fetch('http://localhost:3000/api/users');
// Through ScoutQuest fails
const response = await client.getService('user-service', '/api/users'); // Error
Debug approach:
// 1. Verify service discovery works
const instance = await client.discoverService('user-service');
console.log('Discovered:', instance);
// 2. Test direct call to discovered instance
const directResponse = await fetch(`http://${instance.host}:${instance.port}/api/users`);
console.log('Direct call status:', directResponse.status);
// 3. Enable debug logging
const client = new ScoutQuestClient({
serverUrl: 'http://localhost:8080',
debug: true
});
❌ Problem: Timeout errors
Symptoms: Requests timeout inconsistently
ScoutQuestError: Request timeout after 5000ms
Solutions:
- Increase timeout: Adjust client timeout configuration
- Check service performance: Monitor target service response times
- Network issues: Verify network connectivity and latency
// Increase timeout for specific calls
const response = await client.getService('slow-service', '/api/data', {
timeout: 30000 // 30 seconds
});
Server-Side Issues
❌ Problem: ScoutQuest server won't start
Common error messages and solutions:
# Error: Address already in use
Error: Address already in use (os error 48)
Solution: Another process using port 8080
# Find process using port 8080
lsof -i :8080
kill -9
# Or use different port
SCOUTQUEST_PORT=8081 ./scoutquest-server
# Error: Database connection failed
Error: Failed to connect to database: Connection refused
Solutions:
- Start PostgreSQL:
brew services start postgresql
(macOS) orsudo systemctl start postgresql
(Linux) - Create database:
createdb scoutquest
- Check connection string in config file
❌ Problem: High memory usage
Symptoms: Server memory consumption grows over time
Diagnostic steps:
# Check memory usage
ps aux | grep scoutquest-server
# Monitor over time
watch -n 5 'ps aux | grep scoutquest-server'
# Check metrics endpoint
curl http://localhost:8080/metrics | grep memory
Solutions:
- Reduce cache size: Lower cache TTL and max size in configuration
- Connection pool limits: Reduce database connection pool size
- Service cleanup: Implement automatic deregistration of dead services
Debugging Techniques
Enable Debug Logging
// Enable debug logging in JavaScript
const client = new ScoutQuestClient({
serverUrl: 'http://localhost:8080',
debug: true,
logLevel: 'debug'
});
// Or set environment variable
process.env.SCOUTQUEST_DEBUG = 'true';
process.env.SCOUTQUEST_LOG_LEVEL = 'debug';
// Enable debug logging in Rust
RUST_LOG=debug cargo run
// Or specific to scoutquest
RUST_LOG=scoutquest_rust=debug cargo run
// In code
use tracing::{debug, info, warn, error};
let client = ServiceDiscoveryClient::builder()
.server_url("http://localhost:8080")
.enable_debug_logging(true)
.build()?;
# Enable server debug logging
RUST_LOG=debug ./scoutquest-server
# Or via configuration
[logging]
level = "debug"
format = "text" # Easier to read during debugging
Network Debugging
# Test ScoutQuest server connectivity
curl -v http://localhost:8080/health
# Test service registration
curl -X POST http://localhost:8080/api/services \
-H "Content-Type: application/json" \
-d '{
"name": "test-service",
"host": "localhost",
"port": 3000,
"metadata": {"version": "1.0.0"}
}'
# Test service discovery
curl http://localhost:8080/api/services/test-service
# Test health checks
curl http://localhost:3000/health
# Monitor network traffic
sudo tcpdump -i lo0 port 8080
Database Debugging
# Connect to PostgreSQL database
psql -h localhost -U scoutquest -d scoutquest
# Check services table
SELECT * FROM services;
# Check service instances
SELECT * FROM service_instances;
# Check health check status
SELECT si.*, hc.status, hc.last_check
FROM service_instances si
LEFT JOIN health_checks hc ON si.id = hc.service_instance_id;
Performance Analysis
# Server metrics
curl http://localhost:8080/metrics
# Specific metrics to check
curl -s http://localhost:8080/metrics | grep -E "(http_requests|service_discovery|health_check)"
# Load testing with Apache Bench
ab -n 1000 -c 10 http://localhost:8080/api/services
# Monitor with htop
htop -p $(pgrep scoutquest-server)
Common Error Messages
Client Errors
ServiceNotFoundError: Service 'xyz' not found
Service is not registered or all instances are unhealthy
ServiceUnavailableError: Service 'xyz' is unavailable
Service registered but no healthy instances available
NetworkError: Connection refused
Cannot connect to ScoutQuest server
TimeoutError: Request timeout after Xms
Request took longer than configured timeout
CircuitBreakerOpenError: Circuit breaker is open
Too many failures, circuit breaker protecting service
Server Errors
Address already in use (os error 48)
Port 8080 is already in use by another process
Failed to connect to database
PostgreSQL is not running or connection configuration is incorrect
Redis connection failed
Redis server is not accessible (if using Redis for caching)
TLS certificate error
Invalid or expired TLS certificates
Health Check Debugging
Manual Health Check Testing
# Test health endpoint directly
curl -v http://localhost:3000/health
# Expected response
HTTP/1.1 200 OK
Content-Type: application/json
{"status": "healthy", "timestamp": "2024-01-15T10:30:00Z"}
Health Check Implementation Examples
app.get('/health', (req, res) => {
// Basic health check
res.json({
status: 'healthy',
timestamp: new Date().toISOString(),
uptime: process.uptime(),
version: process.env.npm_package_version
});
});
// Advanced health check with dependencies
app.get('/health', async (req, res) => {
const checks = {};
let healthy = true;
// Database check
try {
await db.query('SELECT 1');
checks.database = 'healthy';
} catch (error) {
checks.database = 'unhealthy';
healthy = false;
}
// Redis check
try {
await redis.ping();
checks.redis = 'healthy';
} catch (error) {
checks.redis = 'unhealthy';
healthy = false;
}
const status = healthy ? 200 : 503;
res.status(status).json({
status: healthy ? 'healthy' : 'unhealthy',
checks,
timestamp: new Date().toISOString()
});
});
use axum::{http::StatusCode, response::Json, routing::get, Router};
use serde_json::{json, Value};
async fn health_check() -> Result, StatusCode> {
// Basic health check
Ok(Json(json!({
"status": "healthy",
"timestamp": chrono::Utc::now().to_rfc3339(),
"version": env!("CARGO_PKG_VERSION")
})))
}
// Advanced health check
async fn advanced_health_check(
Extension(db_pool): Extension,
Extension(redis_pool): Extension>
) -> Result, StatusCode> {
let mut checks = serde_json::Map::new();
let mut healthy = true;
// Database check
match sqlx::query("SELECT 1").fetch_one(&db_pool).await {
Ok(_) => checks.insert("database".to_string(), json!("healthy")),
Err(_) => {
checks.insert("database".to_string(), json!("unhealthy"));
healthy = false;
}
};
// Redis check
let redis_status = match redis_pool.get() {
Ok(mut conn) => {
match redis::cmd("PING").query::(&mut conn) {
Ok(_) => "healthy",
Err(_) => { healthy = false; "unhealthy" }
}
}
Err(_) => { healthy = false; "unhealthy" }
};
checks.insert("redis".to_string(), json!(redis_status));
let status_code = if healthy { StatusCode::OK } else { StatusCode::SERVICE_UNAVAILABLE };
Ok(Json(json!({
"status": if healthy { "healthy" } else { "unhealthy" },
"checks": checks,
"timestamp": chrono::Utc::now().to_rfc3339()
})))
}
Performance Issues
❌ Problem: Slow service discovery
Solutions:
- Enable caching: Use discovery cache with appropriate TTL
- Database optimization: Ensure proper indexes on services table
- Connection pooling: Use database connection pooling
// Enable client-side caching
const client = new ScoutQuestClient({
serverUrl: 'http://localhost:8080',
cache: {
enabled: true,
ttl: 60000, // 1 minute
maxSize: 1000
}
});
❌ Problem: High latency HTTP calls
Diagnostic approach:
# Measure different components
time curl http://localhost:8080/api/services/user-service # Discovery time
time curl http://localhost:3000/api/users # Direct service time
time scoutquest-client get user-service /api/users # Through ScoutQuest
Solutions:
- Keep-alive connections: Enable HTTP keep-alive
- Connection pooling: Reuse HTTP connections
- Local instances: Prefer geographically closer instances
Getting Help
Before Opening an Issue
- Check this troubleshooting guide
- Search existing GitHub issues
- Enable debug logging and gather relevant logs
- Prepare a minimal reproducible example
- Include your environment details (OS, ScoutQuest version, etc.)