US-166US-167US-168
US-169US-170US-171
US-172US-173US-174
US-175US-176US-177
US-178US-179US-180US-181US-182US-183US-184US-185
📈 Metrics Observatory
SOCVault Internal · Real-time nano-level visibility · AWS + Claude AI + API + Compute + System + Application
Wireframe state: 16 Jun 2026 11:30 UTC · Showing last 24h unless noted. Drill-down links represented by clickable table rows.
$127.42
AWS Cost Today
↑ $12 vs yesterday
$48.20
Claude AI Today
↑ 18% vs 7d avg
12,482
API Requests Today
84/min avg
0.07%
API Error Rate
p95: 312ms
42
Lambda Concurrency
MVP · Fargate/EKS at paid tier
99.98%
Platform Uptime
30d rolling
☁️ AWS Cost Breakdown — Today US-166, FR-170
By Service
By Region
By Tenant (Top 10)
30-Day Forecast
Lambda + API Gateway
45%
$57.20
Fargate / EKS (paid tier)
23%
$29.45
MongoDB Atlas (eu-west-2)
14%
$17.84
S3 Storage
6%
$7.65
SNS (SMS Alerts)
4%
$5.10
Cognito + SSM
4%
$5.08
API Gateway
3%
$3.82
Amazon SQS
1%
$1.28
ElastiCache Redis (paid tier)
1%
$1.28
30-Day Forecast (at current burn rate)
$3,822 / month
Budget ceiling
$5,000
76% of budget
✦ Claude AI Usage US-167, FR-171, FR-172
Today
7 Days
30 Days
Per Tenant
4,821,320
INPUT TOKENS
1,240,810
OUTPUT TOKENS
$48.20
TOTAL COST
Model: claude-sonnet-4-6 · Cap: $50/day · Usage: 96.4%
Scan Analysis
$27.96 (58%)
SOAR Triage
$10.60 (22%)
AI Chat (Users)
$5.78 (12%)
L9 Agent Scans
$2.41 (5%)
Financial Reports
$1.45 (3%)
💰
Prompt caching saved $12.34 today vs non-cached equivalent. Cache hit rate: 73%. System prompt cached tokens: 2,140,000.
⚡ API Performance US-169, FR-173
118ms
p50 Latency
312ms
p95 Latency
1,840ms
p99 Latency
0.07%
Error Rate
| Endpoint | Req/hr | p95 (ms) | Errors | Status |
|---|---|---|---|---|
POST /api/scans/trigger | 284 | 198ms | 0 | Normal |
GET /api/scans/{id}/status | 2,140 | 42ms | 0 | Normal |
POST /api/auth/login | 820 | 112ms | 12 | Normal |
GET /api/reports/{id} | 1,080 | 88ms | 0 | Normal |
POST /api/soar/triage | 144 | 1,240ms | 2 | Elevated |
POST /api/ai/analyse | 380 | 2,180ms | 1 | AI Dep. |
GET /api/admin/metrics | 240 | 62ms | 0 | Normal |
Note: /api/soar/triage and /api/ai/analyse latency includes Claude API call time (avg 1.1s).
⚡ Lambda Compute Metrics US-170, FR-174 · Fargate/EKS at paid tier
14
Running Tasks
+2 in 1h
38%
CPU Avg
Peak: 81%
52%
Memory Avg
4/8 GB used
0
Crashes 24h
1 last week
| Task / Service | CPU | Memory | Count | Health |
|---|---|---|---|---|
| api-gateway-service | 22% | 45% | 3 | Healthy |
| scanner-worker (L1–L6) | 72% | 68% | 6 | Healthy |
| l9-agent-worker | 48% | 55% | 2 | Healthy |
| soar-celery-worker | 18% | 32% | 2 | Healthy |
| wazuh-manager | 28% | 74% | 1 | Healthy |
🔬 Application Metrics US-174, FR-175
Scan Engine
SOAR
AI Analysis
| Metric | Today | 7d Avg | Status |
|---|---|---|---|
| Scans Triggered (all layers) | 142 | 118 | Normal |
| Scan Success Rate | 98.6% | 97.2% | Good |
| L1 Recon — Avg Duration (pipeline) | 72s | 68s | Normal |
| L2+ VAPT — Avg Duration | 8m 12s | 7m 55s | Normal |
| L2 Web AppSec — Avg Duration | 14m 38s | 13m 42s | Normal |
| L9 AI Agent — Avg Duration | 7m 02s | 6m 48s | Normal |
| Scan Queue Depth (now) | 4 jobs | — | Normal |
| Scan Queue Wait Time | 42s | 38s | Normal |
| Rate-Limited Scan Attempts | 18 | 12 | Elevated |
| AI Report Generation — Success | 100% | 99.1% | Excellent |
| AI Fallback Activations (outage) | 0 | 0.3 | None |
🏗️ System Health US-175, FR-176
| Component | Metric | Value | Status |
|---|---|---|---|
| MongoDB Atlas | Connections | 184 / 500 | Normal |
| MongoDB Atlas | IOPS (read/write) | 2,840 / 480 | Normal |
| MongoDB Atlas | Storage used | 48 GB / 100 GB | 48% |
| Redis (ElastiCache) | Memory used | 1.8 GB / 4 GB | 45% |
| Redis | Cache hit rate | 84% | Good |
| Redis | Evictions / min | 0 | None |
| S3 (Scan Reports) | Objects stored | 18,420 | Normal |
| S3 | Requests / hr | 3,240 | Normal |
| Network | Egress / hr | 4.2 GB | Normal |
| SSL Certs | Expires | api-staging: 82d · api (prod): dormant | OK |
⚙️ API Settings — SOCVault Admin US-172, FR-178
Rate Limits
CORS Origins
API Keys
Claude Model Config
| Endpoint Group | Limit | Window | Current Usage | Action |
|---|---|---|---|---|
| Auth endpoints | 60s / IP | 2.4/10 avg | ||
| Scan trigger | 60s / tenant | 1.1/5 avg | ||
| AI analysis (public API) | 60s / tenant | 4.2/20 avg | ||
| Admin endpoints | 60s / user | 8.4/60 avg | ||
| Report download | 60min / tenant | 6.2/30 avg |
Claude AI Model Configuration
Model used for: scan analysis, SOAR triage, AI chat, financial reports. L9 agent uses claude-opus-4-8 for extended thinking.
Active CORS Origins
Allowed: https://app-staging.socvault.io
https://api-staging.socvault.io
https://app.socvault.io [post-cutover]
https://admin.socvault.io
https://msp.socvault.io
🔔 Alert Thresholds US-173, FR-179
Breach triggers: PagerDuty (critical) · Slack #alerts (high) · Email (medium)
MetricThresholdCurrentStatus
AWS Cost / day
$127.42
OK
Claude AI Cost / day
$48.20
Alert! 96%
API Error Rate (5min)
0.07%
OK
Scan Queue Depth
4 jobs
OK
Fargate CPU Avg
38%
OK
MongoDB Connections
184
OK
Redis Memory
45%
OK
API p99 Latency
1,840ms
OK
⚠
Claude AI cost at 96% of daily cap ($50). At current burn rate, cap will be reached in ~1.4 hours. 3 tenants are running scans. Consider raising the cap or pausing low-priority scans.
🔬 Nano-Level Drill Down — Per-Request Trace US-171, FR-177
📡
Click any API request in the table above to open a per-request trace. Shows: route → auth middleware → rate limiter → handler → DB query → Claude API call → response. Times breakdown to microsecond level.
| Request ID | Timestamp | Endpoint | Tenant | Total Time | DB Time | AI Time | Status |
|---|---|---|---|---|---|---|---|
req_4f2a1b |
11:28:42.104 | POST /api/scans/trigger |
Acme Corp | 142ms | 38ms | — | 200 OK |
req_7c9d3e |
11:28:39.881 | POST /api/ai/analyse |
PrimeCare Health | 2,184ms | 12ms | 2,142ms | 200 OK |
req_2b8f9a |
11:28:31.220 | POST /api/soar/triage |
Internal — SOAR | 1,480ms | 22ms | 1,441ms | 200 OK |
req_9e1c4d |
11:28:18.553 | GET /api/reports/R-0812 |
SpamCorp | 4,820ms | 4,802ms | — | 200 OK (slow) |
Last row: MongoDB slow query detected (missing index on tenant_id + report_id). Open query analyser →