Tempo di lettura: 8 minuti
1. Panoramica Sistema
Cos'Γ¨ l'AI Risk Scoring?
L'AI Risk Scoring Γ¨ un sistema intelligente che analizza ogni vulnerabilitΓ rilevata e calcola un punteggio di rischio personalizzato da 0 a 10, considerando:
- CriticitΓ intrinseca del servizio
- DisponibilitΓ di exploit pubblici
- EtΓ della vulnerabilitΓ (CVE age)
- Configurazione non standard
- Pattern storici di attacco
- Contesto specifico del tuo asset
PerchΓ© Γ¨ Importante?
Il CVSS score (Common Vulnerability Scoring System) Γ¨ uno standard ma non contestualizza:
- Un SSH esposto su porta 22 Γ¨ piΓΉ rischioso dello stesso su porta 65222
- Una vulnerabilitΓ di 2 anni con exploit pubblico Γ¨ piΓΉ pericolosa di una recente senza exploit
- Un database MySQL esposto pubblicamente Γ¨ critico anche con CVSS moderato
L'AI Risk Scoring risolve questo fornendo un punteggio contestuale e actionable.
Output del Sistema
Per ogni vulnerabilitΓ ottieni:
{
"ai_risk_score": 8.5,
"exploit_likelihood": 0.85,
"confidence_score": 0.95,
"reasoning": "SSH service on standard port with public exploit available.
CVE age: 2 years (weaponized). High service criticality.",
"contributing_factors": {
"service_criticality": 2.5,
"exploit_available": 2.0,
"cve_age": 1.0,
"port_exposure": 0.8,
"ml_adjustment": 1.2
},
"model_version": "v1.2.0-hybrid",
"is_anomaly": false,
"anomaly_type": null,
"anomaly_score": 0.15
}
2. Architettura Ibrida
Tre Livelli di Analisi
Il sistema utilizza un approccio multi-layer per massimizzare accuratezza e robustezza:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Vulnerability Data β
β (port, service, version, CVE, CVSS, exploit, etc.) β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββ
β 1. Rule-Based Engine β
β (always active) β
β β Deterministic rules β
β β Service criticality β
β β Exploit analysis β
β β CVE age factor β
ββββββββββββββ¬ββββββββββββββββ
β
ββββββββββββββββββββββββββββββ
β 2. ML Model (optional) β
β Random Forest Regressor β
β β Learned patterns β
β β Historical context β
β β User feedback β
ββββββββββββββ¬ββββββββββββββββ
β
ββββββββββββββββββββββββββββββ
β 3. Anomaly Detection β
β β Statistical analysis β
β β Unexpected exposures β
β β Service-specific flags β
ββββββββββββββ¬ββββββββββββββββ
β
ββββββββββββββββββββββββββββββ
β Final AI Risk Score β
β + Confidence β
β + Explanation β
ββββββββββββββββββββββββββββββ
PerchΓ© Ibrido?
Vantaggi del Design Ibrido:
- Sempre Funzionante - Il rule engine funziona da subito, senza training
- Interpretabile - Le regole sono trasparenti e spiegabili
- Accurato - ML migliora le predizioni con dati reali
- Robusto - Fallback a regole se ML non disponibile
- Evolutivo - Continuous learning dai feedback utente
3. Rule-Based Engine
Come Funziona
Il Rule-Based Engine Γ¨ il cuore deterministico del sistema. Applica regole di sicurezza consolidate per calcolare un risk score base.
3.1 Service Criticality
Ogni servizio ha un punteggio di criticitΓ intrinseca basato su:
- Esposizione storica ad attacchi
- Impatto potenziale di compromissione
- FacilitΓ di exploitation
Classificazione Servizi:
# CRITICAL SERVICES (9.0 - 10.0)
# β Non dovrebbero MAI essere esposti pubblicamente
{
'telnet': 9.5, # Plaintext, no encryption
'rexec': 9.0, # Remote execution, insecure
'rlogin': 9.0, # Remote login, no encryption
'rdp': 8.5, # Frequent target, RCE risk
'vnc': 8.5 # Remote desktop, weak auth
}
# HIGH RISK SERVICES (7.0 - 8.9)
# β Richiedono hardening significativo
{
'ssh': 7.0, # Remote access, brute-forceable
'ftp': 7.5, # Often misconfigured
'smb': 8.0, # EternalBlue, ransomware vector
'mysql': 7.5, # Direct DB access
'postgresql': 7.5, # Direct DB access
'mongodb': 7.8, # NoSQL, frequent misconfig
'redis': 7.6, # In-memory, often no auth
'docker': 8.0, # Container escape risk
'kubernetes': 8.2 # Orchestration, high impact
}
# MEDIUM RISK SERVICES (4.0 - 6.9)
{
'http': 5.0, # Web server, XSS/SQLi risk
'https': 4.0, # Encrypted, but still web app risk
'smtp': 5.5, # Email relay risk
'dns': 5.5, # DNS amplification, cache poisoning
'ldap': 6.0 # Directory access
}
# LOW RISK SERVICES (< 4.0)
{
'ntp': 2.0, # Time sync
'snmp': 3.5 # Network monitoring
}
3.2 CVSS Score Integration
Il CVSS Score viene usato come baseline:
if cvss_score is not None:
base_score = max(base_score, cvss_score)
else:
base_score = 5.0 # Conservative default
Rationale: Il CVSS rappresenta la criticitΓ della CVE specifica, quindi se Γ¨ piΓΉ alto della criticitΓ del servizio, prevale.
3.3 Exploit Availability
La disponibilitΓ di exploit pubblici aumenta drasticamente il rischio reale:
if has_exploit:
risk_score += 2.0 # +2.0 CRITICAL bonus
exploit_likelihood += 0.30 # +30% likelihood
Fonti Exploit:
- Metasploit Framework
- ExploitDB
- GitHub PoC repositories
- Security researcher disclosures
3.4 CVE Age Factor
Le vulnerabilitΓ non diventano meno pericolose col tempo. Anzi, CVE vecchie hanno piΓΉ tempo per essere weaponizzate.
cve_age_days = (today - cve_published_date).days
if cve_age_days < 365: # < 1 year
risk_score += 0.5 # New, active research
elif 365 <= cve_age_days < 1095: # 1-3 years
risk_score += 1.0 # Weaponized
elif 1095 <= cve_age_days < 1825: # 3-5 years
risk_score += 1.5 # Well-known, tools available
else: # > 5 years
risk_score += 1.5 # Ancient, no excuse
3.5 Port Configuration Risk
La porta su cui gira un servizio influenza il rischio:
STANDARD_PORTS = {
'ssh': 22,
'http': 80,
'https': 443,
'mysql': 3306,
# ... etc
}
if port == STANDARD_PORTS[service]:
risk_score += 0.8 # More visible to automated scanners
if port > 49152: # IANA dynamic/private ports
risk_score -= 0.3 # Slightly less exposed
Rationale: Servizi su porte standard sono il primo target di scanner automatizzati (Shodan, Masscan, Nmap botnet).
Calcolo Finale Rule-Based
def calculate_rule_based_score(vuln_data):
score = 5.0 # Baseline
# 1. Service criticality
score = SERVICE_CRITICALITY.get(service, 5.0)
# 2. CVSS override
if cvss_score:
score = max(score, cvss_score)
# 3. Exploit bonus
if has_exploit:
score += 2.0
# 4. CVE age
score += calculate_age_bonus(cve_age_days)
# 5. Port risk
score += calculate_port_risk(port, service)
# 6. Protocol risk
if protocol == 'udp':
score += 0.5
# 7. Version risk
if is_legacy:
score += 1.5
# Cap at 10.0
score = min(score, 10.0)
return {
'risk_score': score,
'confidence': 0.95,
'reasoning': build_reasoning(factors)
}
4. Machine Learning Model
Algoritmo: Random Forest Regressor
PerchΓ© Random Forest?
- Robusto - Gestisce bene outliers e missing values
- Interpretabile - Feature importance analysis
- Non-lineare - Cattura relazioni complesse
- Ensemble - Riduce overfitting
- Fast inference - Predizioni real-time
Hyperparameters
RandomForestRegressor(
n_estimators=100, # 100 decision trees
max_depth=10, # Max tree depth
min_samples_split=5, # Min samples to split node
min_samples_leaf=2, # Min samples in leaf
random_state=42, # Reproducibility
n_jobs=-1 # Parallel processing
)
Prediction Process
Durante la scansione:
# 1. Calculate rule-based score
rule_score = rule_engine.calculate(vuln_data)
# 2. Extract features
features = feature_engineer.extract(vuln_data)
# 3. ML prediction
if ml_model_available:
ml_score = model.predict([features])[0]
# 4. Blend scores (weighted average)
final_score = (
0.6 * rule_score + # 60% rules
0.4 * ml_score # 40% ML
)
else:
final_score = rule_score # Fallback to rules
When ML Adds Value
Il ML Γ¨ particolarmente utile per:
- Non-obvious patterns - Combinazioni port+service+version storicamente piΓΉ exploitate
- Historical context - Se questo asset Γ¨ stato giΓ attaccato su questa porta, aumenta il risk score
- User feedback integration - Se utenti hanno marcato simili vulnerabilitΓ come false positive, riduce confidence
- Environmental factors - Asset in production vs staging β risk differente
5. Anomaly Detection
Scopo
Rilevare configurazioni inusuali o esposizioni inaspettate che potrebbero indicare:
- Misconfiguration critica
- Attacco in corso
- Shadow IT
- Development service in produzione
Checks Implementati
5.1 Unexpected Service Detection
UNEXPECTED_SERVICES = {
'telnet': 'Telnet should NEVER be exposed',
'rexec': 'Remote execution service - extremely dangerous',
'rlogin': 'Insecure remote login',
'redis': 'Redis without auth is common misconfiguration',
'mongodb': 'MongoDB should be behind firewall'
}
if service in UNEXPECTED_SERVICES:
return {
'is_anomaly': True,
'anomaly_type': 'unexpected_service',
'anomaly_score': 0.9,
'message': UNEXPECTED_SERVICES[service]
}
5.2 Database Exposure Check
DATABASE_SERVICES = ['mysql', 'postgresql', 'mongodb',
'redis', 'cassandra', 'elasticsearch']
if service in DATABASE_SERVICES and is_public_facing:
return {
'is_anomaly': True,
'anomaly_type': 'public_database',
'anomaly_score': 0.85,
'message': f'{service} should not be publicly accessible'
}
5.3 Non-Standard Port Analysis
# MySQL su porta 8080 (tipica di web server)
if service == 'mysql' and port == 8080:
return {
'is_anomaly': True,
'anomaly_type': 'suspicious_port',
'anomaly_score': 0.7,
'message': 'Database on web server port - possible compromise'
}
5.4 Development Service Warnings
DEVELOPMENT_SERVICES = {
'jupyter': 'Jupyter notebook in production',
'webpack-dev-server': 'Development server exposed',
'php-fpm-status': 'PHP-FPM status page exposed'
}
if service in DEVELOPMENT_SERVICES and env == 'production':
return {
'is_anomaly': True,
'anomaly_type': 'dev_in_prod',
'anomaly_score': 0.75,
'message': DEVELOPMENT_SERVICES[service]
}
Anomaly Score Impact
if anomaly_detected:
# Increase risk score based on anomaly severity
risk_score += (anomaly_score * 2.0) # Max +2.0 bonus
# Flag in reasoning
reasoning += f" β οΈ ANOMALY: {anomaly_message}"
6. Feature Engineering
Feature Set (20+ Features)
Il ML model richiede features numeriche. Il feature engineer converte vulnerability data in vettore di features.
6.1 Numerical Features
{
'port_number': 22, # 0-65535
'cvss_score': 7.5, # 0-10
'cve_age_days': 730, # Days since CVE publication
'version_major': 7, # Major version number
'version_minor': 4, # Minor version number
'service_criticality': 7.0, # From rule engine
'port_exposure_risk': 0.8, # Calculated risk
'anomaly_score': 0.15 # From anomaly detector
}
6.2 Categorical Features (One-Hot Encoded)
Service Category:
SERVICE_CATEGORIES = {
'authentication': ['ssh', 'telnet', 'rdp', 'vnc'],
'database': ['mysql', 'postgresql', 'mongodb', 'redis'],
'web': ['http', 'https', 'apache', 'nginx'],
'email': ['smtp', 'imap', 'pop3'],
'file_transfer': ['ftp', 'sftp', 'smb'],
'container': ['docker', 'kubernetes'],
'other': ['*']
}
# One-hot encode
features = {
'service_cat_authentication': 1, # if SSH
'service_cat_database': 0,
'service_cat_web': 0,
# ... etc
}
6.3 Boolean Features
{
'is_standard_port': True, # On expected port?
'has_exploit': True, # Public exploit available?
'has_cve': True, # CVE assigned?
'has_cpe': True, # CPE identifier present?
'is_legacy_version': False, # EOL version?
'protocol_udp': False # Using UDP?
}
Feature Vector Example
vulnerability_data = {
'port': 22,
'service': 'ssh',
'version': 'OpenSSH 7.4',
'cvss_score': 7.5,
'cve_id': 'CVE-2021-28041',
'has_exploit': True,
'protocol': 'tcp',
'cve_published_date': '2021-03-01'
}
# After feature engineering
feature_vector = [
22, # port_number
7.5, # cvss_score
730, # cve_age_days
7, # version_major
4, # version_minor
7.0, # service_criticality
0.9, # port_exposure_risk
0.15, # anomaly_score
1, 0, 0, 0, 0, 0, # service_category (one-hot)
1, 0, 0, # port_range (one-hot)
0, 1, 0, 0, # cve_age_category (one-hot)
1, # is_standard_port
1, # has_exploit
1, # has_cve
1, # has_cpe
0, # is_legacy_version
0 # protocol_udp
]
# Total: 28 features
7. Training e Continuous Learning
Il modello AI attuale Γ¨ stato pre-trained su un dataset di vulnerabilitΓ note ed Γ¨ ottimizzato per fornire score accurati fin dal primo utilizzo. Nella prossima versione sarΓ implementato il sistema di continuous learning che permetterΓ al modello di migliorare progressivamente attraverso:
FunzionalitΓ Pianificate
Incremental Learning
Il sistema incorporerΓ automaticamente i dati delle scansioni reali per affinare le predizioni, adattandosi alle caratteristiche specifiche dei tuoi asset e del tuo ambiente.
User Feedback Integration
Potrai fornire feedback sui risk score (troppo alto, troppo basso, accurato) per migliorare il modello. Il sistema peserΓ automaticamente questi feedback nel retraining periodico.
Model Versioning
Ogni nuova versione del modello sarΓ tracciata con metriche di performance (accuracy, MAE, RΒ² score), permettendo rollback in caso di regressioni e trasparenza nell'evoluzione del sistema.
A/B Testing
Nuove versioni del modello saranno testate su un subset di scansioni prima del rollout completo, garantendo miglioramenti misurabili senza impatti negativi sull'accuratezza.
8. Interpretazione Risultati
8.1 Risk Score Range
Interpretazione Score:
| Score | Severity | Action Required | Typical Timeline |
|---|---|---|---|
| 9.0 - 10.0 | π΄ CRITICAL | Immediate remediation | Within 24 hours |
| 7.0 - 8.9 | π HIGH | Urgent fix required | Within 7 days |
| 5.0 - 6.9 | π‘ MEDIUM | Schedule fix | Within 30 days |
| 3.0 - 4.9 | π΅ LOW | Review and plan | Next patch cycle |
| 0.0 - 2.9 | βͺ INFO | Best practice | When convenient |
8.2 Exploit Likelihood
Interpretazione Likelihood:
- 0.9 - 1.0 (90-100%) β Exploitation imminent
- 0.7 - 0.9 (70-90%) β High probability
- 0.5 - 0.7 (50-70%) β Moderate probability
- 0.3 - 0.5 (30-50%) β Low probability
- 0.0 - 0.3 (0-30%) β Unlikely
8.3 Confidence Score
Interpretazione Confidence:
- 0.95 - 1.0 β Very high confidence (rule-based)
- 0.85 - 0.95 β High confidence (hybrid with good ML)
- 0.70 - 0.85 β Medium confidence (limited data)
- < 0.70 β Low confidence (edge case)
8.4 Reasoning Explanation
Esempio Reasoning Completo:
AI Risk Score: 8.7/10
Confidence: 93%
Exploit Likelihood: 82%
Reasoning:
"SSH service detected on standard port 22 (high exposure to
automated scanners). OpenSSH version 7.4 has known CVE-2021-28041
with CVSS score 7.5 and public exploit available (Metasploit module).
CVE age: 2.1 years (weaponized, active exploitation). Service
criticality: HIGH (remote access). Non-standard configuration: using
weak ciphers. ML model detected similar pattern in 47 previous scans,
12 resulted in confirmed exploitation."
Contributing Factors:
Service Criticality (SSH): +2.5
Public Exploit Available: +2.0
CVE Age (weaponized): +1.0
Standard Port (scanner target): +0.8
Weak Configuration: +1.2
ML Pattern Recognition: +1.2
----
Total: 8.7/10
Recommended Action:
1. IMMEDIATE: Disable SSH or move behind VPN
2. Update to OpenSSH 8.9+ (patches CVE-2021-28041)
3. Change SSH port to non-standard (e.g., 2222)
4. Enable key-based auth, disable password auth
5. Configure strong ciphers only
6. Enable fail2ban or rate limiting
References:
- https://nvd.nist.gov/vuln/detail/CVE-2021-28041
- https://www.openssh.com/security.html
9. Performance e Metriche
9.1 Model Performance Metrics
Current Model (v1.2.0):
Algorithm: Random Forest Regressor Training Samples: 1,247 Test Set Size: 312 Metrics: RΒ² Score: 0.91 (91% variance explained) MAE: 0.78 (avg error Β±0.78 points on 0-10 scale) RMSE: 1.02 (root mean squared error) Feature Importance (Top 10): 1. has_exploit: 0.18 2. cvss_score: 0.16 3. service_criticality: 0.14 4. cve_age_days: 0.12 5. port_exposure_risk: 0.09 6. is_standard_port: 0.07 7. service_cat_auth: 0.06 8. version_major: 0.05 9. anomaly_score: 0.05 10. protocol_udp: 0.04
9.2 Prediction Speed
Rule-Based Engine: < 10 ms per vulnerability ML Inference: < 50 ms per vulnerability Anomaly Detection: < 30 ms per vulnerability Total (Hybrid): < 100 ms per vulnerability Typical Full Scan (50 vulnerabilities): Total AI processing time: ~5 seconds
9.3 Accuracy by Severity
Critical (9.0-10.0): 95% accuracy High (7.0-8.9): 91% accuracy Medium (5.0-6.9): 87% accuracy Low (3.0-4.9): 84% accuracy Info (0.0-2.9): 90% accuracy Overall Accuracy: 89%
9.4 User Feedback Stats
Total Feedback Collected: 3,421 Accurate: 2,847 (83%) Too High: 412 (12%) Too Low: 162 (5%) Model Improvement: v1.0.0 β v1.1.0: +4% accuracy v1.1.0 β v1.2.0: +2% accuracy
10. Best Practices
10.1 Interpretare Correttamente
β DO:
- Considera AI Risk Score come guida prioritaria
- Leggi sempre il reasoning per capire il contesto
- Valuta exploit likelihood per urgenza
- Usa confidence score per decidere se review manuale
β DON'T:
- Non ignorare low confidence scores
- Non fare affidamento SOLO su CVSS
- Non assumere che score basso = non importante
- Non ignorare anomaly warnings
10.2 Fornire Feedback
Quando inviare feedback:
- Score sembra troppo alto/basso per contesto specifico
- VulnerabilitΓ dietro firewall/VPN (context missing)
- False positive confermato
- Exploit verificato ma score basso
10.3 Prioritizzazione Remediation
Ordine Consigliato:
-
CRITICAL (9.0-10.0) + High Likelihood (>70%)
β Fix in 24h, consider immediate mitigation (firewall, disable service) -
HIGH (7.0-8.9) + Public Exploit
β Fix in 7 days, monitor for exploitation attempts -
MEDIUM (5.0-6.9) + Anomaly Detected
β Investigate context, fix in 30 days -
All others
β Schedule in next patch cycle
10.4 Continuous Improvement
Contribuire al Model:
- Invia feedback accurato su ogni scan
- Reporta false positives con evidenze
- Condividi exploitation attempts rilevati
- Suggerisci nuove regole per servizi specifici
Risorse Aggiuntive
Documentazione Correlata:
- User Guide - Guida completa funzionalitΓ
- Quick Start - Inizia in 3 minuti
- WAF Firewall - Protezione real-time
Fonti Esterne:
Supporto:
- π§ Email: [email protected]
- π¬ Community Discord
- π Bug Report: GitHub Issues