Automating cyber risk assessment with public LLMs: an expert-validated framework and comparative analysis

Ünal, Nezih Mahmut; Çeliktaş, Barış

Automating cyber risk assessment with public LLMs: an expert-validated framework and comparative analysis

Dosyalar

Automating_Cyber_Risk_Assessment_With_Public_LLMs_An_Expert_Validated_Framework_and_Comparative_Analysis.pdf (4.36 MB)

Tarih

2026-03-26

Yazarlar

Ünal, Nezih Mahmut

Çeliktaş, Barış

Yayıncı

Institute of Electrical and Electronics Engineers Inc.

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

Traditional cyber risk assessment methodologies face a critical dilemma: they are either quantitative yet static and context-agnostic (e.g., CVSS), or context-aware yet highly labor-intensive and subjective (e.g., NIST SP 800-30). Consequently, organizations struggle to scale risk assessment to match the pace of evolving threats. This paper presents an automated, context-aware risk assessment framework that leverages the reasoning capabilities of publicly available Large Language Models (LLMs) to operationalize expert knowledge. Rather than positioning the LLM as the final decision-maker, the framework decouples semantic interpretation from risk scoring authority through a transparent, deterministic Dynamic Metric Engine. Unlike complex closed box machine learning models, our approach anchors the AI's reasoning to this expert-validated metric schema, with weights derived using the Rank Order Centroid (ROC) method from a survey of 101 cybersecurity professionals. We evaluated the framework through a comparative study involving 15 diverse real-world vulnerability scenarios (C1-C15) and three supplementary sensitivity stress tests (C16-C18). The validation scenarios were independently assessed by a cohort of ten senior human experts and two state-of-the-art LLM agents (GPT-4o and Gemini 2.0 Flash). The results show that the LLM-driven agents achieve scoring consistency closely aligned with the human median (Pearson r ranging from 0.9390 to 0.9717, Spearman ρ from 0.8472 to 0.9276) against a highly reliable expert baseline (Cronbach's α =0.996), while reducing the assessment cycle time by more than 100× (averaging under 4 seconds per case vs. a human average of 6 minutes). Furthermore, a dedicated context sensitivity analysis (C13-C15) indicates that the framework adapts risk scores based on organizational context (e.g., SME vs. Critical Infrastructure) for identical technical vulnerabilities. Importantly, the system is designed not merely to replicate expert intuition, but to enforce bounded, policy-consistent risk evaluation under predefined governance constraints. Overall, these findings suggest that commercially available LLMs, when constrained by expert-validated metric schemas, can support reproducible, transparent, and real-time risk assessments.

Anahtar Kelimeler

Automated risk scoring, Cyber risk assessment, Generative AI, Human-AI comparison, Large Language Models (LLMs), Rank Order Centroid (ROC), Artificial intelligence, Automation, Critical infrastructures, Cybersecurity, Decision making, Learning systems, Risk analysis, Risk assessment, Risk management, Semantics, Cybe risk assessment, Language model, Large language model, Rank order centroid, Rank ordering, Risk scoring, Risks assessments, Sensitivity analysis, Internet

Kaynak

IEEE Access

WoS Q Değeri

Q2

Scopus Q Değeri

Q1

Cilt

14

Künye

Ünal, N. M. & Çeliktaş, B. (2026). Automating cyber risk assessment with public LLMs: an expert-validated framework and comparative analysis. IEEE Access, 14, 47754-47778. doi:https://doi.org/10.1109/ACCESS.2026.3678044

Bağlantı

https://hdl.handle.net/11729/7323
https://doi.org/10.1109/ACCESS.2026.3678044

Koleksiyon

Öğrenci Yayınları Makale Koleksiyonu
Lisansüstü Eğitim Enstitüsü Diğer Yayınlar Koleksiyonu
Makale Koleksiyonu | Bilgisayar Mühendisliği Bölümü
Scopus İndeksli Yayınlar Koleksiyonu
WoS İndeksli Yayınlar Koleksiyonu

Detaylı Öğe Kaydı

Automating cyber risk assessment with public LLMs: an expert-validated framework and comparative analysis

Dosyalar

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Araştırma projeleri

Organizasyon Birimleri

Dergi sayısı

Özet

Açıklama

Anahtar Kelimeler

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Bağlantı

Koleksiyon