Automating cyber risk assessment with public LLMs: an expert-validated framework and comparative analysis
| dc.authorid | 0009-0009-4710-2569 | |
| dc.authorid | 0000-0003-2865-6370 | |
| dc.contributor.author | Ünal, Nezih Mahmut | en_US |
| dc.contributor.author | Çeliktaş, Barış | en_US |
| dc.date.accessioned | 2026-04-20T12:12:59Z | |
| dc.date.available | 2026-04-20T12:12:59Z | |
| dc.date.issued | 2026-03-26 | |
| dc.department | Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Siber Güvenlik Yüksek Lisans Programı | en_US |
| dc.department | Işık University, School of Graduate Studies, Master’s Program in Cybersecurity | en_US |
| dc.department | Işık Üniversitesi, Mühendislik ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümü | en_US |
| dc.department | Işık University, Faculty of Engineering and Natural Sciences, Department of Computer Engineering | en_US |
| dc.description.abstract | Traditional cyber risk assessment methodologies face a critical dilemma: they are either quantitative yet static and context-agnostic (e.g., CVSS), or context-aware yet highly labor-intensive and subjective (e.g., NIST SP 800-30). Consequently, organizations struggle to scale risk assessment to match the pace of evolving threats. This paper presents an automated, context-aware risk assessment framework that leverages the reasoning capabilities of publicly available Large Language Models (LLMs) to operationalize expert knowledge. Rather than positioning the LLM as the final decision-maker, the framework decouples semantic interpretation from risk scoring authority through a transparent, deterministic Dynamic Metric Engine. Unlike complex closed box machine learning models, our approach anchors the AI's reasoning to this expert-validated metric schema, with weights derived using the Rank Order Centroid (ROC) method from a survey of 101 cybersecurity professionals. We evaluated the framework through a comparative study involving 15 diverse real-world vulnerability scenarios (C1-C15) and three supplementary sensitivity stress tests (C16-C18). The validation scenarios were independently assessed by a cohort of ten senior human experts and two state-of-the-art LLM agents (GPT-4o and Gemini 2.0 Flash). The results show that the LLM-driven agents achieve scoring consistency closely aligned with the human median (Pearson r ranging from 0.9390 to 0.9717, Spearman ρ from 0.8472 to 0.9276) against a highly reliable expert baseline (Cronbach's α =0.996), while reducing the assessment cycle time by more than 100× (averaging under 4 seconds per case vs. a human average of 6 minutes). Furthermore, a dedicated context sensitivity analysis (C13-C15) indicates that the framework adapts risk scores based on organizational context (e.g., SME vs. Critical Infrastructure) for identical technical vulnerabilities. Importantly, the system is designed not merely to replicate expert intuition, but to enforce bounded, policy-consistent risk evaluation under predefined governance constraints. Overall, these findings suggest that commercially available LLMs, when constrained by expert-validated metric schemas, can support reproducible, transparent, and real-time risk assessments. | en_US |
| dc.description.version | Publisher's Version | en_US |
| dc.identifier.citation | Ünal, N. M. & Çeliktaş, B. (2026). Automating cyber risk assessment with public LLMs: an expert-validated framework and comparative analysis. IEEE Access, 14, 47754-47778. doi:https://doi.org/10.1109/ACCESS.2026.3678044 | en_US |
| dc.identifier.doi | 10.1109/ACCESS.2026.3678044 | |
| dc.identifier.endpage | 47778 | |
| dc.identifier.issn | 2169-3536 | |
| dc.identifier.scopus | 2-s2.0-105035002785 | |
| dc.identifier.scopusquality | Q1 | |
| dc.identifier.startpage | 47754 | |
| dc.identifier.uri | https://hdl.handle.net/11729/7323 | |
| dc.identifier.uri | https://doi.org/10.1109/ACCESS.2026.3678044 | |
| dc.identifier.volume | 14 | |
| dc.identifier.wos | WOS:001732696000028 | |
| dc.identifier.wosquality | Q2 | |
| dc.indekslendigikaynak | Scopus | en_US |
| dc.indekslendigikaynak | Web of Science | en_US |
| dc.indekslendigikaynak | Science Citation Index Expanded (SCI-EXPANDED) | en_US |
| dc.institutionauthor | Ünal, Nezih Mahmut | en_US |
| dc.institutionauthor | Çeliktaş, Barış | en_US |
| dc.institutionauthorid | 0009-0009-4710-2569 | |
| dc.institutionauthorid | 0000-0003-2865-6370 | |
| dc.language.iso | en | en_US |
| dc.peerreviewed | Yes | en_US |
| dc.publicationstatus | Published | en_US |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | en_US |
| dc.relation.ispartof | IEEE Access | en_US |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Öğrenci | en_US |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
| dc.rights | info:eu-repo/semantics/openAccess | en_US |
| dc.subject | Automated risk scoring | en_US |
| dc.subject | Cyber risk assessment | en_US |
| dc.subject | Generative AI | en_US |
| dc.subject | Human-AI comparison | en_US |
| dc.subject | Large Language Models (LLMs) | en_US |
| dc.subject | Rank Order Centroid (ROC) | en_US |
| dc.subject | Artificial intelligence | en_US |
| dc.subject | Automation | en_US |
| dc.subject | Critical infrastructures | en_US |
| dc.subject | Cybersecurity | en_US |
| dc.subject | Decision making | en_US |
| dc.subject | Learning systems | en_US |
| dc.subject | Risk analysis | en_US |
| dc.subject | Risk assessment | en_US |
| dc.subject | Risk management | en_US |
| dc.subject | Semantics | en_US |
| dc.subject | Cybe risk assessment | en_US |
| dc.subject | Language model | en_US |
| dc.subject | Large language model | en_US |
| dc.subject | Rank order centroid | en_US |
| dc.subject | Rank ordering | en_US |
| dc.subject | Risk scoring | en_US |
| dc.subject | Risks assessments | en_US |
| dc.subject | Sensitivity analysis | en_US |
| dc.subject | Internet | en_US |
| dc.title | Automating cyber risk assessment with public LLMs: an expert-validated framework and comparative analysis | en_US |
| dc.type | Article | en_US |
| dspace.entity.type | Publication | en_US |
Dosyalar
Orijinal paket
1 - 1 / 1
Yükleniyor...
- İsim:
- Automating_Cyber_Risk_Assessment_With_Public_LLMs_An_Expert_Validated_Framework_and_Comparative_Analysis.pdf
- Boyut:
- 4.36 MB
- Biçim:
- Adobe Portable Document Format
Lisans paketi
1 - 1 / 1
Küçük Resim Yok
- İsim:
- license.txt
- Boyut:
- 1.17 KB
- Biçim:
- Item-specific license agreed upon to submission
- Açıklama:












