Anitha Mareedu, 2023. "Data-Driven Cybersecurity: ML-Based Threat Intelligence and Prediction Systems" ESP International Journal of Advancements in Computational Technology (ESP-IJACT) Volume 1, Issue 3: 212-223.
As cyber threats have grown more sophisticated and frequent over the last decade, traditional reactive cybersecurity approaches have proven inadequate for protecting digital assets and critical infrastructure. In response, the cybersecurity system has moved towards a data-driven, predictive model that is supported by machine learning (ML) and repacked with real-time threat intelligence. This paper will examine the transformation of the ML-based cybersecurity systems, paying particular attention to the impact that predictive analytics and intelligent automation have on the detection and response to threats. We offer a structured analysis of the taxonomy of threat intelligence in the form of indicators of compromise (IOCs), tactical threat feeds, and open-source threat-sharing platforms and how they can be integrated with cybersecurity solutions such as the Security Information and Event Management (SIEM) systems in order to make proactive defence approaches possible. The paper goes into detail on some of the major machine learning (ML) methods employed in the field of cybersecurity, including the supervised, unsupervised, and semi-supervised learning models applied in anomaly detection, threat classification, and the behaviour profiling of a subject. Other newer methods like ensemble modelling and federated learning are also discussed, as well as data streaming analytics in real time. Particular consideration is accrued to the sector-specific usage in enterprise, government, and critical infrastructure as intelligent agents play a role in fully automated Security Operations Centres (SOCs). Along with retracing the technical progress, we take a critical look at the remaining problems in the field, like the inconsistency in labelling data, interpretation of models, and their vulnerability to adversarial attacks. This review leverages more than ten years of development to be a resourceful background to any researcher and practitioner who wants to develop robust, intelligent, and futuristic cybersecurity systems.
[1] Ashibani, Y. Yosef, and Q. H. Mahmoud, “Cyber physical systems security: Analysis, challenges and solutions,” Computers & Security, vol. 68, pp. 81–97, 2017.
[2] Carlo et al., “Understanding Space Vulnerabilities: Developing Technical and Legal Frameworks for AI and Cybersecurity in Space,” 2022.
[3] Ghafir et al., “Detection of advanced persistent threat using machine-learning correlation analysis,” Future Generation Computer Systems, vol. 89, pp. 349–359, 2018.
[4] G. González-Granadillo, S. González-Zarzosa, and R. Diaz, “Security information and event management (SIEM): analysis, trends, and usage in critical infrastructures,” Sensors, vol. 21, no. 14, p. 4759, 2021.
[5] R. Farrell, X. Yuan, and K. Roy, “IoT to structured data (IoT2SD): a big data information extraction framework,” in Proc. 2022 1st Int. Conf. on AI in Cybersecurity (ICAIC), IEEE, 2022.
[6] R. Iqbal et al., “Big Data analytics and Computational Intelligence for Cyber–Physical Systems: Recent trends and state of the art applications,” Future Generation Computer Systems, vol. 105, pp. 766–778, 2020.
[7] L. O. Gyamfi, Ghana Institute of Management and Public Administration, 2022.
[8] S. Dixit, “The impact of quantum supremacy on cryptography: Implications for secure financial transactions,” Int. J. Sci. Res. Comput. Sci., Eng. Inf. Technol., vol. 6, no. 4, pp. 611–637, 2020.
[9] Buczak, A. L., & Guven, E. (2016). A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys & Tutorials, 18(2), 1153–1176.
[10] K. Radhakrishnan, R. R. Menon, and H. V. Nath, “A survey of zero-day malware attacks and its detection methodology,” in TENCON 2019 - IEEE Region 10 Conf., IEEE, 2019.
[11] H. Sarker et al., “Cybersecurity data science: an overview from machine learning perspective,” J. Big Data, vol. 7, no. 1, p. 41, 2020.
[12] de Melo e Silva et al., “A methodology to evaluate standards and platforms within cyber threat intelligence,” Future Internet, vol. 12, no. 6, p. 108, 2020.
[13] Yashu et al., “Thread mitigation in cloud native application development,” Webology, vol. 18, no. 6, pp. 10160–10161, 2021. [Online]. Available: https://www.webology.org/abstract.php?id=5338s
[14] M. A. H. Shahi, “Tactics, techniques and procedures (ttps) to augment cyber threat intelligence (cti): A comprehensive study,” 2018.
[15] G. West and A. Mohaisen, “Metadata-driven threat classification of network endpoints appearing in malware,” in Int. Conf. on Detection of Intrusions and Malware, and Vulnerability Assessment, Cham: Springer, 2014.
[16] Padhy, R., & Patra, S. (2021). Real-time cyber threat detection and response using machine learning techniques. Computers & Security, 102, 102116.
[17] Saxe, J., & Berlin, K. (2015). Deep neural network based malware detection using two dimensional binary program features. In 2015 10th International Conference on Malicious and Unwanted Software (MALWARE) (pp. 11–20). IEEE.
[18] Noul, “Big Data Intrusion Detection Using Machine Learning Ensembles (MLE) and Information Security Event Management (SIEM),” 2020.
[19] Mueller, “Enhancing Hidden Threat Detection in Cybersecurity Using Machine Learning Technology and Information Security Event Management (SIEM),” 2020.
[20] W.-T. Wang et al., “Adaptive density-based spatial clustering of applications with noise (DBSCAN) according to data,” in Proc. 2015 Int. Conf. on Machine Learning and Cybernetics (ICMLC), vol. 1, IEEE, 2015.
[21] S. Dixit, “AI-powered risk modeling in quantum finance: Redefining enterprise decision systems,” Int. J. Sci. Res. Sci. Eng. Technol., vol. 9, no. 4, pp. 547–572, 2022. doi:10.32628/IJSRSET221656
[22] J. Saxe and K. Berlin, “Deep neural network based malware detection using two dimensional binary program features,” in 10th International Conference on Malicious and Unwanted Software (MALWARE), 2015.
[23] S. K. Adabala, “Machine Learning in Cybersecurity: Proactive Threat Detection and Response,” Int. J. Multidiscip. Res., vol. 3, no. 5, 2021.
[24] M. Poulou, “Information Security Event Management (SIEM) and Machine Learning Technology for Effective Intrusion Detection and Cybersecurity Threat Prevention,” 2019.
[25] N. Sun et al., “Data-driven cybersecurity incident prediction: A survey,” IEEE Commun. Surv. Tutorials, vol. 21, no. 2, pp. 1744–1772, 2018.
[26] Wheelus, E. Bou-Harb, and X. Zhu, “Towards a big data architecture for facilitating cyber threat intelligence,” in Proc. 2016 8th IFIP Int. Conf. on New Technologies, Mobility and Security (NTMS), IEEE, 2016.
[27] J. Jangid, “Efficient Training Data Caching for Deep Learning in Edge Computing Networks,” Int. J. Sci. Res. Comput. Sci., Eng. Inf. Technol., vol. 7, no. 5, pp. 337–362, 2020. doi:10.32628/CSEIT20631113
[28] Y. Ye, T. Li, D. Adjeroh, and S. S. Iyengar, “A survey on malware detection using data mining techniques,” ACM Computing Surveys (CSUR), vol. 50, no. 3, 2017.
[29] M. Niemiec et al., “Multi-sector Risk Management Framework for Analysis Cybersecurity Challenges and Opportunities,” in Int. Conf. on Multimedia Communications, Services and Security, Cham: Springer, 2022.
[30] Sommer and V. Paxson, “Outside the closed world: On using machine learning for network intrusion detection,” in IEEE Symposium on Security and Privacy, 2010, pp. 305–316.
[31] M. Tavallaee, E. Bagheri, W. Lu, and A. Ghorbani, “A detailed analysis of the KDD CUP 99 data set,” in 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, 2009.
[32] J. Jangid and S. Malhotra, “Optimizing Software Upgrades in Optical Transport Networks: Challenges and Best Practices,” Nanotechnology Perceptions, vol. 18, no. 2, pp. 194–206, 2022. [Online]. Available: https://nano-ntp.com/index.php/nano/article/view/5169
Cybersecurity, ML, Threat Intelligence, Security Information And Event Management (SIEM), Predictive Analytics, Threat Feeds, Data-Driven Security, Federated Learning.