Software Repositories and Machine Learning Research in Cyber Security: Discussio...
source link: https://hackernoon.com/software-repositories-and-machine-learning-research-in-cyber-security-discussions
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Software Repositories and Machine Learning Research in Cyber Security: Discussions
Software Repositories and Machine Learning Research in Cyber Security: Discussions
2min
by @escholar
EScholar: Electronic Academic Papers for Scholars
@escholar
We publish the best academic work (that's too often lost...
Too Long; Didn't Read
People Mentioned
Machine Learning
@machinelearning2
audio
element.@escholar
EScholar: Electronic Academic Papers for ScholarsWe publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community
Receive Stories from @escholar
This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Mounika Vanamala, Department of Computer Science, University of Wisconsin-Eau Claire, United States;
(2) Keith Bryant, Department of Computer Science, University of Wisconsin-Eau Claire, United States;
(3) Alex Caravella, Department of Computer Science, University of Wisconsin-Eau Claire, United States.
Table of Links
Conclusions, Acknowledgment, and References
Discussions
Semantics of words have a crucial role in properly categorizing words through ML. Two different words can be processed into the same word, which potentially provides inaccurate classification. One example is the preprocessing of the words desert and deserted, these words both become desert. The meaning of the word deserted is lost. It would be essential for an ML model to be effective in semantic analysis if it were to make recommendations upon relevant vulnerabilities, utilizing the CAPEC database. The next discussion is the consideration of implementing an unsupervised, supervised, or semi-supervised ML model. The goal of this research would be to compare keywords from an SRS document to the keywords of CAPEC vulnerabilities.
Unsupervised Machine Learning (ML) algorithms find their primary utility in tasks involving the segregation of data into clusters, uncovering underlying data relationships, and reducing dimensionality. For instance, dimensionality reduction becomes a valuable tool when dealing with extensive datasets like the large CAPEC dataset, as it aims to streamline data while preserving its integrity.
In the realm of text analysis, research on Latent Dirichlet Allocation (LDA) uncovered substantial adaptation requirements to achieve satisfactory outcomes, largely due to semantic limitations. On the other hand, the Latent Semantic Analysis (LSA) algorithm is designed to capture semantics and establish connections between vectors that words are segmented into. Over time, LSA has frequently been coupled with techniques such as Singular Value Decomposition (SVD) or other intricate algorithms to enhance its effectiveness. It's important to note that evaluating the usefulness of unsupervised methods, in general, can be challenging due to the absence of well-defined metrics to measure model accuracy. This lack of clear metrics adds complexity to the interpretation of results, making it more intricate to discern the quality of outcomes generated by unsupervised ML approaches.
Supervised ML is a less complex process and requires fewer tools than unsupervised ML (IBM, 2019). Supervised ML uses a training dataset and validation techniques to derive accurate results in a timelier manner, compared to unsupervised ML. Unsupervised ML works by clustering objects into like groups, identified by the algorithm. The largest limitation for supervised ML requires obtaining the training data set to prep the implemented algorithm. Supervised ML also is significantly more proficient at obtaining metrics for the accuracy of results.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK