Classification, Clustering and Application in Intrusion Detection System
Active In SP
Joined: Sep 2010
12-01-2011, 03:56 PM
Kaushal Mittal 04329024
M.Tech I Year
Under the Guidance of Prof. Sunita Sarawagi
KReSIT, Indian Institute of Technology Bombay
Classification and clustering techniques in data mining are useful for a wide variety of real time applications dealing with large amount of data. Some of the applications of data mining are text classification, selective marketing, medical diagnosis, intrusion detection systems. Intrusion detection system are software system for identifying the deviations from the normal behavior and usage of the system. They detect attacks using the data mining techniques - classification and clustering algorithms. In this report, I discuss approaches based on classification techniques like naive bayesian classifiers, neural networks and WINNOW based algorithm. Approaches based on clustering techniques like hierarchical and density based clustering have been discussed to emphasize the use of clustering techniques in intrusion detection.
Classification techniques analyze and categorize the data into known classes. Each data sample is labeled with a known class label. Clustering is a process of grouping objects resulting into set of clusters such that similar objects are members of the same cluster and dissimilar objects belongs to different clusters. In classification, the classes and number of classes is predefined. Training examples are used to create a model, where each training sample is assigned a predefined label. This is not the case with clustering. Classification techniques are examples of supervised learning and clustering techniques are examples of unsupervised learning.
Intrusion detection systems are softwares used for identifying the intentional or unintentional use of the system resources by unauthorized users. They can be categorized into misuse detection systems and anomaly detection systems. Misuse detection systems model attacks as a specific pattern and are more useful in detecting known attack patterns. Anomaly detection systems are adaptive systems that distinguish the behavior of the normal users from the other users. The misuse detection systems can detect specific types of attacks but are not generalized. They cannot detect new attacks until trained for them. On the other hand, anomaly detection systems are adaptive in nature, they can deal with new attacks, but they cannot identify the specific type of attacks. If the intrusion occurs during learning, then the anomaly detection system may learn the intruders behavior and hence may fail. Being more generalized and having a wider scope as compared to misuse detection systems, most of the current research focus on anomaly detection systems.
Data mining approaches can be applied for both anomaly and misuse detection. The data sample are a set of system properties, representing the behavior of the system/user. Classification techniques are used to learn a model using the training set of data samples. The model is used to classify the data samples as anomalous behavior instance or the normal behavior
instance. Clustering techniques can be used to form clusters of data samples corresponding to the normal use of the system. Any data sample with characteristics different from the formed clusters is considered to be an instance of anomalous behavior. Clustering based techniques can detect new attacks as compared to the classification based techniques.
A number of classification and clustering algorithms can be used for anomaly detection. [?] proposes the use of bayesian classifiers to learn a model that distinguishes the behavior of intruder from the normal users behavior. [?] proposes hierarchical clustering based algorithm for anomaly detection on network. [?] proposes the WINNOW based algorithm for anomaly detection. [?] proposes the use of neural networks and [?] proposes the use of density based clustering for anomaly detection.
Rest of the report is organized as follows: Section 2 discusses the bayesian classifiers and neural network based classification. Section 3 discusses the hierarchical and density based clustering. Section 4 discusses the anomaly detection approach based on WINNOW based algorithm and the use of the classification and clustering algorithms discussed in section 2 and section 3, for anomaly detection. Section 5 gives the conclusion.
http://docs.google.com/viewer?a=v&q=cach...ds/seminar and presentationreport.pdf+Classification,+Clustering+and+Application+in+Intrusion+Detection+System+pdf&hl=en&gl=in&pid=bl&srcid=ADGEESgXLHx74s7MfH4QpKLPEO8q19LIThMSr6obmApJ_J_mdfSZeavw5R-_TvF2CHiZpasGJeNzDRaabzaxykrzNlr5eQ_veAWN-wfYek9ksj8Tab9t-tuTrUgWQ6i_h50IL9fNsCWp&sig=AHIEtbRLU7Fnnj4CRh3zrmdqRMqXndwo7Q