Data mining and intrusion analysis

Do data mining and network security have anything in common? Surprisingly, yes: data mining can be used in intrusion analysis. This excerpt from an article at InformIT.com explains:

Within analysis functions, information is synchronized, classified, and subjected to scrutiny of various types to identify activity patterns of security significance.

An approach that is similar to some of the rule-based anomaly detection efforts involves utilizing data mining techniques to build intrusion detection models. The objective of this approach is to discover consistent useful patterns of system features that can be used to describe program and user behaviors. These sets of system features, in turn, can then be processed by inductive methods to form classifiers (detection engines) that can recognize anomalies and misuse scenarios.

Data mining refers to the process of extracting models from large bodies of data. These models often discover facts in the data that are not apparent through other means of inspection. Although many algorithms are available for data mining purposes, the three that are most useful for mining audit data are classification, link analysis, and sequence analysis.

  • Classification assigns a data item to one of several predefined categories. (This step is akin to sorting data into "bins," depending on some criteria.) Classification algorithms output classifiers, such as decision trees or rules.

    Requires Free Membership to View

  • In intrusion detection, an optimal classifier can reliably identify audit data as falling into a normal or abnormal category.
  • Link analysis identifies relationships and correlations between fields in the body of data. In intrusion detection, an optimal link analysis algorithm identifies the set of system features best able to reliably reveal intrusions.
  • Sequence analysis models sequential patterns. These algorithms can reveal which audit events typically occur together and hold the key to expanding intrusion detection models to include temporal statistical measures. These measures can provide the capability to recognize denial-of-service attacks.

Researchers have developed extensions to standard data mining algorithms to accommodate some of the special needs of audit and other system event logs. Initial results of experiments using live data are interesting, but the work is not yet ready for transfer into commercial products. Additional research is planned to refine the approach.


For more information on this topic visit InformIT.

This was first published in March 2001

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.