基于孤立点检测的自适应入侵检测技术研究

发布时间:2009-08-06 11:07 作者:方育柯，傅彦，周俊临，曾金全来源:第二届中国信息安全博士论坛点击:加载中...次

方育柯¹傅彦²周俊临³曾金全⁴
电子科技大学计算机学院成都 610054

摘要：传统的入侵检测技术主要是从已知攻击数据中提取出每种具体攻击的特征规则模式，然后使用这些规则模式来进行匹配。然而基于规则的入侵检测的主要问题是现有的规则模式并不能有效应对持续变化的新型入侵攻击。针对这一问题，基于数据挖掘的入侵检测方法成为了入侵检测技术新的研究热点。本文提出了一种基于孤立点检测的自适应入侵检测框架，首先，基于相似系数寻找孤立点，然后对孤立点集合进行聚类，并使用改进的关联规则算法来从孤立点聚类结果中提取出各类入侵活动的潜在特征模式，然后生成可使用的匹配规则模式来添加到现有的规则模式中去，进而达到自适应的目的。本文使用KDD99的UCI数据集进行孤立点挖掘，然后使用IDS Snort的作为实验平台，使用IDS Informer模拟攻击工具进行测试，这两个实验结果表明了本文所提出算法的有效性。
关键词：人工智能入侵检测孤立点挖掘异常检测自适应

Research of outlier detection based adaptive intrusion detection techniques
FANG Yu Ke1，FU Yan2，ZHOU Jun Lin3，ZENG Jin Quan4
Department of Computer, University of Electronic Science and Technology of China, Chengdu 610054;

Abstract：.Most traditional techniques in intrusion detection are mining the rule patterns of each attacks’ features from the data we have known, then match the new data with these rules. However, the main problem of rule based intrusion detection techniques is that the current rule patterns can not effectively manage the new continuously changing intrusion detection attacks. To deal with the problem, data mining based intrusion detection methods have been the hot fields in intrusion detection research. An outlier detection based adaptive intrusion detection framework is proposed in this paper. In the proposed framework, the outliers are firstly detected by similarity coefficient. And then, the clusters are built on the detected outlier data set and the improved association rule algorithm is employed on the clusters. Finally, the rules generated by association rule algorithm will be adaptively added into the current intrusion detection rule base. The experiment platform was based on IDS Snort and IDS Informer was employed to simulate the attack and test. The experiments performed on simulated data and KDD99 from UCI data set have shown the effectiveness of proposed methods.
Key words：Artificial intelligence; Intrusion detection; Outlier mining; Anomaly detection; Self adaptive

0 引言

    自James Anderson在《Computer Security Threat Monitoring anSurveillance》的技术报告中首次提出了入侵检测的基本概念^[1]以来，入侵检测已经历时20多年。入侵检测也一直是网络安全技术领域的主要研究方向之一，在信息安全框架下，面对暴露出来的越来越多的计算机漏洞，基于认证访问控制机制来阻止日益增长的未知恶意攻击已经被证实是不够的，而入侵检测就成了一道不可缺少的防御线。目前入侵检测技术主要有两类：基于特征的误用检测(Signature-based detection)，又称为基于特征或基于规则的检测，如Snort^[2]；基于异常的检测(Anomaly-based)，又称为基于行为的检测，如MADAM ID^[3], LERAD^[4]。
    误用检测和异常检测都存在着实用上的问题。虽然误用检测识别效率高，误报率低，但存在着特征库更新、入侵变种检测和高误报率的问题^[5][6]，这就限制了其实际的可用性。而基于异常的检测虽然与系统相对无关,通用性较强且可以检测出未知类型攻击但是误报率较高，同时在判断异常的产生源问题上是个难点，即很难判断异常是否来自入侵，还存在概念漂移和伪装攻击的问题^[7][8][9]。更重要的是，在实用的情况下入侵检测的计算复杂度必须要达到可用的时间，相比而言这也是基于特征的入侵检测系统依旧盛行的原因。
    异常检测的特性是通过对系统异常行为的检测，发现未知的攻击模式。虽然这种技术目前还存在很多缺陷，在商业产品中应用也较少，但异常检测留下了一片可以发展的研究空间，如果能够在性能和准确率上有所突破，必将会在入侵检测系统中起很大的作用。因此，有必要对这种检测技术做深入研究。正如Julish^[10]指出的，基于数据挖掘的异常检测是基于一个不现实的假设，即训练集的可用性和质量。由于入侵检测系统的工作环境是动态可变的，使得基于这些训练集的异常检测模型随着环境的变化渐渐失去其有效性。这就要求入侵检测模型也是可变的或者自适应的，进而才能维持一个较好的性能。
    因此本论文引入了机器学习中较为成熟的算法来构建智能入侵检测模型，首先从入侵行为的异常检测出发，考虑到入侵检测对时间性能的要求，引入机器学习的孤立点挖掘算法来挖掘入侵行为的异常模式，通过对挖掘出的异常模式使用改进的Apirior算法进行关联分析，推导出入侵行为的潜在关联规则，并进一步生成新的特征规则来添加到规则库中，再使用基于规则匹配的入侵检测技术。这样就使得入侵检测具有自适应性，能够有效的检测出更多的未知攻击行为。