Jacques Saraydaryan, Fatiha Benali and Stéphane Ubéda 1 Exaprotect R&d villeurbanne, 69100, France

Yüklə 127,35 Kb.

səhifə	3/3
tarix	25.10.2017
ölçüsü	127,35 Kb.
	#12811

1 2 3

Experimentation

In this section, we aim to make into practice the two proposed modules, event modeling and User Behavioral Analysis, while using a large corpus of real data. Event modeling experience will normalize raw events in the ontology’s categories that describe her semantics. User behavioral analysis experimentation will use normalized events to detect abnormal behaviors.

5.1 Event Modeling
To study the effectiveness of the modeling proposed in
Section 5.1, we focused our analysis on the exhaustiveness
of the ontology (each event is covered by a category) and on
the reduction of event number to be presented to the security analyst.
We performed an experiment on a corpus of 20182 messages collected from 123 different products. The main characteristic of this corpus is that the events are collected from heterogeneous products, where the products manipulate different concepts (such as attacks detection, virus detection, flaws filtering, etc.). The sources used are security equipment logs, audit system logs, audit application logs and network component logs. Figure 5 illustrates the various probes types used and, into brackets, the number of probes per type is specified.
The classification process was performed manually by the experts. The expert reads the message and assigns it to the category which describes its semantics.

The expert must extract the intention from the action which generated the message, the movement used to achieve the intention, the target toward which the intention was directed and the gain related to this intention.

We have obtained categories of various sizes with the manual classification of raw events. The distribution of the messages on the categories is represented on figure 6. Some categories are large, the largest one contains 6627 events which presents a rate of 32,83% of the corpus. This is due to the monitoring of the same activity by many products or to the presence of these signatures in many products. The representation of the events under the same semantics reinforces the process of managing the security in a cooperative context and facilitates the task of the analyst (more detail in [2]). In addition, we had a singleton categories, 732 raw events forming their own category, which represent a rate of 42,21% of all categories and which represent only 3,63% of the corpus.
Event modeling has reduced the number of events by 91,40% (from 20182 to 1734). The presence of singleton categories can be explained by the following points:

Only one product among the deployed products produces this type of event. A signature, which is recognized by a product and not recognized by an another,
Errors made by experts generated the creation of new categories, they do not have to exist theoretically,
The presence of targets monitored rarely increases the number of singleton categories, because the movement exists several times, but only once for these rare targets.

We observe that the category of the movement Suspicious introduced into our ontology is quite necessary to preserve the semantics of a raw event which reflects a suspicion.

These types of events will be processed with the User Behavioral Analysis. Ontology does not make it possible to analyze event, its goal is to preserve raw events semantics. The proportions of the various categories depend on the deployed products and the activities to be supervised by these products. The conclusion that we can draw from this study is that a good line of defense must supervise all the activities aimed in an IS, and that the cooperative detection should not be focused on the number of the deployed products but on the activities to be supervised in the IS. This result can bring into question the choice of the defense line for the IS.
5.2 Behavioral Analysis

Actual Intrusion Detection System operating on a Global Information System Monitoring lacks of large test dataset aiming at checking their efficiency and their scalability. In this section, we provide our results on our Anomaly Intrusion Detection System using a real normalized data set. We deployed our architecture on a real network and collected events coming from hundreds of IS components.

DataSet Analysis: Our dataset analysis comes from a large company composed of hundreds of users using multiple services. The dataset has been divided into two datasets: one training data set composed of events collected for 23 days and the other one
(test data set) composed of events collected for 2 days
after the training period. The training dataset aims at train-
ing our engine and creating a user normal behavioral model.
The test ‘dataset’ has been enriched of attack scenarios in or-
der to test our detection engine. First of all, the test data set
is used to test the false alarms rate (false positive rate) of our
engine. Then attack scenarios will be used to determine our
detection rate, and more over the false negative rate (not detected attacks rate) of our engine.

The major parts of the collected events are web server information and authentication information. We can notice that during the monitored period, some types of events are

periodic (like Authentication_Activity .Login.SysAuth

.Account.Success) and other ones are sporadic. Moreover,

our dataset is composed of more than 70 types of events
ranging from Authentication actions to System usage (like service start).

Our training dataset, representing the activity of the system for 23 days, is composed of 7 500 000 events. The test data is composed of 85 000 normal events (two days of the System’s activity) and 410 events representing three different attack scenarios. These scenarios reflected three types of attack effects on the system as introduced in the DARPA attacks classification (Remote to local, User to Root,...). Some scenario variants are developed for each class.

For example, concerning the Remote to Local attack scenario, we provide two kinds of variant of scenario as follow:
The remote to local variant one is composed of four different classes of events:

- Authentication_Activity.Login.SysAuth.Account.Success,

-Authentication_Config.Add.SysAuth.Account.Success,

- Authentication_Activity.Login.SSH.Admin.Success,

- System_Activity.Stop.Audit.N.Success.
The second one holds the classes below:

-Authentication_Activity.Login.SysAuth.Account.Success,
-Authentication_Config.Modify.SysAuth.Account.Success,
- Authentication_Activity.Login.SysAuth.Admin.Success,

-System_Activity.Execute.Command.Admin.Success.
Each variant of each scenario is reproduced ten times with different attributes (login user, IP address Source and Destination) belonging to the data set.
W

e can notice that all events involved in attack scenarios refer to legitimate actions. All these events define a set of event among shared actions between legitimate user behaviors and attacker strategy.

Results: The test data set is used to build our user activities model. To compute efficiently this model, we split the training data set into 440 steps.

Each Bayesian structure feature (nodes, links, states) evolves differently and reaches its stationary point at different times. Nodes (referring to event’s classes) become stationary around the step 330 whereas links (relationships between event’s classes) continue to evolve until step 360. Only the status (user or process identifier) seems to never reach a stationary point. To understand this phenomenon, we analyze in depth the evolution of the status of each different nodes. We notice that the status of one particular node,

Authentication_Activity.Login.SysAuth.Account.Success, blow up. We investigate and discover that the considered company owns an e commerce Web server on which each new consumer receives a new login account. That is why when other nodes reach their stationary point around the 390th step, Authentication_Activity.Login.SysAuth.Account.Success node continues to grow. To avoid a complexity explosion inside our Bayesian model, we add a constraint defining a time of unused events indicator. We define a threshold to determine which state of node will be kept and which one will be dropped.
The test data set is then processed by our Anomaly detection System and our detection’s results are in figure 7. This table distinguishes each scenario’s events and the detection rate for different probability threshold. These thresholds could be chosen regarding to the organisms or company goals. In case of a very sensitive IS, the attack detection rate needs to be as high as possible. A probability threshold of 0.002 achieving a detection rate of 90% with false positive rate around 14% would be suitable. In case of a more transversal use of our approach, companies deal with false positives and detection rate. A threshold of 0.0001 provides an attack detection rate of 79% with a false positive rate below 0.5%.
Additional observation can be made regarding our attack detection rate. Most of the time, attack detection rate of detection tools reaches 95% but in our context, all our scenarios are composed of events which belong to normal behavior. All these events do not necessary deviate from the normal behavior that is why our detection rate is slightly below classical detection rate. We can estimate that little less than 10% of the test attack’s events belong to normal behavior (legitimate event and attributes). Despite this constraint, we still reach detection rate from 80% to 90%.

Conclusion and Perspectives

Our main goal throughout this paper was to describe a framework that addresses companies: managing the company’s security information to protect the IS from threats. We proposed an architecture which provides a global view of what occurs in the IS. It is composed of different agent types collecting and modeling information coming from various security products (such as firewalls, IDS and antivirus), Operating Systems, applications, databases and other elementary information relevant for the security analysis. An Analysis Server gathers all information sent by agents and provides a behavioral Analysis of the user activities.

A new modeling for describing security information semantics is defined to address the heterogeneity problem. The modeling is an ontology that describes all activities that can be undertaken in an IS. By using real data triggered from products deployed to protect the assets of an organization, we shown that the modeling reduced the amount of events and allowed automatic treatments of security information by the analysis algorithms. The model is extensible, we can increase the vocabulary according to the need such as adding a new movement to be supervised in the IS. The model can be applied to other contexts of monitoring such as the monitoring of physical intruders in a museum; all we have to do is to define the adequate vocabulary of the new context.
We demonstrated that unknown attack scenarios could be efficiently detected without hard pre description information through our User Behavioral Analysis. By using only relevant information, User’s behaviors are modeled through a Bayesian network. The Bayesian Network modeling allows a great detection effectiveness by injecting incoming events inside the model and computing all conditional probabilities associated. Our Anomaly evaluation module allows updating dynamically a User’s model, reducing false positive and enriching Behavioral Anomalies. The experimentation on real data set highlights our high detection rate on legitimate action involved in Attack scenarios.
As data are modeled in the same way, User Behavioral Analysis results show that the effectiveness of the analysis processes is highly dependent on the data modeling.
The proposed framework can be useful to other processes. Indeed, the ontology is necessary to carry out counter-measures process, the results of User Behavioral Analysis allowing the administrator to detect legitimate users that deviate from its behavior, a reaction process can then be set up to answer malicious behaviors.

References
[1] E. G. Amoroso. Fundamentals of computer security technology. Prentice-Hall, Inc, Upper Saddle River, NJ, USA,1994.

[2] F. Benali, V. Legrand, and S. Ubéda. An ontology for the management of heterogeneous alerts of information system. In The 2007 International Conference on Security and Management (SAM’07), Las Vegas, USA, June 2007.

[3] M. Bishop. How attackers break programs, and how to write programs more securely. SANS 2002, Baltimore,MD, May 2002.

[4] K. Boudaoud. Un système multi-agents pour la détection d’intrusions. In JDIR’2000- Journées Doctorales Informatique et Réseaux, Paris, France, 2000.

[5] W. R. Cheswick and S. M. Bellovin. Firewalls and Internet

Security Repelling the Wily Hacker. Addison-Wesley, 1994.

[6] F. B. Cohen. Protection and security on the information superhighway. John Wiley & Sons, Inc., New York, NY, USA, 1995.

[7] F. B. Cohen. Information system attacks: A preliminary classification scheme. Computers and Security, 16, No. 1:29-46, 1997.

[8] D. Curry and H. Debar. Intrusion detection message exchange format. http://www.rfc-editor.org/rfc/rfc4765.txt.

[9] Davidson. Actions, reasons, and causes. Journal of Philosophy, pages 685-700 (Reprinted in Davidson 1980, pp. 3-19.), 1963.

[10] D. Davidson. Freedom to act. Honderich (ed.), Essays on Freedom of Action (London: Routledge & Kegan Paul, 1973, reprinted with corrections, 1978) pp. 137-156., 1973.

[11] T. Duval, B. Jouga, and L. Roger. Xmeta a bayesian approach for computer forensics. In Annual Computer Security Applications Conference (ACSAC), 2004.

[12] S. T. Eckmann, G. Vigna, and R. A. Kemmerer. STATL: an attack language for state-based intrusion detection. In Journal of Computer Security, 10:71-103, 2002.

[13] G. Helmer, J. S. K. Wong, V. G. Honavar, L. Miller, and

Y. Wang. Lightweight agents for intrusion detection. Journal of Systems and Software, 67:109-122, 2003.

[14] M. Hossain and S. M. Bridges. A framework for an adaptive intrusion detection system with data mining. In 13^th Annual Canadian Information Technology Security Symposium, 2001.

[15] J. Howard and T. Longstaff. A common language for computer security incidents. Sand98-8667, Sandia International Laboratories, 1998.

[16] J. D. Howard. An Analysis of Security Incidents on the Internet PhD thesis, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 USA, April 1997.

[17] M.-Y. Huang, R. J. Jasper, and T. M. Wicks. A large scale distributed intrusion detection framework based on attack strategy analysis. Computer Networks (Amsterdam, Netherlands: 1999, 31(23-24):2465-2475, 1999.

[18] D. Icove, K. Seger, and W. VonStorch. Computer Crime: A Crimefighter’s Handbook. Inc., Sebastopol, CA, 1995.

[19] I. D. J., P. J. R., and T. S. Executions, motivations, and accomplishments. Philosophical Review, 102(4), Oct 1993.

[20] A. Keleme, Y. Liang, and S. Franklin. A comparative study of different machine learning approaches for decision making. In Recent Advances in Simulation, Computational Methods and Soft Computing, 2002.

[21] U. Lindqvist and E. Jonsson. How to systematically classify computer security intrusions. In proceeding of the IEEE Symposium on Security and Privacy, pages 154-163, 1997.

[22] S. Mathew, C. Shah, and S. Upadhyaya. An alert fusion framework for situation awareness of coordinated multistage attacks. In IWIA ’05: Proceedings of the Third IEEE International Workshop on Information Assurance, 2005.

[23] T. P. Minka. Expectation propagation for approximate bayesian inference. In the 17th Conference in Uncertainty in Artificial Intelligence, 2001.

[24] P. G. Neumann. Computer-Related Risks. Addison-Wesley, October 1994.

[25] P. G. Neumann and D. B. Parker. A summary of computer misuse techniques. In Proceedings of the 12th National Computer Security Conference, pages 396-407, Baltimore, Maryland, October 1989.

[26] P. Ning, Y. Cui, and D. S. Reeves. Constructing attack scenarios through correlation of intrusion alerts. In CCS ’02: Proceedings of the 9th ACM conference on Computer and Communications Security, 2002.

[27] P. Ning, S. Jajodia, and X. S. Wang. Abstraction-based intrusion detection in distributed environments. ACM Transaction Information and System Security, 4(4):407-452, 2001.

[28] S. Noel, E. Robertson, and S. Jajodia. Correlating intrusion events and building attack scenarios through attack graph distances. In ACSAC ’04: Proceedings of the 20th Annual Computer Security Applications Conference, 2004.

[29] R. Puttini, Z. Marrakchi, and L. Me. Bayesian classification model for real-time intrusion detection. In 22nd International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, 2002.

[30] X. Qin and W. Lee. Attack plan recognition and prediction using causal networks. In ACSAC ’04: Proceedings of the 20th Annual Computer Security Applications Conference, 2004.

[31] D. D. Reide. Intending. in y. yovel (ed.), philosophy of history and action. pp. 41-60. (reprinted in davidson 1980, pp.83-102.).

[32] S.-C. S, Tung.B, and Schnackenberg.D. The common intrusion detection framework (CIDF). In The Information Survivability Workshop (ISW ’98), Orlando, FL, October 1998. CERT Coordination Center, Software Engineering Institute.

[33] J. Saraydaryan, V. Legrand, and S. Ubeda. Behavioral anomaly detection using bayesian modelization based on a global vision of the system. In 7eme Conference Internationale sur les NOuvelles TEchnologies de la REpartition (NOTERE 07), 2007.

[34] J. Saraydaryan, V. Legrand, and S. Ubeda. Behavioral intrusion detection indicators. In 23rd International Information Security Conference (SEC 2008), 2008.

[35] J. Saraydaryan, V. Legrand, and S. Ubeda. Modeling of information system correlated events time dependencies. In 8eme Conference Internationale sur les NOuvelles TEchnologies de la REpartition (NOTERE 08), 2008.

[36] O. Sheyner. Scenario Graphs and Attack Graphs. PhD thesis, Carnegie Mellon University, 2004.

[37] D. J. Spiegelhalter and S. L. Lauritzen. Sequential updating of conditional probabilities on directed graphical structures. Networks ISSN 0028-3045, 20:579-605, 1990.

[38] W. Stallings. Network and Internetwork Security: Principles and Practice. Prentice-Hall, Inc, Upper Saddle River, NJ, USA, 1995.

[39] S. Staniford-Chen, S. Cheung, R. Crawford, M. Dilger, J. Frank, J. Hoagland, K. Levitt, C. Wee, R. Yip, and D. Zerkle. Grids - a graph-based intrusion detection system for large networks. In Proceedings of the 19th National Information Systems Security Conference, 1996.

[40] J. M. Stanton, K. R. Stam, P. Mastrangelo, and J. Jolton. Analysis of end user security behaviors. Computers & Security, Volume 24: Pages 124-133, 2005.

[41] A. J. Stewart. Distributed metastasis: A computer network penetration methodology. Phrack Magazine, 55(9), 1999.

[42] J. L. Undercoffer, A. Joshi, and J. Pinkston. Modeling computer attacks an ontology for intrusion detections. The Sixth International Symposium on Recent Advances in Intrusion Detection. September 2003.

[43] R. A. Wasniowski. Multi-sensor agent-based intrusion detection system. In InfoSecCD ’05: Proceedings of the 2^nd annual conference on Information security curriculum development, 2005.

[44] T. Wolle and H. L. Bodlaender. A note on edge contraction. Technical report, institute of information and computing sciences, Utrecht university, 2004.

[45] Z. Zhang, J. Li, C. Manikopoulos, J. Jorgenson, and J. Ucles. Hide: A hierarchical network intrusion detection system using statistical preprocessing and neural network classification. In the 2001 IEEE Workshop on Information Assurance and Security, 2001.

First Jacques Saraydaryan holds an engineering Degree in Computer Sciences at ISTASE –France in 2004, a Master’s Degree in Telecoms and Networks from National Institute of Applied Sciences (INSA), Lyon –France in 2005, and a Ph.D. in computer sciences from INSA, Lyon France in 2009. He is a Research Engineer at the Exaprotect company, France. His research focus is on IS Security especially on Anomaly intrusion detection system. His research work has been published in international conferences such as SEC’08, Securware’07. He has one patent with Exaprotect Company.
Second Fatiha Benali holds an engineering Degree in Computer Sciences at Es-senia University, Oran-Algeria in 1998, a Master’s Degree in Fundamental Computer Sciences at Ecole Normale Supérieure (ENS), Lyon -France, and a Ph.D. in computer sciences from INSA, Lyon-France in 2009. She is a Lecturer in the Department of Telecommunications Services & Usages and a researcher in the Center for Innovations in Telecommunication and Services integration (CITI Lab) at INSA, Lyon- France. Her research focus on IS security notably on information security modeling. Her research work has been published in international conferences such as Security and Management (SAM’07), Securware 08. She has 2 papers awarded and one patent with Exaprotect Company.
Third Stéphane Ubéda holds a PhD in computer sciences at ENS Lyon - France in 1993. He became an associated professor in the Jean-Monnet University-France in 1995, obtain an Habilitation to conduct research in 1997 and became in 2000 full professor at the INSA of Lyon. He is a full professor at INSA of Lyon in the Telecommunications department. He is the director of the CITI Lab, he is also the head of the French National Institute for Research in Computer Science and Control (INRIA) Project named ARES for Architecture of Networks of Services.

1 http://www.rfc-editor.org/rfc/rfc4765.txt

2 An injection consists in launching an operation through a started session or service.

Yüklə 127,35 Kb.

Dostları ilə paylaş:

1 2 3