Master Sciences, Technique, Santé


Name of the course: Data Mining



Yüklə 1,41 Mb.
səhifə161/197
tarix03.01.2022
ölçüsü1,41 Mb.
#34283
1   ...   157   158   159   160   161   162   163   164   ...   197
Name of the course: Data Mining
Number of ECTS credits:
Contact :

Name and Given names : Jean-François Boulicaut

Phone : +33 (0)4 72 43 89 05

email : +33 (0)4 72 43 87 13

Other professor(s) if any : Christophe Rigotti


Exam Prepared lecture on a couple of influential data mining papers.


Course Content
Data Mining has been identified as one of the ten emergent technologies of the 21st century (MIT Technology Review, 2001). This discipline aims at discovering knowledge from large amounts of data and its development has emerged at the intersection of various disciplines related to data processing, e.g., machine learning, database management, visualization, statistics. In a first part, we will provide an overview of the quite active research field of data mining and knowledge discovery in databases. Classical techniques (clustering and supervised classification, pattern discovery) will be considered. Examples of real-life data mining applications will concern, among others, basket data analysis, WWW usage data analysis, and knowledge discovery in living sciences (e.g., molecular biology).

A second part will be dedicated to constraint-based data mining and the emerging framework of inductive querying. After an introduction to this appealing formal framework, we will discuss some recent research topics related to the condensed representations of frequent patterns and constraint-based mining of sequential patterns.
C1 KDD: motivations and terminology (Boulicaut)

C2 Data (Rigotti)

C3 Data exploration (Rigotti)

C4 Clustering (Rigotti)

C5 Classification (Rigotti)

C6 Association analysis (Boulicaut)

C7 Towards a theory of data mining (Boulicaut)

C8 Condensed representations for frequent patterns (Boulicaut)

C9 Advanced sequential pattern mining techniques (Rigotti)

C10 A research agenda(Boulicaut)
The course is based on the excellent book by Pang-Ning Tan, Michael Steinbach and Vipin Kumar “Introduction to data mining” published in 2006 by Addison-Wesley (slides have been provided by the authors).

It will be possible to apply the techniques on benchmark data by using the software platform Weka (free software).


Targeted Skills The popular techniques for data mining (e.g., K-Means and hierarchical clustering, decision tree building, association rule mining) are understood. Some recent conceptual issues or data mining principles related to inductive querying and constraint-based mining are understood as well: it provides a conceptual framework for analysing the current research directions in data mining.






Yüklə 1,41 Mb.

Dostları ilə paylaş:
1   ...   157   158   159   160   161   162   163   164   ...   197




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin