|
Transforming and leveraging olap queries Patrick Marcel
|
tarix | 27.01.2018 | ölçüsü | 445 b. | | #41019 |
|
Patrick Marcel Université François Rabelais Tours SAP-BO, 06.22.2010
Outline Short CV Personnalizing OLAP queries Recommending OLAP queries Summarizing OLAP queries Perspectives
About me PhD « multidimensional data(base) manipulations and rule based languages », - defended 1998, LISI (now LIRIS) INSA Lyon
- Sup. J. Kouloumdjian and MS Hacid
Maître de Conférences, UFRT, Dépt. Informatique - Head of the Masters program in Information systems and decision making
Semester off (September 2010 – January 2011)
About me (cont'd) Member of DB & NLP team (4 PR, 8 MCF) - NLP
- XML and web technology
- Data mining and OLAP
- Recent activities
- Pattern based global models (PhD Eynollah Khanjari 2009)
- Summarizing and visualizing large sets of association rules (PhD Marie Ndiaye 2010)
- Collaborative exploration of datawarehouses (PhD Elsa Negre 2009)
Personnalizing OLAP queries PhD Hassina Mouloudi (2007) Main pulications - ACM DOLAP 2005
- BDA 2006
- Hassina's dissertation (in French)
Prototype
Motivation SELECT CROSSJOIN({City.Tours, City.Orleans}, {Category.Members}) ON ROWS {2003, 2004, 2005, 2006} ON COLUMNS FROM SalesCube WHERE (Measures.quantity) Visualization depends on the user's profile
The problem Given - An MDX query q
- User preferences P
- A Visualization constraint v
Find a preferred query q' - Included in q
- Nearest to q satisfying v
- The most interesting w.r.t P
Personnalizing
Personnalizing OLAP queries Context - Dimension tables in main memory
- No acces to the fact table
Principle - Compute sets of positions in the resulting crosstab
- Compute the structures of the crosstabs
Example of personnalization (1)
Example of personnalization (2)
Example of personnalization (3)
Example of personnalization (4)
Example of personnalization (5)
Prototype
Speedup
Recommending OLAP queries PhD Elsa Negre (2009) Main publications - ACM DOLAP 2008
- DaWak 2009
- ACM DOLAP 2009
- Int. Journal of DW and mining
Prototype - Various methods for OLAP query recommendation
Context and principle
Distances - Hamming
- Based on shortest path
Between queries - Based on differences in dimension
- Hausdorff
Between sessions - Based on the subsequence
- Edit distance
Experiments Cube - Foodmart (Mondrian sample cube)
Session generator - Max 100 cells per MDX query
- 25-50 sessions
- 20-50 queries/session
- Log of 150-25000 queries
- 1-20 queries/current session
Efficiency Shortest path Hausdorff distance Edit distance
Effectiveness 10 fold cross validation - 1 query set = 10 equally sized subsets
For the current sessions - Remove the last query
- check how often this last query is recommended
Effectiveness
Query recommandation for discovery driven analysis?
Processing the log
Processing the log
Processing the log
Processing the log
Processing the log
Processing the log
Recommending
Recommending
Recommending
Recommending
Recommending
Prototype Java, mondrian OLAP engine & Sarawagi's icube Preliminary tests show that - for small size log (few hundreds of queries)
- Recommendation time does not exceeds 50 ms
Conclusion: so far...
Summarizing OLAP queries Master's thesis Julien Aligon (in progress) Problem: viewpoints on former sessions? - By summarizing the log
- By browsing/querying the summary
Experiments on healthcare data Related publication
Perspectives Project STIC-AmSud PQUERY: preference models for personnalized queries Forthcomming work with M. Golfarelli (U. Bologna) - Preference mining to dynamically add preferences to an MDX query
Contributions to a collaborative query management system for OLAP
Dostları ilə paylaş: |
|
|