Semantic collaborative web caching Jean-Marc Pierson



Yüklə 445 b.
tarix28.07.2018
ölçüsü445 b.
#61034


Semantic collaborative web caching

  • Jean-Marc Pierson

  • Lionel Brunie, David Coquil

  • LISI, INSA de LYON

  • Jean-Marc.Pierson@insa-lyon.fr


Outline

  • Motivations and Proxies

  • Documents indexation

  • Temperature of documents

  • Collaboration schema and architecture

  • Results, evaluation and discussion

  • Conclusion



Sharing information/Sharing usage

  • Information is disseminated

  • The volume of information is huge

  •  How find my way in the jungle of the IS ?

  • Many possible solutions : search engines, agents, ontologies...

  • A solution to be explored : help from/collaboration with other users



Making users share usages

  • ... Is an issue that has been addressed for a long time : proxies



Proxies

  • Proxies allow

    • reducing the response time
    • reducing the server load
    • reducing the network load
  • Proxies can be located close to the server and/or close to users

  • Proxies can collaborate (hierarchical or "flat" collaboration)

  • Proxy management policies are based on operational (LRU/MFU-like) information



Motivations

  • Users are generally interested in some concerns

  • User caches contain related documents

  • Metadata, user profiles, virtual communities, hot topics can provide proxies with semantic and contextual information about the queries they have to serve



Proposition

  • monitoring this semantic and contextual information to :

  • optimize proxy management policies and proxy communication policies

  • allow users to share usages

  • give users a personalized view of the web information space



Proposition : use collaborative proxies to :

  • Proposition : use collaborative proxies to :

    • improve performances (basic)
    • act as forum and mediators for helping users share usage information
  • Assumptions :

    • proxies do not share rough data but documents that hold information which can be described by metadata (descriptors)
    • users are not isolated nor autistic : they share some common interest or experience or objective or behavior (virtual communities)
    • information and topics of interest evolve rapidly : "hot" topics


From proxies to adaptive indexes

  • The (present + past) content of a proxy de facto provides a view over the global information system

  • This view has some real added value

  • Examples :

    • what teaching materials about Java are the most accessed ?
    • are there some news about football ?
    • what correlated documents people who once read this document have accessed after ?


Document indexation

  • indexing tree : an "ontology" of the web space

  • difficulty to find one !

  • « Yahoo » like



How the indexation is performed ?

  • analyzes the content of the document…

    • Title
    • Meta-tags (Content, Keywords, …)
    • Links
    • Formatting (header, bold face, outline)
  • … to extract keywords

  • Keywords are analyzed to find related concepts

  • mapping is realized from concepts to ontology



Weighted indexing tree

  • Edges between concepts (ancestors and children) are weighted

  • The weight relates to the probability of a request for a document located under the child node to be next requested after a document under the parent node in the hierarchy was requested.

  • It is the “correlation” (in terms of access patterns) between the target node and its “brothers”



Weighted tree



Notion of Temperature

  • documents are assigned a temperature related to their « hotness » : a more a document is accessed, the higher its temperature

  • cache replacement policy uses the temperature of documents : cooler documents are first suppressed from the cache; prefetching uses the hottest documents



Temperature

  • Represents the probability for a document to be accessed in the near future

  • It is the synthesis between the number of requests for a document in the last time interval and the semantic links represented by the data structure.

  • A temperature value is also associated to internal nodes of the data structure.



Temperature computation

  • Temperature computation occurs at regular requests intervals

  • The number of accesses to each document between two consecutive computations is stored in an access table.

    • if a document has been accessed since the last temperature computation, its temperature increases of the corresponding value in the table and this value is stored in a stack for future cooling
    • otherwise, it decreases


The temperature variation () for each document is diffused along the edges of the data structure.

  • The temperature variation () for each document is diffused along the edges of the data structure.

  • More precisely, for each (document, concept) couple where there exists an edge of weight W between document and concept, the temperature of concept increases or decreases by W * 

  • The concept temperature variation may be further diffused to its parent node (with a given threshold).





Temperature retropropagation down the data structure

  • Temperature is diffused from concepts down to documents

  • each document under a concept that has seen its temperature modified sees its temperature modified

  • even « non-accessed » documents might see their temperature increase





Document – Concept link (precision)

  • When a document is related to two concepts, we duplicate its node and link the two created nodes to the two related concepts.

  • Otherwise, with only one node, problem with the temperature variation propagation among non related documents (by rebound)







Navigator cache vs user proxy

  • Navigator "local caches" are basic and cannot communicate

  • Implementing true communicating proxies at the navigator/user level allows :

    • reducing the intermediate proxy load
    • optimizing the network traffic
    • reducing the response time
    • managing the user profile
    • counting document hits
    • customizing semantic and contextual information


From proxies to virtual communities

  • User profile : topics of interest

  • Virtual community = users with similar profile

  • Virtual communities could be used for :

    • monitoring the document usage
    • associating proxies with specific communities
    • providing users with pertinent information about the content of proxy caches
    • monitoring the evolution of the topics of interest
    • sharing experiences and optimizing queries


Collaboration and communities

  • Subscription : manual and static to evolve to dynamic and automatic

  • Relationships between the user proxy and the aggregate proxies in charge of the community :

  • The proxy organization must reflect the community structure and usages



Prototype

  • Java

  • Indexation tree limited to 2 or 3 levels of Yahoo!

  • Matching done only with keywords (being or not in the indexing tree) and not with concepts

  • Interfaced with ThoughtTreasure (a french-english Wordnet) for keywords not in the indexing tree



Evaluation

  • temperature notion already proved efficient for video archives caching (hit rate)

  • small scale experiments of the proxy-web architecture proved to be robust

  • indexation is working well (more than 90% of documents indexed)

  • difficulties related to the necessity to handle contents of web pages to test the behavior



Conclusion

  • Enhancing the integration of distributed information systems or servers into a global service by the means of collaborative proxies

  • Management and collaboration based on semantic and contextual information  temperature

  • Performance improvement

  • Virtual communities

  • Attachment of a proxy to each user



Future works

  • test the prototype on a large scale : design a test platform !

  • push the intermediate cache management to the heart of the networks (active router)

  • enhance the indexation algorithm

  • apply the technology to Grid computing (cache management)



Yüklə 445 b.

Dostları ilə paylaş:




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin