Many possible solutions : search engines, agents, ontologies...
A solution to be explored : help from/collaboration with other users
Making users share usages
... Is an issue that has been addressed for a long time : proxies
Proxies
Proxies allow
reducing the response time
reducing the server load
reducing the network load
Proxies can be located close to the server and/or close to users
Proxies can collaborate (hierarchical or "flat" collaboration)
Proxy management policies are based on operational (LRU/MFU-like) information
Motivations
Users are generally interested in some concerns
User caches contain related documents
Metadata, user profiles, virtual communities, hot topics can provide proxies with semantic and contextual information about the queries they have to serve
Proposition
monitoring this semantic and contextual information to :
optimize proxy management policies and proxy communication policies
allow users to share usages
give users a personalized view of the web information space
Proposition : use collaborative proxies to :
Proposition : use collaborative proxies to :
improve performances (basic)
act as forum and mediators for helping users share usage information
Assumptions :
proxies do not share rough data but documents that hold information which can be described by metadata (descriptors)
users are not isolated nor autistic : they share some common interest or experience or objective or behavior (virtual communities)
information and topics of interest evolve rapidly : "hot" topics
Edges between concepts (ancestors and children) are weighted
The weight relates to the probability of a request for a document located under the child node to be next requested after a document under the parent node in the hierarchy was requested.
It is the “correlation” (in terms of access patterns) between the target node and its “brothers”
Weighted tree
Notion of Temperature
documents are assigned a temperature related to their « hotness » : a more a document is accessed, the higher its temperature
cache replacement policy uses the temperature of documents : cooler documents are first suppressed from the cache; prefetching uses the hottest documents
Temperature
Represents the probability for a document to be accessed in the near future
It is the synthesis between the number of requests for a document in the last time interval and the semantic links represented by the data structure.
A temperature value is also associated to internal nodes of the data structure.
Temperature computation
Temperature computation occurs at regular requests intervals
The number of accesses to each document between two consecutive computations is stored in an access table.
if a document has been accessed since the last temperature computation, its temperature increases of the corresponding value in the table and this value is stored in a stack for future cooling
otherwise, it decreases
The temperature variation () for each document is diffused along the edges of the data structure.
The temperature variation () for each document is diffused along the edges of the data structure.
More precisely, for each (document, concept) couple where there exists an edge of weight W between document and concept, the temperature of concept increases or decreases by W *
The concept temperature variation may be further diffused to its parent node (with a given threshold).
Temperature retropropagation down the data structure
Temperature is diffused from concepts down to documents
each document under a concept that has seen its temperature modified sees its temperature modified
even « non-accessed » documents might see their temperature increase
Document – Concept link (precision)
When a document is related to two concepts, we duplicate its node and link the two created nodes to the two related concepts.
Otherwise, with only one node, problem with the temperature variation propagation among non related documents (by rebound)
Navigator cache vs user proxy
Navigator "local caches" are basic and cannot communicate
Implementing true communicating proxies at the navigator/user level allows :