The Intelligent Systems Laboratory Amsterdam
The Intelligent Systems Lab Amsterdam (ISLA) now proudly presents three research groups. They are the Intelligent Sensory Information Systems group (ISIS), the Intelligent Autonomous Systems group (IAS), and the Information and Language Processing Systems group (ILPS). The three groups share an interest in processing pictorial, auditory and/or textual information, the content of such information as well as the consequences for actions and knowledge. The topics are covered from theory to practice, from basic principles to implementations of applications.
The prime scientific target of ISIS is to create access to the content of digital images. The aim is to bridge the semantic gap between pictorial data and the interpretation of the data by invariant vision and learning from (very) large image databases. Video, color and tracking are important topics as well as parallel image processing. Applications are, among others, in structured and unstructured video, color image analysis, content-based image retrieval, and picture search engines.
The IAS-Group studies methodologies to create intelligent autonomous systems, which perceive their environment through sensors. The information is used to act and generate intelligent, goal-directed behavior. This work includes formalization, generalization and learning in autonomous systems. Collaborating multi-agent systems create groups of autonomously operating systems working together to realize a task. In the group the theory of geometric algebra is also studied in depth as a means to formalize the addressing of space in a computer.
Research within the ILPS group is aimed at developing and studying the computational, statistical, and linguistic underpinnings of effective ways of providing intelligent information access, especially to massive amounts of information. Their leading methodology is to identify real-world scenarios that give rise to interesting research challenges. ILPS takes part in, and organizes, a number of world-wide evaluations in information retrieval and language processing. This allows ILPS to subject its software infrastructure to qualitative assessments.
1. Intelligent Sensory Information Systems
General information
Contact person : Prof Dr ir A.W. M. Smeulders
Telephone : +31 - 20 - 525 7460
URL : http://www.science.uva.nl/research/isis
Fax : +31 - 20 - 525 7490
E-mail : smeulders@science.uva.nl
Position within the Organisation
This program is carried out within the group Intelligent Sensory Information Systems at the Informatics Institute of the Faculty of Science. The Intelligent Sensory Information System Group participates in the Dutch Graduate School 'ASCI'.
Characterisation
The prime scientific target is to create access to the content of digital images and to learn from multimedia data repositories in general. New topics of research are picture-language combined information, tracking objects in video, learning object recognizers and image space delivery. We aim to bridge the semantic gap between pictorial data and the interpretation of the data. We do so from the image data driven interpretation by computer vision, as well as driven from knowledge about the image, designing algorithms of image analysis, experimenting on visual data, and learning from (very) large image data repositories. The research ranges from the theory of computer vision to innovative applications in multimedia. Implementations are in digital document structure analysis, video analysis, color image analysis and picture search engines. Application areas currently are in industrial vision, multimedia document analysis, biological image processing.
Key Words
Computer vision, image processing, multimedia information analysis, pattern recognition, multimedia databases, e-documents, image/video retrieval, performance evaluation of vision, mathematical morphology, color vision, shape, content-based image retrieval, query optimization, parallel processing, extensible databases.
Main themes
Content-based access of multi-media data
The goal of multimedia information access is to provide efficient content-based methods for indexing and retrieving multimodal information (images, video, sound, music). The success of a content-based multimedia retrieval engine depends on the generality, expressiveness and robustness of the multimedia features expressing the similarity between multimedia documents. We aim at new storage and retrieval technology to make large scale image/video assets manageable, searchable, targetable and reusable.
Colour in computer vision
Colour computer vision aims at defining a colour theory for vision and image sequences and to develop models for colour image processing methods. Colour vision is a very powerful component in image databases, image sequences as well as the processing of real world scenes. The purpose is to formulate computational methods, data structures and colour space representations to design proper colour image processing algorithms and to develop sets of colour invariant image features.
Theory of computer vision
Invariance in computer vision is a recurrent theme in the design of methods to describe shape and form analysis for the identification of equivalence classes with respect to particular shape descriptor. That is, the tools are designed to be invariant under the appearance changes that leave the interpretation constant. Shape and form analysis is based on linear theory encompassing linear scale-space theory and the differential structure of images as well as mathematical morphological based on complete lattice theory with a strong emphasis on the multi-local geometrical interpretation. To design long-term tracking of objects, we study on instant learning of appearances.
Segmentation, learning and tracking
Learning in computer vision is taken on as a new topic driven by the fact that so many and so large a variety of digital images is available these days. The aim is to learn rather model object class appearances and to do this interactively by active annotation.
Spatial and extensible databases
Spatial and pictorial databases are being researched for the purpose of providing consistency to spatial reasoning in an attempt to enhance spatial data-structure design. The formalization of spatial queries and the formalization of spatial data is pursued to guarantee consistency on a pictorial content-level to spatial databases. The purpose is to advance the insight in spatial and multi media data databases as they differ from standard databases. The long-term goal is integrating computer vision tools, OO-databases, and image query tools into complete image information systems.
Software and systems
The aim of vision systems is to define a framework to support software development. The aim is to get to a framework consisting of a number of software modules, workbenches for performance analysis and environments with databases, including appropriate interfaces. Performance evaluation to establish precisely what a computational method can or cannot achieve. In the computer vision community these engineering aspects are only recently taken into consideration. We design meta-systems to enable easy performance evaluation.
2004 Results
Content-based access of multi-media data
Content-based access to large stacks of images has been a topic of study for some time now. Based on the invariant colour features, to be described below, successful methods have been designed to achieve image retrieval robust against variations in illumination, viewpoint and occlusion.
Large sets of documents are analysed for their page layout characteristics, the reading order, and their type of genre (e.g. scientific/news/commercial papers). These methods were successfully applied to classify office documents and scientific papers.
For the retrieval of videos, we aim to make multimedia archives as accessible as their textual counterpart. To that end, our research efforts concentrate on automatic semantic indexing and interactive retrieval of multimedia sources. To value the merit of our efforts on high international standards, all research is evaluated within the TRECVID benchmark for multimedia retrieval.
An import asset in our endeavor for semantic access is indexing of semantic concepts. An architecture for semantic indexing has been developed that facilitates generic indexing based on integrated analysis from multi-media sources on content, style, and context level. Experiments with a lexicon of 32 concepts on the 2004 TRECVID benchmark indicate that our approach is state-of-the-art.
Despite the good results for automatic indexing, the current lexicon is still too limited for daily practice retrieval. Therefore, we view user interaction as an essential part of multimedia retrieval systems. To this end, we have developed a novel paradigm for interactive retrieval which uses a lexicon of detectable concepts in combination with keyword and visual example search to boost interactive retrieval. Implementation of the paradigm within a video search engine, see Figure 1, yielded the highest score for the interactive retrieval task of TRECVID 2004.
Search engine for video retrieval, showing results for vehicles and cars.
We will continue this line of research with a further integration of multi-media sources, a further understanding of the structure of video documents and interactive access to the semantic content of large collections of multi-media information.
Colour in computer vision
Colour is an important cue in image analysis. Image processing of colour images requires special attention as well as new possibilities to see detail where grey value methods cannot.
In previous years, local color, geometric and spatial frequency invariants were developed at ISIS. These local features aim at the robust measurement of the color, shape and texture of an object, under the most common accidental viewing conditions. These accidental conditions can be largely characterized by direction of view, the incident light, the color of the light and the other accidents as the presence of a foreground or background. An invariant representation of the object implies that the various conditions under which the same view of the object may be perceived do not have to be learned. To test the discriminative power of color, geometric and spatial frequency invariants, the ALOI collection was recorded, containing color images of 1000 real-world objects recorded under variuous illumination directions (Figure 2), and different illumination colors (Figure 3).
The invariants were shown to increase discriminative power for object recognition when compared to the visual measurements of which the invariants were composed. More specifically, the illumination intensity invariant W and the color constant shadow and shading invariant N proof to be discriminative when recognizing objects from a data set containing much photometric variation. However, the invariants are shown to be only marginally color constant. The invariant sets outperform SIFT-features extracted from distinctive keypoints when the object is recorded under various illumination directions or when the object is rotated in 3 dimensions. Combining local invariant features and the incorporation of color are known to be nontrivial problems in computer vision, for which we have provided solutions.
From the general theory, methods are derived for industrial colour vision, analysis of colour in documents, colour in microscopy, general images as appearing on the worldwide web. The topic on colour has led to many publications in journals (International Journal on Computer Vision and IEEE Trans. on Pattern Analysis and Machine Intelligence). We aim at expanding the analysis to take on the same invariant representation of texture next, as well as the semantic meaning of colour.
Example objects from the ALOI dataset.
Example object from ALOI under different illumination directions.
Theory of computer vision
Mathematical morphology is an important paradigm for low-level image processing. Our group has a long tradition on the topic at a theoretical as well as a practical level, with applications in industrial and machine vision. Research has been carried out on bringing two basic theories of low-level vision (i.e. linear theory and morphological theory) on par. To that end, the slope transform has been introduced as the morphological equivalent of the Fourier transform. Morphological scale-space has been founded on the same principles as linear scale-space. For practical mathematical morphology, we aim at developing efficient algorithms. This implies efficient decomposition schemes for structuring elements as well as the design of C++ patterns for morphological image processing.
One of the most fundamental tasks in computer vision is edge and line detection in images. The difficulty of edge and line detection is emphasized when the structures run close together or cross each other, as is the case in engineering drawings or two-dimensional projections of complex (3D) scenes. In these cases, one would often like to have a detection method which takes advantage of the anisotropic nature of lines and edges. We have shown the decomposition of the anisotropic Gaussian filter method in two Gaussian line filters in non orthogonal directions. The anisotropic Gaussian filtering method allows fast calculation of edge and ridge maps, with high spatial and angular accuracy.
Segmentation and Learning
Image segmentation is the task of delineating the image of an object from the real world in the digital data array. It is one of the fundamental difficulties of computer vision, easily surpassed by man's superior capabilities. The difficulty resides in the fact that even man cannot rarely give a formal pictorial description why a boundary is positioned at a certain location and that is what a computer needs to perform the job. We contribute a variety of solutions.
The necklace approach offers a solution to inhomogeneous boundaries as seen in three-dimensional images of the spine. A boundary will be inhomogeneous when there are neighboring, touching or overlapping objects, or when the boundary is out of sight due to noise or occlusion. In our example, individual vertebrae are delineated using an a priori geometrical model to be deformed for each vertebra.
A string is a variational deformable model that is learned from a collection of example objects rather than built from a priori analytical or geometrical knowledge. As opposed to existing approaches, an object boundary is represented by a one-dimensional multivariate curve in functional space, a feature function, rather than by a point in vector space. Strings have been compared with active shape models on 145 vertebra images, showing that strings produce better results when initialized close to the target boundary, and comparable results otherwise.
The watersnake approach has established a connection between the well-known watershed segmentation from mathematical morphology and energy-based segmentation methods. While the original watershed algorithm does not allow incorporation of a priori information regarding object shape, we succeeded doing so by formalizing and representing the watershed segmentation as a minimization problem. In particular the imposing of contour smoothness on the segmentation results, solves the problem with noisy boundaries that are often encountered with the original watershed.
The topic on image segmentation have lead to two articles in IEEE trans. on Pattern Analysis and Machine Intelligence (IEEE PAMI). This is the most prominent journal in the area of pattern recognition and is among the top three journals in computer science in the world.
Spatial and extensible databases
Spatial and extensible databases are developed in the MAGNUM-project, which ended in 1999. Amongst the results obtained, the Monet database kernel and its modules for image and geo-spatial reasoning stands out. Research in the area of database kernels was focused on consolidation of the results obtained in recent years in journal papers. The activities were realized in close co-operation with the CWI-database group. In the area of database kernels, an innovative experimental analysis uncovered the lack of performance improvement in database technology over the last decade. The underlying reason is the relative progress in CPU- and RAM-technology, which shows a increasing performance bottleneck. As a result, traditional database solutions use less then a few percent of the available resources. This observation has led to novel techniques to measure the resource waste and new database algorithms to avoid resource stales.
The topic on spatial databases resulted in papers in several conferences including the VLDB conference.
Software and systems
The year 2004 has been a period of change for software engineering from the large complex and abstract Horus system which is now completed to smaller targets systems solving one computer vision task at the time but completely.
A separate seed topic is the study of parallelism in multimedia processing tasks. The purpose of the research is to anticipate on future generation computer systems while constructing a parallel processing library compatible with Horus. A Ph.D. thesis on this topic has been completed by F.J. Seinstra entitled "User Transparent Parallel Image Processing". The research has led to publications in a.o. "IEEE Transactions on Parallel and Distributed aystems", "Parallel Computing", and "Concurrency and Computation: Practice and Experience". In addition, the developed parallel software has been applied in the 2004 TRECVID competition, in which the expected sequential processing time of over 250 days was reduced to less than 60 hours. This reduction was obtained without any hand-parallelization, and played an important role in the realization of our top-ranking TRECVID results.
Image processing as a design process
The research topic of developing image processing tools as a design process was concluded as a separate research topic. One sub-topic was the design of methods for evaluating the performance in terms of robustness of image processing methods. Formalization of the design of image processing tools results in the use of self-reliant detectors each carefully documented on its operational domain describing the complete picture set for which the detector will operate well. The method has been applied to engineering drawings as well as seeds. The topic on design process has lead to a paper in IEEE trans. on PAMI.
2005 and beyond
With first priority we will increase the effectiveness of our solutions in image retrieval and image search engines by expanding on our experience with learning from image databases.
Computational efficiency of image search engines will be increased by the joint development with spatial and extensible databases. This is important as it will open up domains of hundreds of thousands of images, a significant step towards data-mining the content.
At the same time, we aim to expand to create full access to multimedia documents. The integration of information from text and pictures is a very interesting topic both scientifically, as it reveals a lot about the nature of information, as well as practically as multimedia documents will be ubiquitous as is the need for their access. The MultimediaN project provides the opportunity to reach this goal with the intended delivery of a large-scale experimentation platform for multimedia information analysis.
Concerning color research, we aim at the extraction of invariants from interesting regions of the image. Regions improve on the robustness when compared to strictly local interest point based object recognition. For instance, from color distributions we may derive appearance properties. Further, we aim at exploiting distributions of color edges to derive texture properties. We will concentrate on regions that are interesting from an information theoretical point of view. To increase the specifity of regions, we will incorporate the statistics of regions throughout the ALOI collection. We consider this collection to be a natural starting point for visual cognition.
Over the years we have invested in a new, object-oriented software platform for vision. By the end of the year we hope to deliver a first complete system for internal use with an expected life time of 10 years.
Further, the available parallel functionality to heterogeneous wide-area Grid systems is projected to be extended in 2005. The main focus is on the development of an efficient and easy-to-use execution model based on so-called Multimedia Grid Services, i.e. high-performance multimedia functionality that can be invoked from within sequential applications running on a standard desktop machine. A key example is our Aibo robot dog, whose video data is being processed at multiple cluster systems all over the globe. This research direction is prioritized with the arrival of the new Distributed ASCI Supercomputer 3 (DAS-3), which is co-financed by the MultimediaN consortium.
2. Intelligent Autonomous Systems
General information
Contact person : Prof Dr ir F.C.A. Groen
Telephone : +31 - 20 - 525 7461
URL : http://www.science.uva.nl/research/ias/
Fax : +31 - 20 - 525 7490
E-mail : groen@science.uva.nl
Position within the Organisation
This program is carried out by the Intelligent Autonomous Systems group, one of the two research groups of the Multimedia and Intelligent Systems Laboratory at the Informatics Institute of the Faculty of Science. The Intelligent Autonomous Systems group participates in the Dutch Graduate School 'ASCI'.
Characterisation
We develop methodologies to create intelligent autonomous systems that perceive their environment through sensors and use that information to generate intelligent, goal-directed behaviour in a perception-action cycle. These intelligent systems may be single systems, or multiple systems working together. In particular, we study perception for autonomous systems based on vision; we develop a unified framework for geometric computations; we study decision making of single- and multi-agent systems under uncertainty; and we develop computational methods for learning systems and probabilistic reasoning.
Key Words
Machine learning, Bayesian networks, neural computation, graphical models, robotics, sensor data fusion, cognitive robots, ambient inteligence, human-robot interaction, autonomous systems, reasoning with uncertainty, geometric algebra, geometric programming, multi-agent systems, decision making, planning, Markov decision processes, stochastic games.
Main themes
We study the methodologies to create intelligent autonomous systems. Such systems obtain their information from sensors and use that information to generate intelligent, goal-directed actions. As these systems operate in a sense-think-act loop, they can inherently learn from perceiving the result of their actions. They may be single entities or cooperating multi-agent systems, and they must operate in a real dynamic world, inhabited by humans and other agents.
Characteristic for real-world problems is that the sensor data are noisy and hard to interpret since models are often unavailable, inaccurate, or incomplete. On the other hand, because of the real-world environment, data and models have geometrical coherence and are constrained by physics. Our research focuses on new methods which inherently incorporate this real-world structure, in order to produce data processing and modelling that is robust to noise and computationally efficient.
Decision making under uncertainty is an important issue for intelligent systems. We are developing single- and multi-agent planning algorithms for this problem. In particular, multi-agent systems are capable of sharing their perceptions, resulting in a distributed world model. Coordinating the actions of multiple agents is a real challenge, which we approach by Markov decision processes and multi-agent learning strategies. As case study we use Robot Soccer.
In the field of intelligent robots, specific applications we work on include service and personal robots and intelligent cars both in structured and unstructured terrain. In the area of surveillance and safety, we have projects in distributed surveillance systems and decision support agents.
2004 Results
Perception for autonomous systems
We develop methodologies for accurate motion estimation and interpretation from image sequences. Mobile vision platforms have their applications in traffic and driving in unstructured terrain. Applications of static platforms are in public safety, and intelligent care homes for the elderly. We cooperate with TNO-D&V on a number of projects, which are related to autonomous systems, such as the RoboJeep autonomous robot vehicle.
Terrain classification is important for off-road autonomous robot vehicle guidance. Range based sensor systems, such as stereo vision, cannot distinguish between solid obstacles such as rocks or soft obstacles such as tall patches of grass. Terrain classification is needed to prevent that the robot vehicle is stopped needlessly by the obstacle detection system. It can also be used to recognize sand roads or other drivable areas.
We have developed a colour based method to classify typical terrain coverings such as sand, grass or foliage. Using colour recognition outdoors is difficult, because the observed colour of a material is heavily influenced by environment conditions such as the scene composition and illumination type.
A novel approach has been developed for classifying different environment states in outdoor colour images that does not require any additional sensor data. By differentiating between environment states only a small variation in material colour remains to be modeled. The results (as can be seen in the figure) show that our approach is able to classify terrain types in real images with large differences in illumination.
Orignal Colour Image
|
Recognized Terrain Types
|
|
|
|
|
|
|
Terrain classification results for images recorded in different weather conditions.
Blue = sky, Dark Green = foliage, Light green = grass, yellow = sand, gray = gravel and pixels colored red have been rejected.
|
Also in cooperation with TNO-FEL methodologies for public safety are developed. Many surveillance algorithms consist of different parts, such as object detection, segmentation, tracking, and recognition. Literature focuses on algorithms for one of these individual tasks. Integration of and communication between these tasks is often rather ad-hoc. In this project we worked on a statistical framework for visual tracking applications. It allows easy adaptation of the tracking application by substitution of one of the algorithms for part of the problem, without altering the remainder of the application. Furthermore, the framework uses minimum cost classification and feedback for updating the models using knowledge available elsewhere in the application. Another problem for many surveillance algorithms is that often cameras with automatic gain control have to be used. As part the scene changes, static parts of the scene may suddenly become brighter or darker. We developed and compared a range of techniques to automatically correct for these changes. This enables us to use affordable auto-gain cameras to do sophisticated image processing.
Radars became a common component of perimeter monitoring systems. Current state-of-the-art radars are cheap, portable, and small. A disadvantage of radar is that its measurements cannot be directly visualized like an electro-optical sensor. One way to visualize radar measurements is by showing them as a 3D scene in a virtual environment, which can be interpreted by a human observer. This requires the estimation of human motion features to animate the person.
In the previous work we presented animate walking persons with the model-based approach that provide a realistic look-alike of the real walking person with the global Boulic parameters. Although we estimated only the global Boulic parameters the animation showed a certain level of personification. Disadvantages of the model-based approach are: It requires human motion models for each type of motion like walking, jogging and running. The computation time of the fit between the model response and measurements is not real-time. Although the model-based approach has a certain level of personification not all personification of human motion is presented in the models for example a difference in the left step length and the right step length. The research focuses on a feature-based approach to estimate the motion parameters. Three different methods are presented to extract the maximum, minimum and centre velocity.
In addition to these features we extract an independent estimate of the repetition frequency on bases of velocity slices in the spectrogram. The leg model, torso model and repetitive behaviour are derived from the human walking model of Boulic. Kalman filters smooth the leg, torso and repetition frequency features and estimate the global Boulic parameters. The global Boulic features completely describe the human motion and the leg features and torso features give additional personification information. These parameters are input to the human model of Boulic which forms the basis for animation. A multiple processor application gives a real time solution. In the end of the year we start with the arm movement estimation and estimate the difference between swinging and non-swinging arms. The range dimension is added in the measurements and a Principal Component extraction is used to locate the human motion in the step-frequency- step-length space and classify the different humans. (Ready to submit paper: Feature-based Human Motion Parameter Estimation with Radar.)
Principles of autonomous systems
In the project on geometric algebra, we have continued the development of the
conformal model as a compact language for programming geometry, though in the past year the effort was considerably reduced due to prolonged illness (RSI) of the chief investigator. We have deepened the work on efficient implementation, and on facilities for visualization of and experimentation with geometric algebra and geometry. This involves compiler and interpreter construction, and is a cooperation with the Faculty of Computer Science at the Free University (VU). We have started writing a book ‘Geometric Algebra for Computer Science’, based on our paper and interactive tutorials, to help spread the distribution of the technique to the field.
Learning, probabilistic and neural computing
We develop learning and probabilistic reasoning methods for intelligent systems operating in a real world. One line of research focusses on intelligent environments which must be able to localize and track humans and analyze their behaviour. In particular we studied a surveillance application with many cameras which do not have overlapping field of views (see figure below). Such a system is faced with the problem whether an object observed with a camera at some time is the same object as observed by some other camera some time ago. To deal with the uncertainty we use probabilistic networks. The methods we developed outperform conventional (multi-hypothesis) tracking methods. The project is funded by STW.
Another line of research concerns the modelling of sensory data. We developed probabilistic methods that map high dimensional data to a non-linear low dimensional subspace that preserves the structure of the data. Applications of developed techniques in the field of behaviour analysis are studied. Work is supported by STW.
An important new line of research concerns `cognitive devices’ for intelligent environments. With top partners in Europe in the field of robots and human-robot interaction we started a 6th framework Integrated Project ‘Cogniron’ in which a cognitive robotic assistant is developed. Our group works on human activity analysis and on cognitive representations of objects and space. With this Cogniron project, the work started in the NWO project ‘Concept learning’ and the work carried out in the ITEA project ‘Ambience’ is continued.
Decision making in single- and multi-agent systems
We have studied the problem of agent sequential decision making (planning) under uncertainty. In the hardest case, the agent does not observe the true state of its environment, but it only receives noisy observations that are stochastically coupled to the state. Solving such planning problems in an exact fashion is known to be intractable, so our focus has been on approximate solution techniques. We have developed “Perseus”, an approximate value iteration algorithm for arbitrary POMDPs (partially observable Markov decision processes) which is very competitive over existing methods.
We are also interested in cooperative multi-agent systems. Here we use the framework of coordination graphs which allows for a tractable approach to multi-agent coordination, by decomposing the global payoff function of the system into a sum of local terms. Our current work on coordination graphs involves (i) message passing techniques for approximate decision making (similar to belief propagation in Bayesian networks), and (ii) distributed cooperative reinforcement learning (Q-learning). We incorporated this framework in our UvA Trilearn RoboCup simulation team which won the 2004 German Open.
We have applied our experience to the real soccer game when performed by a small number of robots. Previously we cooperated with the Delft University of Technology in the Mid-Size League, this year a Dutch Aibo Team was formed. The 4-Legged Soccer League is an ideal challenge, because it defines a standard hardware platform, which allows excelling on algorithms to perceive the environment, to reason about the optimal (joint) actions, and to perform those actions in a smooth way. The Dutch Aibo Team was a successful debutant, which qualified directly for the WorldCup in Osaka with their performance in the Technical Challenges.
The soccer world is dynamic, but has a fixed number of agents, and small playing field. To study the influence of larger real-world models, we focused on cooperation between teams of rescue-agents in a city after a disaster. In our research we have experimented with a game-tree approach. In this approach the agents construct a strategy for multiple steps in the future instead of just selecting the behavior which currently has the highest priority. They do that not only for themselves, but also for a limited number of team members. This game-tree approach can also be used to trace the situation and the behaviors of the agents multiple cycles back in history, to estimate which actions were really beneficial on the long run.
We applied this game-tree approach in the UvA RoboCup Rescue simulation team, which participated in the Rescue Middle Earth competition at the PRICAI 2004. The figure shows a D visualization of a burning city after an earthquake, as provided by the Robocup Rescue Simulation Project. In this city ambulance, fire brigades, and police forces have to cooperate, while the communication is strictly limited because of a collapsed infrastructure.
Another research line involves `Interactive Hierarchical Awareness’ in a multi-agent system. Modeling and control of such a system may benefit immensely from representing it at multiple levels of abstraction, i.e., using a hierarchy of representations. This means that the system's state is represented at multiple resolutions (in terms of time and abstraction) simultaneously for different levels in the hierarchy, and similarly, that actions are taken at multiple resolutions. In the Interactive Collaborative Information Systems (ICIS) project we aim at developing scalable and theoretically sound methods for constructing and updating hierarchical state representations for multi-agent systems. We are working on establishing criteria for hierarchical state representations, for instance based on action utilities, the Markov property, uncertainty, etc. Furthermore, we are developing algorithms that enforce/approximate these criteria based on the theory of MDPs, POMDPs, HPOMDPs, DBNs, and Utile Distinction methods. Our main application domain is road traffic management.
Distributed perception and sensor fusion
Intelligent process control and decision making in complex systems require adequate situation assessment, which in turn requires processing of large amounts of heterogeneous information originating from different, spatially dispersed sources, such as, sensory systems, human observers, databases, etc. However, such “sensor fusion” is not trivial, since it requires adequate mapping between very heterogeneous concepts, we are confronted with noisy information sources and, due to large amounts of information, significant processing resources might be required (i.e. computational bottlenecks). Another characteristic of the domains we are focusing on is that constellations of information sources can change frequently and, prior to the operation, we never know which information sources will be available. In addition, such fusion systems often provide results which have a critical impact on the decision making process and, consequently, further course of events. Therefore, high quality of fusion results and prevention of misleading results is indispensable.
In order to be able to deal with the mentioned challenges, we have introduced Distributed Perception Networks (DPN), multi-agent systems which support fusion based on distributed Bayesian Networks (BN). In this context, our research is focused primarily on the following problems: (i) Task driven self-configuration of DPN agents at runtime, which allows adaptation to dynamic information source constellations and supports reuse of partial fusion results. (ii) Efficient and robust information fusion with distributed BNs, which provide adequate mapping between observable events and beliefs in hypotheses about hidden events. (iii) Resource allocation in distributed fusion systems based on information theoretic criteria. (iv) Approaches to improved fusion accuracy, such as fail-safe design of fusion systems as well as localization of faulty model components and information sources.
In a related line of research we have studied the problem of decentralized data mining (unsupervised learning) using gossip-based communication protocols. In such protocols, random pairs of nodes repeatedly exchange their local parameter estimates and combine them by weighted averaging. We have provided theoretical and experimental evidence that under such protocols, nodes converge exponentially fast to the correct parameter estimates.
Intelligent vehicles
Our research interest involves developing new vision-based techniques for applications in the intelligent vehicles domain. In order to preserve safety, current operational systems, such as automated transports and people movers, need areas or lanes that are separated from other traffic. Reliable, robust and real-time obstacle detection methodologies are needed to enable the safe operation of these types of intelligent vehicles among other traffic participants such as cars and pedestrians.
-
|
|
|
|
Real-time Stereo vision based obstacle detection on the road. The left shows the system in action from inside of the vehicle. The images on the right hand side show the system’s display. shown. Notice that a “caution” warning is given when the pedestrian is too close.
|
In joint research with TNO we have developed and evaluated a real-time obstacle detection system. The obstacle detection system is based on dense stereo vision. In contrast to sparse stereo vision, it uses matching algorithms which estimate a depth for almost all pixels in the stereo image pair. Robust algorithms are then applied to detect which of the pixel disparities belong to the road surface. Other pixels, which belong to positive obstacles, are clustered into separate instances. All parts of the obstacle detection system have been optimized for the SIMD instruction sets that are available on normal computer systems. The current version of the system achieves processing speeds of 5 Hz. The figure above shows image of a test on the road, where the system detects pedestrians. The system was also evaluated in the state-of-the-art VEHIL facility of TNO in Helmond. This facility provides VeHicle In the Loop testing for new sensors systems by using small robotic vehicles to simulate other traffic (see photo below). By comparing the ground truth positions of the robotic vehicles to that estimated by our obstacle detection system we determined that the error in stereo ranging is less than 1/8 of a pixel.
|
Evaluation of real-time stereo vision based obstacle detection system in the VEHIL facilty of TNO in Helmond. The vehicle with the stereo vision system (RoboJeep) observes the small robotic vehicle which simulates a moving obstacle.
|
More than 150.000/6000 pedestrians are injured/killed yearly in traffic EU-wide. A collaboration with DaimlerChrysler Research (Germany) focuses on these vulnerable traffic participants, aiming to develop a system for video-based pedestrian recognition from a moving vehicle. From a methodical point of view, this application is very challenging because it combines the difficulties of a moving camera, a wide range of possible (deformable) object appearances, cluttered backgrounds, stringent performance criteria and hard real-time constraints. The second phase of the research project focuses on a Bayesian extension to hierarchical shape-based object detection.
3. Information and Language Processing Systems
General information
Contact person : Prof Dr M. de Rijke
Telephone : +31 - 20 - 525 5358
URL : http://ilps.science.uva.nl/
Fax : +31 - 20 - 525 7490
E-mail : mdr@science.uva.nl
Position within the Organisation
This program is carried out by the Information and Language Processing Systems (ILPS) group, one of the three research groups of the Intelligent Systems Laboratory Amsterdam (ISLA) at the Informatics Institute of the Faculty of Science. The Information and Language Processing Systems group participates in the Dutch Graduate School SIKS.
As of April 1, 2004, the Information and Language Processing Systems group (formerly the Language and Inference Technology (LIT) group) is part of the Informatics Institute of the Universiteit van Amsterdam. Previously, the LIT group was part of the Institute of Logic, Language and Computation (ILLC). Founded by Maarten de Rijke and Michael Masuch in 2001, the group quickly grew to become an award-winning group of around 20 people pursuing foundational, experimental, and applied research in intelligent information access. The following people have left the ILLC along with Maarten de Rijke: Loredana Afanasiev, David Ahn, Caterina Caracciolo, Sisay Fissaha, Massimo Franceschet, Evan Goris, Willem van Hage, Gabriel Infante-Lopez, Valentin Jijkoun, Maarten Marx, Gilad Mishne, Rob Mokken, Karin Mueller, Stefan Schlobach, Borkur Sigurbjörnsson, Maarten Stol, and Petrucio Viana. During the second half of 2004, Balder ten Cate, Leonie IJzereef, and Erik Tjong Kim Sang joined the group.
Dostları ilə paylaş: |