Refine
Document Type
- Report (13)
- Article (5)
- Preprint (4)
- Part of a Book (2)
Institute
Keywords
- CGI <Computergraphik> (14)
- Maschinelles Sehen (10)
- Association for Computing Machinery / Special Interest Group on Graphics (8)
- Projektion (6)
- Neuronales Netz (5)
- Objekterkennung (5)
- Augmented Reality (4)
- Erweiterte Realität (4)
- Handy (4)
- Museum (4)
Unsynchronized 4D Barcodes
(2007)
We present a novel technique for optical data transfer between public displays and mobile devices based on unsynchronized 4D barcodes. We assume that no direct (electromagnetic or other) connection between the devices can exist. Time-multiplexed, 2D color barcodes are displayed on screens and recorded with camera equipped mobile phones. This allows to transmit information optically between both devices. Our approach maximizes the data throughput and the robustness of the barcode recognition, while no immediate synchronization exists. Although the transfer rate is much smaller than it can be achieved with electromagnetic techniques (e.g., Bluetooth or WiFi), we envision to apply such a technique wherever no direct connection is available. 4D barcodes can, for instance, be integrated into public web-pages, movie sequences or advertisement presentations, and they encode and transmit more information than possible with single 2D or 3D barcodes.
In this paper, we present a novel technique for adapting local image classifiers that are applied for object recognition on mobile phones through ad-hoc network communication between the devices. By continuously accumulating and exchanging collected user feedback among devices that are located within signal range, we show that our approach improves the overall classification rate and adapts to dynamic changes quickly. This technique is applied in the context of PhoneGuide – a mobile phone based museum guidance framework that combines pervasive tracking and local object recognition for identifying a large number of objects in uncontrolled museum environments.
Superimposing Dynamic Range
(2009)
Replacing a uniform illumination by a high-frequent illumination enhances the contrast of observed and captured images. We modulate spatially and temporally multiplexed (projected) light with reflective or transmissive matter to achieve high dynamic range visualizations of radiological images on printed paper or ePaper, and to boost the optical contrast of images viewed or imaged with light microscopes.
We present a system that applies a custom-built pan-tilt-zoom camera for laser-pointer tracking in arbitrary real environments. Once placed in a building environment, it carries out a fully automatic self-registration, registrations of projectors, and sampling of surface parameters, such as geometry and reflectivity. After these steps, it can be used for tracking a laser spot on the surface as well as an LED marker in 3D space, using inter-playing fisheye context and controllable detail cameras. The captured surface information can be used for masking out areas that are critical to laser-pointer tracking, and for guiding geometric and radiometric image correction techniques that enable a projector-based augmentation on arbitrary surfaces. We describe a distributed software framework that couples laser-pointer tracking for interaction, projector-based AR as well as video see-through AR for visualizations with the domain specific functionality of existing desktop tools for architectural planning, simulation and building surveying.
Although audio guides are widely established in many museums, they suffer from several drawbacks compared to state-of-the-art multimedia technologies: First, they provide only audible information to museum visitors, while other forms of media presentation, such as reading text or video could be beneficial for museum guidance tasks. Second, they are not very intuitive. Reference numbers have to be manually keyed in by the visitor before information about the exhibit is provided. These numbers are either displayed on visible tags that are located near the exhibited objects, or are printed in brochures that have to be carried. Third, offering mobile guidance equipment to visitors leads to acquisition and maintenance costs that have to be covered by the museum. With our project PhoneGuide we aim at solving these problems by enabling the application of conventional camera-equipped mobile phones for museum guidance purposes. The advantages are obvious: First, today’s off-the-shelf mobile phones offer a rich pallet of multimedia functionalities ---ranging from audio (over speaker or head-set) and video (graphics, images, movies) to simple tactile feedback (vibration). Second, integrated cameras, improvements in processor performance and more memory space enable supporting advanced computer vision algorithms. Instead of keying in reference numbers, objects can be recognized automatically by taking non-persistent photographs of them. This is more intuitive and saves museum curators from distributing and maintaining a large number of physical (visible or invisible) tags. Together with a few sensor-equipped reference tags only, computer vision based object recognition allows for the classification of single objects; whereas overlapping signal ranges of object-distinct active tags (such as RFID) would prevent the identification of individuals that are grouped closely together. Third, since we assume that museum visitors will be able to use their own devices, the acquisition and maintenance cost for museum-owned devices decreases.
CAMShift is a well-established and fundamental algorithm for kernel-based visual object tracking. While it performs well with objects that have a simple and constant appearance, it is not robust in more complex cases. As it solely relies on back projected probabilities it can fail in cases when the object's appearance changes (e.g. due to object or camera movement, or due to lighting changes), when similarly colored objects have to be re-detected or when they cross their trajectories. We propose extensions to CAMShift that address and resolve all of these problems. They allow the accumulation of multiple histograms to model more complex object appearance and the continuous monitoring of object identi- ties to handle ambiguous cases of partial or full occlusion. Most steps of our method are carried out on the GPU for achieving real-time tracking of multiple targets simultaneously. We explain an ecient GPU implementations of histogram generation, probability back projection, im- age moments computations, and histogram intersection. All of these techniques make full use of a GPU's high parallelization.
Coded Aperture Projection
(2008)
In computer vision, optical defocus is often described as convolution with a filter kernel that corresponds to an image of the aperture being used by the imaging device. The degree of defocus correlates to the scale of the kernel. Convolving an image with the inverse aperture kernel will digitally sharpen the image and consequently compensate optical defocus. This is referred to as deconvolution or inverse filtering. In frequency domain, the reciprocal of the filter kernel is its inverse, and deconvolution reduces to a division. Low magnitudes in the Fourier transform of the aperture image, however, lead to intensity values in spatial domain that exceed the displayable range. Therefore, the corresponding frequencies are not considered, which then results in visible ringing artifacts in the final projection. This is the main limitation of previous approaches, since in frequency domain the Gaussian PSF of spherical apertures does contain a large fraction of low Fourier magnitudes. Applying only small kernel scales will reduce the number of low Fourier magnitudes (and consequently the ringing artifacts) -- but will also lead only to minor focus improvements. To overcome this problem, we apply a coded aperture whose Fourier transform has less low magnitudes initially. Consequently, more frequencies are retained and more image details are reconstructed.
We present a novel image classification technique for detecting multiple objects (called subobjects) in a single image. In addition to image classifiers, we apply spatial relationships among the subobjects to verify and to predict locations of detected and undetected subobjects, respectively. By continuously refining the spatial relationships throughout the detection process, even locations of completely occluded exhibits can be determined. Finally, all detected subobjects are labeled and the user can select the object of interest for retrieving corresponding multimedia information. This approach is applied in the context of PhoneGuide, an adaptive museum guidance system for camera-equipped mobile phones. We show that the recognition of subobjects using spatial relationships is up to 68% faster than related approaches without spatial relationships. Results of a field experiment in a local museum illustrate that unexperienced users reach an average recognition rate for subobjects of 85.6% under realistic conditions.
We present an enhancement towards adaptive video training for PhoneGuide, a digital museum guidance system for ordinary camera–equipped mobile phones. It enables museum visitors to identify exhibits by capturing photos of them. In this article, a combined solution of object recognition and pervasive tracking is extended to a client–server–system for improving data acquisition and for supporting scale–invariant object recognition.
Superimposing Dynamic Range
(2008)
Replacing a uniform illumination by a high-frequent illumination enhances the contrast of observed and captured images. We modulate spatially and temporally multiplexed (projected) light with reflective or transmissive matter to achieve high dynamic range visualizations of radiological images on printed paper or ePaper, and to boost the optical contrast of images viewed or imaged with light microscopes.