An iterative adaptive multi-modal stereo-vision method using mutual information
基于互信息的迭代适应多模式立体视觉方法
Journal of Visual Communication and Image Representation, Volume 26, January 2015, Pages 115-131
Abstract: We propose a method for computing disparity maps from a multi-modal stereo-vision system composed of an infrared–visible camera pair. The method uses mutual information (MI) as the basic similarity measure where a segment-based adaptive windowing mechanism is proposed along with a novel MI computation surface with joint prior probabilities incorporated. The computed cost confidences are aggregated using a novel adaptive cost aggregation method, and the resultant minimum cost disparities in segments are plane-fitted in their respective segments which are iteratively refined by merging and splitting segments reducing dependency to initial segmentation. Finally, the estimated disparities are iteratively refined by repeating all the steps. On an artificially-modified version of the Middlebury dataset and a Kinect dataset that we created in this study, we show that (i) our proposal improves the quality of existing MI formulation, and (ii) our method can provide depth comparable to the quality of Kinect depth data.
The informed sampler: A discriminative approach to Bayesian inference in generative computer vision models
智能采样器:生成计算机视觉模型Bayesian推理判别方法
Computer Vision and Image Understanding, Volume 136, July 2015, Pages 32-44
Abstract: Computer vision is hard because of a large variability in lighting, shape, and texture; in addition the image signal is non-additive due to occlusion. Generative models promised to account for this variability by accurately modelling the image formation process as a function of latent variables with prior beliefs. Bayesian posterior inference could then, in principle, explain the observation. While intuitively appealing, generative models for computer vision have largely failed to deliver on that promise due to the difficulty of posterior inference. As a result the community has favoured efficient discriminative approaches. We still believe in the usefulness of generative models in computer vision, but argue that we need to leverage existing discriminative or even heuristic computer vision methods. We implement this idea in a principled way with an informed sampler and in careful experiments demonstrate it on challenging generative models which contain renderer programs as their components. We concentrate on the problem of inverting an existing graphics rendering engine, an approach that can be understood as “Inverse Graphics”. The informed sampler, using simple discriminative proposals based on existing computer vision technology, achieves significant improvements of inference.
Application-level Performance Optimization: A Computer Vision Case Study on STHORM
应用性能优化:STHORM计算机视觉实例研究
Procedia Computer Science, Volume 29, 2014, Pages 1113-1122
Abstract: Computer vision applications constitute one of the key drivers for embedded many-core architectures. In order to exploit the full potential of such systems, a balance between computation and communication is critical, but many computer vision algorithms present a highly data-dependent behavior that complexifies this task. To enable application performance optimization, the development environment must provide the developer with tools for fast and precise application-level performance analysis. We describe the process to port and optimize a face detection application onto the STHORM many-core accelerator using the STHORM OpenCL SDK. We identify the main factors that limit performance and discern the contributions arising from: the application itself, the OpenCL programming model, and the STHORM OpenCL SDK. Finally, we show how these issues can be addressed in the future to enable developers to further improve application performance.
Optical flow modeling and computation: A survey
光流模拟与计算研究综述
Computer Vision and Image Understanding, Volume 134, May 2015, Pages 1-21
Abstract: Optical flow estimation is one of the oldest and still most active research domains in computer vision. In 35 years, many methodological concepts have been introduced and have progressively improved performances, while opening the way to new challenges. In the last decade, the growing interest in evaluation benchmarks has stimulated a great amount of work. In this paper, we propose a survey of optical flow estimation classifying the main principles elaborated during this evolution, with a particular concern given to recent developments. It is conceived as a tutorial organizing in a comprehensive framework current approaches and practices. We give insights on the motivations, interests and limitations of modeling and optimization techniques, and we highlight similarities between methods to allow for a clear understanding of their behavior.
Vision-based sensing for assessing and monitoring civil infrastructures
基于视觉感知的民用基础评估与监测技术
Sensor Technologies for Civil Infrastructures, 2014, Pages 383-409
Abstract: Recently, vision-based measurement techniques have emerged as an important non-destructive evaluation (NDE) method, due to the availability of low cost but high image resolution commercial digital cameras and effective image processing algorithms. In this chapter, some important issues that might affect the accuracy of the vision-based measurement techniques are presented and discussed. Results from laboratory tests and field tests are obtained to illustrate the accuracy and applicability of some vision-based measurement techniques.
Identifying barley varieties by computer vision
计算机视觉技术在大麦种类辨析中的应用
Computers and Electronics in Agriculture, Volume 110, January 2015, Pages 1-8
Abstract: Visual discrimination between barley varieties is difficult, and it requires training and experience. The development of automatic methods based on computer vision could have positive implications for the food processing industry. In the brewing industry, varietal uniformity is crucial for the production of high quality malt. The varietal purity of thousands of tons of grain has to be inspected upon purchase in the malt house.
This paper evaluates the effectiveness of identification of barley varieties based on image-derived shape, color and texture attributes of individual kernels. Varieties can be determined by means of discriminant analysis, including reduction of feature space dimensionality, linear classifier ensembles and artificial neural networks, with high balanced accuracy ranging from 67% to 86%. The study demonstrated that classification results can be significantly improved by standardizing individual kernel images in terms of their anteroposterior and dorsoventral orientation and performing additional analyses of wrinkled regions.
A learning-based thresholding method customizable to computer vision applications
基于学习应用于计算机视觉的阈值算法
Engineering Applications of Artificial Intelligence, Volume 37, January 2015, Pages 71-90
Abstract: Although a large variety of thresholding techniques have been developed, the selection of a suitable technique for a particular computer vision application is still unsolved and often requires long error and trial procedures analyzing the performance and robustness of different methods. This paper proposes a training-based method that is capable of capturing, learning and imitating thresholding performance from a set of training images allowing ad-hoc adaptation to a given problem. It is applied in two stages: learning and application. In the learning stage a histogram mode object/background classifier is trained with a set of training images and their respective desired threshold values determined by a human. In the application stage, the histogram modes resulting from multi-mode decomposition are classified with the trained classifier and the threshold is computed using a tunable minimum classification error criterion. The presented method can be used in bi-level and multi-level thresholding and requires no settings since all its parameters are determined in the learning step. It has been successfully applied to several problems, some of which are described in the paper.
3D perception from binocular vision for a low cost humanoid robot NAO
低成本类人机器人NAO双视三维感知技术
Robotics and Autonomous Systems, Volume 68, June 2015, Pages 129-139
Abstract: Depth estimation is a classical problem in computer vision and after decades of research many methods have been developed for 3D perception like magnetic tracking, mechanical tracking, acoustic tracking, inertial tracking, optical tracking using markers and beacons. The vision system allows the 3D perception of the scene and the process involves: (1) camera calibration, (2) image correction, (3) feature extraction and stereo correspondence, (4) disparity estimation and reconstruction, and finally, (5) surface triangulation and texture mapping. The work presented in this paper is the implementation of a stereo vision system integrated in humanoid robot. The low cost of the vision system is one of the aims to avoid expensive investment in hardware when used in robotics for 3D perception. In our proposed solution, cameras are highly utilized as in our opinion they are easy to handle, cheap and very compatible when compared to the hardware used in other techniques. The software for the automated recognition of features and detection of the correspondence points has been programmed using the image processing library OpenCV (Open Source Computer Vision) and OpenGL (Open Graphic Library) is used to display the 3D models obtained from the reconstruction. Experimental results of the reconstruction and models of different scenes are shown. The results obtained from the program are evaluated comparing the size of the objects reconstructed with that calculated by the program.
Detection and classification of surface defects of gun barrels using computer vision and machine learning
计算机视觉和机器学习在枪筒表面瑕疵检测与分类中的应用
Measurement, Volume 60, January 2015, Pages 222-230
Abstract: This work proposes a machine vision based approach for the detection and classification of the surface defects such as normal wear, corrosive pitting, rust and erosion that are usually present in used gun barrels. Surface images containing the defective regions of several used gun barrels were captured in a non-destructive manner using a Charge-Coupled Device (CCD) camera attached with a miniature microscopic probe. Among the captured images, normal wear appeared as bright and the rest of the three defects appeared as dark. Therefore, the classification has been carried out in two stages. Various segmentation methods were tested and extended maxima transform gave the best result. The defective area was calculated in metric units. Multiple textural features based on histogram and gray level co-occurrence matrix were extracted from the segmented images and ranked them automatically using the sequential forward feature selection method in order to select the best minimal features for the classification purpose. Many classifiers based on Bayes, k-Nearest Neighbor, Artificial Neural Network and Support Vector Machine (SVM) were tested and the results demonstrated the efficacy of SVM for this application. All these steps were carried out at six different scales of image sizes and the best scale was selected for the entire analysis based on the segmentation and classification accuracy. The introduction of this Gaussian scale spacing concept could reduce the computation without compromising on the accuracy. Overall, the methodology forms a novel framework for surface defect detection and classification that has a potential to automate the inspection process.
Computer Vision for Mobile On-Ground Robotics
移动地面机器人计算机视觉技术
Procedia Engineering, Volume 100, 2015, Pages 1376-1380
Abstract: Autonomous mobile robotics needs reliable information on relief of underlying surface and location of obstacles. Planning the route of a mobile on-ground robot supposes mapping of a visible area with separating it into zones of good or conditional pass ability, impassability, and indefinite zones. It needs recognition of standard objects (marking, traffic signs) and types of surface (snow, sand, or water) as sources of evident or hidden obstacles.
3D calculation requires large computer resources and leads to delays, limiting the velocity. Contouring of boundaries simplifies image decomposition to objects and defines key tasks of mapping such as image vectorization and recognition of objects. They must be divided as in algorithms so at a hardware level. An idea of process multisequencing results in division of data processing on several processors. Along with vector analysis of obstacles it leads to radical cut of 3D update time, ensuring the data supply for high-speed motion of robots.