From Interactive to Semantic Image Segmentation

From Interactive to Semantic Image Segmentation
Author: Varun Gulshan
Publisher:
Total Pages:
Release: 2011
Genre:
ISBN:

This thesis investigates two well defined problems in image segmentation, viz. in- teractive and semantic image segmentation. Interactive segmentation involves power assisting a user in cutting out objects from an image, whereas semantic segmenta- tion involves partitioning pixels in an image into object categories. Vve investigate various models and energy formulations for both these problems in this thesis. In order to improve the performance of interactive systems, low level texture features are introduced as a replacement for the more commonly used RGB fea- tures. To quantify the improvement obtained by using these texture features, two annotated datasets of images are introduced (one consisting of natural images, and the other consisting of camouflaged objects). A significant improvement in perfor- mance is observed when using texture features for the case of monochrome images and images containing camouflaged objects. We also explore adding mid-level cues such as shape constraints into interactive segmentation by introducing the idea of geodesic star convexity, which extends the existing notion of a star convexity prior in two important ways: (i) It allows for multiple star centres as opposed to single stars in the original prior and (ii) It generalises the shape constraint by allowing for Geodesic paths as opposed to Euclidean rays. Global minima of our energy func- tion can be obtained subject to these new constraints. We also introduce Geodesic Forests, which exploit the structure of shortest paths in implementing the extended constraints. These extensions to star convexity allow us to use such constraints in a practical segmentation system. This system is evaluated by means of a "robot user" to measure the amount of interaction required in a precise way, and it is shown that having shape constraints reduces user effort significantly compared to existing interactive systems. We also introduce a new and harder dataset which augments the existing GrabCut dataset with more realistic images and ground truth taken from the PASCAL VOC segmentation challenge. In the latter part of the thesis, we bring in object category level information in order to make the interactive segmentation tasks easier, and move towards fully automated semantic segmentation. An algorithm to automatically segment humans from cluttered images given their bounding boxes is presented. A top down seg- mentation of the human is obtained using classifiers trained to predict segmentation masks from local HOG descriptors. These masks are then combined with bottom up image information in a local GrabCut like procedure. This algorithm is later completely automated to segment humans without requiring a bounding box, and is quantitatively compared with other semantic segmentation methods. We also introduce a novel way to acquire large quantities of segmented training data rel- atively effortlessly using the Kinect. In the final part of this work, we explore various semantic segmentation methods based on learning using bottom up super- pixelisations. Different methods of combining multiple super-pixelisations are dis- cussed and quantitatively evaluated on two segmentation datasets. We observe that simple combinations of independently trained classifiers on single super-pixelisations perform almost as good as complex methods based on jointly learning across multiple super-pixelisations. We also explore CRF based formulations for semantic segmen- tation, and introduce novel visual words based object boundary description in the energy formulation. The object appearance and boundary parameters are trained jointly using structured output learning methods, and the benefit of adding pairwise terms is quantified on two different datasets.

High-Order Models in Semantic Image Segmentation

High-Order Models in Semantic Image Segmentation
Author: Ismail Ben Ayed
Publisher: Academic Press
Total Pages: 184
Release: 2023-06-22
Genre: Technology & Engineering
ISBN: 0128092297

High-Order Models in Semantic Image Segmentation reviews recent developments in optimization-based methods for image segmentation, presenting several geometric and mathematical models that underlie a broad class of recent segmentation techniques. Focusing on impactful algorithms in the computer vision community in the last 10 years, the book includes sections on graph-theoretic and continuous relaxation techniques, which can compute globally optimal solutions for many problems. The book provides a practical and accessible introduction to these state-of -the-art segmentation techniques that is ideal for academics, industry researchers, and graduate students in computer vision, machine learning and medical imaging. Gives an intuitive and conceptual understanding of this mathematically involved subject by using a large number of graphical illustrations Provides the right amount of knowledge to apply sophisticated techniques for a wide range of new applications Contains numerous tables that compare different algorithms, facilitating the appropriate choice of algorithm for the intended application Presents an array of practical applications in computer vision and medical imaging Includes code for many of the algorithms that is available on the book’s companion website

Interactive Co-segmentation of Objects in Image Collections

Interactive Co-segmentation of Objects in Image Collections
Author: Dhruv Batra
Publisher: Springer Science & Business Media
Total Pages: 56
Release: 2011-11-09
Genre: Computers
ISBN: 1461419158

The authors survey a recent technique in computer vision called Interactive Co-segmentation, which is the task of simultaneously extracting common foreground objects from multiple related images. They survey several of the algorithms, present underlying common ideas, and give an overview of applications of object co-segmentation.

Semantic Image Segmentation

Semantic Image Segmentation
Author: Gabriela Csurka
Publisher:
Total Pages: 0
Release: 2022-10-19
Genre: Computers
ISBN: 9781638280767

Semantic image segmentation (SiS) plays a fundamental role towards a general understanding of the image content and context, in a broad variety of computer vision applications, thus providing key information for the global understanding of an image.This monograph summarizes two decades of research in the field of SiS, where a literature review of solutions starting from early historical methods is proposed, followed by an overview of more recent deep learning methods, including the latest trend of using transformers.The publication is complemented by presenting particular cases of the weak supervision and side machine learning techniques that can be used to improve the semantic segmentation, such as curriculum, incremental or self-supervised learning. State-of-the-art SiS models rely on a large amount of annotated samples, which are more expensive to obtain than labels for tasks such as image classification. Since unlabeled data is significantly cheaper to obtain, it is not surprising that Unsupervised Domain Adaptation (UDA) reached a broad success within the semantic segmentation community. Therefore, a second core contribution of this monograph is to summarize five years of a rapidly growing field, Domain Adaptation for Semantic Image Segmentation (DASiS), which embraces the importance of semantic segmentation itself and a critical need of adapting segmentation models to new environments. In addition to providing a comprehensive survey on DASiS techniques, newer trends such as multi-domain learning, domain generalization, domain incremental learning, test-time adaptation and source-free domain adaptation are also presented. The publication concludes by describing datasets and benchmarks most widely used in SiS and DASiS and briefly discusses related tasks such as instance and panoptic image segmentation, as well as applications such as medical image segmentation.This monograph should provide researchers across academia and industry with a comprehensive reference guide, and will help them in fostering new research directions in the field.

Semantic Video Object Segmentation for Content-Based Multimedia Applications

Semantic Video Object Segmentation for Content-Based Multimedia Applications
Author: Ju Guo
Publisher: Springer Science & Business Media
Total Pages: 118
Release: 2013-03-14
Genre: Computers
ISBN: 1461515033

Semantic Video Object Segmentation for Content-Based Multimedia Applications provides a thorough review of state-of-the-art techniques as well as describing several novel ideas and algorithms for semantic object extraction from image sequences. Semantic object extraction is an essential element in content-based multimedia services, such as the newly developed MPEG4 and MPEG7 standards. An interactive system called SIVOG (Smart Interactive Video Object Generation) is presented, which converts user's semantic input into a form that can be conveniently integrated with low-level video processing. Thus, high-level semantic information and low-level video features are integrated seamlessly into a smart segmentation system. A region and temporal adaptive algorithm was further proposed to improve the efficiency of the SIVOG system so that it is feasible to achieve nearly real-time video object segmentation with robust and accurate performances. Also included is an examination of the shape coding problem and the object segmentation problem simultaneously. Semantic Video Object Segmentation for Content-Based Multimedia Applications will be of great interest to research scientists and graduate-level students working in the area of content-based multimedia representation and applications and its related fields.

Interactive Segmentation Techniques

Interactive Segmentation Techniques
Author: Jia He
Publisher: Springer Science & Business Media
Total Pages: 82
Release: 2013-08-31
Genre: Technology & Engineering
ISBN: 9814451606

This book focuses on interactive segmentation techniques, which have been extensively studied in recent decades. Interactive segmentation emphasizes clear extraction of objects of interest, whose locations are roughly indicated by human interactions based on high level perception. This book will first introduce classic graph-cut segmentation algorithms and then discuss state-of-the-art techniques, including graph matching methods, region merging and label propagation, clustering methods, and segmentation methods based on edge detection. A comparative analysis of these methods will be provided with quantitative and qualitative performance evaluation, which will be illustrated using natural and synthetic images. Also, extensive statistical performance comparisons will be made. Pros and cons of these interactive segmentation methods will be pointed out, and their applications will be discussed. There have been only a few surveys on interactive segmentation techniques, and those surveys do not cover recent state-of-the art techniques. By providing comprehensive up-to-date survey on the fast developing topic and the performance evaluation, this book can help readers learn interactive segmentation techniques quickly and thoroughly.

Practical Machine Learning for Computer Vision

Practical Machine Learning for Computer Vision
Author: Valliappa Lakshmanan
Publisher: "O'Reilly Media, Inc."
Total Pages: 481
Release: 2021-07-21
Genre: Computers
ISBN: 1098102339

This practical book shows you how to employ machine learning models to extract information from images. ML engineers and data scientists will learn how to solve a variety of image problems including classification, object detection, autoencoders, image generation, counting, and captioning with proven ML techniques. This book provides a great introduction to end-to-end deep learning: dataset creation, data preprocessing, model design, model training, evaluation, deployment, and interpretability. Google engineers Valliappa Lakshmanan, Martin Görner, and Ryan Gillard show you how to develop accurate and explainable computer vision ML models and put them into large-scale production using robust ML architecture in a flexible and maintainable way. You'll learn how to design, train, evaluate, and predict with models written in TensorFlow or Keras. You'll learn how to: Design ML architecture for computer vision tasks Select a model (such as ResNet, SqueezeNet, or EfficientNet) appropriate to your task Create an end-to-end ML pipeline to train, evaluate, deploy, and explain your model Preprocess images for data augmentation and to support learnability Incorporate explainability and responsible AI best practices Deploy image models as web services or on edge devices Monitor and manage ML models