Publications

Improvement of a text detection chain and the proposition of a new evaluation protocol for text detection algorithms

Abstract

The objective of this thesis is twofold. On one hand it targets the proposition of a more accurate evaluation protocol designed for text detection systems that solves some of the existing problems in this area. On the other hand, it focuses on the design of a text rectification procedure used for the correction of highly deformed texts. Text detection systems have gained a significant importance during the last years. The growing number of approaches proposed in the literature requires a rigorous performance evaluation and ranking. In the context of text detection, an evaluation protocol relies on three elements: a reliable text reference, a matching set of rules deciding the relationship between the ground truth and the detections and finally a set of metrics that produce intuitive scores. The few existing evaluation protocols often lack accuracy either due to inconsistent matching procedures that provide unfair scores or due to unrepresentative metrics. Despite these issues, until today, researchers continue to use these protocols to evaluate their work. In this Ph.D thesis we propose a new evaluation protocol for text detection algorithms that tackles most of the drawbacks faced by currently used evaluation methods. This work is focused on three main contributions: firstly, we introduce a complex text reference representation that does not constrain text detectors to adopt a specific detection granularity level or annotation representation; secondly, we propose a set of matching rules capable of evaluating any type of scenario that can occur between a text reference and a detection; and finally we show how we can analyze a set of detection results, not only through a set of metrics, but also through an intuitive visual representation. We use this protocol to evaluate different text detectors and then compare the results with those provided by alternative evaluation methods. A frequent challenge for many Text Understanding Systems is to tackle the variety of text characteristics in born-digital and natural scene images to which current OCRs are not well adapted. For example, texts in perspective are frequently present in real-word images because the camera capture angle is not normal to the plane containing text regions. Despite the ability of some detectors to accurately localize such text objects, the recognition stage fails most of the time. Indeed, most OCRs are not designed to handle text strings in perspective but rather expect horizontal texts in a parallel-frontal plane to provide a correct transcription. All these aspects, together with the proposition of a very challenging dataset, motivated us to propose a rectification procedure capable of correcting highly distorted texts.

Continue reading

A tree of shapes for multivariate images

Abstract

Nowadays, the demand for multi-scale and region-based analysis in many computer vision and pattern recognition applications is obvious. No one would consider a pixelbased approach as a good candidate to solve such problems. To meet this need, the Mathematical Morphology (MM) framework has supplied region-based hierarchical representations of images such as the Tree of Shapes (ToS). The ToS represents the image in terms of a tree of the inclusion of its level-lines. The ToS is thus self-dual and contrastchange invariant which make it well-adapted for high-level image processing. Yet, it is only defined on grayscale images and most attempts to extend it on multivariate images - e.g. by imposing an “arbitrary” total ordering - are not satisfactory. In this dissertation, we present the Multivariate Tree of Shapes (MToS) as a novel approach to extend the grayscale ToS on multivariate images. This representation is a mix of the ToS’s computed marginally on each channel of the image; it aims at merging the marginal shapes in a “sensible” way by preserving the maximum number of inclusion. The method proposed has theoretical foundations expressing the ToS in terms of a topographic map of the curvilinear total variation computed from the image border; which has allowed its extension on multivariate data. In addition, the MToS features similar properties as the grayscale ToS, the most important one being its invariance to any marginal change of contrast and any marginal inversion of contrast (a somewhat “self-duality” in the multidimensional case). As the need for efficient image processing techniques is obvious regarding the larger and larger amount of data to process, we propose an efficient algorithm that can build the MToS in quasi-linear time w.r.t. the number of pixels and quadratic w.r.t. the number of channels. We also propose tree-based processing algorithms to demonstrate in practice, that the MToS is a versatile, easy-to-use, and efficient structure. Eventually, to validate the soundness of our approach, we propose some experiments testing the robustness of the structure to non-relevant components (e.g. with noise or with low dynamics) and we show that such defaults do not affect the overall structure of the MToS. In addition, we propose many real-case applications using the MToS. Many of them are just a slight modification of methods employing the “regular” ToS and adapted to our new structure. For example, we successfully use the MToS for image filtering, image simplification, image segmentation, image classification and object detection. From these applications, we show that the MToS generally outperforms its ToS-based counterpart, demonstrating the potential of our approach.

Continue reading

MToS: A tree of shapes for multivariate images

By Edwin Carlinet, Thierry Géraud

2015-10-26

In IEEE Transactions on Image Processing

Abstract

The Tree of Shapes (ToS) is a morphological tree that provides an high-level hierarchical representation of the image suitable for many image processing tasks. When dealing with color images, one cannot use the ToS because its definition is ill-formed on multivariate data. Common workarounds such as marginal processing, or imposing a total order on data are not satisfactory and yield many problems (color artifacts, loss of invariances…) In this paper, we highlight the need for a self-dual and contrast invariant representation of the image and provide a method that builds a single ToS by merging the shapes computed marginally and preserving the most important properties of the ToS. This method does not try to impose an arbitrary total ordering on values but uses only the inclusion relationship between shapes and the merging strategy works in a shape space. Eventually, we show the relevance of our method and our structure through several applications involving color and multispectral image analysis.

Continue reading

Variations on parallel explicit model checking for generalized Büchi automata

By Étienne Renault, Alexandre Duret-Lutz, Fabrice Kordon, Denis Poitrenaud

2015-10-26

In International Journal on Software Tools for Technology Transfer (STTT)

Abstract

We present new parallel explicit emptiness checks for LTL model checking. Unlike existing parallel emptiness checks, these are based on a Strongly Connected Component (SCC) enumeration, support generalized Büchi acceptance, and require no synchronization points nor recomputing procedures. A salient feature of our algorithms is the use of a global union-find data structure in which multiple threads share structural information about the automaton checked. Besides these basic algorithms, we present one architectural variant isolating threads that write to the union-find, and one extension that decomposes the automaton based on the strength of its SCCs to use more optimized emptiness checks. The results from an extensive experimentation of our algorithms and their variations show encouraging performances, especially when the decomposition technique is used.

Continue reading

SAT-based minimization of deterministic $\omega$-automata

By Souheib Baarir, Alexandre Duret-Lutz

2015-09-01

In Proceedings of the 20th international conference on logic for programming, artificial intelligence, and reasoning (LPAR’15)

Abstract

We describe a tool that inputs a deterministic $\omega$-automaton with any acceptance condition, and synthesizes an equivalent $\omega$-automaton with another arbitrary acceptance condition and a given number of states, if such an automaton exists. This tool, that relies on a SAT-based encoding of the problem, can be used to provide minimal $\omega$-automata equivalent to given properties, for different acceptance conditions.

Continue reading

Using histogram representation and earth mover’s distance as an evaluation tool for text detection

By Stefania Calarasanu, Jonathan Fabrizio, Séverine Dubuisson

2015-08-01

In Proceedings of the 13th IAPR international conference on document analysis and recognition (ICDAR)

Abstract

In the context of text detection evaluation, it is essential to use protocols that are capable of describing both the quality and the quantity aspects of detection results. In this paper we propose a novel visual representation and evaluation tool that captures the whole nature of a detector by using histograms. First, two histograms (coverage and accuracy) are generated to visualize the different characteristics of a detector. Secondly, we compare these two histograms to a so called optimal one to compute representative and comparable scores. To do so, we introduce the usage of the Earth Mover’s Distance as a reliable evaluation tool to estimate recall and precision scores. Results obtained on the ICDAR 2013 dataset show that this method intuitively characterizes the accuracy of a text detector and gives at a glance various useful characteristics of the analyzed algorithm.

Continue reading

Morphological object picking based on the color tree of shapes

By Edwin Carlinet, Thierry Géraud

2015-06-29

In Proceedings of 5th international conference on image processing theory, tools and applications (IPTA’15)

Abstract

The Tree of Shapes is a self-dual and contrast invariant morphological tree that provides a high-level hierarchical representation of images, suitable for many image processing tasks. Despite its powerfulness and its simplicity, it is still under-exploited in pattern recognition and computer vision. In this paper, we show that both interactive and automatic image segmentation can be achieved with some simple tree processings. To that aim, we rely on the “Color Tree of Shapes”, recently defined. We propose a method for interactive segmentation that does not involve any statistical learning, yet yielding results that compete with state-of-the-art approaches. We further extend this algorithm to unsupervised segmentation and give some results. Although they are preliminary, they highlight the potential of such an approach that works in the shape space.

Continue reading

Une approche morphologique de segmentation interactive avec l’arbre des formes couleur

By Edwin Carlinet, Thierry Géraud

2015-06-16

In Actes du 15e colloque GRETSI

Abstract

L’arbre des formes est un arbre morphologique à la fois auto-dual et invariant par changement de contraste. Il fournit une représentation haut-niveau de l’image, intéressante pour de nombreuses tâches de traitement d’images. Malgré son potentiel et sa simplicité, il reste largement sous-utilisé en reconnaissance des formes et vision par ordinateur. Dans cet article, nous présentons une méthode de segmentation interactive qui s’effectue simplement en manipulant cet arbre. Pour cela, nous nous appuierons sur une représentation récemment définie : l’Arbre des Formes Couleur . La méthode de segmentation interactive que nous proposons ne requiert aucun apprentissage statistique ; néanmoins elle obtient des résultats qui rivalisent avec ceux de l’état de l’art. Bien que préliminaires, les résultats obtenus mettent en avant le potentiel et l’intérêt des méthodes travaillant dans l’espace des formes.

Continue reading

On refinement of Büchi automata for explicit model checking

By František Blahoudek, Alexandre Duret-Lutz, Vojtčech Rujbr, Jan Strejček

2015-06-15

In Proceedings of the 22th international SPIN symposium on model checking of software (SPIN’15)

Abstract

In explicit model checking, systems are typically described in an implicit and compact way. Some valid information about the system can be easily derived directly from this description, for example that some atomic propositions cannot be valid at the same time. The paper shows several ways to apply this information to improve the Büchi automaton built from an LTL specification. As a result, we get smaller automata with shorter edge labels that are easier to understand and, more importantly, for which the explicit model checking process performs better.

Continue reading

Practical stutter-invariance checks for $\omega$-regular languages

By Thibaud Michaud, Alexandre Duret-Lutz

2015-06-15

In Proceedings of the 22th international SPIN symposium on model checking of software (SPIN’15)

Abstract

We propose several automata-based constructions that check whether a specification is stutter-invariant. These constructions assume that a specification and its negation can be translated into Büchi automata, but aside from that, they are independent of the specification formalism. These transformations were inspired by a construction due to Holzmann and Kupferman, but that we broke down into two operations that can have different realizations, and that can be combined in different ways. As it turns out, implementing only one of these operations is needed to obtain a functional stutter-invariant check. Finally we have implemented these techniques in a tool so that users can easily check whether an LTL or PSL formula is stutter-invariant.

Continue reading