Publications

Données, transparence et démocratie

Abstract

Bienvenue dans l’âge des données. Nos actions sur Internet sont enregistrées au profit d’entreprises qui valorisent ces données et offrent des services en échange. Pouvons-nous faire de même ? Pouvons-nous utiliser les données de l’état pour améliorer notre démocratie ?Depuis 2016, les données publiques doivent être ouvertes àà tous. Les citoyens peuvent les analyser pour mesurer l’efficacité de l’action publique ou pour leur compte personnel. Les data journalistes les utilisent pour nous éclairer, les chercheurs pour comprendre. Ainsi la transparence permet de lutter contre la corruption et les intox, tout comme elle est source de progrès.Ce changement de paradigme, l’accès aux données de l’état, est surtout une opportunité pour participer. Noter un dysfonctionnement permet de suggérer une amélioration, un manque peut être une opportunité économique, même un jeu de données incomplet est une occasion pour tisser des liens entre l’administration, les associations et les citoyens.À travers cet essai, l’auteur nous propose un voyage optimiste dans le monde des données. Chemin faisant, les liens entre ces données et la transparence ouvrent la voie vers une démocratie plus ouverte, plus interactive et donc plus juste.

Continue reading

Effective reductions of Mealy machines

By Florian Renkin, Philipp Schlehuber-Caissier, Alexandre Duret-Lutz, Adrien Pommellet

2022-04-26

In Proceedings of the 42nd international conference on formal techniques for distributed objects, components, and systems (FORTE’22)

Abstract

We revisit the problem of reducing incompletely specified Mealy machines with reactive synthesis in mind. We propose two techniques: the former is inspired by the tool MeMin and solves the minimization problem, the latter is a novel approach derived from simulation-based reductions but may not guarantee a minimized machine. However, we argue that it offers a good enough compromise between the size of the resulting Mealy machine and performance. The proposed methods are benchmarked against MeMin on a large collection of test cases made of well-known instances as well as new ones.

Continue reading

LTL under reductions with weaker conditions than stutter invariance

By Emmanuel Paviot-Adet, Denis Poitrenaud, Étienne Renault, Yann Thierry-Mieg

2022-04-18

In Proceedings of the 41th IFIP international conference on formal techniques for distributed objects, components and systems (FORTE’22)

Abstract

Verification of properties expressed as omega-regular languages such as LTL can benefit hugely from stutter insensitivity, using a diverse set of reduction strategies. However properties that are not stutter invariant, for instance due to the use of the neXt operator of LTL or to some form of counting in the logic, are not covered by these techniques in general. We propose in this paper to study a weaker property than stutter insensitivity. In a stutter insensitive language both adding and removing stutter to a word does not change its acceptance, any stuttering can be abstracted away; by decomposing this equivalence relation into two implications we obtain weaker conditions. We define a shortening insensitive language where any word that stutters less than a word in the language must also belong to the language. A lengthening insensitive language has the dual property. A semi-decision procedure is then introduced to reliably prove shortening insensitive properties or deny lengthening insensitive properties while working with a reduction of a system. A reduction has the property that it can only shorten runs. Lipton’s transaction reductions or Petri net agglomerations are examples of eligible structural reduction strategies. An implementation and experimental evidence is provided showing most non- random properties sensitive to stutter are actually shortening or lengthening in- sensitive. Performance of experiments on a large (random) benchmark from the model-checking competition indicate that despite being a semi-decision proce- dure, the approach can still improve state of the art verification tools. 1 Introduction Model checking is an automatic verification technique for proving the correctness of systems that have finite state abstractions. Properties can be expressed using the popular Linear-time Temporal Logic (LTL). To verify LTL properties, the automata-theoretic approach [25] builds a product between a Buchi automaton representing the negation of the LTL formula and the reachable state graph of the system (seen as a set of infinite runs). This approach has been used successfully to verify both hardware and software components, but it suffers from the so called “state explosion problem”: as the number of state variables in the system increases, the size of the system state space grows exponentially.

Continue reading

Estimation of the noise level function for color images using mathematical morphology and non-parametric statistics

By Baptiste Esteban, Guillaume Tochon, Edwin Carlinet, Didier Verna

2022-04-08

In Proceedings of the 26th international conference on pattern recognition

Abstract

Noise level information is crucial for many image processing tasks, such as image denoising. To estimate it, it is necessary to find homegeneous areas within the image which contain only noise. Rank-based methods have proven to be efficient to achieve such a task. In the past, we proposed a method to estimate the noise level function (NLF) of grayscale images using the tree of shapes (ToS). This method, relying on the connected components extracted from the ToS computed on the noisy image, had the advantage of being adapted to the image content, which is not the case when using square blocks, but is still restricted to grayscale images. In this paper, we extend our ToS-based method to color images. Unlike grayscale images, the pixel values in multivariate images do not have a natural order relationship, which is a well-known issue when working with mathematical morphology and rank statistics. We propose to use the multivariate ToS to retrieve homogeneous regions. We derive an order relationship for the multivariate pixel values thanks to a complete lattice learning strategy and use it to compute the rank statistics. The obtained multivariate NLF is composed of one NLF per channel. The performance of the proposed method is compared with the one obtained using square blocks, and validates the soundness of the multivariate ToS structure for this task.

Continue reading

A benchmark of named entity recognition approaches in historical documents

By Nathalie Abadie, Edwin Carlinet, Joseph Chazalon, Bertrand Duménieu

2022-04-07

In Proceedings of the 15th IAPR international workshop on document analysis system

Abstract

Named entity recognition (NER) is a necessary step in many pipelines targeting historical documents. Indeed, such natural language processing techniques identify which class each text token belongs to, e.g. “person name”, “location”, “number”. Introducing a new public dataset built from 19th century French directories, we first assess how noisy modern, off-the-shelf OCR are. Then, we compare modern CNN- and Transformer-based NER techniques which can be reasonably used in the context of historical document analysis. We measure their requirements in terms of training data, the effects of OCR noise on their performance, and show how Transformer-based NER can benefit from unsupervised pre-training and supervised fine-tuning on noisy data. Results can be reproduced using resources available at https://github.com/soduco/paper-ner-bench-das22 and https://zenodo.org/record/6394464

Continue reading

Learning grayscale mathematical morphology with smooth morphological layers

Abstract

The integration of mathematical morphology operations within convolutional neural network architectures has received an increasing attention lately. However, replacing standard convolution layers by morphological layers performing erosions or dilations is particularly challenging because the min and max operations are not differentiable. P-convolution layers were proposed as a possible solution to this issue since they can act as smooth differentiable approximation of min and max operations, yielding pseudo-dilation or pseudo-erosion layers. In a recent work, we proposed two novel morphological layers based on the same principle as the p-convolution, while circumventing its principal drawbacks, and showcased their capacity to efficiently learn grayscale morphological operators while raising several edge cases. In this work, we complete those previous results by thoroughly analyzing the behavior of the proposed layers and by investigating and settling the reported edge cases. We also demonstrate the compatibility of one of the proposed morphological layers with binary morphological frameworks.

Continue reading

Is it really easy to detect sybil attacks in c-ITS environments: A position paper

By Badis Hammi, Yacine Mohamed Idir, Sherali Zeadally, Rida Khatoun, Jamel Nebhen

2022-04-01

In IEEE Transactions on Intelligent Transportation Systems

Abstract

In the context of current smart cities, Cooperative Intelligent Transportation Systems (C-ITS) represent one of the main use case scenarios that aim to improve peoples? daily lives. Thus, during the last few years, numerous standards have been adopted to regulate such networks. Within a C-ITS, a large number of messages are exchanged continuously in order to ensure that the different applications operate efficiently. However, these networks can be the target of numerous attacks. The sybil attack is among the most dangerous ones. In a sybil attack, an attacker creates multiple identities and then disguises as several fake stations in order to interfere with the normal operations of the system or profit from provided services. We analyze recently proposed sybil detection approaches regarding their compliance with the current C-ITS standards as well as their evaluation methods. We provide several recommendations such as network and attack models as well as an urban and highway datasets that can be considered in future research in sybil attack detection.

Continue reading

Qu’est-ce que mon GNN capture vraiment ? Exploration des représentations internes d’un GNN

By Luca Veyrin-Forrer, Ataollah Kamal, Stefan Duffner, Marc Plantevit, Céline Robardet

2022-03-24

In Extraction et gestion des connaissances, EGC 2022, blois, france, 24 au 28 janvier 2022

Abstract

While existing GNN’s explanation methods explain the decision by studying the output layer, we propose a method that analyzes the hidden layers to identify the neurons that are co-activated for a class. We associate to them a graph.

Continue reading

Local intensity order transformation for robust curvilinear object segmentation

By Tianyi Shi, Nicolas Boutry, Yongchao Xu, Thierry Géraud

2022-03-22

In IEEE Transactions on Image Processing

Abstract

Segmentation of curvilinear structures is important in many applications, such as retinal blood vessel segmentation for early detection of vessel diseases and pavement crack segmentation for road condition evaluation and maintenance. Currently, deep learning-based methods have achieved impressive performance on these tasks. Yet, most of them mainly focus on finding powerful deep architectures but ignore capturing the inherent curvilinear structure feature (e.g., the curvilinear structure is darker than the context) for a more robust representation. In consequence, the performance usually drops a lot on cross-datasets, which poses great challenges in practice. In this paper, we aim to improve the generalizability by introducing a novel local intensity order transformation (LIOT). Specifically, we transfer a gray-scale image into a contrast- invariant four-channel image based on the intensity order between each pixel and its nearby pixels along with the four (horizontal and vertical) directions. This results in a representation that preserves the inherent characteristic of the curvilinear structure while being robust to contrast changes. Cross-dataset evaluation on three retinal blood vessel segmentation datasets demonstrates that LIOT improves the generalizability of some state-of-the-art methods. Additionally, the cross-dataset evaluation between retinal blood vessel segmentation and pavement crack segmentation shows that LIOT is able to preserve the inherent characteristic of curvilinear structure with large appearance gaps. An implementation of the proposed method is available at https://github.com/TY-Shi/LIOT.

Continue reading

Electricity price forecasting on the day-ahead market using machine learning

Abstract

The price of electricity on the European market is very volatile. This is due both to its mode of production by different sources, each with its own constraints (volume of production, dependence on the weather, or production inertia), and by the difficulty of its storage. Being able to predict the prices of the next day is an important issue, to allow the development of intelligent uses of electricity. In this article, we investigate the capabilities of different machine learning techniques to accurately predict electricity prices. Specifically, we extend current state-of-the-art approaches by considering previously unused predictive features such as price histories of neighboring countries. We show that these features significantly improve the quality of forecasts, even in the current period when sudden changes are occurring. We also develop an analysis of the contribution of the different features in model prediction using Shap values, in order to shed light on how models make their prediction and to build user confidence in models.

Continue reading