Publications

In ELS 2024, the 17th european lisp symposium

Abstract

The internal architecture of Declt, our reference manual generator for Common Lisp libraries, is currently evolving towards a three-stage pipeline in which the information gathered for documentation purposes is first reified into a formalized set of object-oriented data structures. A side-effect of this evolution is the ability to dump that information for other purposes than documentation. We demonstrate this ability applied to the complete Quicklisp ecosystem. The resulting “cohort” includes more than half a million programmatic definitions, and can be used to gain insight into the morphology of Common Lisp software.

Weakly supervised training for hologram verification in identity documents

By Glen Pouliquen, Guillaume Chiron, Joseph Chazalon, Thierry Géraud, Ahmad Montaser Awal

2024-04-25

In The 18th international conference on document analysis and recognition (ICDAR 2024)

Abstract

We propose a method to remotely verify the authenticity of Optically Variable Devices (OVDs), often referred to as “holograms”, in identity documents. Our method processes video clips captured with smartphones under common lighting conditions, and is evaluated on two public datasets: MIDV-HOLO and MIDV-2020. Thanks to a weakly-supervised training, we optimize a feature extraction and decision pipeline which achieves a new leading performance on MIDV-HOLO, while maintaining a high recall on documents from MIDV-2020 used as attack samples. It is also the first method, to date, to effectively address the photo replacement attack task, and can be trained on either genuine samples, attack samples, or both for increased performance. By enabling to verify OVD shapes and dynamics with very little supervision, this work opens the way towards the use of massive amounts of unlabeled data to build robust remote identity document verification systems on commodity smartphones. Code is available at https://github.com/EPITAResearchLab/pouliquen.24.icdar.

An end-to-end approach for the detection of phishing attacks

By Badis Hammi, Tristan Billot, Danyil Bazain, Nicolas Binand, Maxime Jaen, Chems Mitta, Nour El Madhoun

2024-04-01

In Advanced information networking and applications (AINA))

Abstract

The main approaches/implementations used to counteract phishing attacks involve the use of crowd-sourced blacklists. However, blacklists come with several drawbacks. In this paper, we present a comprehensive approach for the detection of phishing attacks. Our approach uses our own detection engine which relies on Graph Neural Networks to leverage the hyperlink structure of the websites to analyze. Additionally, we offer a turnkey implementation to the end-users in the form of a Mozilla Firefox plugin.

Automatic vectorization of historical maps: A benchmark

By Yizi Chen, Joseph Chazalon, Edwin Carlinet, Minh Ôn Vũ Ngọc, Clément Mallet, Julien Perret

2024-04-01

In PLOS ONE

Abstract

Shape vectorization is a key stage of the digitization of large-scale historical maps, especially city maps that exhibit complex and valuable details. Having access to digitized buildings, building blocks, street networks and other geographic content opens numerous new approaches for historical studies such as change tracking, morphological analysis and density estimations. In the context of the digitization of Paris atlases created in the 19th and early 20th centuries, we have designed a supervised pipeline that reliably extract closed shapes from historical maps. This pipeline is based on a supervised edge filtering stage using deep filters, and a closed shape extraction stage using a watershed transform. It relies on probable multiple suboptimal methodological choices that hamper the vectorization performances in terms of accuracy and completeness. Objectively investigating which solutions are the most adequate among the numerous possibilities is comprehensively addressed in this paper. The following contributions are subsequently introduced: (i) we propose an improved training protocol for map digitization; (ii) we introduce a joint optimization of the edge detection and shape extraction stages; (iii) we compare the performance of state-of-the-art deep edge filters with topology-preserving loss functions, including vision transformers; (iv) we evaluate the end-to-end deep learnable watershed against Meyer watershed. We subsequently design the critical path for a fully automatic extraction of key elements of historical maps. All the data, code, benchmark results are freely available at https://github.com/soduco/Benchmark_historical_map_vectorization.

The reactive synthesis competition (SYNTCOMP): 2018-2021

By Swen Jacobs, Guillermo A. Pérez, Remco Abraham, Véronique Bruyère, Michaël Cadilhac, Maximilien Colange, Charly Delfosse, Tom van Dijk, Alexandre Duret-Lutz, Peter Faymonville, Bernd Finkbeiner, Ayrat Khalimov, Felix Klein, Michael Luttenberger, Klara J. Meyer, Thibaud Michaud, Adrien Pommellet, Florian Renkin, Philipp Schlehuber-Caissier, Mouhammad Sakr, Salomon Sickert, Gaëtan Staquet, Clément Tamines, Leander Tentrup, Adam Walker

2024-03-01

In Int. J. Softw. Tools Technol. Transf.

Abstract

We report on the last four editions of the reactive synthesis competition (SYNTCOMP 2018-2021). We briefly describe the evaluation scheme and the experimental setup of SYNTCOMP. Then we introduce new benchmark classes that have been added to the SYNTCOMP library and give an overview of the participants of SYNTCOMP. Finally, we present and analyze the results of our experimental evaluations, including a ranking of tools with respect to quantity and quality—that is, the total size in terms of logic and memory elements—of solutions.

Concurrent stochastic lossy channel games

By Daniel Stan, Muhammad Najib, Anthony Widjaja Lin, Parosh Aziz Abdulla

2024-02-01

In Proceedings of the 32nd EACSL annual conference on computer science logic (CSL’24), february 19-23, 2024, naples, italy

Abstract

Unsupervised discovery of interpretable visual concepts

By Caroline Mazini-Rodrigues, Nicolas Boutry, Laurent Najman

2024-01-14

In Information Sciences

Abstract

Providing interpretability of deep-learning models to non-experts, while fundamental for a responsible real-world usage, is challenging. Attribution maps from xAI techniques, such as Integrated Gradients, are a typical example of a visualization technique containing a high level of information, but with difficult interpretation. In this paper, we propose two methods, Maximum Activation Groups Extraction (MAGE) and Multiscale Interpretable Visualization (Ms-IV), to explain the model’s decision, enhancing global interpretability. MAGE finds, for a given CNN, combinations of features which, globally, form a semantic meaning, that we call concepts. We group these similar feature patterns by clustering in concepts, that we visualize through Ms-IV. This last method is inspired by Occlusion and Sensitivity analysis (incorporating causality) and uses a novel metric, called Class-aware Order Correlation (CAOC), to globally evaluate the most important image regions according to the model’s decision space. We compare our approach to xAI methods such as LIME and Integrated Gradients. Experimental results evince the Ms-IV higher localization and faithfulness values. Finally, qualitative evaluation of combined MAGE and Ms-IV demonstrates humans’ ability to agree, based on the visualization, with the decision of clusters’ concepts; and, to detect, among a given set of networks, the existence of bias.

DiffVersify: A scalable approach to differentiable pattern mining with coverage regularization

By Thibaut Chataing, Julien Perez, Marc Plantevit, Céline Robardet

2024-01-10

In Machine learning and knowledge discovery in databases. Research track - european conference, ECML PKDD 2024, vilnius, lithuania, september 9-13, 2024, proceedings, part VI

Abstract

Pattern mining addresses the challenge of automatically identifying interpretable and discriminative patterns within data. Recent approaches, leveraging differentiable approach through neural autoencoder with class recovery, have achieved encouraging results but tend to fall short as the magnitude of the noise and the number of underlying features increase in the data. Empirically, one can observe that the number of discovered patterns tend to be limited in these challenging contexts. In this article, we present a differentiable binary model that integrates a new regularization technique to enhance pattern coverage. Besides, we introduce an innovative pattern decoding strategy taking advantage of non-negative matrix factorization (NMF), extending beyond conventional thresholding methods prevalent in existing approaches. Experiments on four realworld datasets exhibit superior performances of DiffVersify in terms of the ROC-AUC metric. On synthetic data, we observe an increase in the similarity between the discovered patterns and the ground truth. Finally, using several metrics to finely evaluate the quality of the patterns in regard to the data, we show the global effectiveness of the approach.

Actes de l’atelier gestion et analyse des données spatiales et temporelles

By Nida Meddouri, Aurélie Leborgne, Loic Salmon

2024-01-01

In

Abstract

Additive margin in contrastive self-supervised frameworks to learn discriminative speaker representations

By Theo Lepage, Reda Dehak

2024-01-01

In The speaker and language recognition workshop (odyssey 2024)

Abstract

Self-Supervised Learning (SSL) frameworks became the standard for learning robust class representations by benefiting from large unlabeled datasets. For Speaker Verification (SV), most SSL systems rely on contrastive-based loss functions. We explore different ways to improve the performance of these techniques by revisiting the NT-Xent contrastive loss. Our main contribution is the definition of the NT-Xent-AM loss and the study of the importance of Additive Margin (AM) in SimCLR and MoCo SSL methods to further separate positive from negative pairs. Despite class collisions, we show that AM enhances the compactness of same-speaker embeddings and reduces the number of false negatives and false positives on SV. Additionally, we demonstrate the effectiveness of the symmetric contrastive loss, which provides more supervision for the SSL task. Implementing these two modifications to SimCLR improves performance and results in 7.85% EER on VoxCeleb1-O, outperforming other equivalent methods.

The Quickref cohort

Abstract

Weakly supervised training for hologram verification in identity documents

Abstract

An end-to-end approach for the detection of phishing attacks

Abstract

Automatic vectorization of historical maps: A benchmark

Abstract

The reactive synthesis competition (SYNTCOMP): 2018-2021

Abstract

Concurrent stochastic lossy channel games

Abstract

Unsupervised discovery of interpretable visual concepts

Abstract

DiffVersify: A scalable approach to differentiable pattern mining with coverage regularization

Abstract

Actes de l’atelier gestion et analyse des données spatiales et temporelles

Abstract

Additive margin in contrastive self-supervised frameworks to learn discriminative speaker representations

Abstract

Search

Tags