Réda Dehak

A benchmark for toxic comment classification on civil comments dataset

By Corentin Duchêne, Henri Jamet, Pierre Guillaume, Réda Dehak

2023-01-16

In Extraction et gestion des connaissances, EGC 2023, lyon, france, 16 au 20 janvier 2023

Abstract

Continue reading

Hate speech and toxic comment detection using transformers

By Pierre Guillaume, Corentin Duchêne, Réda Dehak

2022-01-12

In Workshop EGC 2022 DL for NLP

Abstract Hate speech and toxic comment detection on social media has proven to be an essential issue for content moderation. This paper displays a comparison between different Transformer models for Hate Speech detection such as Hate BERT, a BERT-based model, RoBERTa and BERTweet which is a RoBERTa based model. These Transformer models are tested on Jibes&Delight 2021 reddit dataset using the same training and testing conditions. Multiple approaches are detailed in this paper considering feature extraction and data augmentation.

Continue reading

The MIT Lincoln Laboratory 2016 speaker recognition system

Abstract This document presents the system submission for the group composed of MIT Lincoln Laboratory, Johns Hopkins University (JHU), Laboratoire de Recherche et de Développement de l’EPITA (LRDE) and Universidad Autónoma de Madrid (ATVS). The primary submission is a combination of four systems focused on i-vector systems. Two secondary submissions are also included

Continue reading

GMM weights adaptation based on subspace approaches for speaker verification

By Najim Dehak, O. Plchot, M. H. Bahari, L. Burget, H. Van hamme, Réda Dehak

2014-06-16

In Odyssey 2014, the speaker and language recognition workshop

Abstract In this paper, we explored the use of Gaussian Mixture Model (GMM) weights adaptation for speaker verifica- tion. We compared two different subspace weight adap- tation approaches: Subspace Multinomial Model (SMM) and Non-Negative factor Analysis (NFA). Both techniques achieved similar results and seemed to outperform the retraining maximum likelihood (ML) weight adaptation. However, the training process for the NFA approach is substantially faster than the SMM technique. The i-vector fusion between each weight adaptation approach and the classical i-vector yielded slight improvements on the tele- phone part of the NIST 2010 Speaker Recognition Eval- uation dataset.

Continue reading

Unsupervised methods for speaker diarization: An integrated and iterative approach

By S. Shum, Najim Dehak, Réda Dehak, J. Glass

2013-06-07

In IEEE Transactions on Audio, Speech, and Language Processing

Abstract In speaker diarization, standard approaches typically perform speaker clustering on some initial segmentation before refining the segment boundaries in a re-segmentation step to obtain a final diarization hypothesis. In this paper, we integrate an improved clustering method with an existing re-segmentation algorithm and, in iterative fashion, optimize both speaker cluster assignments and segmentation boundaries jointly. For clustering, we extend our previous research using factor analysis for speaker modeling. In continuing to take advantage of the effectiveness of factor analysis as a front-end for extracting speaker-specific features (i.

Continue reading