Research Topics
More than 25 international scientific publications on:
- Language modeling for speech recognition
- Large vocabulary continuous speech recognition
- Conversational speech processing
- Speaker name identification
- Speech recognition system for human-machine dialog
Research Work
Responsibilities
- Scientific and administrative coordinator of the EPAC project funded by the ANR (French National Research Agency)
- Scientific and administrative coordinator of the ASH project funded by the ANR
- Local scientist in charge of the PORT-MEDIA project funded by the ANR
- Research member of the EuroMatrixPlus European project funded by the European Union
- Three co-directions of PhD theses
- Scientist in charge of the collaboration between Specinov (a French software editor) and LIUM.
Chronological presentation (since 2002 only)
My research work mainly focuses on language modeling for speech recognition.
During my PhD (obtained in 2002), I have worked on language modeling for speech recognition in the framework of human/machine dialog in collaboration with France Telecom Research & Development (FTRD).
In 2002, I have worked as a research engineer at the France Telecom R&D center at Lannion. I have made some language models for several France Telecom's applications and also made some experiments to compare some concurrent approaches.
Since september 2003, I am an Associate Professor at the LIUM (Laboratoire d'Informatique de l'Université du Mans, Computer Science Laboratory of the University of Le Mans). I have worked on language modeling for large vocabulary continuous speech recognition, particularly for automatic transcription of radiophonic broadcast news. I have participated, with Professor Paul Deléglise and Assistant Professor Sylvain Meignier, to the development of an entire speech recognition system based on the CMU Sphinx decoders (Sphinx 3 and Sphinx 4).
Our transcription system ranked the second position during the French evaluation campaign ESTER1 in 2005 with a word error rate of 23.6%, and was still the best open source system during the ESTER2 campaign in 2008 with a word error rate of 17.8%. These evaluations concerned French broadcast news recording.
In 2007, we have been invited to the 3rd internal ASR evaluation of the TC-STAR European project on speech-to-speech translation. Our system participated to the English and Spanish ASR evaluations: it was the best invited system. These evaluations consisted in transcripting discourses of members of the European Parliament.
From 2003 to 2006, I have co-directed the Julie Mauclair's thesis on confidence measures with Professor Paul Deléglise. We have proposed a new confidence measure to estimate the reliability of ASR outputs. This confidence measure is based on word posteriors and language model back-off behavior.
During this time, we have developed our ASR system based on CMU Sphinx which is still in progress.
In 2006, with Sylvain Meignier, I have proposed a new approach which allows to automatically identify a speaker by his name without a priori information by exploiting the outputs of an automatic speaker diarization system and an automatic speech recognition system. This approach focuses on broadcast news processing and is still targeted by our research work.
Since 2007, I coordinate the ANR EPAC project. This project focuses on conversational speech processing and is a cooperative project between 4 French academic laboratories: IRIT (Toulouse), LI (Tours), LIA (Avignon) and LIUM. The EPAC project aims to propose methods to extract information from non-structured audio data and to structure these data. Different information channels are concerned by this work: speaker diarization, speaker identification, speech recognition, discourse analysis, conversational interactions, spontaneous speech categorization and detection, ... This project will end to august 2010.
During this time, I have also co-directed the theses of two PhD students which will finish in 2010:
- the Richard Dufour's thesis. It is a doctoral thesis on computer science: this work is based on two different studies. The first one, in relation with the EPAC project, consists in proposing a reliable method to categorize and detect spontaneous speech in mass of audio data. The second one, consists in improving the accuracy of a French ASR system by proposing specific language models to correct specific errors.
- the Thierry Bazillon's thesis. It is a doctoral thesis in linguistics, in relation to the EPAC project too, consisting in studying how to better manually annotate conversational speech to make these annotations more helpful and precise to train automatic systems. This work is co-directed by Professor Daniel Luzzati.
Since december 2008, I am the scientist in charge, for the LIUM, of the ANR PORTMEDIA project. This project focuses on portability and robustness of human/machine dialog systems. In this project, the LIUM works on providing ASR outputs to dialog systems. More particularly, we have to propose paradigms allowing to improve the accuracy of dialog systems by using different kinds of ASR outputs than 1-best hypothesis. This work is in the continuity of the work I have started with FTRD. 5 French academic laboratories or organization are involved: LIA (Avignon), LORIA (Nancy), LIG (Grenoble), ELRA (Paris) and LIUM.
In 2009, a new ANR project, called ASH, will start. This project is very ambitious and expect to propose a new framework to combine in real time heterogeneous automatic speech recognition systems. I am the coordinator of the ASH project which will end in 2012. This project is a cooperative project between 3 French academic laboratories: IRISA (Rennes), LIA (Avignon) and LIUM.
Last, I work since 2008 with Professor Holger Schwenk in the domain of the Statistical Machine Translation: I work on language modeling and transliteration. We have participated to the NIST'08 SMT evaluation where we ranked at the first position of the academic laboratories from all over the world.
Contract with enterprise
I am the scientist in charge of the collaboration between the Specinov company and the LIUM. This collaboration focuses on the auto-adaptation of an ASR system in the framework of an application for meeting transcription. This work will be end in 2009. It consists in allowing an ASR system to automatically adapt its vocabulary (including prononciation, particularly for proper nouns), and its language models without the intervention of an human expert. A PhD student, Antoine Laurent, is working on this: he is co-directed by Professor Paul Deléglise and Assistant Professor Sylvain Meignier.
International collaboration
The CMU Sphinx Group, from the Carnegie Mellon University, which has developed the core (Sphinx 3.x + Sphinx 4) of the ASR system implemented by the LIUM was interested by the new functionalities we have added into their tools. In 2006, I have been integrated into the development team of the international project CMU Sphinx. Normally, in 2009 new features from LIUM will be integrated into the CMU Sphinx repository on SourceForge.
From 2009, I work with Professor Holger Schwenk in the EuromatrixPlus European project on Statistical Machine Translation. The consortium of this project consists of DFKI Gmbh (Germany), University of Edinburgh (United Kingdom), Charles University (Czech Republic), Johns Hopkins University (USA), FBK (Italy), DCU (Ireland), Lucy Software and Services Gmbh (Germany) and CEET (Czech Republic).
Tools and resources distribution
Some parts of the tools and acoustic or linguistic models developped by the LIUM are available on the website of the LIUM under open source license (BSD-like):
http://www-lium.univ-lemans.fr/speechtools
Others
- Reviewer for the Speech Communication Journal of the International Speech Communication Association and some international conferences
- Member of the International Speech Communication Association
- Member of the Administration Council of the Faculty of Sciences of the University of Le Mans
- Member of the Laboratory Council of the LIUM

