Automatic Speech Transcription System

The LIUM has a complete automatic speech recognition (ASR) system. The core of the system is based on Sphinx, distributed by Carnegie Mellon University (CMU) and KALDI. The LIUM system was initially developed for transcription of French newscasts. It was then adapted for the transcription of debates in English, Spanish, Italien, Arabic and German, as well as for the telephone transcription of dialog in French an English.

System for Statistical Machine Translation

The LIUM has been carrying out research in automatic translation since 2007. The system of the LIUM is based on the software Moses which is a statistical translation system using the concept of phrase tables. We regularly add new functionalities to improve performance, as attested by our results in international evaluation campaigns such as WMT, IWSLT, or those organized by the NIST.

The statistical approach for translation is generic and independent of the treated language pair. However, we concentrated our efforts on the following languages: English, French, Arabic, Mandarin.

Our research activities in translation are also characterized by a privileged co-operation with the company SYSTRAN, the world leader on the market of translation software. We work together on the convergence of the statistical approaches and the formal methods.

Speaker Recognition system

The speaker recognition activity began at LIUM at the end of 2004. Since then, we work on speaker diarization (single and cross-show), speaker identification and verification as well as language identification. The developed systems obtained very good performance in speaker diarization and identification.

Participation In Evaluation Campaigns

L’équipe LST participe régulièrement à des campagnes d’évaluation internationales et nationales dans le domaine de la reconnaissance de la parole, de la traduction automatique, de la traduction automatique de la parole ou encore de la reconnaissance du locuteur. Ces campagnes ont pour but d’évaluer les performances des technologies à l’état de l’art, et permettent aux participants de comparer leurs systèmes avec ceux des meilleurs laboratoires du domaine. Le tableau suivant synthétise les participations de l’équipe.

  • IWSLT 2011, we obtained the first place for speech translation (english->french).
  • ETAPE 2012, we obtained the first place for speech transcription tas in French and for cross-show speaker diarization task.
  • REPERE 2013 et 2014, we obtained the first place for speech transcription tas in French, cross-show speaker diarization task and speaker identification task.
  • IWSLT 2014, we obtained the 6th in english speech transcription and the 1st for italian speech transcription.
  • MGB Challende 2015, we obtained the 2d place for the speaker diarization.
  • A la campagne NIST Open MT, nous avons obtenu la 7e, 4e et 4e place respectivement pour Traduction automatique en chinois vers anglais pour les SMS, les chat et les conversations téléphoniques.
  • A la campagne QALAb, nous avons obtenu la 1e place et la 2e place pour la correction automatique de texte arabe.