Detail projektu

Multi Modal Meeting Manager

Období řešení: 01.03.2002 — 28.02.2005

O projektu

The M4 project started in March 2002, and has a duration of three years. The overall objective of the project is the construction of a demonstration system to enable structuring, browsing and querying of an archive of automatically analysed meetings. The archived meetings will have taken place in a room equipped with multimodal sensors. For each meeting, audio, video, textual, and (possibly) interaction information will be available. Audio information will come from close talking and distant microphones, as well as binaural recordings. Video information will come from multiple cameras. While the video and audio information will form several streams of data generated during the meeting, the textual information---the agenda, discussion papers, text of slides---will be pre-generated and can be used to guide the automatic structuring of the meeting. The interaction stream consists of any information that can help in analysing events within the meeting, for example, mouse tracking from a PC-based presentation or laser pointing information.

Popis česky
Cílem projektu M4 "Multimodal meeting manager" je vyvinout systém pro analýzu a záznam živých jednání. Účastníci jednání budou snímáni mikrofony a kamerami. Jejich řeč a gesta budou automaticky rozpoznána a indexována pro snadnou orientaci a hledání v záznamu. Uživatel pak bude moci například položit systému otázku "Kdy mluvil pan X o tématu Y" a systém automaticky vyhledá příslušné sekvence. FIT VUT Brno bude pracovat na nových metodách rozpoznávání specifických částí řeči, které bude nezávislé na jazyku jednání. Dalším úkolem bude určení mluvčího pomocí analýzy gest a jeho sledování otočnou kamerou.

Klíčová slova
speech processing, video processing, information merging, meeting summarization

Označení

IST-2001-34485

Originální jazyk

angličtina

Řešitelé

Útvary

Ústav počítačové grafiky a multimédií
- příjemce (01.01.2002 - 31.12.2004)

Výsledky

MATĚJKA, P., SCHWARZ, P., ČERNOCKÝ, J., HEŘMANSKÝ, H. Phoneme Recognition using Temporal Patterns. In In Proceedings of the conference TSD'2003. International Conference on Text Speech and Dialogue, TSD 2003. 2003. p. 198 ( p.)ISBN: 3-540-20024-X.
Detail

ČERNOCKÝ, J. Temporal processing for feature extraction in speech recognition, habilitation thesis. Brno: 2002. p. 0 ( p.)
Detail

POTÚČEK, I. Person Tracking Using Omnidirectional View. In Proceedings of the 9th conference STUDENT EEICT 2003. Vědecký sborník. Brno: Brno University of Technology, 2003. p. 603-607. ISBN: 80-214-2379-X. ISSN: 0572-3043.
Detail

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. In Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer, 2004. p. 465 ( p.)ISBN: 3-540-23049-1.
Detail

JENDERKA, P., VÍCHA, T. Voice Activity Detection in Multimodal Meeting Manager. In Proceedings of 9th Conference and Competition STUDENT EEICT 2003 Volume 3. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 588-592. ISBN: 80-214-2379-X.
Detail

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Recognition of Phoneme Strings using TRAP Technique. In Proceedings of 8th International Conference Eurospeech. European Conference EUROSPEECH. Geneve: International Speech Communication Association, 2003. p. 1-4. ISSN: 1018-4074.
Detail

MOTLÍČEK, P. Derivation of TRAPs in Auditory Domain. In Proceedings of the International Conference and Competition. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 315-319. ISBN: 80-214-2401-X.
Detail

MOTLÍČEK, P., ČERNOCKÝ, J. Time-domain based Temporal Processing with Application of. In Proc. EUROSPEECH 2003. European Conference EUROSPEECH. Geneva: Institute for Perceptual Artificial Intelligence, 2003. p. 821-824. ISSN: 1018-4074.
Detail

MOTLÍČEK, P., ČERNOCKÝ, J. Autoregressive Modeling based Feature Extraction for Aurora3 DSR Task. In Proc. EUROSPEECH 2003. European Conference EUROSPEECH. Geneva: Institute for Perceptual Artificial Intelligence, 2003. p. 1801-1804. ISSN: 1018-4074.
Detail

MOTLÍČEK, P., ČERNOCKÝ, J. All-Pole Modeling for Definition of Speech Features in Aurora3 DSR Task. In 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings. Lecture Notes in Computer Science. České Budějovice: University of West Bohemia in Pilsen, 2003. p. 295-300. ISBN: 3-540-20024-X. ISSN: 0302-9743.
Detail

SCHWARZ, P. Would You Like To Make Your Programs Understand Human Voice?. In Proceedings of 9th Conference STUDENT EEICT 2003. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 231-235. ISBN: 80-214-2379-X.
Detail

MATĚJKA, P., SCHWARZ, P., HEŘMANSKÝ, H., ČERNOCKÝ, J. Phoneme Recognition using Temporal Patterns. In Proc. 6th International Conference Text, Speech and Dialogue, TSD2003. Ceske Budejovice: Springer Verlag, 2003. p. 465-472. ISBN: 3-540-20024-X.
Detail

MATĚJKA, P., SCHWARZ, P., GRÉZL, F., ČERNOCKÝ, J. Phoneme Classification using Temporal Patterns. In Proc. 13th International scientific conference Radioelektronika 2003. Brno: Faculty of Electrical Engineering and Communication BUT, 2003. p. 1-4. ISBN: 80-214-2383-8.
Detail

GRÉZL, F. Local time-frequency operators in TRAPs for speech recognition. In 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings. Lecture Notes in Computer Science. České Budějovice: University of West Bohemia in Pilsen, 2003. p. 269-274. ISBN: 3-540-20024-X. ISSN: 0302-9743.
Detail

GRÉZL, F. Effect of normalization on TRAP based systems in ASR. In Proc. 13th International scientific conference Radioelektronika 2003. Brno: Department of Radioelectronics FEEC BUT, 2003. p. 128-131. ISBN: 80-214-2383-8.
Detail

SCHWARZ, P., HEŘMANSKÝ, H., MATĚJKA, P. Použití časové dynamiky k rozpoznávání jazyků z mluvené řeči. In Proceedings of Language Recognition Workshop 2003. NIST Gaithersburg, MD USA: 2003. s. 56-62.
Detail

GRÉZL, F. Combinations of TRAP-based systems. In Proc. Seventh International conference on Text, Speech and Dialogue. Brno: Faculty of Informatics MU, 2004. p. 323-330. ISBN: 3-540-23049-1.
Detail

SZŐKE, I. Speech units automatically generated by ergodic hidden Markov model. In Proceedings of 10th Conference and Competition STUDENT EEICT 2004. Brno: Faculty of Electrical Engineering and Communication BUT, 2004. p. 1 ( p.)
Detail

SUMEC, S. Simulation of Parallel Ray Tracing. In Proceedings of 38th International Conference MOSIS'04. Ostrava: 2004. p. 150 ( p.)ISBN: 80-85988-98.
Detail

KADLEC, J. Lip detection in low resolution images. In Proceeding of the 10th Conference and Competition STUDENT EEICT 2004, Volume 2. Brno: 2004. p. 303-306. ISBN: 80-214-2635-7.
Detail

SUMEC, S. Multi View Person Localization. In Proceedings of the 10th Conference and Competition STUDENT EEICT 2004. Brno: Brno University of Technology, 2004. p. 432 ( p.)ISBN: 80-214-2635-7.
Detail

POTÚČEK, I., SUMEC, S., ŠPANĚL, M. Participant activity detection by hands and face movement tracking in the meeting room. In 2004 Computer Graphics International (CGI 2004). Los Alamitos: IEEE Computer Society, 2004. p. 632-635. ISBN: 0-7695-2717-1.
Detail

POTÚČEK, I., WALLHOFF, F., ZOBL, M., RIGOLL, G. Dynamic Tracking in Meeting Room Scenarios Using Omnidirectional View. In 17th International Conference on Pattern Recognition (ICPR 2004). Cambridge: IEEE Computer Society, 2004. p. 933-936. ISBN: 0-7695-2128-2.
Detail

BURGET, L. Combination of Speech Features Using Smoothed Heteroscedastic Linear Discriminant Analysis. In Proc. 8th International Conference on Spoken Language Processing. Jeju island: Sunjin Printing Co, 2004. p. 2549-2552.
Detail

MOTLÍČEK, P., ČERNOCKÝ, J. Multimodal Phoneme Recognition of Meeting Data. In 7th International Conference, TSD 2004 Brno, Czech Republic, September 2004 Proceedings. Lecture Notes in Computer Science. Brno: Springer Verlag, 2004. p. 379-384. ISBN: 3-540-23049-1. ISSN: 0302-9743.
Detail

BURGET, L. Measurement of Complementarity of Recognition Systems. In Proc. Seventh International conference on Text, Speech and Dialogue. Lecture Notes in Artificial Intelligence (LNAI) subseries of LNCS series as Volume 3206. Brno: Springer Verlag, 2004. p. 283-290. ISBN: 3-540-23049-1.
Detail

JENDERKA, P., POTÚČEK, I., SUMEC, S. Meeting recordings at Brno University of Technology. In AMI/PASCAL/IM2/M4 workshop. Martigny: 2004. p. 1 ( p.)
Detail

ZEMČÍK, P., HEROUT, A., CRHA, L., TUPEC, P., FUČÍK, O. Particle rendering pipeline in DSP and FPGA. In Proceedings of Engineering of Computer-Based Systems. Los Alamitos: IEEE Computer Society, 2004. p. 361-368. ISBN: 0-7695-2125-8.
Detail

POTÚČEK, I., ŠPANĚL, M. Face Detection in Meeting Room Using Omni-directional View. In AMI/PASCAL/IM2/M4 workshop. Martigny: Institute for Perceptual Artificial Intelligence, 2004. p. 1 (1 s.).
Detail

ZEMČÍK, P., SUMEC, S., POTÚČEK, I., ŠPANĚL, M., HEROUT, A., PEČIVA, J. Summary of Image/Video Processing for AMI Project in Brno. In Poster at MLMI'04 workshop. Martigny: Institute for Perceptual Artificial Intelligence, 2004. p. 1 (1 s.).
Detail

SUMEC, S. Multi Camera Automatic Video Editing. In Proceedings of ICCVG 2004. Warsaw: Kluwer Verlag, 2004. p. 935-945. ISBN: 1-4020-1503-8.
Detail

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. In Proceedings of 7th International Conference Text,Speech and Dialoque 2004. Brno: Springer Verlag, 2004. p. 465 ( p.)ISBN: 3-540-23049-1.
Detail

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Phoneme Recognition from a Long Temporal Context. In poster at JOINT AMI/PASCAL/IM2/M4 Workshop on Multimodal Interaction and Related Machine Learning Algorithms. Martigny: Institute for Perceptual Artificial Intelligence, 2004. p. 1 (1 s.).
Detail

FOUSEK, P., SVOJANOVSKÝ, P., GRÉZL, F., HEŘMANSKÝ, H. New Nonsense Syllables Database - Analyses and Preliminary ASR Experiments. In Proc. 8th International Conference on Spoken Language Processing. 8th International Conference on Spoken Language Processing. Jeju Island: Sunjin Printing Co, 2004. p. 348-351. ISSN: 1225-4111.
Detail

SZŐKE, I., SCHWARZ, P., BURGET, L., KARAFIÁT, M., ČERNOCKÝ, J. Phoneme based acoustics keyword spotting in informal continuous speech. In Radioelektronika 2005. Brno: Faculty of Electrical Engineering and Communication BUT, 2005. p. 195-198. ISBN: 80-214-2904-6.
Detail

BURGET, L., ČERNOCKÝ, J. Recognition of Speech with Non-random Attributes. In 6th International Conference, TSD 2003 České Budějovice, Czech Republic, September 2003 Proceedings. Lecture Notes in Computer Science. České Budějovice: Springer Verlag, 2003. p. 1 ( p.)ISBN: 3-540-20024-X. ISSN: 0302-9743.
Detail

MOTLÍČEK, P., BURGET, L., ČERNOCKÝ, J. VISUAL FEATURES FOR MULTIMODAL SPEECH RECOGNITION. In Radioelektronika 2005. Brno: Faculty of Electrical Engineering and Communication BUT, 2005. p. 187-190. ISBN: 80-214-2904-6.
Detail

POTÚČEK, I. Tracking movement objects in sequence pictures. ElectronicsLetters.com - http://www.electronicsletters.com, 2003, vol. 2003, no. 2, p. 1-15. ISSN: 1213-161X.
Detail

KARAFIÁT, M., GRÉZL, F. Using MATLAB for Analysis of TRAP system. Radioengineering, 2003, vol. 2003, no. 4, p. 38-41. ISSN: 1210-2512.
Detail

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 465 ( p.)ISSN: 0302-9743.
Detail

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Towards Lower Error Rates in Phoneme Recognition. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 465 ( p.)ISSN: 0302-9743.
Detail

MOTLÍČEK, P., ČERNOCKÝ, J. Multimodal Phoneme Recognition of Meeting Data. Lecture Notes in Computer Science, 2004, vol. 2004, no. 3206, p. 379 ( p.)ISSN: 0302-9743.
Detail

ČERNOCKÝ, J. Temporal processing for feature extraction in speech recognition. In Vědecké spisy VUT. Edice Habilitační a inaugurační spisy, sv. 112. Brno: Publishing house of Brno University of Technology VUTIUM, 2003. p. 0 ( p.)ISBN: 80-214-2395-1.
Detail

MOTLÍČEK, P. Visual Feature Extreaction for Phoneme Recognition of Meetings. Brno: Department of Computer Graphics and Multimedia FIT BUT, 2004.
Detail

KARAFIÁT, M., GRÉZL, F., BURGET, L. Combination of MFCC and TRAP features for LVCSR of meeting data. Martigny: 2004.
Detail

SCHWARZ, P., MATĚJKA, P., ČERNOCKÝ, J. Phoneme Recognition. AMI Workshop. 2004. p. 1 ( p.)
Detail

MOTLÍČEK, P. Modeling of Spectra and Temporal Trajectories in Speech Processing, PhD thesis. Brno: Faculty of Information Technology BUT, 2003. p. 1-138.
Detail

BURGET, L. Complementarity of Speech Recognition Systems and System Combination. Brno: Faculty of Information Technology BUT, 2004.
Detail

MOTLÍČEK, P. Modeling of Spectra and Temporal Trajectories in Speech Processing. In Sborník příspěvků a prezentací akce Odborné semináře 2003. Brno: Ústav radioelektroniky FEKT VUT, 2003. s. 0 ( s.)
Detail

Odkaz