BUT Speech@FIT is one of the most famous speech data mining R&D groups in the world. It was formed in 1997 at the Faculty of Electrical Engineering and Computer science at BUT, and joined the Department of Computer Graphics and Multimedia of FIT at the creation of FIT in January 2002. The group is advised by Prof. Hermansky, managed by Dr. Jan "Honza" Cernocky, and its research director is Dr. Lukas Burget. BUT Speech@FIT has a significant track in EC-sponsored projects, ranging from speech corpora collection (SpeechDat-E, SpeeCon), through audiovisual meeting recognition and processing (M4, AMI, AMIDA) to mobile biometric identification (MOBIO), recognition of events for multimedia data-mining and security applications (DIRAC, CareTaker, GLOCAL), and aviation safety (A-PiMod). Recently, the group obtained funding from the Horizon 2020 program under project BISON, that is coordinated by Phonexia, group’s spin-off company. The group is also funded by US Government (IARPA and DARPA programs), and local research agencies (Czech Ministries of Education, Trade and Commerce, Defense and Interior, Technological Agency of the Czech Republic, Grant Agency of the Czech Republic).
BUT Speech@FIT has extensive cooperation with international and local industrial partners. It has generated two spin-offs: Phonexia (est. 2006) http://phonexia.com/ delivers speech analytics solutions to customers in commercial and security/defense sectors, and ReplayWell (est. 2011) http://replaywell.com/ develops and commercializes cloud-based indexing and browsing technology. BUT Speech@FIT is active in open-source software development, and its STK toolkit, PHNREC phone recognizer and SNet/TNet neural net training software are used in several labs worldwide. BUT is involved in the development of a new generation speech toolkit KALDI.
The group disposes of equipment for serious experiments in speech recognition: >1000 CPU cores in several blade chassis, all running CentOS 6 Linux and SGE system for distributed computing, file servers with total capacity of more than 375 TeraBytes, >25 powerful GP-GPU cards for scientific computation and speech and language databases: Linguistic Data Consortium subscription since 2004, numerous corpora purchased separately, collected and obtained due to participation in projects.
The group is also a known event organizer: MLMI 2007 (4th Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms), NIST Speaker recognition evaluation workshop and Odyssey “The Speaker and Language Recognition Workshop” (both 2010), IEEE ICASSP 2011 in Prague, and IEEE ASRU 2013 in Olomouc.