Research
The ACLP research team has been awarded a number of research grants and has been involved in several projects with leading companies in the industry, in addition to initiating self-funded projects.
 
 
 
 
Speech Driven HMI in Domain-Specific Dialogue
 
SMS Language Identification
 
Phonetic Search Keyword Spotting Engine 
 
Speaker Verification Engine 
 
Voice-Driven Demo Application
 
Open KWS Evaluation
 
Evaluating the Added Contribution of NLU Algorithms Over Existing ASR Technologies
 
Definition of The ACLP Hebrew Corpus
 
Phonetic Search Based on Cross-Language Phoneme Transformation
 
Improving Speech Recognition in Noisy Conditions Using a Laser Microphone
 
Keyword Spotting Application in Several Languages
 
An Efficient Algorithm for Voicemail Transcription
 
LVCSR Engine Sensitivity to Phoneme Recognition
 
 
A Hybrid System for Keyword Spotting in Recorded Conversations
 
Magneton  2013-2015
 
Funded by the Israeli Ministry of Commerce Chief Scientist as part of a Magneton project encouraging the transfer of technology from academia to the industry. The research is being carried out in collaboration with NICE Systems.
 
 
Intent Insight Impact
 
The amount of recorded data collected by intelligence systems is overwhelming, making it impossible for human examiners to handle alone. This big data problem is often overcome by utilizing keyword spotting systems that are capable of pinpointing pre-designated terms among thousands of hours of recordings. This project focuses on combining LVCSR and Phonetic Search KWS engines in order to maximize keyword spotting capabilities and provide keyword flexibility.
 
Robust Speaker Diarization for multi-speakers telephony environment
 
Magneton  2013-2015
 
Funded by the Israeli Ministry of Commerce Chief Scientist as part of a Magneton project encouraging the transfer of technology from academia to the industry. The research is being carried out in collaboration with NICE Systems.
 
Intent Insight Impact
 
Speaker diarization estimates the number of speakers in a conversation and produces a time-stamped conversational "diary" of participating speakers and is becoming an increasingly important component of speech and speaker recognition technologies. The focus of this research is on minimizing speaker diarization errors in commercial applications used by live call-centers and trading floor arenas.
 
 
Speech Driven HMI in Domain-Specific Dialogue

Maf'at 2010-present
Funded by Maf'at (The Israeli Ministry of Defense research administration for development of arms and technological infrastructure).

 
2013-2014: Speech Driven Question Answering
Adapting the Kaldi speech recognition engine to domain specific QA system with a focus on acoustical modeling and discriminative training. 
 
2012-2013: Incorporation of Textual Entailment
Incorporation of textual entailment algorithm as an extension of a textual distance measure.

2011-2012: Domain Specific Speech Recognition
Adapting the HTK speech recognition engine to domain specific QA system while improving the textual similarity measure for locating the response to a question or information request posed by the user.

2010-2011: Textual Distance Measure
Proof-of-concept for speech-driven question answering in a specific domain. The focus was on developing the framework for a Question-Answering (QA) system that integrates speech recognition, text-based analysis and text to speech.
 
 
SMS Language Identification
 
TeleMessage 2013- 2014
 
 
The main high-level objective of the project is to develop a robust algorithm for training language profiles and identifying the language of short text messages. The Language Identification (LI) algorithm will be able to deal with a finite set of 4 a-priori known languages, but will enable support of additional languages by training profiles for each new language and adding them to the LI engine.
 
 
An Algorithmic Version of a Phonetic Search Keyword Spotting Engine 
Athena 2013
The objective of the Keyword Spotting project was to develop a phoneme decoder and a phonetic search keyword spotting engine that locates designated keywords within a sequence of phonemes representing a speech signal.
 
An Algorithmic Version of a Speaker Verification Engine 
 
Athena 2013
 
The objective of the Speaker Verification project was to develop a text and language-independent speaker verification engine including a speaker training engine. 
 
Voice-Driven Demo Application
 
Inuitive 2013

 
The focus of the project was to provide the company with the information needed to serve as a basis for the definition, assessment, integration and construction process of a voice-driven demo.

Open KWS Evaluation 

All evaluation participants are provided the same data from a “surprise” language and are allotted an initial development phase. Each research center is allowed to use whatever KWS technologies and methods are available to them, but no human intervention is allowed. Evaluations are submitted according to strict guidelines and scored by NIST using a Term Weighted Value (TWV) function.
 
OpenKWS2013
The 2013 surprise language was Vietnamese, a South-East Asian tonal language with a vocabulary that consists only of monosyllabic words. The ACLP approach to the OpenKWS13 evaluation was to perform phonetic search KWS, using a maximum TWV based thresholding mechanism to improve the resulting keyword detections.
 
OpenKWS2014
The 2014 surprise language was Tamil, a morphologically rich Indian language spoken mainly in Southern India and Sri Lanka. The ACLP approach to the OpenKWS2014 evaluation was to perform KWS on multiple LVCSR and phonetic-search based engines while applying maximum TWV based thresholding and score normalization mechanisms which enabled merging the KWS results of the various engines.
 


Definition of The ACLP Hebrew Speech Corpus

2012
ACLP Hebrew Speech Corpus is a speech database that is being collected by the researchers from the Afeka Center for Language Processing.

The corpus will contain a large number of recordings of phonetically rich sentences, as well as, spontaneous speech. The main goal of the project is to model Modern Hebrew phonemes for Automatic Speech Recognition (ASR) and Text-to-Speech (TTS). The corpus includes a large lexicon with phonetic transcriptions.
 
Evaluating Commercial and Open-Source Speech Recognition Engines
2012
 
 
 
 
The main purpose of the project was to evaluate both commercial and open-source speech recognition engines according to criteria that will enable the company to assess the potential contribution of their technology to speech recognition results, and to determine what data is needed from the engine in order optimize this contribution.
 

Phonetic Search Based on Cross-Language Phoneme Transformation

Magneton 2011-2013
Funded by the Israeli Ministry of Commerce Chief Scientist as part of a Magneton project encouraging the transfer of technology from academia to the industry. The research is being carried out in collaboration with NICE Systems.



One of the more efficient methods for analyzing recorded dialogue is by detecting specific keywords within the speech (Keywords Spotting). This is done using automated systems that are able to scan large quantities of data quickly and efficiently and then recognize when any given word occurs. To date, such systems have been developed mainly for more common languages, since compiling the linguistic resources needed to develop such a system is both time and resource consuming. Keyword spotting solutions for exotic languages with high security interest rarely exist, if at all. The goal of this research is to develop a keyword spotting methodology that utilizes existing acoustic models and other language resources from a source language for detecting keywords in speech in an under-resourced target language, without training new acoustic models, etc. 


Improving Speech Recognition Performance in Noisy Conditions Using a Laser Microphone

2011
This project was carried out with VocalZoom, developers of a unique optoelectronic microphone.
 


For this project, the ACLP designed an American English speech database to be compiled for testing a laser optic microphone and its influence on Automatic Speech Recognition (ASR) performance. The database was designed to test improvement of ASR performance in various noisy environments such as inside running and driving vehicles and in public locations. Environments such as home and office were also included to verify that there are no adverse effects in quieter surroundings. During the project, the algorithms and the speech recognition engine were tuned to work in the new environments.

Keyword Spotting Application in Several Languages

2011
This project was carried out with Athena Security Implementations Ltd., a security solutions company.

This project entailed the design and construction of a demo and marketing tool for KeyWord Spotting (KWS) in several languages. The ACLP designed the flow of the demo in cooperation with the company. For each language, a list of keywords and sentences containing the keywords were composed; evaluation and tuning databases of native speakers were collected; and tuning experiments were conducted.

An Efficient Algorithm for Voicemail Transcription

Magneton 2009-2011
Funded by the Israeli Ministry of Commerce Chief Scientist as part of a Magneton project encouraging the transfer of technology from academia to the industry. The research was carried out in collaboration with SpeechModules Ltd.



The goal of the research was to efficiently automatically transcribe voicemail messages using a speech recognition engine and a very large lexicon of 100K words, and then to forward the textual results via SMS to the user. The results of the research showed that the computational complexity of a speech recognition engine used for transcribing spontaneous speech can be significantly reduced to accommodate real-time processing by using various methods for limiting the search space, while at the same time maintaining recognition performance.
 

LVCSR Engine Sensitivity to Phoneme Recognition

Speech Modules 2009
The project was carried out in collaboration with SpeechModules Ltd, a company specializing in advanced speech recognition technologies.

In this project the performance of a Large Vocabulary Speech Recognition Engine (LVCSR) was tested at various working points in order to generate a sensitivity model to phoneme recognition. The LVCSR engine was a three-stage engine used by SpeechModules (phonemes, words, sentences) and the work agenda was to determine and tune the phoneme recognition engine to the optimal working point with regards to overall LVCSR performance.
 
Our Services
 
 
The ACLP provides research, development and consulting services for industry clients that are either already in or are looking to enter the Speech Processing arena.
 
 
News