News‎ > ‎

BCL Technologies received Session's Best Paper Award

posted Mar 7, 2008, 1:28 PM by BCL Technologies   [ updated Mar 8, 2008, 9:07 AM by Aman Kumar ]
Wednesday, July 18, 2007 BCL Technologies is selected as the winner or Session's Best Paper Award at the 5th International Conference on Computing, Communication and Control Technologies.

Title: Spoken Language Understanding Software for Language Learning
 Hassan Alam, Aman Kumar, Fuad Rahman, Yuliya Tarnikova, Rachmat Hartono


In this study we have developed a proof-of-concept, work-in-progress Spoken Language Understanding Software (SLUS) with tailored feedback options, which uses interactive spoken language interface to teach foreign language (Arabic) and culture. The SLUS analyzes input speech by the second language learner and grades not only for correct pronunciation, vocabulary, and grammar, but also for prosody and intonation. Arabic language itself has many features that cause difficulties for strategies developed for processing Romance and Germanic languages, as reported in Kirchhoff (2002) and in Chiang et al. (2005). Due to the nature of the challenges posed by less-studied languages such as Arabic, the sophistication of computer-based models of Arabic speech, and especially of dialectical speech, has lagged behind that of the European languages. In order to build such a system we developed a comprehensive model of Iraqi Arabic against which the student’s performance is measured. This model includes many aspects: (1) an acoustic model; (2) an articulation model; (2) a dictionary or vocabulary model; (3) a grammar model; and (4) a model of common errors or “disfluencies”. In traditional (not computer-assisted) instructions, these models take the form of written descriptions and examples of sounds, vocabulary lists, and grammatical rules; and the student’s performance in the language is graded by human instructors. For computer-based language instruction, all of these must be cast as explicit databases and mathematical models, so that they can be used to automatically grade student performance, to identify errors, and to evoke appropriate and believable responses from simulated tutors.

In order to test new methodologies for creating Language Models we created a corpus by transcribing and recording the scenarios in both Modern Standard Arabic and in the Iraqi dialect that is most prevalent in central and southern Iraq. Using the test sentences from the corpus and an acoustic analysis software, preliminary prosodic and intonational models were developed for the target language to create training data with acoustic features. We use COTS SRI speech recognition engine (DynaSpeak) for speech-to-text processing. We prototyped and performed (1) evaluation of stress and pitch contours of the input speech, (2) addition of phonetic information to SRI's DynaSpeak, and (3) re-ranking of the ASR output using a Support Vector Machine (SVM). In addition, the SLUS figured out rudimentary segmental errors (corresponds to missing consonant or vowel). We evaluated this software on training data with the help of two native speakers, and found that the software recorded an accuracy of around 70% in law and order domain.