Welcome!

News Feed Item

The International Computer Science Institute (ICSI) Leads Team Researching Ways to Build Speech Recognition Systems for New Languages Under Severe Data and Time Constraints

The International Computer Science Institute (ICSI) is leading a research team under the IARPA Babel Program that is focused on building speech recognition solutions with self-imposed time and data limitations for a variety of languages. The work aims to better understand fundamental challenges and discover new methods for development of speech models for languages that could emerge as important in the future.

“The goal of the Babel program is to rapidly build speech recognition systems to support effective keyword search for new languages using limited amounts of transcribed speech recorded in real-world conditions,” said Mary Harper, the IARPA Program Manager in charge of the Babel program.

Using only a fraction of the training data usually required, the team aims to build speech recognition systems for several languages in just one week by the end of the program.

“ICSI excels at intellectual challenges and unique approaches to research. This is an intriguing project that puts significant constraints on our researchers as a means to discover better ways to develop automatic speech recognition systems,” said Roberto Pieraccini, director and president of ICSI.

By working on a variety of languages with time and data restrictions, the team will research basic principles of speech technology rather than incremental improvements to existing technology. In addition, this research will be useful in enabling keyword-search systems for those languages that do not have large amounts of transcribed audio.

“The speech recognition systems we’ve built in the past have the curse of being reasonably good, particularly for a few languages and speech recorded in good acoustic conditions, which has often reduced the impetus to significantly change the technology,” said Professor Nelson Morgan, deputy director and leader of the Speech Group at ICSI. “This project strongly pushes us to solve fundamental problems in speech recognition to address the Babel challenge."

In each of the four periods of the project, the team will be given a set of languages and will be tasked with developing methods to quickly build a system. Speech recognition systems are typically trained on thousands of hours of transcribed audio. In this project, the team was initially given only 80 hours of conversational speech for each language, and in each succeeding period a smaller fraction of the audio is transcribed. At the end of each period, the team will be given a new language to build a system – initially in four weeks, but by the end of the program down to just one week.

In addition to Morgan, the leaders of the team are Steven Wegmann of ICSI, Professor Mari Ostendorf of the University of Washington, Professor Janet Pierrehumbert of Northwestern University, Professor Eric Fosler-Lussier of The Ohio State University, and Professor Dan Ellis of Columbia University. Morgan says an important element of the project is that these team leaders have had strong previous research ties with one another in research topics that are essential to the Babel problem.

The project is funded by the Intelligence Advanced Research Projects Activity (IARPA), a research arm of the Office of the Director of National Intelligence, which invests in high-risk/high-payoff research programs.

About ICSI

The International Computer Science Institute (ICSI) is a leading center for research in computer science and one of the few independent, nonprofit research institutes in the United States. With its unique focus on international collaboration and its affiliation with the University of California at Berkeley, ICSI brings together the most influential U.S. scientists and experts from around the world in areas such as computer networking and security, speech and language processing, algorithms, bioinformatics, computer architecture, computer vision, and artificial intelligence. For more information, check ICSI out on the Web:

www.ICSI.berkeley.EDU | http://twitter.com/ICSIatBerkeley | http://blog.ICSI.berkeley.EDU

www.facebook.com/ICSIatBerkeley | www.youtube.com/ICSIatBerkeley

More Stories By Business Wire

Copyright © 2009 Business Wire. All rights reserved. Republication or redistribution of Business Wire content is expressly prohibited without the prior written consent of Business Wire. Business Wire shall not be liable for any errors or delays in the content, or for any actions taken in reliance thereon.

Latest Stories
A critical component of any IoT project is what to do with all the data being generated. This data needs to be captured, processed, structured, and stored in a way to facilitate different kinds of queries. Traditional data warehouse and analytical systems are mature technologies that can be used to handle certain kinds of queries, but they are not always well suited to many problems, particularly when there is a need for real-time insights.
Choosing the right cloud for your workloads is a balancing act that can cost your organization time, money and aggravation - unless you get it right the first time. Economics, speed, performance, accessibility, administrative needs and security all play a vital role in dictating your approach to the cloud. Without knowing the right questions to ask, you could wind up paying for capacity you'll never need or underestimating the resources required to run your applications.
"Software-defined storage is a big problem in this industry because so many people have different definitions as they see fit to use it," stated Peter McCallum, VP of Datacenter Solutions at FalconStor Software, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Enterprise networks are complex. Moreover, they were designed and deployed to meet a specific set of business requirements at a specific point in time. But, the adoption of cloud services, new business applications and intensifying security policies, among other factors, require IT organizations to continuously deploy configuration changes. Therefore, enterprises are looking for better ways to automate the management of their networks while still leveraging existing capabilities, optimizing perf...
The best-practices for building IoT applications with Go Code that attendees can use to build their own IoT applications. In his session at @ThingsExpo, Indraneel Mitra, Senior Solutions Architect & Technology Evangelist at Cognizant, provided valuable information and resources for both novice and experienced developers on how to get started with IoT and Golang in a day. He also provided information on how to use Intel Arduino Kit, Go Robotics API and AWS IoT stack to build an application tha...
IoT generates lots of temporal data. But how do you unlock its value? You need to discover patterns that are repeatable in vast quantities of data, understand their meaning, and implement scalable monitoring across multiple data streams in order to monetize the discoveries and insights. Motif discovery and deep learning platforms are emerging to visualize sensor data, to search for patterns and to build application that can monitor real time streams efficiently. In his session at @ThingsExpo, ...
Manufacturers are embracing the Industrial Internet the same way consumers are leveraging Fitbits – to improve overall health and wellness. Both can provide consistent measurement, visibility, and suggest performance improvements customized to help reach goals. Fitbit users can view real-time data and make adjustments to increase their activity. In his session at @ThingsExpo, Mark Bernardo Professional Services Leader, Americas, at GE Digital, discussed how leveraging the Industrial Internet a...
You think you know what’s in your data. But do you? Most organizations are now aware of the business intelligence represented by their data. Data science stands to take this to a level you never thought of – literally. The techniques of data science, when used with the capabilities of Big Data technologies, can make connections you had not yet imagined, helping you discover new insights and ask new questions of your data. In his session at @ThingsExpo, Sarbjit Sarkaria, data science team lead ...
Extracting business value from Internet of Things (IoT) data doesn’t happen overnight. There are several requirements that must be satisfied, including IoT device enablement, data analysis, real-time detection of complex events and automated orchestration of actions. Unfortunately, too many companies fall short in achieving their business goals by implementing incomplete solutions or not focusing on tangible use cases. In his general session at @ThingsExpo, Dave McCarthy, Director of Products...
WebRTC is bringing significant change to the communications landscape that will bridge the worlds of web and telephony, making the Internet the new standard for communications. Cloud9 took the road less traveled and used WebRTC to create a downloadable enterprise-grade communications platform that is changing the communication dynamic in the financial sector. In his session at @ThingsExpo, Leo Papadopoulos, CTO of Cloud9, discussed the importance of WebRTC and how it enables companies to focus...
Amazon has gradually rolled out parts of its IoT offerings in the last year, but these are just the tip of the iceberg. In addition to optimizing their back-end AWS offerings, Amazon is laying the ground work to be a major force in IoT – especially in the connected home and office. Amazon is extending its reach by building on its dominant Cloud IoT platform, its Dash Button strategy, recently announced Replenishment Services, the Echo/Alexa voice recognition control platform, the 6-7 strategic...
Aspose.Total for .NET is the most complete package of all file format APIs for .NET as offered by Aspose. It empowers developers to create, edit, render, print and convert between a wide range of popular document formats within any .NET, C#, ASP.NET and VB.NET applications. Aspose compiles all .NET APIs on a daily basis to ensure that it contains the most up to date versions of each of Aspose .NET APIs. If a new .NET API or a new version of existing APIs is released during the subscription peri...
Verizon Communications Inc. (NYSE, Nasdaq: VZ) and Yahoo! Inc. (Nasdaq: YHOO) have entered into a definitive agreement under which Verizon will acquire Yahoo's operating business for approximately $4.83 billion in cash, subject to customary closing adjustments. Yahoo informs, connects and entertains a global audience of more than 1 billion monthly active users** -- including 600 million monthly active mobile users*** through its search, communications and digital content products. Yahoo also co...
Ixia (Nasdaq: XXIA) has announced that NoviFlow Inc.has deployed IxNetwork® to validate the company’s designs and accelerate the delivery of its proven, reliable products. Based in Montréal, NoviFlow Inc. supports network carriers, hyperscale data center operators, and enterprises seeking greater network control and flexibility, network scalability, and the capacity to handle extremely large numbers of flows, while maintaining maximum network performance. To meet these requirements, NoviFlow in...
As companies gain momentum, the need to maintain high quality products can outstrip their development team’s bandwidth for QA. Building out a large QA team (whether in-house or outsourced) can slow down development and significantly increases costs. This eBook takes QA profiles from 5 companies who successfully scaled up production without building a large QA team and includes: What to consider when choosing CI/CD tools How culture and communication can make or break implementation