IIA to improve speech-recognition services in Hebrew and Arabic

Analyzing Semitic languages such as these has been considered more challenging, hindering the development of high-quality speech recognition tools.

Computer code and an Israeli flag (photo credit: JPOST STAFF)
Computer code and an Israeli flag
(photo credit: JPOST STAFF)
The Israel Innovation Authority (IIA) and the Israel National Digital Ministry announced on Thursday the establishment of the Association of Natural Language Processing (NLP) Technology Companies, an initiative that will aim to improve the capability of computerized systems for understanding the Hebrew and Arabic languages.

The new association has already received a budget of NIS 7.5 million for its first three years of operation, pointing to the fact that the importance of efficient digitization of spoken human language is becoming increasingly evident.  

"The public sector deals with unstructured data in Hebrew and Arabic on a daily basis. One of the major challenges in the digitization of public services is to enable operational efficiency and high productivity while ensuring that such services are free to the public," according to Asher Bitton, the ministry's director-general. 

The Hebrew and Arabic languages have suffered inequality regarding speech recognition in various types of computerized systems, with speech recognition capabilities of other languages considered much more developed.

Analyzing Semitic languages such as Hebrew and Arabic has been considered more challenging, hindering the development of high-quality speech recognition tools. Another reason for underdevelopment in the field is the lack of interest to invest by commercial companies, which led the IIA and Israel National Digital Ministry to realize that some kind of government intervention was needed.   

The new initiative has a great potential to affect Israeli consumers, as the low level of computer comprehension of Hebrew and Arabic has hampered the ability of technology providers to develop and implement new services for users in these languages. 

The association will aim to produce an R&D infrastructure that will enable an empirical basis for the identification of the linguistic structural elements and models that make up the Hebrew and Arabic languages, and for mapping the ways in which these systems are used.

The infrastructure will then analyze the languages’ syntactic, semantic and morphological characteristics for R&D purposes in the field of natural language processing. 

In order to allow for a detailed and complete understanding of the languages in the computerized systems, the association will deploy many Hebrew and Arabic texts from diverse fields, including: news, archives, films, books, articles, customer service, transcribed radio and television broadcasts, professional literature, and more. 

The infrastructure established by the association will be set up in the cloud and will enable the secure sharing of linguistic materials via a shared management system and algorithms accessible by all the partners in the association. These partners include companies from a variety of Israeli industries that all have an invested agenda in implementing advanced applications that require NLP in Hebrew and Arabic, and all took an active part in establishing the association.  

Among the companies involved in the initiative are Rafael, Bank Hapoalim, Intel, Walla! News, Ynet, Ginger Software and the research and innovation wing of Maccabi Health Services. 

Hopefully, the Israeli public will be able to enjoy advanced 
speech-recognition technology from a wide range of sectors and services in the near future. 

"The association that we established this week will allow Israeli industry to clearly define its needs and help close technological gaps by enabling the use of unstructured databases in Hebrew and Arabic, and providing insights which can be harnessed when developing and promoting products and services provided by Israeli companies," Aviv Zeevi from the IIA said.