ISCA Special Interest Group: Under-resourced Languages (SIGUL)

Aims. SIGUL intends to bring together a number of professionals involved in the development of language resources and technologies for under-resourced languages. Its main objective is to build a community that not only supports linguistic diversity through technology and ICT but also commits to increase the lesser-resourced languages (regional, minority, or endangered) chances to survive the digital world through language and speech technology. SIGUL is a joint Special Interest Group of the European Language Resources Association (ELRA) and of the International Speech Communication Association (ISCA).

Motivation. Porting a NLP system (for instance a speech recognition system or a syntactic parser) to a lesser-resourced language requires techniques that go far beyond the basic re-training of the models. Indeed, processing a new language often leads to new challenges (special phonetic and phonological systems, word segmentation problems, fuzzy grammatical structure, unwritten language, etc.). The lack of resources requires, on its side, innovative data collection methodologies (via community sourcing for instance) or models for which information is shared between languages (e.g. multilingual acoustic models) or even approaches that do not need annotated data (e.g. zero-resource or zero-shot methods). In addition, some social and cultural aspects related to the context of the targeted language bring additional problems: languages with many dialects in different regions, code-switching phenomena, massive presence of non-native speakers. It is also important to bridge the gap between language experts, native speakers and technology experts.  Finally, digital humanities offer new opportunities to work on ancient languages which are inherently under-resourced. Therefore, the main goal of this SIG will be to increase interaction between researchers interested in all the above topics.

Board. (provisional before elections)

• Chair and ELRA liaison representative: This email address is being protected from spambots. You need JavaScript enabled to view it. (CNR-ILC, Pisa, Italy)
• Co-chair and ISCA liaison representative: This email address is being protected from spambots. You need JavaScript enabled to view it. (LIG, Grenoble, France)
• Secretary: This email address is being protected from spambots. You need JavaScript enabled to view it. (NAIST, Nara, Japan)

ISCA-supported events

  • Spoken Language Technologies for Under-resourced Languages (LSTU) workshops
  • workshops at LREC
    • LREC 2014 (Reykjavik, Iceland) "Collaboration and Computing for Under Resourced Languages in the Linked Open Data Era", proceedings
    • LREC 2012 (Istanbul, Turkey) "Language technology for normalisation of less-resourced languages", proceedings
    • LREC 2010 (Malta) "Creation and use of basic lexical resources for less-resourced languages", proceedings
    • workshop at LREC 2008 (Marrakech, Morocco) "Collaboration: interoperability between people in the creation of language resources for less-resourced languages", proceedings
  • Other related events
    • CCURL2016: Collaboration and Computing for Under-Resourced Languages - Towards an Alliance for Digital Language Diversity (Portoroz, Slovenia) proceedings
    • CCURL2014: Collaboration and Computing for Under-Resourced Languages in the Linked Open Data Era (Reykjawik, Iceland) proceedings

    Join the SIG: please contact This email address is being protected from spambots. You need JavaScript enabled to view it. (LIG, Grenoble, France)