<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <atom:link href="http://isca-speech.org/page-18306/BlogPost/6700809/RSS" rel="self" type="application/rss+xml" />
    <title>ISCA Database Resources</title>
    <link>https://isca-speech.org/</link>
    <description>ISCA blog posts</description>
    <dc:creator>ISCA</dc:creator>
    <generator>Wild Apricot - membership management software and more</generator>
    <language>en</language>
    <pubDate>Sat, 04 Apr 2026 16:50:55 GMT</pubDate>
    <lastBuildDate>Sat, 04 Apr 2026 16:50:55 GMT</lastBuildDate>
    <item>
      <pubDate>Fri, 01 Dec 2023 16:49:06 GMT</pubDate>
      <title>ELRA - Language Resources Catalogue - Update (July 2023)</title>
      <description>&lt;p&gt;&lt;span&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;We are happy to announce that 66 new monolingual lexicons and 1 speech resource are now available in our catalogue. Moreover, 4 speech resources are now available at reduced fees.&lt;/font&gt;&lt;/span&gt;&lt;br&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&amp;nbsp;&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;strong&gt;1) New Language Resources:&lt;/strong&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&amp;nbsp;&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;a href="http://catalog.elra.info/en-us/repository/search/?q=Bitext+Lexical+Dataset"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;font color="#095197"&gt;&lt;strong&gt;Bitext Lexical Datasets&lt;/strong&gt;&lt;/font&gt;&lt;/font&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;The series of&amp;nbsp;&lt;strong&gt;Bitext Lexical Datasets&lt;/strong&gt;&amp;nbsp;for the generic vocabulary includes Lemmas, POS tagging, Frequency, Named Entities and Offensive features. Depending on the dataset and language, other syntactic and morphological features are also provided. The following 15 languages are available:&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;As a complement to the datasets mentioned above, 11 datasets of&amp;nbsp;&lt;strong&gt;Language Variants&lt;/strong&gt;&amp;nbsp;can also be obtained:&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&amp;nbsp;&lt;/font&gt;&lt;/p&gt;

&lt;ol style="line-height: 20px;"&gt;
  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0136/"&gt;&lt;font color="#095197"&gt;Arabic (MSA)&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset and&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0151/"&gt;&lt;font color="#095197"&gt;Arabic Language Variants&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset consisting of Arabic Gulf, Arabic Najdi, Arabic Egypt and Arabic MSA variants,&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0137/"&gt;&lt;font color="#095197"&gt;Chinese (Simplified)&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0138/"&gt;&lt;font color="#095197"&gt;Chinese (Traditional)&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset, and&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0152/"&gt;&lt;font color="#095197"&gt;Chinese Language Variants&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset (Simplified + Traditional),&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0139/"&gt;&lt;font color="#095197"&gt;Dutch&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset and&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0153/"&gt;&lt;font color="#095197"&gt;Dutch Language Variants&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset consisting of Netherlands and Belgium variants,&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0140/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset and&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0154/"&gt;&lt;font color="#095197"&gt;English Language Variants&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset consisting of United States, United Kingdom and India variants,&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0141/"&gt;&lt;font color="#095197"&gt;Finnish&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset and&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0155/"&gt;&lt;font color="#095197"&gt;Finnish Language Variants&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset consisting of Standard and Colloquial Finnish variants,&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0142/"&gt;&lt;font color="#095197"&gt;French&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset and&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0156/"&gt;&lt;font color="#095197"&gt;French Language Variants&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset consisting of France, Canada and Switzerland variants,&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0143/"&gt;&lt;font color="#095197"&gt;German&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset and&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0157/"&gt;&lt;font color="#095197"&gt;German Language Variants&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset consisting of Germany and Switzerland variants,&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0144/"&gt;&lt;font color="#095197"&gt;Indonesian&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset,&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0145/"&gt;&lt;font color="#095197"&gt;Italian&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset and&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0158/"&gt;&lt;font color="#095197"&gt;Italian Language Variants&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset consisting of Italy and Switzerland variants,&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0146/"&gt;&lt;font color="#095197"&gt;Malay&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset,&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0147/"&gt;&lt;font color="#095197"&gt;Norwegian (Bokmal)&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset and&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0159/"&gt;&lt;font color="#095197"&gt;Norwegian Language Variants&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset consisting of Bokmal and Nynorsk variants,&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0148/"&gt;&lt;font color="#095197"&gt;Portuguese&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset and&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0160/"&gt;&lt;font color="#095197"&gt;Portuguese Language Variants&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset consisting of Portugal and Brazil variants,&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0149/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset and&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0161/"&gt;&lt;font color="#095197"&gt;Spanish Language Variants&lt;/font&gt;&lt;/a&gt;&amp;nbsp;dataset consisting of Spain, North America, Central America, Andes and Southern Cone variants,&lt;/font&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&amp;nbsp;&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;a href="http://catalog.elra.info/en-us/repository/search/?q=Bitext+Synthetic+Data"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;font color="#095197"&gt;&lt;strong&gt;Bitext Synthetic Data&lt;/strong&gt;&lt;/font&gt;&lt;/font&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;The Bitext Synthetic Data consist of pre-built training data for intent detection and are provided for 20 verticals for English and Spanish languages. They cover the most common intents for each vertical and include a large number of example utterances for each intent, with optional entity/slot annotations for each utterance. Data is distributed as models or open text files.&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;For each language, the following verticals are available:&lt;/font&gt;&lt;/p&gt;

&lt;ol style="line-height: 20px;"&gt;
  &lt;li&gt;&lt;font&gt;Automotive: 52 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0162/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0182/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;font&gt;&lt;br&gt;&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Retail banking: 26 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0163/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0183/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Education: 37 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0164/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0184/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Event and ticketing: 25 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0165/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0185/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Field Service: 27 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0166/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0186/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Healthcare: 40 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0167/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0187/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Hospitality: 24 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0168/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0188/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Insurance: 38 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0169/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0189/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Legal : 29 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0170/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0190/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Manufacturing: 34 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0171/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0191/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Media Streaming: 24 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0172/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0192/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Mortgage and loans: 39 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0173/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0193/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Moving and storage: 29 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0174/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0194/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Real estate and construction: 28 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0175/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0195/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Restaurant/ bar chains: 30 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0176/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0196/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Retail Ecomm: 34 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0177/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0197/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Telecommunication: 26 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0178/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0198/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Travel: 33 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0179/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0199/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Utilities: 21 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0180/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0200/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;Wealth management: 24 intents (&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0181/"&gt;&lt;font color="#095197"&gt;English&lt;/font&gt;&lt;/a&gt;,&amp;nbsp;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-L0201/"&gt;&lt;font color="#095197"&gt;Spanish&lt;/font&gt;&lt;/a&gt;)&lt;/font&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&amp;nbsp;&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-S0487/"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;font color="#095197"&gt;&lt;strong&gt;Persian Kids’ Speech Corpus&lt;/strong&gt;&lt;/font&gt;&lt;/font&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;The Persian Kids’ Speech Corpus consists of speech signals recorded by 286 children (141 girls, 145 boys), from 6 to 9 years old, through an Andreas Mic Anti-Noise microphone and a Premium Speechmike headphone. This recorded data was manually checked and labeled. Finally, a corpus containing 162,395 samples with a duration of 33 hours and 44 minutes was created. The samples are distributed as follows:&lt;/font&gt;&lt;/p&gt;

&lt;ol style="line-height: 20px;"&gt;
  &lt;li&gt;&lt;font&gt;29,057 Words (478 minutes),&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;17,429 SubWords (260 minutes),&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;43,838 Syllables (485 minutes),&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&lt;font&gt;70,078 Phonemes (765 minutes),&lt;/font&gt;&lt;/li&gt;

  &lt;li&gt;&amp;nbsp;&lt;font&gt;1,993 Extra Vocabulary (36 minutes).&lt;/font&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;The prepared speech corpus comprehensively contains all the 29 Persian phonemes, 118 syllables, 56 sub-words, and 711 words and is particularly applicable to speech recognition and linguistics studies.&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&amp;nbsp;&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;strong&gt;2) Reduced fees for the following speech resources:&lt;/strong&gt;&lt;br&gt;&lt;/font&gt;&lt;/p&gt;

&lt;ul style="line-height: 20px;"&gt;
  &lt;li style="line-height: 23px;"&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-S0397/"&gt;&lt;font&gt;&lt;font color="#095197"&gt;&lt;strong&gt;Chinese Mandarin (South) database&lt;/strong&gt;&lt;/font&gt;&lt;/font&gt;&lt;/a&gt;&lt;/li&gt;

  &lt;li style="line-height: 23px;"&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-S0398/"&gt;&lt;font&gt;&lt;font color="#095197"&gt;&lt;strong&gt;Chinese Mandarin (North) database&lt;/strong&gt;&lt;/font&gt;&lt;/font&gt;&lt;/a&gt;&lt;/li&gt;

  &lt;li style="line-height: 23px;"&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-S0411/"&gt;&lt;font&gt;&lt;font color="#095197"&gt;&lt;strong&gt;Japanese Kids Speech database (Lower Grade)&lt;/strong&gt;&lt;/font&gt;&lt;/font&gt;&lt;/a&gt;&lt;/li&gt;

  &lt;li style="line-height: 23px;"&gt;&lt;font&gt;&lt;a href="http://catalog.elra.info/en-us/repository/browse/ELRA-S0412/"&gt;&lt;font color="#095197"&gt;&lt;strong&gt;Japanese Kids Speech database (Upper Grade)&lt;/strong&gt;&lt;/font&gt;&lt;/a&gt;&lt;strong&gt;&amp;nbsp;&lt;/strong&gt;&lt;/font&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;br&gt;
For more information on the catalogue or if you would like to enquire about having your resources distributed by ELRA, please&amp;nbsp;&lt;strong&gt;contact us&lt;/strong&gt;.&lt;br&gt;
_________________________________________&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;Visit the&amp;nbsp;&lt;a href="http://catalog.elra.info/"&gt;&lt;font color="#095197"&gt;&lt;strong&gt;ELRA Catalogue of Language Resources&lt;/strong&gt;&lt;/font&gt;&lt;/a&gt;&lt;br&gt;
Visit the&amp;nbsp;&lt;a href="http://universal.elra.info/"&gt;&lt;font color="#095197"&gt;&lt;strong&gt;Universal Catalogue&lt;/strong&gt;&lt;/font&gt;&lt;/a&gt;&lt;strong&gt;&amp;nbsp;&lt;/strong&gt;&lt;br&gt;
&lt;a href="http://www.elra.info/en/catalogues/language-resources-announcements"&gt;&lt;font color="#095197"&gt;&lt;strong&gt;Archives&amp;nbsp;&lt;/strong&gt;&lt;/font&gt;&lt;/a&gt;of ELRA Language Resources Catalogue Updates&lt;/font&gt;&lt;/p&gt;</description>
      <link>https://isca-speech.org/page-18306/13285816</link>
      <guid>https://isca-speech.org/page-18306/13285816</guid>
      <dc:creator>(Past member)</dc:creator>
    </item>
    <item>
      <pubDate>Fri, 01 Dec 2023 16:45:48 GMT</pubDate>
      <title>Linguistic Data Consortium (LDC) update (October 2023)</title>
      <description>&lt;p&gt;&lt;strong&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;Membership Year 2024 publication preview&amp;nbsp;&lt;/font&gt;&lt;/strong&gt;&lt;br&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;strong&gt;Fall 2023 data scholarship recipients&lt;br&gt;&lt;/strong&gt;&lt;em&gt;&lt;br&gt;
New publications:&lt;strong&gt;&lt;br&gt;&lt;/strong&gt;&lt;/em&gt;&lt;a href="https://catalog.ldc.upenn.edu/LDC2023T11"&gt;&lt;font color="#095197"&gt;AIDA Scenario 1 Practice Topic Source Data&lt;/font&gt;&lt;/a&gt;&lt;strong&gt;&lt;em&gt;&lt;br&gt;&lt;/em&gt;&lt;/strong&gt;&lt;a href="https://catalog.ldc.upenn.edu/LDC2023T10"&gt;&lt;font color="#095197"&gt;AIDA Scenario 1 and 2 Reference Knowledge Base&lt;/font&gt;&lt;/a&gt;&lt;br&gt;
&lt;br&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;strong&gt;Membership Year 2024 publication preview&amp;nbsp;&lt;br&gt;&lt;/strong&gt;The 2024&amp;nbsp;membership year is&amp;nbsp;approaching&amp;nbsp;and plans for next year’s publications are in progress. Among the expected releases are:&amp;nbsp;&lt;/font&gt;&lt;/p&gt;

&lt;ul style="line-height: 20px;"&gt;
  &lt;li style="line-height: 23px;"&gt;&lt;strong&gt;KASET: 147 hours of&amp;nbsp;&lt;/strong&gt;Sorani Kurdish and Kurmanji Kurdish conversational telephone speech and web broadcasts, 65 hours transcribed&lt;/li&gt;

  &lt;li style="line-height: 23px;"&gt;&lt;strong&gt;AIDA topic source data and annotations:&lt;/strong&gt; multimodal source data and annotations in multiple languages (Russian, Ukrainian, English, Spanish) for information and entity extraction&lt;/li&gt;

  &lt;li style="line-height: 23px;"&gt;&lt;strong&gt;RATS Low Speech Density Data:&lt;/strong&gt; 87 hours of&amp;nbsp;Levantine Arabic, English, Persian, Pushto, and Urdu&amp;nbsp;audio files selected from RATS speech activity detection and keyword spotting data sets, also including communications systems sounds and silence&lt;/li&gt;

  &lt;li style="line-height: 23px;"&gt;&lt;strong&gt;Call My Net 1:&lt;/strong&gt; 364 hours of conversational telephone speech recordings in Tagalog, Cebuano, Cantonese, and Mandarin from speakers in the Philippines and China using various handsets under diverse noise conditions&lt;/li&gt;
&lt;/ul&gt;

&lt;ul style="line-height: 20px;"&gt;
  &lt;li style="line-height: 23px;"&gt;&lt;strong&gt;Ravnursson Faroese Speech and Transcripts:&lt;/strong&gt; 109 hours of read speech from 433 native speakers with transcripts&lt;/li&gt;

  &lt;li style="line-height: 23px;"&gt;&lt;strong&gt;Diaspora Tibetan Speech:&lt;/strong&gt; elicited, read, and spontaneous speech from 73 native Tibetan speakers in Katmandu’s diaspora Tibetan community, some recordings transcribed&lt;/li&gt;
&lt;/ul&gt;

&lt;ul style="line-height: 20px;"&gt;
  &lt;li style="line-height: 23px;"&gt;&lt;strong&gt;IARPA MATERIAL language packs:&lt;/strong&gt; conversational telephone speech, transcripts, English translations, annotations, and queries in multiple languages (e.g., Bulgarian, Somali, Georgian)&lt;/li&gt;

  &lt;li style="line-height: 23px;"&gt;LORELEI:&amp;nbsp;representative and incident language packs containing monolingual text, bi-text, translations, annotations, supplemental&amp;nbsp;resources,&amp;nbsp;and related tools in various languages (e.g., Farsi, Hungarian, Hindi, Amharic)&amp;nbsp;&lt;/li&gt;
&lt;/ul&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;Check your inbox in the coming weeks for more information about membership renewal.&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;br&gt;
&lt;strong&gt;Fall 2023 data scholarship recipients&lt;br&gt;&lt;/strong&gt;Congratulations to the recipients of LDC's Fall 2023 data scholarships:&lt;br&gt;
&lt;br&gt;
&lt;strong&gt;Nessma Diab&lt;/strong&gt;: Ain-Shams University (Egypt): Pre-PhD student, Linguistics. Nessma is awarded copies of CALLHOME Egyptian Arabic Speech LDC97S45 and CALLHOME Egyptian Arabic Transcripts LDC97T10 for her work in machine translation.&lt;br&gt;
&lt;strong&gt;Soheir Elssakkout&lt;/strong&gt;: Ain-Shams University (Egypt): PhD candidate. Soheir is awarded copies of Turkish Broadcast News and Transcripts LDC2012S06 and Middle East Technical University Turkish Microphone Speech v 1.0 LDC2006S33 for her work in speech recognition.&lt;br&gt;
&lt;strong&gt;Matheus Franco&lt;/strong&gt;: Witten/Herdecke University (Germany): Post-doctoral scholar, Faculty of Management, Economics and Society. Matheus is awarded a copy of Avocado Research Email Collection LDC2015T03 for his work in emotional foundations of dynamic capabilities.&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;strong&gt;Kamal Jarrar&lt;/strong&gt;: Birzeit University (Palestine): Master’s student, Applied Statistics and Data Science Program. Kamal is awarded copies of Arabic Gigaword Fifth Edition LDC2011T11 and BOLT Arabic Discussion Forums LDC2018T10 for his work in part-of-speech tagging for dialectal Arabic.&lt;br&gt;
&lt;strong&gt;Minkyoung Kim&lt;/strong&gt;: Yonsei University (Korea); PhD candidate, Graduate School of Information. Minkyoung is awarded a copy of The New York Times Annotated Corpus LDC2018T19 for her work in event extraction and semantic event annotation.&lt;br&gt;
&lt;strong&gt;Humaira Mehmood&lt;/strong&gt;: Fatima Jinnah Women University (Pakistan): Master’s student, Computer Sciences. Humaira is awarded a copy of ARL Urdu Speech Database, Training Data LDC2007S03 for her work in machine translation.&lt;br&gt;
&lt;strong&gt;Diyam Mousa&lt;/strong&gt;: Birzeit University (Palestine): PhD candidate, Computer Science Department. Diyam is awarded copies of Arabic Treebank: Part 3 v. 3.2 LDC2010T08 and BOLT Egyptian Arabic Treebank – Discussion Forum LDC2018T23 for her work in morphological tagging for dialectal Arabic.&lt;br&gt;
&lt;br&gt;
For information about the program, visit the&amp;nbsp;&lt;a href="https://www.ldc.upenn.edu/language-resources/data/data-scholarships"&gt;&lt;font color="#095197"&gt;Data Scholarships page&lt;/font&gt;&lt;/a&gt;.&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;em&gt;New publications:&lt;br&gt;&lt;/em&gt;&lt;a href="https://catalog.ldc.upenn.edu/LDC2023T11"&gt;&lt;font color="#095197"&gt;AIDA Scenario 1 Practice Topic Source Data&lt;/font&gt;&lt;/a&gt;&amp;nbsp;was developed by LDC and is comprised of 1511 files (text, image, and video) from English, Russian, and Ukrainian web sources. Each phase of the AIDA program centered on a specific scenario, or broad topic area, with related subtopics designated as either practice subtopics or evaluation subtopics. The Phase 1 scenario focused on political relations between Russia and Ukraine in the 2010s. This corpus constitutes the full set of topic-focused documents for Phase 1 practice subtopics.&amp;nbsp;Data was collected from web sources by a combination of automatic and manual processes.&lt;strong&gt;&lt;br&gt;
&lt;br&gt;&lt;/strong&gt;The DARPA AIDA (Active Interpretation of Disparate Alternatives) program aimed to develop a multi-hypothesis semantic engine&amp;nbsp;to generate explicit alternative interpretations of events, situations, and trends from a variety of unstructured sources. LDC supported AIDA by collecting, creating and annotating multimodal linguistic resources in multiple languages.&lt;strong&gt;&lt;br&gt;
&lt;br&gt;&lt;/strong&gt;The knowledge base for entity detection and linking annotation for all AIDA Scenario 1 and 2 corpora is available separately as&amp;nbsp;&lt;a href="https://catalog.ldc.upenn.edu/LDC2023T10"&gt;&lt;font color="#095197"&gt;AIDA Scenario 1 and 2 Reference Knowledge Base (LDC2023T10)&lt;/font&gt;&lt;/a&gt;.&lt;br&gt;
&lt;br&gt;
2023 members can access this corpus through their LDC accounts. Non-members may license this data for $1000.&lt;/font&gt;&lt;/p&gt;

&lt;p style="line-height: 20px;"&gt;&lt;font color="#444444" face="arial, sans-serif"&gt;&lt;a href="https://catalog.ldc.upenn.edu/LDC2023T10"&gt;&lt;font color="#095197"&gt;AIDA Scenario 1 and 2 Reference Knowledge Base&lt;/font&gt;&lt;/a&gt;&amp;nbsp;contains the English knowledge base (KB) used for all AIDA entity linking annotation in Scenario 1 (Russia-Ukraine Relations) and Scenario 2 (Crisis in Venezuela). The KB content was drawn from GeoNames, the CIA World Leaders List, and the CIA World Factbook and was supplemented with manually-created KB entries developed by LDC specifically for AIDA data.&lt;br&gt;
&lt;br&gt;
This knowledge base supported the AIDIA entity detection and linking task for 13 entity types: GPE (Geo-Political Entity), LOC (Location), PER (Person), ORG (Organization), FAC (Facility), MHI (Medical/Health Issue), WEA (Weapon), SID (Side), COM (Commodity), CRM (Crime), LAW (Law), VEH (Vehicle), and BAL (Ballot).&lt;br&gt;
&lt;br&gt;
2023 members can access this corpus through their LDC accounts. Non-members may license this data for $250.&lt;/font&gt;&lt;/p&gt;</description>
      <link>https://isca-speech.org/page-18306/13285815</link>
      <guid>https://isca-speech.org/page-18306/13285815</guid>
      <dc:creator>(Past member)</dc:creator>
    </item>
  </channel>
</rss>