ISCA - International Speech
Communication Association


  • Home
  • Post a New Job Offer
<< First  < Prev   1   2   Next >  Last >> 
  • 2024-12-10 13:10 | Anonymous member (Administrator)

    “Privacy for Smart Speech Technology” (PSST) is a joint doctoral training programme and Horizon Europe Marie Skłodowska-Curie Action, the European Union’s flagship funding programme for doctoral training. We are a consortium of 7 European universities and 11 industrial partners searching for 12 PhD students to work on the protection and evaluation of privacy for smart speech technology. PSST is a unique opportunity, as it is the largest international project focusing on privacy in speech technology and because the importance of privacy has only recently gained wider appreciation.

    This is no ordinary PhD programme.

    The structured PSST doctoral training programme combines training in cutting-edge research, transferable skills and career-enhancing skills with exposure to multiple sectors and disciplines.

    Join us and put your expertise in deep learning / machine learning, speech processing, information privacy and security, and user studies into practice and gain your PhD degree from TWO leading European Universities (listed below)!

    See more information and PhD topics at https://psst-doctoralnetwork.eu/

    We are looking for 12 PhD candidates who hold a master's degree. We value diversity and plan to hire 12 fellows with a balanced background and skillset, and an excellent academic track record. We especially encourage applications from members of under-represented groups.

    Call opens 10.12.2024

    Application deadline 26.1.2025

    Shortlisted candidates informed 28.2.2025

    Recruitment event in Finland for shortlisted candidates 17.-18.3.2025

    Notification of acceptance May 2025

    Planned start of employment August 2025

    PSST follows a double-degree model whereby, during their 45-month employment, each PhD student will work in collaboration with two universities towards PhD degrees from both institutions! Each PhD student will also spend 6 months on secondment to one of our Associate Partners, all leading European SMEs, large industrials or regulatory bodies active in speech privacy: CNIL (France), ELDA (France), ki:elements (Germany), Loihde (Finland), Naver (France), Omilia (Greece), Orange (France), Vocapia (France), VoiceInteraction (Portugal), Voice INTER connect (Germany), and VoiceMod (Spain).

    Applications should include:

    • Curriculum Vitae (including countries of residence in the past 36 months).
    • Academic transcripts for completed courses and degrees.
    • Motivation letter explaining why you want to pursue a PhD degree and why you believe you are an outstanding candidate to pursue your PhD researching PSST topics.
    • Reference letter from Master’s thesis supervisor/advisor or similar.
    • (Optional) Preferences for 1-3 research topics (see webpage) and universities.

    Requirements

    • A master's degree in electrical engineering, computer science or related area (degree must be completed before employment can start).
    • Mobility: The fellow must not have resided or carried out their main activity (work, studies, etc.) in the country of the first recruiting organisation for more than 12 months in the 36 months immediately before their recruitment date.
    • Fluent written and verbal communication skills in English are required, knowledgof the local language is an advantage.
    • Candidates cannot hold a doctoral degree.
    Desirable skills
    • Knowledge and skills in deep learning, programming, speech processing, usestudies, privacy.
    • Ability to work independently and a critical mindset.
    • Pro-activeness and eagerness to participate in network-wide training events, international mobility, and public dissemination activities.

    Submit your application at

    https://www.aalto.fi/en/open-positions/doctoral-researchers-12-positions-privacyfor-

    smart-speech-technology-psst

    PhD students receive a regular salary and social benefits according to national regulations, and if applicable, also family leave, long-term leave, and special needs allowances.

    The gross salaries we offer, including both a living allowance and a mobility allowance, are

    • Aalto University (Espoo, Finland) 3500 €/month
    • EURECOM (Sophia Antipolis, France) 3261 €/month1
    • INESC-ID (Lisbon, Portugal) 2680 €/month2
    • INRIA (Nancy or Saclay, France) 3261 €/month 1
    • Ruhr University Bochum (Germany) Salary group TV-L E13 3
    • Radboud University Nijmegen (Netherlands) Salary scale P 4
    • Technical University of Berlin (Germany) Salary group TV-L E13 3

    1 https://www.horizon-europe.gouv.fr/sites/default/files/2022-02/horizon-europe---dn-pf---french-salary-explained-5762.pdf

    2 includes: base salary + food allowance + holiday allowance

    3 https://oeffentlicher-dienst.info/c/t/rechner/tv-l/allg?id=tv-l-2024&g=E_13&s=1

    4 https://www.ru.nl/sites/default/files/2024-09/Overview%20salary%20scales%201%20sept%202024.pdf

    For queries, contact info@psst-doctoralnetwork.eu.

  • 2024-11-07 19:27 | Anonymous
    *** Tenure-Track and Research Faculty Positions at the Toyota Technological Institute at Chicago ***


    * The Toyota Technological Institute at Chicago (TTIC) invites applications for the following faculty positions in computer science:

      - Tenure-track Assistant Professor
      - Tenured Associate Professor or full Professor
      - Research Assistant Professor (non-tenure track, endowed position for up to 3 years; see https://ttic.edu/research-assistant-professor/ )
      - Visiting Professor


    * While we welcome applications from many areas of computer science, we will give preference to candidates working in machine learning, computer vision, natural language processing and speech, robotics, computational biology, and algorithms and complexity theory.


    * About TTIC

    TTIC (www.ttic.edu) is an independent, philanthropically endowed academic institute dedicated to fundamental research and graduate education in computer science. All TTIC faculty positions are supported by the endowment.  TTIC has an accredited PhD program in computer science.

    TTIC produces cutting-edge research and offers world-class graduate education. Our faculty (https://www.ttic.edu/faculty/) are recognized with distinctions such as the Sloan Research Fellowships, NSF CAREER Awards, Best Paper Awards, and the NAS Michael and Sheila Held Prize. TTIC research faculty alumni have an excellent employment track record (https://www.ttic.edu/faculty-alumni/).  

    TTIC faculty members enjoy a uniquely light teaching load, which helps them focus on their research. TTIC has only PhD students, so all courses and activities are focused on advanced learning and research.  

    TTIC’s students have been recognized with fellowships (such as NSF, Google, and Microsoft), and have an excellent career track record, including post-docs and faculty positions at top universities and research positions at major industry labs (https://www.ttic.edu/student-alumni/).
     
    Located on the University of Chicago campus, TTIC has strong ties to the University. In addition to TTIC's excellent computing infrastructure, faculty members benefit from many of U. Chicago's state-of-the-art facilities.  TTIC faculty also regularly collaborate with U. Chicago faculty and students, as well as with faculty and students at Northwestern and other nearby institutions.

    TTIC strongly supports travel and visitor hosting, and typically hosts several workshops each year.  

    TTIC faculty and students enjoy the close proximity of a vibrant urban environment with flourishing culture, business, and entertainment scenes.


    * Teaching Requirements

    Tenured/tenure-track faculty teach one quarter per year. Research faculty have no teaching duties, but have the opportunity to teach and co-advise students.


    * TTIC/Simons-Berkeley Joint Program

    Applicants for research assistant professor (RAP) positions in relevant areas are encouraged to simultaneously apply for the TTIC RAP program and the Simons-Berkeley Research Fellowship (https://simons.berkeley.edu/research-fellowship-call-applications).

    Applicants selected by both institutions will be able to participate in a program at the Simons Institute before joining TTIC. Please note that applicants interested in the joint program must submit separate applications to TTIC and the Simons Institute.


    * Timeline

    Applications received before December 1 are guaranteed full consideration. However, applications will continue to be considered at any time.

    If interested in the joint program with the Simons Institute, please note that the Simons Institute has a different deadline.


    * Where to Apply:  https://ttic.edu/facultyapplication

    Senior applicants may directly contact the Chief Academic Officer (avrim@ttic.edu) or faculty members in their areas.


    * Questions?  Contact recruiting@ttic.edu



  • 2024-10-03 15:04 | Anonymous member (Administrator)

    Vicomtech (https://www.vicomtech.org/en/), an international applied research centre specialised in Artificial Intelligence, Visual Computing and Interaction located in Spain, has several research positions in the field of speech and natural language processing.

    We are seeking talented and motivated individuals to join our dynamic Speech and Natural Language Technologies team in either our Donostia - San Sebastián or Bilbao premises. If you have experience in speech and/or natural language processing technologies and are passionate about applying cutting-edge research to solve real-world needs through advanced prototypes, this opportunity is for you! 

    Whether you are a junior researcher (BSc/MSc graduate) looking to kickstart your career or a senior researcher (PhD graduate) eager to take on research leadership roles, we are interested in your profile. We offer the perfect environment with outstanding equipment and the best human team for growth. You will participate in advanced research and development projects, with opportunities to manage high-profile projects and/or lead technical teams depending on your experience. 

    Key Responsibilities: 

    • Conduct cutting-edge research in Speech and Natural Language Processing (NLP) technologies such as automatic speech recognition and synthesis, audio deep fake detection, information extraction, machine translation, text simplification and dialogue systems, among others. 
    • Contribute to national and international research projects.
    • Develop advanced prototypes that transfer technology to businesses and institutions. 
    • Manage or lead research projects, depending on experience. 

    Requirements: 

    • Bachelor’s or Master’s degree in Computer Science, Telecommunications Engineering or related fields. 
    • For senior profiles, a PhD in Speech Processing, NLP, AI or related disciplines is preferred. A PhD is not required for junior candidates. 
    • Strong programming skills (Python, Bash). 
    • Fluency in both spoken and written Spanish and English. 

    Preferred Skills (Not Required but Valued): 

    • Experience with speech and natural language processing tools and libraries (e.g. Kaldi, Whisper, Marian NMT, HuggingFace Transformers, Rasa, etc.). Deep learning frameworks (Pytorch, Tensorflow, ONNX). 
    • Virtualization technologies (Docker, Kubernetes). 
    • Experience in industrial and/or European research projects. 

    What We Offer: 

    • A vibrant, innovative research environment with state-of-the-art AI, Visual Computing, and Interaction technologies. 
    • Exciting national and international research projects. A multidisciplinary and renowned team in Speech and Language Technologies. 
    • Creative freedom in research, aligned with the centre’s goals. 
    • Opportunities for personal development through continuous learning. 
    • Clear career progression paths and leadership opportunities. 
    • Work-life balance policies and a commitment to equal employment opportunities. 

    If you are passionate about research and eager to apply or develop your expertise to real-world challenges, we encourage you to send us your CV and join our forward-thinking team!

    To apply via LinkedIn: https://www.linkedin.com/jobs/view/4034768411


  • 2024-06-24 11:35 | Anonymous member

    KU Leuven's Faculty of Engineering Science has an open position for a junior professor (tenure track) in the area ofSpoken Language Technologies. The successful candidate will conduct research on current challenges of speech technology and its applications,teach courses in the Master of Engineering Scienceand supervise students in the Master and PhD programs. The candidate will be embedded in the PSI research divisionof the Department of Electrical Engineering. More information is available athttps://www.kuleuven.be/personeel/jobsite/jobs/60334358?lang=en. The deadline for applications is September 30, 2024. 

    KU Leuven is committed to creating a diverse environment. It explicitly encourages candidates from groups that are currently underrepresented at the university to submit their applications. 

  • 2024-05-08 12:02 | Anonymous member (Administrator)

    Saarland University is a campus university with an international focus and a strong research profile. With numerous internationally respected research institutes on campus and dedicated support for collaborative projects, Saarland University is an ideal environment for innovation and technology transfer. The German Research Center for Artificial Intelligence (DFKI) is Germany's leading application-driven research institute with a core technology transfer mission. DFKI is currently the world's largest research centre for artificial intelligence operated as a public-private partnership. DFKI maintains close collaborative ties with national and international companies and is firmly rooted in the worldwide scientific AI landscape.

    To further strengthen this excellence in research and teaching, the Department of Language Science and Technology(LST) in collaboration with the German Research Center for Artificial Intelligence (DFKI) is inviting applications for the following position:

    Professorship (W3) in Language Technology

    (m/f/x; Reference: W2464)

    This position is a permanent public sector appointment (equivalent to a 'full-tenured professorship') starting at the earliest possible opportunity. We are looking for an experienced researcher in the field of language technology who has extensive knowledge of natural language processing and machine learning/AI methodologies. Experience with dialogue systems and reinforcement learning, the development of foundation models and/or trustworthy Artificial Intelligence is also desirable. In addition to holding a professorship at the university, the successful candidate will also be appointed as a scientific director at the German Research Center for Artificial Intelligence (DFKI) where they will head a research department. DFKI is an application-driven research organization that is largely financed through external project funding. A demonstrated ability to attract significant external funding for research projects at the national and international level is therefore essential. We also expect candidates to have experience in interdisciplinary research and in collaborating with industrial partners.The Department of Language Science and Technology is internationally recognized for its collaborative and interdisciplinary research, and the successful candidate will be expected to contribute to relevant jointr esearch initiatives. Language technologies are core elements of our study programmes at the M.Sc./M.A.and B.Sc./B.A. level and the person appointed will teach courses within these programmes.

    What we can offer you:

    The successful candidate will conduct world-class research, lead their own research group at the university and perform teaching and supervisory duties at the undergraduate, graduate and doctoral levels. At DFKI, the person appointed will lead a research department with access to an extensive worldwide network of industrial and other research partners, facilitating research and impact at a scale that is otherwise difficultto achieve. The position offers excellent working conditions in a lively and international scientific community. Saarland University is one of the leading centres for language science and computational linguistics in Europe and offers a dynamic and stimulating research environment. The Department of Language Science and Technology (LST) employs about 100 research staff across nine research groups in the fields of computational linguistics, natural language processing, psycholinguistics, phonetics and speech science, speech processing, and corpus linguistics (https://www.uni-saarland.de/en/department/lst.html). The department serves as the focal point of the Collaborative Research Centre 1102 'Information Density and LinguisticEncoding'(http://www.sfb1102.uni-saarland.de)andoftheResearchTrainingGroup'Neuroexplicit Models of Language, Vision, and Action' (https://www.neuroexplicit.org/), both of which involve close collaborationwithDFKI.TheLSTdepartmentandtheDFKIarebothpartoftheSaarlandInformaticsCampus (SIC: https://saarland-informatics-campus.de/en), which brings together some 800 researchers and over 2000studentsfrom81countries.SICisacollaborationbetweenSaarlandUniversityandworld-classresearch institutions on campus, which in addition to DFKI include the Max Planck Institute for Informatics and the Max Planck Institute for SoftwareSystems.

    Qualifications:

    The appointment will be made in accordance with the general provisions of German public sector employmentlaw.Candidatesmusthaveexperienceinandanaptitudeforacademicteaching.Theywillhave a PhD or doctorate in an appropriate subject and will have demonstrated a particular capacity for independent academic research, typically by having obtained an advanced, post-doctoral research degree ( Habilitation) or by having published an equivalent volume of peer-reviewed research or by having been appointed to a junior professorship or similar position. They will have a proven track record of leading their own research group and of acquiring external research funding. The successful candidate will be expected to actively contribute to departmental research and teaching. The language of instruction is English (in the M.Sc. and M.A. programmes) and German (in the B.Sc./B.A. programmes). We expect the successful candidate either to have sufficient proficiency to teach in both languages or to be willing to acquire this  level of proficiency within an appropriateperiod.

    Your Application:

    Applications should be submitted online at www.uni-saarland.de/berufungen. No additional paper copy is required. The application must contain:

    • a letter of application and CV/résumé (including your telephone number andemail address)
    • a complete list of your academicpublications
    • a complete list of external funding (stating own share if you were not the solebeneficiary)
    • your proposed research concept (2–5pages)
    • your teaching concept (1page)
    • copies of your degreecertificates
    • complete copies of your five most significantpublications
    • the names of three academic references (including email addresses),at least one of whom is not one of your previous academic supervisors.
    • If you hold a university degree from a foreign university, please provide proof of equivalence from Germany's Central Office for Foreign Education (ZAB) if available. If proof of equivalence has not been requested at the time of application, it must be submitted later upon request.

    Applications must be received no later than May 30, 2024.

    Please include the job reference number W2464 when you apply. Selected candidates will be interviewed. If you have any questions, please contact: crocker@lst.uni-saarland.de.

    At Saarland University, we view internationalization as a process spanning all aspects of university life. We therefore expect members of our professorial staff to engage in activities that promote and foster further internationalization. Special support will be provided for projects that maintain collaborative interactions within existing international cooperative networks, e.g. projects with partners in the European University Alliance Transform4Europe (www.transform4europe.eu) or the University of the Greater Region (www.uni- gr.eu)

    Saarland University is an equal opportunity employer. In accordance with its affirmative action policy, Saarland University is actively seeking to increase the proportion of women in this field. Qualified women candidates are therefore strongly encouraged to apply. Preferential consideration will be given to applications from disabled candidates of equal eligibility. We welcome applications regardless of nationality, ethnic and social origin, religion/belief, age, sexual orientation and identity.

    WhenyousubmitajobapplicationtoSaarlandUniversityyouwillbetransmittingpersonaldata.Pleaserefer to our privacy notice (https://www.uni-saarland.de/verwaltung/datenschutz/) for information on howwe collect and process personal data in accordance with Art. 13 of the General Data Protection Regulation (GDPR). By submitting your application, you confirm that you have taken note of the information in the Saarland University privacynotice.

    The full job advertisement can be found at:

    www.uni-saarland.de | www.youtube.com/watch?v=tzo6dxr1FWk


  • 2024-02-19 16:44 | Anonymous member

    The Laboratory of Language Technology (https://taltech.ee/en/laboratory-language-technology) at Tallinn University of Technology, Estonia, is looking to fill a postdoc position in the field of speech processing and/or NLP. The position is funded by EXAI -- the Estonian Centre of Excellence in Artificial Intelligence (2024−2030).

    The position is flexible with respect to topic, but it should connect thematically with current topics of interest to the research group (speech recognition, speaker and language recognition, speaker diarization, spoken language translation, summarization, low resource scenarios). Some possible research directions are using and finetuning of different speech and language foundation models (such as wav2vec2.0, Whisper, LLMs) for various speech and language processing tasks.

    The position does not include any teaching load, but supervision of Master and PhD students is expected.

    We are looking for candidates who have finished, or are about to complete, a PhD degree in speech processing, NLP or a related discipline. You must be proficient in English (spoken and written). Applicants should have demonstrated their research expertise through high-quality publications.

    The starting salary for this position is around 3500 euros per month (before taxes, around 2700 euros after taxes) and increases with experience. Additional benefits include roughly 6 weeks of paid annual leave, paid sick leave as well as maternity and parental leave. The initial appointment will be for two years; the position could be extended and migrated to a permanent researcher position, if suitable for both parties. The starting date is March 2024 or later; we would be willing to adapt to the time requirements of an ideal candidate.

    How to apply:

    Please send an e-mail to Tanel Alumäe (tanel.alumae@taltech.ee with the following information:

    * a short statement (just a few sentences) of research interests that motivates why you are applying for this position;
    * a full CV including your list of publications;

    Or, just apply via Linkedin: https://www.linkedin.com/hiring/jobs/3827285976/detail

    Unofficial inquiries about the position are also welcome!

  • 2024-01-04 15:22 | Anonymous

    Nous proposons un stage de recherche (Bac+5) au service recherche de l'Institut National de l'Audiovisuel (INA). Le stage porte sur la détection de l'activité vocale dans des corpus audiovisuels à l'aide de représentations auto-supervisées.

    Vous trouverez ci-joint l'offre de stage détaillée.

    D'autres stages sont également proposés au sein de l'INA, l'ensemble des sujets peuvent être retrouvés sur la page suivante : https://www.ina.fr/institut-national-audiovisuel/equipe-recherche/stages.

     

    Détection de l'activité vocale dans des corpus audiovisuels à l'aide de représentations auto-supervisées Stage de fin d’études d’Ingénieur ou de Master 2 – Année académique 2023-2024 

     

    Mots clés : deep learning, machine learning, self supervised models, voice activity detection, speech activity detection, wav2vec 2.0 Contexte L’Institut National de l’Audiovisuel (INA) est un établissement public à caractère industriel et commercial (EPIC), dont la mission principale consiste à sauvegarder et promouvoir le patrimoine audiovisuel français à travers la vente d’archives et la gestion du dépôt légal. À ce titre, l’Institut capte en continu 180 chaînes de télévision et radio et stocke plus de 25 millions d’heures de contenu audiovisuel. L’INA assure également des missions de formation, de production et de recherche scientifique. Le service de la recherche de l’INA mène depuis plus de 20 ans des travaux de recherche dans le domaine de l’indexation et de la description automatique de ces fonds selon l’ensemble des modalités : textes, sons et images. Le service participe à de nombreux projets collaboratifs de recherche que ce soit dans un cadre national et européen et accueille des stages de Master ainsi que des doctorants en co-encadrement avec des laboratoires nationaux d’excellence. Ce stage est proposé au sein de l’équipe de recherche (https://recherche.ina.fr) et se place dans le cadre d’un projet collaboratif financé par l’ANR : Gender Equality Monitor (GEM). D’autres sujets de stage sont également proposés dans l’équipe : https://www.ina.fr/institut-national-audiovisuel/equipe-recherche/stages

    Objectifs du stage La détection d’activité vocale (Voice Activity Detection - VAD) est une tâche d’analyse audio qui vise à identifier les portions d’enregistrement contenant de la parole humaine, les distinguant des autres parties du signal contenant du silence, des bruits de fond ou de la musique. Souvent considérée comme un prétraitement, cette méthode utilisée en amont des tâches de reconnaissance automatique de la parole, des locuteurs ou des émotions. Si les outils VAD existants permettent d’obtenir d’excellents résultats sur les programmes d’information ou les émissions de plateau [Dou18a, Bre23], les recherches récentes menées à l’INA ont révélé que les performances des systèmes état-de-l’art sont moindres pour un grand nombre de matériaux peu représentés dans les corpus de parole annotés. Ces contenus, qui ont fait l’objet d’une campagne d’annotation interne, incluent des émissions musicales, des dessins animés, du sport, des fictions, des jeux télévisés et des documentaires. L'objectif du stage est de développer des modèles de détection d'activité vocale (VAD) en adoptant une approche fondée sur le paradigme d'apprentissage auto-supervisé et s’appuyant sur les architectures transformerstelles que wav2vec 2.0 [Bae20]. Les modèles basés sur ces architectures permettent d’obtenir des résultats état de l'art sur de nombreuses tâches de traitement de la parole à l’aide de quantités d’exemples annotés limitées : transcription, compréhension, traduction, détection d'émotions, reconnaissance de locuteur, détection du langage, etc [Li22, Huh23, Par23]. Plusieurs études récentes ont démontré l’efficacité des approches auto-supervisées pour la VAD [Gim21, Kun23], mais ont à ce jour été entraînées et évaluées sur des données ne reflétant pas la diversité des contenus audiovisuels. Le stage proposé vise à exploiter les millions d'heures de contenu audiovisuel conservés à l’INA pour l'entraînement et l’amélioration des modèles. Les modèles réalisés seront intégrés au logiciel open-source inaSpeechSegmenter, utilisé entre autres pour le décompte du temps de parole des femmes et des hommes dans les programmes à des fins de recherche ou de régulation du paysage audiovisuel [Dou18b, Arc23].

    Valorisation du stage Différentes stratégies de valorisation des travaux seront envisagées, en fonction de leur degré de maturité et des orientations envisagées pour la suite des travaux :

    ● Diffusion des modèles réalisés sous licence open-source sur HuggingFace et/ou le dépôt Github de l’INA : https://github.com/ina-foss

    ● Rédaction de publications scientifiques

    Conditions du stage Le stage se déroulera sur une période de 4 à 6 mois, au sein du service de la Recherche de l’Ina. Il aura lieu sur le site Bry 2, situé au 28 Avenue des frères Lumière, 94360 Bry-sur-Marne.La·le stagiaire sera encadré·e par Valentin Pelloin et David Doukhan. Un ordinateur équipé d’un GPU sera fourni ainsi qu’un accès au cluster de calcul de l’Institut. Gratification : 760 € brut / mois + 50 % pass navigo

    Télétravail : possible une journée par semaine

    Contact Pour soumettre votre candidature à ce stage, ou pour solliciter davantage d’informations, nous vous invitons à envoyer votre CV et votre lettre de motivation par e-mail aux adresses suivantes : vpelloin@ina.fr et ddoukhan@ina.fr. Profil recherché ● Étudiant·e en dernière année d’un bac +5 dans le domaine de l’informatique et de l'IA

    ● Forte appétence pour la recherche académique

    ● Intérêt pour le traitement automatique de la parole

    ● Maîtrise de Python et expérience dans l’utilisation de bibliothèques de ML

    ● Capacité à effectuer des recherches bibliographiques ● Rigueur, Synthèse, Autonomie, Capacité à travailler en équipe

    Bibliographie

    [Arc23] ARCOM (2023). “La représentation des femmes à la télévision et à la radio - Rapport sur l'exercice 2022” [en ligne].

    [Bae20] A. Baevski, H. Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations,” Neural Information Processing Systems, Jun. 2020.

    [Bre23] Bredin, H. (2023). pyannote.audio 2.1 speaker diarization pipeline: principle, benchmark, and recipe, in INTERSPEECH 2023, ISCA, pp. 1983–1987.

    [Dou18a] Doukhan, D., Carrive, J., Vallet, F., Larcher, A., & Meignier, S. (2018, April). An open-source speaker gender detection framework for monitoring gender equality. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5214-5218). IEEE.

    [Dou18b] Doukhan, D., Poels, G., Rezgui, Z., & Carrive, J. (2018). Describing gender equality in french audiovisual streams with a deep learning approach. VIEW Journal of European Television History and Culture, 7(14), 103-122.

    [Gim21] P. Gimeno, A. Ortega, A. Miguel, and E. Lleida, “Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021,” in Interspeech 2021, ISCA, Aug. 2021, pp. 4359–4363.

    [Huh23] Huh, J., Brown, A., Jung, J. W., Chung, J. S., Nagrani, A., Garcia-Romero, D., & Zisserman, A. (2023). Voxsrc 2022: The fourth voxceleb speaker recognition challenge. arXiv preprint arXiv:2302.10248.

    [Kun23] M. Kunešová and Z. Zajíc, “Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0,” in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Jun. 2023, pp. 1–5.

    [Li22] Li, M., Xia, Y., & Lin, F. (2022, December). Incorporating VAD into ASR System by Multi-task Learning. In 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP) (pp. 160-164). IEEE.

    [Par23] Parcollet, T., Nguyen, H., Evain, S., Boito, M. Z., Pupier, A., Mdhaffar, S., ... & Besacier, L. (2023). LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech. arXiv preprint arXiv:2309.05472.

  • 2024-01-04 15:21 | Anonymous

    L’équipe SAMoVA de l’IRIT à Toulouse propose plusieurs stages (M1, M2, PFE ingénieur) en 2024 autour des thématiques suivantes (liste non exhaustive) :

     

    - Génération Automatique De Partitions Musicales Dans Le Style Choro

    - Compréhension De La Parole Et IA Au Service De L’Analyse Sensorielle

    - Caractérisation Du Comportement Alimentaire Par Des Analyses Vidéo Et Multimodale

    - Adaptations De Systèmes De Reconnaissance Automatique De Parole En Contexte Pathologique

    - Traitement De Signal Et IA Pour Révéler Des Troubles Articulatoires En Production De Parole Atypique

    - End-To-End Speech Recognition For Assessing Comprehension Skills Of Children Learning To Read

    - Active Learning For Speaker Diarization

    - Modélisation Automatique Du Rythme De La Parole

    - Transcription de Verbalisations pour l’Analyse du Discours lors de Scénarios en Réalité Virtuelle

    - Mise en œuvre d’un prototype de reconnaissance vocale comparative appliqué à l’apprentissage du langage oral

     

    Tous les détails (sujets, contacts) sont disponibles dans la section 'Jobs' de l’équipe :
    https://www.irit.fr/SAMOVA/site/jobs/
  • 2024-01-04 15:20 | Anonymous

    Offre Post-doc – Linguistique / linguistique computationnelle 

     

    Durée :            9 mois

    Début :            janvier ou février 2024, un début au mois de mars 2024 est négociable

    Lieu :               LIUM – Le Mans Université

    Salaire net :     environ 2 000 €/mois, variable selon les compétences

    Contact :         jane.wottawa@univ-lemans.fr, richard.dufour@univ-nantes.fr

    Candidature :  Lettre de motivation, CV (3 pages maximum)


    Dans le cadre du projet DIETS qui s’intéresse particulièrement aux métriques d’évaluation de systèmes automatiques de reconnaissance de la parole, une position post-doc est prévue pour 

    a)     Mener une analyse linguistique et grammaticale sur les erreurs de sorties de systèmes automatiques de reconnaissance de la parole

    b)    Mener des tests d’évaluation humaine en fonction de différents types d’erreurs 

    c)     Comparer les choix des tests d’évaluation avec les évaluations faites par des métriques automatiques

    d)    Publication des résultats (conférences, journaux)

     

     

    Le projet DIETS

     

    L'un des problèmes majeurs des mesures d'évaluation du traitement des langues est qu'elles sont conçues pour mesurer globalement une solution proposée par rapport à une référence considérée, l'objectif principal étant de pouvoir comparer les systèmes entre eux. Le choix des mesures d'évaluation utilisées est très souvent crucial puisque les recherches entreprises pour améliorer ces systèmes sont basées sur ces mesures. Alors que les systèmes automatiques, comme la transcription de la parole, s'adressent à des utilisateurs finaux, ils sont finalement peu étudiés : l'impact de ces erreurs automatiques sur les humains, et la manière dont elles sont perçues au niveau cognitif, n'ont pas été étudiés, puis finalement intégrés dans le processus d'évaluation.

     

    Le projet DIETS, financé par l'Agence Nationale de la Recherche (2021-2024) et porté par le Laboratoire Informatique d'Avignon, propose de se focaliser sur la problématique du diagnostic/évaluation des systèmes de reconnaissance automatique de la parole (RAP) de bout en bout, basés sur des architectures de réseaux de neurones profonds, en intégrant la réception humaine des erreurs de transcription d'un point de vue cognitif. Le défi est ici double :

     

        1) Analyser finement les erreurs de RAP à partir d'une réception humaine.

     

        2) Comprendre et détecter comment ces erreurs se manifestent dans un cadre ASR de bout en bout, dont le travail est inspiré par le fonctionnement du cerveau humain.

     

    Le projet DIETS vise à repousser les limites actuelles concernant la compréhension des systèmes ASR de bout en bout, et à initier de nouvelles recherches intégrant une approche transversale (informatique, linguistique, sciences cognitives...) en replaçant l'humain au centre du développement des systèmes automatiques.

     

     

    Compétences requises 

     

    L’offre de poste requiert les compétences suivantes : une bonne maîtrise de l’orthographe et de la grammaire française nécessaires pour catégoriser d’une manière informée les erreurs de différents systèmes de transcription et des compétences numériques puisqu’il faudrait récupérer les données à partir d’un serveur. Une formation en linguistique ou linguistique computationnelle est souhaitée. 

    Une expérience dans l’organisation, la réalisation et l’analyse de tests comportementaux est un plus. 

     

    Lieu d’accueil 

     

    La structure d’accueil est le LIUM, laboratoire d’informatique de Le Mans Université situé au Mans. Une présence régulière au laboratoire est requise tout au long du Post-doc. Le LIUM est composé de deux équipes. Le post-doc se déroulera dans l’équipe LST qui développe ses activités de recherche dans le domaine du traitement automatique des langues naturelles sous forme de texte et de parole. Elle travaille avec des approches guidées par les données mais l'équipe est également spécialisée dans le deep learningappliqué au traitement des langues. L’équipe est actuellement composée d’une chargée de projets, de 11 enseignants-chercheurs (informaticiens, acousticiens, linguistes), de 4 chercheurs-doctorants et de deux masterants apprentis.

  • 2024-01-04 15:19 | Anonymous


    Postdoctoral Scholar | Data Sciences and Artificial Intelligence at Penn State University

    The Data Sciences and Artificial Intelligence (DS/AI) group at Penn State invites applications for a Postdoctoral Scholar position, set to commence in Fall 2024. This role is centered on cutting-edge research at the nexus of machine learning, deep learning, computer vision, psychology, and biology, with foci on psychology-inspired AI and addressing significant biological questions using AI.

    To Apply: https://psu.wd1.myworkdayjobs.com/en-US/PSU_Academic/job/Postdoctoral-Scholar---College-of-IST-Data-Sciences-and-Artificial-Intelligence_REQ_0000050584-1

    Qualifications:

    • Ph.D. in computer science, A.I., data science, physics, or neuroscience with an emphasis on machine learning, or a closely related field. To qualify, candidates must possess a Ph.D. or terminal degree before their employment starts at Penn State.

    • A strong record of publications in high-impact journals or premier peer-reviewed international conferences.

    • Prior experience in conducting interdisciplinary/multidisciplinary research is a plus.

     

    About the position:

    The successful candidate will be designated as a Postdoctoral Scholar at the College of Information Sciences and Technology (IST) of The Pennsylvania State University. The initial term of the position is for one year, with the possibility of renewal upon performance and fund availability. The scholar will be engaged in two interdisciplinary projects funded by the National Science Foundation, receiving mentorship from Professors James Wang (IST), Brad Wyble (Psychology), and Charles Anderson (Biology). The scholar will collaborate with highly motivated and talented graduate students and benefit from strong career development support, which includes training in teaching, grant proposal writing, and other collaborative work. Qualified candidates will have the ability to teach in IST after successfully completing one semester with approval from college leadership.

     

    To apply:

    • Please submit a CV, research statement (max 3 pages), and other pertinent documents in a single PDF document with the application.

    • Deadline: February 29, 2024, for full consideration. Late applications are accepted but given secondary priority.

    • Only shortlisted candidates will be contacted to provide reference letters.

    • For inquiries, please email with the subject line “postdoc” to Professor James Wang at jwang@ist.psu.edu or visit the lab website http://wang.ist.psu.edu.

     

    COMMITMENT TO DIVERSITY:

    The College of IST is strongly committed to a diverse community and to providing a welcoming and inclusive environment for faculty, staff and students of all races, genders, and backgrounds. The College of IST is committed to making good faith efforts to recruit, hire, retain, and promote qualified individuals from underrepresented minority groups including women, persons of color, diverse gender identities, individuals with disabilities, and veterans. We invite applicants to address their engagement in or commitment to inclusion, equity, and diversity issues as they relate to broadening participation in the disciplines represented in the college as well as aligning with the mission of the College of IST in a separate statement.

     

    CAMPUS SECURITY CRIME STATISTICS:

    Pursuant to the Jeanne Clery Disclosure of Campus Security Policy and Campus Crime Statistics Act and the Pennsylvania Act of 1988, Penn State publishes a combined Annual Security and Annual Fire Safety Report (ASR). The ASR includes crime statistics and institutional policies concerning campus security, such as those concerning alcohol and drug use, crime prevention, the reporting of crimes, sexual assault, and other matters. The ASR is available for review here.

     

    Employment with the University will require successful completion of background check(s) in accordance with University policies. 

     

    EEO IS THE LAW

    Penn State is an equal opportunity, affirmative action employer, and is committed to providing employment opportunities to all qualified applicants without regard to race, color, religion, age, sex, sexual orientation, gender identity, national origin, disability or protected veteran status. If you are unable to use our online application process due to an impairment or disability, please contact 814-865-1473.

<< First  < Prev   1   2   Next >  Last >> 
 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by Wild Apricot Membership Software