ISCA Tutorial and Research Workshop on Experimental Linguistics (ExLing 2008)

Athens, Greece
August 25-27, 2008

MORPHEMIA: a Semi-Supervised Algorithm for the Segmentation of Modern Greek Words into Morphemes

Constandinos Kalimeris, Stelios Bakamidis

Voice and Sound Technology Department, Institute for Language and Speech Processing (ILSP), Greece

The present paper reports on MORPHEMIA, a semi-supervised machine-learning algorithm designed to segment Modern Greek (MG) words into morphemes. The algorithm segments its input iteratively. During its first iteration, the algorithm uses its a priori linguistic knowledge. At the end of each successful iteration, the algorithm extracts new morphological knowledge which is utilised during its next iteration. Thus, with each successful iteration, the algorithm segments an increasing amount of its input data. The algorithm uses a metric to decide whether a given extracted piece of morphological knowledge will improve its performance and only accepts it if it will. Thus, its output gradually improves in quality. MORPHEMIA terminates its operation when new knowledge can no longer be extracted from its input data.

Bibliographic reference.  Kalimeris, Constandinos / Bakamidis, Stelios (2008): "MORPHEMIA: a semi-supervised algorithm for the segmentation of modern Greek words into morphemes", In ExLing-2008, 117-120.