Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Assessing the Reading Level of Web Pages

Sarah E. Petersen, Mari Ostendorf

University of Washington, USA

Reading is an important part of educational development. However, finding appropriate reading material for all students can be difficult and time consuming for teachers. Our goal is to automate the task of assessing the reading level of text to enable teachers to more effectively take advantage of the large amounts of text available today on the World Wide Web. Reading level assessment tools already exist for clean corpora such as books and magazine articles. This paper presents extensions of a particular set of tools to handle web pages returned by a standard search engine, including a step that pre-filters web pages to eliminate "junk" pages with little or no text. Results of applying the reading level detectors to web pages are manually evaluated by elementary school teachers, the intended audience for these tools. The tools work well for grades 4 and 5, with room for improvement in grades 2 and 3.

Full Paper

Bibliographic reference.  Petersen, Sarah E. / Ostendorf, Mari (2006): "Assessing the reading level of web pages", In INTERSPEECH-2006, paper 1610-Tue1WeS.5.