Natural Language Processing in Learning Environments

Rada Mihalcea; Rodney Nielsen

doi:10.4135/9781483346397

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Natural Language Processing in Learning Environments

By: Rada Mihalcea & Rodney Nielsen
In:The SAGE Encyclopedia of Educational Technology
Chapter DOI:https://doi.org/10.4135/9781483346397.n221
Subject:Technology (general)
Keywords:algorithms; data mining; errors; essays; information retrieval; statistical learning; tutoring

Request Permissions

Show page numbers Hide page numbers

Natural language processing (NLP) is a field of artificial intelligence concerned with the interaction between computer science and natural languages. It has grown rapidly in recent years and has found direct applicability in a number of different domains. One such domain is that of education, where NLP has been successfully used to build automatic systems for assessment, instruction, and educational data mining.

Assessment

Several NLP techniques have been developed to address various aspects of assessment, such as the detection of reading level, the identification of errors in written text, or the automatic grading of essays, short answers, or multiple-choice questions.

Reading Level Assessment

Reading level assessment is used to determine language competency by finding the appropriate reading level for English students or for second language learners. Reading assessment is particularly relevant in a climate where as many as 25% of the students in many U.S. states have limited English proficiency. To meet the needs of these students, teachers often have to identify educational materials that have a low reading difficulty, while covering content that is relevant for the students’ level.

NLP can be used to determine the reading level of a text automatically, by using a statistical learning approach to assign a text automatically to one of several known reading levels for which training corpora exist. Each text is transformed into a feature vector, using several features such as readability metrics (e.g., Dale-Chall, Flesch-Kincaid), N-grams collected from the text, syntactic information, sentence length, number of words from a given vocabulary of “easy” words, discourse structure, measures of text coherence, and so on. Next, a machine learning algorithm is applied to categorize the text with the unknown reading level into one of the existing training levels. Several machine learning algorithms can be used, with the best results to date being obtained by support vector machines.

Grading

Computer grading has started to be used on a large scale in student evaluations such as the Test of English as a Foreign Language (TOEFL), the SAT, and the Graduate Management Admission Test (GMAT). Computer grading is often used to replace one of the human graders in settings where multiple graders are requested. Discrepancies between the automatic and human grader are adjudicated by another human.

Systems designed for essay grading rely on features based on writing characteristics and are based on the guidelines provided to human graders. These features typically cover rhetorical structure (e.g., relations and arguments); syntactic structure, meant to identify variety (e.g., number of different clauses) and correctness; and topical content, to determine if the vocabulary used is relevant to the essay’s topic. The essay scoring features can be combined using automatic methods such as linear regression of existing data sets of human-scored essays.

Automatic grading can also be used for short answers, where the aim is to determine the extent to which a student answer contains the information [Page 534]included in a correct instructor answer. Most of the approaches for short answer grading rely on measures of text similarity, which often results in scores that can then be translated into grades. Optimized combinations between different measures can be learned by training on existing sets of human-graded answers. Automatic feedback on errors can also be provided, but that usually requires additional manual annotations of the expected components of a correct answer.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Natural Language Processing in Learning Environments

Assessment

Reading Level Assessment

Grading

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends