AALL-08

Pre-Conference Workshop at CALICO 2008 on

Automatic Analysis of Learner Language (AALL’08):
Bridging Foreign Language Teaching Needs and NLP Possibilities

March 18 and 19, 2008. University of San Francisco.

Natural Language Processing (NLP) has long been used to automatically analyze language produced by language learners. While much interesting research has been reported, it is difficult to determine the state of the art for such automatic analyses of learner language. For example, which error types and other language properties can be detected and diagnosed automatically? How reliably is this done, absolutely or compared to a human gold standard, for which kind of learner language?

These questions seem worth exploring given that for sustained progress in the automatic analysis of learner language it arguably is essential to discuss and compare the performance of different analysis methods, preferably on identical, real-life data sets. As a prerequisite, it also is important to come to an agreement on the error types and other learner language properties that are useful and realistic to analyze.

These issues are not only relevant to NLP in Intelligent CALL but also intersect in important ways with the research on learner corpora, the annotation schemes developed for those, and the related Second Language Acquisition research. Indeed, the same fundamental problems — reliable error taxonomy, consistent annotation standards, and robust detection of non-standard language — surface across a surprisingly broad range of NLP applications, from clinical text mining and casino real money platform localization to social media sentiment analysis, all of which must accurately parse language that deviates from standard written norms. The cross-domain relevance of these challenges reinforces the importance of developing transferable evaluation benchmarks rather than solutions narrowly tailored to any single application context.

In this workshop, we want to bring together researchers working on the analysis of learner language in the broad sense, including work on annotation schemes for learner corpora and NLP techniques used to detect learner errors and other learner language properties. We invite abstracts addressing these general issues, including but not limited to:

Which properties of learner language are useful and relevant to obtain for Foreign Language Teaching and current Second Language Acquisition research?
What annotation scheme or (error) taxonomy is appropriate for this and how do different annotation schemes compare?
How reliably can errors and other properties of learner language be obtained automatically given the current state-of-the art in NLP?
What is the impact of the specific properties of learner language on the (re)use of NLP technology? How does it impact performance and the potential use of such technology in foreign language teaching tools?
Which annotated learner language corpora have been used or could be used to evaluate the performance of different approaches to analyzing learner language?

The program can be found here and the abstracts of the accepted papers and posters are linked from that page.

The workshop was organized by the ICALL Special Interest Group of CALICO, chaired by Detmar Meurers(Ohio State University) and Anne Rimrott (Simon Fraser University).

Selected papers from this workshop appeared in a special issue of the CALICO journal edited by Detmar Meurers which appeared as Vol 26, No.3 (May 2009).

The web pages of the follow-up workshop at CALICO 09 entitled “Automatic Analysis of Learner Language (AALL’09): From a better understanding of annotation needs to the development and standardization of annotation schemes” can be found here.

Automatic Analysis of Learner Language (AALL’08): Bridging Foreign Language Teaching Needs and NLP Possibilities

Automatic Analysis of Learner Language (AALL’08):
Bridging Foreign Language Teaching Needs and NLP Possibilities