Member Login

Reset Password




ICALL Workshop 2012







9:30–10:15 Hans Paulussen (K. U. Leuven)

Web services as new interfaces for CALL applications

Human language technology has been around for some time, and yet it remains difficult for HLT tools to get off the ground in non-technical contexts where non-experts want to use software tools originally written for specialists. Although it is no longer necessary to convince them of the quality of linguistic resources—from raw text data to annotated corpora and electronic lexical databases—language researchers and applied linguists are often confronted with the fact that such resources are not always easily accessible, if you have little or no prior knowledge of basic text manipulation skills.

In a way, computational linguists are right in saying that one should take the effort to gain these basic skills, but on the other hand, one cannot expect non-experts to get to grips with the complexities of the tools. We don’t expect either that a driver should know all the mechanical details of his car.

Web interfaces were probably the first environments where non-experts could get hands-on experience with linguistic resources without extra help from experts, even if the selection results are limited. As such, web interfaces are the first step into helping non-specialists to explore data independently. Web services open up new opportunities, making independent exploration more flexible for both the ordinary user and software developers who want to develop further on programs and interfaces developed by other programmers. However, this creates new challenges
for the programmer, since he now has to consider also all the complexities of handling and controlling the input of the non-expert user.

In this talk, we will present a web service for aligning bilingual texts at sentence level. The original program was developed for the compilation of DPC, the Dutch Parallel Corpus, an automatic alignment project as part of STEVIN, a programme set up to contribute to the further progress of HLT in Flanders and the Netherlands. The alignment web service presented is part of a set of web services that are being developed within the TTNWW project which aims at integrating components—that have been developed in the STEVIN projects—into a workflow system for
web services to be developed for CLARIN.


10:15–11:00 Frederik Cornillie, Reuben Lagatie (K. U. Leuven)

Leveraging crowdsourced data for the automatic generation of feedback in written dialogue tasks

This presentation will report on the ongoing development and evaluation of a sentence matching algorithm which is used to generate corrective feedback in a tutorial CALL system on the basis of crowdsourced data.

First, we will present the case within which the algorithm will be evaluated. The case concerns an online application in which learners play the role of a detective and gather clues by formulating (written) responses in scripted dialogue tasks. These tasks focus on a number of specific grammatical problems in English, but since the unit of response is at the level of the utterance, many alternative responses are possible. Learning support is available as feedback through metalinguistic prompts and model responses. The learners’ responses are logged by the system, and are subsequently evaluated by peers, as a form of educational crowdsourcing.

Next, we will present a review of existing methods for analysing learner output and generating metalinguistic feedback in similar (half-)open tasks. Most state-of-the-art algorithms use some sort of robust parsing and a wide variety of language dependent linguistic resources, such as lexicons and grammars (e.g. Heift, 2003; Nagata, 2002; Schulze, 1999; Dodigovic, 2005). Such techniques allow to detect linguistic errors without having the correction at hand, but unfortunately they are language dependent, often hard to construct and not foolproof (Dodigovic, 2005; Fowler, 2006).

Finally, we will describe an alternative approach that uses more simple techniques, is less language dependent and can address a wider range of target language errors. The proposed algorithm leverages the crowdsourced data and uses approximate string matching, POS tagging, and lemmatisation in order to

 a) detect similarities and differences between the student’s response and (correct and incorrect) alternatives, and

 b) to provide metalinguistic feedback for a number of grammatical problems.

Works cited

Dodigovic, M. (2005). Artificial intelligence in second language learning : raising error awareness. Clevedon: Multilingual Matters.

Fowler, A. M. L. (2006). Logging student answer data in call exercises to gauge feedback efficacy. In J. Colpaert, W. Decoo, S. Van Bueren, & A. Godfroid (Eds.), CALL & monitoring the learner: proceeedings of the 12th International CALL Conference (pp. 83-91). Antwerp: Universiteit Antwerpen.

Heift, T. (2003). Multiple learner errors and meaningful feedback: A challenge for ICALL systems. CALICO Journal, 20(3), 533–548.

Nagata, N. (2002). BANZAI: An application of natural language processing to web-based language learning. CALICO Journal, 19 (3), 583-599.

Schulze, M. (1999) From the developer to the learner: Describing grammar – learning grammar. ReCALL Journal 11 (1), 117–24.


11:00-11:45 Thomas Plagwitz (UNC Charlotte)

Using NLP tools to automate production and correction of interactive learning materials for blended learning templates in the Language Resource Center

To facilitate lesson delivery and student interaction in our Language Resource Center I have programmed a VBA- and MS-Word- based cloze quiz template with batch creation based on a simple markup language and rich autocorrecting functions that use string metric algorithms (Damerau-Levenshtein (Figure 1: Damerau-Levenshtein Implementation). The templates support typical activities in the digital language lab (interactive presentations with multimedia, listening comprehensions(Figure 3: Quiz Template with Chanson Lyrics), speaking and dialoguing activities for language learning: view usage examples here). Teachers can use them as exercise-generating engines: the templates allow copy/paste of their own exercises into these templates. To also automatically create language teaching materials with the required markup in French, German, Italian and Spanish (mostly based on movie subtitles) for this template, I wrote a C#-program thatapplies an expanding library of regular expressions whichcan match typical language learner tasks (Figure 2: RegEx Library: function words (Figure 5: Spanish Movie Subtitling Exercise Creation), affixes/infixes and lexical subsets taken from corpus linguistic research on word-frequency (SUBTLEX, Opensubtitles)).These templates support the learner by strengthening learner autonomy and providing immediate corrective feedback and - in conjunction with the grouping facilities of the Center’s classroom management system infrastructure - allow for custom-tailored instruction based on the immediately available outcome of formative assessments.


1:00–1:45 Ken Petersen (American Councils for International Education)

ICALL Through Web Services: Breaking Down the Walls of the Virtual Classroom

Intelligent computer-assisted language learning (ICALL) applications have traditionally been confined to restricted, and often artificial, environments. However, with the proliferation of Web services, which provide interoperable functionality through open APIs, NLP tools can now be employed in a wide variety of contexts to provide dynamic scaffolding and feedback for second language learning and present heretofore unimaginable opportunities for tracking learner performance.

Drawing on previous work in English and Russian, the language technology team at American Councils for International Education has begun to design a conceptual framework for deploying discrete and combinatory NLP functionalities through open Web services. This framework is meant to harness lexical, morphological and syntactic resources in any number of languages and provide a suite of generic language services (e.g. dictionary services, corpus lookups and morphological and syntactic analysis).

This presentation will include examples from previous work and a discussion of the theoretical and methodological considerations of this ambitious project. Particular attention will be given to the development of language-agnostic interfaces for NLP tools, API architecture, integration with client systems and data modeling. Additional emphasis will also be placed on the implications of such a flexible and interoperable system on automatic feedback generation, evaluation, language analysis and learner modelling.

Designing and implementing a framework for managing NLP Web services is rife with questions and potential pitfalls. The presenters hope to draw on the expertise of the workshop participants to help identify and clarify lingering questions and explore potential use cases for such a framework.


1:45-2:30 Lene Antonsen (University of Tromsø)

Adding grammatical misspellings to the fst in an ICALL system

A suite of language learning programs for Sami, a morphologically complexminority language, has since 2009 been available on the Internet for students with Norwegian or Finnish as their L1 ( (Antonsen et al., 2009b).

The programs give immediate feedback to the student on grammatical errors. In two of the programs, a QA-drill and a machine governed dialogue system, the student types answers to questions. This free input is analysed by grammatical parsers, and the system gives tutorial feedback on syntactic errors.

The input is analysed with an fst-analyser (Beesley and Karttunen, 2003). The disambiguation and assigning of grammatical error tags and semantic tags for navigation in the dialogue are done with a Constraint Grammar parser (Fred

Karlsson and Anttila, 1995) (vislcg3) (VISL-group, 2008). To constrain the syntactic analyses, the machine question and the student’s answer are given to the analyser as one text string.

The user’s input and the feedback from the system are logged, which makes it possible to investigate the interaction. A study showed that the system detected 93% of all targeted syntactic errors (Antonsen et al., 2009a). The log showed also misspellings in half of the sentences, and even when misspelled words were pointed out, the students often gave up finding the correct spelling. The challenge is to give a good feedback for the misspellings. A generic spell checking program is not sufficient for L2-errors.

The solution is to enrich the fst-analyser with grammatical malforms marked by specific error-tags. Most misspellings are systematic and therefore possible to predict. The specific error-tags make it possible to give metalinguistic comments about the morphological nature of the misspellings, both for non-word and real-word errors.

The experiments show promising results, the CG-analyser is capable of disambiguating the input despite malforms. We have now implemented the new parser into the system, and by investigating the log, we’ll be able to evaluate the student-machine interaction.


Antonsen, L., Huhmarniemi, S., and Trosterud, T. (2009a). Constraint grammar in dialogue systems. In Proceedings of the 17th Nordic Conference of Computational

Linguistics, volume 8 of NEALT Proceeding Series, pages 13–21, Odense.

Antonsen, L., Huhmarniemi, S., and Trosterud, T. (2009b). Interactive pedagogical programs based on constraint grammar. In Proceedings of the 17th Nordic Conference of Computational Linguistics, NEALT Proceeding Series, Odense. bitstream/10062/9546/1/paper38.pdf.

Beesley, K. R. and Karttunen, L. (2003). Finite State Morphology. CSLI publications in Computational Linguistics, USA.

Fred Karlsson, Atro Voutilainen, J. H. and Anttila, A. (1995). Constraint grammar: a language-independent system for parsing unrestricted text. Mouton de Gruyter.

 VISL-group (2008). Constraint grammar.


2:30-3:15 Troy Cox, Trevor Burbidge, Nathan Glen, Matt LeGarem,

Carl Christensen, Deryle Lonsdale (Brigham Young University)

The Effect of Task Difficulty on Automatically Scored Speech Features

Speaking tests and oral interviews are some of the most commonperformance tests given in language learning situations and with technologicaladvances, many researchers are looking into ways to use Automatic Speech Recognition (ASR) to score the tests. While ASR has great potential, it is still limited in its ability to recognize both speaker and context independent speech samples (O’Shaughnessy, 2008). Despite that limitation, ASR can recognize temporal features of speech, such as Speech Rate, Average Run Duration, Number of Pauses, etc. and those features have been analyzed to predict speaking ability (Ginther, Dimova & Yang, 2011).

One relatively unexplored area of research in speaking tests is the effect of task difficulty on the speaking performance (Fulcher & Reiter, 2003). For example, an examinee might be quite fluent discussing rehearsed topics such as family and hobbies, but less fluent they discussing less familiar topics such as global economics or medical ethics. If all of the prompts are the same, variance in scoring due to task difficulty is unimportant. If however, speaking prompts are chosen from an item bank, the effect of task difficulty could confound the results of the ASR.

This study looked at the following question:

What is the effect of task difficulty on ASR Rated speech features?

To determine the effect of task difficulty on ASR rated speech features, a speaking test with 10 questions targeted at 5 different difficulty levels was administered to 187 ESL students at an intensive English program. The ASR scores of different features including speech rate, average run duration, number of words, unique words, acoustic score, language score, etc. were then analyzed to see if the item difficulty affected any of the feature scores.


3:15–4:00 Michael Walmsley (University of Waikato)

FERN: Fun Extensive Reading from Novice to Native Speaker

Extensive reading (ER) is an effective language learning approach that involves reading large quantities of easy and interesting L2 text without a dictionary. A major obstacle to the use of ER in the classroom is the expense required to obtain a large library of ‘graded readers’ that cater for wildly varying abilities and interests of all the learners in a class.

While the WWW contains a large volume of freely available multi-lingual text, the majority of documents are unsuitable for unaided ER by a typical L2 learner. Some projects have used general readability measures to harvest documents on the web suitable for low level L2 learners; this approach yields a rather small corpus.

The FERN project has created an iCALL system that that assists each learner in locating interesting news articles at an appropriate level of difficulty. How does the system do this? Firstly, FERN automatically glosses articles; during reading learners click unfamiliar words to view translations. In the background, FERN constructs a profile of each learner’s vocabulary knowledge. This profile is used to enhance a News search engine—that searches the live web—with individualized ratings of the lexical difficulty of articles.

A 12 week evaluation of FERN was conducted with 17 students in a second year university Spanish class. The results indicate that the difficulty ratings helped learners find articles of appropriate difficulty. Furthermore, the results indicate that computer assisted narrow reading—reading of several texts in a genre of interest—of authentic news articles is a viable, low cost approach to ER in the L2 classroom.

 The evaluation also highlighted the benefits of iCALL in helping to overcome other major issues in the ER classroom: motivating learners to read; monitoring how much learners read in a ‘cheat’ proof fashion; measuring language learning progress.


4:00-4:45 Nina Vyatkina (University of Kansas)

Digital Resources for L2 Research: An Annotated Longitudinal Corpus of Learner German

This study reports on the results of a project that archived, annotated, analyzed, and will make publicly available a digital longitudinal corpus of writing samples collected from American learners of German at dense time intervals over several semesters. This international project combined the Second Language Acquisition (SLA) and Natural Language Processing (NLP) expertise of three research teams located in the US and Germany. The longitudinal methodology employed in the project helped to richly document developmental profiles of multiple learners based on their language production and ethnographic data as well as to focus on beginning acquisition stages, thus filling existing gaps and expanding the empirical basis of SLA research.

This project focuses on innovative text annotation and analysis techniques as well as on making digital resources freely and publicly available to researchers and educators. Multiple layers of data were annotated for all learner texts. First, automatic part-of-speech, morphological, and lemma annotation was performed on the original learner data. Next, two independent annotators formulated target hypotheses for deviant learner data and provided manually corrected versions of all learner texts. The interannotator agreement was calculated, and discrepancies were discussed and resolved. The resulting manual annotation for target hypotheses constituted yet another annotation layer, and the output was, in turn, automatically tagged for parts-of-speech and lemmas.

The project will culminate in making the whole corpus and all annotation layers described above publicly available and searchable via an online interface. We will demonstrate how this brand new learner corpus can be searched as well as report on the first results of research performed on the corpus data. In particular, we will show how this longitudinal corpus can be used for studies aimed at capturing linguistic patterns germane to developing learner language and for teaching purposes.


Diaz-Negrillo, A., Meurers, D., Valera, S., & Wunsch, H. (2010). Towards in- terlanguage POS annotation for effective learner corpora in SLA and FLT. Language Forum, 36, 1–2.

Ellis, R. (1994). The Study of Second Language Acquisition. Oxford University Press, Oxford.

Lüdeling, A., Adolphs, P., Kroymann E., & Walter, M. (2005). Multi-level error annotation in learner corpora. The Corpus Linguistics Conference Series, 1(1).

Meurers, D. (2011, March). On emergent linguistic characteristics in learner and translation corpora. Invited lecture at Université Paris. Retrieved from:

Meurers, W. D., & Müller, S. (2009). Corpora and Syntax. In A. Lüdeling & M. Kytö (Eds.), Corpus linguistics (pp. 920–933). Berlin, Germany: Mouton de Gruyter.

Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing. Manchester, UK.