Concordance Ver 2.0
Vance Stevens - Amideast UAE /MLI
Product at a glance
Any language, any level
Activity: Concordance, collocations, text analysis
Download zip file, 3 megabytes
Windows (95) (98) (NT)* (2000) (ME)
|Hardware requirements|| |
PC: X86/Pentium (must run Windows 95 at minimum; no Windows 3.1 version)
RAM 32 MB "as a sensible minimum"
Hard disk space = 4 Mb for installation
Download 30-day evaluation copy free
Single copy: $89 (US dollars) or £55 (UK pounds sterling)
Multiple copies: First copy $89 or £55; each additional copy £40 or £25
For site licenses, or more than 20 copies, contact program author
Upgrades to later versions are free of charge for foreseeable future
| || |
* Reviewer has personally tested version 18.104.22.168 of this program on Windows 95, 98, and NT, and version 2.0 on Windows 98 only
Concordancing has for ages been a powerful tool in the hands of sophisticated practitioners. Medieval monks occasionally occupied themselves with making longhand concordances of all words in extant bibles, but more recently the rest of us have had access to computer-based tools and electronic texts (e-texts) to accomplish the same thing in a fraction of the time, an early benefit being the concordance-based output of the prodigious Cobuild project. But still the level of sophistication required for concordance-based text analysis has proven daunting to the run-of-the-mill computer user. DOS-based concordancing tools requiring the practitioner to be conversant with the DOS command line peaked with Microconcord, the best in its class at the time, but still a bit esoteric for people who really didn't want to learn how to think like computers. Windows-based concordancers such as Monoconc improved dramatically on the concept by integrating concordancing with the more intuitive Windows environment.
And now in this most recent evolution of the state of the art, Concordance brings us concordancing in a Windows environment with even more capabilities you've wished you could do at a keystroke, plus web integration. The present version 2.0 allows concordances of web documents or text files, and allows you to select them as individual files or concatenate them. Or you can copy the contents of these documents, or documents from any application, from the computer's 'clipboard' buffer into a concordance file. The concordance output can be sorted left of keyword or right, and the keyword itself not only can contain wild cards, but can be any set of associated terms you chose to specify in the program's lemmatizer. Your output can be in the form of a file containing a full or partial concordance of the text chosen, and can even be in a format which can be accessed via the Internet.
While the program is intuitive to use and exceedingly well documented, the web integration appears to require storage resources that would normally be available only in institutional or corporate settings. According to its documentation, this program "makes wordlists, concordances, and Web Concordances from your electronic texts of any size ... limited only by available disk space and memory" of course. Users might feel the memory pinch if they work with huge corpora or desire to place their concordances on the web. Still, with the appearance of this program, compelling applications of concordancing have come a step forward toward availability to "the rest of us." More casual users will find Concordance to be a flexible, versatile, and easy-to-use program that can concordance very large text files quickly, and allow them to sort the data in numerous ways either from pull-down menus or clicks on column headings, and then shift quickly between wordlist, concordance, and text views. The list of all words in the text can be sorted alphabetically (descending, ascending), by frequency, length or word endings, or it can be reduced to a subset of the complete list.
Concordances can be done in full (a concordance of every word in the text) or as a "fast" concordance, which means you specify a concordance of only certain words (wild cards allowed of course). The latter might be useful if you know what you are looking for and want to get just that with minimal use of disk space; for example, if you are trying to illustrate usage of a word quickly in a language class, or you want to retrieve a concordance of a word you have previously explored in a corpus. But since a fast concordance is a limited subset of a full concordance lacking the power of the latter, I would think one would be most often happy to put up with the compromises on processing time and use of disk space and do full concordances at least once on a text. In this case, you have the benefit of collocation data, and can explore patterns in a text by experimenting with headwords. With a full concordance, you can sort lists of headwords so that they suggest avenues for exploration, or you can view a text and click on words in the text which, as long as you have done a full concordance and have the words in the headword list, will trigger a concordance view of that word.
Some of these features are illustrated in sample screen shots at http://www.rjcw.freeserve.co.uk/screenshots.htm
The most innovative feature of this software is its ability to make web concordances. As is explained in the documentation, "Web Concordances: turn your concordance into linked HTML files, ready for publishing on the Web, with a single click" I found this to be an accurate statement though getting exactly what you want on the web can take some fine-tuning. However you get there, one click or several, the program produces an HTML document in three frames. The leftmost frame has the headword list (as you have defined it). Each headword is hyper-linked to its corresponding concordance which comes up in another of the frames with the target word highlighted in red. The third frame displays the complete text. Whereas you might not want to have this kind of output as your sole concordance tool, it is significant that concordances you produce can be rendered web-accessible in as few as one mouse click.
Having produced a concordance from a text file as shown in the Concordance application window at right, a web concordance may be produced from the pull-down menu with a single click on 'Build Web Concordance'. The resulting web concordance is shown in the Netscape window on the left. The concordance window is taken from version 22.214.171.124 of the program
In practice, I found that this capability doesn't come cheap. Those without access to a lot of space on a web host will find themselves seriously constrained . Starting from a 1.2 meg text file (a long philosophical essay by a French author) I got a 20 meg concordance which converted to 50 meg of HTML documents (using version 126.96.36.199 of the program). Given what in my situation are considered to be rather large file-size outputs, I was not able to experiment much with producing web concordances other than to verify that the program appears to work as documented. Control over the size of each of the component files in the output set is also under the user's control, but requires annotation to the source text (to indicate where to break it logically) at the cost of a number of mouse clicks proportional to the length of text in the corpus.
I ran some tests on more modest files of 27 and 40 kbytes, respectively (both short chapter-length articles I downloaded from the web). Here are the corresponding file sizes of the concordances produced.
The two 'papers' produced 'full' concordance files of 1.5 to 3.1 megabytes in size. These files contain all information necessary to retrieve all concordance data from the two papers and are not particularly large. However, if one were to assemble a sizeable corpus, it is clear that the concordance sizes would mushroom accordingly.
In the folders are all the files one would need to move up to the Internet in order to publish a web concordance. For these rather small articles, the paper3 web concordance is 4.16 megabytes (148 files) and the paper6 web concordance is 2.34 meg (96 files). Those without recourse to corporate or institutional support facilities would be hard-pressed to find the server space needed to publish any particularly revelatory concordances when these data are extrapolated out to what is considered minimal for such purposes.
Web concordancing is a relatively new phenomenon on the Internet (only 186 hits on +"web concordance" using Google.com, December 2000, many of them on the server in Dundee where the program author is based). Large file sizes would not appear to be unusual for this means of publishing concordances. However, I think it's only fair to warn potential users that whereas anyone can produce a web concordance quite easily with this program, parking it on a web server may be the hurdle that prevents this from being practical right now for just anyone . As with any significant development, it will take time for the availability of server space on the one side to catch up with the space demands of what we might want to store on those servers on the other.
What the program does:
Here are some other features of this program (as noted in its documentation)
The graphic shows how you can concatenate files for concordancing in version 2.0. Once the files are concatenated the window allows you to edit your stop list and then Make Full Concordance (as was requested earlier in the process).
The features of the new version are described at http://www.rjcw.freeserve.co.uk/version_history.htm
Speed, Reliability, Compatibility
I was 'fortunate' to have carried this review out over the life of several of my home and office computers. Therefore I had a chance to check earlier versions of the program out on a range of machines, ranging from a 40 meg RAM, Pentium I laptop, to Pentium III's with 64 meg of RAM. On the lower order machines performance was slowed but acceptable. The documentation states that the program "can pick 5000 occurrences of a word from a 1MB text in under 6 seconds on a 266MHz Pentium II".
Screen management, navigation control, and user interface transparency / ease of use
These are all highly intuitive, and in cases where any explanation is required, the user is assisted by one of the most comprehensive, well thought out, and easily accessible set of online help documents I have ever come across in any application. Although what the program does is complex and extensive in scope, there is no reason that persistent users would not be able to root out answers to most questions they might have after exploring the numerous thoughtfully placed hyperlinks in the thorough help documents.
There are also tools for handling systems tasks while running concordances; for example, a systems monitor that allows you to see what resources your computer is using during concordancing, and use of this is again explained in the documentation.
Another feature I find quite useful is the ability of the program to concordance documents right off the web. To do this, you tell the concordancer to ignore any text falling within angle brackets, thus removing all HTML commands from being considered as concordance data. This is easily done, as the following screen shot indicates.
These overlaid windows show how text is de-selected from consideration as concordance data and what a list of headwords might look like with and without the skip markers invoked. On the far left is a list of headwords that includes HTML tags such as 'alignbottom' and 'alttable.' After setting the angle brackets as opening and closing skip markers, a concordance on the same text produced a list of headwords devoid of the HTML tags that appeared in the source web document (screen shots from version 2.0).
Yet another useful feature of version 2.0 of the program is an ability to concordance from the memory buffer, so the program can operate off texts copied right from other applications, like browsers or word processed documents. This is easily implemented. Simply copy whatever is to be concordanced to the clipboard and specify the type of concordance desired. Make any last-minute adjustments in the dialog box that follows and click the 'concordance' button.
You can also paste multiple clipboard entries to the above box, and edit them as desired.
Exploitation of computer potential
This program makes excellent use of computer resources and concepts. The Windows interface is well exploited, as well as is the availability of the Internet in making available the results of concordance output.
I was a little baffled by the problem illustrated below, which I was able to replicate but not correct in numerous concordances.
It appears that the problem was that by stripping out the HTML tags, my 'lines' of text had become very long. The documentation is in fact clear on the need to break the text into 'human-readable length' lines (60 to 100 characters) in order to avoid both bloated file sizes and the display problem indicated above . I didn't break the lines in my sample texts myself because, extrapolating out to a sizeable corpus, I thought the task might be daunting for all but the most assiduous researchers. One thing I might suggest is that future versions of this program incorporate a way to check if the text has sufficient carriage returns and line feeds and, if not, offer to break the text for you for better concordance results. It shouldn't matter much to the user where the lines are broken (they are broken pretty much arbitrarily anyway on the printed page) and it does matter if the lines aren't broken. (You can eyeball the frequency of carriage returns and line feeds yourself in Hex Mode of the program's File Viewer, but if the results are not to your liking, you'll have to break the lines yourself.)
As for documents you wish to publish as a web concordance, the documentation also states that the files themselves should be broken into files of about the size as those I experimented with (20 to 40 kb is recommended; the ones in the illustration above are 27kb and 40 kb). The program incorporates a Multiple Document Editor to facilitate splitting large files into smaller ones for this purpose in just a few steps.
Tribble (1990) and Higgins (1991) include many examples of productive concordances for classroom use. These and other examples are listed in Stevens (1995): http://www.ruf.rice.edu/~barlow/stevens.html . Hardesty and Windeatt (1989) was among the first published ESL activities books to include concordance-based exercises. One of the Oregon State University's "Tech Tips" was a concordancing activity on Connecting Clauses by Maria Dantas-Whitney (1997) http://osu.orst.edu/dept/eli/march1997.html .
Another current work is the comprehensive ICT4LT Module 2.4, Using concordance programs in the modern foreign languages classroom at http://www.ict4lt.org/en/en_mod2-4.htm (May 2, 2000). Catherine N. Ball's, 1996, Tutorial Notes: Concordances and Corpora (Georgetown University (http://www.georgetown.edu/cball/corpora/tutorial.html) is also an excellent resource. There are further materials available at Vance Stevens's Text Analysis: Concordance and Collocation page at http://www.vancestevens.com/textanal.htm.
Teacher Fit (Approach)
In discussing "teacher fit" it must be kept in mind that concordancers are primarily linguistic research tools. Almost all have been designed with the sophisticated researcher in mind. For this reason, the best concordancers give researchers capabilities best suited to their purposes (the one under current review being an excellent example of this kind of tool). They are rarely if ever designed with the typical language-learning student user in mind, and therefore students would only be interested in a small subset of their capabilities. With regard to students those who can best make use of concordances are inquisitive, inductive thinking, research-oriented, and constructivist in their approach to language learning. They must understand, or their teachers must help them to understand, that answers to questions they might have about the language under study can be productively gleaned through computer-based analysis of the data at hand, which in the case of concordancing, is the corpus of text available.
Why use concordancing with language learners? Johns (1988) felt that concordancing embodied the concept of data-driven learning while interjecting authenticity of text, purpose, and activity into the learning process. Higgins (1991:5) felt that the computer's prime benefit to language learning was "supplying, on demand and in an organized fashion, masses and masses of authentic language. ...The most powerful of these tools is a concordancer." Tribble (1990) predicted that the concordancer "will perhaps be the pre-eminent software tool in this next stage in the development of computer assisted language learning" (p.15).
A decade later, whereas we can still agree with the premises in these remarks, we see that Tribble's prediction has not come true (possibly in part because it was over-run by the advent of the Internet as a language-learning tool). Concordancing has nevertheless had a great impact on language learning, largely through approaches to language usage, as embodied for example in the Cobuild project. Yet (and despite Higgins1991 contention that concordancing accounts for "well over half" the computer work he does with students) few language learners today are being introduced to concordancing. Why not?
As I put it in Stevens (1995) http://www.ruf.rice.edu/~barlow/stevens.html
Concordancers are certainly not tools that computer novices can be turned loose on without proper preparation beforehand. In many instances, both students and teachers must be made aware of the methodological considerations underpinning use of such software. Inherent limitations in the database are rarely intuitively understood. Why, for example, should the word 'potential' never occur in a corpus of biology readings, yet occur repeatedly in a corpus of physics texts, always as a property of energy? The relationship between raw data and output is not obvious to all, and the very existence of the text base, its particular bias, and its relevance to the students must all be explained and emphasized. Formulation of productive queries is particularly difficult for language learners, who may need assistance until they have become familiar with the technique. Misspellings which spoil productive searches are common, and successful use of wild cards requires near-native competence in anticipating word derivations. It is also difficult for language learners to independently phrase queries so that they will expose subtle patterns in the language. Such patterns will likely have to be pre-considered by the teacher/facilitator, and until students have got the hang of concordancing, heuristics for getting at patterns will likely have to be worked out in advance and spelled out to students as well.
In their capacity as teachers, and for use with students predisposed (or whom they have predisposed) toward concordancing as a learning tool, applied linguists can appreciate a concordancer that will help them head off some of the problems students are known to have with concordancing, such as (1) tendency of students to make spelling errors which prevent successful concordance searches, and (2) inability to mount sophisticated searches on word roots to get at the full range of use of words and phrases in English. This program goes some way solve both these problems, by displaying down a left hand frame all the words encountered in the corpus (all correctly spelled, and from which students can work out root words and their derivations).
The only publicly available tool of which I am aware to facilitate the use of concordances by students is Tom Cobb's Compleat Lexical Tutor for data-driven language learning, which can be found on the web at: http://188.8.131.52/. There, students can explore various word lists and find the words grouped by families (or lemma; all words occurring in the database in a given family are listed). The students then click on words to see them concordanced through the Hong Kong Virtual Language Centre web-based concordancer at http://vlc.polyu.edu.hk/scripts/concordance/WWWConcappE.htm. Note that the concordance tool itself is not student-oriented, though the interface makes that tool easier for students to use (but not intuitive to the non-tutored).
In cognitive terms, students need a lot of scaffolding before they are able to use concordances productively on their own, but students who become well-versed in the technique should have a tool at their disposal to serve them well in their capacity as life-long learners. Properly used, concordancing can lead to serendipitous discoveries. As Wells (1999) put it, as the zone of proximal development "emerges in the activity and, as participants jointly resolve problems and construct solutions, the potential for further learning is expanded as new possibilities open up that were initially unforeseen."
As programs of this nature are learning and research tools, users have full control over how they use them. Users (students and teachers) can supply their own text. Adequate features exist within the program to analyze and examine that text. There is little that a user can or needs to do to alter those features. Adaptation of the features to the task at hand is the challenge in using concordancers.
The program presently under review appears to be the most versatile of its kind. It works well in -and takes full advantage of- the Windows environment, providing and in many cases improving on most features that concordancers to date offer. The program goes significantly beyond present offerings by making it possible for concordances to be published in web format, and to be done on word lists which could comprise a set of lemma or, in practice, any set of words the user proposes. The program is an excellent research tool, possibly the most sophisticated available, and should be appreciated as such, but the opportunity still exists for someone to develop a concordancing tool especially appropriate to the use of students of second and foreign languages.
Scaled rating (1 low-5 high)
Implementation possibilities: 5
Pedagogical features: 5 (but only for certain types of student)
Use of computer capabilities: 4
Ease of use (student / teacher): 5 (for teacher/researcher; accessible only to sophisticated student users)
Over-all evaluation: 5
Value for money: 5
1 .The author comments in email correspondence: "I do agree you are right to warn people new to concordancing that plenty of disk space is going to be needed. However, this is intrinsic to any full concordance, not really something characteristic of my program. For example, if a text contains ten words per line on average, then a full concordance will be at least ten times the size of the original text (since each line of the original text will appear in the concordance as often as there are words in the line). This is true whether the concordance is a book or a computer file. My program actually uses a good deal less space than, for example, the old standard Oxford Concordance Program which used fixed-length records. This is an appropriate comparison because my program and OCP are among the few to be able to make full, production-quality concordances to texts of almost any size." - Rob Watt 11/26/00
2. Authors' comment: "The context view won't scroll to show more than the first 255 characters before the headword, and the same number after. The text is still there all right, and if you save the concordance to text or to HTML, it will all be correctly preserved. It's just that the Windows control I use to display contexts refuses to scroll indefinitely. I didn't spell this out in the documentation for version 1.1.3, and will remedy that. But it is another consequence of having very long lines in your text, and I do spell that out, and the program even warns about over-long lines as it reads your text. So as with the file size issue, I think you would get more typical results if you did some tests with files which have lines of 'human-readable length' as I suggest in the documentation." - Rob Watt 11/26/00
Name Rob Watt
Address: Learmonth House, Liff, Dundee, DD2 5NN, Scotland, U.K.
Phone: +44 1382 580599
Rob Watt wishes it made clear that the only way to obtain the program is by download from the website and the way to ask questions or get support is by e-mail. (Reviewer's comment - the author has been very supportive in replies by email with this reviewer.)
Vance Stevens has been working with CALL since 1979. His publications and research projects have included numerous works on concordancing. After a 20-year career in ESL Vance has most recently been working in CALL software design and is currently a Consultant / CALL Coordinator for the Amideast UAE/MLI Project in Abu Dhabi.
P.O. Box 41637, Abu Dhabi,
Higgins, John. 1991. "Fuel for learning: The neglected element of textbooks and CALL". CAELL Journal 2, 2:3-7.
Johns, Tim. 1988. "Whence and whither classroom concordancing?" In Bongaerts, Theo et al. (Eds.). Computer applications in language learning. Foris.
Stevens, Vance. 1991. "Classroom concordancing: Vocabulary materials derived from relevant, authentic text". English for Specific Purposes Journal 10: 35-46.
Stevens, Vance. 1995. "Concordancing with Language Learners: Why? When? What?" CAELL Journal, vol 6 #2, pp. 2-10. http://www.ruf.rice.edu/~barlow/stevens.html
Tribble, Chris. 1990. "Concordancing and an EAP writing programme". CAELL Journal 1, 2:10-15.
Wells, G. 1999. "Dialogic inquiry: Towards a sociocultural practice and theory of education". New York: Cambridge University Press. Chapter 10: The zone of proximal development and its implications for learning and teaching. http://www.oise.utoronto.ca/~gwells/resources/ZPD.html