EDITION: Editing Digital Interactive Texts In an Online Network

We have seen over the last years of the great benefits of electronic editions. But we have seen too how hard it is for scholars to make these editions, using the existing tools. At present, it is far more difficult to make an electronic edition than to make a print edition. Until this changes, the great potential of electronic editions will remain unrealized.

We here outline our plans for new software which aims to change this situation, by making it possible for any scholar who can make a high-quality print edition to make a high-quality electronic edition. The thinking behind the need for EDITION is described at more length in an article by Peter Robinson in the Digital Medievalist journal.

EDITION summary

EDITION will provide a single on-line web system to permit scholars to make highly complex, information-rich scholarly editions based on the most original texts. Scholars may create highly-detailed transcripts of different versions of the one text, linked to the letter level to images of those texts, have them collated (compared word-by-word), adjust the collation to show the most significant differences, have the collation analyzed by sophisticated specialist tools, then view the transcripts, collations and analyses on-line, beside images of the original texts. EDITION will make it much easier for any scholar who can make a print edition to make a fully-featured electronic edition.

EDITION in detail

Context

Reliable editions based on the original documentary forms of the text are the foundation of scholarship in the humanities: in literary studies, history, theology, philosophy, and in all areas where the written record is important. In the last decades, two developments have given new life to textual scholarship (the theory and practice of making editions). The first is the interest in book history: the study of the material form of texts, through their creation, transmission and reception. The second is the advent of digitization: the possibility to make full digital facsimiles of the primary sources accompanied by limitless quantities of editorial materials ­ transcripts, collations, editorial commentaries.

Realization of this potential relies upon four factors:

Ten years of digitization have given us many instances of the first and fourth of these. But there are few of the second, and almost none of the third. This is particularly so in the case of texts in many versions. Few projects are making high-quality transcripts of texts in many versions, even fewer are using digital methods to compare them, and even fewer have analyzed and published. Indeed, only four projects, all associated with the applicants, have stayed the course so far: the Canterbury Tales Project and the two digital Greek New Testament projects led by EDITION partners in Birmingham and Münster, Germany, and the digital Monarchia, advised by members of the EDITION team.

The reason for this is that the methods used by those projects require specialist skills and support well beyond that usual in academic projects. Even moderate edition projects, of short texts in relatively few versions, require considerable resources.

Aims/Objectives

EDITION will combine the specialist tools developed by the EDITION team over the last twenty years into a single easy-to-use online server/client tool. This will permit scholars with relatively few computer skills to make digital editions with all the features of the Canterbury Tales and New Testament editions: viewing page images and transcripts side by side, linked to the letter level and below, collation of any texts against each other and against any base; filtering and adjustment of variants (including spelling regularization); generation of phylogenetic trees of relationships; searchable variant databases; full-text searching.

Any scholar who can make a high-quality print edition should be able to make a high-quality electronic edition. EDITION will aim for this.

Benefits and interest

EDITION will particularly address texts in many versions (though it may be used to make an edition of a text in just one version, presenting sophisticated searchable transcripts page-by-page alongside the images). Texts in many versions are among the most important texts of our culture: thus those mentioned above, and others the applicant is currently working on (e.g. Dante's Commedia; Wolfram's Parzival). Further, these are exactly those where digital methods have the most to offer: for handling large amounts of data, for searching out patterns in the variation, for showing exactly how one text is transformed, word by word and phrase by phrase, into another.

A very few of us are able to do this now, for a very few texts. EDITION will allow many more to make high-quality digital editions, of many more texts.

Methods

EDITION will incorporate and build on three pieces of software, the first developed by the team led by Kevin Kiernan at the University of Kentucky in the ARCHWAY project, and two of them developed by the EDITION team. Kiernan's software is designed to enable the making of what he describes as 'image-based editions': editions in which image and text are linked to the finest level of detail, down to the individual mark within a letter. EDITION will customize this to optimize its use in contexts where many versions of a text exist. The interface of Kiernan's software is designed to facilitate its use by any scholar, and this will be refined further in EDITION.

The other two pieces of software, developed by Robinson, are Collate and Anastasia. Collate has been continously developed since 1985. It has been used in the collation of thousands of versions, from many different languages and cultures (Japanese, Sanskrit, Armenian, etc.) It has developed precise means for adjusting collation interactively, to filter out spelling and other variation and to set variant spans exactly as required. Its use in major editing projects has forced it to meet the most demanding requirements. In turn, it has changed the way editors work: since 1997, the two Greek New Testament projects in Münster and Birmingham, possibly the largest editorial projects anywhere, have both completely rebuilt their editorial systems around Collate.

The third piece of software, developed by Robinson and West, is the Anastasia electronic publishing system. Development of this began in 1994, to provide a (now) XML publishing system for the output of Collate. Especially, it was designed to circumvent a major problem in XML systems: their difficulty in displaying alternative hierarchices (that is: to show a text both by chapter and by page). It has matured into a powerful publication system: for example, at http://nestlealand.uni-muenster.de.

Both these latter tools present difficulties for scholars operating without considerable computer support. Collate offers a bewildering range of options (in truth, too many), and even a moderate size collation may require handling many hundred computer files. The current Macintosh classic interface for Collate is aging, and requires updating to handle XML and unicode input. Anastasia demands XML and programming skills not usual in scholars.

EDITION will amalgamate key components of all three tools, into a web-based system permitting interactive creation, submission and validation of transcripts, collation of them, interactive adjustment of the collation, generation of tools for the analysis of the collation results, proof publication of all materials over the browser, and generation of XML output for further publication. File handling within EDITION will employ CVS for control of transcription files, while editorial adjustments setting regularizations and variants will be incorporated in a MYSQL database. All this will be invisible to the editor, who will prepare transcripts, submit them through a web browser, adjust the collation on-line, and view the proof edition.

Almost all the tools to achieve this are already present within the one million lines of 'C' code making up Collate and Anastasia. An alpha version of EDITION will be running by 1 March 2006, for full testing by projects led by the EDITION team and partners. In the following years we will tune EDITION, prepare documentation, and progressively release it fully.

Outputs

An online edition preparation system, capable of mounting on any major computer system (presently: MacOS X, Windows and Linux).

Dissemination

EDITION will be open source, with all code and software products available on sourceforge.net under the GNU public license (as now is Anastasia). For the life of this project and for one year after, EDITION will be maintained on a ITSEE webserver. In the longer term, EDITION may be supported by income generated from electronic publication of editions made with EDITION.

The next step: Autumn 2005

We expect to start work on EDITION in Autumn 2005. In the coming months, we invite project partners to join the project. This will involve:

We invite project leaders and staff who are interested in working with us as partners in the development of edition to email us.

 

 

Last modified 1 September 2005. The EDITION team: Peter Robinson, Barbara Bordalejo, Andrew West.