Charles Explorer logo
🇬🇧

Creating the European Literary Text Collection (ELTeC): Challenges and Perspectives

Publication

Abstract

The aim of this contribution is to reflect on the process of building the multilingual European Literary Text Collection (ELTeC) that is being created in the framework of the networking project Distant Reading for European Literary History funded by COST (European Cooperation in Science and Technology). To provide some background, we briefly introduce the basic idea of ELTeC with a focus on the overall goals and intended usage scenarios.

We then describe the collection composition principles that we have derived from the usage scenarios. In our discussion of the corpus-building process, we focus on collections of novels from four different literary traditions as components of ELTeC: French, Portuguese, Romanian, and Slovenian, selected from the more than twenty collections that are currently in preparation.

For each collection, we describe some of the challenges we have encountered and the solutions developed while building ELTeC. In each case, the literary tradition, the history of the language, the current state of digitization of cultural heritage, the resources available locally, and the scholars' training level with regard to digitization and corpus building have been vastly different.

How can we, in this context, hope to build comparable collections of novels that can usefully be integrated into a multilingual resource such as ELTeC and used in Distant Reading research? Based on our individual and collective experience with contributing to ELTeC, we end this contribution with some lessons learned regarding collaborative, multilingual corpus building.