About our web service

The idea behind correspSearch

With correspSearch you can search through indexes of different letter collections (digital or print) by sender, addressee, location written, location sent, and date. To this purpose a website and a technical interface are provided. The web service collects and evaluates TEI-XML data in the ‘Correspondence Metadata Interchange’ format.

The web service correspSearch is operated and developed according to the following principles:

  • Reference System: The web service aims to help users with their research by offering a central location to search for letters, and by guiding them to the original publication.
  • Academic Data: The web service is based on the data from letter-indexes of editions or repositories that are edited according to academic criteria.
  • Conceptionaly Open: There is no focus on a particular time period or place. This allows for new kinds of research questions to be explored.
  • Open Access: Data is only collected that is under a free license, and the data from the web service continues to be under a free license and is thus available for further use.
  • Open Interfaces: correspSearch offers technical interfaces that are open and well documented. Other projects can easily query and use the data.
  • Open Standards: Data exchange and data processing are based on open standard formats and technologies.

Motivation

Letters are considered some of the most valuable sources for historical research. Not only does their content cover a variety of different themes, events, persons, etc., but letters also depict networks between people.

Yet due to editorial and practical reasons historical letters are generally only partially edited—focusing often only on one person or on the correspondence between two persons. In order to address questions on topics like relationship networks, it is necessary to do elaborate research using multiple editions of letters. This challenge has been long recognized in the academic community. Wolfgang Bunzel visualizes a solution, focusing on correspondence from the Romantic era:

“the creation of a decentralized, preferably open digital platform, based [...] on HTML/XML and operating with minimal TEI standards, which is extensible in different directions and allows for existing web portals and websites to contribute at the lowest possible cost. This doesn’t require some kind of super structure which covers the entire amount of letters from the Romantic era (which could not be estimated exactly, anyway) but rather an intelligent linking system, which associates existing documents with one another. The creation of such nexus will naturally lead to research possibilities from searches for persons and places to specific keyword-based searches [...]”

Wolfgang Bunzel: Briefnetzwerke der Romantik. Theorie – Praxis – Edition. In: Anne Bohnenkamp und Elke Richter (Hg.): Brief-Edition im digitalen Zeitalter (=Beihefte zu editio Bd. 34) Berlin/Boston 2013. p. 109-131, here p. 117. (Own translation)

The web service ‘correspSearch’ takes a step in this direction, collecting the letter metadata from separate editions and repositories and making the data available through open interfaces, based on the foundations of TEI-XML and under a free license. Through this service the indexes of different digital and print letter editions can be quickly and easily searched according to sender, addressee, writing location, and date.

The graphic shows how correspSearch works.

Digital Indexes of Letters

The web service correspSearch is based on digital letter indexes that are available online and written in the Correspondence Metadata Interchange (CMI) Format.

The CMI format is based essentially on the TEI extension ‘correspDesc’ (Correspondence Description), a part of the Text Encoding Initiative guidelines. The element correspDesc was developed by the TEI Correspondence Special Interest Group (SIG) in order to record correspondence-specific metadata from letters, postcards, etc., in TEI-based editions. In April 2015 ‘correspDesc’ was officially added to the version 2.8 of TEI P5.

The CMI format was developed by the TEI Correspondence SIG as well, and makes a standardized comparison of letter metadata possible. It does this by reducing the index to only the most essential elements—the sender, addressee, date, and location of a letter—as well as by regulating the inputted data. In addition to the benefits of TEI-XML encoding, standardization is greatly aided through authority files. Sender, addressee, and location are identifiable independently of the specific project (for example, through the GND number from the German National Library) and the data can be searched, linked and used outside of a particular project. Finally, in CMI the individual letter is referenced by the letter identification number or through bibliographical information for print editions and by the URL for digital editions.

More about the CMI format

Data

The data behind the web service is an ever growing collection of digital letter indexes provided by print or digital scholarly editions of letters. The web service does not set limitations regarding the historical time period or place. A list of the letter indexes that are currently in the web service data base can be found in the data summary.

The database of the web service is continually being extended. Any print or digital scholarly letter edition may register their index of letters in the CMI format through the correspSearch web service.

More about participating

Please note that scholarly editions of letters generally only focus on the correspondence of one person or between two persons. Other letters (for instance a letter between two family members of the main correspondent of an edition) are in the rule only considered with regard to the main correspondent. In addition, the data might be incomplete, either because not yet all of the letter indexes at hand are integrated, or because the respective edition is not yet finalized. Furthermore, the correspSearch web service only considers letters that are part of a scholarly edition. Letters which have not been published are not considered by the web service at this time.

Web Service Features

The web service offers one website through which one can search and research within multiple letter editions using their digital indexes. The indexes are available online under a free license from their respective hosts and their data is regularly retrieved and updated by the web service. Every index simply needs to be registered through a URL and will then be automatically integrated.

When processing the indexes the authority files of correspondents will be used to query the Virtual International Authority File (VIAF) system for other existing authority files, so that different identification systems can be used together. Currently the web service supports GND, VIAF, the Bibliothèque nationale de France (BNF), the Library of Congress (LC), and the National Diet Library (NDL) in Japan. The geographical data base GeoNames is used for place names.

The aggregated letter indexes are searchable by correspondent, location, and date. Correspondent and location can be specified according to their role in the communication process. It is also possible to search within a particular letter index.

Search results disiplay the metadata of the individual letter, together with biographical details. Letters from digital editions are directly linked to their original source.

In addition to the website, multiple technical interfaces are available with which the web service can be automatically queried. Results are outputted in CMI format (among other options) and are thus free for further use by other programs.

More about available interfaces

Technologies

The software architecture of the web service is modularised (from version 2) and consists of the following components:

  • csHarvester: Harvesting and validation of the CMIF files (eXistdb app)
  • csIngest: Ingest of the CMIF files into the Elasticsearch index, creation of entries in various autocompletes (persons, places etc.), enrichment with further data from standards files (Python)
  • csSearch: search frontend incl. map-based search (vue.js)
  • csWeb: Website (except search) (eXistdb app)
  • csAPI: API (eXistdb app)
  • CMIF Creator: Tool for browser-based, simple creation of CMIF files (vue.js)
  • csLink: Javascript widget for integration into digital editions

The following web services and interfaces are used for the search, the search index or the CMIF Creator:

Host of the Web Service

The web service correspSearch has been developed and operated since April 2014 by TELOTA, a research group of the Berlin Brandenburg Academy of Sciences and Humanities (BBAW), and in cooperation with the TEI Correspondence SIG and other participating researchers.

The Berlin Brandenburg Academy of Sciences and Humanities is, with its 25 academy projects and various third-party funded projects, the largest non-university research institute for the humanities in the Berlin-Brandenburg region. Besides the TELOTA research group the BBAW is also home, for example, to the Digital Dictionary of the German Language (Digitale Wörterbuch der deutschen Sprache) and the German Text Archive (Deutsches Textarchiv).

The inspiration for the web service originated during the workshop ‘Editions for Letters around 1800: Finding interfaces and networks’ (Briefeditionen um 1800: Schnittstellen finden und vernetzen), organized by Anne Baillot (junior research group ‘Berlin Intellectuals’ Humboldt Universität Berlin) and Markus Schnöpf (TELOTA, BBAW).