The Parliamentary Proceedings of a country provide a backbone of its textual cultural heritage. It covers centuries, in a standardized format, with highly controlled quality. Its completeness as a longitudinal corpus and its content covering a large part of a nations history, makes the proceedings both of interest in their own right, and as a tool to contextualize other data collections. The Dutch Parliamentary Proceedings are available in digitized format starting from 1576 (although large parts of 1625-1815 have not been digitized). However, access is poor as it is limited to full text search of OCR'ed text editions, or metadata descriptions of the archives. The main issues and the main research problems of the proposal are: How to turn the corpus of digitized heritage texts into a connected network of information? How to exploit the obtained networked structure for interactive exploratory search?
The ExPoSe project aims to bring out the full potential of the Dutch Proceedings by transforming them into a structured XML format. To really function as a backbone, it must be easy to connect other documents deeply into the proceedings. We make this possible by connecting the content of the Proceedings to the Linked Open Data cloud, which we extend with several historically relevant sources (among them the huge newspaper collection of the KB and the Woordenboek der Nederlands Taal, WNT). The project will run for 5 years with a consortium that covers the whole chain from raw data producer to end-users in the Creative Industry.
A detailed project description is here.