The Patient Information Leaflet (PIL) Corpus

Version 2.0 (31 March 2006)

This directory contains version 2.0 of the Patient Information Leaflet corpus, a collection of several hundred documents giving instructions to patients about their medication. The corpus was originally created from the ABPI compendium of patient information leaflets by manual scanning and conversion. Documents are available in rtf, doc, html formats, and also marked up with logical structure using a specially created sgml dtd specification. The corpus is organised in the following versions: The PIL corpus was initially developed as part of the ICONOCLAST project, supported by the EPSRC (grant no L77102).

Release notes

March 2006 Version 2.0
Tidied up for general release by Roger Evans
Nov 2000 Version 1.0
Initial internal release by Nadjet Bouayad-Agha

Projects using this resource

The following is a list of projects we know about that have made use of the PIL corpus. If you know of other uses not in this list, please send an email to .
Document last modified on 25 April 2007.