Leigh Hunt Online: The Letters – Methodology and Standards

Leigh Hunt Digitized Letters News About the Project About the Collection Conditions of Use Project Staff Acknowledgements Contact Us

Methodology and Standards

 

A test bed of letters was chosen to represent the scope of the collection and perceived digitizing factors, including issues of date, addressee, physical size, legibility, content, and scanning resolution.  These were digitized to evaluate how CONTENTdm would display the collection and to bring to light any unperceived technical problems.

The University of Iowa Libraries Digital Library Services has decided to integrate all current and future projects for maximum interoperability by using CONTENTdm software, and Leigh Hunt Online: The Letters uses this program as its structural basis.

CONTENTdm offers a high level of interoperability since it maps to Dublin Core metadata standards, as well as the capability of tailoring the structure to meet individual project needs. The project’s compliance with Dublin Core metadata standards and best practices ensures that it will be able to migrate easily and successfully for future software changes or upgrades.

For Leigh Hunt Online: The Letters, digital images of the nearly 1,600 letters are obtained either with a scanner or with a digital camera.  Each letter averages three pages, totaling 4800 pages to be digitized, not including transcripts. About half of the letters have been scanned, using a Ricoh Aficio 2335 scanner, at 600 ppi and saved as tiff images for archival purposes.  The file size of a letter scanned at 600 ppi ranges from 20-80 MB, depending on paper size and handwriting density.

However, the remaining letters were laid into larger sheets of paper and made into books, bound at Brewer’s discretion in fairly elaborate, often full leather, bindings.  The Libraries’ conservation department has found that scanning the letters in the bound volumes would damage bindings that have intrinsic value and whose presence is integral to the provenance of the collection. Consequently, the bound letters are being scanned on a Zeutschel OS 12000 overhead scanner to attain the best archival results with the least damage. These scans will have the same high resolution as the others and likewise be saved as archival images.

The tiff images are being converted to jpeg images for online access in CONTENTdm. For multi-page letters, CONTENTdm constructs compound objects to keep pages of a letter linked together. The program automatically generates thumbnails for the images, which will appear with the title on searching and browsing screens.  Metadata, including the transcript and rights information, is being inserted after the compound object has been made. Current policy for Iowa digital projects places a proprietary band on the image to indicate its institutional affiliation.

Further conditions for use are specified in the metadata field “Rights Management,” with the supplementary “Contact Information” field directly below it. The letters are all in the public domain, so concerns about copyright do not exist, except in the case of scholarly transcripts, which in we first gain permission to use before including in the database.  The descriptive metadata culled from the card catalog records fills in many of the descriptive fields. Several of these fields use Library of Congress subject headings and controlled vocabulary for names.

The typed transcripts from Toledo have been scanned using the Ricoh scanner at 300 ppi, and then processed by OCR software, ABBYY FineReader 8.0. The OCR files are being carefully edited against the original transcript, since the purpose of using Cheney’s transcripts is to preserve his carefully executed scholarly work.  After editing, the transcript as a whole is inserted into the record in a metadata field “Transcript,” with appropriate HTML tags to indicate page breaks. This text is fully searchable.

Additionally, each individual page record has its particular section of the transcript duplicated and located in the metadata field directly below the image. In this way, the reader can quickly see which text appears on that particular page, but the entire transcript is also available for reference on each page of the letter. References to Cheney and Toledo are made in “Transcript by” and “Transcript location” fields.

If transcripts from Cheney or other scholars are not available, other transcripts (e.g., those done by Luther Brewer) are used when available. If no transcript previously exists, a new one is created and checked for accuracy.  In either case, the image of the original letter is available for personal interpretation of the letter’s content.

Finally, RDF (Resource Description Framework) tags is periodically extracted from the metadata to integrate Leigh Hunt Online: The Letters with the Networked Infrastructure for Nineteenth-century Electronic Scholarship (NINES) project. The NINES project has guidelines and standards for creating and transforming metadata into their required RDF format. RDF tags are part of the push toward the Semantic Web experience, which allows data to be shared and reused across applications.

The long-term home for this project is on the Libraries’ servers. Projects currently being developed with the Digital Library Services department must meet their requirements for using CONTENTdm software.  This ensures interoperability and permits any future migration to upgraded or other software. The library is committed to showcasing and digitizing its unique collections to contribute to the information community. Leigh Hunt Online: The Letters does just that.

 

As a digital project, Leigh Hunt Online: The Letters is naturally disseminated over the Internet.  The project is indexed through open-access initiatives and harvested by search engines like Google, which allows the actual content of the letters to attract users.  The project is also included in the NINES project, which collates resources such as the Walt Whitman Archive and the Blake Archive. This will generate new and different contextual intersections with digital materials originating from the same time period and also ensure that scholars looking for resources on Romantic and Victorian literary figures will not overlook Leigh Hunt Online: The Letters since these letters often cast light on people other than Hunt himself.