In order to better support the ever-growing collections of digitized content from Digital Commonwealth member institutions, developers in the Boston Public Library’s Digital Services team have been building the next generation of the library’s digital asset management system. This new system, built entirely on open-source software, uses cloud storage for file management, allowing the repository to potentially grow exponentially, without the constraints of locally-managed servers and storage devices.

This new system is a suite of applications, APIs, and services that are collectively known as “DC3,” since this is the third version of the asset management system used to support preservation and dissemination of digitized primary source materials. (Click here for an overview of the previous version.)

The heart of the new system is an application called Curator, which is responsible for managing all of the descriptive, administrative, and technical metadata for objects and files in the repository. Curator provides an application programming interface (API) to support ingesting new items into the repository or making changes to existing items. Backed by a relational database, the Curator data model supports a wide variety of content types, as well as rich descriptive metadata for ingested items conforming to the Digital Commonwealth metadata application profile, which is based on the Metadata Object Description Schema (MODS) created by the Library of Congress. This system provides improved data validation and authority control, making better use of controlled vocabularies and thesauri offered by the Library of Congress and the Getty Research Institute.

Curator interacts with a number of other applications in the DC3 ecosystem, including:

  • ARK Manager – manages unique Archival Resource Key identifiers and permalinks for repository items.
  • AVI Processor – analyzes ingested files to extract technical metadata and creates derivative files used for viewing and downloading.
  • BPLDC Authority API – supports querying a variety of controlled data sources (such as LC, Getty, and GeoNames) for descriptive metadata fields including subjects, locations, genres, languages, resource types, names, etc.
  • Canataloupe – provides high-resolution images and deep zooming functionality for the DC user interface via the IIIF Image API.
  • Solr – supports indexing and retrieval of metadata and full-text content; powers the search features for the DC user interface.

In addition to the increased capacity (and decreased maintenance) provided by moving storage infrastructure to the cloud, this system provides a number of advantages. The relational data model used by Curator supports the ability to make updates to existing metadata much more efficiently. By spreading functionality over a variety of applications, the system is more fault-tolerant overall, and components can be re-engineered without the need for a complete overhaul of the entire system. And because this system uses more widely-adopted technologies and components, it will be easier to maintain and on-board new developers in the future.

All components of the DC3 system are built on freely-available open-source software. ARK-manager, AVI Processor, and BPLDC Authority API are custom-built applications created and maintained by BPL Digital Services – like Curator, code for many of these projects is available on GitHub.

Please contact us with any questions, comments, or concerns.

Comments are closed.