In order to better support the ever-growing collections of digitized content from Digital Commonwealth member institutions, developers in the Boston Public Library’s Digital Services team have been building the next generation of the library’s digital asset management system. This new system, built entirely on open-source software, uses cloud storage for file management, allowing the repository to potentially grow exponentially, without the constraints of locally-managed servers and storage devices.

This new system is a suite of applications, APIs, and services that are collectively known as “DC3,” since this is the third version of the asset management system used to support preservation and dissemination of digitized primary source materials. (Click here for an overview of the previous version.)

The heart of the new system is an application called Curator, which is responsible for managing all of the descriptive, administrative, and technical metadata for objects and files in the repository. Curator provides an application programming interface (API) to support ingesting new items into the repository or making changes to existing items. Backed by a relational database, the Curator data model supports a wide variety of content types, as well as rich descriptive metadata for ingested items conforming to the Digital Commonwealth metadata application profile, which is based on the Metadata Object Description Schema (MODS) created by the Library of Congress. This system provides improved data validation and authority control, making better use of controlled vocabularies and thesauri offered by the Library of Congress and the Getty Research Institute.

Curator interacts with a number of other applications in the DC3 ecosystem, including:

  • ARK Manager – manages unique Archival Resource Key identifiers and permalinks for repository items.
  • AVI Processor – analyzes ingested files to extract technical metadata and creates derivative files used for viewing and downloading.
  • BPLDC Authority API – supports querying a variety of controlled data sources (such as LC, Getty, and GeoNames) for descriptive metadata fields including subjects, locations, genres, languages, resource types, names, etc.
  • Canataloupe – provides high-resolution images and deep zooming functionality for the DC user interface via the IIIF Image API.
  • Solr – supports indexing and retrieval of metadata and full-text content; powers the search features for the DC user interface.

In addition to the increased capacity (and decreased maintenance) provided by moving storage infrastructure to the cloud, this system provides a number of advantages. The relational data model used by Curator supports the ability to make updates to existing metadata much more efficiently. By spreading functionality over a variety of applications, the system is more fault-tolerant overall, and components can be re-engineered without the need for a complete overhaul of the entire system. And because this system uses more widely-adopted technologies and components, it will be easier to maintain and on-board new developers in the future.

All components of the DC3 system are built on freely-available open-source software. ARK-manager, AVI Processor, and BPLDC Authority API are custom-built applications created and maintained by BPL Digital Services – like Curator, code for many of these projects is available on GitHub.

Please contact us with any questions, comments, or concerns.

Image of the front page of The Woman's Era newspaper, which includes a story on and photograph of abolitionist and suffragist Lucy Stone.
The Woman’s Era, Vol. 1 No. 1, March 24, 1894

It will come as no surprise that there is widespread, urgent demand from institutions across the state to digitize historical newspapers, especially local titles that provide invaluable local coverage of daily history and titles with underrepresented perspectives and histories. There is an incredible amount of important material in need of access and preservation, and making these resources available will require a robust, sustained effort.

The Digital Services team at BPL has been working on increasing capacity for newspaper digitization and dissemination; here’s an overview of recent efforts from the last year:

Digitization at BPL

The BPL obtained a Mekel Mach 5 high-capacity microfilm scanner in March 2021, but the pandemic resulted in a significant delay with scheduling the necessary setup and training needed for operation. Mekel’s imaging technicians were finally able to help get this machine up and running in the fall of 2021, which has since been used to digitize several short runs of historically significant newspapers, including The Woman’s Era and The Tocsin of Liberty. The main current project, which is still ongoing, is scanning a major run of the Lawrence Evening Tribune (1890-1929).

While the scanning work is proceeding well (over 96,000 pages to date), imaging is only the start of any newspaper digitization project – there is significant manual work needed to collate and group the scanned pages into issue-level folders, and to identify missing pages, duplicate pages, and other anomalies.

There are also technical steps involved in processing the scanned images to create derivative files (such as using optical character recognition to extract text and word-coordinate information to support full-text searching and highlighting keyword matches on the page image), as well as developing the pipelines, workflows, and scripts to ingest the content into the digital repository. The library hopes to make significant progress on these latter steps during the second half of 2022.

Screenshot of a search box field in the Digital Commonwealth repository, with the heading "Search inside: The Tocsin of liberty"
“Search inside” view
Screenshot of the word "emancipated" highlighted in the text of a newspaper article.
Keyword searching

National Digital Newspaper Program (NDNP) Grant

Through the assistance of the Boston Public Library Fund (https://bplfund.org/), BPL was awarded a grant in September 2021 from the National Endowment for the Humanities to join the National Digital Newspaper Program (https://www.loc.gov/ndnp/), a long-running effort coordinated by the Library of Congress to build and maintain a free online digital library of historical newspapers from all U.S. states and territories. During the last few months, Digital Services staff has been working with an advisory committee of scholars and experts to identify significant newspapers from the library’s microfilm archives for inclusion in this national collection, which will then be digitized and made available via Chronicling America (http://chroniclingamerica.loc.gov/), which provides access to over 18 million pages from over 6,000 newspaper titles published from 1777 to 1963, and in Digital Commonwealth. The project, which will run until October 2023 and produce 100,000 pages of scanned newspaper content, is currently nearing the end of the title selection process, with imaging scheduled to contracted out to a digitization vendor in the fall of this year.

MyHeritage & Boston Neighborhood Newspapers

In 2016 BPL established a partnership with MyHeritage to provide access to BPL-held microfilm for digitization and display on their online genealogy platform, with the condition that BPL will receive a copy of all digitized page images produced. To date, this partnership has resulted in the digitization of approximately 7.5 million pages from a wide variety of Massachusetts newspapers spanning the late 1700s to the mid 1900s. However, the deliverables include the image scans only, and not any of the derivative files required to support discovery and display in Digital Commonwealth (see the “Digitization at BPL” section above). Producing the necessary derivative files at this scale will require additional capacity and funding support.

To evaluate the logistics, costs, time, and effort needed to ingest the MyHeritage-digitized materials into Digital Commonwealth, BPL is currently undertaking a pilot project using a vendor specializing in newspaper digitization to process a subset of these titles, highlighting Boston’s neighborhood newspapers. The titles selected for this project span from the mid-1800s to mid-1900s, representing many newspapers that currently have limited online availability, including the Roxbury Gazette, Hyde Park Times, East Boston Free Press, South Boston Gazette, Charlestown News, and the Dorchester Beacon, to name just a few. This project will produce approximately 170,000 pages of content; processing is scheduled to be completed by the end of June, and the goal is to integrate this content into the repository ingest workflow in the latter half of the year.

Looking Forward

The projects described above will no doubt provide increased access to historical newspaper content, but to make a significant impact, these activities need to become part of a curated, sustainable program with dedicated funding, equipment, and staff. The BPL is committed to continuing participation in Library of Congress’s NDNP program, which can be renewed every two years. The Digital Services team is also actively investigating other ways to increase capacity, including grant programs, advocating for more funding from the state legislature, adding staff to help manage digitization projects, and providing guidance to institutions that want to take on their own digitization projects. As with all things Digital Commonwealth, collaboration will be key to success!

Wonderful feature on NECN about the Boston Public Library and the Digital Commonwealth! Tom Blake, the Digital Projects Manager at the BPL and David Leonard, Interim President and Director of Administration and Technology, did a wonderful job describing the project, with well-chosen examples showing the digitization process, the Digital Commonwealth site, and some examples of items that have been digitized by Boston, from bathing suits to butterflies!

Boston Public Library Digitizing Cultural Treasures — Watch the video on the NECN website

On March 21st, Digital Commonwealth was co-awarded the New England Archivist’s Archival Advocacy Award along with BPL’s Digital Services. The award was presented on Saturday morning at the NEA business meeting (part of the Spring NEA meeting).

NEA’s former president Alyssa Pacy presented the award and highlighted the work BPL’s Digital Services staff members have been doing to help institutions share content in Digital Commonwealth. Alyssa acknowledged the important work BPL’s digital services staff had been able to for cultural organizations because of federal funding (the initial grant funding), and ongoing state funding.  Amy Ryan thanked the New England Archivists for the award and said she was honored to accept on behalf of the BPL but wanted Tom Blake to say a few words because of all he has done with the work that was being recognized. Tom acknowledged that the BPL’s plan to assist others with digitization tasks when the BPL has over a million items to digitize is ambitious but he feels like it is important and he appreciated the fact that Amy Ryan has been so supportive of the efforts.  He acknowledged the work of all in his department and the receptiveness of the organizations they have assisted.  He thanked everyone from the NEA/MARAC meeting who had volunteered to help with geo-coding at the BPL on Thursday (3/19); this was the “Day of Service” activity offered in conjunction with the conference.

Nancy Heywood accepted the award on behalf of Digital Commonwealth.  Nancy complimented the BPL’s developers for their excellent work on the website and also the excellent work of Tom Blake’s department.  She also had the opportunity to mention that Digital Commonwealth’s partnership with the BPL (in which the BPL takes the lead on the repository and portal) is allowing Digital Commonwealth the non-profit organization to think about programming/events/training sessions that will help the people involved with digitization efforts become knowledgeable about relevant issues. Nancy also thanked everyone who worked at institutions who have contributed content to the Digital Commonwealth website.

Recently, Digital Repository Developer Steven Anderson and Web Services Developer Eben English presented at the Open Repositories 2014 conference in Helsinki and at the Northeast Fedora Users Group (NEFUG) meeting in Boston.

Open Repositories is an annual international conference that brings together people and institutions responsible for the development, implementation, and management of digital repositories to share information and strategies for long-term preservation and access. Steven’s presentation was entitled “When Metadata Collides: Lessons on Combining Records from Multiple Repository Systems.” It summarizes the practical challenges involved in combining diverse descriptions, authorities, and technologies into the shared Digital Commonwealth repository and highlights the imaginative ways Steven and Eben have addressed them with the help of the Digital Projects department. Watch the seven-minute presentation online. Move the slider to the 52 minute mark to start with Steven’s talk. (Editor’s note: the previous link has had intermittent connection issues. Please continue to try the link until it resolves correctly.)

During the NEFUG meeting, Eben and Steven gave a presentation titled “digital_commonwealth_presentation” during the Hydra session. Steven presented on slides, that can be viewed here, and Eben gave a 10 minute demonstrations of teh actual portal. Steven also gave a lightening talk (aka “Dork Short”) about metadata combination challenges.

Congratulations to the Digital Commonwealth Movers & Shakers of 2014 just announced by Library Journal (http://lj.libraryjournal.com/2014/03/people/movers-shakers-2014/movers-shakers-2014). Featured in the new selection of stellar talent are two local librarians who have had a long and significant involvement with Digital Commonwealth: current board member Tom Blake as well as retired board member and past president Kristi Chadwick.

Tom is recognized for his leadership in pursuing a partnership between the BPL and Digital Commonwealth that was part of an organizing effort to attain a LSTA digitization grant in 2011. The successful grant  was funded for $200,000 for a two-year project to digitize historical materials for members of the Digital Commonwealth. As the entry about Tom explains, “So far, Blake and his team have digitized more than 75,000 objects from 100 institutions, and the DC has grown to 200 members, from large academic libraries to small independent museums. The collections, now in beta, will soon be available via the DC portal and repository system.”

Tom is also credited for helping establish the strong relationship that has transpired between the BPL, the Digital Commonwealth, and the Digital Public Library of America who chose Digital Commonwealth as one of its initial service hubs.  For more about that experience, check out Tom’s recent blog post: Life as a Service Hub for the Digital Public Library of America.

And if that were not honor enough, Kristi Chadwick is also included in this year’s selection. Kristi is awarded for her work as the Director of the Emily Williston Memorial Library & Museum in Easthampton where she achieved tremendous strides in increased staff appreciation and public support for the library in the short amount of time she has worked there.

Certainly many remember Kristi for her long association with Digital Commonwealth  that included several years serving on the board of directors and a year as president in 2011 and 2012.

Our appreciation goes out to these two for all they have done for librarianship in Massachusetts and particularly for the efforts they have committed to the success of Digital Commonwealth. A well-deserved thank you and congratulations!

The Boston Public Library received an award for its digitization work for Digital Commonwealth members at last month’s Griffin Museum of Photography’s eighth annual Focus Awards ceremony. The Focus Awards recognize contributions to the promotion, curation, and presentation of photography. The BPL received the Commonwealth Award, which is given to an organization that brings prominence to the local photographic scene.

“We are honored to receive this award for our digitization work,” said Amy E. Ryan, President of the Boston Public Library. “It is our great pleasure to contribute to Digital Commonwealth and help increase access to photos archives, cultural treasures, and other historical materials for people across Massachusetts and around the world.”

The annual Focus Awards was created by the Griffin Museum in 2006 in order to recognize critical contributions to the promotion of photography made by institutions and individuals. Tom Blake, Digital Projects Manager for the BPL, accepted the Commonwealth Award on the library’s behalf.

The award was presented to Tom by Bob Cullum, the grandson of photographer Leslie Jones (1886-1967). The Leslie Jones collection of nearly 40,000 glass negatives was digitized by the BPL and is now available for viewing in the new Digital Commonwealth repository that the BPL designed and built and now hosts — https://search.digitalcommonwealth.org/collections/commonwealth:2j62s484w.

The award is certainly very well deserved, not just for the work the BPL has done for the membership and organization of Digital Commownwealth, but the enormouse value this work provides the reputation of the Commonwealth as a whole. Congratulations!!

Both Digital Commonwealth and the BPL were represented at the annual MBLC Legislative meeting on September 12 where members of the library and information community were invited to comment on line items in the MBLC budget. The objective is to improve the MBLC’s presentation of needs to the legislature.

This year, a big push by the MBLC concerns the societal digital divide. The MBLC sees Digital Commonwealth — and more specifically the partnership achievements of Digital Commonwealth and the BPL — as a large part of that initiative.

At the beginning of the meeting, a demonstration of the new BPL repository by developers Stephen Anderson and Eben English was received with great enthusiasm. Afterwards, Michael Colford read a statement in support of the BPL’s partnership with Digital Commonwealth and its Library of the Commonwealth digital scanning services.

Afterwards, Karen Cariani, President of Digital Commonwealth, read a statement and presented a handout that offered further support of the work done by the BPL in partnership with the Digital Commonwealth.

Downloads of the Digital Commonwealth’s statement and handout are available in PDF format:

Another big issue at the meeting concerned the plan to establish a state-wide system of buying and lending ebooks. One question considered was whether or not Digital Commonwealth could be involved in the distribution of electronic books. It is unclear at this point what Digital Commonwealth’s role might be, if any, but this is certainly something that will be further discussed.

The Sterling and Francine Clark Art Institute Library has partnered with Archive-It to harvest web content created for the 55th Venice Biennale. The Venice Biennale 2013 Web Collection of organizational websites, video, blogs, and social media sites documents the international art exhibition, La Biennale di Venezia, 2013.

This virtual collection complements the Library’s growing Venice Biennale physical collection of exhibition catalogues, press kits, and ephemera beginning with the 52nd iteration of the Biennale, the oldest and most widely recognized cultural event in the world of contemporary art.

More than a decade ago the Clark library began to concentrate on collecting rare artists’ books and other, less conventional book-like works produced by artists around the world since the 1960s, and it has since built substantial holdings.  In 2007, the library decided to begin gathering such materials at the Biennale and asked Thomas Heneage, a veteran London art-book dealer, to represent the library at the Biennale as its “personal catalog, ephemera and art-book gatherer.”  Through the Clark/Heneage Biennale partnership, the library added oddities like The Whole Universe  created by artist Terence Koh and Used Swim Wear by collaborative duo Han & Him for the 2009 Danish/Nordic Pavilion’s “goodie bag.”

With the 53rd Venice Biennale came a sea change in the Library’s collection. In addition to the collection of traditional paper press kits, Thomas Heneage sent back electronic materials in the form of cds, flash drives, and web address hyperlinks. The library needed not only to preserve the physical objects but the videos, images and text contained within them. To accommodate this new electronic press material, the Library created the Venice Biennale (E-Biennale) Preservation Archive a restricted collection in the library’s digital management system.  New accessions connected with the 54th Venice Biennale (2011) generated even more independent Biennale web content, for example Christian Boltanski’s web game Chance to “induce global participation” beyond his installation in the French Pavilion, that the library set out to preserve as well.

The Venice Biennale 2013 iteration and the Library’s collaboration with the web-archiving service Archive-It has brought the capture of intellectual content to a new level. The Library worked with Archive-It’s Sylvie Rollason-Cass to create the “url seeds” and provide descriptive metadata and faceting using Dublin Core fields.

The Archive-It crawl on behalf of the Library began April 28, 2013 and will continue through to the end of the exhibition in November.  This year also promises to be a banner year for our physical Biennale Collection with Russian gold from Vadim Zakharov’s project titled Danaë and Golden Lion award winner for the Angola Pavilion Edson Chagas’ Found not Taken series of posters.

This blog post explores the Lee Library Association’s project and is the first in a series presenting and following up on members’ projects from their perspective.

Mary Philpott, President of the Lee Library Association Board of Directors, sees her library’s partnership with the Digital Commonwealth and the Boston Public Library as a great community building activity.   The Lee Library project includes more than 1,000 photographs that were digitized by the Boston Public Library thanks to funding from an LSTA grant awarded by the Massachusetts Board of Library Commissioners.

The library originally provided access to photocopies of these images, along with title descriptions organized in albums arranged by broad subject areas.  Mary pursued the digitization project because the albums could not provide access to people from a distance, were not searchable, and would preserve Lee’s history by (digitally) duplicating the photos.  In addition, a collection of glass plate negatives was made accessible.

Before the collection can go online in the upcoming new Digital Commonwealth repository (currently under development by the BPL), volunteers have to enter all the descriptions (metadata) either into an Excel spreadsheet or online.  Even though the digitized images are not online, Mary said the Library used them from day 1.  Lee had important marble and paper manufacturing industries, and many important historic buildings in the country contain marble from Lee. Now, the library can answer a lot of the telephone and email reference questions as a result and email the image back to the patron.  Sometimes in return, the library learns more about the town’s history.

This photo is one of the glass slides from the Lee Library photo collection. There is no written information about these slides, but in this photo, the men are carving a piece of marble that is most likely from one of the Lee marble quarries. The carving’s destination is the Bolkenhayn House in Central Park. The Bolkenhayn House was built on the last vacant plot at the Fifth Avenue entrance to Central Park. The name of the building, The Bolkenhayn, was taken from a town in Silesia, and “some significance attaches to it because the suggestion of the style of architecture is taken from a palace in the place named.” (NY Times, Feb. 6, 1894) This building was designed and owned by Alfred Zucker. There is a carving of a palace above the name and in one of the pictures of the building the carved piece is above the door. The building was completed in 1895. This building has been well-documented nationally and has housed prominent residents through the years.

A side benefit of the digitization project was discovering new material.  Even though the images had been well described in the albums, the Library staff found images they did not even know they had when they selected images to be digitized. These “new “pictures hadn’t been categorized.  These images now present an opportunity for staff and patrons alike to identify them and they have been exhibited in the gallery.  Mary noted that this exhibit brought people into the Lee Library who had not visited for quite a long time and sees opportunities to use the photos everywhere from newsletters to local cable TV spots.

This project is also helping the Lee Library to build new collections.  The Library is currently hosting “Picture Lee 2013: Preserving the Present for the Future.” The Library invited community members to submit photos of Lee people, places and things taken in 2013. The Library recently used 300 digitized photos as background images at their annual meeting and is using the images for advocacy by planning exhibits to coincide with its spring budget meeting.

For the Lee Library, digitizing local history is a priority because there is no public access to the basement historical room.  The Library was determined to digitize their collections.  Initially, Mary wrote an LSTA grant for the project that was not funded.

For Mary, the hardest part of this project was the steep learning curve.  When she started, she knew nothing about digitizing collections and did a lot of homework.  When she wrote the LSTA, the information from vendors was difficult to compare as she tried to overcome the steep technological barriers. Mary attended a Digital Commonwealth Conference three years ago but left very frustrated because the terminology was daunting, and the process was too complicated at the time.  She did not give up, however, and attended a Digital Commonwealth workshop in which the material was presented in such a way that the complicated terminology was translated and simplified.  Mary is also grateful to the many librarians who helped, mentored and encouraged her and especially the Digital Commonwealth and the Boston Public Library.

Building off of their achievements and using their new digital collections and know how as leverage, the library recently received a grant from the Berkshire Bank Foundation for a digital microfilm reader to make the Berkshire Gleaner (1857-1944) and other local history microfilm accessible.

Watch Lee Library’s digitization progress at http://blog.bpl.org/dcbpl/about-the-program/participating-institutions/lee-library-association/

Mary can be reached at maryphilpott@mindspring.com