WGBH Awarded Grant by Institute of Museum and Library Services for Public Broadcasting Preservation Fellowship

Grant of $229,772 will fund students’ work on digitization of historic, at-risk public media content from underrepresented regions and communities

BOSTON, September 28, 2017 – WGBH Educational Foundation is pleased to announce that the Institute of Museum and Library Services (IMLS) has awarded WGBH a $229,772 Laura Bush 21st Century Librarian Program grant to launch the Public Broadcasting Preservation Fellowship. The fellowship will fund 10 graduate students from across the United States to digitize at-risk audiovisual materials at public media organizations near their universities. The digitized content will ultimately be incorporated into the American Archive of Public Broadcasting (AAPB), a collaboration between Boston public media station WGBH and the Library of Congress working to digitize and preserve thousands of broadcasts and previously inaccessible programs from public radio and public television’s more than 60-year legacy.

“We are honored that the Institute of Museum and Library Services has chosen WGBH to lead the Public Broadcasting Preservation Fellowship,” said Casey Davis Kaufman, Associate Director of the WGBH Media Library and Archives and WGBH’s AAPB Project Manager. “This grant will allow us to prepare a new generation of library and information science professionals to save at-risk and historically significant public broadcasting collections, especially fragile audiovisual materials, from regions and communities underrepresented in the American Archive of Public Broadcasting.”

WGBH has developed partnerships with library and information science programs and archival science programs at five universities: Clayton State University, University of North Carolina at Chapel Hill, University of Oklahoma, University of Missouri, and San Jose State University. Each school will be paired with a public media organization that will serve as a host site for two consecutive fellowships: Georgia Public Broadcasting, WUNC, the Oklahoma Educational Television Authority, KOPN Community Radio, and the Center for Asian American Media in partnership with the Bay Area Video Coalition.

“As centers of learning and catalysts of community change, libraries and museums connect people with programs, services, collections, information, and new ideas in the arts, sciences, and humanities. They serve as vital spaces where people can connect with each other,” said IMLS Director Dr. Kathryn K. Matthew. “IMLS is proud to support their work through our grant making as they inform and inspire all in their communities.”

The first fellowship will take place during the 2018 spring semester, from January to April of 2018. The second fellowship will take place during the summer semester from June to August of 2018. The grant also will support participating universities in developing long-term audiovisual preservation curricula, including providing funding for audiovisual digitization equipment, and developing partnerships with local public media organizations.

### 

About WGBH
WGBH Boston is America’s preeminent public broadcaster and the largest producer of PBS content for TV and the Web, including Masterpiece, Antiques Roadshow, Frontline, Nova, American Experience, Arthur, Curious George, and more than a dozen other prime-time, lifestyle, and children’s series. WGBH also is a leader in educational multimedia, including PBS LearningMedia, and a pioneer in technologies and services that make media accessible to the 36 million Americans who are deaf, hard of hearing, blind, or visually impaired. WGBH has been recognized with hundreds of honors: Emmys, Peabodys, duPont-Columbia Awards…even two Oscars. Find more information at www.wgbh.org.

About the Library of Congress
The Library of Congress is the world’s largest library, offering access to the creative record of the United States – and extensive materials from around the world – both on site and online. It is the main research arm of the U.S. Congress and the home of the U.S. Copyright Office.  Explore collections, reference services and other programs and plan a visit at loc.gov, access the official site for U.S. federal legislative information at congress.gov and register creative works of authorship at copyright.gov.

About the American Archive of Public Broadcasting
The American Archive of Public Broadcasting (AAPB) is a collaboration between the Library of Congress and the WGBH Educational Foundation to coordinate a national effort to preserve at-risk public media before its content is lost to posterity and provide a central web portal for access to the unique programming that public stations have aired over the past 60 years. To date, nearly 50,000 hours of television and radio programming contributed by more than 100 public media organizations and archives across the United States have been digitized for long-term preservation and access. The entire collection is available on location at WGBH and the Library of Congress, and more than 22,000 programs are available online at americanarchive.org.

About IMLS
The Institute of Museum and Library Services is celebrating its 20th Anniversary. IMLS is the primary source of federal support for the nation’s 123,000 libraries and 35,000 museums. Our mission has been to inspire libraries and museums to advance innovation, lifelong learning, and cultural and civic engagement. For the past 20 years, our grant making, policy development, and research has helped libraries and museums deliver valuable services that make it possible for communities and individuals to thrive. To learn more, visit http://www.imls.gov and follow us on Facebook, Twitter and Instagram.

Introducing an audio labeling toolkit

In 2015, the Institute of Museum and Library Services (IMLS) awarded WGBH on behalf of the American Archive of Public Broadcasting a grant to address the challenges faced by many libraries and archives trying to provide better access to their media collections through online discoverability. Through a collaboration with Pop Up Archive and HiPSTAS at the University of Texas at Austin, our project has supported the creation of speech-to-transcripts for the initial 40,000 hours of historic public broadcasting preserved in the AAPB, the launch of a free open-source speech-to-text tool, and FIX IT, a game that allows the public to help correct our transcripts.

Now, our colleagues at HiPSTAS are debuting a new machine learning toolkit and DIY techniques for labeling speakers in “unheard” audio — audio that is not documented in a machine-generated transcript. The toolkit was developed through a massive effort using machine learning to identify notable speakers’ voices (such as Martin Luther King, Jr. and John F. Kennedy) from within the AAPB’s 40,000 hour collection of historic public broadcasting content.

This effort has vast potential for archivists, researchers, and other organizations seeking to discover and make accessible sound at scale — sound that otherwise would require a human to listen and identify in every digital file.

Read more about the audio labeling toolkit here, and stay tuned for more posts in this series.

Audio_Labeler_The_World

AAPB NDSR Resources Round-up

 

In 2015, the Institute of Museum and Library Services awarded a generous grant to WGBH on behalf of the American Archive of Public Broadcasting (AAPB) to develop the AAPB National Digital Stewardship Residency (NDSR). Through this project, we have placed seven graduates of master’s degree programs in digital stewardship residencies at public media organizations around the country.

AAPB NDSR  has already yielded dozens of great resources for the public media and audiovisual preservation community – and the residents aren’t even halfway done yet! As we near the program’s midpoint, we wanted to catch you up on the program so far.

We started off in July 2016 with Immersion Week in Boston, which featured presentations on the history of public media and the AAPB, an overview of physical and digital audiovisual materials, an introduction to audiovisual metadata, and instructional seminars on digital preservation workflows, project management, and professional development. Attendees also participated in a full-day session on “Thinking Like a Computer” and a hands-on command line workshop.

Several sessions from Immersion Week were filmed by
WGBH Forum Network, including:

In August 2016, the residents dispersed to their host stations, and began recording their experiences in a series of thoughtful blog posts, covering topics from home movies to DAM systems to writing in Python.

AAPB NDSR blog posts to date include:

Digital Stewardship at KBOO Community Radio,” Selena Chau (8/9/16)

Metadata Practices at Minnesota Public Radio,” Kate McManus (8/15/16)

NDSA, data wrangling, and KBOO treasures,” Selena Chau (8/30/16)

Minnesota Books and Authors,” Kate McManus (9/23/16)

Snapshot from the IASA Conference: Thoughts on the 2nd Day,” Eddy Colloton (9/29/16)

Who just md5deep-ed and redirected all them checksums to a .csv file? This gal,” Lorena Ramirez-Lopez (10/6/16)

IASA Day 1 and Voice to Text Recognition,” Selena Chau (10/11/16)

IASA – Remixed,” Kate McManus (10/12/16)

Learning GitHub (or, if I can do it, you can too!)” Andrew Weaver (10/13/16)
Home Movie Day,” Eddy Colloton (10/15/16)

Snakes in the Archive,” Adam Lott (10/20/16)

Vietnam, Oral Histories, and the WYSO Archives Digital Humanities Symposium,” Tressa Graves (11/7/16)

Archives in Conversation (A Glimpse into the Minnesota Archives Symposium, 2016),” Kate McManus (11/15/16)

Inside the WHUT video library clean-up – part 1: SpaceSaver,” Lorena Ramirez-Lopez (11/21/16)

Is there something that does it all?: Choosing a metadata management system,” Selena Chau (11/22/16)

Inside the WHUT video library clean-up – part 2: lots of manual labor,” Lorena Ramirez-Lopez (12/20/16)

Just Ask For Help Already!” Eddy Colloton (12/22/16)

August also kicked off our first series of guest webinars, focusing on a range of topics of interest to audiovisual and digital preservation professionals. Most webinars were recorded, and all have slides available.

AAPB NDSR webinars to date include:

Metadata: Storage, Modeling and Quality,” by Kara Van Malssen, Partner & Senior Consultant at AVPreserve

Public Media Production Workflows,” by Leah Weisse, WGBH Digital Archive Manager/Production Archival Compliance Manager (slides)

Imposter Syndrome” by Jen LaBarbera, Head Archivist at Lambda Archives of San Diego, and Dinah Handel, Mass Digitization Coordinator at the NYPL (slides)

Preservation and Access: Digital Audio,” by Erica Titkemeyer, Project Director and AV Conservator at the Southern Folklife Collection (slides)

Troubleshooting Digital Preservation,” by Shira Peltzman, Digital Archivist at UCLA Library (slides)

Studs Terkel Radio Archive: Tips and Tricks for Sharing Great Audio,” by Grace Radkins, Digital Content Librarian at Studs Terkel Radio Library (slides)

From Theory to Action: Digital Preservation Tools and Strategies,” by Danielle Spalenka, Project Director of the Digital POWRR Project (slides)

Our first two resident-hosted webinars (open to the public) will be happening this month! Registration and more info is available here.

The residents also hosted two great panel presentations, first in September at the International Association of Sound and Audiovisual Archives Conference, and in November at the Association of Moving Image Archivists Conference. The AMIA session in particular generated a lot of Twitter chatter; you can see a roundup here.

To keep up with AAPB NDSR blog posts, webinar recordings, and project updates as they happen, follow the AAPB NDSR site at ndsr.americanarchive.org.

AAPB & Pop Up Archive Launch Project to Analyze 40,000 Hours of Historic Public Media

AAPB_Logo_Color_4Squarepopup-archiveIMLS_Logo_2c

We are thrilled to announce that the Institute of Museum and Library Services has awarded WGBH, on behalf of the American Archive of Public Broadcasting, a National Leadership Grant for a project titled “Improving Access to Time-Based Media through Crowdsourcing and Machine Learning.”

Together, WGBH and Pop Up Archive plan to address the challenges faced by many libraries and archives trying to provide better access to their media collections through online discoverability. This 30-month project will combine technological and social approaches for metadata creation by leveraging scalable computation and engaging the public to improve access through crowdsourcing games for time-based media. The project will support several related areas of research and testing, including: speech-to-text and audio analysis tools to transcribe and analyze almost 40,000 hours of digital audio from the American Archive of Public Broadcasting; develop open source web-based tools to improve transcripts and descriptive data by engaging the public in a crowdsourced, participatory cataloging project; and create and distribute data sets to provide a public database of audiovisual metadata for use by other projects.

Our research questions are: How can crowdsourced improvements to machine-generated transcripts and tags increase the quality of descriptive metadata and enhance search engine discoverability for audiovisual content? How can a range of web-based games create news points of access and engage the public engagement with time-based media through crowdsource tools? What qualitative attributes of audiovisual public media content (such as speaker identities, emotion, and tone) can be successfully identified with spectral analysis tools, and how can feeding crowdsourced improvements back into audio analysis tools improve their future output and create training data that can be publicly disseminated to help describe other audiovisual collections at scale?

This project will use content from the AAPB to answer our questions. The project will fund 1) audio analysis tools – development and use of speech-to-text and audio analysis tools to create transcripts and qualitative waveform analysis for almost 40,000 hours of AAPB digital files (and participating stations can definitely receive copies of their own transcripts!); 2) metadata games – development of open-source web-based tools to improve transcripts and descriptive data by engaging the public in a crowd sourced, participatory cataloging project; 3) evaluating access – a measurement of improved access to media files from crowd sourced data; 4) sharing tools – open-source code release for tools developed over the course of the grant, and 5) teaching data set– the publication of initial and improved data sets to ‘teach’ tools and provide a public database of audiovisual metadata (audio fingerprint) for use by other projects working to create access to audiovisual material.

The 2014 National Digital Stewardship Agenda includes, “Engage and encourage relationships between private/commercial and heritage organizations to collaborate on the development of standards and workflows that will ensure long-term access to our recorded and moving image heritage.” These partnerships are critical in order to move the needle of audiovisual access issues of national significance. The AAPB and Pop Up Archive are eager to continue building such a relationship so that the innovations in technology, workflows, and data analysis advanced by the private sector are fully and sustainably leveraged for U.S. public media and cultural heritage organizations.