How to FIX IT+ and Why: Crowdsourcing to Save Public Media Materials

Most public radio and television organizations are at-risk of losing their archival materials due to deterioration and the high costs associated with digitization. The American Archive of Public Broadcasting (AAPB) is partnering with George Blood L.P., a digitization vendor, to help AAPB’s contributing organizations preserve their collection one transcript and one tape at a time in the Transcribe to Digitize Challenge.

Article highlights:

  • The American Archive of Public Broadcasting (AAPB) is a digital archive of public radio and television programs from contributing stations across the nation, available online at
  • Each program in the archive has a computer-generated, speech-to-text transcript to improve keywords search
  • These transcripts are not accurate and have been made available for the public to help proof and edit through AAPB’s editing site, FIX IT+ at
  • George Blood L.P., a digitization vendor, has agreed to provide FREE digitization for each station that corrects a minimum number of transcripts in FIX IT+, a.k.a. The Transcribe to Digitize Challenge
  • The public is invited to help individual stations in this Challenge reach their goal of 20 corrected transcripts
  • Crowdsourcing provides two lasting outcomes for the future
  • Tune-in to a video interview with a WGBH volunteer on his experience with FIX IT+
  • Below are three easy steps for participating

Making America’s Public Broadcasting Legacy Searchable and Accessible

The American Archive of Public Broadcasting (AAPB), a collaboration between the Library of Congress and WGBH to preserve a national archive of public radio and television programming accessible to the public, has partnered with George Blood, a digitization service provider, to help mitigate the costs of digitization through the Transcribe to Digitize Challenge.

Over the past five years, the AAPB has digitized and preserved more than 50,000 hours of public programming created by stations and producers across the United States. This unique historic material, created as early as the 1940s, often lacks closed captioning and represents our shared and diverse cultural heritage. Yet it is not highly discoverable to researchers, educators, students, lifelong learners, journalists and the public because it lacks descriptive information.

With funding from the Institute of Museum and Library Services (IMLS), the AAPB has created speech-to-text transcripts of the audio and video materials in the collection. These transcripts can be used to improve the accessibility of the collection through the addition of new keywords and by exposing the time-stamped transcript alongside the program on the AAPB website.

Keywords search and time-stamped transcripts alongside the program on

However, these computer-generated transcripts lack accuracy, and the AAPB is seeking the help of the public to correct them!

A Call to Action

For the Transcribe to Digitize Challenge, a minimum of 20 transcripts must be corrected in AAPB’s FIX IT+,, for a station to meet the challenge, and George Blood will then provide free digitization for 20 tapes selected by that station. Up to 100 transcripts can be corrected for 100 tapes to be digitized per station. The digitized materials will be delivered back to each station, and a copy will also go to the AAPB for long-term preservation at the Library of Congress and access through the AAPB website!

Stations like WGBH, Louisiana Public Television, Rocky Mountain PBS, and Wisconsin Public Television have opted-in to the challenge and must correct their transcripts by December 2019, and participating in this challenge creates two lasting outcomes:

  1. Completed transcripts are made available online at for students, educators, journalists, and life long learners to access.
  2. Your help could be the result of a station’s free digitization.

But don’t take our word for it… here’s the perspective of a WGBH volunteer editor!

Editing is as easy as 1-2-3:

1. Filter Content by Station

Visit and filter the content by participating station; i.e. “WGBH” and sort by “Completeness (most to least)”.

The filter bar is located on the homepage of

Direct links:

Louisiana Public Television Transcripts 

Rocky Mountain PBS Transcripts

Wisconsin Public Television Transcripts

WGBH Transcripts 

2. Select a Transcript

Transcript tile

Select an unedited transcript OR continue editing a transcript that has already been started by another user. Each transcript requires two reviews, so feel free to choose a topic that interests you and spend anywhere from 10 mins to an hour editing. All your edits are saved automatically.

“Transcript tiles” note the transcript’s contributing station, the program title, its series, a brief description, the program’s duration, number of contributing editors, and a progress bar.

3. Become Familiar with Simple Editing Conventions

You can listen to the audio by clicking the ‘play’ icon to the left of each line, and then correct the text on-screen using your keyboard. For more editing details, click the “View a Tutorial” button at the top of the transcript’s page for standard conventions.

Green lines note when lines are completed and no longer need editing. The gray lines still need reviewing.

Questions? Contact Ryn Marchese, AAPB Engagement and Use Manager at, 617-300-3644.


New Special Collection of News and Cultural Programming from Oklahoma Educational Television Authority (OETA)!

“At the heart of this collection, are the people. The resilient men and women who have both contributed to the legacy of Oklahoma as well as the mosaic of our great nation in the area of art, music, science, exploration, politics, religion, architecture, literature, language, etc.”- Evelyn Cox, Public Broadcasting Preservation Fellow

Collection Summary

OETA Special Collection

The OETA News and Cultural Programming (1980-Present) Collection includes 74 programs and segments created since the 1980s by Oklahoma Educational Television Authority (OETA). It is a glimpse into the past, covering topics and exploring issues that are relevant to the diverse cultures of Oklahoma. The collection, which includes programs about Oklahoma history, documents issues and events such as the Oklahoma Land Run of 1889, the life and career of humorist Will Rogers, the women’s war effort in World War II, and the Oklahoma City Bombing. Also featured are individuals, such as, aviator Wiley Post, Boomer David Payne, the “Hanging Judge” Charles Isaac Parker, and many others. This collection is an eclectic mix of digitized at-risk public media material from the OETA Archive with contributions from the Oklahoma Department of Wildlife Conservation Archive.

Direct link:!

Collection Background

The Oklahoma Educational Television Authority (OETA) is Oklahoma’s only statewide coordinated instructional and public television network. In 1951 the state legislature pioneered the growth of noncommercial educational television in the United States by unanimously approving House Concurrent Resolution Number 5, urging the Federal Communications Commission (FCC) to reserve television channels for educational purposes. In addition to offering television programs supplied by PBS and acquired from various independent distributors, the network produces news, public affairs, cultural, and documentary programming; the OETA also distributes online education programs for classroom use and teacher professional development, and maintains the state’s Warning, Alert and Response Network (WARN) infrastructure. The OETA network’s main offices and production facilities are located at the intersection of Kelley Avenue and Britton Road in northeastern Oklahoma City. The collection was digitized in 2018 by Evelyn Cox and Laura Haygood, Public Broadcasting Preservation Fellows, in collaboration with Oklahoma Educational Television Authority, through a project funded by the Institute of Museum and Library Services.

Featured Programs

Screen Shot 2019-01-18 at 4.30.54 PM.png

AAPB Launches New Special Collection of Radio Programs from Georgia Public Broadcasting, 1992-2007

Collection Summary


The Georgia Gazette Collection consists of 102 Georgia Gazette radio programs produced by Georgia Public Broadcasting in Atlanta, Georgia from 1992-2007. Georgia Gazette started under the name Georgia Journal as an hour long weekly call-in radio magazine that covered a wide array of Georgia topics. It is unclear when the title Georgia Journal changed to Georgia Gazette. The main host of Georgia Gazette was Bruce Dortin. After 2007, the program became a daily thirty-minute broadcast hosted by Rickey Bevington. Georgia Gazette is considered to be the longest running Georgia focused radio program in the country. Each episode focused on a variety of topics with reports from all across the state. It covered Georgia legislative issues and political events including the Georgia general assembly and elections. Common segments that were featured throughout the magazine included sports briefs, the arts calendar, fictional legal advice, and commentaries. Guests often included well-known and lesser-known Georgian writers, comedians, journalists, scholars, lawyers, artists, psychologists, and doctors. Major topics included the 1996 Olympics, teen pregnancy, race issues, and local protests.

Along with the weekly show, the Georgia Gazette team also produced an hour long monthly call-in program called Georgia Gazette Consumer Call-In, which featured guests who would give advice and inform the public on a single consumer-related topic. The Georgia Gazette Consumer Call-In program focused on the single topic of consumer rights. From time to time the program focused on a particular consumer issue such as auto recalls, credit card scams, and telecommunication development. Some guests featured in the program were the Secretary of the State of Georgia Lewis Massey and later, Cathy Cox as well as representatives from the Governor’s Office of Consumer Affairs, Berry Reed and Trilis Halford. During this program, Georgians could call in and get help and advice directly from the experts about their consumer issues. Occasionally, the program would re-release popular segments from past shows together in an episode titled Best of Georgia Gazette.

Collection Background

The Georgia Gazette Series was contributed to the American Archive of Public Broadcasting (AAPB) by GPB during the AAPB Public Broadcasting Digitization Fellowship, funded by the Institute of Museum and Library Services in 2018. Collection digitized by Virginia Angles and Riley Griffin.

Featured Radio Items

Screen Shot 2019-01-18 at 4.31.41 PM.png



Steve Wilcer, Public Broadcasting Preservation Fellow at WUNC

Wilcer profile photo.jpg
I was thrilled to experience the myriads of different programs from WUNC over the years and be able to directly contribute to their preservation for the future.

Hello! My name is Steve Wilcer. I coordinated with WGBH and WUNC Radio in Chapel Hill, North Carolina as a member of the second cohort of fellows for the AAPB Public Broadcast Preservation Fellowship. I am currently working towards a Master of Science in Library Science at the University of North Carolina and plan to graduate next spring. Prior to my time in North Carolina, I studied musicology at the Ohio State University and was exposed to a wide variety of media formats and materials, ranging from microfiche to medieval manuscripts. I developed a strong passion for libraries and archives through these experiences, which led me to pursue a second master’s degree in library science.

Learning as I work

As someone who just entered North Carolina last fall, my work with WUNC Radio offered me a unique opportunity to learn about the area and its people. Public radio provides a versatile platform for education, entertainment, and awareness programming. I was thrilled to experience the myriads of different programs from WUNC over the years and be able to directly contribute to their preservation for the future. During my portion of the fellowship, I was able to digitize approximately forty assets, with most of them being digital audio tapes. I also continued to develop the cataloging and documentation for WUNC, allowing me to experience the digitization and preservation process from a more holistic standpoint.

One particularly informative component of the fellowship for me was the North Carolina Voices special collection: This collection contains materials from two of WUNC’s special program series: Understanding Poverty and Civil War. Understanding Poverty offered a wide assortment of programs and features on various financial and social issues in the state, as well as how North Carolina has developed over the last several decades. The Civil War series contained family stories of ancestors that lived during or served in the United States Civil War. Both series provided me a valuable, more tangible insight into the people of Chapel Hill and North Carolina as I listened to their stories and firsthand experiences. I also had the artistic opportunity to design our thumbnail image for the special collection as it appears on the AAPB.

Building up foundations

Being the second UNC fellow for the project, I was fortunate that our digitization station was already set up and operational. Getting the station to work was a significant challenge for the first round of the fellowship, but fortunately, the station operated without any issues for me, thanks to all the hard work from everyone involved. One of my duties in the project was to build upon the records for the digitized materials and ensure that WUNC’s personal records were uniform and easy to understand. I frequently consulted with WUNC’s Keith Weston to confirm dates, names, and programming details. In some cases, newly rediscovered items forced us to reevaluate how we defined a particular series or piece of programming, and I would edit our records as necessary.

UNC SILS Digitization station

While the fellowship focuses on digitization, cataloging the physical DATs and cassettes I handled proved to be equally important. Without proper labeling and documentation, a given asset could be unknowingly re-recorded and cost extra time. In addition to our digital master table of records, I was responsible for labeling the physical objects and their cases with the newly-determined local identifiers for WUNC. With these markings, the cases can be quickly scanned for items that are yet to be digitized, which will make future digitization projects easier for WUNC.

I developed a strong personal connection to these items as I cataloged and marked them. Each DAT and cassette had a story to tell, and it was up to me to piece together their metadata and see that they were digitized and made publicly accessible so others could listen to them. Being one of the first North Carolina-based organizations to be included in the AAPB was very exciting for me, as our work here was not only a foundation for WUNC and its archives, but for North Carolina as a state, as well. Materials like the WUNC 1953 sign-on event reminded me how long ago some of these recordings were made, and how many more there may still be at WUNC, waiting to be digitized and heard once more.

Overall, the fellowship has been a wonderful opportunity for me. It allowed me to not only develop my abilities handling audio materials and digital records, but also provide me a way to learn about the area and its people and history. I am incredibly grateful for all the support and effort from everyone that allowed this project to be realized: my advisor, Dr. Helen Tibbo, Erica Titkemeyer from the Southern Folklife collection for her technical assistance, Dena Schultz, our first fellow for the project, Keith Weston at WUNC, and all the staff at WGBH for their supervision, planning, and feedback.

Written by Steve Wilcer, PBPF Summer 2018 Cohort


About PBPF

The Public Broadcasting Preservation Fellowship (PBPF), funded by the Institute of Museum and Library Services, supports ten graduate student fellows at University of North Carolina, San Jose State University, Clayton State University, University of Missouri, and University of Oklahoma in digitizing at-risk materials at public media organizations around the country. Host sites include the Center for Asian American Media, Georgia Public Broadcasting, WUNC, the Oklahoma Educational Television Authority, and KOPN Community Radio. Contents digitized by the fellows will be preserved in the American Archive of Public Broadcasting. The grant also supports participating universities in developing long-term programs around audiovisual preservation and ongoing partnerships with their local public media stations.

For more updates on the Public Broadcasting Preservation Fellowship project, follow the project at and on Twitter at #aapbpf, and come back in a few months to check out the results of their work.


Dena Schulze, Public Broadcasting Preservation Fellow at WUNC

My name is Dena Schulze and I am the Public Broadcasting Preservation fellow partnered with WUNC radio station in Chapel Hill, North Carolina and the University of North Carolina at Chapel Hill. I graduate in May from the Archives and Records Management track in the Library Science School at UNC. It has been my privilege to digitize over 170 assets from WUNC radio station that were deemed at risk.  Formats included CDs, cassettes and DAT tapes. Check out some pictures and ramblings about my experience below!


Time Travelin’ with WUNC

Every time I put on the headphones, cue up the tape or CD and press record it’s like stepping into a time machine! I had noise reducing headphones that allowed me to be totally immersed in the recordings. Shows at WUNC that I digitized were mostly weekly talk shows about current events and the people, places and things of North Carolina. There were also special programs and recordings that changed up the monotony of talk shows. I enjoyed learning about the state that I have called home for the last fifteen years. Over the course of the fellowship I was able to digitize about 170 assets and learned so much about both the process and the content. Here are a few key words that summarize my experience:


There were times when I was listening to a talk show or news segment and if you had changed the names and dates, I would have thought it was a current broadcast. Topics included poverty, politics, abortion, economics, gay marriage, health care, etc. These issues are still constantly in the news and being debated in our country. While I was listening to people talk about these issues 5, 10, 20 years ago it brought a new perspective to the news I was reading about in the present. Will we ever solve these problems or end the debate? Maybe not but I think the continuing discussion is vital and looking back on what has been said before can help the present conversation move forward.


Many of the shows and recordings also featured performing arts and music. Gary Shivers on Jazz played collections of jazz music, including an episode on Frank Sinatra and Ella Fitzgerald which I thoroughly enjoyed. The first episode of The Linda Belans show focused on television, specifically the popular shows airing at the time: Friends and Frasier. There was also a collection of short stories recorded by authors including Lee Smith and Haven Kimmel. As someone who loves the arts, I loved this theme throughout the assets and listening to things I would never have heard of otherwise.


Cueing up a tape was almost like going on a treasure hunt! The titles of the episode didn’t necessarily tell me what I was going to be listening to for the next hour or so. Sometimes they were pretty simple: “Ray Bradbury” was a conversation with the famous author. Others had one description or name but that was only part of the tape. I was surprised to discover a whole segment on the art of fiddling and another interview featuring actress Amy Adams at the beginning of her career. Some did not even have a description on the tape and that content was a total surprise! Kept me on my toes!


North Carolina!

As mentioned above, I have lived in North Carolina for the past fifteen years and felt a strong connection to the shows focusing on the people, places and issues of the state. One show discusses a school being built near where I lived and I had no idea its history and beginning. Another had an interview with Dr. William Friday, who is basically North Carolina royalty and at one time was the president of the University of North Carolina system. Every recording dealt with a person, issue or place concerning the state of North Carolina. It gave me a greater knowledge and appreciation for the state I call home!


This word describes more of the process than the content. Because we were creating the workstation and workflow from the ground up, there were a lot of hiccups to work through. Equipment did not arrive on time or did not work properly, the computer did not read the CDs or programs correctly, miscommunication in emails are just a few examples. I had to be ready to move onto another part of the fellowship while other factors were figured out or fixed. Once the workstation and workflow were set up, everything ran a lot smoother but it takes time to get all the different pieces working together. I found it vital that I had mentors and professionals at my university and at the station to ask for help and I would not have gotten the workstation up and running without them!

I had so much fun immersing myself in recordings from the past and learning some history! I think these recordings are going to be so valuable on the AAPB website and I am so glad I was able to help get them online!

– Written by PBPF Fellow Dena Schulze


About PBPF

The Public Broadcasting Preservation Fellowship (PBPF), funded by the Institute of Museum and Library Services, supports ten graduate student fellows at University of North Carolina, San Jose State University, Clayton State University, University of Missouri, and University of Oklahoma in digitizing at-risk materials at public media organizations around the country. Host sites include the Center for Asian American Media, Georgia Public Broadcasting, WUNC, the Oklahoma Educational Television Authority, and KOPN Community Radio. Contents digitized by the fellows will be preserved in the American Archive of Public Broadcasting. The grant also supports participating universities in developing long-term programs around audiovisual preservation and ongoing partnerships with their local public media stations.

For more updates on the Public Broadcasting Preservation Fellowship project, follow the project at and on Twitter at #aapbpf, and come back in a few months to check out the results of their work.

Upcoming Webinar: AAPB’s Quality Control Tools and Techniques for Ingesting Digitized Collections


Oklahoma mentor Lisa Henry (left) cleaning a U-matic deck with Public Broadcasting Preservation Fellow Tanya Yule.

This Thursday, February 15th at 8 pm EST, American Archive of Public Broadcasting (AAPB) staff will host a webinar covering quality control tools and technologies used when ingesting digitized collections into the AAPB archive, including MDQC, MediaConch, Sonic Visualizer, and QCTools.

The public is welcome to join for the first half hour. The last half hour will be limited to Q&A with our Public Broadcasting Preservation Fellows, who are just now beginning the process of digitizing at-risk public broadcasting collections to be preserved in the AAPB.

Webinar URL:


For more updates on the Public Broadcasting Preservation Fellowship project, follow the project at and on Twitter at #aapbpf, and come back in a few months to check out the results of their work: digitized content preserved in the American Archive of Public Broadcasting from our collaborating host organizations WUNCKOPNOklahoma Educational Television AuthorityGeorgia Public Broadcasting, and the Center for Asian American Media as well as documentation created to support ongoing audio and video preservation education at the University of MissouriUniversity of OklahomaClayton State UniversityUniversity of North Carolina at Chapel Hill, and San Jose State University.


PBS NewsHour Digitization Project Update

NewsHour_Project_LogosIn January 2016, the Council on Library and Information Resources awarded WGBH, the Library of Congress, WETA, and NewsHour Productions, LLC a grant to digitize, preserve, and make publicly accessible on the AAPB website 32 years of NewsHour predecessor programs, from October 1975 to December 2007, that currently exist on obsolete analog formats. Described by co-creator Robert MacNeil as “a place where the news is allowed to breathe, where we can calmly, intelligently look at what has happened, what it means and why it is important,” the NewsHour has consistently provided a forum for newsmakers and experts in many fields to present their views at length in a format intended to achieve clarity and balance, rather than brevity and ratings. A Gallup Poll found the NewsHour America’s “most believed” program. We are honored to preserve this monumental series and include it in AAPB.

Last week, our contract archivist Alexander (AJ) Lawrence completed the inventory of 7,320 NewsHour tapes stored in 523 boxes located in WETA’s storage units in Arlington, Virginia, comprising the bulk of the collection. (Additional content is located at two other locations.)

“I was so excited to receive Casey’s initial email asking about my interest in the NewsHour project. I’ve been a life long watcher of the program and the chance to be involved in the preservation of such a valuable resource for historical research seemed like a wonderful opportunity.

The process of inventorying the entire collection seemed pretty daunting on my first day when I got my first in-person look at the storage units housing the estimated 7,500 tapes. However, the process has gone quite smoothly overall and we’ve now surpassed the halfway point. Generally, the tapes have little more than a date to identify them, but it’s been especially interesting to come across the tapes for significant historical events over the past 40+ years. These tapes in particular offered me a chance to reflect on some major cultural milestones I’ve witnessed, often through coverage by the NewsHour team. That said, it was also fun to come across the broadcast that aired on the day I was born, as well as the very first broadcast of The MacNeil/Lehrer NewsHour.

Thankfully, I haven’t been tackling the entire inventory alone. I need to offer a special thanks to Matthew Graylin, a desk assistant with the NewsHour who’s been tasked with assisting me with the work. Needless to say, conducting an archival inventory is well beyond the normal duties of a broadcast news assistant, but Matthew has dived in with gusto. We still have a few weeks together, so hopefully I can convert him into a future audiovisual archivist in that time.”

This slideshow requires JavaScript.

We have also selected a digitization vendor for the project and are looking to begin pilot tests for digitization within the next month. Meanwhile, the Library has instituted quality control procedures to ensure that all digitized files will be properly preserved for present and future generations.

We can’t wait to get started with digitization and look forward to making this monumental series accessible as part of the AAPB collection. In the meantime, we’re pleased to share this clip reel sampling of content that will be digitized, courtesy of NewsHour Productions.


Crawford finishes AAPB digitization of 40,000 hours!

The following is a guest post by Emily Halevy, Director of Media Management Sales at Crawford Media Services. In this blog post, Emily records her interview with Chip Stephenson, Crawford Project Manager, and David Braught, Crawford Logistics Coordinator. Crawford and the AAPB Project Team recently completed the American Archive of Public Broadcasting Digitization Project, funded by the Corporation for Public Broadcasting. Crawford’s role in the project was the coordination and digitization of approximately 40,000 hours of public broadcasting video and audio archival content, as well as the transcoding of approximately 20,000 born-digital files, contributed by more than 100 stations and organizations nationwide!

Now that the digitization is complete, the files will be preserved and made accessible as much as possible through the American Archive of Public Broadcasting, and the AAPB Project Team at WGBH and the Library of Congress is excited to begin working on these efforts. Continue reading below for an account of Crawford’s experience throughout the AAPB digitization project.

Happy New Year, Everyone! I’m delighted to be a guest blogger for the American Archive of Public Broadcasting, once again! As we come to the end of this migration project, I thought this time it would be fun to sit down with Chip Stephenson and David Braught and discuss some of the successes and challenges this project brought. It’s also a great time to reflect on the importance and value of the project as a whole.

The AAPB digitization team at Crawford Media Services
The AAPB digitization team at Crawford Media Services

Emily: What’s the first thing that comes to mind now that the project is over?

Chip: It’s over? What? We’ve been living it for over three years!

David: It’s hard to believe it’s over.

Chip: Well, it’s not quite over yet. We’re still wrapping- the engineers are finalizing data, project management is compiling spreadsheets and financials. But we’re almost there.

David: I’ve never worked on anything like this before- the logistics- everything.

Chip: Logistics of shipping, receiving, and accounting for all of the content. And then the amount of data, file configurations, bags, copying files for the individual stations. Over 125 different spreadsheets- audio, video, born digital, plus over 100 stations, which sometimes had multiple spreadsheets. It was more like 100 individual projects than one big project.

David: And every station had its own set of quirks to deal with.

Chip: Every station required multiple phone calls and emails to set things up. It’s an amazing project. The stations were all great to work with and they all had an amazing amount of work to do to make it happen. Some like New Jersey Network and University of Maryland had an incredible amount of content.

David: I’m sure the stations wanted to kill me with the number of emails about checking their files so we could delete them from our system.

Chip: Our engineers were amazing.

David: I can’t say enough good things about our engineers. Guy (Boyd) was able to adapt and push through data, JP (Lesperance) handling all of the born digital, Nathan (Lewis) re-transcoding every single proxy to meet the requirements for the Library of Congress, Herve (Bergeron) and Dr. Dave (Wolaver) switching out and repairing decks.

Chip: And don’t forget the thousands of tapes baked and repaired by Dr. Dave as well.

David: It really was a tremendous team effort.

Emily: We really do have a great team, don’t we? And we can’t leave out the migrators.

Chip: At the peak we had 3 audio migrators running 5 days a week, 24 hours a day. We had 5 video migrators digitizing content, with one pod running 5 days a week, 16 hours a day, and the other pod running 24 hours, 5 days a week. There were even many months running 7 days a week. There were also others just doing QC. And others handling born digital content, copying files into working storage, and then checking to be sure they worked and renaming and creation of the proxy file.

David: Haha! So what was the question again?

Emily: The question was “What’s the first thing that comes to mind now that’s it is over?”

David: Evidently everything! Haha!

Chip: You never understand the true complexity of the project until you look back and have time to reflect. Before the project even started, during a visit by Stephanie (Sapienza) and Caitlin (Hammer) from CPB, we were reviewing the process and we all started to realize how complex the overall project was going to be. Caitlin kept asking me, “How are you going to do this?” And my answer was “One station at a time.” Thinking about all of it at once was just overwhelming. So David and I sat down and thought about how we wanted to parse this project out. How do we want to think about this on a daily and weekly basis? So we came up with an operational spreadsheet, which then became two spreadsheets, which then became multiple spreadsheets. And there were times over the past year when we just took a deep breath and said, “Ok. 40 stations down, 60 to go.”

David: It was a constant balancing act. Nothing ended up being accurate in terms of tape counts. More audio, less video, double ¾”, which is more time consuming. We had to rearrange our thinking and the pods on a regular basis. And adjust accordingly.

Chip: But working with CPB, then the transfer of the host to WGBH went incredibly smooth. We had some discussions about what they thought and what we thought, but it was very easy moving through issues and problems as they came up.

David: And we always got great support from CPB and then WGBH.

Emily: What turned out to be the most challenging aspect of the project? (If you could name one thing.)

Chip: For me-

David: Oh! Born digital.

Chip: For me it was the born digital for a couple of reasons…

David: Well you take the issues we had with receiving the physical assets and multiply that times a million.

Emily: The born digital was one of the “orphan items” that wasn’t completely fleshed out when we got started.

Chip: We started the born digital about 8 months later than we’d hope and there were many more individual steps dealing with the stations and how they’d build their drive and name their files and create their spreadsheets. So we had to develop ways to review the file names and correct them to make them legal- spaces had to be replaced with underscores, no illegal characters, they all must have file extensions, etc. Then we had to combine GUIDs for the project with the individual station’s file name. When you do this with thousands and thousands and thousands of files, it becomes complex. And then we had to create proxy files for all of them. And the process you use to create a proxy of one file type might be different from another file type. And then all of the files needed to be QC’d and compared to the master file. Some stations, when they built their initial hard drives, had a large amount of bad files. Sometime up to 50% of the files were bad. And we had to give the stations time to rebuild. Remember the whole purpose of this project was to migrate, capture and acquire as many of these files as possible. Migrate as much as we could within the time frame we had to work with and that time frame was closing in on us.

Emily: Again- another area where we got great support from Casey and the American Archive team.

Were there any hurdles that turned out to be no big deal?

David: Just getting the content here.

Chip: In the beginning, logistics were slow. We were still trying to figure out the most efficient way to get stuff here.

David: And at the start the stations didn’t really know what they were getting into, but honestly, it went smooth.

Chip: We started to realize- let’s not worry about having too many tapes here, let’s worry about not having enough.

David: KQED for instance, they were ready to ship immediately. So we told Robert (Chehoski), “Alright, let’s bring it on!”

Chip: At one point, we had the equivalent of 65 pallets of assets in our crypt. And of course it was interesting shipping things from Alaska. But every single station helped us find a way to get their assets to us. And every single station, despite issues (time of year, reduction of staff, etc.) they all worked their butts off. They all worked really hard to pull, barcode, pack and ship their tapes to us and make this a success. Between dozens of Fed Ex shipments, three semi-truck runs across the country and an airline delivery, we managed to get everything here and under budget!

Emily: What did you learn from the project?

Chip: Efficiency. Efficiency. Efficiency. Rethink everything you do and realize there might be a better way to do something. And if it sounds like there might be, try it. When David and I sat down and put a plan together we realized quickly we were too rigid. We needed to be flexible. We had to find compromises throughout the project. There were many times we’d get off the phone with a station and say to each other, “How is this going to work?” We could not be afraid to come up with new solutions for the stations. We had to be receptive to their ideas, especially when it came to timing.

David: It didn’t do any good to stick to a timeline that wouldn’t work for them.

Chip: Initially, our idea was to do all the beta tapes together, then all the DVCPro tapes together, but we ended up digitizing several formats simultaneously.

David: Sometimes even 6 video tape formats simultaneously.

Chip: We had a few stations that had only one or two formats, but most of them had a little of everything.

David: Halfway through the project we realized we were dealing with 20 stations at one time- shipping tapes, migrating, moving data, shipping delivery drives, bagging and backing up file data, literally tracking upwards of 30 stations in a given period.

Chip: So being as flexible as possible was important, because no matter how well you thought you had it figured out, it changed on you. And, honestly, at first we fought it, but then we realized that it just wasn’t going to work. So stop fighting it. We had to maintain the flow of tapes required in order to meet the deadline, and being rigid was not going to get us there.

David: I don’t know if a day went by without asking Dr. Dave to switch out tape decks to accommodate our revised workflows.

Emily: What was your favorite “found” item from the project?

David: For me, it was the famous Akira Kurosawa footage. One of our migrators found that the tape label didn’t match the content. It was labeled as a cooking show, but turned out to be an interview with Kurosawa and George Lucas and Francis Ford Coppola. I was like, “Give me that tape!” It turned out to be a program that was thought lost for many years at the station.

Chip: For me, at one point it was all hands on deck, so I had to QC several hundred files. The content just happened to be all the history of New York City and Boston and The Revolutionary War. WNET had a whole series on the history of Manhattan dating back to the revolution. Growing up in that area, I knew a lot of the city’s history, but I never really knew the intricate history of Manhattan and the Bronx and Queens. I didn’t know that Wall Street really was a wall. I learned there’s a fence in Bowling Green Park, which still exists to this day, that was erected in 1770 to protect a statue of George III. The history in this collection is amazing. Meanwhile, I was supposed to be spending 2-3 minutes QC’ing these files and 20 minutes later I had to stop myself and get back to work!

David: That happened all the time!

Chip: The programming is so great! From arts and symphonies to theatricals, history- everything you can think of from all across the country.

Emily: Hence the “American Archive” project!

Chip: Now that the project is coming to an end, I’m just dealing with the data and the files. We did massive shipments out in October and November. It was amazing. The last truck run went up in first week of December. Right now we’re just pulling the little tidbits and reviewing everything and making sure we crossed all of our Ts and dotted all of our Is. We’re shipping out LTO tapes to the Library of Congress. And I’m a little sad it’s come to an end. On the other side, it’s a great sense of accomplishment. A year of planning and discussions. Two years of migration. Then changing all of the planning several times throughout. It all comes back to flexibility. Understanding you can’t be rigid.

Digitization Successes of the American Archive

By Emily Halevy, Director of Media Management Sales at Crawford Media Services

Migration at Crawford Media Services
Digitization at Crawford Media Services

Hi everyone! My name is Emily Halevy. I’m the Director of Media Management Sales at Crawford Media Services. Hopefully by now, stations have been able to work with Chip and David- our fabulous project managers- and are well on their way to receiving their digitized content.

I want to take a moment to first say how much this project means to me. Growing up, I was an army brat and moved nearly every year of my childhood, sometimes even twice a year, until I hit 10 years old. My sister and I figured out that by the time we’d moved out of our parent’s house we’d moved a total of 24 times. I say that because there weren’t many constants in my life … just my sister, my parents, and PBS. I used to lay my blanket out in the living room floor, sit on it with my stuffed animals pretending it was a magic carpet and watch Sesame Street, Mr. Rogers, 3-2-1 Contact and all the other children’s programming for hours on end. No matter where we lived, no matter where life took us, I always had my blanket and PBS. I feel like in some way I’m now helping to preserve this programming much in the same way it helped me preserve some sense of stability throughout my childhood. To all of you who helped those great programs find their way into my home and my life, I thank you.

Enough about me! Let’s talk about this project!

The task as outlined was to digitize 35,000 hours of audio and video content across 55,000 tapes, and transcode another 5,000 hours of born digital content from approximately 100 stations. Easy enough, right?! Well, our first head scratching, “how are we gonna do this” moment came when we realized that we would actually need to hold the majority of this content simultaneously. Fifty-eight pallets of tapes and hundreds of additional boxes to be exact. So, we allocated some of our space to creating a secure crypt with temperature control and FM-200 fire suppression.

Pallets of media arrive to be digitized.
Pallets of media arrive to be digitized.

And then we thought, hmm … how is everyone going to barcode their materials consistently, so that  when they arrive there is no issue with scanning them? Well, it turned out the easiest solution was for us to print the barcodes and ship them out to all of the stations.

Then we realized, huh … while this project is one project, it’s actually more like 100 different projects with clients all over the country. Even in Guam and Alaska. And about Alaska … Unalaska in the Aleutian Islands to be specific … a truck run was impossible. We couldn’t do a Fed Ex or UPS run. So, our solution was to have the station book their tapes as luggage on Alaska Airlines, which just so happens to fly into Atlanta. As for our other stations, where possible, Chip was able to coordinate shipping between stations, using 53 foot pharmaceutical, climate controlled trucks, instead of overnight carriers. We project this logistical feat has saved the project approximately $85,000 in shipping costs, which will in turn be used to digitize more media. Yay!

Crawford's tape bays for digitizing various analog formats.
Crawford’s tape bays for digitizing various analog formats.

Now for the files … three for video tapes, two for audio tapes and one transcode for born digital. And then there’s the BagIt container … each source tape yields up to 27 objects including the media essence files, closed caption files, SAMMA migration log, technical metadata files, checksums and so on. That’s nearly 1.5M pieces of information generated and tracked throughout the project!

Along the way, we’ve uncovered a few priceless gems, including Robert Frost reading a selection of his poems from WFCR, a Frank Zappa interview from KGNU, an Ayn Rand speech from WFCR, and film studies major and movie buff David Braught’s favorite: three tapes from KQED that were actually labeled as “Over Easy” programs. These three turned out to be interviews with film director Akira Kurosawa and a tribute to Japanese Cinema, which included interviews with Kurosawa, Coppola and Lucas. These tapes were thought to be lost. No longer thanks to the American Archive of Public Broadcasting!

Here are some other little factoids:

  • Tapes are being digitized 24 hours a day, five days a week and even some weekends to stay on schedule.
  • Thousands of ¼” reel-to-reel audio tapes and ¾” Umatics have been baked.
  • The project will result in over 1 Petabyte of new data, 2 Petabytes with the copies.
  • We are just starting to tackle born digital. Our original data estimate for born digital was in the neighborhood of 6 TB of data. We now anticipate handling over 33,000 files, which will result in around 280 Terabytes of data.
  • To date, we’ve written over 1,000 LTO-5 data tapes.

We have thoroughly enjoyed working with all of the stations over the past year and a half. As we wind down this phase of the project over the next few months, we hope that the American Archive of Public Broadcasting continues to grow into what surely will become one of the most educational and culturally diverse archives in the country.