Announcing the Second Round of Public Broadcasting Preservation Fellows!

WGBH on behalf of the American Archive of Public Broadcasting is pleased to introduce our second cohort of fellows for the Public Broadcasting Preservation Fellowship (PBPF), a project funded by the Institute of Museum and Library Services (IMLS).

The PBPF supports students enrolled in non-specialized graduate programs to pursue digital preservation projects at public broadcasting organizations around the country. The Fellowship is designed to provide graduate students with the opportunity to gain hands-on experiences in the practices of audiovisual preservation; address the need for digitization of at-risk public media materials in underserved areas; and increase audiovisual preservation education capacity in Library and Information Science graduate programs around the country.

Over the summer semester of this year, each fellow will inventory, digitize, and catalog a small collection of audiovisual media; generate technical and preservation metadata; and process the digital files for ingest into the American Archive of Public Broadcasting. The fellows will collaborate with a faculty advisor at their university to complete a handbook which was drafted by the first Fellows, and develop a training workshop for fellow students in the autumn semester. The fellowship will also support a digitization station at each university for the use by the fellows and future students enrolled at the universities.

Please welcome the members of our Summer 2018 PBPF cohort:

Fellow: Laura Haygood
Program: University of Oklahoma
Host Organization: Oklahoma Educational Television Authority
Host Mentor: Janette Thornbrue, Vice President of Operations, Oklahoma Educational Television Authority
Faculty Advisor:Susan Burke, Interim Director and Associate Professor, School of Library and Information Studies
Local Mentor: Lisa Henry, Curator/Archivist, Political Communication Center, Julian P. Kantor Political Commercial Archive

Laura Haygood is a graduate student in the University of Oklahoma’s Master of Library and Information Studies Program. She holds a Bachelor of Arts in History, and she has a background in instrumental music. She works as a Graduate Research Assistant in the Government Documents collection at OU’s Bizzell Library. Laura has volunteered her time at the Moore-Lindsay Historical House Museum, where she wrote an NEH Preservation Grant, as well as at her local public library and local school library. She will complete her MLIS in May 2019. Laura hopes to use this experience digitizing and preserving audiovisual materials to preserve oral histories in the future. Upon completion of her degree, she plans to seek employment in an archive or academic library. Wherever she ends up, Laura’s overarching professional goal is to connect people with the resources they need.

_DSF9475.jpg

Fellow: Riley Eren Cox
Program: Clayton State University
Host Organization: Georgia Public Broadcasting
Host Mentor: Ellen Reinhardt, Radio Program Director, Georgia Public Broadcasting
Faculty Advisor: Josh Kitchens, Director, Master of Archival Studies Program
Local Mentor: Kathy Christensen, former VP of News, Archives and Research at CNN

Riley graduated from SUNY Fredonia in May 2017 with xir bachelor’s in History, minors in Anthropology and Museum Studies.  After interning at the Chautauqua Institution for a season in 2015, xe decided to pursue a career in archives.  Riley is currently enrolled in Clayton State University’s Master of Archival Studies program.  Xe will be ending xir time of employment at the Stuart A. Rose Manuscript, Archive, and Rare Book Library at Emory University this summer and is excited to see where this fellowship takes xir.

Screen Shot 2018-06-13 at 3.39.38 PM

Fellow: Steve Wilcer
Program: University of North Carolina at Chapel Hill
Host Organization: WUNC
Host Mentor: Keith Weston, Web Producer and Back Porch Music Host, WUNC
Faculty Advisor: Helen Tibbo, Alumni Distinguished Professor, SILS
Local Mentor: Erica Titkemeyer, Project Director/AV Conservator, University of North Carolina at Chapel Hill

Steve Wilcer is a graduate student in the School of Information and Library Science at the University of North Carolina at Chapel Hill with a current focus in academic libraries and archives. He obtained his undergraduate degree in Music Performance and Composition at Western Illinois University in Macomb, Illinois and his first master’s degree in Musicology from the Ohio State University in Columbus, Ohio. His multifaceted background in music, research, and archival resources led him to explore and pursue library science and preservation, especially regarding audiovisual materials. In addition to music, he is also interested in history, literature, film, and electronic gaming.

Wilcer profile photo.jpg

Fellow: Tanya Yule
Program: San Jose State University
Host Organization: Center for Asian American Media in collaboration with the Bay Area Video Coalition
Host Mentor: James Ott, Director of Finance and Administration, Center for Asian-American Media
Faculty Advisor: Alyce Scott, Lecturer, School of Information
Local Mentor: Jackie Jay, Preservation Technician, Bay Area Video Coalition

Tanya Yule is a current MLIS candidate at San José State University, focusing on archives and photography preservation; she received her BFA in photography from the San Francisco Art Institute, with a background in traditional darkroom methods, and photomechanical printing. Tanya is an intern at the Hoover Institution Archives at Stanford University, and resides in San Francisco with her husband and adorable dog Otto.

IMG_0239

Fellow: Eric Saxon
Program: University of Missouri
Host Organization: KOPN Community Radio
Host Mentor: Jacqueline Casteel, KOPN Community Radio
Faculty Advisor: Sarah Buchanan, Assistant Professor, Library and Information Science
Local Mentor: James Hone, Digital Archivist, University Libraries, Washington University in St. Louis

Eric Saxon is a graduate student in the School of Information Science and Learning Technologies at the University of Missouri – Columbia, where he is specializing in archives. His archival research/building interests include anything in danger of being forgotten by the collective memory, a predilection that has led to digital preservation efforts focusing on community centers, an outsider artist, and a WWII Monuments Man.  Eric holds a master’s degree in art history and graduate certificate in digital humanities from the University of Nebraska, and a bachelor’s degree in American studies from Stanford University.

Follow along on their digitization journeys by searching #aapbpf!

AAPB Transcription Workflow, Part 1

The AAPB started creating transcripts as part of our “Improving Access to Time-Based Media through Crowdsourcing and Machine-Learning” grant from the Institute of Museum and Library Services (IMLS). For the initial 40,000 hours of the AAPB’s collection, we worked with Pop Up Archive to create machine-generated transcripts, which are primarily used for keyword indexing, to help users find otherwise under-described content. These transcripts are also being corrected through our crowdsourcing platforms FIX IT and FIX IT+.

As the AAPB continues to grow its collection, we have added transcript creation to our standard acquisitions workflow. Now, when the first steps of acquisition are done, i.e., metadata has been mapped and all of the files have been verified and ingested, the media is passed in to the transcription pipeline. The proxy media files are either copied directly off the original drive or pulled down from Sony Ci, the cloud-based storage system that serves americanarchive.org’s video and audio files. These are copied into a folder on the WGBH Archives’ server, and then they wait for an available computer running transcription software.

Dockerized Kaldi

The AAPB uses the docker image of PopUp Archive’s Kaldi running on many machines across WGBH’s Media Library and Archives. Rather than paying additional money to run this in the cloud or on a super computer, we decided to take advantage of the resources we already had sitting in our department. AAPB and Archives staff at WGBH that regularly leave their computers in the office overnight are good candidates for being part of the transcription team. All they have to do is follow instructions on the internal wiki to install Docker and a simple Macintosh application, built in-house, that runs scripts in the background and reports progress to the user. The application manages launching Docker, pulling the Kaldi image (or checking that you already have it pulled), and launching the image. The user doesn’t need any specific knowledge about how Docker images work to run the application. That app gets minimized on the dock and continues to run in the background as the staff members goes about their work during the day.* But that’s not all! When they leave for the night and their computer typically wouldn’t be doing anything, it continues to transcribe media files, making use of processing power that we were already paying for but hadn’t been utilizing.

*There have been reports of systems being perceptively slower when running this Docker image throughout the day. It has yet to have a significant impact on any staff member’s ability to do their job.

Square application window that shows list of transcripts that have been processed
Application user-interface

Centralized Solution

Now, we could just have multiple machines running Kaldi through Docker and that would let us create a lot of transcripts. However, it would be cumbersome and time-consuming to split the files into batches, manage starting a different batch on each computer, and collect the disparate output files from various machines at the end of the process. So we developed a centralized way of handling the input and output of each instance of Kaldi running on a separate machine.

That same Macintosh application that manages running the Kaldi Docker image also manages files in a network-shared folder on the Archives server. When a user launches the application, it checks that specific folder on the server for media files. If there are any media files in that folder, it takes the oldest file, copies it locally and starts transcribing it. When Kaldi has finished transcribing it, the output text and json formatted transcripts are copied to a subfolder on the Archives server, and the copy of the media file is deleted. Then the application checks the folder again, picks up the next media file, and the process continues.

Screenshot of a file directory with many .mp4 files, a few folders, and a few files named with base64 encoded strings
Files on the Archives server: the files at the top are waiting to be processed, the files near the bottom are the ones being processed by local machines

Avoiding Duplicate Effort

Now, since we have multiple computers running in parallel, all looking at the same folder on the server, how do we make sure that multiple computers aren’t duplicating efforts by transcribing the same file? Well, the process first tries to rename the file to be processed, using the person’s name and a base-64 encoding of the original filename.  If the renaming succeeds, the file is copied into the Docker container for local processing, and the process on every other workstation will ignore files named that way in their quest to pick up the oldest qualifying file. After a file is successfully processed by Kaldi, it is  then deleted, so no one else can pick it up. When Kaldi fails on a file, then the file on the server is renamed to its original file name with “_failed” appended, and again the scripts know to ignore the file. A human can later go in to see if any files have failed and investigate why. (It is rare for Kaldi to fail on an AAPB media file, so this is not part of the workflow we felt we needed to automate further).

Handling Computer and Human Errors

The centralized workflow relies on the idea that the application is not quitting in the middle of a transcription. If someone shuts their laptop, the application will stop, but when they open it again, the application will pickup right where it left off. It will even continue transcribing the current file if the computer is not connected to the WGBH network, because it maintains a local copy of the file that is processing. This allows a little flexibility in terms of staff taking their computers home or to conferences.

The problem starts when the application quits, which could occur when someone quits it intentionally, someone accidentally hits the quit button rather than the minimize button, someone shuts down or restarts their computer, or a computer fails and shuts itself down automatically. We have built the application to minimize the effects of this problem. When the application is restarted it will just pick up the next available file and keep going as if nothing happened. The only reason this is a problem at all is because the file they were in the middle of working on is still sitting on the Archives server, renamed, so another computer will not pick it up.

We consider these few downsides to this set up completely manageable:

  • At regular intervals a human must look into the folder on the server to check that a file hasn’t been sitting renamed for a long time. These are easy to spot because there will be two renamed files with the same person’s name. The older of these two files is the one that was started and never finished. The filename can be changed to its original name by decoding the base-64 string. Once the name is changed, another computer will pick up the file and start transcribing.
  • Because the file stopped being transcribed in the middle of the process, the processing time spent on that interrupted transcription is wasted. The next computer to start transcribing this file will start again at the beginning of the process.

Managing Prioritization

Because the AAPB has a busy acquisitions workflow, we wanted to make sure there was a way to manage prioritization of the media getting transcribed. Prioritization can be determined by many variables, including project timelines, user interest, and grant deadlines. Rather than spending a lot of time to build a system that let us track each file’s prioritization ranking, we opted for a simpler, more manual operation. While it does require human intervention, the time commitment is minimal.

As described above, the local desktop applications only look in one folder on the Archives server. By controlling what is copied into that folder, it is easy to control what files get transcribed next. The default is for a computer to pick up the oldest file in the folder. If you have a set of more recent files that you want transcribed before the rest of the files, all you have to do is remove any older files from that folder. You can easily put them in another folder, so that when the prioritized files are completed, it’s easy to move the rest of the files into the main folder.

For smaller sets of files that need to be transcribed, we can also have someone who is not running the application standup an instance of dockerized Kaldi and run the media through it locally. Their machine won’t be tied into the folder on the server, so they will only process those prioritized files they feed Kaldi locally.

Transforming the Output

At any point we can go to the Archives server and grab the transcripts that have been created so far. These transcripts are output as text files and as JSON files which pair time-stamp data with each word. However, the AAPB prefers JSON transcripts that are time-stamped at each 5-7 second phrase.

We use a script that parses the word-stamped JSON files and outputs phrase-stamped JSON files.

Word time-stamped JSON

Screenshot from a text editor showing a json document with wrapping json object called words with sub-objects with keys for word, time, and duration
Snippet of Kaldi output as JSON transcript with timestamps for each word

Phrase time-stamped JSON

Screenshot from a text editor of JSON with a container object called parts and sub-objects with keys text, start time, and end time.
Snippet of transformed JSON transcript with timestamps for 5-7 second phrases

Once we have the transcripts in the preferred AAPB format, we can use them to make our collections more discoverable and share them with our users. More on the part of the workflow in Part 2 (coming soon!).

Five New Special Collections Now Available in the American Archive of Public Broadcasting!

Happy International Archives Day! The American Archive of Public Broadcasting (AAPB) is celebrating by launching five NEW Special Collections that feature raw interviews from American Experience’s Freedom Riders, The Murder of Emmett Till, John Brown’s Holy War, and Jubilee Singers, as well as the Peabody award-winning documentary Africans in America!

Now available online, you can access these collections at http://americanarchive.org/special_collections or in person at the Library of Congress and at WGBH, preserved for future generations to learn about our nation’s history.

The AAPB, a collaboration between the Library of Congress and Boston public media station WGBH, has digitized and preserved more than 50,000 hours of broadcasts and previously inaccessible programs from public radio and public television’s more than 60-year legacy.

The AAPB invites you to spend the day (and everyday) exploring the collections at americanarchive.org. Let us know what you discover by tagging us at @amarchivepub!

New Special Collections Summaries

Freedom Ridershttp://americanarchive.org/special_collections/freedom-riders-interviews

Screen Shot 2018-06-08 at 4.11.39 PMThe Freedom Riders Interview Collection contains 124 raw interviews from the American Experience documentary of the same name. The film documents the six-month period from May to November 1961, when white and black activists rode together on buses across the American South to protest the continued segregation of public buses and transportation facilities. Risking attack from white mobs and arrest by local police, the documentary chronicles the reality of the Freedom Riders’ experiences and success at calling attention to southern indifference to federal law and demanding enforcement of integrated interstate bus travel. The Freedom Riders interviews were conducted with activists and journalists who took part in the Freedom Rides, including John Lewis, a key player in the Civil Rights Movement and a member of the House of Representatives; Diane Nash, a coordinator for Freedom Riders in Nashville; Moses Newson, a journalist who covered the first Freedom Ride; John Seigenthaler, a Special Assistant to Robert F. Kennedy; and Genevieve Hughes Houghton, Congress of Racial Equality (CORE) field secretary on their Freedom Ride. Subjects discussed include the Supreme Court, the American South, Jim Crow, the Ku Klux Klan, violence, racism, segregation, CORE, and the Civil Rights Movement.

The Murder of Emmett Till – http://americanarchive.org/special_collections/the-murder-of-emmett-till-interviews

Screen Shot 2018-06-08 at 4.11.29 PMThe Murder of Emmett Till Interviews Collection is made up of 40 raw interviews from the award-winning 2003 American Experience documentary, The Murder of Emmett Till. The film, which chronicles the story of Emmett Till, a 14-year-old who was murdered in 1955 after being accused of whistling at a white woman, follows Till’s life and transformation into an icon of the Civil Rights Movement. The Murder of Emmett Till interviews paint a picture of the Jim Crow South, the Mississippi community in which the murder took place, and contain intimate recollections by those who knew Emmett Till. Guests include family and friends of Emmett Till, including Mamie Till Mobley, Emmett Till’s mother and Civil Rights activist; and Wheeler Parker, Emmett Till’s cousin; as well as journalists, politicians, and witnesses, like Ernest Withers, a photographer known for his photos of the segregated South; Willie Reed, a witness who testified against Emmett Till’s murderers; and David Jordan, a Senator from Mississippi. Topics include segregation, Jim Crow, lynching and violence, the American judicial system, journalism, the American South, and the Civil Rights Movement.

John Brown’s Holy Warhttp://americanarchive.org/special_collections/john-brown-holy-war-interviews

Screen Shot 2018-06-08 at 4.11.15 PMThe John Brown’s Holy War Interview Collection is comprised of 41 raw interviews conducted in 2000 for the American Experience film of the same name. The interviews examined the enigmatic life, history, myth, and legacy of abolitionist John Brown, one of the most controversial figures in American history. John Brown’s Holy War outlines John Brown’s life, role in the abolition movement, unsuccessful raid on the Harpers Ferry federal armory, death, and subsequent entry into American lore as both villain and martyr during the American Civil War. Interviews were conducted with historians, authors, and educators, including James Horton, Professor of American Studies and History at George Washington University; Paul Finkelman, historian of American law; Margaret Washington, historian and Professor of History at Cornell University; and Russell Banks, novelist. Interviews feature a range of topics, including abolition, philosophy, enslavement, race, Christianity, economics, mental health, journalism, the Dred Scott Decision, Frederick Douglass, Pre-Civil War American politics, the Harpers Ferry attack, and the American Civil War.

Jubilee Singers Interviewshttp://americanarchive.org/special_collections/jubilee-singers-interviews

Screen Shot 2018-06-08 at 4.11.24 PMThe Jubilee Singers Interviews Collection includes 19 raw interviews conducted in 2000 for the American Experience documentary Jubilee Singers: Sacrifice and Glory. The film focused on the early years of the Fisk Jubilee Singers, an ensemble of students from Fisk University in Tennessee who created the a cappella group in 1871 in an effort to raise funds for the financially-struggling school. The original Fisk Jubilee Singers, largely made up of former slaves, toured around the United States, and, later, Europe, and were known for their performances of spirituals, which they are partially credited with preserving and introducing to a wider audience. Interviews were conducted with musicologists and historians, including John Hope Franklin, historian and recipient of the Presidential Medal of Freedom; Toni Anderson, Music Historian; Horace Clarence Boyer, musicologist and noted scholar of African-American gospel music; and Reavis L. Mitchell, Professor of History at Fisk University. Topics include spirituals and music, slavery, racism, religion, segregation, the American Civil War, and higher education, particularly historically black colleges and universities (HBCUs), and Fisk University.

Africans in Americahttp://americanarchive.org/special_collections/africans-in-america-interviews

Screen Shot 2018-06-08 at 4.11.35 PMThe Africans in America Interviews Collection is made up of 53 raw interviews from the award-winning, four-part documentary of the same name, which aired on PBS in 1998. The documentary, the first to fully examine the history of slavery in the United States, focused on the experiences of African people and their transformation of America, beginning with 16th-century enslavement on Africa’s Gold Coast and ending on the eve of the American Civil War in 1861. The interviews offer an in-depth examination of the social, economic, and intellectual foundations of slavery and the ways in which African people changed the United States. Guests include descendants of slaves and slave-owners, authors, professors, historians, and statesmen, including Colin Powell, retired four-star general and the first African American on the Joint Chiefs of Staff; Karen Hughes White, a descendant of Thomas Jefferson and founder of the Afro-American Historical Association of Fauquier County; Catherine Acholonu, a Nigerian author and Associate Professor of English Literature, Awuku College of Education; and Jeffrey Leath, Pastor of Mother Bethel A.M.E. Church, Philadelphia. Topics covered include Christianity and English Protestantism, George Washington, Toussaint Louverture, the American Revolution, Nat Turner’s Rebellion, gender conventions, racism, violence, economics, family, and enslavement.

Special thanks to Lynn Mason of the WGBH Media Library and Archives’ Stock Sales and Licensing team for her teneacious work in digitizing the collection and Miranda Villesvik for ingesting the collections into AAPB.

Rebecca Benson, Public Broadcasting Preservation Fellow at KOPN

My name is Rebecca Benson, and I’m a graduate student at the University of Missouri, working on a Master’s in Library Science and focusing on work in special collections libraries. I am so excited for the experience I have gained working with the AAPB: I am familiar with much older materials, but the history of the past 100 years really demands broadcast media to be fully understood. The opportunity to work with AAPB and the materials from our local community radio station has expanded my archival horizons, and I look forward to sharing these materials and this history with researchers, as well as sharing this technology with other archivists.

IMG_3065The University of Missouri partnered with the one of the local community radio stations to work on this project. KOPN has been broadcasting from the same office in downtown Columbia since it was founded in 1973  — and I’m pretty sure some of the reels I digitized had not been touched since then. As one of the first open-access community radio stations, they have an amazing perspective on the history of the past several decades. The collection spans an incredible number of areas, from radio theatre to concerts to talk shows, from feminist, queer, indigenous, and otherwise marginalized voices. Working with Jackie Casteel, we decided to begin by digitizing the women’s programming, from the annual Women’s Weekend, the League of Women Voters, and the local Women’s Health collective, among others. Even within this subset, the range of programming spans from interview shows with women in prison to a discussion from one of the first female dentists in the area. Every time I start a new reel, I learn something new and interesting about Columbia or the world, and I cannot wait for others to use this trove of information to begin doing research. I have benefited from the information myself — by chance, I digitized the 1986 League of Women Voters panel on hospital trustees a week before another hospital trustee election in town, which dealt with the hospital lease discussed in 1986!

As I have worked with these materials, I have found that this sort of archival work can re-unite communities and bring people together. Not only have I worked with the university and our initial contacts at the station, I have encountered numerous other people who are, or were, connected with programming that I have now heard. Working on the metadata for our programs led me to the State Historical Society, and their archives of broadcast lists. My time sorting reels at the station led to meeting with a woman who had run much of the radio theatre programming for decades. A chance mention of KOPN led to learning more about the alternative ‘zine community in Columbia, and its connection with the radio station. This project has shown me all the ways in which archival projects are more than just scholarly work, but a way to build and re-build communities.

Getting all of these reels digitized has been — and continues to be — a massive project. As a community radio station, KOPN did not have the most standardized procedures for recording, broadcasting, and documentation, which has led to some interesting moments at the work station. I’m still uncertain how someone managed to splice one tape inside out and backwards! On the other hand, all of these quirks are a result of the creative community that grew around KOPN, and without it, the history of the station would be much poorer. We are so excited to share this vibrant part of our local history with the world.

Written by Rebecca Benson, PBPF Spring 2018 Cohort

*******************

About PBPF

The Public Broadcasting Preservation Fellowship (PBPF), funded by the Institute of Museum and Library Services, supports ten graduate student fellows at University of North Carolina, San Jose State University, Clayton State University, University of Missouri, and University of Oklahoma in digitizing at-risk materials at public media organizations around the country. Host sites include the Center for Asian American Media, Georgia Public Broadcasting, WUNC, the Oklahoma Educational Television Authority, and KOPN Community Radio. Contents digitized by the fellows will be preserved in the American Archive of Public Broadcasting. The grant also supports participating universities in developing long-term programs around audiovisual preservation and ongoing partnerships with their local public media stations.

For more updates on the Public Broadcasting Preservation Fellowship project, follow the project at pbpf.americanarchive.org and on Twitter at #aapbpf, and come back in a few months to check out the results of their work.

Tanya Yule, Public Broadcasting Preservation Fellow at CAAM

 

Screen Shot 2018-05-07 at 4.13.46 PM
Drives loaded up and ready to be sent to the AAPB!!

 

Hello, my name is Tanya Yule and I am one of the five, in the first cohort of the AAPB Public Broadcast Preservation Fellows. Later this month I will be receiving my Masters in Library and Information Science, and an advanced certificate in Digital Assets Management from San Josè State University, with an emphasis in archives and preservation.

When I began the program at SJSU it was with a focus on photography preservation; this was initially a means of utilizing my background in historic photography practices as a way to protect and preserve images for future generations. However, through my work at the Hoover Institution Archives (where I am an intern), I began to fall in love with working in all areas of archives, not just with photographs, and have had the fortunate experience to process incredible collections that range from the Russian Revolution to the Vietnam War, each providing a unique glimpse of someone’s life that I get to describe, organize, and preserve for future generations. When the fellowship was posted, I had a “this was made for me” moment and applied instantly. I have wanted to work with A/V media for quite sometime, and have yet to have the opportunity, until now.

For the last three-months I have been entrenched in material spanning the globe; each item as unique as the next, and giving me more in return than I was prepared for. As I am sitting here trying to tap out a structure and synthesis of what the heck just occurred during the American Archive of Public Broadcasting’s Preservation Fellowship, I am almost overwhelmed with the task.

 

Screen Shot 2018-05-07 at 4.13.34 PM
Bay Area Video Coalition (BAVC) Set-up

 

The specialness of this particular fellowship has been based in the opportunity to work with at-risk magnetic media, multiple stakeholders, and learn a very complex technique for capturing. I was fortunate to be able to work with two amazing San Francisco based non-profit organizations that focus on representing arts and culture for underrepresented communities, and have been pillars in what they do for several decades. The collection I worked from came from the Center for Asian American Media (CAAM); CAAM isn’t a traditional archives, but their holdings are significant and represent a wide range of diverse films and documentaries; many which have appeared on local and national PBS stations over the years. The collection contained U-matic, Betacam, and Digibeta tapes, many which haven’t been viewed in decades. The majority of the fellowship was spent over at the Bay Area Video Coalition (BAVC), under the watchful (and extremely patient and knowledgeable) eye of Jackie Jay. I was fortunate to be able to have my experience take place with the help of a staff that do this work daily, and could help me capture and learn in the best possible situation. I would like to also give a shout out to Morgan Morel for suffering though my lack of commandline knowledge, he has inspired me to take a python class when this is all over.

What is in a name?

While inventorying the items for the collection at CAAM, I couldn’t help but be curious about some of the titles: Anatomy of a Springroll, Dollar a Day, 10 Cents a Dance, A Village Called Versailles, Sewing Woman, to name a few. Since all of the items are on some form of video (magnetic media) it isn’t as easy as just popping in a deck and taking a peek. While capturing in the dark room with my noise cancelling headphones on, there were moments that I would literally laugh out loud, or cry; the subjects are heavy, as is the perspective and history, my work at the Hoover Archives had helped prepare me for dealing with difficult collections, especially when it comes to visual materials regarding war and atrocities.

 

Screen Shot 2018-05-07 at 4.08.43 PM
Many videos have some form of image error, the above “watermark” is a blemish on an old tape, this can be seen in 1/30 of a second. After capturing I would go back to any discrepancy to investigate further

 

Cleaning, cleaning, and some baking!

I soon learned that the majority of my time was in making sure that the decks and tapes were in tip-top shape before capturing. It is quite amazing how much time is spent cleaning tapes, cleaning the decks, baking tapes (in a really high tech food dehydrator), re-cleaning tapes, and re-cleaning machines, as well as setting up levels and making sure that the item being digitized is as close to the original as possible. The cleaning ensures that there is no transfer of dust or debris from another tape, and that the output from the deck is precise. I am extremely fortunate to have my digitization station at BAVC, as they understand the fundamentals of video preservation and digitization, and helped me learn more about the process then I thought I would be capable of in such a short time.

About the collection

As archivists often times we really don’t know what the collection is “about” until the end, there are usually surprises, and most the times these records don’t come with a “read me” file, so I figured I would save this portion to the end as well. The collection as a whole speaks to the diversity of Asian American life, culture, and experiences; evoking the universal struggle of the human condition. When curating the featured films for the AAPB Special Collections page it was difficult to choose, however, many of the films tell the history of women who have defied odds, been outspoken, or who had sacrificed so much for so little in return, I wanted to put these women upfront and recognize their stories and the ones who decided to tell them.

 

Screen Shot 2018-05-07 at 4.13.20 PM
CAAM Video Archive

 

Having this wonderful opportunity to participate in this fellowship while completing my degree allowed me to expand my technical and historical knowledge base, which I am forever grateful for. I would like to thank SJSU and my wonderful advisor Alyce Scott, James Ott and Davin Agatep at the CAAM for helping me out with the project, the entire preservation crew at BAVC for making sure I didn’t break anything, and of course the AAPB and all of the wonderful WGBH folks that made this fellowship happen.

If you are interested in learning more, here is a Q & A I did with CAAM when I started, you can also follow #aapbpf for photos of the stations and process.

 

 

Written by Tanya Yule, PBPF Spring 2018 Cohort

*******************

About PBPF

The Public Broadcasting Preservation Fellowship (PBPF), funded by the Institute of Museum and Library Services, supports ten graduate student fellows at University of North Carolina, San Jose State University, Clayton State University, University of Missouri, and University of Oklahoma in digitizing at-risk materials at public media organizations around the country. Host sites include the Center for Asian American Media, Georgia Public Broadcasting, WUNC, the Oklahoma Educational Television Authority, and KOPN Community Radio. Contents digitized by the fellows will be preserved in the American Archive of Public Broadcasting. The grant also supports participating universities in developing long-term programs around audiovisual preservation and ongoing partnerships with their local public media stations.

For more updates on the Public Broadcasting Preservation Fellowship project, follow the project at pbpf.americanarchive.org and on Twitter at #aapbpf, and come back in a few months to check out the results of their work.

AAPB Webinar Series with the Boston Library Consortium

logo.png

This past March, the American Archive of Public Broadcasting (AAPB) hosted two webinars with the Boston Library Consortium. This two-part webinar series provided an overview on the AAPB as well as review ways in which it can be effectively used as a resource for teaching and research.

Part I – “Accessibility of AAPB in Academic Libraries”
This webinar covered AAPB’s background, governance and infrastructure. Casey Kaufman, AAPB Project Manager, and Ryn Marchese, AAPB Engagement and Use Manager, discussed the scope, content and provenance of the AAPB collection; methods of searching, navigating, and accessing content in the AAPB; examples of the types of materials available in the AAPB collection, and the scholarly and research value of audiovisual collections and specifically public media archives.

Slides available: https://www.slideshare.net/RynMarchese/blc-webinar-part-1-accessibility-of-aapb-for-academic-libraries

 

In this webinar, panelists Casey Kaufman (WGBH), Ingrid Ockert (Princeton University), and Mark Williams (Dartmouth College), explored specific use cases for librarians and researchers in accessing and making use of the AAPB collection. They included a general overview of how scholars and researchers are seeking to use digital AV collections, a brief recap of how AAPB provides access to its collection to researchers and the general public, incorporating AAPB into subject-specific LibGuides, use of audiovisual collections in traditional historical research and in academic coursework, and examples of how AAPB metadata and transcripts can be used in digital humanities research and data mining.

Slides available: https://blc.org/sites/default/files/BLC_Uploads/Part%20II_BLCs%20AAPB%20Webinars_Speaker%20Slides.pdf

 

Special thanks to Jessica Hardin and Susan Stearns of the Boston Library Consortium for helping organizing this series!

Celebrate Women’s History Month by Preserving Women’s Voices in Public Media

One of the most fascinating aspects of the American Archive of Public Broadcasting (AAPB) is discovering how local broadcasting stations used their platforms to communicate national issues to local audiences.

As second-wave feminism gained momentum between the years 1960 to 1980, WNED from Buffalo, New York documented the movement’s ripple effect in a half-hour public affairs talk show series titled Woman.  Syndicated by over 200 PBS stations during the years 1973-1977, Woman was the only year-round, national public television forum where a wide variety of national experts provided perspectives on the (then) evolving world of women’s history.

To celebrate this milestone in women’s public media history, the American Archive of Public Broadcasting (AAPB) launched a new Special Collection featuring the Woman series! Over 190 episodes are available online via the AAPB website: http://americanarchive.org/special_collections/woman-series.

Screen Shot 2018-03-06 at 10.10.46 AM.png
Woman Series, WNED – Buffalo, NY (1973-1977)

The AAPB invites you to celebrate Women’s History Month by helping preserve and make accessible six Woman transcripts. We’re launching a demo-version of our *NEW* transcript editor tool FIX IT+, a line-by-line editing platform initially developed by the New York Public Library. The six featured interviews include conversations with Gloria Steinem (editor and co-founder of Ms. Magazine), Dorothy Pitman Hughes (African American activist and co-founder of Ms. Magazine), Betty Friedan (author of The Feminine Mystique), Nora Ephron (editor for Esquire magazine and the author of the best-selling book Crazy Salad), Marcia Ann Gillespie (editor-in-chief of Essence Magazine and a board member of Essence communications), Connie Uri, M.D. (on the National Board of Research on the Plutonium Economy and the advisory board of NASC, the Native American Solidarity Committee), and Marie Sanchez (Chief Judge of the Northern Cheyenne Tribe, member of the Indian Women United for Social Justice).

These transcripts will be made available online through the AAPB’s website, allowing women’s voices in public media to be more readily searchable and accessible for future generations.

Below are sample recordings of the six interviews mentioned above. Search the Woman Special Collection for more interviews with activists, journalists, writers, scholars, lawyers, artists, psychologists, and doctors, covering topics such as women in sports, the Equal Rights Amendment, sexuality, marriage, women’s health, divorce, the Women’s Liberation Movement, motherhood, and ageism, among others.

Direct link to FIX IT+: http://54.205.165.195.xip.io/

Sample Recordings of Featured Transcripts:

Connie Uri, M.D. and Marie Sanchez, Chief Judge of the Northern Cheyenne Tribe, FIX IT+ Transcript: http://54.205.165.195.xip.io/transcripts/cpb-aacip_81-67wm3fxh

Marcia Ann Gillespie, FIX IT+ Transcript: http://54.205.165.195.xip.io/transcripts/cpb-aacip_81-69z08t6x

Nora Ephron, FIX IT+ Transcript: http://americanarchive.org/catalog/cpb-aacip_81-988gttr0

Gloria Steinem, FIX IT+ Transcript: http://americanarchive.org/catalog/cpb-aacip_81-57np5qgv

Betty Friedan, FIX IT+ Transcript: http://americanarchive.org/catalog/cpb-aacip_81-9995xhm0

Dorothy Pitman Hughes, FIX IT+ Transcript: http://54.205.165.195.xip.io/transcripts/cpb-aacip_81-59c5b5nr

Written by Ryn Marchese, AAPB Engagement and Use Manager

Resources Roundup: AAPB Presentations from 2017 AMIA Conference

DRq7ymbVwAE8zFi

Earlier this month the American Archive of Public Broadcasting staff hosted several workshops at the 2017 Association of Moving Image Archivists (AMIA) conference in New Orleans. Their presentations on workflows, crowdsourcing, and best copyright practices are now available online! Be sure to also check out AMIA’s YouTube channel for recorded sessions.

THURSDAY, November 30th

  • PBCore Advisory Sub-Committee Meeting
    Rebecca Fraimow reported on general activities of the Sub-Committee and the PBCore Development and Training Project. The following current activities were presented:

PBCore Cataloging Tool (Linda Tadic)
PBCore MediaInfo updates (Dave Rice)
ProTrack integration (Rebecca Fraimow)
Updated CSV templates (Sadie Roosa)
PBCore crosswalks (Rebecca Fraimow and Sadie Roosa)

FRIDAY, Dec 1st

Archives that hold A/V materials are at a critical point, with many cultural heritage institutions needing to take immediate action to safeguard at-risk media formats before the content they contain is lost forever. Yet, many in the cultural heritage communities do not have sufficient education and training in how to handle the special needs that A/V archive materials present. In the summer of 2015, a handful of archive educators and students formed a pan-institutional group to help foster “educational opportunities in audiovisual archiving for those engaged in the cultural heritage sector.” The AV Competency Framework Working Group is developing a set of competencies for audiovisual archive training of students in graduate-level education programs and in continuing education settings. In this panel, core members of the working group will discuss the main goals of the project and the progress that has been made on it thus far.

Born-Digital audiovisual files continue to present a conundrum to archivists in the field today: should they be accepted as-is, transcoded, or migrated? Is transcoding to a recommended preservation format always worth the potential extra storage space and staff time? If so, what are the ideal target specifications? In this presentation, individuals working closely with born-digital audiovisual content from the University of North Carolina, WGBH, and the American Folklife Center at the Library of Conference will present their own use cases involving collections processing practices, from “best practice” to the practical reality of “good enough”. These use cases will highlight situations wherein video quality, subject matter, file size and stakeholder expectations end up playing important roles in directing the steps taken for preservation. From these experiences, the panel will put forth suggestions for tiered preservation decision making, recognizing that not all files should necessarily be treated alike.

  • Crowdsourcing Anecdotes

How does the public play a role in making historical AV content accessible? The American Archive of Public Broadcasting has launched two games that engage the public in transcribing and describing 70+ years of audio and visual content comprising more than 50,000 hours.

 THE TOOLS: 

(Speech-to-Text Transcript Correction) FIX IT is an online game that allows the public to identify and correct errors in our machine-generated transcripts. FIX IT players have exclusive access to historical content and long-lost interviews from stations across the country.

AAPB KALDI is a tool and profile for speech-to-text transcription of video and audio, released by the Pop Up Archive and made available on Github at github.com/WGBH/american-archive-kaldi.

(Program Credits Cataloging) ROLL THE CREDITS is a game that allows the public to identify and transcribe information about the text that appears on the screen in so many television broadcasts. ROLL THE CREDITS asks users to collect this valuable information and classify it into categories that can be added to the AAPB catalog. To accomplish this goal, we’ve extracted frames from uncataloged video files and are asking for help to transcribe the important information contained in each frame.

20171201_182116.jpg

SATURDAY, Dec 2nd

Digitized collections often remain almost as inaccessible as they were on their original analog carriers, primarily due to institutional concerns about copyright infringement and privacy. The American Archive of Public Broadcasting has taken steps to overcome these challenges, making available online more than 22,000 historic programs with zero take-down notices since the 2015 launch. This copyright session will highlight practical and successful strategies for making collections available online. The panel will share strategies for: 1) developing template forms with standard terms to maximize use and access, 2) developing a rights assessment framework with limited resources (an institutional “Bucket Policy”), 3) providing limited access to remote researchers for content not available in the Online Reading Room, and 4) promoting access through online crowdsourcing initiatives.

20171202_101425.jpg

The American Archive of Public Broadcasting seeks to preserve and make accessible significant historical public media content, and to coordinate a national effort to save at-risk public media recordings. In the four years since WGBH and the Library of Congress began stewardship of the project, significant steps have been taken towards accomplishing these goals. The effort has inspired workflows that function constructively, beginning with preservation at local stations and building to national accessibility on the AAPB. Archivists from two contributing public broadcasters will present their institutions’ local preservation and access workflows. Representatives from WGBH and the Library of Congress will discuss collaborating with contributors and the AAPB’s digital preservation and access workflows. By sharing their institutions’ roles and how collaborators participate, the speakers will present a full picture of the AAPB’s constructive inter-institutional work. Attendees will gain knowledge of practical workflows that facilitate both local and national AV preservation and access.

As an increasing number of audiovisual formats become obsolete and the available hours remaining on deteriorating playback machines decrease, it is essential for institutions to digitize their AV holdings to ensure long-term preservation and access. With an estimated hundreds of millions of items to digitize, it is impractical, even impossible, that institutions would be able to perform all of this work in-house before time runs out.  While this can seem like a daunting process, why learn the hard way when you can benefit from the experiences of others? From those embarking on their first outsourced AV digitization project to those who have completed successful projects but are looking for ways to refine and scale up their process, everyone has something to learn from these speakers about managing AV digitization projects from start to finish.

How do you bring together a collection of broadcast materials scattered in various geographical locations across the country? National Education Television (NET), the precursor to PBS, distributed programs nationally to educational television stations from 1954-1972. Although this collection is tied together through provenance, it presents a challenge to processing due to differing approaches in descriptive practices across many repositories over many years. By aggregating inventories into one catalog and describing titles more fully, the NET Collection Catalog will help institutions holding these materials make informed preservation decisions. By its conclusion, AAPB will publish an online list of NET titles annotated with relevant descriptive information culled from NET textual records that will greatly improve discoverability of NET materials for archivists, scholars, and the general public. Examples of specific cataloging issues, including contradictory metadata documentation and legacy records, inconsistent titling practices, and the existence of international version will be explored.

download.jpg

ABOUT THE AAPB

The American Archive of Public Broadcasting (AAPB) is a collaboration between the Library of Congress and the WGBH Educational Foundation to coordinate a national effort to preserve at-risk public media before its content is lost to posterity and provide a central web portal for access to the unique programming that public stations have aired over the past 70 years. To date, over 50,000 hours of television and radio programming contributed by more than 100 public media organizations and archives across the United States have been digitized for long-term preservation and access. The entire collection is available on location at WGBH and the Library of Congress, and almost 25,000 programs are available online at americanarchive.org.

Announcing ROLL THE CREDITS: Classifying and Transcribing Text with Zooniverse

AAPB_RollTheCredits

Today we’re launching ROLL THE CREDITS, a new Zooniverse project to engage the public in helping us catalog unseen content in the AAPB archive. Zooniverse is the “world’s largest and most popular platform for people-powered research.” Zooniverse volunteers (like you!) are helping the AAPB in classifying and transcribing the text from extracted frames of uncataloged public television programs, providing us with information we can plug directly into our catalog, closing the gap on our sparsely described collection of nearly 50,000 hours of television and radio.

RolltheCredits.png

Example frame from ROLL THE CREDITS

The American people have made a huge investment in public radio and television over many decades. The American Archive of Public Broadcasting (AAPB) works to ensure that this rich source for American political, social, and cultural history and creativity is saved and made available once again to future generations.

The improved catalog records will have verified titles, dates, credits, and copyright statements. With the updated, verified information we will be able to make informed decisions about the development of our archive, as well as provide access to corrected versions of transcripts available for anyone to search free of charge at americanarchive.org.

In conjunction with our speech-to-text transcripts from FIX IT, a game that asks users to correct and validate the transcripts one phrase at a time, ROLL THE CREDITS helps us fulfill our mission of preserving and making accessible historic content created by the public media, saving at-risk media before the contents are lost to prosperity.

Thanks for supporting AAPB’s mission! Know someone who might be interested? Feel free to share with the other transcribers and public media fans in your life!

Upcoming Webinar: Building AAPB Participation into Digitization Grant Proposals

cropped-aapb_logo_color_1line7.png

Building AAPB Participation into Digitization Grant Proposals: Requirements, Recommendations and Workflows

Tuesday, December 12, 2017
12:00pm ET

Webinar Registration form: https://goo.gl/forms/lWWU5GgFkv09bNFi2
Direct meeting URL: http://wgbh1.adobeconnect.com/aapb_grant-proposals-1/

Curious about getting involved in the American Archive of Public Broadcasting (AAPB)?

Seeking information about the workflows and requirements for contributing digitized content and/or metadata to the AAPB?

Writing a grant proposal and want to explore collaborating with the AAPB to preserve copies of your digitized collections and/or provide an access point to your collections through the AAPB metadata portal?

Then this webinar is for you!

On Tuesday, December 12, 2017 at 12:00pm ET, the AAPB will host a webinar focused on grant writing for digitization and subsequent contribution of digital files and metadata to the AAPB.

By the end of this webinar, participants will gain an understanding of:

  • AAPB’s background and infrastructure,
  • how contributing to the AAPB could benefit your collection
  • steps to becoming an AAPB contributor,
  • metadata and digital file format requirements and recommendations,
  • delivery procedures, and
  • other workflows and considerations for contributing digital files and/or metadata to the AAPB.
  • the value of your collection as part of a national collection and how to express that in a proposal

Attendees will also receive advice on how to incorporate AAPB contribution into their CLIR Recordings at Risk (applications due February 9, 2018!), CLIR Digitizing Hidden Collections, or other grant proposal timelines and work plans.

Fill out this brief form to receive info about future webinars and to receive a webinar meeting invitation sent to your calendar: https://goo.gl/forms/lWWU5GgFkv09bNFi2

Anyone can join the webinar at this URL: http://wgbh1.adobeconnect.com/aapb_grant-proposals-1/

This webinar and future AAPB webinars are generously funded by The Andrew W. Mellon Foundation.

The American Archive of Public Broadcasting (AAPB) is a collaboration between the Library of Congress and the WGBH Educational Foundation to coordinate a national effort to preserve at-risk public media before its content is lost to posterity and provide a central web portal for access to the unique programming that public stations have aired over the past 60 years. To date, over 50,000 hours of television and radio programming contributed by more than 100 public media organizations and archives across the United States have been digitized for long-term preservation and access. The entire collection is available on location at the Library of Congress and WGBH, and almost 25,000 programs are available online at americanarchive.org.