Using Linked Data for the NET Collection Catalog

Who I Am

I am Chris Pierce, the Cataloger/Metadata Specialist for the American Archive of Public Broadcasting and the National Educational Television (NET) Collection Catalog project at the Library of Congress. The NET Collection Catalog Project is a collaboration between WGBH and Library of Congress and funded by the Council on Library and Information Resources (CLIR). The NET project involves the creation of a national catalog of records documenting the existence and robust description of titles distributed by NET, public media’s first national network and its earliest and among its most at-risk content.

In addition to cataloging moving image material distributed by NET during the mid to late fifties to early seventies, I am also working on a feasibility report on the implementation of linked data for the NET catalog.

Linked data? Huh?

What is linked data? The Wikipedia definition is “a method of publishing structured data so that it can be interlinked.” To put it simply, linked data is data that can be linked to other data, very much like how browsers manage hyperlinks.

Why would we want to implement linked data? There are several reasons:

  • AAPB/NET metadata contains valuable and largely undiscovered relationships that, when reused by others on the internet, can enhance the information already online.
  • It would open AAPB/NET metadata to web applications and making the metadata more discoverable and shareable on the web
  • It would contribute to the sustainability of metadata creation for future cataloging at the AAPB with metadata that is more deeply connected to external metadata, which could then be reused for description of AAPB material

Very often we talk about linked data being actionable, by which we mean that the data can be linked to other data through Uniform Resource Identifiers (URIs) (or hyperlinks that direct the user to more information about the resource or property). A key part of being actionable is that data that has been designed to be interlinked in such a way can be said to be a node in a traversable “web” of data. Thus, the model for linked data is a graph, and linked datasets are typically modelled on a graph model rather than relational or hierarchical structures. It is very common to see linked data visualized through this sort of image:

Image from The Oracle Alchemist

These links are structured through relationships expressed as triples. In the image above, these triples are represented in graph form, but they can also be serialized in machine readable code. In both the serialization and the graph, these triples are logical statements:

This person [has]realName Stephen King

This person hasTwitter @StephenKing

@StephenKing hasContent [pictures of his dog Molly aka Thing of Evil]

A triple is simply a relationship between a subject and an object communicated through a predicate:

SUBJECT——PREDICATE——OBJECT

The data model that supports the exchange of data structured in this way (as a web of interlinked nodes connected through relationships expressed as triples) is the Resource Description Framework (RDF). RDF can be semantically structured through specifications that define what types of data are being modelled. For instance, the RDF schema (RDFs) is a data modelling vocabulary that can be used to define classes and possible relationships between classes. BIBFRAME is another vocabulary that is being developed by the Library of Congress to represent library bibliographic metadata in RDF. Another example is EBUCORE, a vocabulary designed by the European Broadcasting Union to support linked data in various stages of the life cycle of broadcasting material, including production, business, and archives. Vocabularies such as these are central  to having every object, subject, and predicate defined and expressed as Uniform Resource Identifiers (URIs) rather than literal string values (strings that are not actionable through links), and they expand upon the types of things that can be described as linked data (at various levels of granularity).

This framework of linked data advances the principles proposed by Tim Berners-Lee as the foundation of linked data:

  1. Use URIs as names for things
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (RDF)
  4. Include links to other URIs, so that they can discover more things.

The NET project

The feasibility report on which my colleagues at the Library of Congress and I are working will focus on records generated through the NET catalog project (where I spend the majority of my day cataloging). We catalog these records in our content management system, MAVIS. MAVIS outputs the data to MAVISXML, which is a hierarchically structured format for representing metadata. We are looking at ways to transform MAVISXML to PBCORE (the XML schema in use by AAPB) and then to RDF linked data. We are examining existing technologies, vocabularies, and workflows, and identifying other problems we need to solve. The results of this research will be a benefit not only to the AAPB, but also to other cultural heritage institutions and the public broadcasting community taking efforts to implement linked data. I am currently on the “literature review” stage of the linked data research. Look forward to future posts about our process!

This post was written by Chris Pierce, AAPB and NET Cataloger/Metadata Specialist.

A Day In the Life of NET

Hi there! We’re part of the National Educational Television (NET) collection at the Library of Congress’s National Audiovisual Conservation Center (NAVCC) – maybe you’ve heard of us? Recently, the Council on Library and Information Resources (CLIR) funded the AAPB to complete the NET Collection Catalog Project, whereby some nifty catalogers are working to create fabulous descriptions of programs distributed by NET (1952-1972, which makes up some of the earliest public television content!). People know so little about us because, up until now, we’ve been stored in unprocessed collections! So we’re looking to get makeovers, too. We are happy here, NAVCC has optimal storage facilities for us – we’re stored at a cool 50 degrees with 30% relative humidity – but we would like it if people could find us more easily.

To give you a better idea of just what processing a film title in the collection entails, we’re going to give you an inside look. The first part of our journey? Getting pulled from the stacks, of course! When we’re pulled, we make our way down from the shelves, onto an obliging cart, and are rolled out of the vaults. Yippee!

But because we like it chilly, we don’t appreciate temperature shock. So we get wheeled into the acclimatization room, where we can get adjusted to the new climes.

After gradually thawing out, we get picked up. Today’s the big day, we’re getting processed today!

Quick, time to make a break for it!!

We find our way here, to a work bench, where the magic happens.

All right, Mr. DeMille, I’m ready for my close-up. We get pulled, one by one from the cart. But you can find a lot of great metadata on us, so all that info gets written down first for input into our collection database system later.

Sometimes when you open us up, there’s a prize inside! No, not of the Cracker Jack variety – these prizes come in the form of broadcast histories and/or condition assessments. They get re-foldered and stored safely away, too, but hey, this is about us, the NET film!

We get placed up on the spindle, ready to wind! (Good thing Sleeping Beauty isn’t a film archivist, whew.)

We’re going to transfer from an old reel onto a slick, plastic “core.” The core (you can see cores stored in the boxes below the bench) is fixed inside the split reel on the right.

When we’ve been wound through, I end up on the right now, wrapped around a core.

How embarrassing! Look away!

Like a beautiful butterfly, now that we’ve been transformed, we shed the old reel and accompanying film can (that is, they are promptly disposed of).

Ouch!

I’m then rehoused into a – blue, blue, ‘lectric blue (that’s the color of my room) – plastic can.

And I’m taken over to a computer, to complete my cataloging in the collection database system MAVIS.

And now for my favorite part! I get labeled with a Library of Congress item barcode, new rack number, and a snazzy title label so people can find me again!

Now I’m all set! Ahhh 1331 – I’ve always liked the sound of a palindrome. Now I’m headed back to the vaults to get some well-deserved shut-eye. Later!

This post was written by Susie Booth, NET Cataloger at NAVCC, on behalf of the NET film.

Louisiana Public Broadcasting Digital Preservation Plan

In 2015, the Institute of Museum and Library Services awarded a generous grant to WGBH on behalf of the American Archive of Public Broadcasting (AAPB) to develop the AAPB National Digital Stewardship Residency (NDSR). Through this project, we have placed seven graduates of master’s degree programs in digital stewardship residencies at public media organizations around the country.This post was written by resident Eddy Colloton, who has just concluded his residency project with the completion of a digital preservation plan for Louisiana Public Broadcasting, his host institution:

I have completed my NDSR AAPB residency at Louisiana Public Broadcasting (LPB)! While most of my cohort will continue chugging along for another few months, I sadly have to finish up a bit early. But, I’m leaving for an exciting opportunity in the conservation department at the Denver Art Museum. I’m feeling good about the time I have spent here at LPB, and the work that we have accomplished. I may even chime in on the ol’ AAPB NDSR blog again down the line, once I’ve had time to lean into some post-residency navel gazing.

Please find my primary deliverable to LPB, the LPB Digital Preservation Plan, below. The objective of this document was to document the station’s current digital preservation procedures, and to make recommendations for improvement. The plan discusses the benefits of creating MediaInfo and MediaConch reports, as well as fixity checks, and how to apply those tools in a production environment. The plan also describes the benefits of using uncompressed and lossless codecs for the preservation of analog video, the methodology and strategy behind planning for LTO tape generation migrations, the importance of collecting production documentation in audiovisual archiving, and much more. While the policies and procedures described in the plan are specific to LPB, I think that there’s certainly information to be gleaned from the plan whether you are working in a public broadcasting archive or not.

I want to offer my thanks to everyone at LPB for being so welcoming to a stranger from the north, and for helping me with so many aspects of my project. I want to offer a special thanks to my host mentor, Leslie Bourgeois, who in spite of having a very difficult year due to the historic flooding that occurred Baton Rouge in August of 2016, has been supportive and encouraging of my work here at LPB. I would also be remiss if I didn’t thank Rebecca Fraimow, the NDSR program coordinator, for constantly being there for me and the rest of the cohort over the last 7 months. And of course a very special thanks to my NDSR cohort for letting me ask them questions, vent to them about my struggles, and allowing me to share a barrage of my dumb jokes. I wish you all the best. AAPB NDSR 4 lyfe!

Download the Louisiana Public Broadcasting Digital Preservation Plan

PBS NewsHour Digitization Project Update

NewsHour_Project_LogosIn January 2016, the Council on Library and Information Resources awarded WGBH, the Library of Congress, WETA, and NewsHour Productions, LLC a grant to digitize, preserve, and make publicly accessible on the AAPB website 32 years of NewsHour predecessor programs, from October 1975 to December 2007, that currently exist on obsolete analog formats. Described by co-creator Robert MacNeil as “a place where the news is allowed to breathe, where we can calmly, intelligently look at what has happened, what it means and why it is important,” the NewsHour has consistently provided a forum for newsmakers and experts in many fields to present their views at length in a format intended to achieve clarity and balance, rather than brevity and ratings. A Gallup Poll found the NewsHour America’s “most believed” program. We are honored to preserve this monumental series and include it in AAPB.

Last week, our contract archivist Alexander (AJ) Lawrence completed the inventory of 7,320 NewsHour tapes stored in 523 boxes located in WETA’s storage units in Arlington, Virginia, comprising the bulk of the collection. (Additional content is located at two other locations.)

“I was so excited to receive Casey’s initial email asking about my interest in the NewsHour project. I’ve been a life long watcher of the program and the chance to be involved in the preservation of such a valuable resource for historical research seemed like a wonderful opportunity.

The process of inventorying the entire collection seemed pretty daunting on my first day when I got my first in-person look at the storage units housing the estimated 7,500 tapes. However, the process has gone quite smoothly overall and we’ve now surpassed the halfway point. Generally, the tapes have little more than a date to identify them, but it’s been especially interesting to come across the tapes for significant historical events over the past 40+ years. These tapes in particular offered me a chance to reflect on some major cultural milestones I’ve witnessed, often through coverage by the NewsHour team. That said, it was also fun to come across the broadcast that aired on the day I was born, as well as the very first broadcast of The MacNeil/Lehrer NewsHour.

Thankfully, I haven’t been tackling the entire inventory alone. I need to offer a special thanks to Matthew Graylin, a desk assistant with the NewsHour who’s been tasked with assisting me with the work. Needless to say, conducting an archival inventory is well beyond the normal duties of a broadcast news assistant, but Matthew has dived in with gusto. We still have a few weeks together, so hopefully I can convert him into a future audiovisual archivist in that time.”

This slideshow requires JavaScript.

We have also selected a digitization vendor for the project and are looking to begin pilot tests for digitization within the next month. Meanwhile, the Library has instituted quality control procedures to ensure that all digitized files will be properly preserved for present and future generations.

We can’t wait to get started with digitization and look forward to making this monumental series accessible as part of the AAPB collection. In the meantime, we’re pleased to share this clip reel sampling of content that will be digitized, courtesy of NewsHour Productions.

 

Meet Lily Troia, AAPB Cataloging Intern & Public Media Junkie

The following is a guest post by Lily Troia, AAPB Cataloging Intern.

Exploring the WGBH Vault!
Exploring the WGBH Vault!

Hi. My name is Lily Troia and I am a public media junkie. I will admit, it is a bit of a problem. The first thing I do when traveling to any new town is find the local radio affiliate for my fix of daily news. I frequently cry along to This American Life, sit in my parked car laughing hysterically to Wait, Wait Don’t Tell Me’s antics, and I am certain Antiques Roadshow curtailed more than one family fight over the remote during my childhood.

I blame my mom and dad, ultimately, for a northern Wisconsin upbringing entrenched in public media. In the expanse of the rural Northwoods, commercial radio and static occupied most of the airwaves, with one local NPR-affiliate, WOJB, broadcast off a nearby Ojibwe reservation, serving as a beacon of independent thought and music for our small community. Cable was a luxury not yet accessible to remote country residents in the 1980s, and since my back-to-the-lander family couldn’t entertain the idea of a satellite dish, our viewing options included only NBC and PBS, with the occasional blurry-screened ABC when snowmobile traffic was reduced (seriously). Thus, I was the kid carrying my parents’ Wisconsin Public Television member tote bag to the summer pool, raised on a diet of Sesame Street, Square One, and 3-2-1 Contact in an era of Nickelodeon.

Decades later I found myself collaborating professionally with Minnesota Public Radio and Twin Cities Public Television on a regular basis. A classical music performer throughout my youth, I studied ethnomusicology at Northwestern University, yet felt disconnected from the cloistered world of academia, and eventually turned my musical interests to the business world. While running my own music management firm in Minneapolis, I produced numerous live and recorded projects, and frequently contributed content to MPR as a music and arts culture commentator. These experiences further solidified my lifelong love of and dedication to public media. Now back in school, pursuing a Masters in Library and Information Science at Simmons College, I have the unique opportunity to apply my music and humanities background in the arena of preservation and access, synthesizing my passion for scholarship and public service.

Life occasionally delivers instances of perfect serendipity; joining the American Archive of Public Broadcasting feels like such an instance. It truly is a professional dream to work on such a socially vital, dynamic project. Already in my brief time cataloging archival content from member stations across the country, I have learned about an influx of Mexican immigrants to Wyoming in the 1990s, listened to a decades-old KUT broadcast featuring Eliza Gilkyson, and discovered that Oregon hipster culture began long before Portlandia, in the form of a 1985 municipally-sponsored beard-growing contest. In a time when public media is forced to fight for basic funding–my Wisconsin stations are currently facing potential demise–ensuring the longevity and availability of this immeasurably valuable, cultural material has never been more important. What an inspiration to be at an organization like WGBH, committed to protecting and providing access to these historical gems that document our diverse American stories.

As Seen on TV: An exploratory glimpse into the archives of the AAPB

The following is a guest post by Ingrid Ockert, a doctoral student at Princeton University studying the history of science. Currently, she’s gathering material for her dissertation, which will be on the history of science educational television. Follow her on twitter @i_rockt.

Part I

Back in January, while I was furiously planning my dissertation travel for the upcoming semester, I needed to compile a list of archives. Immediately. I wanted to plan a series of trips to archives holding television production materials, but I didn’t know where to start my search. My only option was to cold-call archives. I hoped that some friendly archivist would take pity on a poor graduate student and let me into their collection.

My timing could not have been more fortuitous; one of the first people I emailed was Casey Davis, the amazing project manager at the American Archive of Public Broadcasting. Casey exemplifies the AAPB; she’s a friendly librarian dedicated to opening access to public broadcasting materials and to connecting researchers with archivists. At the time, the American Archive of Public Broadcasting had a basic webpage (they have since launched a beautiful website). Casey generously helped me get into touch with the archivists at WGBH. She also suggested other contacts for me within the AAPB network.

Trip to Boston

A few months later, I was on a train headed to Boston to visit WGBH. The glistening glass building that houses WGBH instantly wowed me. Keith Luf, the head of the archives, met me in the foyer of the building. He graciously gave me a tour of the building, allowing me to glimpse the studios and the offices of NOVA, Masterpiece, Antiques Roadshow, and American Experience. For a longtime fan of PBS like myself, I was thrilled at the chance to walk along the same halls as the people who create these amazing programs.

Ockert2I researched the history of one of these programs, NOVA, for the next two weeks. Premiering in 1974, NOVA is the longest running science television program in the United States. Luckily for me, WGBH has files related to the history of the program that stretch back to the earliest discussions of the program in 1973.

One of the highlights of my travel was simply poring over files upon files of material. Or taking tea breaks from my researching and gazing out at Boston’s skyline. But just as valuable were my chats with Keith about the history of WGBH, Leah Weisse about the management of the collection, and Casey about the future of the AAPB. I am so grateful for the amount of time that they took talking with me!

Best hidden gem at WGBH? I spent a lot of time hanging out in the ‘viewing room’ and watching old episodes of NOVA. This room was a goldmine for researchers like myself – it had a working U-matic cassette player! And the best part? Leaning against the back wall was a vintage ‘Edward Gorey’, a life size sketch of a bat hanging from a bird perch by the artist Edward Gorey for the television program “Mystery!” It was just one of the many interesting artifacts in the WGBH collection.

Part II

Trip to College Park

In May, I was on another train, this time bound to College Park, Maryland. This time, I visited another archive that participates in the AAPB, the National Public Broadcasting Archives at the University of Maryland. Chuck Howell was my main contact (he specializes in the history of mass media). The National Public Broadcasting Archives is home to many interesting collections related to the history of television, including the papers for National Public Radio and the Corporation for Public Broadcasting. I was there to peruse correspondence, memos, and publicity held within the Children’s Television Workshop’s papers. What does the Children’s Television Workshop have to do with science? In the 1970s, the creators of Sesame Street and the Electric Company wanted to create a daily science series for children. The resulting program was 3-2-1 Contact! ockert1

I spent less than a week at the University of Maryland, but I was just as impressed with the collection as I had been with WGBH. Michael Henry, another archivist specializing inbroadcast journalism, greatly helped to familiarize me with the collection. Like WGBH, I discovered that I learned a lot simply by talking to the archivists. On my last day, Michael showed me the Broadcasting Reading Room on the library’s second floor. The Reading Room was an impressive space, lined with a several dramatic murals from the 1940s, each extolling the virtues of an age of radio and television. Radios, record players, and televisions – each restored to impeccable condition – lined the walls. Wandering up to each item and peering at it, I felt like a kid in a candy shop! One of my favorite artifacts in this collection was a German entertainment system from the 1950s that included a recordable tape deck and turntable. Additionally, it was housed in a beautiful wooden cabinet – perfect for what must have been a top-of-the line luxury item.

I’ve been really lucky to be able to research in such wonderful collections. I’m grateful that nonprofit and government institutions like WGBH and the University of Maryland are equally committed to providing open access to historians and researchers; I applaud the AAPB on their mission to heighten the public’s awareness of historic public media. Hopefully, through my own research, I can also contribute to a greater cultural appreciation of the history of public broadcasting.

Digital Preservation for Public Broadcasting Webinar Recording is Available!

The following is a guest post by Rebecca Fraimow, National Digital Stewardship Resident at WGBH and the AAPB.

As the National Digital Stewardship Resident with WGBH and the AAPB, I’ve backed up a lot of drives, designed a lot of workflow diagrams, and written up a lot of documentation, but for my final deliverable for the residency, I got to do something with a slightly broader focus: create a webinar that focused on digital preservation concepts through the lens of the unique needs of a public broadcasting organization.

Rebecca Fraimow is the NDSR resident at WGBH and the AAPB.
Rebecca Fraimow is the NDSR resident at WGBH and the AAPB.

Although I’ve spent most of the past year in a public media context, WGBH is pretty unique among public media organizations: we have a strong archival department, and a dedicated budget for preservation.   That gives us a lot of opportunities to invest in tools and techniques that most public media organizations aren’t going to have. As a result, creating a webinar about digital preservation best practices from a PB perspective is not just as simple as saying ‘here’s what we do and why we do it’ – while it would be great if all stations had the same level of resources, just getting that level of buy-in is something that most archivally-minded station employees have to fight really hard to make a case for.

Therefore, instead of designing the webinar based around our workflows at WGBH, I sent out an open call for topics to see what the audience of (primarily AAPB) stations really wanted to hear about. I got a wide range of responses:

– where to start when creating a digital library
– best practices for migrating videotape to digital files
– how to manage the volume with a small staff
– tools for embedding metadata into audio and video files
– systems for small organizations with little IT support
– integrity checking, video file standards, naming conventions
– funding
– getting producers onboard from the get-go
– how to go back into the archives where proper documentation doesn’t exist
– how to properly use the PBCore field called instantiationStandard

Obviously, I don’t have the answer to all these questions (to be honest, instantiationStandard is kind of a confusing field) and, of course, for many of them, there is no right answer — as I can tell you from the experiences of my entire NDSR cohort, even organizations with huge dedicated preservation departments are still trying to figure out the solutions that make the most sense for them.  Next year, the AAPB will be sending a new crop of NDSR residents into public media stations to help grapple with some of these issues, but before finding answers, the first step is figuring out the right questions to ask.   The webinar is designed to provide a guide to some of those questions, and an overview of the issues to consider when making a case for digital preservation.

You can view the full webinar below (click on the title to open in a larger screen):

Digital Preservation for Public Broadcasting from American Archive on Vimeo.

The slides are available here:

http://www.slideshare.net/RebeccaFraimow/digital-preservation-for-public-media

Accessing Historic Public Media: The Perspective of a Researcher

The following is a guest post by Jessica Brandt, PhD candidate at Drew University.

At the beginning of April, I had the pleasure of being one of the first researchers to visit the American Archive of Public Broadcasting at the Library of Congress. I am a doctoral student researching non-commercial radio during the Cold War, and I had happened across the AAPB blog while on a quest for records relating to the 1981 production of Star Wars as a radio play for NPR. Shortly after submitting the “Contact Us” form, I received a reply from Sadie Roosa with a few possible assets for me to look into.

Jessica Brandt, PhD candidate at Drew University. One of the first researchers to access the digitized AAPB collection, Brandt visited the Library of Congress to listen to digitized recordings as part of her dissertation research.
Jessica Brandt, PhD candidate at Drew University. One of the first researchers to access the recently digitized AAPB collection, Brandt visited the Library of Congress to listen to digitized recordings as part of her dissertation research.

At the time, the digitized media wasn’t available to stream through the website yet, so I had to arrange a site visit. With the help of Casey Davis at WGBH and Alan Gevinson at the Library of Congress, I was able to set up a visit in no time. Once there, I found the interface easy to navigate and the quality of the audio was excellent. I was also able to poke around more of the archive, including assets that had not yet been digitized, and I left with more leads to pursue.

Every step along the way, I found the people involved with the AAPB to be responsive and helpful, eager to make my research experience successful.  And that brings me to a broader observation about the world of public media — in exploring the story behind this alternate Star Wars, I’ve had occasion to contact public radio stations across the country, and without exception, every one has responded as fully as possible. In each case, if they couldn’t offer any archives or records, they made suggestions of other places to look, and offered to make introductions where necessary. Very few fields have such a way of feeling so tight-knit and collegial.

I’m sure this is preaching to the choir, but I can’t overstate the value of digitizing these public media assets. Organizations like the Paley Center have done a great job with commercial television, in particular, but so much of the product of public radio and television stations languishes in the limited storage facilities of those stations, scattered around the country. That’s if it has been preserved at all. The nature of funding makes it unlikely that any but the largest stations in major markets will have any staff dedicated to managing their archives locally. So the service that the AAPB has to offer is opening a new world for people like me, who spend our time studying the public airwaves of the past. Take a look (or a listen) for yourselves —  untold treasures await!

Archival discoveries and collaboration at Minnesota Public Radio

The following blog post was written by Margaret Bresnahan from Minnesota Public Radio.

I’m writing to share the next installment in the American Archive success story. Thanks to the cataloging done during the American Archive inventory project, Minnesota Public Radio was able to identify about 900 MPR News stories covering the Hmong settlement in Minnesota, with recordings dating from 1975 to present day. This discovery led to a collaboration with the Minnesota Historical Society (MNHS), informing an exhibition/celebration that launches this month (March 2015), and it led to new broadcasts from the MPR News Room.

Marking the 40th anniversary of the first large-scale arrival of Hmong people in Minnesota, MPR News recently launched a Hmong collection page and broadcast a few news stories–all using archive recordings to tell the story of Hmong-Minnesotans. Two of our main collaborators in the News Room plan on continuing the coverage throughout the year, bringing more archive recordings on air and online. This is a wonderful example of the power of access. The inventory made it clear that these recordings existed and enabled this great use of archive material to tell a contemporary, ongoing story.

 Here are some links to the archive usage, and more are to come: