How to FIX IT+ and Why: Crowdsourcing to Save Public Media

The American Archive of Public Broadcasting (AAPB) is a collaboration between the Library of Congress and public media producer GBH to preserve and make accessible historic public radio and television broadcasts from across the nation. Over 130 participating organizations have contributed content to the AAPB, creating a remarkable resource of America’s audiovisual heritage. Collections include digitized local news programs, live event coverage and full-length interviews created as early as the 1940s.

Yet, these programs are not highly discoverable to researchers, educators, students and lifelong learners because it lacks descriptive information. With funding from the Institute of Museum and Library Services (IMLS), the AAPB has created speech-to-text transcripts of the audio and video materials in the collection and made these transcripts available online in FIX IT+, a crowdsourcing platform to correct simple grammatical errors.

Corrected transcripts improve the searchability of these historic programs and the AAPB needs YOUR help in this effort! Below are brief instructions on how to get started, preferred editing conventions, and further explanations (for the extra curious)!


Editing is as easy as 1-2-3:

  1. Open FireFox or Chrome to visit FIX IT+: fixitplus.americanarchive.org
  2. FILTER the computer-generated transcripts by collection and then SORT by ‘Completeness (most to least)’, or search for a specific topic.
  3. “GO FOR THE GRAY” — Edit the gray lines of unfinished transcripts.
    Progress bar color key:
    Yellow = part of the transcript has been reviewed once
    Green = part of the transcript has been reviewed twice
    Gray = part of the transcript has not been reviewed

Be sure to check out the updated Editing Conventions below, and for more editing details, visit the Further Explanation + TIPS section.

Green lines note when lines are completed and no longer need editing. The gray lines still need reviewing.

editing

Editing Conventions


Transcripts are searched for specific and complete keywords. Focus on transcribing the substance of the conversation rather than speech inflections such as ‘um’ or partial words.

  • Filled Pauses and Hesitations – Omit adding “ah”, “eh”, or “um”
  • Partial Words – If someone stops speaking in the middle of a word, either transcribe as much of the word as they say and follow it with a dash, OR transcribe the completed word. Example: “Tes- Testing” OR Absolu- “Absolutely”
  • Unknown Name or Phrase – Use question marks before and after a word — ?Mimz? — to indicate unknown spelling of names, places, foreign language phrases and the like.

    TIP: Search the web for spellings, or look for spellings in the television credits of the original program on the AAPB website. You can find a linked to the original program at the top of the transcript’s page.
  • Numbers – one through ten are spelled out, everything above ten is noted as a numeral. Addresses and years are always numerals. Example: ‘She had ten votes and one was from me.’; ‘The year was 1964 and I lived at 1 Everlane Street.‘

Want to transcribe with the AAPB staff? Sign for our regularly scheduled virtual transcripat-a-thons here: Archives Volunteer Form. If you have any questions, email aapb_notifications@wgbh.org.


further

Further Explanation + TIPS:

1. Narrowing Your Search

  • Open FireFox or Chrome to visit FIX IT+: fixitplus.americanarchive.org
  • The transcripts are automatically filtered by All Collections (from contributing organizations) and sorted by Title (A-Z)

    HOT TIP: Filter the transcripts by a specific Collection, for example ‘Louisiana Public Broadcasting’, and sort by Completeness (most to least) to display which transcripts have been completed, started, or unedited.
Picture1.png
The filter bar is located on the homepage of fixitplus.americanarchive.org.

2. Selecting a Transcript

Example of transcript tile on homepage.

“Transcript tiles” include all the vital information you need to know about the transcript, including its duration (example: ’28m’), brief description, number of contributors and progress bar.

Yellow = individual lines have been reviewed once
Green = individual lines have been reviewed twice
Gray = individual lines have not been reviewed

Each transcript must be reviewed twice. Multiple people can pick-up where someone else has left-off in a transcript, but as long as each line in the transcript has been reviewed twice, then it is considered finished.

HOT TIP: “GO FOR THE GREY LINES! Transcripts may start green, but scroll down until you reach the grey lines that need reviewing.

3. Editing the Transcript

Listen to the audio by clicking the ‘play’ icon to the left of each line. If the text does not match, simply type in the correct audio then press [ENTER]. All changes are automatically saved.

Example of play head at the beginning of each line.

Visit the updated Editing Conventions section above for style preferences.


Want to transcribe with the AAPB staff? Sign for our regularly scheduled virtual transcript-a-thons here: Archives Volunteer Form. If you have any questions, email aapb_notifications@wgbh.org.

Leave a comment