Extracting Audio

There are a lot of good talks, but they are usually posted in video format. Most do not require visual attention to be understood, and so it would make sense to publish the audio. It would save bandwidth and time.

But until that happens, you must download them, and then extract the audio. Here’s how I (currently) do this.

Github: Gist: 4464109: Small bash script to extract audio from video files.

#!/bin/bash

# Some consts
OUT_CODEC="libvorbis"
OUT_EXTENSION=".ogg"

pids=""

# Kick off the conversions
for file in *
do
  mime_type=$(file --brief --mime-type ${file})
  # Discard the subtype:
  mime_type=${mime_type%/*}
  #*/ XXX avoid silliness of the syntax highlighter

  if [ "${mime_type}" = "video" ]
  then
    # Build the outfile
    outfile="${file%.*}${OUT_EXTENSION}"

    # Consider existence to mean that it's been done before
    if [ -e "${outfile}" ]
    then
      echo "Skipping ${file} (destination exists)..."
      continue
    fi

    # Announce it:
    echo -n "${file} TO ${outfile}..."

    # Do the conversion
    avconv -i "${file}" -vn -acodec ${OUT_CODEC} ${outfile} &> /dev/null &
    pid=$!
    echo "${pid}"

    pids="$pids $pid"
  fi
done

failed=""

# Waiting...
for pid in ${pids}
do
  echo "${pid} is pending..."
  wait ${pid} || failed="${failed} ${pid}"
done

if [ -z "${FAIL}" ]
then
  echo "Done."
else
  echo "Process(es) failed (${FAIL}), check state manually."
fi

Output will look something like (simulated):

$ ./extract_audio.sh
Skipping talk_0.mp4 (destination exists)...
talk_1.mp4 TO talk_1.ogg...18443
talk_2.flv TO talk_2.ogg...18444
talk_3.ogv TO talk_3.ogg...18445
18443 is pending...
18444 is pending...
18445 is pending...
Done.

Changes that would be desirable include:

  1. Not kicking off too many conversions at once. This would require basically melding the two loops, to count the active number of processes, and only add new ones when old ones had finished.
  2. Better file location handling. Mixing input and output files isn’t ideal, and I end up manually moving the audio files to where they belong for easy consumption.
  3. Better file handling. Once I’m done with a file (assuming it wasn’t in the $failed list), it can be deleted. I don’t intend to watch the files that I convert to audio. The main risk is that a talk will have visually essential information that I’ll only discover upon listening, and if I still wanted to understand those portions I would have to redownload.
  4. cron integration. In theory I can add a script like this to my crontab, and then (assuming the changes directly above), simply download talks, and let the script manage the rest.
  5. Player integration. In my case this would be telling mpd to update the directory where I keep spoken word content, and possibly add new files to the MPD queue.

But one step at a time. I tend to go through phases where I listen to many talks and when I listen to none. Part of the reason is that I hadn’t built this script before, so I would often download talks and they would sit for a time until I would manually extract them.

Part of the future of computing ought be building systems that enable us to easily interact with the digital world, without jumping through many hoops. Often what is possible is not taken advantage of because of the efforts required to get there. For example, I have never visited the Musée du Louvre, despite it being on the same planet, because getting there would take a lot of time and money. But at least through their website and other sites on the Internet I can easily enjoy some of the art displayed there (and in the numerous other museums of the world).