blog.humaneguitarist.org

segmenting audio with AudioRegent, SoX and XML

[Sat, 16 Jan 2010 19:05:39 +0000]
For some reason I feel obligated to point out that I haven't blogged in a while for a few reasons: 1. Christmas break from school/work at the University of Alabama 2. the desire not to blog for the sake of blogging 3. and ... I've been working on something huge - at least for me. It's a piece of software called AudioRegent that harnesses XML to create derivative "clips" of regions within WAV audio files. A region is simply a user-defined segment within an audio file, like a track on a Compact Disc. Besides writing the program in Python, which I pretty much finished in December, I had to also develop the XML format which I call SimpleADL [http://blog.humaneguitarist.org/projects/audioregent/#SimpleADL] (Simple Audio Decision List) that AudioRegent looks at and then makes derivative audio clips by leveraging SoX [http://sox.sourceforge.net/], the Sound Exchange command line audio editor. AudioRegent and SimpleADL can also be used to sync audio to text, like transcripts. Actually, the programming and devising SimpleADL were the easy part. The hard stuff was the documentation and deciding on a license for the software. I tried to find a balance in documenting the software: being thorough without writing a novel. I'm not sure I succeeded, but I can always improve it with time. I used the W3C's Amaya [http://www.w3.org/Amaya/] editor to write the documentation in XHTML. Sure, you can use OpenOffice to export a document to XHTML, but man is it bloated and messy. Amaya writes really clean XHTML. As for the license, I chose the BSD license [http://creativecommons.org/licenses/BSD/]. As I understand it, this allows one to use the source code at will in future open or closed-source applications as long as you maintain the credits for AudioRegent. I was tempted to use the Mozilla Public License [http://www.mozilla.org/MPL/] (MPL) which, again from what I can tell, is similar to the BSD license except that any source derived from AudioRegent would have to stay open-source though any peripheral code can be closed-source. I absolutely decided against the GNU General Public License [http://www.gnu.org/licenses/gpl.html] which is viral and imposes its philosophy perpetually on all subsequent code, even peripheral code. Some have even argued that it works against its own objectives and is less "open" than the MPL [http://croftsoft.com/library/tutorials/gplmpl/]. Now I realize that, practically speaking, a skilled programmer could write better code from scratch in 30 minutes as opposed to the some 30 hours I needed, but I wanted to go about this quasi-professionally. And I learned more about licensing, which was cool. Anyway, rather than try and explain the software itself and how to get it, I'd be better off pointing you to the documentation [http://blog.humaneguitarist.org/projects/audioregent/] if you have any interest ...