blog.humaneguitarist.org

on the brain: audio + ocr/hocr, "did you mean", and "there are no ebooks"

[Thu, 07 Mar 2013 17:11:45 +0000]
Lightning talk style 'cause I'm home sick and need to get a few things out there ... Audio + OCR/HOCR Some time ago I wrote this [http://blog.humaneguitarist.org/2012/07/14/okra-pie-some-simple-ocrhocr-tests/] post on OCR/HOCR and making searchable pages. I recently did some tests with generating audio with Festival [http://www.cstr.ed.ac.uk/projects/festival/] and using simple HTML5 audio to ad audio to the page. I only used Festival on the OCR output, but by using the HOCR output it's no big thing to make audio for every line that Tesseract "detects" and incorporate it with SAVS [http://blog.humaneguitarist.org/tag/SAVS] or something. "Did You Mean?" Google doesn't seem to offer a "did you mean" API, but you can get around it [https://github.com/wiggin/Google-did-you-mean]. Other options might be to use Wikipedia's API [http://en.wikipedia.org/w/api.php?format=xml&action=query&list=search&srsearch=disese&srlimit=10] or Google's own search suggestion API [http://google.com/complete/search?output=toolbar&q=disese] (i.e. first suggestion). In both links to the API, I've sent it "disese" instead of "disease". There are no eBooks In digital, why the hell are we still thinking of "eBOOKS" and "digital AUDIO BOOKS", etc.? Why can't we just think of them as web applications? And instead of having "ebook reader software" and "audio book software" why can't we just use, say, a Python/PyQT based application that uses Webkit [http://www.rkblog.rk.edu.pl/w/p/webkit-pyqt-rendering-web-pages/]? That's to say, I get that for monetary reasons not everything can be open, but why can't I just download a compiled script that runs Webkit and disallows me from doing things like viewing the source, saving the page, etc.? If the files need to be downloaded they can be saved in password secured ZIP files (which can only be downloaded, say, with a username/password). The Webkit app would be the only thing that could talk to a centralized DB and determine if the user still has rights to view the material, if so the DB could hand the application the password to read the contents into memory from the ZIP file, show them through the browser, etc. without making the contents readable otherwise. What am I missing here? A lot, I'm sure. But there's got to be a better way and we need to stop thinking of digital as a bits/bytes rendering of the physical world. After all, is this [http://blog.humaneguitarist.org/uploads/SAVS/currentVersion/savs.html] an eBook or a digital audio book? I can, of course, both read and/or hear it. In early tests, making "eBooks" readable via a Python/Webkit app is working, HTML5 audio is working on my computer but not at work, video on neither. But I *think* based on what I've read that there are some bugs with the PyQT Webkit, so maybe it's just a temp thing. Either way, why keep inventing new software when a secure browser lets us read, listen, and watch? Adding stuff like bookmarks is a simple matter I'd think of storing data in a centralized DB for that given user account (or even a local SQLite db). Moreover, will there even be "e-readers" and "mp3 players" in the near future? Won't it all just be a "device" of a given size that just runs a web browser?