blog.humaneguitarist.org

dewey decimal nesting

[Tue, 14 Jun 2011 22:23:42 +0000]
Update, June 8, 2012: So, I'm so not going to ever work on this. But for the record the link to the little, silly demo is now here [http://blog.humaneguitarist.org/uploads/duiGui/duiGui.php]. Update, July 4, 2011: Gawd, this post is a disaster. I've got an updated version of the script below pretty much up and running [DEL: here :DEL] , but I won't get around to populating the "collection" with real data for a while. When I do, I hope I can remember to update this post. So, you've been warned. Read at your own risk. :P ... I'm toying with the idea of implementing something at work for our small video collection. Out video ids are stored in an SQL database and apart from that it also has other data about the video like basic categories - like "American History" or whatevers - so people can locate videos on the same subject. I was thinking, however, about using an easy, pre-existing taxonomy like the Dewey Decimal System [http://en.wikipedia.org/wiki/Dewey_Decimal_Classification] to create nesting browsing. The idea is by just "tagging" the video in the db table with a DDC number one would already know the nested context in which the subject category lies and it really wouldn't be hard to find a subject and the corresponding DDC number in the first place. It's not like it's LCSH [http://authorities.loc.gov/] or anything. Plus, DDC is based on 10's, so that should help. Anywho, that video stuff's all down the road ... maybe. First, I needed to find out how to nest DDC numbers automatically from an SQL table. And I'm getting there with major help from the post here: http://www.thetechspace.com/2010/05/21/get-unlimited-parent-and-child-in-php [http://www.thetechspace.com/2010/05/21/get-unlimited-parent-and-child-in-php/]. So, say I have this table with flat DDC numbers. id ddc 1 100 2 101 3 110 4 111 5 111.5 6 111.55 7 112 8 109.5 9 111.555 10 111.5555 11 111.52 12 101.5 13 111.525 14 111.2 Using the PHP code below a "top" level DDC number can be passed via the URL to a function that will return the 1st generation children of that DDC number and then recursively find the 1st generation children of the results until there are no children left. No more children ... how sad. BTW: Passing the display output style is also an option. For example, going here: http://127.0.0.1/deweyNester.php?top=100&display=verbose would yield: The \$_GET starting value is: 100 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "10_" AND ddc NOT LIKE "100" ORDER by ddc Child: 101 Parent: 100 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "101._" ORDER by ddc Child: 101.5 Parent: 101 and going here: http://127.0.0.1/deweyNester.php?top=111&display=dashed would yield: The $_GET starting value is: 111 -111.2 -111.5 --111.52 ---111.525 --111.55 ---111.555 ----111.5555 Notice how in the first example 109.5 was not found because 109 isn't present in the ddc column of the table. That is to say, currently the script doesn't find orphans. Anyway, this is a work in progress but it's been kind of fun playing with this at work today so here's the code for now: <?php $db_connect = mysql_connect("127.0.0.1","root",""); //yeah, yeah. no password. I know. Shutup! :] //evaluate the DDC number //return an appropriate snippet for query function nextOfKin($arg) { if (substr($arg,-1) == 0) { $arg = '"'.substr_replace($arg,'',-1).'_" AND ddc NOT LIKE "' .$arg .'"'; } elseif (substr($arg,3,1) == ".") { $arg = '"' .$arg .'_"'; } else { $arg = '"' .$arg .'._"'; } return $arg; } //execute the SQL query recursively function displayChildAndRoot($parent,$display) { $sql = "SELECT * FROM csv_db.deweyTable WHERE ddc LIKE " .nextOfKin($parent) ." ORDER by ddc"; $result = mysql_query($sql); while ($row = mysql_fetch_array($result)) { if ($display == "verbose") { echo "SQL: " .$sql ."<br />"; echo "Child: " .$row['ddc'] ."<br />"; echo "Parent: " .$parent ."<br />"; echo "<br />"; } elseif ($display == "dashed") { $dashLen = strlen($row['ddc']) - 3; if ($dashLen > 0) { $dashLen = $dashLen - 1; } echo str_repeat('-',$dashLen) .$row['ddc'] ."<br />"; } displayChildAndRoot($row['ddc'],$display); //recurse } } //get the starting DDC number per what's passed via the URL if (empty($_GET["top"])) { $top = 100; echo "The DEFAULT starting value is: " .$top; } else { $top = $_GET["top"]; echo "The $_GET starting value is: " .$top; } //get the output type per what's passed via the URL if (empty($_GET["display"])) { $display = "dashed"; } else { $display = $_GET["display"]; } echo "<br /><br />"; //make it happen! displayChildAndRoot($top,$display); ?> Update, June 15, 2011: Looking back, setting the top/root to 100 should have yielded all the DDC numbers except the aformentioned 109.5. I think the solution is to tweak the IF statement inside the nextOfKin() function like so: if (substr($arg,-2) == "00") { $arg = '"'.substr_replace($arg,'',-2).'__" AND ddc NOT LIKE "' .$arg .'"'; } Now, , going here: http://127.0.0.1/deweyNester.php?top=100&display=verbose yields: The $_GET starting value is: 100 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "1__" AND ddc NOT LIKE "100" ORDER by ddc Child: 101 Parent: 100 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "101._" ORDER by ddc Child: 101.5 Parent: 101 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "1__" AND ddc NOT LIKE "100" ORDER by ddc Child: 110 Parent: 100 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "1__" AND ddc NOT LIKE "100" ORDER by ddc Child: 111 Parent: 100 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "111._" ORDER by ddc Child: 111.2 Parent: 111 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "111._" ORDER by ddc Child: 111.5 Parent: 111 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "111.5_" ORDER by ddc Child: 111.52 Parent: 111.5 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "111.52_" ORDER by ddc Child: 111.525 Parent: 111.52 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "111.5_" ORDER by ddc Child: 111.55 Parent: 111.5 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "111.55_" ORDER by ddc Child: 111.555 Parent: 111.55 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "111.555_" ORDER by ddc Child: 111.5555 Parent: 111.555 SQL: SELECT * FROM csv_db.deweyTable WHERE ddc LIKE "1__" AND ddc NOT LIKE "100" ORDER by ddc Child: 112 Parent: 100 Update, June 15, 2011: This is why people do extensive testing! Now, I've changed the nextofKin() function yet again. The nesting wasn't as desired - maybe I should post on what those nesting requirements are? So here goes again. I know, I suck. But shutup! :) function nextOfKin($arg) { if (substr($arg,-2) == "00") { $arg2 = '"'.substr_replace($arg,'',-2).'0_" OR ddc LIKE "' .substr($arg,0,1) .'_0" AND ddc NOT LIKE "' .$arg .'"'; } elseif ((substr($arg,-1) == "0" and substr($arg,-2) != "00")) { $arg2 = '"'.substr_replace($arg,'',-1).'_" AND ddc NOT LIKE "' .$arg .'"'; } elseif (substr($arg,3,1) == ".") { $arg2 = '"' .$arg .'_"'; } else { $arg2 = '"' .$arg .'._"'; } return $arg2; }