<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: using &#034;circ_tran&#034; to show borrowing suggestions in HIP</title>
	<atom:link href="http://www.daveyp.com/blog/archives/49/feed" rel="self" type="application/rss+xml" />
	<link>http://www.daveyp.com/blog/archives/49</link>
	<description>Dave Pattern's weblog</description>
	<pubDate>Tue, 06 Jan 2009 05:03:21 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.2</generator>
		<item>
		<title>By: Dewey friend wheel &#187; &#34;Self-plagiarism is style&#34;</title>
		<link>http://www.daveyp.com/blog/archives/49#comment-47875</link>
		<dc:creator>Dewey friend wheel &#187; &#34;Self-plagiarism is style&#34;</dc:creator>
		<pubDate>Tue, 18 Nov 2008 19:44:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.daveyp.com/blog/?p=49#comment-47875</guid>
		<description>[...] friend wheel, but using library data, for a while now. Here&#39;s a prototype which uses our &#34;people who borrowed this, also borrowed&#8230;&#34; data to try find strong borrowing [...]</description>
		<content:encoded><![CDATA[<p>[...] friend wheel, but using library data, for a while now. Here&#39;s a prototype which uses our &#34;people who borrowed this, also borrowed&#8230;&#34; data to try find strong borrowing [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: 2008 &#8212; The Year of Making Your Data Work Harder &#187; &#34;Self-plagiarism is style&#34;</title>
		<link>http://www.daveyp.com/blog/archives/49#comment-46821</link>
		<dc:creator>2008 &#8212; The Year of Making Your Data Work Harder &#187; &#34;Self-plagiarism is style&#34;</dc:creator>
		<pubDate>Thu, 22 May 2008 19:38:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.daveyp.com/blog/?p=49#comment-46821</guid>
		<description>[...] Implementation of Library 2.0 and the E-framework&#34; study). We&#39;ve had circ driven borrowing suggestions on our OPAC since 2005 (were we the first library to do this?) and, more recently, we&#39;ve used [...]</description>
		<content:encoded><![CDATA[<p>[...] Implementation of Library 2.0 and the E-framework&#34; study). We&#39;ve had circ driven borrowing suggestions on our OPAC since 2005 (were we the first library to do this?) and, more recently, we&#39;ve used [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Go John, Go! &#187; &#34;Self-plagiarism is style&#34;</title>
		<link>http://www.daveyp.com/blog/archives/49#comment-32260</link>
		<dc:creator>Go John, Go! &#187; &#34;Self-plagiarism is style&#34;</dc:creator>
		<pubDate>Wed, 31 Jan 2007 23:53:08 +0000</pubDate>
		<guid isPermaLink="false">http://www.daveyp.com/blog/?p=49#comment-32260</guid>
		<description>[...] suggestions on our OPAC are very much driven by books recommended on the student reading lists, so it&#39;s going to be [...]</description>
		<content:encoded><![CDATA[<p>[...] suggestions on our OPAC are very much driven by books recommended on the student reading lists, so it&#39;s going to be [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Pattern</title>
		<link>http://www.daveyp.com/blog/archives/49#comment-788</link>
		<dc:creator>Dave Pattern</dc:creator>
		<pubDate>Thu, 20 Apr 2006 12:47:50 +0000</pubDate>
		<guid isPermaLink="false">http://www.daveyp.com/blog/?p=49#comment-788</guid>
		<description>Hi Alex

I've added my code just above the section of the XSL file that starts with:

&lt;tt&gt;&#60;!--&lt;br /&gt;************************************************&lt;br /&gt;Javascript&lt;br /&gt;************************************************&lt;br /&gt;--&#62;&lt;/tt&gt;</description>
		<content:encoded><![CDATA[<p>Hi Alex</p>
<p>I&#039;ve added my code just above the section of the XSL file that starts with:</p>
<p><tt>&lt;!&#8211;<br />************************************************<br />Javascript<br />************************************************<br />&#8211;&gt;</tt></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alex</title>
		<link>http://www.daveyp.com/blog/archives/49#comment-783</link>
		<dc:creator>Alex</dc:creator>
		<pubDate>Thu, 20 Apr 2006 10:02:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.daveyp.com/blog/?p=49#comment-783</guid>
		<description>I now invenstigate to add a java script in the full bib view .

Would you mind tell me where I can add the script in the fullnonmarcbib.xsl stylesheet ?

Thanks</description>
		<content:encoded><![CDATA[<p>I now invenstigate to add a java script in the full bib view .</p>
<p>Would you mind tell me where I can add the script in the fullnonmarcbib.xsl stylesheet ?</p>
<p>Thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Davey P</title>
		<link>http://www.daveyp.com/blog/archives/49#comment-369</link>
		<dc:creator>Davey P</dc:creator>
		<pubDate>Mon, 27 Feb 2006 07:43:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.daveyp.com/blog/?p=49#comment-369</guid>
		<description>Our borrowing suggestions are now available via a web service:

&lt;a href="http://www.daveyp.com/blog/index.php/archives/69" rel="nofollow"&gt;http://www.daveyp.com/blog/index.php/archives/69&lt;/a&gt;</description>
		<content:encoded><![CDATA[<p>Our borrowing suggestions are now available via a web service:</p>
<p><a href="http://www.daveyp.com/blog/index.php/archives/69" rel="nofollow">http://www.daveyp.com/blog/index.php/archives/69</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lorcan Dempsey's weblog</title>
		<link>http://www.daveyp.com/blog/archives/49#comment-45</link>
		<dc:creator>Lorcan Dempsey's weblog</dc:creator>
		<pubDate>Mon, 28 Nov 2005 02:54:09 +0000</pubDate>
		<guid isPermaLink="false">http://www.daveyp.com/blog/?p=49#comment-45</guid>
		<description>&lt;strong&gt;Circulating intentional data&lt;/strong&gt;

I have posted a couple of times recently about intentional data, data that records choices and behaviors. I mentioned holdings data, ILL records, circulation records, and database usage records. One could extend this list to any data which records an i...</description>
		<content:encoded><![CDATA[<p><strong>Circulating intentional data</strong></p>
<p>I have posted a couple of times recently about intentional data, data that records choices and behaviors. I mentioned holdings data, ILL records, circulation records, and database usage records. One could extend this list to any data which records an i&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Davey P</title>
		<link>http://www.daveyp.com/blog/archives/49#comment-32</link>
		<dc:creator>Davey P</dc:creator>
		<pubDate>Thu, 17 Nov 2005 22:28:54 +0000</pubDate>
		<guid isPermaLink="false">http://www.daveyp.com/blog/?p=49#comment-32</guid>
		<description>Great suggestions Casey :-)

If you've not seen what Amazon Web Services have to offer, then have a look at this sample XML output:

http://www.daveyp.com/blog/stuff/032112247X.xml

...down at the bottom, there's a section marked "SimilarProducts" - those are the ISBNs Amazon use for their "Customers who bought this book also bought".

All you need to do is to cross-reference the Amazon ISBNs with those in your database - if you get a match, display a link.

Casey's second suggestion would ideally suit an academic institution, as student borrowing trends at specific times in the year could partly depend upon the modules they were studying.

If you were to view the student's entire borrowing history (e.g. over 3 years), then it would contain (in part) an amalgamation of all of their module topics.

However, sample the data every 90 days and you get a clearer selection of which titles to suggest to another borrower that will (potentially) be the most relevant to them at that moment in time.

For example, if a student took a three year course in Java then you'd expect them to begin by borrowing books like "Dummies Guide to Java" and "Learn Java in 21 Days".  By the end of the course, they might be on to "Advanced Agile Java Development using Scrum"***

Their entire borrowing history would be a mix of Java titles, whereas a 90 day sample will group together Java titles at a similar skill level.

*** - After graduation, they will of course be rounded up by a press gang led by Jack Blount (wearing an eye patch and a suspicious looking parrot on one shoulder squawking "Pieces of 8.0! Pieces of 8.0!")</description>
		<content:encoded><![CDATA[<p>Great suggestions Casey :-)</p>
<p>If you&#039;ve not seen what Amazon Web Services have to offer, then have a look at this sample XML output:</p>
<p><a href="http://www.daveyp.com/blog/stuff/032112247X.xml" rel="nofollow">http://www.daveyp.com/blog/stuff/032112247X.xml</a></p>
<p>&#8230;down at the bottom, there&#039;s a section marked &#034;SimilarProducts&#034; - those are the ISBNs Amazon use for their &#034;Customers who bought this book also bought&#034;.</p>
<p>All you need to do is to cross-reference the Amazon ISBNs with those in your database - if you get a match, display a link.</p>
<p>Casey&#039;s second suggestion would ideally suit an academic institution, as student borrowing trends at specific times in the year could partly depend upon the modules they were studying.</p>
<p>If you were to view the student&#039;s entire borrowing history (e.g. over 3 years), then it would contain (in part) an amalgamation of all of their module topics.</p>
<p>However, sample the data every 90 days and you get a clearer selection of which titles to suggest to another borrower that will (potentially) be the most relevant to them at that moment in time.</p>
<p>For example, if a student took a three year course in Java then you&#039;d expect them to begin by borrowing books like &#034;Dummies Guide to Java&#034; and &#034;Learn Java in 21 Days&#034;.  By the end of the course, they might be on to &#034;Advanced Agile Java Development using Scrum&#034;***</p>
<p>Their entire borrowing history would be a mix of Java titles, whereas a 90 day sample will group together Java titles at a similar skill level.</p>
<p>*** - After graduation, they will of course be rounded up by a press gang led by Jack Blount (wearing an eye patch and a suspicious looking parrot on one shoulder squawking &#034;Pieces of 8.0! Pieces of 8.0!&#034;)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: casey</title>
		<link>http://www.daveyp.com/blog/archives/49#comment-29</link>
		<dc:creator>casey</dc:creator>
		<pubDate>Thu, 17 Nov 2005 21:05:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.daveyp.com/blog/?p=49#comment-29</guid>
		<description>I agree. This is very cool.

I have two ideas on this subject:

1) You have a trusted 3rd party (preferably in a country with no extradition treaty) "launder" the data -- encrypt the borrower#'s with an encryption key that you don't know, only they know.  So borrower #123 will always get encrypted as "7xl3" or something.  That way, you can keep circ history indefinitely without having it tied to the borrower at all.  If you get visited by the feds, they have no way of decrypting the borrower#'s since you don't have the key.   

2) Say you are allowed to keep circ history data for a relatively long period of time but not forever  (90 days, say).  Every 90 days, you encrypt the borrower#'s in the circ history with a one-time pad (basically, a completely random encryption key that is never reused so the only way to crack it is by brute force).  So say 1st quarter of 2005, borrower #123 gets encrypted as "7xklj" and 2nd quarter of 2005, borrower #123 gets encrypted as "823w".  You have no way of knowing that "823w" is the same person as "7xklj" but you can still know that borrower 7xklj checked out so and so books in a 90 day period.  Basically it would make it so you could figure out "borrowers who checked out x also checked out y within 90 days of each other" but again, there's no way to decrypt any of the data older than 90 days -- or make inferences between what happened 3-6 months ago and what happened 0-3 months ago.  

If you only keep circ history for a few days like we do, the method is basically equivalent to what Davey describes but much messier.

Finally, you don't have to collect any data yourself to offer such a service. You could just use Amazon Web Services.</description>
		<content:encoded><![CDATA[<p>I agree. This is very cool.</p>
<p>I have two ideas on this subject:</p>
<p>1) You have a trusted 3rd party (preferably in a country with no extradition treaty) &#034;launder&#034; the data &#8212; encrypt the borrower#&#039;s with an encryption key that you don&#039;t know, only they know.  So borrower #123 will always get encrypted as &#034;7xl3&#034; or something.  That way, you can keep circ history indefinitely without having it tied to the borrower at all.  If you get visited by the feds, they have no way of decrypting the borrower#&#039;s since you don&#039;t have the key.   </p>
<p>2) Say you are allowed to keep circ history data for a relatively long period of time but not forever  (90 days, say).  Every 90 days, you encrypt the borrower#&#039;s in the circ history with a one-time pad (basically, a completely random encryption key that is never reused so the only way to crack it is by brute force).  So say 1st quarter of 2005, borrower #123 gets encrypted as &#034;7xklj&#034; and 2nd quarter of 2005, borrower #123 gets encrypted as &#034;823w&#034;.  You have no way of knowing that &#034;823w&#034; is the same person as &#034;7xklj&#034; but you can still know that borrower 7xklj checked out so and so books in a 90 day period.  Basically it would make it so you could figure out &#034;borrowers who checked out x also checked out y within 90 days of each other&#034; but again, there&#039;s no way to decrypt any of the data older than 90 days &#8212; or make inferences between what happened 3-6 months ago and what happened 0-3 months ago.  </p>
<p>If you only keep circ history for a few days like we do, the method is basically equivalent to what Davey describes but much messier.</p>
<p>Finally, you don&#039;t have to collect any data yourself to offer such a service. You could just use Amazon Web Services.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Davey P</title>
		<link>http://www.daveyp.com/blog/archives/49#comment-28</link>
		<dc:creator>Davey P</dc:creator>
		<pubDate>Thu, 17 Nov 2005 20:19:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.daveyp.com/blog/?p=49#comment-28</guid>
		<description>That's a tough one Luke!

There are several methods of turning a borrower number into something else (e.g. a MD5 hash), but none really fully erase the trail back to the original borrower.

If you have a high circulation turnover, then the alternative would be to store each of the bibs of the items that a borrower checks out at the same time - e.g. if someone goes to the issue desk with 5 items, record those 5 bibs together.  However, don't store anything about the borrower - just record the bib numbers:

1857,47265,901,11367,91375

As each borrower checks items out, you'll get strings of bibs to add to your database...

857,37562,9582,901,1857 &lt;i&gt;(5 items checked out)&lt;/i&gt;
7464,1725 &lt;i&gt;(2 items checked out)&lt;/i&gt;
3874,58948,857 &lt;i&gt;(3 items checked out)&lt;/i&gt;

Then, to generate suggested items for bib# 901, you'd need to:

1) search though all your strings of bibs and collate all the ones that contain a "901":

    1857,47265,&lt;b&gt;901&lt;/b&gt;,11367,91375
    857,37562,9582,&lt;b&gt;901&lt;/b&gt;,1857

2) count how many times each bib occurs in the collated list

Then present the ones that occured the most as the suggested items (e.g. 1857). 

Think of it more as a "people who borrowed this item, also borrowed these items at the same time..."

At the very worse, all you could deduce from the database would be that at some point in the past, someone borrowed these specific books at the same time.  There would be nothing to help to work out who they were, when they borrowed those items, or if they ever borrowed again.</description>
		<content:encoded><![CDATA[<p>That&#039;s a tough one Luke!</p>
<p>There are several methods of turning a borrower number into something else (e.g. a MD5 hash), but none really fully erase the trail back to the original borrower.</p>
<p>If you have a high circulation turnover, then the alternative would be to store each of the bibs of the items that a borrower checks out at the same time - e.g. if someone goes to the issue desk with 5 items, record those 5 bibs together.  However, don&#039;t store anything about the borrower - just record the bib numbers:</p>
<p>1857,47265,901,11367,91375</p>
<p>As each borrower checks items out, you&#039;ll get strings of bibs to add to your database&#8230;</p>
<p>857,37562,9582,901,1857 <i>(5 items checked out)</i><br />
7464,1725 <i>(2 items checked out)</i><br />
3874,58948,857 <i>(3 items checked out)</i></p>
<p>Then, to generate suggested items for bib# 901, you&#039;d need to:</p>
<p>1) search though all your strings of bibs and collate all the ones that contain a &#034;901&#034;:</p>
<p>    1857,47265,<b>901</b>,11367,91375<br />
    857,37562,9582,<b>901</b>,1857</p>
<p>2) count how many times each bib occurs in the collated list</p>
<p>Then present the ones that occured the most as the suggested items (e.g. 1857). </p>
<p>Think of it more as a &#034;people who borrowed this item, also borrowed these items at the same time&#8230;&#034;</p>
<p>At the very worse, all you could deduce from the database would be that at some point in the past, someone borrowed these specific books at the same time.  There would be nothing to help to work out who they were, when they borrowed those items, or if they ever borrowed again.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
