Taggytastic – part 3 – Self Plagiarism is Style

God bless Bryony – she puts up with a lot! Last night she had to put up with me adding a tagging system to our test HIP server (wwwlibrarycat.hud.ac.uk).

To be honest, the amount of interest in the subject keyword cloud took me by surprise, and it was fascinating to read all of the other blogs that picked it up. A number of blogs made the very valid point that it wasn’t a true tag cloud – the tags were created using the existing subject keywords and not from tags added by our users.

I began to feel that tag clouds in OPACs are a true “chicken & egg” scenario — to be able to add that kind of functionality to our live OPAC, I need to be able to prove that it’s a valid and worthwhile new feature… but to do that, I need to have already implemented it and to have had a healthy number of tags added by our users. I think this is backed up by the sheer number of people out there who think it’s a “good idea” but who are not in a position to start the ball rolling. Obviously, it’d be a different story if our OPACs already had tag functionality “out of the box”!

I’m sure there are a dozen ways of implementing user tags in an OPAC, but here’s mine…

I want to have some degree of control over the tagging (yes I know, I should be trusting our users!), so on a live system I’d need a user to log into the OPAC before they could add tags.
If they’ve logged in, then I can let them add new tags or delete ones they’ve previously added.
I want to be able to make tag suggestions – for example, if a user wants to tag “Web Site Design using XML and HTML”, then I want to be able to suggest relevant tags that other users have already added (e.g. xml, html, web design, etc). Unless other users have already tagged that particular book, how can I generate suggestions?
There’s more than one way of doing this, but I decided to do it by adding tags to the subject headings as well as to books. In other words, when someone tags a book with html then I also add that tag to all of the subject headings for that book as well. Then, when someone wants to tag a different book that has one of those subject headings, I can suggest html as a possible tag.

This does mean that irrelevant tags can get added to subject headings but (in theory) over time the relevant tags will outweigh the irrelevant ones for each heading. As well as using the tags linked to subject headings, I also take into account any existing tags for that book.

It’s been interesting watching the beta tagging feature on Amazon, although (being early days) they seem to be getting a huge number of irrelevant tags – for example, the latest Harry Potter is tagged with things like good, kerri, brothers present and Jill. How long will it take before the irrelevant tags on Amazon sink away and the relevant ones rise to the top? More to the point, would library patrons be more likely to use relevant tags on an OPAC? Again, it’s the “chicken & egg” — without a healthy critical mass of relevant tags, how do you prove to cynical members of staff that tagging is a good thing and not a method of adding virtual graphitti to the OPAC?

I’ve still got more work to do with our initial attempt at OPAC tagging — at the moment, all you can do is add tags. In particular, I don’t have a method of selecting a tag and then showing all the items that have been tagged… but I’m hoping that Casey Durfee (Seattle Public Library) will come to my rescue. Spookily, I woke up this morning wondering how on earth I could hack our HIP server to allow this and then found that Casey had already emailed me to let me know it was possible!

I’m fairly happy with the suggestions based on subject headings – for example, when I try to add the very first tag to Internet technology and e-commerce, the following suggestions appear:

The suggestions are based on items that have been previously tagged which have the subject headings “Web site development” and/or “Electronic commerce”. Some of the suggestions are more relevant than others (e.g. ecommerce), but at least I’m not getting anything too weird appearing (yet!). Obviously there’s nothing to stop the user adding a new tag, but hopefully having a few suggestions available will help them out.

This should also have the added benefit that each of the subject headings will get a healthy selection of (hopefully relevant) tags attached to them. For example, here are some of the tags I’ve currently got attached to the “Web site development” subject heading (in ranked order):

web services
xml
java
microsoft
dotnet
asp
html
ecommerce
php
macromedia
portals
security
soap
lucene
web design
databases

…wouldn’t it be cool if the OPAC could tie those together when the user does a subject keyword search? For example, if they searched for XML, then the OPAC could suggest that they might be interested in other books under the “Web site development” subject heading. Or, if they were looking at books under the “Web site development” subject heading, then the OPAC could suggest that they might also be interested in books tagged with things like xml, web services and java.

I guess I’ve tagged about 150 books so far (mostly those to do with XML and HTML), but what I’d really like to do is throw open the doors and invite anyone who reads this weblog post to jump in and start tagging our catalogue.

When you first try to tag an item, the server will attempt to save a cookie in your web browser — you can see the value of this cookie in the “debug” section, along with the user number assigned to that cookie. The only reason I’m doing this is to try and give you an option to remove any tags that you’ve added (but not ones that other people have added).

Also, the script doesn’t refresh the book page with your new tags — so, once you’ve added your tags, you’ll need to refresh the book page to make them appear.

I’ll keep on working on the code over the holidays and, once it’s in a stable state, I’ll post the scripts here.

If anyone has any comments or suggestions, please feel to email me:

d.c.pattern [at] hud.ac.uk
email [at] daveyp.com

Have fun folks!

6 thoughts on “Taggytastic – part 3”

I’m working on a tagging solution for AADL right now, and ran into the same issues you’re talking about with having to prove why it’s a worthwhile thing to do. I know.. it’s crazy.
Anyway, I plan on maintaining a parallel database of tags and comments for each bib item in the collection and incorporating it via the middleware we use to suck info from the OPAC.

Hi John
Cool – I was hoping I wasn’t the only one working on this!
As with the discussion about People who borrowed this, also borrowed…, we’d all benefit from a cooperative database – that would give us the critical mass of relevant tags which we all need to help argue our cases and prove our points.
I’ve put together a live page which breaks down the relationships between the tags and the subject headings (and visa versa):
webcat.hud.ac.uk:4128/perl/taginfo.pl
The first half of the page lists each tag, along with the subject headings that have been tagged with that tag. For example, the tag mysql is currently linked to subject headings relating to “MySQL”, “PHP” and “Web Site Development”.
The second half of the page lists all of the subject headings that have been tagged, along with the tags. For example, the “Computer graphics” subject heading has been tagged with adobe, css, livemotion and web design.
At the moment, the xml tag appears a lot, but that’s due to the fact that I’ve concentrated my own tagging on XML books.
What I’m hoping is that the relationship between the tags and the subject headings will stay fairly relevant.
By the way — it wasn’t me who tagged rude things about President Bush, honest!

Thanks to Casey, I’ve now got the bold tag headings on this page triggering a list of tagged items in the OPAC.
Unfortunately I think I’ve found a bug in the version of HIP that we’re using as you’re unable to sort the items – if you try, then you get an “unable to sort” error and I get to see a lovely big screen of debug info on the server!
Oh well, one step forward and two steps back 😀
I just want to stress that this is definitely “work in progress” and the idea is to:
1) try and figure out good ways of integrating tagging within an OPAC (not just SirsiDynix HIP)
2) investigate ways of enhancing the existing searches and browses with tags
3) try and provide a working example of a tagged OPAC that anyone can use for demonstation purposes
4) share everything that I discover along the way
If anyone can make use of it, I’m planning to make the tagged items available for downloading — probably as XML, e.g.
<taglist>
<item>
<isbn>123456789X</isbn>
<tag count=”2″>microsoft</tag>
<tag count=”1″>internet</tag>
<tag count=”4″>windowsxp</tag>
</item>
<item>
<isbn>0700012375</isbn>
<tag count=”5″>harry potter</tag>
<tag count=”2″>wizards</tag>
</item>
</taglist>
Once again, I think there would be a huge benefit in coming together to form a collaborative database that we can all use.

…and here’s that XML version of the tags:
webcat.hud.ac.uk:4128/perl/tagexport.pl

Pingback: Lorcan Dempsey's weblog

Tagging our OPAC is something I had been thinking about recently, though I have not even metioned it to our staff yet, as I think selling it to them will be far harder than the technical issues 🙂
I wonder if we would be better off not trying to provide pre-populated tags for our users. I know it might help them choose a tag, but I can see lots of arguments amongst librarians as to which tags to use. Maybe we have to trust that the ‘best’ tags emerge over time. Compared to a site like Flickr though the number of people providing tags to a library catalogue is far smaller, so the scope for ‘noise’ in the tag cloud will be higher, especially in the early days. Great work though – I’ll be watching your progress with interest.

Comments are closed.