I Often Plagiarise Myself

Here’s a thought for those of you that like to mix your academic work with your blogging vices. My university is trialing some software that detects plagiarism. It runs the submitted work through a few trials and tribulations to see if any significant strings match anything found online. It’s primarily there to make sure an essay isn’t actually a liberal quoting of Wikipedia.

However what happens when the software picks up a match for online content that the author of the submitted work is also the author of? What will happen in two years if my thesis gets run through this algorithm and a big arrow points at this blog? Obviously the mess should swiftly be resolved (one would hope) by me explaining the source of the blog. However it demonstrates both the problems of relying on algorithms and the problem that occurs when someone’s output is no longer contained to a professional sphere. Fifteen years ago no-one would expect a student to be airing their work anywhere outside their professional sphere. Now it should be compulsory.

Advertisements

How to DRM the News?

It’s generally possible to lock down a media such as music or video because it requires equipment to actually play it. This is how many DRM systems work, by having security measures in both the media and the media player (Cory Doctorow gives a great run-down of how this system will always make DRM crackable).

What about news though? Twitter is becoming somewhat a secondary newswire these days, with media outlets cobbling together stories from information picked up off of Twitter, ‘breaking’ the story hours after the Twitterverse has moved on. This can also work in the other direction: ¬†Twitter can spread one factoid to thousands within minutes and could be the BitTorrent of news in a world where news is a paid for commodity.

If I paid to access the information behind the WSJ pay-wall, does that make that information private? If I reveal some of that information to others on the web is that theft of content, or copyright infringement? Granted if I lifted the thing verbatim and reposted it onto my cleverly named http://www.freewsj.com then fair enough, but what if I just found something interesting and quoted it, or discussed some figures that I saw. At what point am I giving too much of the subscriber content away to be fair use?

From the publisher’s perspective, do you control this? If so then how? Do you try to enforce some sort of screening algorithms to pick up on anyone writing something too close to your protected content? Or do you allow it in the hope that it will drive more traffic and more customers to the originating article? If you see it as viral marketing then how much should you allow out, and is there an issue if so much talk is generated that the whole article is essentially available in pieces anyway? How do you lock down something that is communicable across so many different platforms (and if we really have to, then without a technical platform at all)? The whole endeavour seems impossible. I hope it is.

Reference

European publishers want a law to control online news access – Ars Technica

Citation Metrics & the Freedom to Share

Citation Metrics probably aren’t the most exciting thing you’ll hear about this week but they’re incredibly important if you’re an academic, or if you’re concerned about the freedom of knowledge. In the UK the performance of departments in universities is assessed based on an audit that occurs every five or so years called the Research Assesment Exercise (RAE). The result of the RAE translates into how much government money that department will recieve until the next RAE.
So far the RAE has worked through a system of peer review. Every department submits a certain amount of articles
from various staff members along with a statement about the make-up of the department and other academics in the country are assigned to review the work and rank it. At the end all the ranks are tallyed up and a grant is worked out. The whole process can be rather expensive if you consider the cost of administration and dealing with staff having to be taken off their normal schedules to evaluate their peers.
So, bring in the citation metrics. Whenever someone writes an article and submits it to an academic journal it also becomes indexed by a citation index, the best known of these being the ‘Web of Science’ from Thomson Reuters. This database keeps track of which articles cite which articles and so provides a massive resource for ranking academic work based on how many people considered it worth citing. From these individual rankings academics can be valued based on how many other articles have cited their work. This gives them a number value which is often consulted when they apply for jobs. Departments routinely look up their applicants on the index to see how influential their work has been. If they are ranked highly it’s more likely the department will recieve more audit money.
The government has already expressed interest in the idea of moving from an audit based on peer review to one based on the data provided by the citation index. The massive costs required to run a peer reviewed audit certainly makes an audit based on mining already compiled data rather enticing.
To get to the point consider this scenario. Due to certain political aspirations in line with the Creative Commons movement, I feel that if I were to publish my work, it should be done openly, available to anyone to utilise. Most academic journals require you either pay for access to the content, or that you are a member of an institution that pays for your access. If you’re not an academic or willing to fork out money, a massive chunk of intellectual endeavour is closed off to you. If I want to publish outside this system I can, I’m totally free to give my work away for free however I choose. However these articles will never be indexed and so never count towards my ‘value’ as an academic. I shoot myself in the foot for my principles.
There is of course the option of a book, which is my current contingency plan. Over the course of my thesis I’ll have to lock my work away into those closed systems to secure some vague semblance of value. However at the end I can recompile these articles into a book (which i wanted to do anyway) and give that away for free as a seperate piece. As long as I can find a publisher willing to just do physical distribution and be happy to distribute free pdfs I should be set. Although not all publishers are a part of the index either…

Citation Metrics probably aren’t the most exciting thing you’ll hear about this week but they’re incredibly important if you’re an academic, or if you’re concerned about the freedom of knowledge. In the UK the performance of departments in universities is assessed based on an audit that occurs every five or so years called the Research Assesment Exercise (RAE). The result of the RAE translates into how much government money that department will recieve until the next RAE.

So far the RAE has worked through a system of peer review. Every department submits a certain amount of articles from various staff members along with a statement about the make-up of the department and other academics in the country are assigned to review the work and rank it. At the end all the ranks are tallyed up and a grant is worked out. The whole process can be rather expensive if you consider the cost of administration and dealing with staff having to be taken off their normal schedules to evaluate their peers.

So, bring in the citation metrics. Whenever someone writes an article and submits it to an academic journal it also becomes indexed by a citation index, the best known of these being the ‘Web of Knowledge’ from Thomson Reuters (a closed system similar to Google Scholar). This database keeps track of which articles cite which articles and so provides a massive resource for ranking academic work based on how many people considered it worth citing. From these individual rankings academics can be valued based on how many other articles have cited their work. This gives them a number value which is often consulted when they apply for jobs. Departments routinely look up their applicants on the index to see how influential their work has been. If they are ranked highly it’s more likely the department will recieve more audit money.

The government has already expressed interest in the idea of moving from an audit based on peer review to one based on the data provided by the citation index. The massive costs required to run a peer reviewed audit certainly makes an audit based on mining already compiled data rather enticing.

To get to the point consider this scenario. Due to certain political aspirations in line with the Creative Commons movement, I feel that if I were to publish my work, it should be done openly, available to anyone to utilise. Most academic journals require you either pay for access to the content, or that you are a member of an institution that pays for your access. If you’re not an academic or willing to fork out money, a massive chunk of intellectual endeavour is closed off to you. If I want to publish outside this system I can, I’m totally free to give my work away for free however I choose. However these articles will never be indexed and so never count towards my ‘value’ as an academic. I shoot myself in the foot for my principles.

There is of course the option of a book, which is my current contingency plan. Over the course of my thesis I’ll have to lock my work away into those closed systems to secure some vague semblance of value. However at the end I can recompile these articles into a book (which i wanted to do anyway) and give that away for free as a seperate piece. As long as I can find a publisher willing to just do physical distribution and be happy to distribute free pdfs I should be set. Although not all publishers are a part of the index either…