I Often Plagiarise Myself

Here’s a thought for those of you that like to mix your academic work with your blogging vices. My university is trialing some software that detects plagiarism. It runs the submitted work through a few trials and tribulations to see if any significant strings match anything found online. It’s primarily there to make sure an essay isn’t actually a liberal quoting of Wikipedia.

However what happens when the software picks up a match for online content that the author of the submitted work is also the author of? What will happen in two years if my thesis gets run through this algorithm and a big arrow points at this blog? Obviously the mess should swiftly be resolved (one would hope) by me explaining the source of the blog. However it demonstrates both the problems of relying on algorithms and the problem that occurs when someone’s output is no longer contained to a professional sphere. Fifteen years ago no-one would expect a student to be airing their work anywhere outside their professional sphere. Now it should be compulsory.

How to DRM the News?

It’s generally possible to lock down a media such as music or video because it requires equipment to actually play it. This is how many DRM systems work, by having security measures in both the media and the media player (Cory Doctorow gives a great run-down of how this system will always make DRM crackable).

What about news though? Twitter is becoming somewhat a secondary newswire these days, with media outlets cobbling together stories from information picked up off of Twitter, ‘breaking’ the story hours after the Twitterverse has moved on. This can also work in the other direction:  Twitter can spread one factoid to thousands within minutes and could be the BitTorrent of news in a world where news is a paid for commodity.

If I paid to access the information behind the WSJ pay-wall, does that make that information private? If I reveal some of that information to others on the web is that theft of content, or copyright infringement? Granted if I lifted the thing verbatim and reposted it onto my cleverly named http://www.freewsj.com then fair enough, but what if I just found something interesting and quoted it, or discussed some figures that I saw. At what point am I giving too much of the subscriber content away to be fair use?

From the publisher’s perspective, do you control this? If so then how? Do you try to enforce some sort of screening algorithms to pick up on anyone writing something too close to your protected content? Or do you allow it in the hope that it will drive more traffic and more customers to the originating article? If you see it as viral marketing then how much should you allow out, and is there an issue if so much talk is generated that the whole article is essentially available in pieces anyway? How do you lock down something that is communicable across so many different platforms (and if we really have to, then without a technical platform at all)? The whole endeavour seems impossible. I hope it is.

Reference

European publishers want a law to control online news access – Ars Technica

Why Did the World Economy Fail?…

… because our financial institutions are run by people that think the musings of a 15 year old work experience kid constitutes groundbreaking demographic research.

A research note written by a 15-year-old Morgan Stanley intern that described his friends’ media habits has generated a flurry of interest from media executives and investors.

The US investment bank’s European media analysts asked Matthew Robson, an intern from a London school, to write a report on teenagers’ likes and dislikes, which made the Financial Times’ front page today.

His report, that dismissed Twitter and described online advertising as pointless, proved to be “one of the clearest and most thought-provoking insights we have seen – so we published it”, said Edward Hill-Wood, executive director of Morgan Stanley’s European media team.

Twitter is not for teens, Morgan Stanley told by 15-year-old expert

News Won’t Learn from Music’s Mistakes

The news industry has been kicking up a bit of a stink that is reminiscent of the music industry in its digital infancy. They’re noticing a decline in advertising revenues and are pointing the finger at the internet for stealing away their advertising contracts. Similarly those news outlets that have an online presence are pointing the finger at Google and similar news aggregator services for making their content available outside of their originating sites meaning advertising never gets seen. On the other side of the argument the aggregators and index services argue that a large proportion of the news site traffic is because of their aggregators and indexing. It all sounds very similar to the music industry rallying against P2P for stealing their profits and streaming services for profiting off their content. The consumption patterns have changed, companies outside the circle that traditionally dealt with a media start filling the gaps, and then the old companies call for courts to protect them for being too slow.
Interestingly what is also comparable is the statistical debate; with music it was always the stats war between ‘P2P causes drops in profits’ and ‘P2P has no impact/improves profits’. With news, as the major companies claim that the internet has stolen all their revenue, stats start cropping up that in fact it may be many factors. You could argue that music sales declined due to multiple factors such as reduction in releases and the move from sales as albums to sales as tracks. With advertising revenue Robert Picard argues that it is not necessarily the internet, but the rise in other forms of physical advertising.
Over the last few weeks there have been calls from the news industries to implement essentially a DRM program across the net (ACAP) to control aggregators. News Corp have also implied that paywalls will soon be going up around their major properties. With the music industry’s recent admittance that their reaction to Napster could have been better it seems that just as one industry starts to come around, the other begins the whole process over again.

The news industry has been kicking up a bit of a stink that is reminiscent of the music industry in its digital infancy. They’re noticing a decline in advertising revenues and are pointing the finger at the internet for stealing away their advertising contracts. Similarly those news outlets that have an online presence are pointing the finger at Google and similar news aggregator services for making their content available outside of their originating sites meaning advertising never gets seen. On the other side of the argument the aggregators and index services argue that a large proportion of the news site traffic is because of their aggregators and indexing. It all sounds very similar to the music industry rallying against P2P for stealing their profits and streaming services for profiting off their content. The consumption patterns have changed, companies outside the circle that traditionally dealt with a media start filling the gaps, and then the old companies call for courts to protect them for being too slow.

Interestingly what is also comparable is the statistical debate; with music it was always the stats war between ‘P2P causes drops in profits’ and ‘P2P has no impact/improves profits’. With news, whilst the major companies claim that the internet has stolen all their revenue, stats start cropping up that in fact it may be many factors. With music you could argue that sales declined due to multiple factors such as reduction in releases and the move from sales as albums to sales as tracks. With advertising revenue Robert Picard argues that it is not necessarily the internet, but the rise in other forms of physical advertising.

Over the last few weeks there have been calls from the news industries to implement essentially a DRM program across the net (ACAP) to control aggregators. News Corp have also implied that paywalls will soon be going up around their major properties. With the music industry’s recent admittance that their reaction to Napster could have been better it seems that just as one industry starts to come around, the other begins the whole process over again.

Reference

AP to Aggregators: We Will Sue You | Wired

European publishers want a law to control online news access – Ars Technica

News Corp will charge for newspaper websites, says Rupert Murdoch | Media | guardian.co.uk

British music boss: we should have embraced Napster – Ars Technica

The Media Business: THE POOR CONNECTION BETWEEN INTERNET ADVERTISING AND NEWSPAPER WOES

Citation Metrics & the Freedom to Share

Citation Metrics probably aren’t the most exciting thing you’ll hear about this week but they’re incredibly important if you’re an academic, or if you’re concerned about the freedom of knowledge. In the UK the performance of departments in universities is assessed based on an audit that occurs every five or so years called the Research Assesment Exercise (RAE). The result of the RAE translates into how much government money that department will recieve until the next RAE.
So far the RAE has worked through a system of peer review. Every department submits a certain amount of articles
from various staff members along with a statement about the make-up of the department and other academics in the country are assigned to review the work and rank it. At the end all the ranks are tallyed up and a grant is worked out. The whole process can be rather expensive if you consider the cost of administration and dealing with staff having to be taken off their normal schedules to evaluate their peers.
So, bring in the citation metrics. Whenever someone writes an article and submits it to an academic journal it also becomes indexed by a citation index, the best known of these being the ‘Web of Science’ from Thomson Reuters. This database keeps track of which articles cite which articles and so provides a massive resource for ranking academic work based on how many people considered it worth citing. From these individual rankings academics can be valued based on how many other articles have cited their work. This gives them a number value which is often consulted when they apply for jobs. Departments routinely look up their applicants on the index to see how influential their work has been. If they are ranked highly it’s more likely the department will recieve more audit money.
The government has already expressed interest in the idea of moving from an audit based on peer review to one based on the data provided by the citation index. The massive costs required to run a peer reviewed audit certainly makes an audit based on mining already compiled data rather enticing.
To get to the point consider this scenario. Due to certain political aspirations in line with the Creative Commons movement, I feel that if I were to publish my work, it should be done openly, available to anyone to utilise. Most academic journals require you either pay for access to the content, or that you are a member of an institution that pays for your access. If you’re not an academic or willing to fork out money, a massive chunk of intellectual endeavour is closed off to you. If I want to publish outside this system I can, I’m totally free to give my work away for free however I choose. However these articles will never be indexed and so never count towards my ‘value’ as an academic. I shoot myself in the foot for my principles.
There is of course the option of a book, which is my current contingency plan. Over the course of my thesis I’ll have to lock my work away into those closed systems to secure some vague semblance of value. However at the end I can recompile these articles into a book (which i wanted to do anyway) and give that away for free as a seperate piece. As long as I can find a publisher willing to just do physical distribution and be happy to distribute free pdfs I should be set. Although not all publishers are a part of the index either…

Citation Metrics probably aren’t the most exciting thing you’ll hear about this week but they’re incredibly important if you’re an academic, or if you’re concerned about the freedom of knowledge. In the UK the performance of departments in universities is assessed based on an audit that occurs every five or so years called the Research Assesment Exercise (RAE). The result of the RAE translates into how much government money that department will recieve until the next RAE.

So far the RAE has worked through a system of peer review. Every department submits a certain amount of articles from various staff members along with a statement about the make-up of the department and other academics in the country are assigned to review the work and rank it. At the end all the ranks are tallyed up and a grant is worked out. The whole process can be rather expensive if you consider the cost of administration and dealing with staff having to be taken off their normal schedules to evaluate their peers.

So, bring in the citation metrics. Whenever someone writes an article and submits it to an academic journal it also becomes indexed by a citation index, the best known of these being the ‘Web of Knowledge’ from Thomson Reuters (a closed system similar to Google Scholar). This database keeps track of which articles cite which articles and so provides a massive resource for ranking academic work based on how many people considered it worth citing. From these individual rankings academics can be valued based on how many other articles have cited their work. This gives them a number value which is often consulted when they apply for jobs. Departments routinely look up their applicants on the index to see how influential their work has been. If they are ranked highly it’s more likely the department will recieve more audit money.

The government has already expressed interest in the idea of moving from an audit based on peer review to one based on the data provided by the citation index. The massive costs required to run a peer reviewed audit certainly makes an audit based on mining already compiled data rather enticing.

To get to the point consider this scenario. Due to certain political aspirations in line with the Creative Commons movement, I feel that if I were to publish my work, it should be done openly, available to anyone to utilise. Most academic journals require you either pay for access to the content, or that you are a member of an institution that pays for your access. If you’re not an academic or willing to fork out money, a massive chunk of intellectual endeavour is closed off to you. If I want to publish outside this system I can, I’m totally free to give my work away for free however I choose. However these articles will never be indexed and so never count towards my ‘value’ as an academic. I shoot myself in the foot for my principles.

There is of course the option of a book, which is my current contingency plan. Over the course of my thesis I’ll have to lock my work away into those closed systems to secure some vague semblance of value. However at the end I can recompile these articles into a book (which i wanted to do anyway) and give that away for free as a seperate piece. As long as I can find a publisher willing to just do physical distribution and be happy to distribute free pdfs I should be set. Although not all publishers are a part of the index either…

“It has never, ever been easier to break the law”

As I continue my perusal of the SABIP report on Digital Consumers in the Online AgeI’m finding yet more things that get my goat. The target of this particular moment’s focus is one of their ‘Key Findings’ titled “It has never, ever been easier to break the law” on page 12.

When I saw this I thought ‘Yes! Something in this report that I agree with’, however this joy was short lived. The report’s take on this statement is that it is relatively easy to get into file-sharing, with the media constantly telling us how to find the sites, Google providing easy information when searching for ‘free music’ (that evil Google) and peer-pressure in social networks… apparently Pirate Bay is the new crafty cigarette.

Yet when I saw this initial statement my mind turned to ‘Infringement Nation by John Tehranian. This wonderful article from The University of Utah’s S.J. Quinney College of Law documents a day in the life of the average Law Professor and how his daily practice infringes copyright left right and centre.

By the end of the day, John has infringed the copyrights of twenty emails, three legal articles, an architectural rendering, a poem, five photographs, an animated character, a musical composition, a painting, and fifty notes and drawings. All told, he has committed at least eighty-three acts of infringement and faces liability in the amount of $12.45 million There is nothing particularly extraordinary about John’s activities. Yet if copyright holders were inclined to enforce their rights to the maximum extent allowed by law, barring last minute salvation from the notoriously ambiguous fair use defense, he would be liable for a mind-boggling$4.544 billion in potential damages each year. And, surprisingly, he has not even committed a single act of infringement through P2P file-sharing.


(Tehranian, 2007:547)

If we tallied up the acts of copyright infringement that occurred outside of P2P systems I’m sure they would be much more substantial. I agree that it has never been easier to break the law, but perhaps that is because of the law, not because of the individual.

“These Figures are Staggering”

I‘ve just had a quick look through the SABIP report on Digital Consumers in the Online Age’ which has been featured over on the BBC.

Here’s an extract I’d like you to read…

On one peer-to-peer network we found that at midday on a weekday there were 1.3 million users, sharing content. If each “peer” from this network (not the largest) downloaded one file per day the resulting number of downloads (music, film, television, e-books, software and games were all available) would be 473 million items per year. If the figure for each individual is closer to five or more items per day, the lowest estimate of downloaded material (remembering that the entire season of the Fox television series “24”, or the “complete” works of the rock group Led Zeppelin can be one file) is just under 2.4 billion files. And if the average value of each file is £5 – that is a rough low average of the price of a DVD or CD, rather than the higher prices of software or E-books – we have the online members of one file sharing network consuming approximately £12 billion in content annually – for free. These figures are staggering. (SABIP, 2009:6)

Dear me, don’t strain yourself SABIP! The accuracy of those figures are indeed astounding, truly the quantitative method for discerning the empirical data knocks me to the floor.

The brilliance of this work has also been taken up by Zeropaid who also feel the passage is worthy of such joyous quoting. These are the figures quoted by the BBC and I imagine the ones that will stick in the minds of our beloved ordained policy makers. This of course is not a problem at all as the report clearly demonstrates the ability to discern absolute truth with their statistical prowess.

I bow down to your greatness SABIP, these figures are staggering.

***************************

UPDATE

The report also cites Zentner’s 2006 research stating that those who file-share are 30% less likely to purchase music. You may find that report difficult to locate as SABIP failed to include a reference in their bibliography (truly staggering).

[The article is called ‘Measuring the Effect of File Sharing on Music Purchases’, by Zentner, A. (2006) Journal of Law and Economics, Vol. 49, No. 1, University of Chicago. If you have access to such things it can be picked up here.]

Moving on: Indeed Zentner did say that file-sharing reduces the likelihood of purchasing music by 30% (see page 87). However he also said…

The database does not contain information on quantities of music purchased or on intensities of music downloads to calculate what music sales would have been in the absence of music downloading. (Zentner, 2006: 85)


Zentner’s analysis is based on the assumption that if file-sharing did not exist, people would buy all the music that they downloaded.

The percentage of people who bought music is much larger among the group who regularly download MP3 files (55.8 percent) than among those who do not (37.7 percent), which suggests that MP3 downloaders have a strong taste for music.(Zentner, 2006: 73)

…and are essentially the music industry’s customer base.

But hey, even if Zentner’s work is a statistical piece that shows the music industry is being murdered by file-sharers, there are still those pesky other pieces of research that aren’t three years old saying quite the opposite.

Consumer Culture in Times of Crisis,” conducted by the BI Norwegian School of Management, the largest business school in Norway, and the second largest in all of Europe, concluded that file-sharers actually buy 10 times as much music as they download for free.The Impact of Music Downloads and P2P File-Sharing on the Purchase of Music: A Study For Industry Canada,” a study commissioned by Industry Canada, a ministry of the Canadian federal government, found that for every album downloaded illegally legal CD purchases increased by 0.44, or by about half an album.


Props to
Zeropaid again.