Original audio broadcast can be found on IASTED.org. Brandon Hisey from The International Association of Science and Technology for Development (IASTED) and Dan Videtto from iThenticate / iParadigms (former Managing Director) have a discussion on IASTED Live!
IASTED (Interviewer): Dan, we have been introduced to the notion of checking for plagiarism on submissions coming into IASTED. Tell us a little more about how it actually works.
Dan (iThenticate): Brandon was alluding to our content databases, which are the critical factor in what we have. The core technology is based on our algorithm, that will match strings of text, but, of course, without the database to match against, there would be no result – and right now (as of September 2011) we have the world’s largest comparison database, which includes 14+ billion current and archived web pages, 55,000+ publications, including 30 million scholarly articles, books and conference proceedings, and 70+ million other published works from scientific, technical and medical (STM) journals, periodicals, magazines, encyclopedias, abstracts – and that database is growing at about 10 million pages a day.
Dan (iThenticate): Yeah, it’s amazing.
IASTED: Are you making headway or falling behind, or you know how that’s working? You’re cataloguing faster than people are putting it out there?
Dan (iThenticate): We are pretty much right up with it. We do similar crawling to Google and Yahoo, and then, in addition, to having the live web, we are also archiving content, so pages come and go, but we keep them in a database. We also have our CrossCheck/iThenticate service for publishers to archive their own content.
IASTED: Let’s say I am authoring a paper and I have a quotation. I take some text out, but I am also putting down a footnote, so I am giving some credit. That’s academically honest and there is nothing wrong with that. Does the software pick that up? Or how does that work? Is it left to the reviewer to make the discernment that it’s ok?
Dan (iThenticate): It’s true. It is up to the actual editor or user to make that discretion. Because it will show matching text, it won’t say whether that is spare use or not, so that is up to the editor to make that decision. In fact the term ‘plagiarism detection’ is a bit of misnomer. I think it would be better to call it a ‘matching text’ or an ‘originality product’.
IASTED: I see your point. The word plagiarism is full of moral implications, especially in the academic community.
Dan (iThenticate): But there is a functionality on this service that allows you to exclude quotation because you would assume that if it has been quoted that it is in the fair use category and that makes the editorial process much more streamline.
IASTED: Tell me a little bit about your perceptions of IASTED’s process, Dan, and how they are using the iThenticate software.
Dan (iThenticate): Going back to what they were doing beforehand, which, as Brandon describes was a very manually intensive process of having to go out and do various forms of looking for the matching text, using highlighter markers and paper copies. We’ve been able to make that a very rich and dynamic environment, where they get digital copies of all the matching texts. That makes it very easy and streamlined for an editor to be able to use a tool.
IASTED: Give me a little background on the context within which the tool was developed in the first place.
Dan (iThenticate): It actually started in academia with some researchers at Cal Berkeley who were noticing more and more instances of plagiarism within the PhD programs. This was about 10 years ago, and they developed a little program to use internally. It was very well received, and then they took that and created a non-profit to go out and bring this further out to other universities. Our sister product, Turnitin, is now used by 20+ million users in over 126 countries, and is in most major universities and high schools in the United States and Canada and throughout the UK.
IASTED: The next question that comes to mind is foreign language, how do you deal with that?
Dan (iThenticate): The product is not language specific and currently we are able to check all languages in the ISO standard for western languages, and we have products for eastern European and Asian languages.
IASTED: Brandon, are there instances where plagiarism appears to be the case, and then secondly, in those instances where this occurs, what do you do about it?
Brandon (IASTED): It’s a very complicated question to ask and the answer is almost equally complicated. I wouldn’t define plagiarism. It tends to be a judgment call. The iThenticate software generates several different types of reports, which allows different ways of interpreting the plagiarized material or the material in question. I should say in context of the original source and in the context of the paper, from that we are able to decide which course of the action we are going to take. Sometimes we see very well done, perfectly referenced material, which is completely acceptable – they give due credit to the original authors – but, at times, we find there is no referencing at all and you know authors are claiming work that’s not theirs as their own. So, depending on the severity we may take actions over a certain range. We may say to the author that you can submit again, but you need to change this because it’s questionable. Or we decide to not allow the author to submit to us again, meaning we blacklist them.
Brandon (IASTED): That way we ensure that our conferences and our proceedings and our journals are presenting the latest research and the most accurate information without taking away from another author’s credit. At IASTED we hold a very strict policy on plagiarism and we believe it’s very important in not just academic circles but in any kind of industry for the people that originally wrote that work get their credit for history – that’s their contribution so we do deal with it quite strictly.
IASTED: That’s very admirable. I guess what you are saying is, contrary to what we might think, this is a very challenging area and it’s a large problem.
Brandon (IASTED): It is, it’s quite prevalent, actually.
IASTED: The origin of the software were PhD candidates who noticed other PhD candidates were increasingly plagiarizing their work - that is a little disturbing – and if it’s happening in that instance, what accounts for it? Is it sloppiness? Is there so much pressure in the university system to publish that you know we get a little cavalier? Is it error? Or is it just outright fraud?
Dan (iThenticate): It runs the full spectrum from outright fraud to inadvertent mistakes. In fact, one of our authors that uses this software said at times now we are looking at tens of thousands of sources to write an article, you get lost in your digital footprints, and she says it’s wonderful to have a tool like iThenticate (the individual edition for authors and researchers) where you can go back and just do one last check.