Plagiarism Detection ???
-
My university (and about all I'd think) wants source code. They don't care about binaries. When I was TA I didn't accept binaries anywhere in a submission. As long as the code is original you won't have to worry about it.
The JZ wrote:
I didn't accept binaries anywhere in a submission.
Same applies with the course I teach at the local junior college. I can compile the code myself. I don't accept binaries.
"Try asking what you want to know, rather than asking a question whose answer you know." - Christian Graus
-
Just wondering ...How does this software work for Universities..How does it check for code cheating...I want to write this piece of software but i am not sure how does it compare things..I know it search over the net and look over all the coding reference database to find the similar one...But if the student recompiled the code and made some changes then would this software gonna work. cheers James
There was a series of articles on Dr. Dobb's Journal back in 2004/2005 that talked about detecting source code plagiarism. It was rather involved with looking at the executable code created by the compiler. The author discussed removing common code such as stack frame setup instructions, etc. Here is the link: DDJ Search[^]
"Try asking what you want to know, rather than asking a question whose answer you know." - Christian Graus
-
Just wondering ...How does this software work for Universities..How does it check for code cheating...I want to write this piece of software but i am not sure how does it compare things..I know it search over the net and look over all the coding reference database to find the similar one...But if the student recompiled the code and made some changes then would this software gonna work. cheers James
james_dixon_2008 wrote:
How does it check for code cheating
The plagiarism checking software that was used at my university (a lifetime ago I admit) worked in the following way: 1. Strip out ALL comments and whitespace from the code 2. Compare what's left (ie: just the code, not the comments/white space) I was shown a sample of a detected plagiarism - the code was IDENTICAL apart from the literal strings eg: one said "Enter the number" the other said "Number?" If you are researching a problem on the web, and take some snippets of code you find and modify them to suit your purposes, then that is the efficient and effective use of resources. If you are taking someone elses code and attempting to straight out pass it off as your own, that's plagiarism.
------------------------------------------- Don't walk in front of me, I may not follow; Don't walk behind me, I may not lead; Just bugger off and leave me alone!!
-
james_dixon_2008 wrote:
How does it check for code cheating
The plagiarism checking software that was used at my university (a lifetime ago I admit) worked in the following way: 1. Strip out ALL comments and whitespace from the code 2. Compare what's left (ie: just the code, not the comments/white space) I was shown a sample of a detected plagiarism - the code was IDENTICAL apart from the literal strings eg: one said "Enter the number" the other said "Number?" If you are researching a problem on the web, and take some snippets of code you find and modify them to suit your purposes, then that is the efficient and effective use of resources. If you are taking someone elses code and attempting to straight out pass it off as your own, that's plagiarism.
------------------------------------------- Don't walk in front of me, I may not follow; Don't walk behind me, I may not lead; Just bugger off and leave me alone!!
_Damian S_ wrote:
If you are researching a problem on the web, and take some snippets of code you find and modify them to suit your purposes, then that is the efficient and effective use of resources. If you are taking someone elses code and attempting to straight out pass it off as your own, that's plagiarism.
Back when I was at the university this was very difficult. The CS students were very good at renaming variables, reversing searches, derecursing, or recursing. It was actually pretty amazing the level of effort they took to NOT get caught taking other's code. They could probably have done the work in the same amount of effort, I guess it was the idea of "getting away with it" that kept them in the business. One of the personal favorites was to copy my work, given that every program always got 100 or better, I was an obvious source for a good grade. I got somewhat irritated this was happening, so I started choosing algorithms far beyond the scope of the general class. For instance "happy_walk" the infamous "find the shortest path across the map" that appears in who knows how many classes under who knows how many names.... The algorithm is quite simple, 2D array arbitrary sizes read in and stored, starting position given by the input file, you may move forward, or forward diagnal, no other moves are possible, you choose the moves with the least elevation changes to reach the end. I implimented this with a brute force search of all possibilities, all paths, that pulled system resources from all over ( :laugh: ) to solve the problem. A) This became noticeable on system load with too many people doing brute force searches, some got caught doing reruns. because the answer didn't match the sample answer, some thought the routine didn't work, and gave up. But a few stuck it out and turned it in with appropriate changes to my code. Except the teacher knew me, from her ex-husband and knew my reputation for computers. The TA did mark my program wrong, and I argued it with the teacher and got it changed -- I had found an alternate path that was "cheaper" but was hidden by two moves that would have never been taken in the straight-forward comparison approach, but three others got caught copying my work when asked to explain what the routines actually did. In the final project I did a 5 dimensional data structure storage to annoy the teacher, but it was very efficient. Efficient enough the school adopted the routine for storage of student information in t
-
_Damian S_ wrote:
If you are researching a problem on the web, and take some snippets of code you find and modify them to suit your purposes, then that is the efficient and effective use of resources. If you are taking someone elses code and attempting to straight out pass it off as your own, that's plagiarism.
Back when I was at the university this was very difficult. The CS students were very good at renaming variables, reversing searches, derecursing, or recursing. It was actually pretty amazing the level of effort they took to NOT get caught taking other's code. They could probably have done the work in the same amount of effort, I guess it was the idea of "getting away with it" that kept them in the business. One of the personal favorites was to copy my work, given that every program always got 100 or better, I was an obvious source for a good grade. I got somewhat irritated this was happening, so I started choosing algorithms far beyond the scope of the general class. For instance "happy_walk" the infamous "find the shortest path across the map" that appears in who knows how many classes under who knows how many names.... The algorithm is quite simple, 2D array arbitrary sizes read in and stored, starting position given by the input file, you may move forward, or forward diagnal, no other moves are possible, you choose the moves with the least elevation changes to reach the end. I implimented this with a brute force search of all possibilities, all paths, that pulled system resources from all over ( :laugh: ) to solve the problem. A) This became noticeable on system load with too many people doing brute force searches, some got caught doing reruns. because the answer didn't match the sample answer, some thought the routine didn't work, and gave up. But a few stuck it out and turned it in with appropriate changes to my code. Except the teacher knew me, from her ex-husband and knew my reputation for computers. The TA did mark my program wrong, and I argued it with the teacher and got it changed -- I had found an alternate path that was "cheaper" but was hidden by two moves that would have never been taken in the straight-forward comparison approach, but three others got caught copying my work when asked to explain what the routines actually did. In the final project I did a 5 dimensional data structure storage to annoy the teacher, but it was very efficient. Efficient enough the school adopted the routine for storage of student information in t
I work at The University of Northampton (UK), and we use http://www.turnitin.com/static/home.html[^]. I don't know about the back end, but I'd like to make some general comments. The point of this is not just to catch plagiarisers (buying off the net), but also to protect student work from being stolen and passed off as their own, in a way providing an intellectual property right/copyrighting system free of charge to the students. If you submit the same work for a different assignment, but the registered owner is the same, there isn't a problem as you own the work. We've had two stories about the use of these systems: The first is that someone submitted some work to Turnitin that showed up as plagiarised from Google, the student resubmitted, and this time accidentally included the receipt they were issued from the place they bought the work!! (they failed their year). The second, and this is where it can go very wrong, someone wrote their assignment, someone else stole it from them and submitted it to Turnitin before the originator did, the originator submitted it and got done for plagiarising his own work!!! The investigation did reveal the truth, but showed that in order to protect your own work, you should submit it as soon as you are ready to. It is an evolving technology, but plagiarism is a cultural quirk. Chinese and Japanese students are brought up to repeat the words of their teachers, and we go to great lengths to teach them that it is fine to do that, but Western education requires you to add your own words to it, and also to reference your sources. I thought nicking code was fine as long as you honoured the originator? ;)
-
Just wondering ...How does this software work for Universities..How does it check for code cheating...I want to write this piece of software but i am not sure how does it compare things..I know it search over the net and look over all the coding reference database to find the similar one...But if the student recompiled the code and made some changes then would this software gonna work. cheers James
What is it about academia ? Punishing people for code resuse ? Whereas in a working environment you get punished for not resuing code. Where a key metric is how quickly you can deliver and the sponsor is not the least bit interested in who wrote the code.
-
james_dixon_2008 wrote:
How does it check for code cheating
The plagiarism checking software that was used at my university (a lifetime ago I admit) worked in the following way: 1. Strip out ALL comments and whitespace from the code 2. Compare what's left (ie: just the code, not the comments/white space) I was shown a sample of a detected plagiarism - the code was IDENTICAL apart from the literal strings eg: one said "Enter the number" the other said "Number?" If you are researching a problem on the web, and take some snippets of code you find and modify them to suit your purposes, then that is the efficient and effective use of resources. If you are taking someone elses code and attempting to straight out pass it off as your own, that's plagiarism.
------------------------------------------- Don't walk in front of me, I may not follow; Don't walk behind me, I may not lead; Just bugger off and leave me alone!!
Back in the mid 80's I had a professor that checked for cheating by running the student's programs through lexical analyzer that he wrote. Everything that was left after white space and comments would be considered a token. He would then compare the number of tokens and which ones were used. Any close matches in these numbers warranted a closer look and usually proved that a couple students were sharing code. The last two classes I had with this guy was “Compiler Design” and that is where he revealed how he did this while teaching us to write the front end of our own compilers.
-
What is it about academia ? Punishing people for code resuse ? Whereas in a working environment you get punished for not resuing code. Where a key metric is how quickly you can deliver and the sponsor is not the least bit interested in who wrote the code.
-
What is it about academia ? Punishing people for code resuse ? Whereas in a working environment you get punished for not resuing code. Where a key metric is how quickly you can deliver and the sponsor is not the least bit interested in who wrote the code.
True, but the point of submitting these pieces of coursework is to prove that the student actually has the capability to complete that coursework - that they understand the concept and how to put it into practice. The grade they get on the coursework normally makes up part of the overall grade for that course module and ultimately the pass/fail result, and any grading, they get on their time at university. You could do it as a measure of whatever their code provides over and above what they've researched and obtained, but that makes it very difficult to set coursework for fundamentals classes covering data structures and algorithms, which will tend to be included in some system-supplied package with most programming languages and environments. It also makes for an uneven playing field. As for the working environment, the sponsor should be interested in who wrote the code, or at least the terms the original author placed on it. A lot of companies are being sued for including GPL code in their applications and having to settle, and many router makers are now having to place source code on their websites.
DoEvents
: Generating unexpected recursion since 1991 -
What is it about academia ? Punishing people for code resuse ? Whereas in a working environment you get punished for not resuing code. Where a key metric is how quickly you can deliver and the sponsor is not the least bit interested in who wrote the code.
peterwaine wrote:
Punishing people for code resuse ?
Academia is about learning how to do something. A base of knowledge so you are productive at the tasks after school is finished. Take a bubble sort, for example. The goal is not just to get a working bubble sort. It is more important to learn the manipulations, storing, comparisons, and HOW TO DO IT, not just to do it. Why bother learning history, math, a language (or even spelling), when I can go on the web and do a search and "resuse" someone else's work? Same idea. Work is about getting things done, not learning how to do it. We've all met the "coders" that cannot think for themselves. If they can't find the idea somewhere else, they can't do the assigned task without help. :mad:
Gary
-
The university should be encouraging Plagiarism anyway. In the real world if you can steal some working code of the old interweb you save man hours. Just mark anything stolen as *research* and you're in the clear.
In some situations yes, you can copy code from the web. Maybe you need a data access layer, so you use enterprise library, or a customized version of it. Or maybe you need logging, so you use log4net, ent lib logging, or better yet xquisoft logging (shameless plug). But rarely if ever will you be able to piece together an entire business application from anything on the web. Make sure you actually learn how to design and write original code while you can.
Michael Lang (versat1474) http://www.xquisoft.com/[^]
-
Just wondering ...How does this software work for Universities..How does it check for code cheating...I want to write this piece of software but i am not sure how does it compare things..I know it search over the net and look over all the coding reference database to find the similar one...But if the student recompiled the code and made some changes then would this software gonna work. cheers James
Dear university professors! Please consider starting with your own work. Come to think about, may be you're the first ones to be accused in plagiarism, well... sometimes? When I hear about student's assignment work done by copying someone's else work, it means that someone else already invented the problems, provided the solution or his/her students did; and may be you were the one to borrow the problem (may be with solutions(s)) and offer to your students. If your problems were original, your students would not have anything to copy, except some known algorithm to solve some partial problems, but that would be o.k. to do so. Now, I admit that it is very hard to invent a really original problem. I have created myself just a dozen or so, to count only decent ones. But I had a luxury to teach VERY few students; and I work at inventing problems quite rarely. Most problems come out when you do some real-life development combined with teaching/mentoring. So, anyway, before thinking about catching your students on plagiarism, I would kindly invite you to think some more about quality of your teaching. Thank you.
Sergey A Kryukov
-
peterwaine wrote:
Punishing people for code resuse ?
Academia is about learning how to do something. A base of knowledge so you are productive at the tasks after school is finished. Take a bubble sort, for example. The goal is not just to get a working bubble sort. It is more important to learn the manipulations, storing, comparisons, and HOW TO DO IT, not just to do it. Why bother learning history, math, a language (or even spelling), when I can go on the web and do a search and "resuse" someone else's work? Same idea. Work is about getting things done, not learning how to do it. We've all met the "coders" that cannot think for themselves. If they can't find the idea somewhere else, they can't do the assigned task without help. :mad:
Gary
There is a huge message in the industry targeted right at universities. The message is clear, the graduates you are pumping out are not prepared to work here. They do not have the networking, code reuse, application, or quick decision making skills to cut the mustard. One of the most valuable skills I ever learned was how to read other people's code, therefore "plagiarism" checking applications with complex algorithms are a trivial tool. A quick program that gives totals of whitespace and tokens so that a professor can identify suspicious submissions followed by a quick interview to make sure both students understand the code that they have submitted is more than sufficient to ensure that a student did not simply "copy" the solution, if his friend aided him by providing source and making sure that his fellow student understood the operation and mechanics of it then they both gained valuable knowledge from the exercise.
-
Dear university professors! Please consider starting with your own work. Come to think about, may be you're the first ones to be accused in plagiarism, well... sometimes? When I hear about student's assignment work done by copying someone's else work, it means that someone else already invented the problems, provided the solution or his/her students did; and may be you were the one to borrow the problem (may be with solutions(s)) and offer to your students. If your problems were original, your students would not have anything to copy, except some known algorithm to solve some partial problems, but that would be o.k. to do so. Now, I admit that it is very hard to invent a really original problem. I have created myself just a dozen or so, to count only decent ones. But I had a luxury to teach VERY few students; and I work at inventing problems quite rarely. Most problems come out when you do some real-life development combined with teaching/mentoring. So, anyway, before thinking about catching your students on plagiarism, I would kindly invite you to think some more about quality of your teaching. Thank you.
Sergey A Kryukov
-
Just wondering ...How does this software work for Universities..How does it check for code cheating...I want to write this piece of software but i am not sure how does it compare things..I know it search over the net and look over all the coding reference database to find the similar one...But if the student recompiled the code and made some changes then would this software gonna work. cheers James
Just in case you weren't aware of it, Google has a special "Code Search" feature: http://www.google.com/codesearch/advanced\_code\_search
-
Dear university professors! Please consider starting with your own work. Come to think about, may be you're the first ones to be accused in plagiarism, well... sometimes? When I hear about student's assignment work done by copying someone's else work, it means that someone else already invented the problems, provided the solution or his/her students did; and may be you were the one to borrow the problem (may be with solutions(s)) and offer to your students. If your problems were original, your students would not have anything to copy, except some known algorithm to solve some partial problems, but that would be o.k. to do so. Now, I admit that it is very hard to invent a really original problem. I have created myself just a dozen or so, to count only decent ones. But I had a luxury to teach VERY few students; and I work at inventing problems quite rarely. Most problems come out when you do some real-life development combined with teaching/mentoring. So, anyway, before thinking about catching your students on plagiarism, I would kindly invite you to think some more about quality of your teaching. Thank you.
Sergey A Kryukov
-
Just in case you weren't aware of it, Google has a special "Code Search" feature: http://www.google.com/codesearch/advanced\_code\_search
-
What is it about academia ? Punishing people for code resuse ? Whereas in a working environment you get punished for not resuing code. Where a key metric is how quickly you can deliver and the sponsor is not the least bit interested in who wrote the code.
peterwaine wrote:
Punishing people for code resuse ?
Do you need a lecture on how plagiarism and code reuse are different? I don't think so. I do believe you're smart enough to understand this: Plagiarism according to Wikipedia[^] Plagiarism is no more no less than a misrepresentation of the author of something -- a lie, a fraud.
Sergey A Kryukov