Thank you everyone for your feedback. After going through the responses one by one, I grouped up the questions and concerns into the following:
1- How does your solution work? Is it effective/accurate?
2- What about privacy concerns, student consent? How does it compare to other tools? Is this spying?
3- Explain more about the features that will help students.
4- Marketing plan, scalability, pricing.
My response is segmented to the different questions but I invite you to read the whole response:
1- Our solution is a plugin for the learning management system of the university, say Moodle. Instead of the students opening their course webpage and downloading the project handout, the document will open in the browser and stream online. We urge the students to consult the handouts online and not try to download it on their personal computer, to limit the sharing of the document. Of course, the document will have some security features built in to deny access to downloading, sharing, printing.
By project handout, we mean all the readable documents provided by the teacher to guide the student on doing his project. This includes but is not limited to Instruction files, grading guidelines, related readings, engineering drawings, related research documents.
We then treat those files as normal URLs, meaning we apply common web analytics tools to generate data about the activity of the users on their appropriate documents. We gather data over time about their consultation of the handout. Accordingly, we build models about the interaction of every user with his project handouts. Of course, this interaction will change over time, and over different classes. However, we gain insights from the class behavior, as a heat map on the sections that took most of their attention. Using that knowledge, we compare the students’ behavior to the trend of the class and to his own usual patterns.
This whole process will eventually indicate a Probability of a student having worked on the project, which outputs a binary label (Okay, not okay) after choosing an appropriate confidence level. The more data we have, the more accurate our predictions will be. Signaling that a student has a low probability of doing the project would invite the teacher to give aid to the student before the deadline. At the same time, it could flag suspicious behavior that someone else is accessing the files or the student has resorted to third party members to complete his assignment. In both cases, the teacher will address the student to see his % of involvement in doing the project.
The accuracy of the solution is limited by the data we have, which improves with adoption and retention. The effectiveness, however, is relative. I would answer it as a question: What is the effectiveness of the current detection? The gut instinct of the teacher, with quite a few students, could he detect? Shouldn’t we relieve him from the burden of detection and help him focus on his actual role: supporting and guiding?
2- By activity we mean the user behavior on the document, and NOTHING outside of the document. We urge people to think about every website they use and to realize that ALL of them have analytics of the users. Google publicly provides those tools as well. Also please remember that the project handouts are readable documents, and non-editable. Meaning that there will not be any input from the user, we merely record their Reading behavior on the document. We are fully transparent with the student; they have every right to know what data is being recorded. Of course, transparency does not imply consent. We wish to gain consent from the students over time. Like any fraud detection software, people will always outsmart the software. That is why we want to build a brand that indulges the students into helping us help them. Unlike other plagiarism tools that spy on users for webcam and microphone information, we take the data from the user and we give back to the user. We input information about reading/working behavior and we output refinements that can be made (More of this on question 3). Of course, our views can be subjected to scrutiny and the notion of spying can be used against us. However, this begs the question: How far would universities go to solve this problem? I can tell you from my research and interviews, some would go very far (up to 2 years imprisonment in Australia). Thus, with an accelerated increase of contract cheating, some consent would have to abide to convention. I would love to talk hours about this subject so please do contact me to continue this discussion.
3- I imagine 2 ways of using data to help students. One is active and the other passive. The passive help just records the data that we already recorded (reading behavior) and tries to convey to the student, the trend of the class. Something on the line of (80% of the class spent the most amount of time on question 3)-(your reading speed is 40% slower between the hours 4-7 pm)-(You are focusing 80% more time on this part of the reading than the rest of the class, consider skimming?)… The sky is really the limit here. I want to start with a few simple features and go on from there. As for the active help, this is best explained by what google maps does. You know when you go to a restaurant and then one day a few questions pop up (do they serve sushi here, how expensive is this restaurant, do they have parking for disabled…). Well, I am thinking of implementing such an inventory for a class project. While on the project handouts, you could be randomly prompted to answer a pop-up question that could help another person in the class. For example, you could be having a programming assignment and you hover over question 1 and see that (90% of students reported having used a nested loop when solving the problem), I imagine if you need more insights you can type the questions and the system would then try to get that information for you. You can think of this as students collaborating on the project just enough to remove incentives of cheating. This area is really interested, I want to hear some thoughts or ideas
4- Our branding will try to expose our helpful side more than our anti-plagiarism side. However, since it is premature to pivot, our main goal is anti-plagiarism. With experience we will assess which direction to adopt. I believe since our solution involves only software, scalability wouldn’t be an issue. For the present, as long as the software does its job and proves effective, I do not see any barrier to entry to the Anti-plagiarism market. We aim to reach first the universities in Lebanon, then the universities in the Gulf before trying to expand to Europe and America. Of course, any high schools that employs a learning management system can adopt our solution. However, I have only researched contract cheating in higher education so I cannot answer this question confidently. For the time being I aim to follow the pricing methodology in the software market, with a specific profit margin per license. We could then correlate the license price and the number of users at the university.