There is an urgent need to address the apparently growing problem of plagiarism in academia. Writing in the International Journal of Data Mining, Modelling and Management, a team from Saudi Arabia has focused on one particular aspect of plagiarism where an author uses images stolen from another source and passed them off as their own without due credit to the original content creator and how this might be detected using technology. Images and figures within a research paper may represent hard-fought experimental data or even core concepts within the research and so are critical to the scientific endeavour.
Taiseer Abdalla Elfadil Eisa of the King Khalid University Mahayil in Asir, explains that detecting plagiarism in the figures and images used in a research publication is particularly challenging, not least because of the complexity of the requisite analysis and comparison but also because of the vast number of research papers published in journals each year. The research looks at a technique that can analyse textual content and structure of the figures in a paper. Image processing and semantic mapping are employed, Eisa explains.
“In scientific publications, quantitative information, results of experiments, frameworks, and statistical facts are represented in infographic form, such as figures, charts, and tables, rather than in text forms,” Eisa explains. “However, less attention has been paid to detecting plagiarism in these non-textual elements of scientific publication.” The current study addresses this issue directly by overcoming the limitation of current text-matching tools to extract information for comparison from non-textual components of an image, such as a flowchart. The approach can identify shapes within an image, describe those and their relationships within the image textually and annotate this with OCR (optical character recognition) of any text within those shapes.
The approach improves significantly on existing methods, Eisa writes, addressing the problem of text within shapes in a figure in a way that other approaches have not managed.
Eisa, T.A.E. (2022) ‘Plagiarism detection of figure images in scientific publications’, Int. J. Data Mining, Modelling and Management, Vol. 14, No. 1, pp.15–29.