[Spring Workshop 2023] Risks, Realities, and Ruminations on the Probalistic Web

Friday June 2, 1:00 – 1:10

Session Description

Current AI tools use large language models (LLM) to create original text, using probability to build it word-by-word. What does this mean for determining if something is real? If AI output is based on how often words are found together in the resources the tool has trained on, is it only as real as what is freely available online? How does this question of what is real intersect with other tools, techniques, and technologies? Using AI-produced academic papers and their reference lists as our case study, we will examine how tools like ChatGPT may alter our understanding of our current information landscape and allow us to interrogate a path forward. Reference lists create an accessible test case, as fact-checking is a little more straightforward than for the paper itself. ChatGPT assembles references that may or may not exist, merged references that contain real DOIs or article titles blended together beside completely invented references. What happens if judging something as real becomes too much of a barrier? Readers might not have access to the evidence to refute incorrect citations as most scholarly sources are locked behind paywalls. Then too, reviewing sources could be too time-consuming; most professors only give the references a cursory skim, unless they notice a problem. Is it reasonable to expect them to take time out of grading to research the references closely. While these decisions could be off-loaded to other detection tools, they are not keeping up, for example, Turnitin’s AI checker returns at least 1% false positives. Finally, is there hope for a semantic web, where AI tools recognize and evaluate their sources and know what a citation is, not just what it should look like?

Recording and Materials


Files and links.


Posted

in

,