Links Between East and West 55 AI and Academics 东西方的连接55 - AI与学术
As the development of artificial intelligence (AI) accelerates, it is increasingly crucial for societies to discuss its potential impacts. Some who hail AI as a technology heralding another major technological revolution in human history claim it will be a strong presence in many fields, from education to finance to entertainment. There is merit to this enthusiastic claim. AI’s power might not necessarily rest in its capabilities but in its magic to transform many industries in the long run.
One of the fundamental fields of human activity that must brace for the shock from AI is academia and academic education. Since ChatGPT and OpenAI entered the stage, academics and educators have been in a scurry to rule out certain features and debate about new additions to academic integrity codes. This essay will begin by examining the challenges posed by AI to academia, followed by a discussion of academic integrity in the age of generative AI. This essay will end by investigating ways to ensure AI will not cause an upheaval in academic environments globally.
It is a danger to understate AI’s challenges to academia. As AI tools become more prevalent, academic misconduct is exponentially more prevalent. A 2022 investigation by Taylor and Francis, a prominent publisher of scientific papers, shows that in 2022, the number of research-integrity cases rose from about 800 in 2021 to 2900. The investigators predict that this number might double in 2023. There are several specific academic risks. First, AI allows researchers to make up and utilize nonexistent sources. Large language models (LLMs) can fabricate critical information, such as the title of another essay or the evidence used to support an argument. Elizabeth Bik, a microbiologist at Stanford University, points out that between 2016 and 2020, dozens of biology papers contained images produced by AI and created deliberately to back their theses. This risk is quite severe. Other researchers and readers digest the information presented by academics in their papers based on the premise that it is real. Forging academic information with AI breaks the trust between researchers, their peers, and their readers.
Second, AI-generated content can be challenging to identify. LLMs will provide more genuine responses to human requests as they evolve. The genuine difficulty for the reader is not that the AI-generated content looks real. Once researchers fuse AI-generated content into their writings and fine-tune the language to their writing style, it would be improbable for the readers to discern a marked difference. This risk, again, might lead to a trust crisis. For example, if readers are aware that there might be AI-generated content when reading argumentative essays, they would doubt the fundamental validity of the evidence presented by the author to make their point. This trust crisis negatively impacts the willingness of readers to read the works of academics and believe in the professional knowledge they provide.
Third, AI can be biased. Some people hold that using AI increases the objectivity of any scholarly writing. This opinion lacks merit. Computer scientists often train LLMs with AI-generated data or ones that do not factor in certain demographic groups. For instance, underrepresented data of women or minorities can lead to biases in predictive AI algorithms. Experts have found that computer-aided diagnosis systems, which should benefit immensely from the power of AI, return results with lower accuracy for Black patients than white patients. Other algorithms that analyze user data for a specific app and return with recommended content are also prone to be skewed due to extant biases in the data they are trained with. In academia, writing that utilizes AI-generated materials carries and even magnifies the biases in these materials. This issue might lead to ethical ramifications, especially if AI biases are related to sensitive topics in race, ethnicity, or politics.
As AI permeates different academic environments, how will academic integrity look like in this age of generative algorithms? The quick answer is that the old rules and values will still hold. Academic integrity demands that work be transparent, credible, honest, and trustworthy. Some extant rules apply to AI as well. For example, one must refrain from using AI to write blocks of text, as doing so leads to a significant loss of originality. Institutions should also require scholars to cite AI tools if they are used. AI will not uproot academic environments entirely, at least not in the short run, so the fundamental guidelines that have been in place for centuries will continue to exist.
However, there should be some new additional points to the idea of academic integrity. Most importantly, scholars cannot use AI to make up nonexistent information – this behavior must be considered a blatant violation of academic integrity. There are two specific cases of violation under this point. In the first case, an author forges content with AI tools on purpose. Institutions should punish the author severely in this case. In the second case, an author utilizes AI-generated fabricated content but is unaware of it. After all, it is difficult to determine what data an AI tool pulls up to generate a response. Even if the author does not desire to use fake content, the materials that the AI feeds them might still be undetectably fake. This case calls for the devising of a systematic solution. Academic organizations should urge for the enactment of regulations on the capability of LLMs to fabricate content. Regulations can discourage scholars from utilizing nonexistent evidence by prodding them to invest less trust in these LLMs’ effectiveness. Innovative leaders in developing AI models, such as Mustafa Suleyman of Inflection AI, have supported this notion of taking certain features off the table.
Another solution to prevent scholars from fabricating information using AI involves cross-checking each other’s work. Ideally, this solution would create a mechanism for academics to watch and supervise each other. Any attempt to use fake evidence can be caught. This is not to take away the so-called freedom of academia. Having experts read each other’s work is a simple method to “screen” each piece before it becomes widely accessible. Experts might detect any trace of AI-generated content more effectively than most readers.
Aside from the ban on utilizing AI to make up nonexistent materials, another aspect of academic integrity in the era of great algorithms is to involve the readers’ proactivity more. Essays that fail to allow readers to cross-check evidence should be regarded as academically dishonest. Today, readers are more passive than active when absorbing the content of an academic paper. Admittedly, some may actively process the author’s argument during reading and choose to agree or disagree, but they are still heavily influenced by the product of the author’s thinking process. Academics should provide direct channels for readers to cross-check information for them. Through this approach, every reader can be an active sifter and processor of information, not a passive receiver. This approach strengthens the audience’s ability to determine the realness and credibility of an essay’s content, which are arguably of utmost importance in the age of AI.
To implement this approach, besides citations, academics should link other relevant sources and peer-reviewed work in the main body of the text. These links must be direct and easy to access for the target readers. This action aims to make an essay as transparent as possible. Such an approach has already been shared in some fields, such as history. When Herodotus, the Greek historian who lived 2500 years ago, wrote his history, he had the habit of listing all the potential sources his audience might find when making a disputable claim. This writing mode should be adopted more commonly today in the AI age. Fostering open discussion about writing is the ideal new norm in AI-present academia.
Finally, regarding academic integrity, a greater weight should be put on the ethical side of a piece of work. Current codes of academic integrity emphasize ethics less than they should. AI’s biases can be challenging to spot and avoid, so how researchers use AI ethically in the presence of potential prejudices is vital. The essential requirement for scholars is that their work does not promote harmful or misleading information. The burden of distinguishing reliable information from unreliable pieces rests not just on the readers but also significantly on the writers. If authors choose to incorporate AI into their research, they must sift through the algorithm’s responses and check whether they violate ethical codes or might leave a harmful influence on specific groups of audiences.
Aside from academics, the LLM trainers are another shareholder in the AI world. This essay argues that they should not let AI train itself yet for ethical reasons. Today, allowing AI to train itself recursively might cause an undesirable phenomenon known as “model collapse.” In 2023, Ilia Shumailov, a computer scientist at Oxford University, published a paper investigating this phenomenon. The paper explains an experimental case: A model was fed handwritten digits and requested to form digits of its own, which were fed back to it in turn. After a few iterations, the computer’s numbers became more illegible. After 20 iterations, it could only generate rough circles or blurred-out lines. This experiment illustrates that AI algorithms still have a long way to go before evolving true capabilities and consistently improving themselves. Letting go of the reins now bears the risk of producing ineffective and inaccurate algorithms. More importantly, having unmonitored algorithms with the potential to magnify biases based on the training data they feed themselves is genuinely horrifying. Negative ethical ramifications would be both hard to measure and eradicate.
Academia should be more involved in the training of AI to ensure that the data LLMs digest reflects a broad spectrum of demographic groups. The current problem is that AI technology has not yet become accessible on the open web. As a result, the groups who have a say in training AI are only computer scientists and government personnel. Since people cannot stop AI’s integration into academic work, it seems the safest option to factor in the opinions of academics in “LLM capability” discussions. Academia should debate what features might be hugely disruptive and, hence, should not be widely employed. A decision-making process with only computer wizards and government officials risks creating unintended impacts on academia.
To tackle the challenges posed by AI and protect academic integrity, the general philosophy that incorporates all the aforementioned solutions is to create greater cohesion between different shareholders and parties. A concerted effort spearheaded by academics is the most effective. However, there are some broader, more macro-level solutions to AI’s disruption in academia. If successfully implemented, these solutions would guarantee a future where students and researchers can coexist with AI algorithms.
The first solution is raising awareness in the primary and secondary educational frameworks. Today, there is a lag in these frameworks in the sense that educators need to integrate AI into the daily classroom. Doing so might be a tall order, but technological innovators have been pushing significant progress. In a recent talk, Sal Khan, the founder of the renowned Khan Academy, discusses how his platform for learning integrates an AI tool called “Khanmigo.” Khanmigo heralds a new era for Khan Academy, where AI will be a nonnegligible factor in student learning. The AI tool is available when watching knowledge videos, doing practices, or visiting the website for general purposes. Mr. Khan believes in the vitalness of familiarizing students early with AI so that they become accustomed to algorithms in their future learning and research.
The second long-run solution concerns the coordination between AI firms, academia, and educators. AI companies like OpenAI are responsible for informing academic researchers and educators about the evolving landscape of the technology. Faced with such a rapidly developing technology, people might not have the time or ability to fully understand each short period of evolution. AI firms should then openly discuss new features to their models so that guidelines can be collaboratively created and adapted to changing conditions.
The third long-run solution asks for the founding of an independent academic institution or a collective group of such institutions to monitor the development of LLMs that academics may employ. Historically, whether it was with fossil fuel technologies or mobile phones, an independent organization eventually oversaw the progression of such technologies. AI requires independent monitoring eyes, too. A checks-and-balance system where an independent academic institution monitors AI firms, and the latter conversely has some deciding power over the components of the former, is desirable.
Given all the challenges and obstacles, this essayist still holds an optimistic view of AI in academics. For researchers, AI yields immense promises for academic breakthroughs. For instance, AI just helped historians decipher damaged Roman papyrus scrolls carbonized by the eruption of Mount Vesuvius in 79 C.E. In the future, with AI tools, researchers will achieve many more such breakthroughs. Additionally, AI can make academic work and achievements genuinely impactful to society. This boon includes bringing unprecedented help to underprivileged populations in healthcare. For the public, AI makes education and academic research more accessible. As the previous essay about the doughnut economy explains, edifying the populace is one of the most imperative tasks to reach a sustainable future with secure social foundations. With proper regulation and discussion, the AI age will be the best of times. With rash carelessness and complacency, the AI age will be the worst of times.
随着人工智能(AI)的加速发展,社会对其潜在影响的讨论变得愈发重要。一些人将人工智能誉为人类历史上又一次重大的技术革命,声称它将在教育、金融、娱乐等多个领域带来史无前例的积极影响。这种热情的说法不无道理。人工智能的强大未必在于它的实际能力,而在于它有魔力长期改变许多行业的运作原理。
学术界和学术教育是人类活动的基本领域之一,必须做好准备迎接来自人工智能的冲击。自从ChatGPT和OpenAI登上舞台以来,学术界与教育界人士就一直在争先恐后地禁止某些功能的使用,并就学术诚信准则的新补充展开辩论。本文将首先探讨人工智能给学术界带来的挑战,然后讨论生成式人工智能时代的学术诚信问题。最后,本文将探讨如何确保人工智能不会在全球学术环境中造成灾难般的影响。
低估人工智能对学术界的挑战是危险的。随着人工智能工具的普及,学术不端行为也呈指数级增长。著名科学论文出版商泰勒与弗朗西斯公司(Taylor and Francis)2022 年的一项调查显示,2022 年,研究诚信案件的数量从2021 年的约 800 起上升到 2900 起。调查人员预测,2023 年这一数字可能会翻一番。具体的学术风险有以下几个方面。首先,人工智能让研究人员有机会编造并利用完全不存在的资料来源。大型语言模型(LLM)有能力编造关键信息,如另一篇论文的标题或用于支持论点的证据。斯坦福大学微生物学家伊丽莎白-比克(Elizabeth Bik)指出,在 2016 年至 2020 年间,有数十篇生物学论文包含了人工智能制作的图像,这些图像是为了支持论文观点而特意制作的。这种风险相当严重,因为其他研究人员与读者消化学者在论文中提供的信息时默认的前提就是这些信息是真实的。用人工智能伪造学术信息打破了研究人员与读者之间的信任。
其次,人工智能生成的内容可能难以识别。随着 LLM 的发展,它们会对人类的请求做出看似更真实的回应。对读者来说,困难并不在于人工智能生成的内容本身看起来是真实的,而是说一旦研究人员将人工智能生成的内容融合到自己的文章中,并根据自己的写作风格对语言进行微调,读者就很难分辨出明显的差异。这种风险同样可能导致信任危机。例如,如果读者意识到文章中可能有人工智能生成的内容,那么在阅读议论文时,他们就会怀疑作者为表达观点而提供的证据的基本有效性。这种信任危机会对读者阅读学者作品并相信其提供的专业知识的意愿产生负面影响。
第三,人工智能可能存在偏见。有些人认为,使用人工智能可以提高任何学术著作的客观性。这种观点缺乏依据。计算机科学家经常使用人工智能生成的数据或忽略某些人口群体的数据来培训LLM。例如,女性或少数群体数据代表性不足经常会导致人工智能预测算法出现偏差。医学专家们发现,计算机辅助诊断系统本应从人工智能的强大功能中获益匪浅,但其对黑人患者返回结果的准确性却低于白人患者。其他分析某款应用的用户数据并返回推荐内容的算法,也很容易因其训练数据中的现存偏差而实际歧视某些产品或用户。在学术界,利用人工智能生成的材料进行写作会带有甚至放大这些材料中的偏见。这个问题可能会导致伦理方面的影响,尤其是当人工智能的偏见涉及到种族、民族或政治等敏感话题时。
随着人工智能渗透到不同的学术环境中,在这个生成算法的时代,学术诚信会是怎样的?快速的答案是,旧有的规则与价值观依然有效。学术诚信要求作品透明、可信、与诚实。一些现行规则也适用于人工智能。例如,学者不能使用人工智能来撰写大量文本,因为这样会使作品丧失基本原创性。如果使用了人工智能工具,学术界也应要求学者加以引用。人工智能不会完全颠覆学术环境,至少短期内不会,因此几个世纪以来一直沿用的基本准则将继续存在。
不过,学术诚信的理念应该有一些新的补充要点。最重要的是,学者不能利用人工智能编造不存在的信息--这种行为必须被视为公然违反学术诚信。在这一点上,有两种具体的违规案例。第一种情况是作者故意使用人工智能工具伪造内容。在这种情况下,学术组织应直接严惩作者。第二种情况是,作者使用了人工智能生成的编造内容,但却没有意识到这一点。毕竟,作者可能很难确定人工智能工具具体调取了哪些数据来生成回复。即使作者不想使用虚假内容,人工智能提供给他的材料也可能有无法察觉的虚假成分。这种情况需要制定系统的解决方案。学术组织应敦促针对LLM编造内容的能力制定相关法规。这些法规可以阻止学者利用不存在的证据,促使他们减少对这些LLM的依赖。开发人工智能模型的创新领导者,如Inflection AI公司的穆斯塔法-苏莱曼(Mustafa Suleyman),支持目前取消某些算法功能的想法。
防止学者利用人工智能编造信息的另一个解决方案是让学者们互相核对工作。理想情况下,这种解决方案可以创造一种机制,让学者们互相监督,从而抓住任何企图使用假证据的行为。这并不是要剥夺所谓的学术自由,让专家们互相阅读对方的作品是一种简单的方法,可以在读者接触到每篇作品之前先对其进行 "筛选"。在发现人工智能生成内容的任何痕迹方面,专家可能比大多数读者更有效且专业。
除了禁止利用人工智能编造不存在的材料外,大算法时代学术诚信的另一个方面是让读者更加主动地参与进学术中。不能让读者交叉检查证据的论文应被视为是学术不诚实的。如今,读者在吸收学术论文内容时被动多于主动。诚然,有些读者在阅读过程中可能会主动处理作者的论点,并选择同意或不同意,但他们仍然在很大程度上受到作者思维过程产物的影响。学者应该在自己的作品中为读者提供直接的渠道,让读者为他们核对信息。通过这种方法,每个读者都可以成为主动的信息筛选者和处理者,而不是被动的接受者。这种方法增强了受众判断文章内容真实性以及可信度的能力,而这两点在人工智能时代是至关重要的。
要落实这种方法,除了引用之外,学者们应在正文中链接其他相关来源与同行评议的作品,这些链接必须直接且便于目标读者访问。这样做的目的是使文章信息尽可能透明。这种方法在某些学科领域已经很常见,比如历史。生活在 2500 年前的希腊历史学家希罗多德在撰写《历史》一著作时,习惯于列出所有可能的资料来源,以便读者在阅读有争议的主张时可以找到并自行判断。在今天的人工智能时代,这种写作模式应该得到更普遍的采用。促进对写作的公开讨论是人工智能时代学术界理想的新规范。
最后,在学术诚信方面,学术社会应更加重视作品的伦理方面。目前的学术诚信准则对道德伦理的强调远远不够。人工智能的偏见很难发现和避免,因此研究人员如何在存在潜在偏见的情况下合乎伦理地使用人工智能至关重要。对学者的基本要求是,他们的工作不宣传有害或误导性信息。区分可靠信息与不可靠信息的责任不仅在于读者,也主要在于作者。如果作者选择将人工智能纳入自己的研究,他们就必须筛选算法的回应,检查它们是否违反道德规范或可能对某些受众群体造成有害影响。
除了学者,人工智能领域的另一个重要群体是LLM培训者。本文认为,出于伦理原因,他们还不应该让人工智能进行自我训练。如今,允许人工智能自我递归训练可能会导致一种被称为 "模型崩溃 "的不良现象。2023 年,牛津大学计算机科学家伊利亚-舒迈洛夫(Ilia Shumailov)发表了一篇论文,对这一现象进行了研究。论文解释了一个实验案例: 向一个模型输入数字,要求它自己依据这些数字组成新的数字,并将这些新的数字重新作为输入不断反馈给它。经过几次迭代后,计算机的数字变得越来越难以辨认。迭代 20 次后,这个算法只能生成粗糙的圆圈或模糊的线条。这个实验说明,人工智能算法要想真正具备持续改进自身的能力,还有很长的路要走。现在放手不管,有可能产生无效及不准确的算法。更重要的是,不受监控的算法有可能根据自己输入的训练数据放大偏见,这非常可怕。负面的道德伦理影响既难以衡量,也难以消除。
学术界应更多地参与人工智能的训练,以确保LLM所消化的数据能够反映广泛的人口群体。目前的问题是,人工智能技术尚未在开放网络中普及。因此,在训练人工智能方面有发言权的群体只有计算机科学家与政府人员。既然人们无法阻止人工智能融入学术工作,那么在 "LLM所应拥有的能力"这种讨论中考虑学术界的意见似乎是最安全的选择。学术界应该讨论哪些功能可能会造成巨大破坏,因此暂时不应广泛使用。只有计算机奇才与政府官员参与的决策过程有可能对学术界造成意想不到的消极影响。
为了应对人工智能带来的挑战并保护学术诚信,包含上述所有解决方案的总体思路是在不同群体之间建立更大的凝聚力,由学术界牵头的协同努力最为有效。不过,对于人工智能对学术界的干扰,还有一些更广泛、更宏观的解决方案。这些解决方案如能成功实施,将确保未来学生与研究人员能与人工智能算法共存。
第一个解决方案是提高中小学教育框架对人工智能的认识。如今,这些框架存在滞后现象,即教育工作者没有将人工智能融入日常课堂。要做到这一点可能是一项艰巨的任务,但技术创新者却在最近不断取得重大进展。在去年的一次演讲中,著名的可汗学院(Khan Academy)创始人萨尔-可汗(Sal Khan)讨论了他的学习平台如何开发了一个名为 "Khanmigo "的人工智能工具并将其融入到每一处学习场景中。Khanmigo 预示着可汗学院进入了一个新时代,人工智能将成为学生学习过程中不可忽视的因素。在观看知识视频、进行练习或知识搜索时,都可以使用该人工智能工具。可汗先生认为,让学生尽早熟悉人工智能至关重要,这样他们在今后的学习和研究中就会习惯于算法的存在。
第二个长期解决方案涉及人工智能公司、学术界与教育工作者之间的协调。像 OpenAI 这样的人工智能公司有责任让学术研究人员及教育工作者了解该技术不断发展的情况。面对如此快速发展的技术,人们可能没有时间或能力去完全理解每一次短期的突进。因此,人工智能公司应该公开讨论其模型的新功能,从而共同制定指导方针,并根据不断变化的情况进行调整。
第三个长期解决方案是要成立一个独立的学术机构或此类机构的集体,以监督学术界可能采用的LLM的发展。从历史上看,无论是化石燃料技术还是移动电话,最终都会有一个独立的组织来监督这些技术的发展。人工智能也需要独立的监督机构。世界最好能建立一个制衡系统,由独立的学术机构监督人工智能公司,同样,后者对前者的组成部分也有一定的决定权。
尽管面临种种挑战或障碍,本文作者仍对人工智能在学术界的潜能持乐观态度。对于研究人员来说,人工智能为学术突破带来了巨大的希望。例如,人工智能刚刚帮助历史学家破译了因公元 79 年维苏威火山爆发而碳化的受损罗马纸莎草纸卷轴。此外,人工智能还有能力让学术工作与成果真正对社会产生影响,这包括在医疗保健方面为弱势群体带来前所未有的帮助。对于公众来说,人工智能让教育与学术研究变得更容易获得。正如上一篇关于甜甜圈经济的文章所解释的那样,教育民众是实现未来可持续发展和社会基础稳固的当务之急。如果社会创造了适当的监管与讨论,人工智能时代将是最好的时代。反之,如果社会贸然疏忽人工智能的威力并对现有的发展保持自满,人工智能时代将是最坏的时代。
- 本文标签: 原创
- 本文链接: http://www.jack-utopia.cn//article/632
- 版权声明: 本文由Jack原创发布,转载请遵循《署名-非商业性使用-相同方式共享 4.0 国际 (CC BY-NC-SA 4.0)》许可协议授权