By: Assoc. Prof. Ts. Dr. Aznul Qalid Md Sabri
I read with keen interest a recent article at Futurism.com entitled “When AI Is Trained on AI-Generated Data, Strange Things Start to Happen”. The author was trying to put into context an inherent danger that lurks within the world of generative AI, if not monitored. This is especially so after the explosive popularity of generative AI scene following the introduction of ChatGPT to the public in November 2022.
Model collapse, a term gaining prominence within the AI domain, signifies a critical hurdle that AI models can encounter during their learning phase. To put it simply, model collapse refers to a scenario where an AI model fails to learn effectively from the data it is provided. In the context of generative AI, this issue manifests as the model primarily learning from other generative AI systems, often relying on synthetic or artificially generated data.
This concern has not gone unnoticed; AI researchers have been sounding the alarm about the potential risks associated with widespread adoption of generative AI models. However, the challenge that remains is multifaceted and requires comprehensive addressing. This entails establishing robust AI governance mechanisms that facilitate collaboration and coordination among key players involved in the development of generative AI models, both locally and globally.
An essential aspect of this collaboration involves information sharing, and collectively ascertaining the origins of the data that AI models are trained on. Apart from maintaining transparency and accountability in AI development; these efforts are necessary to prevent the proliferation of noisy synthetic data—a phenomenon where erroneous or misleading data generated by AI systems comes into contact with mainstream AI models.
In a more accessible context, envision this as an intricate ecosystem where multiple parties contribute to the growth of AI technology, while ensuring it does not deviate from its intended purpose due to inadequate quality control. For this ecosystem to thrive and produce reliable outcomes, it is essential for these contributors to collectively manage the quality of data and models.
Considering the expanding role of AI in critical sectors such as healthcare, it becomes even more crucial. AI models might be used in patient risk assessment, for example, where it is used to predict patient outcomes and assess risks based on medical data. If a predictive model suffers from model collapse, it might become excessively reliant on certain patient profiles or data patterns, leading to biased or inaccurate risk assessments.
In another example, generative AI models can also be used to generate personalized treatment plans based on patient’s medical history. In the case of model collapse, the AI might generate treatment plans that are very similar to a few standard protocols, disregarding the uniqueness of patient individuality. This could lead to suboptimal, or wasteful (and even dangerous!) treatment recommendations.
In conclusion, the call for effective AI governance stems from the need to build a harmonious relationship between technological advancement and responsible usage. The objective is to strike the right balance that harnesses the potential of generative AI models, without succumbing to the risks associated with unchecked development, such as model collapse.
The author is an Associate Professor at the Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Universiti Malaya. He may be reached at email@example.com