Home >Technology peripherals >AI >Can AI fatigue be solved through data governance?
It has long been a core focus of the data industry.
Data governance is all measures to ensure that data is secure, private, accurate, available and reliable, including the development of internal standards and data policies to regulate the collection, storage, processing and disposal of data. This process is critical to protecting user privacy and maintaining data integrity.
As this definition emphasizes, data governance is about managing data—precisely, the engine that drives AI models.
While the connection between data governance and AI is initially apparent, linking it to AI fatigue by emphasizing the causes of fatigue ensures consistent use of the term throughout the article.
AI fatigue can occur due to setbacks and challenges encountered by a company, developer, or team, resulting in hindered implementation or value realization of AI systems.
The main reason for over-hyping AI is unrealistic expectations of its capabilities. Stakeholders need to be aligned with AI's capabilities, possibilities, limitations, and risks to properly assess its value and applications.
When it comes to risk, ethics are often considered an afterthought, leading to the abandonment of non-compliant AI initiatives.
You must be wondering about the role of data governance in causing AI fatigue – this is the premise of this article.
This is where we are going next.
AI fatigue can be roughly divided into pre-deployment and post-deployment. Let's focus on pre-deployment work first.
There are many factors that go into moving a Proof of Concept (PoC) to deployment, such as:
Once we evaluate how best to use ML algorithms To solve problems well, the data science team will perform exploratory data analysis. Many underlying data patterns are revealed at this stage, highlighting whether a given data contains rich signals. It also helps in creating engineered features to speed up the algorithm's learning process.
Next, the team builds the first baseline model and often finds that its performance does not reach an acceptable level. A model whose output is as good as flipping a coin doesn't add any value, which is one of the first setbacks and lessons learned when building ML models.
Companies can shift from one business problem to another, causing fatigue. Still, if the underlying data does not carry rich signals, no AI algorithm can be built on it. The model must learn statistical associations from the training data to generalize to unseen data.
Although the trained model shows promising results on the validation set, according to qualified business standards, such as 70% accuracy, if the model cannot be used in a production environment At full capacity, fatigue may still occur.
This type of AI fatigue is called the post-deployment phase.
Numerous reasons can cause performance degradation, and poor data quality is the most common problem plaguing the model, which limits the model's ability to accurately predict target responses in the absence of key attributes.
Consider that one of the essential features that was missing in only 10% of the training data now becomes null 50% of the time in the production data, leading to incorrect predictions. Such iterations and efforts to ensure that the model performs consistently will make Data scientists and business teams become exhausted, eroding confidence in data pipelines and putting project investments at risk.
Robust data governance measures are critical to addressing both types of AI fatigue. Given that data is at the core of ML models, signal-rich, error-free and high-quality data are necessary for the success of ML projects. Addressing AI fatigue requires a strong focus on data governance. Therefore, we must work rigorously to ensure the right data quality, laying the foundation for building state-of-the-art models and delivering trustworthy business insights.
Data quality is key to thriving data governance and a critical factor in the success of machine learning algorithms. Companies must invest in data quality, such as publishing reports to data consumers. In a data science project, think about what happens when poor quality data enters the model, which can lead to poor performance.
Only during error analysis can teams identify data quality issues, which ultimately leads to fatigue among teams when these issues are sent upstream for fixes.
Obviously, this is not only the effort spent, but also a lot of time lost before the correct data starts to be entered.
Поэтому всегда рекомендуется исправлять проблемы с данными в источнике, чтобы предотвратить такие трудоемкие итерации. В конечном счете, опубликованные отчеты о качестве данных подразумевают, что группа по обработке данных (или любые другие последующие пользователи и потребители данных) понимают приемлемое качество входящих данных.
Без мер по обеспечению качества данных и управлению специалисты по данным будут перегружены проблемами с данными, что приведет к созданию неудачных моделей и усталости ИИ.
В этой статье освещаются две стадии наступления усталости ИИ и описывается, как меры управления данными, такие как отчеты о качестве данных, могут способствовать построению надежных и надежных моделей.
Создавая прочную основу посредством управления данными, компании могут построить дорожную карту для успешной и беспрепятственной разработки и внедрения ИИ, вселяя энтузиазм.
Чтобы в этой статье был представлен всесторонний обзор различных подходов к борьбе с усталостью от искусственного интеллекта, я также подчеркиваю роль организационной культуры, которая в сочетании с другими передовыми практиками, такими как управление данными, позволит командам по анализу данных быстрее и быстрее создавать значимый вклад ИИ.
The above is the detailed content of Can AI fatigue be solved through data governance?. For more information, please follow other related articles on the PHP Chinese website!