Data Governance for Sustainable Artificial Intelligence
Tukia, Anttoni (2022)
Tukia, Anttoni
2022
Tietojohtamisen DI-ohjelma - Master's Programme in Information and Knowledge Management
Johtamisen ja talouden tiedekunta - Faculty of Management and Business
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2022-12-14
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202212099003
https://urn.fi/URN:NBN:fi:tuni-202212099003
Tiivistelmä
Rapid growth of big data and processing power in recent years has caused an upsurge in artificial intelligence (AI) in numerous domains. It has brought great benefits to our society, but also caused several sustainability and ethical issues. There is a consensus among researchers that the harms caused by AI must be mitigated, and proper AI governance has been identified as an important part of it. AI uses large quantities of data to work, making some aspects of data governance relevant for doing this. Still, studies on how data governance can help solve these issues remain scarce.
The aim of this thesis is to examine how AI can be positively influenced towards sustain-ability by the means of data governance. Another goal of this thesis is to understand the role of data in sustainable AI. First, a literature review on the topics of data governance, sustainable AI, and AI governance was conducted, and a theory-based data governance for sustainable AI was formed. This framework was then refined in the empirical part by organising two workshops for the data and AI governance experts from the case company, Solita.
The most significant data-related challenge regarding AI identified in this research is how to ensure data quality. If this is not done, the outcome of the AI algorithms using it may be-come biased or skewed. Additionally, the correctness and transparency of data acquisition must be in place to prevent the misuse of data, especially when it is personally identifiable, sensitive, private, or confidential. These matters are not only relevant for the input data, but for the output as well. To prevent biased or skewed output data from causing damage, it should be properly managed and governed. Furthermore, the data supply chains around AI systems should be accompanied with clear and continuous chains of accountability and responsibility. However, if a biased or skewed outcome is produced by a flaw in the algorithm itself, such as a programming or design error, the issue may need to be addressed by some other way than ensuring data quality.
In this research, a data governance for sustainable AI framework was formed as the main result. Its objective is to be a high-level abstraction of what needs to be considered when supporting sustainable AI by the means of data governance, and to illustrate how different data governance activities support the goals of different AI governance elements. There are three AI governance elements selected for the framework, AI System, Organisation, and Ecosystem, as well as seven data governance activities, (1) Objectives & Key Results, (2) Decision Rights and Accountabilities, (3) Data Policies, Rules, Definitions, and Standards, (4) Roles and Responsibilities, (5) Data Processes, (6) Data Governance Metrics, and (7) Controls. The numbers indicate the order in which the data governance activities should be done.
In addition to contributing to the academic discussion in the field of sustainable AI, this research has practical implications for the case company. Thus, strengthening the role of private businesses in guiding AI towards sustainability is another contribution of this re-search.
The aim of this thesis is to examine how AI can be positively influenced towards sustain-ability by the means of data governance. Another goal of this thesis is to understand the role of data in sustainable AI. First, a literature review on the topics of data governance, sustainable AI, and AI governance was conducted, and a theory-based data governance for sustainable AI was formed. This framework was then refined in the empirical part by organising two workshops for the data and AI governance experts from the case company, Solita.
The most significant data-related challenge regarding AI identified in this research is how to ensure data quality. If this is not done, the outcome of the AI algorithms using it may be-come biased or skewed. Additionally, the correctness and transparency of data acquisition must be in place to prevent the misuse of data, especially when it is personally identifiable, sensitive, private, or confidential. These matters are not only relevant for the input data, but for the output as well. To prevent biased or skewed output data from causing damage, it should be properly managed and governed. Furthermore, the data supply chains around AI systems should be accompanied with clear and continuous chains of accountability and responsibility. However, if a biased or skewed outcome is produced by a flaw in the algorithm itself, such as a programming or design error, the issue may need to be addressed by some other way than ensuring data quality.
In this research, a data governance for sustainable AI framework was formed as the main result. Its objective is to be a high-level abstraction of what needs to be considered when supporting sustainable AI by the means of data governance, and to illustrate how different data governance activities support the goals of different AI governance elements. There are three AI governance elements selected for the framework, AI System, Organisation, and Ecosystem, as well as seven data governance activities, (1) Objectives & Key Results, (2) Decision Rights and Accountabilities, (3) Data Policies, Rules, Definitions, and Standards, (4) Roles and Responsibilities, (5) Data Processes, (6) Data Governance Metrics, and (7) Controls. The numbers indicate the order in which the data governance activities should be done.
In addition to contributing to the academic discussion in the field of sustainable AI, this research has practical implications for the case company. Thus, strengthening the role of private businesses in guiding AI towards sustainability is another contribution of this re-search.