《WEF:2023数据公平研究报告-生成式AI的基本概念(英文版)(19页).pdf》由会员分享,可在线阅读,更多相关《WEF:2023数据公平研究报告-生成式AI的基本概念(英文版)(19页).pdf(19页珍藏版)》请在三个皮匠报告上搜索。
1、Data Equity:Foundational Concepts for Generative AIB R I E F I N G P A P E RO C T O B E R 2 0 2 3Images:Getty Images 2023 World Economic Forum.World Economic Forum reports may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public Lice
2、nse,and in accordance with our Terms of Use.Disclaimer This document is published by the World Economic Forum as a contribution to a project,insight area or interaction.The findings,interpretations and conclusions expressed herein are a result of a collaborative process facilitated and endorsed by t
3、he World Economic Forum but whose results do not necessarily represent the views of the World Economic Forum,nor the entirety of its Members,Partners or other stakeholders.ContentsIntroduction 1 Classes of data equity2 Data equity across the data lifecycle 3 Data equity challenges in foundation mode
4、ls 4 Focus areas for key stakeholders5 DiscussionConclusionContributorsEndnotes 34691114151618Data Equity:Foundational Concepts for Generative AI2IntroductionOver the past several months,a series of technological advances have emerged as a result of generative artificial intelligence(genAI)tools,inc
5、luding ChatGPT,Bard,Midjourney,and Stable Diffusion.The use of these tools has gained significant attention and captured the imagination of public and industry stakeholders due to its capabilities,wide range of applications and ease of use.Given its potential to challenge established business practi
6、ces and operational paradigms,and the promise of rapid innovation coupled with the likelihood of significant disruption,genAI is sparking global conversations.These anticipated,far-reaching consequences have a societal dimension and will require comprehensive engagement from key stakeholders such as
7、 industry,government,academia and civil society.At the heart of these discussions lies the concept of“data equity”a core notion within data governance centred on the impact of data on the equity of technical systems for individuals,groups,enterprises and ecosystems.1 It includes concepts of data fai
8、rness,bias,access,control and accountability,all underpinned by principles of justice,non-discrimination,transparency and inclusive participation.Data equity is not a new concept;it is grounded in human rights and part of ongoing work on data privacy,protection,ethics,Indigenous data sovereignty and
9、 responsibility.The intersection of data equity and genAI,however,is new and presents unique challenges.The datasets used to train AI models are prone to biases that reinforce existing inequities.This requires proactively auditing data and algorithms and intervening at every step of the AI process,f
10、rom data collection to model training to implementation,to ensure that the resulting genAI tools fairly represent all communities.With the advent of genAI significantly increasing the rate at which AI is deployed and developed,exploring frameworks for data equity is more urgent than ever.This briefi
11、ng paper delves into these issues,with a particular focus on data equity within foundation models,both in terms of the impact of genAI on society and on the further development of genAI tools.Our goals are threefold:to establish a shared vocabulary to facilitate collaboration and dialogue;to scope i
12、nitial concerns to establish a framework for inquiry on which stakeholders can focus;and to shape future development of promising technologies proactively and positively.The World Economic Forums Global Future Council(GFC)on Data Equity2 envisions this as a first step in a broader conversation,recog
13、nizing the need for further exploration and discussion to be comprehensively understood,scrutinised,and addressed.The issues are complex and interconnected.Tackling them now creates a unique opportunity to positively shape the future of these exciting,promising tools.Data Equity:Foundational Concept
14、s for Generative AIOctober 2023Definitions of key conceptsBOX 1To provide context and clarity,the following key concepts are highlighted:Artificial intelligence is a broad field that encompasses the ability of a machine or computer to emulate certain aspects of human intelligence for diverse tasks b
15、ased on predetermined objectives.3 Machine learning is a subset of artificial intelligence which utilizes algorithms to enable machines to identify and learn from patterns found in datasets.4 Generative AI is a branch of machine learning that is capable of producing new text,images and other media,r
16、eplicating patterns and relationships found in the training data.5 Foundation models are a type of large-scale,machine-learning model that is trained on diverse multi-modal data at scale and can be adapted to many downstream tasks.6 Large language models represent a subset of foundation models speci
17、alizing in comprehending and generating human language,often employed for text-related functions.The latest iteration of LLMs facilitates natural conversations through advanced chatbot mechanisms.7Data Equity:Foundational Concepts for Generative AI3Classes of data equity1Effectively addressing the c
18、omplexities of data equity mandates an appreciation of the diverse viewpoints held by various stakeholders regarding data.The academic literature has identified four distinct classes of data equity,which are closely interrelated:8 Representation equity seeks to enhance the visibility of historically
19、 marginalized groups within datasets while also accounting for data relevancy for the target populations.The development of models primarily within the Global North introduces disparities in representation,potentially leading to systemic biases in subsequent decisions rooted in such data.A proactive
20、 approach is indispensable to ensure that AI training data and models authentically reflect all stakeholders without encoding biases.Feature equity seeks to ensure the accurate portrayal of individuals,groups and communities represented by data,necessitating the inclusion of attributes such as race,
21、gender,location and income alongside other data.Without these attributes,it is often difficult to identify and address latent biases and inequalities.Access equity focuses on the equitable accessibility of data and tools across varying levels of expertise.Addressing transparency and visibility issue
22、s related to model construction and data sources is critical.Additionally,access equity also encompasses disparities in terms of AI literacy and the digital divide.Outcome equity pertains to impartiality and fairness in results.Beyond developing unbiased models,maintaining vigilance over unintended
23、consequences that impact individuals or groups is necessary.Transparency,disclosure and shared responsibility are crucial to achieve fairness.These four classes of data equity are particularly relevant to genAI,but not exhaustive.Two other prominent types of equity broadly applicable to technology t
24、hat need to be considered are procedural and decision-making equity.These procedural elements underscore broad equity concerns and include transparent decision-making,fair treatment of workers who develop and deploy technology,and inclusive development and deployment practices.9 Going further,consid
25、eration must also be given to issues of temporal equity(sustainability and long-term impacts)and relational equity(fostering equitable stakeholder relationships).These latter issues are not unique to genAI or technology broadly and,as such,are beyond the scope of this paper.Nonetheless,they are ackn
26、owledged here as integral components of the overarching fabric of technology equity.Data Equity:Foundational Concepts for Generative AI4Figure 1:The four classes of data equity issues are interconnected as well as influenced and impacted by equitable practices and considerations in procedures and de
27、cision-making.Classes of data equityFIGURE 1Outcome EquityRepresentation EquityAccess EquityFeature EquityProcedural&Decision-Making EquitySource:World Economic ForumData Equity:Foundational Concepts for Generative AI5Data equity across the data lifecycle2A simplified representation is helpful in sh
28、owing how data equity permeates the data lifecycle.At each stage,different classes of data equity raise specific challenges and concerns,illustrating the need for multifaceted approaches to mitigate potential harms.Data equity throughout the data lifecycleFIGURE 2Stage 3Output Data Equity(Access&Out
29、come Equity)Stage 1Input Data Equity(Representation&Feature Equity)Stage 2Algorithmic Data Equity(Representation,Feature,Access Equity)Figure 2:Data equity across the data lifecycle.Ensuring data equity throughout the data lifecycle involves multiple stages:Stage 1 addresses the data that is used as
30、 input for developing foundation models.Stage 2 is the intermediary stage where algorithms are formulated and designed to analyse and interpret input data.Stage 3 focuses on the output data of genAI applications.Generated output may in some cases be used as input to further train foundation models,t
31、hereby exacerbating data equity challenges.Source:World Economic ForumData Equity:Foundational Concepts for Generative AI6Data equity throughout the data lifecycleBOX 2Why focus on foundation models?Foundation models are at the core of many genAI tools.They are typically trained on large and complex
32、 datasets.Foundation models may encode results that reflect human prejudice,bias or misunderstanding;and training algorithms may discern incorrect relationships or context.Stage 1:Input data equity(representation and feature equity)Input data equity centres on the data collected and used in building
33、 foundation models while also addressing the potential shortcomings this data might entail.As noted,foundation model training data may reflect societal inequities and result in societal bias.GenAI consequently generates outputs that mirror or amplify these patterns.Thus,ensuring equitable representa
34、tion of diverse individuals,groups and communities in the datasets becomes pivotal to guarantee the relevance and accuracy of the generated outcomes.This requirement extends beyond individual representation,encompassing the accurate portrayal of communities within information labelling.The promotion
35、 of fairness,bias mitigation and equal explanatory power practices is imperative for the outputs of foundation models to genuinely mirror the perspectives and realities of all individuals and groups inherent in the data.Moreover,the labels employed must be adaptable for use within algorithmic learni
36、ng models.Input data equity should also embrace the rights and well-being of data subjects.This encompasses aspects such as securing informed consent,just compensation for data contributors and annotators,and navigating the intricate trade-offs linked to data inclusion.These trade-offs are complex.W
37、hile broader data inclusion may address equity concerns,it might concurrently escalate privacy worries through heightened surveillance.Similarly,generating new content can expand creative options but might not always ensure equitable compensation for the creators whose works contribute to the models
38、 training.The degree of anticipated data equity on the input side might vary based on the nature and objectives of the foundation models.Commercial applications,for instance,might prioritize transparency for end users,disclosing the scope and coverage of data,along with sensitivity analyses targetin
39、g specific groups.In other domains such as welfare allocation or legal applications,input side equity may demand the explicit inclusion of all pertinent communities to ensure genuine and tangible inclusivity.Stage 2:Algorithmic data equity(representation,feature,access equity)Algorithmic data equity
40、 introduces a pivotal phase:the intermediary stage where algorithms are formulated and designed to interpret input data,thereby generating output results.This stage necessitates the incorporation of fairness,bias management and diversity inclusion in the algorithms operations.It is imperative to ens
41、ure that these algorithms function as impartially as possible,refraining from perpetuating undesirable biases and accommodating diverse viewpoints.Attaining algorithmic data equity involves including a diverse array of perspectives in its design and assessing its influence on different demographic g
42、roups.Algorithmic bias can emerge from several factors,such as the availability of suitable datasets.Concerns arise when culturally or geographically specific data is used to train models that will subsequently interact with populations not originally represented in the training data.For instance,mo
43、dels predominantly trained on North American or English-language content may struggle to offer accurate results for non-English-speaking populations or contexts outside the Global North.Transparency also poses challenges as foundation models,which utilize neural networks,can produce complex and ofte
44、n opaque predictive outcomes.While other AI systems may allow for algorithmic transparency,the neural network-based learning process of genAI differs.Foundation models are pre-trained on vast datasets,which give them a broad base of knowledge.However,when fine-tuned or adapted to specific tasks,they
45、 initially rely on this general knowledge.As they are further trained on task-specific data,its predictions for that task can become more accurate,homing in on the intricate patterns and relationships within the new data they encounter.This underscores the importance of exposing foundation models to
46、 diverse datasets,reflective of global communities.Moreover,fine-tuning algorithms to recognize the uniqueness of various regions and populations is vital to ensure the accurate understanding and prediction of relationships by foundation models,thus fostering balanced and equitable outcomes for user
47、s.Data Equity:Foundational Concepts for Generative AI7At the same time,given that digital literacy varies widely and marginalized communities may be particularly underserved ensuring global users understanding of the models capabilities and limitations becomes a significant equity concern for genAIs
48、 mass adoption.Stage 3:Output data equity(access and outcome equity)Output data equity revolves around the fairness of tangible effects stemming from foundation model outputs.This encompasses benefits that directly arise from AI systems developed using this data.It involves asserting co-ownership ri
49、ghts over the AI system and advocating for the equitable sharing of benefits derived from the model.Equitable distribution is also linked to the ability to share in the benefits generated by improvements to the AI system over time through iterative processes during the AI lifecycle.Instances where d
50、ata collected in one region primarily bolsters the accuracy and performance of systems controlled by entities located in other regions underscore the importance of equitable sharing of these benefits with the originating communities.Additionally,it is important for designers and implementers of AI s
51、ystems to allocate resources to monitor and mitigate the disproportionate impacts on specific groups,reflecting biases and discrimination in the systems outputs,for example by making available remedial mechanisms.Data subjects and contributors have the right to influence the usage and governance of
52、the AI system,particularly when it perpetuates harms or undesired effects.Similarly,those who contribute to the development of the system deserve to participate in the sharing of the profits or benefits generated by it.Data Equity:Foundational Concepts for Generative AI8Data equity challenges in fou
53、ndation models3The data equity challenges of foundation models in genAI are distinct from those in non-generative AI systems,highlighting a complex landscape that requires careful attention.Training datasets requires innovative approaches to ensure accurate representation and consent.With genAI,the
54、scale and diversity of such data raise other issues.Ethical dilemmas and privacy concerns arise from publicly available content,and the scale and ad hoc nature of data collection may render obtaining genuine consent impossible.Linguistic and cultural biases within training data,largely in English an
55、d from Western sources,can skew responses,favouring English-centric viewpoints and lead to an internationalization of dominant cultures.The release of genAI applications for mass consumption exacerbates automation bias,fuelled by insufficient transparency about model capabilities and limitations.The
56、 unique features of foundation models namely,the scale,volume and broad,often ambiguous,sourcing of data complicate remediation.It is hard to pinpoint and correct specific data going into the model,which is further exacerbated by the ability of foundation models to generate entirely new content.This
57、 feature,while powerful for ongoing adaptation and learning,may further amplify bias and increase the difficulties associated with consent and Intellectual Property(IP)rights.Moreover,the datasets for foundation models are highly generalized and not built for specific use cases.A single foundation m
58、odel may be used for multiple applications,those extending inequities across multiple domains or sectors.Foundation models are continually learning and adapting.This unique feature creates further challenges given the scope,intricacy,size and training methods.As foundation models learn,algorithmic t
59、ransparency,clarity and auditability become increasingly difficult.Secondly,reusing generated outputs can amplify existing biases.Recent research hints at the danger of“model collapse”,in which a system seems to“forget”its initial data and worsens over time.10 Moreover,given the size and complexity,
60、replicating results or auditing models can be more challenging.Data Equity:Foundational Concepts for Generative AI9The table below summarizes some key differences between non-generative AI and generative AIs foundation models Non-Generative AIFoundation Models for GenAIUnique ChallengesScale of volu
61、me and source of dataOften uses smaller,curated datasets with known sources specifically relevant to the identified use case Uses massive datasets with often ambiguous,broad origins with no specific use case Hard to pinpoint and correct specific dataGeneralizability vs specificityBuild for specific
62、purpose(s)or task(s)Designed for a broad range of tasksCreation of novel contentMostly analyses or predicts based on input data Can generate entirely new content,which may reflect or amplify biases explicit in the training data,or which may be misleading,inaccurate or false Generated content raises
63、new consent and IP issuesScale of impactBecause tools are developed for narrow use cases,impact is most relevant to the specific domain or applicationA single model can have varied applications,thus extending or amplifying the effect of bias across multiple sectors and domainsExacerbated ChallengesO
64、pacity and complexitySome models are interpretableScope,intricacy,size and training methods make algorithmic transparency and clarity especially challengingFeedback loopsFeedback loops might be less prevalent and controlled The continuous refinement process of standard training methods can reinforce
65、 biases Reusing generated outputs can amplify existing biasReproducibility and accountabilityEasier to reproduce and pinpoint source of biasesDue to the size and complexity,replicated results or auditing can be more challengingInternationalization of dominant cultureSince the model is domain-specifi
66、c,the risk of spreading a dominant culture is lesser Broad applicability risks minimizing or overlooking the needs of specific communities.This can inadvertently promote dominant cultural viewpoints globallyChallenges:Non-Generative AI vs Generative AITABLE 1Table 1:Unique and exacerbated challenges
67、 in the case of non-generative AI versus foundation models for generative AI.It is important to note that this is a non-exhaustive list.Data Equity:Foundational Concepts for Generative AI10Focus areas for key stakeholders4Addressing data equity is a complex undertaking and will require the active,en
68、gaged participation of many individuals,groups and communities.As a starting point,we propose various pathways and actions stakeholders should take to ensure data equity when interacting with foundation models.Three major groups of stakeholders can be distinguished:Those that are responsible for dri
69、ving and governing the societal use of AI:AI-creating,AI-using organizations and policy-makers.Those that are impacted by or are the end users of AI systems:The public and communities.When it comes to the public and communities as stakeholders,there is an inherent power asymmetry between them and th
70、e other stakeholders due to differences in both capacities to use AI and levels of data literacy.It is important that those accountable for driving and governing the societal use of AI ensure meaningful engagement with the public and communities.Those that can bridge concerns between the accountable
71、 stakeholders and the public and communities:Civil society,with a focus on capacity building and developing representation for the public and communities with organizations that are responsible for AI.Data Equity:Foundational Concepts for Generative AI11AI-creating organizationsAI-using organization
72、sPolicy-makers and regulatorsStakeholderFocus areas Data collection and labelling Data privacy and security Transparency,traceability,and explainability Mitigation strategies(incl.fairness and bias mitigation)Continuous model evaluation Inclusive model design Responsible AI practices Data access and
73、 usage,incl.data privacy and security Disclosure to impacted communities Continuous monitoring Mitigation strategies(incl.fairness and bias mitigation)Context appropriate AI-human decision-making balance Develop ethical guidelines and standards10 Develop regulatory frameworks,including audits Consid
74、eration of public interest AI risk classifications Clear delineation of rights of data subjects and contributors regarding AI Raise public awareness Potential outcomes Meaningful transparency Model traceability for better quality control Effective accountability,incl.clear pathways for accountabilit
75、y(both external and internal)Implementation of assessment measures Facilitate continuous independent audits Collaborate with content generators Public disclosure of AI system usage Implement responsible AI governance frameworks Adopt standard practices Develop clear methodologies Ensure clear guidel
76、ines of automation circuit-breakers Establish standards and enact regulation Human-rights based approach Universal AI ethics Set an observatory body to ensure regulatory engagement and enforcement11 Engage multistakeholder community,incl.industry,academia,civil society,and public Including meaningfu
77、l engagement with stakeholders from the Global South Example pathways Open-source a representative portion of data Pre-launch and continuous auditing and monitoring of model behaviour Create and use public feedback channels Build tools that provide greater transparency Due diligence prior to deploym
78、ent Create and utilize public feedback channels Ethical guidelines and training Consult global AI experts Facilitate regulatory sandboxes as a best practice to design and test genAI systems Educate judiciary Implementation of Indigenous data sovereignty frameworks12Those responsible for driving and
79、governing societal use of AIFocus areas,potential outcomes and example pathways for key stakeholders to ensure data equity in foundation models.TABLE 2Civil society groupsPublicCommunities StakeholderFocus areas Bridge gap between AI organizations and public by raising awareness through advocacy eff
80、orts Promote ethical practices Increased awareness of AI Understanding of AI ethics Engagement with AI stakeholders Impact of AI on affected communities Participation in AI decision-making discussions Potential outcomes Develop accessible research and awareness material for the general public Develo
81、p ethical practice codes and model legislation Greater public awareness on how AI might influence issues and topics the public cares about Engage with stakeholders in public debates Understand impact of AI on everyday life Actively participate in advocacy campaigns Capacity-building for those using
82、AI Example pathways Public awareness campaigns Create data equity toolkits and resources Become educated on AI Learn about and participate in advocacy campaigns Hold stakeholders accountable Report and share observations with policy-makers Consider what data equity means in specific communities,such
83、 as in the case of Indigenous data sovereigntyThose using and impacted by AI systemsNote:Academia can either be part of AI-creating organizations,civil society,or communities,depending on the focus areas and the research undertaken.Data Equity:Foundational Concepts for Generative AI13Discussion5This
84、 paper has introduced main ideas and concepts about data equity.It is important to recognize,however,that data equity will have sector-specific considerations across all stages of the data cycle discussed above(input,algorithmic,output).Addressing data equity in the use of foundation models(and AI i
85、n general)requires greater transparency about the limitations,capabilities and therefore the application of data to AI in different contexts.As AI is being used to inform decision-making,it highlights the need to consider the human dimensions and socio-technical elements of both the development and
86、utilisation of AI.Acknowledgment of such limitations and the required correctives may be informed by the nature of the data used,the kind of AI model and the sensitivity of the application space.As digital society evolves,genAI application functions will increasingly become intelligently autonomous
87、to an even greater extent.AI is expected to be widely available at an industrial scale in all sectors and become less expensive,more convenient and more easily accessible to use.This widespread availability lends itself to a general tendency to overuse genAI models.A key problematic result of this w
88、ould be encoding data inequities,thereby perpetuating epistemic inequities.14 It is thus also critical to evaluate the utility of genAI for a given use case;in some scenarios,more traditional data science or AI approaches might be more relevant and useful.Keeping an appropriate AI vs“human decision-
89、making”balance in different contexts reduces the chances of perpetuating these inequities when foundation models are used.At the same time,it is also important to recognize the potential of generative AI in enhancing data equity.GenAI applications may be used for example to improve data analysis,pro
90、vide further explanation and increase access to data.For this briefing paper,we decided to focus specifically on the challenges of data equity in generative AI,given the importance of addressing these challenges early on in the adoption of genAI applications.As a result,the opportunities of genAI fo
91、r data equity fall outside the purview of this briefing paper.Data Equity:Foundational Concepts for Generative AI14GenAI promises immense potential to drive digital and social innovation,including improving efficiency,enhancing creativity and augmenting existing data.Generative AI has the potential
92、to democratize access and usage of technologies,thereby bridging the digital divide.15 However,if left unchecked,it could further engrain inequities.As these systems rapidly advance,only a small window exists to act decisively.It is crucial to integrate data equity and ethical considerations into ev
93、ery phase of genAIs development,from dataset collection to model training and model output.Ignoring issues at this moment will only amplify the inequities and increase the data and digital divides in societies.Now is the time to create definitional terms for collaboration in order to develop methods
94、 and processes that can be incorporated into technological development.While data equity concepts have existed in systems and methods for some time,the rise of genAI marks an urgent moment to foster dialogue and collaborative efforts across all sectors of society.This briefing paper represents a fir
95、st step in exploring and promoting data equity in the context of genAI.The proposed definitions,framework and recommendations are intended to be applicable to proactively and positively shape the future development of promising genAI technologies.Through this and future work,the World Economic Forum
96、s Global Future Council on Data Equity seeks to ensure equitable results throughout the broader digital economy,enabling fair and widespread global sharing of societal outcomes and benefits,and to start a dialogue on data equity among all stakeholders.It is only by identifying and acknowledging diff
97、erent types of systemic inequities that we can address them and work towards more comprehensive and inclusive solutions,to ensure shared benefits of generative AI.We look forward to continuing the conversation and working towards enhanced data equity.ConclusionData Equity:Foundational Concepts for G
98、enerative AI15ContributorsGlobal Future Council on Data Equity 2023-2024The World Economic Forums network of Global Future Councils is the worlds foremost multistakeholder and interdisciplinary knowledge network dedicated to promoting innovative thinking to shape a more resilient,inclusive and susta
99、inable future.Global Future Council on Data Equity Council MembersJoAnn Stonier(co-chair)Mastercard Fellow,Data&AI,MastercardLauren Woodman(co-chair)Chief Executive Officer,DataKindMajed Alshammari Special Adviser,Data Governance,Saudi Data and AI Authority(SDAIA)Rene Cummings Data Science Professor
100、&Data Activist in Residence,University of VirginiaNighat Dad Founder and Executive Director,Digital Rights FoundationArti Garg AI Chief Strategist,Hewlett Packard EnterpriseAlberto Giovanni Busetto Group Senior Vice-President;Head,Data and Artificial Intelligence,Adecco GroupKatherine Hsiao Executiv
101、e Vice-President;Head,Health and Life Sciences,Palantir TechnologiesMaui Hudson Associate Professor and Director,Te Kotahi Research Institute,University of WaikatoParminder Jeet Singh Digital Society Researcher David Kanamugire Chief Executive Officer,National Cyber Security Agency of RwandaAstha Ka
102、poor Co-Founder,Aapti InstituteZheng Lei Professor,Fudan UniversityJacqueline Lu President and Co-Founder,Helpful PlacesEmna Mizouni Chief Executive Officer,Digital CitizenshipAngela Oduor Lungati Executive Director,UshahidiMara Paz Canales Loebel Head of Legal,Policy and Research,Global Partners Di
103、gitalArathi Sethumadhavan User Research Scientist,Technology and Society,GoogleSarah Telford Lead,Centre for Humanitarian Data,United Nations Office for the Coordination of Humanitarian Affairs(OCHA)World Economic Forum Supheakmungkol SarinHead of Data and Artificial Intelligence Ecosystems,Centre f
104、or the Fourth Industrial Revolution;Council Manager,Global Future Council on the Future of Data EquityKimmy Bettinger Lead,Expert and Knowledge Communities,Centre for the Fourth Industrial RevolutionStephanie Teeuwen Early Careers Programme Data Policy,Centre for the Fourth Industrial Revolution Dat
105、a Equity:Foundational Concepts for Generative AI16AcknowledgementsTalal Altook Fellow,Artificial Intelligence and Machine Learning,Centre for the Fourth Industrial Revolution,World Economic ForumGenta Ando Fellow,AI Governance Alliance,Centre for the Fourth Industrial Revolution,World Economic Forum
106、 Jos Berens Data Policy Officer,Centre for Humanitarian Data,United Nations Office for the Coordination of Humanitarian Affairs(OCHA)Sebastian Buckup Head of Network and Partnerships,Centre for the Fourth Industrial Revolution,World Economic ForumJohn Bradley Lead,Metaverse,Centre for the Fourth Ind
107、ustrial Revolution,World Economic ForumKasia Chmielinski Principal,Data Nutrition Project Tenzin Chomphel Coordinator,Data Policy,Centre for the Fourth Industrial Revolution,World Economic ForumDaisuke Fukui Fellow,Advancing Cross-Border Data Flows,Centre for the Fourth Industrial Revolution,World E
108、conomic Forum Devendra Jain Lead,Digital Transformation,Centre for the Fourth Industrial Revolution,World Economic ForumBenjamin Larsen Lead,Artificial Intelligence and Machine Learning,Centre for the Fourth Industrial Revolution,World Economic Forum Cathy Li Head of AI,Data and Metaverse,Centre for
109、 the Fourth Industrial Revolution,World Economic ForumSandra Waliczek Centre Curator,Blockchain and Digital Assets,World Economic Forum Karla Yee Amezaga Lead,Data Policy,Centre for the Fourth Industrial Revolution,World Economic ForumProductionAnn Brady Editor,World Economic ForumMichela Liberale D
110、orbol Graphic Designer,World Economic ForumData Equity:Foundational Concepts for Generative AI171.Leslie,D.,Katell,M.,Aitken,M.,Singh,J.,Briggs,M.,Powell,R.,Rincn,C.,Chengeta,T.,Birhane,A.,Perini,A.,Jayadeva,S.,and Mazumder,A.(2022).Advancing data justice research and practice:an integrated literatu
111、re review.The Alan Turing Institute in collaboration with The Global Partnership on AI,https:/arxiv.org/ftp/arxiv/papers/2204/2204.03090.pdf.2.The World Economic Forums network of Global Future Councils is the worlds foremost multistakeholder and interdisciplinary knowledge network dedicated to prom
112、oting innovative thinking to shape a more resilient,inclusive and sustainable future,https:/www.weforum.org/communities/gfc-on-data-equity.3.World Economic Forum,A Blueprint for Equity and Inclusion in Artificial-Intelligence,June 2022,https:/www.weforum.org/whitepapers/a-blueprint-for-equity-and-in
113、clusion-in-artificial-intelligence/.4.Google Cloud,“Artificial intelligence(AI)vs machine learning(ML)”,https:/ University Human-Centered Artificial Intelligence,Generative AI:Perspectives from Stanford HAI,March 2023,https:/hai.stanford.edu/generative-ai-perspectives-stanford-hai.6.Bommasani et al.
114、“On the Opportunities and Risks of Foundation Models”.Stanford University Human-Centered Artificial Intelligence,2021,https:/crfm.stanford.edu/report.html.7.Amazon Web Services,“What are Large Language Models(LLM)?”,https:/ Stoyanovich and Bill Howe.“The Many Facets of Data Equity”,Journal of Data a
115、nd Information Quality,vol.14,no.4,December 2022,https:/doi.org/10.1145/3533425.9.Lee,Min Kyung,Anuraag Jain,Hea Jin Cha,Shashank Ojha and Daniel Kusbit.“Procedural Justice in Algorithmic Fairness:Leveraging Transparency and Outcome Control for Fair Algorithmic Mediation”,Proceedings of the ACM on H
116、uman-Computer Interaction,vol.3,no.CSCW,November 2019,https:/doi.org/10.1145/3359284.10.Shumailov,Ilia,Zakhar Shumaylov,Yiren Zhao,Yarin Gal,Nicolas Papernot and Ross Anderson.“The Curse of Recursion:Training on Generated Data Makes Models Forget”,Cornell University,31 May 2023,https:/doi.org/10.485
117、50/arXiv.2305.17493.11.World Economic Forum,The Presidio Recommendations on Responsible Generative AI,June 2023,https:/www.weforum.org/whitepapers/the-presidio-recommendations-on-responsible-generative-ai/.12.See for example the“New and emerging digital technologies and human rights resolution”from
118、UNHRC.United Nations Human Rights Council Resolution 53/29 of 12 July 2023,https:/ap.ohchr.org/documents/dpage_e.aspx?si=A/HRC/53/L.27/rev.1.13.An interesting case study to consider is the Mori Data Governance model.This is an example of a leading framework to ensure data equity in practice.The fram
119、ework is built around 8 pillars based on Mori values with a vision of data for self-determination,to ensure Mori authority over Mori data.For more information see Kukutai,T.,Campbell-Kamariera,K.,Mead,A.,Mikaere,K.,Moses,C.,Whitehead,J.and Cormack,D.(2023).Mori data governance model.Te Khui Raraunga
120、 https:/tengira.waikato.ac.nz/_data/assets/pdf_file/0008/973763/Maori_Data_Governance_Model.pdf.14.Kamruzzaman,Palash,“The case for epistemic justice”,TransformingSociety,29 October 2021,https:/www.transformingsociety.co.uk/2021/10/29/the-case-for-epistemic-justice/.15.Groth Olaf,Supheakmungkol Sari
121、n and Stephanie Teeuwen.“Small but mighty:How SMEs can thrive in the cognitive economy”,World Economic Forum,19 June 2023,https:/www.weforum.org/agenda/2023/06/amnc23-smes-can-thrive-in-the-cognitive-economy/.McKinsey Digital,The economic potential of generative AI:The next productivity frontier,14
122、June 2023,https:/ of Generative AI to Increase Equity in Knowledge”,5 July 2023,https:/ Equity:Foundational Concepts for Generative AI18World Economic Forum9193 route de la CapiteCH-1223 Cologny/GenevaSwitzerland Tel.:+41(0)22 869 1212Fax:+41(0)22 786 2744contactweforum.orgwww.weforum.orgThe World Economic Forum,committed to improving the state of the world,is the International Organization for Public-Private Cooperation.The Forum engages the foremost political,business and other leaders of society to shape global,regional and industry agendas.