1、Adrienn Lawson,Linux Foundation Marco Gerosa,Linux Foundation Stephen Hendrick,Linux Foundation Matt White,Linux Foundation Lucy Hyde,Linux FoundationForeword by Stella Biderman,EleutherAIDecember 2023In partnership with2023 Open Source Generative AI Survey ReportEnterprise perspectives and survey-b
2、ased insights at the intersection of open source innovation and generative AI advancementsCopyright 2023 The Linux Foundation|December 2023.This report is licensed under the Creative Commons Attribution-NoDerivatives 4.0 International Public License.Open source GenAI is considered better at supporti
3、ng collaboration,innovation,and ease of integration over proprietary solutions,according to our respondents.Open source GenAI leads to increased data control and transparency,according to 69%of respondents.Openness is important.63%of respondents are extremely or moderately concerned by the openness
4、of GenAI systems their companies are using or developing.Proprietary and open source solutions are equally preferred by respondents when it comes to the scalability and accuracy of GenAI technologies.Neutrality is a key aspect of GenAI governance,according to almost all of our respondents(95%).For t
5、he long-term sustainability of GenAI,open source solutions(43%)are preferred over proprietary(32%)solutions.The majority of businesses surveyed intend to tailor GenAI technologies to their needs,embedding them in existing products or creating new products around it.In general,41%of organizations sur
6、veyed would prefer open source GenAI technologies,compared with 9%who would prefer proprietary solutions.Security is the primary reason why organizations do not plan to deploy GenAI-related projects,but proprietary solutions are not considered more secure than open ones.A majority(60%)of companies s
7、urveyed plan to significantly invest in GenAI,allocating a large percentage of their IT budgets to the technology.Generative AI(GenAI)is a key component for businesses,with 50%of respondents organizations using it in a production context.GenAI is a key factor in future planning.63%of companies surve
8、yed feel it is extremely or moderately important to the future.2023 Open Source Generative AI Survey ReportContentsForeword 4Introduction 5Context 6High involvement and financial commitment 6Diverse application areas and usage strategies 7Generative AI openness 9Security and trust 12Security is a ma
9、jor barrier to deploying GenAI 12No evidence found for proprietary preference in security considerations 13Transparency and accessibility 14Open source GenAI increases data control and transparency 14Evaluating open source as a solution for accessibility and reproducibility of GenAI 15Neutral govern
10、ance and responsible innovation 16A neutral governance approach is important for GenAI technologies 16Performance and business needs 17Accuracy and scalability are deemed to be at similar levels of open source and proprietary GenAI 17Conclusions 18Businesses are concerned by the openness of the GenA
11、I technologies they are using 18Survey respondents generally lean in the direction of open source18A neutral governance approach is key to GenAI development 18About this study 19Methodology 19Demographics 19DataWorld access 19About the authors 20Acknowledgments 20ForewordWhen GPT-3 came out in May 2
12、020,the world of artificial intelli-gence was forever changed.What began as a revolution in language modeling research has expanded to image generation,protein synthesis,video editing,and more.Unfortunately these revolutions were largely kept from the world writ large:only eight of the thirty-four l
13、anguage models released in the two years since GPT-3 had their weights released under an open source license,and only three non-profit or academic institutions in the world successfully trained models more powerful than the previous generation of closed models(GPT-2).2023,however,marked a turning po
14、int in this trend.We witnessed an unprecedented surge in the release of open source AI models,with thirty new base models being made available under open source licenses.This shift was not just in quantity but also in the quality and diversity of these models,trained on 15 languages and coming from
15、13 different countries across four continents.Moreover,these base models served as the foundation for thousands of fine-tuned models,each tailored for specific appli cations.This explosion in open source AI has democratized access to cutting-edge technology,enabling a broader range of researchers,de
16、velopers,and organizations to contribute to and benefit from these advancements.A commitment to open source AI is more than just a commitment to permissively licensed weights however.The core tenants of the open source movementthe freedom to use,modify,study,and share computer systemsrequires access
17、 to large amounts of computing resources,highly optimized HPC libraries to carry out the training,reproducible and transparent evaluation frameworks,and large permissively licensed training corpora among other things.Some of these barriers are beginning to fall,with GPT-NeoX,OpenCLIP,training librar
18、ies seeing widespread use beyond their respective creators and evaluation frameworks such as the Language Model Evaluation Harness and Open LLM Leaderboards providing unprecedented access to state-of-the-art tools for creating and studying these models.Still,a broad commitment to increased access to
19、 both the technological and the material means of production of generative AI systems is essential to a healthy and thriving open source AI ecosystemThe world has a lot to gain from the recent revolution in AI tech-nology,but it also has a lot to lose.As society,legal systems,and regulators to grapp
20、le with this technology its essential that the open source community builds on our historical successes of securing widespread access to technology such as encryption to build a world where AI is not held in a de facto monopoly by a handful of companies.It is essential that people continue to be emp
21、owered to compute what they want,how they want,and according to their own values,rather than having their economic and social freedoms be at the whims of a few technology companies.In 2024 I look forward to seeing the continued democratization of this technology.I look forward to seeing new models t
22、rained in countries that have never trained generative AI systems before,models that speak their creators languages and reflect their values.I look forward to broader notions of responsible AI that go beyond what is expedient for large corporations.And I look forward to building all this alongside y
23、ou.STELLA BIDERMAN EXECUTIVE DIRECTOR,ELEUTHERAI42023 OPEN SOURCE GENERATIVE AI SURVEY REPORTIntroductionGenerative AI,commonly referred to as GenAI,stands at the forefront of a technological revolution,profoundly altering diverse sectors by synthesizing vast amounts of data and generating new outpu
24、ts.From creating intricate artworks and composing music to designing novel pharmaceutical compounds and simulating realistic human language,the potential applications of GenAI are vast and transformative.GenAI has undoubtedly become a focal point of both excitement and scrutiny.The open source appro
25、ach,rooted in principles of transparency,collaboration,and shared innovation,holds transformative potential for the advancement of GenAI technologies.By democra-tizing access to AI algorithms and datasets,open source initiatives allow a broad and diverse pool of developers to contribute to,refine,an
26、d critique GenAI systems.This collective intelligence accelerates the pace of innovation and uncovers and rectifies biases or vulnerabilities that might otherwise go unnoticed in closed development environments.As the integration of GenAI into business operations gains momen-tum,understanding its in
27、tricacies and its relation to open source becomes paramount.To understand how open source GenAI can impact the market,LF AI&Data,in partnership with Linux Foundation Research,launched a worldwide survey.This report provides an in-depth exploration of this surveys results,with a special focus on the
28、current state of GenAI in enterprise and GenAI openness.Through comprehensive analysis,we aim to offer insights,highlight best practices,and chart a path forward that ensures sustainable,ethical,and innovative development in this exciting frontier.To clarify the terminology present in this paper,we
29、refer to GenAI as a broad category for a type of AI that can create new content based on some input.GenAI tools are built on underlying AI models,such as a large language model(LLM).LLMs are a subset of GenAI with a specialized focus on text.In this survey,we have covered open source GenAI technolog
30、ies not limited to models but including databases,applications,and frameworks.Although at the time of the writing of this paper,the Open Source Initiative(OSI)had not yet released an open source AI definition,a draft 0.0.3 version is available and uses four freedoms to define an open source AI syste
31、m:Study how the system works and inspect its components.Use the system for any purpose and without having to ask for permission.Modify the system to change its recommendations,predictions,or decisions to adapt to the users needs.Share the system with or without modifications,for any purpose.52023 OP
32、EN SOURCE GENERATIVE AI SURVEY REPORTContextHigh involvement and financial commitmentIn the following section,the report lays out the most important features of the survey sample.The varied data and figures reveal that the sample comprises companies that are highly involved in GenAI.As observed in F
33、igure 1,88%of survey participants indicate that GenAI is important to the future of their companies.This data evidences the strategic importance of GenAI.Figure 2 shows that the surveyed companies show high involvement in GenAI technolo-gies(80%)and will invest heavily in their GenAI strategies(60%)
34、.This investment distribution is nearly identical for both end-user and vendor organizations,suggesting that all organizations in our sample are anticipating heavy investment.This considerable invest ment reflects a major commitment,indicating a significant impact on several projects and infrastruct
35、ure changes within these companies.FIGURE 1 HIGH IMPORTANCE FOR GENAI FOR THE FUTURE PLANS OF COMPANIES2023 GenAI Survey,Q11,Sample Size=280How important is GenAI to the future of the company you work for?(select one)Extremely importantModerately importantSlightly importantNeither important or unimp
36、ortantUnimportantDont know or not sure21%42%1%9%3%25%Extremely involved(GenAI is business critical to key aspects of what our company does)Very involved(GenAI is being used in production in selected areas)Involved(experimenting with how GenAI can add value in selected areas)Slightly involved(researc
37、hing or evaluating GenAI)My organization has evaluated and banned all use of GenAI toolsNot involved at allDont know or not sure31%49%14%5%1%0%0%A majority:almost entirely focused on GenAI strategiesA large percentage:a major commitment encom-passing several projects or infrastructure changesA moder
38、ate percentage:significant but not a major portion of the IT budgetA small percentage:for pilot projects or specific initiativesNo investmentDont know or not sure9%51%29%8%2%0%FIGURE 2 HIGH INVOLVEMENT AND LARGE INVESTMENT IN GENAI TECHNOLOGIES WITHIN THE SURVEYED COMPANIESTo what extent is your com
39、pany involved with GenAI?(select one)2023 GenAI Survey,Q2,Sample Size=284How much is your company planning to invest in its GenAI strategies in the next 12 months as a percentage of its overall IT budget?(select one)2023 GenAI Survey,Q16,Sample Size=249 62023 OPEN SOURCE GENERATIVE AI SURVEY REPORTD
40、iverse application areas and usage strategiesGenAI significantly impacts operations,as shown in Figure 3,particularly in product development and enhancement.Key areas include software quality assurance(35%),software testing(34%),and cybersecurity(31%),demonstrating its potential in risk mitigation a
41、nd ensuring product and service quality.Additionally,software development(29%)and documentation FIGURE 3 DIVERSE APPLICATION AREAS FOR GENAI UTILIZATION2023 GenAI Survey,Q12,Sample Size=280Please identify those areas where your organization expects to develop or use GenAI.(select all that apply)Qual
42、ity Assurance:anomaly detection and mitigation strategySoftware Testing:test case generation,unit testing,and UXDocumentation:generation of documents for code,applicationsCybersecurity:vulnerability analysis,risk mitigation,adaptation to attacksSoftware Development:code generation,code assistance,an
43、d auditsMarketing:sales collateral,image generation,blogs and articlesCustomer Service:chatbots,support and recommendationsKnowledge Management:access to company data and knowledge with chat interfaceLanguage:language understanding and translationCustomer Sentiment:customer satisfaction analysisPers
44、onal Assistants:manage tasks,schedule appointments,make recommendationsEducation and Training:adaptive learning,dynamic contentResearch:market,scientific and analyticalFinance:decisioning,investment optimization,predictionsLogistics:optimal routes,economics,engineeringHealthcare:assist in diagnosis,
45、drug discovery,personal medicineDisaster response:prediction,analysis,mitigationOther(please specify)Dont know or not sure35%34%34%31%29%23%20%20%19%14%14%13%11%11%9%6%2%1%0%72023 OPEN SOURCE GENERATIVE AI SURVEY REPORT(34%)are notable applications,with organizations using GenAI to automate code gen
46、eration and create dynamic documentation for applications and source code.Our survey assessed companies stages in their GenAI journey by examining how they plan to use GenAI,as shown in Figure 4.We grouped companies by their most advanced GenAI usage on two dimensions.The row totals reveal that orga
47、nizations in our sample aim not only to enhance their internal processes with GenAI but also to embed GenAI into products and services(29%)or create new GenAI-based products or solutions(55%).In terms of customi-zation level,the column totals show that many organizations plan to customize and enhanc
48、e GenAI foundation models(57%),potentially through methods such as fine-tuning or RAG(retrieval-augmented generation).A significant number also intend to develop in-house GenAI technologies(30%).Developing in-house solutions does not necessarily mean building LLMs or other large foundation models fr
49、om scratch,as it can require expensive and scarce resources and might not serve specific use cases well.Companies also have an opportunity to build small,domain-specific GenAI models from their own datasets by leveraging expertise in data science.Both customization and the development of in-house so
50、lutions will likely rely on the open-source community,which has been creating solutions to the challenges of GenAI customization with techniques such as LongLoRA,a fine-tuning approach with limited computation cost.FIGURE 4 THE MAJORITY OF BUSINESSES INTEND TO TAILOR GENAI TECHNOLOGIES TO THEIR SPEC
51、IFIC NEEDS2023 GenAI Survey,Q10 and Q14,Sample Size=245,cells add up to 100%How does your company use or plan to use GenAI technologies?(select all that apply)Use out-of-the-box GenAI technologies with little to no customizationUse GenAI technologies and customize them extensively to fit our needsDe
52、velop our own in-house GenAI technologies4%7%5%6%18%6%3%33%19%Rowtotals16%29%55%13%57%30%Employ GenAI to enhance internal processesEmbed GenAI into products and servicesCreate new products based on GenAI or create GenAI solutions for third partiesColumn totals82023 OPEN SOURCE GENERATIVE AI SURVEY R
53、EPORTGenerative AI openness1Andreas Liesenfeld,Alianda Lopez,and Mark Dingemanse.2023.Opening up ChatGPT:Tracking openness,transparency,and accountability in instruction-tuned text generators.In Proceedings of the 5th International Conference on Conversational User Interfaces(CUI 23).Association for
54、 Computing Machinery,New York,NY,USA,Article 47,16.https:/doi.org/10.1145/3571884.3604316Open source software provides significant benefits by ensuring that software is developed in the open.This attribute removes barriers to learning,using,sharing,and improving software.This can also result in more
55、 autonomy,transparency,and collabor-ation,which,if applied to GenAI,could ensure that users have the free dom to develop reliable and transparent AI systems.The following section delves into the results of the surveys GenAI openness questions.The level of openness can vary greatly between the differ
56、ent GenAI models currently available,but most of them would likely not earn the open source title,since availability and access to the underlying code,data,model,and documentation are rare.1 However,the GenAI ecosystem is not limited to models but includes applications from vector and graph database
57、s to agent frameworks.To illustrate,companies have the opportunity to leverage open source application development frameworks FIGURE 5 CONCERNS OVER OPENNESS IN EXISTING GENAI SYSTEMS ESPECIALLY AMONG COMPANIES CUSTOMIZING OR DEVELOPING IN-HOUSE SOLUTIONS2023 GenAI Survey,Q14 by Q13,Sample Size=247H
58、ow does your company employ or plan to employ GenAI technologies?(select all that apply)segmented by How concerned is your organization about the openness of the GenAI systems you are developing or using?(select one)Develop our own in-house GenAI technologiesUse GenAI technologies and customize them
59、 extensively to fit our needsUse out-of-the-box GenAI technologies with little to no customizationExtremely concernedModerately concernedSlightly concernedNeither concerned or unconcernedUnconcerned20%17%16%51%16%8%5%6%6%10%29%27%13%45%32%92023 OPEN SOURCE GENERATIVE AI SURVEY REPORT(e.g.,LangChain)
60、on top of closed models to integrate their appli-cations,back their office systems and innovate with new platforms.Therefore,openness can be leveraged across a wide range of GenAI.The open approach is vital,as confirmed by our survey respondents concerns about the openness of the GenAI technolo-gies
61、 they are using or developing.Figure 5 shows that,across the three ways in which organizations intend to employ GenAI technologies(develop in-house,customize to their needs,and use with little or no customization),concern over the openness of the GenAI is correlated with the level of organizational
62、involvement.In Figure 5,71%of organizations are moderately or extremely concerned about the openness of the GenAI they will be developing.This may be due to the wide variations in openness today and the risk of betting on an approach that the industry dismisses as the market matures.A similar situat
63、ion exists for organizations that intend to customize GenAI systems to better fit their needs,where 62%of organiza tions are moderately or extremely concerned about the need for openness.By contrast,only 48%of organizations are moderately or extremely concerned about the openness of out-of-the-box A
64、I technologies that are used with little or no customization.Presumably,this is because organizations have already done due diligence in their selection process and the vendor/supplier is also ultimately responsible for the quality and reliability of the product or service.Concern about the openness
65、 of GenAI translates into organizational preferences between open source and proprietary GenAI techno-logies.Figure 6 shows that 41%of organizations lean toward open source GenAI technologies,compared with 9%favoring proprietary ones.Twenty-two percent of organizations are inclined to use both types
66、 of solutions while 28%are indifferent,indicating that their choice of technology will ultimately be influenced by factors beyond these preferences.FIGURE 6 IN GENERAL,OPEN SOURCE GENAI TECHNOLOGIES ARE PREFERRED OVER PROPRIETARY SOLUTIONS ACCORDING TO SURVEY RESPONDENTS2023 GenAI Survey,Q17,Sample
67、Size=249 Which distribution model does your organization prefer for GenAI,proprietary or open source?(select one)41%22%9%28%We are or will use both proprietary and open source GenAI technologiesWe prefer proprietary GenAI technologiesWe do not have a preferenceboth types of GenAI technologies are im
68、portant to usWe prefer open source GenAI technologies102023 OPEN SOURCE GENERATIVE AI SURVEY REPORTWhile the open source software definition revolves around the source code2,an open source AI system definition will have to consider the various layers that make up the GenAI stack.In our survey,we out
69、lined three primary layers:the application layer,the model layer,and the infrastructure layer.Figure 7 shows that respondents appreciate open datasets most(47%).Open datasets for GenAI can accelerate innovation,promote collabo-ration,and mitigate bias through data availability.Survey respondents fur
70、ther mentioned that open source technologies 2 Open Source Initiative:The Open Source Definition,available at https:/opensource.org/osd/FIGURE 7 ACROSS THE THREE PRIMARY LAYERS OF THE GENAI STACK,OPEN DATASETS AT THE MODEL LAYER WOULD BE THE MOST FAVORED OPEN SOURCE TECHNOLOGY2023 GenAI Survey,Q22,S
71、ample Size=249The GenAI stack can generally be divided into three primary layers:the Application Layer,the Model Layer,and the Infrastructure Layer.Which components of these layers,if any,do you believe should be based on open source technologies?(select all that apply)Application layer:deploymentAp
72、plication layer:frameworkApplication layer:AI applicationsModel layer:software for training and testing,inference,and analysisModel layer:raw data and curated datasets for model training and validationInfrastructure layer:measuring and monitoring performanceInfrastructure layer:hostingDont know or n
73、ot sure5015%30%44%37%47%36%11%2%could improve the applications based on GenAI models(44%),the software for training and testing(37%),and the tools for measuring and monitoring performance in the infrastructure layer(36%).Other ways exist to deconstruct GenAI systems and assess their open ness:Resear
74、chers have developed an openness tracker for various LLMs.112023 OPEN SOURCE GENERATIVE AI SURVEY REPORTSecurity and trustSecurity is a major barrier to deploying GenAISecurity(49%)is by far the most relevant obstacle to employing GenAI,as observed in Figure 8.Some examples of security concerns rega
75、rding GenAI are privacy,trust,unintended consequences,data breaches,and misuse.GenAI systems,by design,ingest vast amounts of data to train and operate optimally.This data may include sensitive,insecure,incorrect,or biased information.There is also the challenge of ensuring that the model or other p
76、arts of the infrastructure do not inadvertently disclose informa-tion introduced during the training,testing,or validation process,which could lead to leakage of confidential information.Ensuring the security of GenAI technologies is not just a technical necessity but crucial for maintaining trust a
77、nd regulatory compliance.This is further complicated by the complexity of these black-box models that can obscure vulnerabilities,making it challenging for organizations to fully understand and mitigate potential security risks.As one of the respondents answered in an open-FIGURE 8 SECURITY CONCERNS
78、 ARE THE PRIMARY REASON WHY COMPANIES DO NOT INITIATE GENAI RELATED PROJECTS2023 GenAI Survey,Q15,Sample Size=249If your company does not plan to deploy or initiate any GenAI-related projects in the next 12 months,what are the primary reasons?(select all that apply)SecurityCostTechnology maturityNo
79、AI expertise in houseNo compelling business caseDoes not apply to usOther(please specify)Dont know or not sure49%33%31%17%14%12%0%2%122023 OPEN SOURCE GENERATIVE AI SURVEY REPORTended question on the challenges of GenAI:“Data security concerns are the biggest problem at our company since GenAI needs
80、 to adopt security measures more effectively to protect consumer data and guarantee compliance with privacy regulations.”No evidence found for proprietary preference in security considerationsWhile it is crucial to address the security concerns mentioned when discussing Figure 8,it is not guaranteed
81、 that proprietary solutions will effectively resolve these issues.We asked our survey respond-ents to consider whether they would prefer open source or proprietary GenAI solutions across four discrete security concerns.Figure 9 reveals that when it comes to the vital considerations of security,priva
82、cy,and regulatory compliance in GenAI technologies,there is no substantial evidence of a prevailing preference for proprietary solutions over open source options among companies.Figure 9 shows that respondents lean toward preferring open source over proprietary,but the surveys margin of error does n
83、ot show a significant difference between the two alternatives.This finding challenges the arguments that claim that proprietary solutions are more compliant with regulation and safer for GenAI development.FIGURE 9 WHEN CONSIDERING SECURITY MATTERS,NO EVIDENCE FOUND THAT COMPANIES PREFER PROPRIETARY
84、GENAI TECHNOLOGIES TO OPEN SOURCE SOLUTIONS2023 GenAI Survey,Q18 and Q19,Sample Size=249For each of the following considerations,which type of GenAI solution would you prefer?(one response per row)Open sourceThe sameProprietary46%33%21%42%37%22%42%34%24%PrivacySecurityTrustworthy data and modelsRegu
85、latory compliance39%38%23%132023 OPEN SOURCE GENERATIVE AI SURVEY REPORTTransparency and accessibilityOpen source GenAI increases data control and transparencyThe openness of GenAI models provides opportunities for the public and academics to scrutinize AI models.A lack of under-standing or transpar
86、ency about how GenAI models make decisions can hinder individuals rights to know how their data is being used.Without proper mechanisms for accountability,it is challenging to ensure that privacy is consistently upheld.A survey respondent answered an open-ended question on transparency by saying,“Ou
87、r company must prioritize building trust with their customers by being transparent about their use of AI technology and providing clear explanations of how AI systems make decisions()some customers may be hesitant to interact with AI systems,preferring human interaction.”As observed in Figure 10,a s
88、ignificant 69%of companies believe that their organizations data control and transparency would see an improvement if they were to use open source GenAI technologies.FIGURE 10 AGREEMENT ON THE INCREASE IN DATA CONTROL AND TRANSPARENCY BY OPEN SOURCE GENAI TECHNOLOGIES2023 GenAI Survey,Q21,Sample Siz
89、e=249How much do you estimate your organizations data control and transparency could change if the GenAI technologies you use were open source?(select one)33%14%22%6%8%10%1%2%2%SignificantlyincreaseModerately increaseSlightly increaseStay the sameSlightly decreaseModerately decreaseSignificantly dec
90、reaseDoes not apply to my organizationDont know or not sure142023 OPEN SOURCE GENERATIVE AI SURVEY REPORTOpen sourceThe sameProprietary42%36%21%42%35%23%42%37%21%Widespread adoptionAccess to diverse data and modelsTransparency and reproducibilityCost and budget41%32%27%Evaluating open source as a so
91、lution for accessibility and reproducibility of GenAIFigure 11 shows preferences for four considerations related to GenAI adoption.Open source models may be perceived as more favorable for widespread adoption of GenAI due to their accessibility and the collaborative opportunities they offer,allowing
92、 for rapid dissemination and iteration across a broad user base,as shown.With 42%favoring open source for access to diverse data and models compared with 35%for proprietary,there is an implication that open source is associated with a richer variety of data and modeling options.This is critical in A
93、I development,where diversity in datasets can lead to more robust and less biased AI systems.The result that 42%prefer open source for transparency and reproducibility underscores the value placed on openness in the AI community.Transparency is key to building trust and allowing for independent veri
94、fication of AI systems,while reproducibility is essential for scientific progress and validation of results.The preference for open source(41%)over proprietary(32%)in terms of cost and budget considerations indicates that open source solutions are perceived as more cost-effective.This is particularl
95、y relevant in a context where organizations are seeking to maximize the efficiency of their investments in AI technologies,especially when budget constraints are a factor.FIGURE 11 EVALUATING OPEN SOURCE AS A SOLUTION FOR ADOPTION,ACCESSIBILITY,AND REPRODUCIBILITY OF GENAI2023 GenAI Survey,Q18 and Q
96、19,Sample Size=249For each of the following considerations,which type of GenAI solution would you prefer?(one response per row)152023 OPEN SOURCE GENERATIVE AI SURVEY REPORTNeutral governance and responsible innovationA neutral governance approach is important for GenAI technologiesAs important as t
97、ransparency and accessibility are for GenAI technologies,open source might not be enough to mitigate the risks that we associate with GenAI.Figure 12 shows that a neutral governance approach is important for our survey respondents,with 88%indicating that it is extremely or very important when develo
98、ping GenAI technologies.Neutral governance is another aspect of true open source models and can benefit GenAI tech-nologies in multiple ways.Neutral governance is important to ensure innovation is not subject to only a few companies futures.In addition,neutral governance can help set ethical standar
99、ds and guidelines to prevent misuse of the technology.Neutral governance is tied to various considerations explored in our survey.Figure 13 shows a lean toward open source solutions in the realms of collaboration and community involvement(43%),long-term sustainability(42%),and responsible AI and eth
100、ical considerations(40%).Such governance provides an impartial framework that likely encourages diversity and inclusion in the development process as it is not tied to the interests of proprietary systems.Neutral governance can ensure that innovation and iteration are not only rapid but also ethical
101、ly aligned and sustainable over time,making the technology more accessible and potentially leading to more equitable outcomes in the GenAI space.FIGURE 13 OPEN SOURCE GENAI TECHNOLOGIES,UNDER NEUTRAL GOVERNANCE,HAS THE POTENTIAL TO ACHIEVE RESPONSIBLE INNOVATION2023 GenAI Survey,Q18 and Q19,Sample S
102、ize=249For each of the following considerations,which type of GenAI solution would you prefer?(one response per row)Collaboration and community involvementEase of integrationLong-term sustainabilityResponsible AI and ethical considerationsOpen sourceThe sameProprietaryRapid iteration and innovation4
103、3%37%20%43%33%24%42%40%39%22%39%36%25%26%32%FIGURE 12COMPANIES FOCUSING ON THE DEVELOPMENT OF GENAI TECHNOLOGIES CONSIDER THE ADOPTION OF A NEUTRAL GOVERNANCE APPROACH TO BE OF SIGNIFICANT IMPORTANCE2023 GenAI Survey,Q24,Sample Size=72How important is having a neutral governance open source approach
104、 to developing GenAI technologies?(select one)46%42%7%3%Extremely importantVery importantImportantSlightly importantNot important at allDont know or not sure162023 OPEN SOURCE GENERATIVE AI SURVEY REPORTPerformance and business needsAccuracy and scalability are deemed to be at similar levels of open
105、 source and proprietary GenAIThe effectiveness of GenAI is often evaluated by companies based on performance indicators such as accuracy and speed.Figure 14 highlights the comparative preferences for open source versus proprietary GenAI technologies in relation to key business needs.It is evident fr
106、om the data that the preference for open source and proprietary solutions is closely matched across various technical considerations.For example,open source and proprietary solutions are almost equally preferred in terms of their accuracy,with 36%for proprietary and 35%for open source.Similar patter
107、ns are observed in other categories,such as support and maintenance and performance/scalability.In terms of user experience,slightly more respondents prefer proprietary solutions(41%)to open source ones(38%).This balanced distribution of preferences acknowledges a competitive landscape where open so
108、urce solutions are considered nearly as favorable as proprietary ones in meeting critical technical needs.FIGURE 14 SIMILAR LEVELS OF PREFERENCE OF OPEN SOURCE GENAI TECHNOLOGIES AND PROPRIETARY SOLUTIONS REGARDING BUSINESS NEEDS,SUCH AS SCALABILITY AND ACCURACY2023 GenAI Survey,Q18 and Q19,Sample S
109、ize=249For each of the following considerations,which type of GenAI solution would you prefer?(one response per row)Ability to align withbusiness needsUserexperienceSupport andmaintenancePerformance/scalabilityOpen sourceThe sameProprietaryAccuracy41%33%26%38%41%21%39%37%36%27%35%36%28%20%41%172023
110、OPEN SOURCE GENERATIVE AI SURVEY REPORTConclusionsBusinesses are concerned by the openness of the GenAI technologies they are usingThe survey reveals a strong concern among respondents regarding the openness of GenAI systems.Around two-thirds of respondents are either extremely or moderately concern
111、ed about this aspect,reflecting the importance of transparency and control in technology deployments.Open source GenAI,according to 69%of respondents,leads to increased data control and transparency,which are critical for ethical and responsible AI development.Survey respondents generally lean in th
112、e direction of open sourceThe findings from our survey provide compelling insights into the current attitudes and preferences of organizations toward GenAI,particularly highlighting a notable inclination toward open source solutions.This finding highlights a recognition of the benefits associated wi
113、th open source technologies,including transparency,reproducibility,access to diverse data and models,and ease of integration.Security,an important concern for any technology deployment,does not appear to be a deterrent for open source GenAI adoption.In fact,most respondents do not view proprietary s
114、olutions as more suitable for security considerations than open ones.A neutral governance approach is key to GenAI developmentThe importance of neutral governance in GenAI was supported by 95%of respondents in the survey.This governance framework ensures a more ethical and equitable development of G
115、enAI tech nologies through community involvement and collaboration.Neutral governance is not only crucial for fostering responsible growth of GenAI but also for ensuring that its benefits are wide spread and aligned with societal values.This approach is vital in maintaining the integrity and sustain
116、ability of GenAI advancements,ensuring that they serve both communities and stakeholders.182023 OPEN SOURCE GENERATIVE AI SURVEY REPORTAbout this studyDuring September and October 2023,LF AI&Data and Linux Foundation Research fielded an online survey of individuals at organizations on a range of que
117、stions related to GenAI.The survey was promoted via LF social media and at LF events.We also sourced qualified respondents from a third-party panel provider to craft a more diverse sample.MethodologyWe received 284 valid survey starts,and 249 respondents completed all relevant questions.The margin o
118、f error for the sample size of 249 is 5.2%at the 90%confidence level.This sample size reflects those respondents who met a variety of screening and filtering criteria.The primary screening criteria included employment(respondents who were students,unem-ployed or retired were disqualified)and familia
119、rity with the organizations adoption of GenAI(not familiar at all,slightly familiar,and those responding“Dont know or not sure”were also disqualified).The percentage values in this report may not total exactly 100%due to rounding.DemographicsFigures 15 and 16 provide selected demographics of the sur
120、vey sample.In the left-hand panel of Figure 15,we see that 42%of our respond-ents were extremely familiar with GenAI,44%were very familiar,and just 14%were familiar.The lack of respondents who were either slightly familiar or not familiar or didnt know or were not sure was intentional.This is becaus
121、e this question was part of our screening process so that respondents would be capable of FIGURE 15 SELECTED DEMOGRAPHIC DATA2023 GenAI Survey,Q1,Q3 and Q4,Sample Size=284How familiar are you with your organizations adoption of Generative AI(GenAI)?(select one)Which of the following best describes y
122、our professional role?(select one)What best describes the company you work for?(select one)I work for a company that primarily offers products and services outside the IT industryI work for a company that operates in the Information Technology(IT)industryOther57%39%4%Extremely familiarVery familiarF
123、amiliarSlightly familiarNot familiar at allDont know or not sure42%44%14%0%0%0%AI or ML engineerSenior/executive management(non-IT)IT management(eg.,VIP,CIO,CISO,CTO)Data scientistProduct managerMarketing/communicationsDeveloper/software engineerOther31%20%17%13%6%6%4%3%192023 OPEN SOURCE GENERATIVE
124、 AI SURVEY REPORTproviding us with reliable perspectives and insights.Because 86%of respondents were either very or extremely familiar with GenAI,we believe this higher level of expertise will improve the quality and insight provided by this survey.The central panel in Figure 15 shows that the respo
125、ndents are well distributed across industries,with 57%working in end-user organizations(those companies that use or even embed IT but primarily offer products and services focused on industries outside of IT itself)and 39%working for IT vendors or service providers.The right-hand panel of Figure 15
126、shows that respondents are distributed across a variety of roles,including AI or ML engineer(31%),non-IT senior/executive manager(20%),IT managers(17%),and data scientists(13%).Figure 16 is a continuation of this demographic data.The left-hand panel of Figure 16 shows a distribution by region.We did
127、 not make an effort to stratify by region,and,as a result,most responses come from the U.S.or Canada(92%).The center panel in Figure 16 shows the distribution of respondent organizations by company size in employees.Respondents are reasonably well distributed across three groups:1 to 999(29%),1,000
128、to 9,000(47%),and 10,000 or more(24%).The right-hand panel in Figure 16 shows that most organizations are reliant on open source software,with 53%reporting being very reliant and 35%being extremely reliant.DataWorld accessLF Research makes each of its empirical project datasets available on Data.Wor
129、ld.Included in this dataset are the survey instrument,raw survey data,screening and filtering criteria,and frequency charts for each question in the survey.LF Research datasets,including this project,can be found at data.world/thelinuxfoundation.FIGURE 16 SELECTED DEMOGRAPHIC DATA2023 GenAI Survey,Q
130、5,Q6 and Q7,Sample Size=284In which region does your company have its headquarters?(select one)How reliant is your company on open source software(OSS)?(select one)Please estimate how many total employees are in your company.(select one)United States or CanadaEuropeOther92%5%3%1 to 9991,000 to 9,999
131、10,000 or more29%47%24%Extremely OSS-reliantVery OSS-reliantModerately OSS-reliantSlightly OSS-reliantNot OSS-reliant at allDont know or not sure35%53%7%4%1%0%202023 OPEN SOURCE GENERATIVE AI SURVEY REPORTAbout the authorsADRIENN LAWSON is a data analyst at the LF.Adrienn obtained a masters degree f
132、rom the University of Oxford in social data science.She supports LF Research with survey development,analysis,and report writing.Adrienn has previously conducted research at the University of Oxford,the Budapest Institute for Policy Analysis,and the U.K.s Office for National Statistics.Dr.MARCO GERO
133、SA is a full professor in computer science at Northern Arizona University and a research analyst at LF Research.His main areas of investigation are software engineering and open source software.He is currently investigating the use of GenAI as a tool to support computer science education and the onb
134、oarding of new developers in OSS communities.These projects are supported by the National Science Foundation and have been featured in publications in top-tier venues.He has published over 200 papers and serves on the program committee of important conferences,such as ICSE,FSE,and MSR,and as a revie
135、wer for several journals.He graduated several Ph.D.and M.Sc.students who are now researchers in top institutions and have more than 20 years of teaching experience.For more information,visit http:/.STEPHEN HENDRICK is vice president of research at the Linux Foundation,where he is the principal inves
136、tigator on a variety of research projects core to the Linux Foundations understanding of how OSS is an engine of innovation for producers and consumers of IT.Steve specializes in primary research techniques developed over 30 years as a software industry analyst.Steve is a subject matter expert in ap
137、plication development and deployment topics,including DevOps,application management,and decision analytics.Steve brings experience in a variety of quantitative and qualitative research techniques that enable deep insight into market dynamics and has pioneered research across many application develop
138、 ment and deployment domains.Steve has authored over 1,000 publi cations and provided market guidance through syndicated research and custom consulting to the worlds leading software vendors and high-profile start-ups.MATT WHITE is the Director of the Generative AI Commons at the Linux Foundations A
139、I&Data Foundation.He is Head of AI&Data at Amdocs,as well as the Founder of the AI research group Berkeley Synthetic.He teaches graduate students Data Science at UC Berkeley.He is also the Co-Founder and Chair of the Open Metaverse Foundation a part of the Linux Foundation and a Board Director at th
140、e Metaverse Standards Forum.He has over 25 years of experience in AI and data and open source.He holds a Master of Data Science from UC Berkeley,an MBA from the University of Denver and a BSc IT from York University.For more information,visit www.matt-.LUCY HYDE is a Senior Program Manager speciali-
141、zing in Machine Learning supporting open source innovation in artificial intelligence and data.As an accomplished professional,she is highly regarded for technical expertise working in roles focusing on data science,software engineering,and technical exploitation.She started her career with the Depa
142、rtment of Defense as an active duty service member and as a government civilian,and continued in private sector digital forensics.She graduated with undergraduate degrees in Psychology,the Arabic language,and Intelligence Operations;an MS in Analytics;and is pursuing a Ph.D.in Computational Science/
143、Informatics from George Mason University and a second MS degree in Artificial Intelligence from Johns Hopkins University.AcknowledgmentsWe thank all the participants of the survey for kindly sharing their insights and experience on the 2023 state of GenAI.Special thanks to peer reviewers and LF coll
144、eagues for their involvement in the various stages of the research process:Hilary Carter,Michael Dolan,Ibrahim Haddad,and Anna Hermansen.212023 OPEN SOURCE GENERATIVE AI SURVEY REPORTFounded in 2021,Linux Foundation Research explores the growing scale of open source collaboration,providing insight i
145、nto emerging technology trends,best practices,and the global impact of open source projects.Through leveraging project databases and networks,and a commitment to best practices in quantitative and qualitative methodologies,Linux Foundation Research is creating the go-to library for open source insig
146、hts for the benefit of organizations the world over.Copyright 2023 The Linux FoundationThis report is licensed under the Creative Commons Attribution-NoDerivatives 4.0 International Public License.To reference this work,please cite as follows:Adrienn Lawson,Marco Gerosa,and Stephen Hendrick,”2023 Open Source Generative AI Survey Report:Enterprise perspectives and survey-based insights at the intersection of open source innovation and generative AI advancements”,foreword by Stella Biderman,The Linux Foundation,December 2023.lfaidata.foundation|genaicommons.org