《Snowflake:2023年数据趋势报告-四种方式从数据和人工智能中获取更多价值(英文版)(13页).pdf》由会员分享,可在线阅读,更多相关《Snowflake:2023年数据趋势报告-四种方式从数据和人工智能中获取更多价值(英文版)(13页).pdf(13页珍藏版)》请在三个皮匠报告上搜索。
1、Four essential trends redefining the way modern companies succeed with AI,automation,and moreDATA TRENDS20232DATA AND AI ARE REINVENTING THE WAY WE DO EVERYTHING.The world is changing fast,and one of the most powerful drivers of change is data.Data is reshaping research,policy,innovation,and more.Ge
2、nerative AI is prompting every industry to rethink core workflows and processescatalyzing a fundamental shift across software and enterprises.Data helps cybersecurity teams respond at faster-than-human speeds to an ever-rising number of threats that are also using fast and sophisticated data-driven
3、technologies.Data can be used to improve everything from supply chain logistics to the proper stocking of retail shelves to accelerated development of medicines and vaccines.And thats today;whole workflows,categories of software,and jobs will be transformed in the coming years.Because the Snowflake
4、Data Cloud is fundamental to how thousands of companies are able to get those insights,we were able to look at how theyre optimizing for the future.This wasnt research into how business or technical leaders feel about their data operations,nor about what they intend to do or believe is workingwe loo
5、ked at actual usage data throughout the past year to see trends in how organizations leverage their data,from governance issues to the programming languages theyre using.METHODOLOGYThe data in this report covers the 12-month period ending Jan.31,2023,referred to as“this year”or“the year,”to align wi
6、th Snowflakes 2023 fiscal year.We examined data usage of roughly 7,800 Snowflake customers,some of them longtime Snowflake users,and others only recently having joined the Data Cloud.Note that Snowflakes customer base grew 31%in the 2023 fiscal year,which provides a baseline of comparison for statis
7、tics identifying trends that outpaced this overall growth.3TREND 1:Companies are connecting data and AI everywhere they can.TREND 2:State-of-the-art companies are bringing their work(and AI)to the data,not vice versa.TREND 3:Governance is even more important in the new AI era.TREND 4:Companies are e
8、mbracing automation and expect a fully managed platform.HERE ARE FOUR KEY TRENDS THAT SHOW HOW ORGANIZATIONS ARE USING THEIR DATA AND EMBRACING AI.IN OTHER WORDS,THESE ARE TRENDS WITH REAL MOMENTUM.4TREND 1:COMPANIES ARE CONNECTING DATA AND AI ACROSS THEIR BUSINESS ECOSYSTEMS,GLOBALLY AND ACROSS CLO
9、UDS.The importance of connecting data is often talked about as a key IT imperative.In Data and Analytics Trends 2023,the Info-Tech Research Group identifies“democratizing real-time data”as strategically crucial.The report also calls out the importance of data marketplaces that share data for collabo
10、rative purposes,and notes Amazon,Microsoft,and Snowflake as leaders in this trend.When McKinsey identified seven characteristics that will define the data-driven enterprise of 2025,No.1 was“data embedded in every decision,interaction,and process.”The next two were“data is processed and delivered in
11、real time”and“flexible data stores enable integrated,ready-to-use data.”All of this points to environments without data silos or the lengthy resource delays of collecting and cleansing the data set.And the imperative,across every industry,to use that data to power AI models makes these characteristi
12、cs even more vital.5COMPLETE DATA DRIVES BETTER DECISIONSMedical technology maker Siemens Healthineers acquired and merged with Varian Medical Systems,a medical device company known for treating cancer.Working with Snowflake,Siemens Healthineers made integrated CRM data available globally.Whereas th
13、is process previously wouldve taken weeks or months,it took only about 20 minutes for the data from a Snowflake Azure instance in the United States to be replicated and available in Snowflake Azure in Europe.This let the sales team analyze the data right away to derive new insights,drive global deci
14、sion-making,and identify cross-selling opportunities.We looked at how our users are interacting with the Snowflake Data Cloud to see whether real-world behaviors match predicted trends.The answer,overwhelmingly,was yes.Cross-cloud growthFewer companies are confining their data to one cloud or region
15、;many are using a cross-cloud strategy for business continuity,resilience,and collaboration.Over the year,the number of Snowflake customers operating across the three leading public cloud providers(Amazon Web Services,Microsoft Azure,and Google Cloud)grew 207%.This exacerbates a growing need for app
16、lications to share data globally and across clouds.ResilienceOrganizations are increasingly emphasizing business continuity strategies to mitigate risk.Cross-cloud replication can facilitate seamless failover from one cloud to another,without disrupting the business.On Snowflake,the number of Forbes
17、 Global 2000 customers as of Jan.31,2023,doing cross-cloud replication increased 58%year over year.CollaborationData that lives in silos is not living up to its potential.Throughout the year,we saw increased collaboration on data across regions and across cloud providers.Snowflake allows concurrent
18、sharing of data where it resides,which dramatically increases efficiency by cutting out the need to extract,transform,and load that data before you can work with it.Billions of jobs were run against shared data during the fiscal year,with the number of jobs run growing 93%comparing Feb.1,2022,to Jan
19、.31,2023.A measure of collaboration that is unique to Snowflake is stable edges.In brief,an“edge”is a data share between a provider and consumer of data.A“stable edge”is one that,over two successive three-week periods,has produced at least 20 data transactions in each period.This indicates that the
20、specific collaboration this data sharing facilitates has ongoing value to the organization.Stable edges increased 93%over the year,suggesting that Snowflake users nearly doubled the usage of shared data on an ongoing basis to improve outcomes.In short,organizations continue to uplevel how they conne
21、ct their data and the data of others to improve processes and insights.Were seeing not only experimentation,but ongoing adoption of these successful innovations.In 2023,the number of organizations with data across all three major public clouds grew by207%6TREND 2:THE NEXT STEP OF“BREAKING DOWN DATA
22、SILOS”IS TO BRING YOUR WORK,INCLUDING AI,TO ALL OF YOUR DATA.STATE-OF-THE-ART COMPANIES ARE DOING IT ALREADY.Another source of silos is that data is generated in many formats,and lands in different specialized systems to be consumed by disparate downstream teams.Bringing together different formats a
23、nd types(structured,unstructured,semi-structured)is a persistent challenge.Companies have been making fitful progress on these challenges for more than a decade.But storing all your data in one place is less meaningful if you have to pull out and prep discrete data sets for each kind of job you want
24、 to do.The next horizon one thats essential to the dawning era of generative AIis being able to do meaningful work with all that data together.Because Snowflake lets users analyze any of their data,in the same platform and with the same engine,we can see how organizations embrace the ability to brin
25、g their work to all their data.This work can include building pipelines to process data,training machine learning models,creating analytic queries,dashboarding,and even powering entire applications.As of January 2023,Snowflake has more than 800 companies registered in the Powered by Snowflake progra
26、m,which helps them build,support,and PYTHON+SQL:A POPULAR PAIRWhen users bring their work to their data,rather than vice versa,how are they doing it?While SQL remains the most popular language on Snowflake,the most popular programming language used with Snowpark,our developer framework,is Python.No
27、surprise,since Pythons versatility,ease of use,and large community make it popular with programmers everywhere.In February 2023,Python accounted for nearly 88%of all jobs run on Snowpark.In addition,the Python Connector was the most popular tool for running DML jobs on Snowflake over the year.Our ta
28、keaway is that organizations need a platform that allows many languages,including SQL and Python,to be used by the same engine,with the same governance,against the same copy of data.7Our conclusion is that once you make it easier for someone to work with all their data for a variety of workloads,the
29、yll do so,unlocking new value from that data.Especially when it can be supported at enterprise scale,where concurrency is a requirement.We expect more and more companies to find ways to eliminate the movement of data and to bring more work to the data itself in the AI-driven years ahead.scale their
30、applications in the Snowflake Data Cloud.One exciting new trend is how more and more of these SaaS providers are architecting their applications to connect to a customers data platform instead of re-siloing data into their own managed data store.During the year,the number of connected applications i
31、n the Data Cloud grew 285%.Were seeing that once companies have all their data in one place,they canand want todo more with it.Snowflake users collectively run billions of jobs per day on the data within the Snowflake Data Cloud.Over the year,we saw the number of jobs in the Data Cloud grow by 64%mo
32、re than double our 31%customer growth,indicating that even as we simply add users,those users are finding more and more ways to bring work to their data.Theyre using SQL,Python,Java,Scala,and other languages to work with data within the Data Cloud,connecting to a rich repository of data to drive imp
33、rovements and insights.In January of this year,one customer ran more than 900 million jobs in that month aloneaveraging out to about 20,000 jobs a minute.ENERGIZING DATA SCIENCEEDF is a leading energy supplier in the UK,and is Britains biggest generator of zero-carbon electricity.EDF uses Snowflake
34、as a central repository for all customer data.Working with such data can be challenging.Prior to Snowflake,the companys data lake team had to provide extracts for data scientists to work with,which was slow-moving and came with a lot of complexity.Data scientists had to then manage the security and
35、governance of that data themselves.Rebecca Vickery,EDFs data science lead,says that Snowpark,with its support of Python and SQL,lets business users manipulate that data where it lies,deploying end-to-end machine learning to uncover the insights that make customers lives easier.“The benefits of being
36、 able to run data science tasks,such as feature engineering,directly where the data sits,is massive,”she says.“Its made our work a lot more efficient and a lot more enjoyable.”Over the year,the number of jobs run in the Data Cloud rose64%TREND 3:GOVERNANCE IS EVEN MORE IMPORTANT IN THE NEW AI ERA.Da
37、ta governancehow an organization understands and protects its data,and puts it to usedefines the roles,processes,and policies for interacting with data.Effective governance helps you develop business insights from trustworthy data and helps ensure regulatory compliance.The compliance part is a big d
38、eal.In the past several years,regulation and standards around data protection have risen sharply.The European Unions General Data Privacy Regulation(GDPR)and the California Consumer Privacy Act(CCPA)get an outsized share of attention,but there are many others.Canada has PIPEDA(the Personal Informati
39、on Protection and Electronic Documents Act)and Mexico has the Ley de Proteccin de Datos Personales.In Brazil its LGPD(Lei Geral de Proteo de Dados Pessoais);in Singapore its PDPA(the Personal Data Protection Act),and South Africa has POPIA(the Protection of Personal Information Act).And thats just o
40、ff the top of our heads.Except for Canadas,all those regulations have come into effect in the past decade,and each regime brings the threat of significant fines for non-compliance.But effective governance is about more than checking regulatory boxes.Strong data governance is an enabler,even an accel
41、erator,for getting full value from data.As modern enterprises use generative AI and LLMs to extract insights from their own data,a single,consistent governance model is indispensable.For both reliability and security,the full breadth of an organizations data must be efficiently,properly governed.A c
42、ore aspect of good data governance is to ensure that only people in the right,necessary roles are able to access a given data set.An added complexity is that different roles might need access to the same data set,but with differing levels of visibility.Organizations need automated and dynamic contro
43、ls and policies,including classification,tagging,masking,and granular role-based access controls,to enforce this at scale.89Any trendwatcher or prognosticator will tell you that organizations are increasingly concerned with meeting data governance requirements.(That Info-Tech report notes“adaptive d
44、ata governance”as a key trend,while a McKinsey report advises companies to establish an enterprise-wide data governance program to help themselves thrive in economic uncertainty.)But we wondered what organizations are actually doing.So we took a look at our own user base.Snowflake offers a wide rang
45、e of native governance policies and controls that can be consistently enforced across all data,regardless of region and cloud.As an example,users can employ dynamic masking policies that conceal sensitive data down to fine-grained levels from roles that arent permitted to access that data.We found t
46、hat our users are increasingly applying these controls to understand and protect data at scale.Across the Snowflake Data Cloud,the number of applied dynamic masking policies active for capacity customers had grown more than 205%,comparing Jan.31,2023,to Feb.1,2022.Again,thats six times the growth of
47、 our customer base,suggesting that users globally are becoming more rigorous in their application of such governance policies.SQUARE:SMARTER,MORE EFFECTIVE DATA GOVERNANCEFinancial services platform Square uses Snowflake to enable secure,governed access to data.Squares data platform team relies on S
48、nowflake to answer complex questions about data lineage,data accessibility,and data versioning.Anomaly detection algorithms monitor Squares data lifecycle to avoid bad data and ensure data quality.The company is progressing toward self-service data governance,in which data customers can take care of
49、 their own data rather than waiting on a data governance team.40 xThe number of tags applied by Snowflake customers increased nearly 40 x comparing Jan.31,2023,to Feb.1,2022.Furthermore,after releasing the feature in 2021,we saw meteoric growth in object tagging,which helps add context to data to be
50、tter use and protect it,as well as trigger automations and actions.Users may use tags to track personally identifiable information in a data set,and then understand who has been accessing those sensitive data sets.The number of such tags applied by Snowflake customers increased nearly 40 x comparing
51、 Jan.31,2023,to Feb.1,2022.Our takeaway is that as privacy becomes more important,organizations will turn away from siloed environments with separate governance controls,favoring a single platform that makes it easy to know and protect your data and your AI models.10VERADIGM:AUTOMATION DRIVES SCALE
52、AND INNOVATION Veradigm,a technology company that delivers care and financial solutions to healthcare providers,turned to Snowflake to modernize its data environment.Snowflakes multi-cluster shared data architecture,with automatic scaling of storage and compute resources,eliminated Veradigms perform
53、ance issues while lowering costs.The fully managed infrastructure and near-zero maintenance helped Veradigms data team support additional data use cases without increasing head count.TREND 4:COMPANIES ARE EMBRACING AUTOMATION AND EXPECT A FULLY MANAGED PLATFORM.Increasingly,customers are taking adva
54、ntage of automation capabilities,enabling greater efficiency and minimizing operating costs.This will be increasingly important:Running massive AI models requires scale,expertise and planning for an unpredictable amount of resources.Human speed isnt fast enough;every organization must look to take i
55、ts approach to automation to the next level.A basic example is resizing compute resources to right-size the resources and improve query performance.In January 2023,Snowflake performed millions of automated warehouse resize events a day to adapt to customer needs.We saw a 71%rise in such activity com
56、paring Jan.31,2023,to Feb.1,2022.The shift to greater use of automation creates a new challenge for CIOs,CDOs,and data platform administrators:how to establish financial governance.Data platform owners want to track use of resources by department or purpose,monitoring consumption to help align costs
57、 to value.The cloud has unlocked abundant resources,allowing teams to deliver value faster.Compute can be spun up and spun down to meet the organizations changing needs rather than leaving IT teams trapped in a slow-moving world of fixed-capacity resources.In this new world,the mission for data plat
58、form owners is to establish their internal practices around cost optimization and financial governance to help executives connect cost to value.Automation will not only power AI,it will be powered by it.An EnterprisersProject article about automation trends called out cloud resource optimization,edg
59、e computing,and DevSecOps as areas that would benefit from AI-driven automation in the coming years.Judging from what we see in the Data Cloud,data and IT experts are more than ready to take advantage of these automations to maximize performance and minimize costs.11ELEMENTS OF DATA AND AI SUCCESSEa
60、ch of these four trends presents its own opportunities to form strategies that address business needs and the competitive landscape.Below are essential recommendations derived from both the individual trends and a more holistic consideration of how leading organizations are embracing new possibiliti
61、es to use their data to improve decision-making and accelerate positive outcomes.REMOVE THE BARRIERS TO COLLABORATIONMany organizations have spent years trying to eliminate silos by bringing data together in one place.But their processes or toolsthe actual work they do with the datacreate new silos
62、that inhibit collaboration and reduce efficiency.Whether youre collaborating with global teams,different business units,or with third-party providers,you should be able to work as seamlessly with your data as though youre all in the same room,on the same systems.Modernize your data environment with
63、an eye toward making data secure and shareable,and toward reducing wait time.Teams should be able to access live data in the moment,to drive more powerful insights between analyst teams,data scientists,and others.LET TEAMS WORK IN THEIR PREFERRED LANGUAGETechnical people almost always have preferred
64、 languages,and theyll gravitate to the tools that are designed for Python,SQL,Scala,etc.This can create translation problems,but the answer is not to force your people to stop using the languages and tools they love.(They probably wont,and it might hurt retention of hard-to-replace talent.)Instead m
65、odernize your environment to allow people to work with their preferred tools and languages.You need to give them that flexibility while also providing shared governance,proper performance capabilities,etc.Otherwise,youre inevitably re-siloing your data through incompatible,inflexible work processes.
66、TURN GOVERNANCE INTO AN AI ACCELERATORThe purpose of governance is to keep data secure and manage access in order to comply with relevant policies and regulations.The easiest way to do that is to lock all data down and minimize access to it,which can also minimize the value of the data.Refocus your
67、data governance regime around safely enabling the use of data.Dont create multiple copies of the same data set,with conflicting governance policies.Ensure that everyone who needs to can access the same canonical data set,but with role-based dynamic masking that makes sure each user sees only what th
68、ey should be permitted to see.Use automation to simplify governance of data across systems and clouds so that data is managed from one place rather than ad hoc across your entire ecosystem.And make sure that governance is considered up front,as data is ingested and uses are planned,rather than tryin
69、g to securely manage it as an afterthought.All this ensures that the data is protected,while allowing the value to be realized.MINIMIZE TOTAL COST OF OWNERSHIP OF YOUR DATA SYSTEMS THROUGH AUTOMATIONAs organizations scale in terms of data volume and overall complexity,it quickly becomes impossible t
70、o optimize your data environment through manual processes.Fortunately,machines scale when humans cant.Automating efficiencies in terms of how resources are managed not only prevents costly human error(like the dev environment no one remembered to spin down),but it also frees your teams to build rath
71、er than sink time into managing resources,installing upgrades,and performing other maintenance.Efficiency through automation prevents waste of resources and ensures more value through greater productivity.12NEXT STEPS Learn more about how Snowflake can help you improve governance and collaboration t
72、hrough automation and AIand get more value from your data.OPTIMIZE COST AND PERFORMANCE Learn more about how the Snowflake Data Cloud helps companies optimize performance and minimize costs.FIVE STEPS TO SUCCESSFUL DATA GOVERNANCE For more guidance on Trend 3 in this report,read this ebook to discov
73、er how to define a data governance strategy and scale it as your data grows.OPERATE AT GLOBAL SCALE WITH SNOWGRID See how Snowflakes unique cross-cloud technology layer,Snowgrid,helps global organizations overcome challenges in terms of collaboration,governance,and business continuity to maximize th
74、e value of their data.ABOUT SNOWFLAKE Snowflake enables every organization to mobilize their data with Snowflakes Data Cloud.Customers use the Data Cloud to unite siloed data,discover and securely share data,and execute diverse analytic workloads.Wherever data or users live,Snowflake delivers a sing
75、le data experience that spans multiple clouds and geographies.Thousands of customers across many industries,including 573 of the 2022 Forbes Global 2000(G2K)as of January 31,2023,use Snowflake Data Cloud to power their businesses.Learn more at 2023 Snowflake Inc.All rights reserved.Snowflake,the Sno
76、wflake logo,and all other Snowflake product,feature and service names mentioned herein are registered trademarks or trademarks of Snowflake Inc.in the United States and other countries.All other brand names or logos mentioned or used herein are for identification purposes only and may be the trademarks of their respective holder(s).Snowflake may not be associated with,or be sponsored or endorsed by,any such holder(s).