《GitGuardian:2023年泄密扩散状态报告(英文版)(41页).pdf》由会员分享,可在线阅读,更多相关《GitGuardian:2023年泄密扩散状态报告(英文版)(41页).pdf(41页珍藏版)》请在三个皮匠报告上搜索。
1、How leaky was 2022?1GitGuardian State of Secrets SprawlTHE STATE OF SECRETS SPRAWL2023How leaky was 2022?2GitGuardian State of Secrets Sprawl10Mnew secrets detectedin public GitHub commits in 2022We have never detected as many secrets and secrets sprawl has been accelerating yearly since 2020.Hard-c
2、oded secrets increased by 67%compared to 2021,whereas the volume of scanned commits rose by 20%(from 860M to 1.027B commits between 2021 and 2022).+67%How leaky was 2022?3GitGuardian State of Secrets SprawlHard-coded secrets have never been a more significant threat to the security of people,enterpr
3、ises,and even countries worldwide.IT systems,open-source,and entire software supply chains are vulnerable to exploiting keys left by mistake in source code.As the world digital footprint grows,millions of such keys accumulate every year,not only in public spaces such as code-sharing platforms but es
4、pecially in closed spaces such as private repositories or corporate IT assets.In other words,secrets sprawl on GitHub is only the tip of the iceberg.This wouldnt be so concerning if credentials theft werent the most common cause of data breaches.The 2022 editions of Verizons DBIR and the IBM Cost of
5、 a data breach reports highlighted that this attack vector had remains the top concern since 2021:ForewordUse of stolen or compromised credentials remains the most common cause of a data breach.Stolen or compromised credentials were the primary attack vector in 19%of breaches in the 2022 study and a
6、lso the top attack vector in the 2021 study,having caused 20%of breaches.Breaches caused by stolen or compromised credentials had an average cost of USD 4.50 million.1 1 From the IBM Cost of a data breach 2022Secrets are not just any kind of credentials;they are the keys to obtaining privileged acce
7、ss to secure systems.Because of the leverage they provide,they are hackers most sought-after information.However,many infosec incidents that occurred in 2022 pointed to how inadequate their protection is.How leaky was 2022?4GitGuardian State of Secrets SprawlA look back at 2022 major incidentsSecret
8、s are found in one way or another in most of the security incidents that happened in 2022.We can classify them into three categories:Stolen source code repositoriesOkta admitted a breach of its GitHub repositories resulting in source code theft.Dec.21LastPass source code is stolen,leaking credential
9、s and keys used months later to access and decrypt storage volumes.Aug.-Dec.200GB of Samsung source code is leaked,revealing 6,695 hard-coded secrets.Mar.7Slack employee tokens are stolen and misused to download private code.Dec.7Dropbox disclosed that 130 stolen code repositories contained API keys
10、.Nov.1250 Microsoft projects are leaked,revealing 376 hard-coded secrets.Mar.22NVIDIA source code is leaked by the“Lapsus$”group.Feb.25Secrets exploited in an attackAn attacker leveraged malware deployed to a CircleCI engineers laptop to steal a valid,2FA-backed SSO session.They could then exfiltrat
11、e customer data,including customer environment variables,tokens,and keys.Dec.29An attacker breached Uber and used hard-coded admin credentials to log into Thycotic,the firms Privileged Access Management platform.They pulled a full account takeover on several internal tools and productivity applicati
12、ons.Sep.15Secrets exposed publiclyToyota disclosed a contractor exposed a credential giving access to user data on GitHub for five years.Oct.7Research reveals 18,000+Android apps leak hard-coded secrets.Sep.1Tom Forbes disclosed Infosys leaked FullAdminAccess AWS keys on PyPi for over a year(and the
13、n 57 other AWS keys on PyPi).Nov.16Summary5GitGuardian State of Secrets SprawlPublicMonitoring06Taming Secrets Sprawl in the SDLC31How leaky was 2022?07How does secrets sprawl threaten software supply chain security?14From code to cloud:Infrastructure as code16Measuring time-to-hacked:our experiment
14、 with honeytokens21Fun facts23Whats a good strategy for mitigating hard-coded secrets?33Insight from DarkOwl:The hidden economy of credentials on the darknet25Conclusion34About GitGuardian35Appendix36How leaky was 2022?6GitGuardian State of Secrets Sprawl1From the Octoverse 2022,see Methodology1.027
15、Bcommits scannedby GitGuardian(+20%compared to 2021)HCL(Hashicorp Configuration Language)is the fastest-growing language on GitHub.94M developers(+27%)About GitHub1 in 2022(+20%)85.7M+new repositoriesPUBLICMONITORINGHow leaky was 2022?07GitGuardian State of Secrets SprawlHow leaky was 2022?GitHubs o
16、rganic growth and the improvements of our detection engine(including+35 new detectors in 2022)partly explain the growth in the number of detected secrets.But all things equal,there is no doubt:Secrets sprawl continues to expand worldwide.10Msecrets occurrences detected in 2022(3M unique secrets)1 in
17、 10authors exposed a secret in 2022To err is human.Of the 13.3M distinct authors who pushed code to GitHub in 2022,1.35M accidentally exposed a secret.5.5commits out of 1,000 exposed at least one secret(+50%)3.7%of repositories active during 2022 leaked a secret61.2M repositories were active in 2022
18、2.27M of those repositories leaked a secret0246810New secrets detected on GitHub(millions)20202021202211063Secrets sprawl over the years 1 GitGuardian State of Secrets Sprawl reportHow leaky was 2022?08GitGuardian State of Secrets SprawlMap of secrets leaksGitGuardian State of Secrets SprawlHow leak
19、y was 2022?08India ChinaUSABrazilGermanyNigeriaSouth KoreaBangladeshFranceRussia01 02 03 04 05 06 07 08 09 10From GitHub profiles mentioning locationHow leaky was 2022?09GitGuardian State of Secrets SprawlDebunking a myth:hard-coding secrets is a junior developer mistake.It is a common myth that har
20、d-coded secrets are committed mainly by junior developers.The reality is that this can happen to any level of developer,regardless of experience or seniority.Hard-coding secrets is often a result of convenience rather than a lack of knowledge or skill.Senior developers,who might be simply testing a
21、database connection or an endpoint,are under tremendous pressure to deliver quickly to meet business demands.They are responsible for many hard-coded secrets too!Therefore,it is essential to recognize that secrets sprawl is a systemic issue,not just a problem for junior developers.shittysecrets.dev
22、stories:I was working on building a docker image in a new repo&needed it to be pushed to Dockerhub.When I was happy with my Dockerfile,I ran git add.&pushedeverything to my public github repo.Unfortunately,I hadnt added a gitignore file to the repo,so my env was committed to Github,along with my Doc
23、kerhubcredentials.A coworker of mine,after some restructuring of env files and howthey are loaded,commited an env file that was supposed to beignored,with all the credentials for multiple integrations-postmark,b passwords,algolia keys,you name itWell,a typical story of a temporary hardcoded solution
24、,which gotpushed to git and exposed everything for hell a lot longer timethan it was originally expected.How leaky was 2022?10GitGuardian State of Secrets SprawlSecrets categoriesSecrets detectors24.7%27.7%12%11.6%3.8%20.2%Data storageOtherMessagingsystemPrivatekeyCloud providerVersion controlplatfo
25、rmSpread by category for unique specific secretsGitGuardian uses two classes of detectors:specific and generic.Specific detectors match recognizable secrets,like an AWS access key or MongoDB database credentials.In 2022,our specific detectors accounted for 33%of the secrets detected1.Here are some o
26、f the top specific secrets caught in 2022:Spread by detector for unique specific secrets60%50%40%30%20%10%0%OtherGoogle API KeyRSA Private KeyGeneric Private KeyGoogle Cloud KeysPostgreSQL CredentialsGitHub Access TokenMySQL CredentialsAWS KeysGoogle OAuth2 Keys 1 See the Methodology section.9.7%57.
27、2%6.4%4.8%4.7%4.3%3.9%3.4%3%2.6%How leaky was 2022?11GitGuardian State of Secrets SprawlOn the other hand,generic detectors match a broad range of secrets,for example,a company email and a password that would end up hard-coded in a file.In a detection strategy,generic detectors are essential to ensu
28、re that no valid secrets fall through the cracks of specific detectors.To maximize precision and avoid false positives,each uses a carefully crafted set of conditions(regarding the filename,the file path,the surrounding context,etc.)In 2022,they accounted for 67%1 of the secrets detected,which shows
29、 their importance.1 See the Methodology section.Spread by detector for generic secretsGeneric PasswordGeneric High Entropy SecretUsername PasswordOthers(Bearer token,Basic Auth string,Authentification tuple,etc.)Base64 Generic High Entropy SecretCompany Email PasswordGeneric Database56%38%2.6%2%0.9%
30、0.4%0.1%60%50%40%30%20%10%0%How leaky was 2022?12GitGuardian State of Secrets SprawlFilename extension sensitivity indexTop 20 file extensions(unique secrets)051015202525%20%15%10%5%0%pyjson20.92%14.97%14.31%14.23%3.79%3.72%2.82%2.51%2.29%2.07%1.93%1.91%1.84%1.81%1.46%1.22%1.04%1.00%1.00%0.98%envjs(
31、empty)propertiespemphpkeyymltxttslogxmljavaplistexampleyamlipynbcs0.00500.001,000.00envkeypemexamplepropertiesplistpyyamllogjsonjsymlphpipynbtxt(empty)xmltscsjava1,461.63561.52324.9241.7714.2913.345.793.272.681.881.791.351.160.910.800.760.760.600.550.38Related to the filename extension frequency on
32、GitHub,we can build a sensitivity index(the higher the index,the more secrets uncovered per file of this type):1,500.00How leaky was 2022?13GitGuardian State of Secrets SprawlI would say,“Good luck,”to someone who says secrets detection isnt a priority.Their priorities are probably wrong.One of the
33、easiest ways for intrusion,as well as losing a lot of money in your company,is getting your secrets leaked somehow.Andrei PredoiuDevOps Engineer at a wholesaler/distributor with 10,001+employeesGitGuardian State of Secrets SprawlHow does secrets sprawl threaten software supply chain security?14How d
34、oes secrets sprawl threaten software supply chain security?When weighing the risk posed by secrets sprawl,its essential to consider the ensemble of hard-coded plaintext secrets rather than individual secrets taken separately:the more secrets there are,the more potential attack vectors there are for
35、a malicious actor.An excellent example of this principle was demonstrated in research by Ronen Shustin and Shir Tamari from Wiz,a cloud security vendor.They use the image of a keychain to better illustrate the concept:Harvesting plaintext credentials along their virtual tour of IBM Cloud Databases f
36、or PostgreSQL,they discovered a vulnerability,dubbed“Hells Keychain,”that combined three exposed secrets and a network misconfiguration.This vulnerability would have allowed them to compromise IBM Clouds internal image-building process to finally read and modify the data stored in every instance of
37、IBM Cloud Databases for PostgreSQL databases.In other words,they would have been able to expose IBM Clouds customers to a supply-chain attack.The keychain.symbolizes the collection of one or more scattered secrets the attacker finds throughout the target environment.Although both components(the forb
38、idden link and the keychain)are individually unhygienic,they form a fatal compound when combined.GitGuardian State of Secrets SprawlHow does secrets sprawl threaten software supply chain security?15The third plaintext credential in the keychain was a hard-coded secret in a container images metadata,
39、allowing them to infiltrate IBM Cloud build servers.Hells Keychain illustrates how scattered plaintext credentials across your environment can impose a huge risk on your organization by impairing its integrity and tenant isolation.Moreover,the vulnerability emphasizes the need for strict network con
40、trols and demonstrates how pod access to the Kubernetes API is a common misconfiguration that can result in unrestricted container registry exposure and scraping.Finally,we were reminded of the value of secret scanning.Although in previous cases,our team managed to violate tenant isolation by exploi
41、ting vulnerabilities in neighbor-tenant instances or the control plane,in the case of IBM Cloud Databases for PostgreSQL,the Achilles heel was improper secrets management.Regardless of how strong your organizations security measures are,it faces a huge risk if plaintext credentials are scattered acr
42、oss its environment.From code to cloud:Infrastructure as code16GitGuardian State of Secrets SprawlFrom code to cloud:Infrastructure as codeInfrastructure as code(IaC)is the abstraction layer used to declare the final/desired state of IT infrastructure with code:servers,storage,databases,networks,and
43、 all the basic configurations(DNS entries,firewalls,etc.).Empowering developers,SREs,and platform engineers to collaborate faster and more effectively than ever before,IaC has become a staple in cloud-native deployments.With a 24%CAGR projected from 2022 to 20301,infrastructure as code adoption is i
44、n full swing.This is also evident on GitHub,where IaC-related contributions2 increased by 28%in 2022.IaC has made infrastructure workflows shift left.Cloud security is shifting left with it.Distribution of IaC filetypes on GitHub in 202232.5%31.9%18.9%9%6.2%DockerfileKubernetes*TerraformAnsibleDocke
45、r composeAzureResourceManagerCloudformation*Kubernetes-related.yaml or config files 1 From Fireflys State of IaC 20232 Patches to IaC filesFrom code to cloud:Infrastructure as code17GitGuardian State of Secrets SprawlShifting left:Infrastructure as code securityCode managing infrastructure can lead
46、to uncaught mistakes and security vulnerabilities.We found that Terraform files had an average of 5.57 occurrences of secrets(2.11 unique secrets)per 1,000 patches,which is 3x the average for all file extensions.The graph below shows that all file extensions can hide secrets.Therefore,it is essentia
47、l to scan them all.Moreover,secrets are not the only risk present in IaC files.A single misconfiguration in an IaC manifest can break a security policy and make the deployed infrastructure vulnerable to attacks.Proportion of secrets occurrences per 1000 patches005005004003002001000412.95.
48、env.properties.py.tf or.tfvars.js.json.yaml or.ymlEvery extension26.247.045.573.593.462.681.8From code to cloud:Infrastructure as code18GitGuardian State of Secrets SprawlThe most common IaC vulnerabilities are:Networking misconfigurations:unrestricted egress or ingress traffic can expose assets to
49、attacks such as remote code execution.The use of HTTP instead of HTTPS is also frequent.Data exposure misconfigurations:S3 buckets without encryption can lead to data leakage.Secrets:exposing a sensitive environment variable in the configuration can lead to a plaintext credentials leak.Permission mi
50、sconfigurations:using the default service account on a compute instance allows an attacker to spread through the network.Number of vulnerabilities per Terraform repos40.3%1.9%20.3%8.6%6.8%4.2%151+17.9%23-56-910-2021-50Infrastructure as code is an entirely new attack surface to protect.We estimate th
51、at 21.52%of all Terraform1 repositories have one or more security policy vulnerabilities.1 Based on a private repositories dataset where 7.5%of repos contained Terraform files,see AppendixFrom code to cloud:Infrastructure as code19GitGuardian State of Secrets SprawlSecrets sprawl will continue to gr
52、ow as a problem since a lot of developers forget that IT security aspect.They just copy and paste stuff,then leave it in the code and forget about it.That is how attacks happen;somebody slipped,making a mistake or misconfiguration.Edvinas UrbasiusIT Security Specialist SOC analyst at a wholesaler/di
53、stributor with 10,001+employeesFrom code to cloud:Infrastructure as code20GitGuardian State of Secrets SprawlSecrets in Docker imagesDocker images are one of the largest unmonitored attack surfaces.Images are complex,layered structures that present more risk than meets the eye.As highlighted earlier
54、,even commercially available images distributed by reputable cloud vendors can leak secrets through their build process(see p.15).A study we conducted last year on Docker Hub showed that more than 4,000 secrets(1,200+unique)were hard-coded in a 10,000 image sample.Also,4.62%of the Docker Hub images
55、sample exposed one or more secrets.It is clear today that scanning for secrets in images(whether home-built or third-party-provided)is essential to guard against supply chain attacks.In December 2022,was called 600K times to scan 17,300 Docker images for secrets.GitGuardian State of Secrets SprawlMe
56、asuring Time-to-hacked:Our Experiment With Honeytokens21Measuring time-to-hacked:our experiment with honeytokensGitGuardian is not the only service monitoring GitHub in real-time.Malicious bots are also looking to exploit sensitive information as soon as it is exposed.When a honeytoken(also called a
57、 canary token)is deliberately exposed as an experiment,its easy to realize why GitHub-exposed data should be considered compromised.Not all credentials are born equal:for this experiment,we leaked two types of honeytokens on GitHub:a set of AWS IAM credentials and PostgreSQL credentials.We then plot
58、ted the number of times they were used by third-parties over multiple hours:What is a honeytoken?A honeytoken is a resource that is monitored for access or tampering.Usually,honeytokens come in the form of a URL,file,API key,email,etc.,and trigger alerts whenever someone(presumably an attacker)trips
59、 over them,producing an Indicator of Compromise(IoC)for security teams.Honeytokens triggers count0510150:00:020:00:030:00:040:00:050:00:080:00:100:00:110:00:120:00:131:00:053:06:003:06:018:59:58022000Exposure timePOSTGRES PASSWORDAWS IAMGitGuardian State of Secrets SprawlMeasur
60、ing Time-to-hacked:Our Experiment With Honeytokens22The first attempt occurred 2 seconds after publicly exposing the credentials.In total,24 IPs(22 if we remove AWS and GitGuardian)requested AWS IAM information using the credentials in the first 9 hours after the leak.Although no IP used the Postgre
61、SQL credentials in the elapsed time,that doesnt mean that they are any less problematic:on the contrary,bounty hunters are actively looking for“dorks”on GitHub,which are easily recognized patterns to identify low-hanging fruits:Fun Facts23GitGuardian State of Secrets SprawlFun factsChatGPT and OpenA
62、I are(without surprise)the buzzwords of 2022:GitGuardian State of Secrets SprawlFun Facts23Commit some okay,but first,revoke that key!100,00075,000300050,000200025,0001000002023-01-012022-10-012022-07-012022-04-012022-01-01Number of documents mentioning OpenAI or ChatGPTNumber of OpenAI API keys det
63、ected(right axis)Quotes from review24GitGuardian State of Secrets SprawlSecrets being used to access resources is probably one of the most common ways to be involved in a high-profile breach these days.If you are not detecting secrets in code,then every developers machine is a security breach waitin
64、g to happen.Jon-Erik SchneiderhanSenior Site Reliability Engineer at a computer software company with 501-1,000 employeesInsight from DarkOwl:The hidden economy of credentials on the Darknet25GitGuardian State of Secrets SprawlInsight from The hidden economy of credentials on the darknetWhen it come
65、s to secrecy,there is one place that cannot be ignored:the darknet.The darknet,also referred to as the dark web,is a layer of the internet designed specifically for anonymity.It is more difficult to access than the surface web and is accessible only via using specialized software or network proxies
66、specifically browsers supporting special protocols.Users cannot access the darknet by simply typing a dark web address into a web browser.Adjacent to the darknet are other networks,such as instant messaging platforms like Telegram and the deep web(non-public web).Due to its inherently anonymous and
67、privacy-centric nature,it facilitates a complex ecosystem of cybercrime and illicit goods and services trade.The dark web is a thriving ecosystem within the global internet infrastructure that many organizations struggle to incorporate into security posture.Still,it is an increasingly vital componen
68、t for organizations with forward-thinking strategies.“Secret”data,including tokens and keys found on open repositories such as GitHub,are easily re-sold (or sometimes shared for free)on the darknet and deep web.How does it work?In some cases,such as that of the deep web site BreachForums,leaked data
69、 is offered for download via vendor-specific currency in the form of generally inexpensive credits.Another way to accrue credits is to share other breached data for users to download.Users can also gain credits to purchase these stolen data packets by commenting on and engaging with other user posts
70、.Both of these aspects of the darknet breach economy encourage discovering and re-sharing of sensitive user data and creativity in exploiting previously-exposed information.Darknet 101Insight from DarkOwl:The hidden economy of credentials on the Darknet26GitGuardian State of Secrets SprawlConsequent
71、ly,an extensive amount of sensitive information is available for download on the darknet and deep web,ranging in prices from free to several thousands of dollars.While such free exchanges may challenge the use of the word“economy”it is crucial to remember how this stolen information is used.The vast
72、 majority of cases result in hackers gaining illicit access to user accounts and either exploiting them for financial gain or using them to pivot into corporate network access to carry out more large-scale attacks.ExamplesVerizonThe below screenshots demonstrate a typical database leak offering.This
73、 breached information has been offered entirely free(no digital currency or credits are required to download).Insight from DarkOwl:The hidden economy of credentials on the Darknet27GitGuardian State of Secrets SprawlDoorDash User AccountsWhile the token or credit-based nature of the darknet economy
74、does support“free”or more covert methods of exchanging Secret data(such as credits),this is not always the case.For example,as demonstrated by this Doordash database,containing username and password pairs for over 650,000 individuals,offered at a starting bid of US$10,000 in August 2022.TSA No-Fly L
75、ist Hackers offered the recently leaked US TSA No-Fly list in exchange of credit tokens on a deep web forum.Insight from DarkOwl:The hidden economy of credentials on the Darknet28GitGuardian State of Secrets SprawlAre hackers exchanging secrets on the darknet?The shift towards everything-as-an-API i
76、n the commercial landscape echoes what DarkOwl analysts see in the darknet.Discussions around stealing and selling API keys are a relatively new phenomenon in the darknet over the last couple of years that we expect to continue to grow.Threat actors who are looking to facilitate the wider distributi
77、on of malware through supply chain compromises have also discussed credentials and pivot points sourced from open repositories.Developers and security researchers worldwide have been equally appalled and conflicted by the intentional sabotage of open-source software packages.Many are concerned about
78、 the reputational damage these incidents cause to the open-source software development movement.Insight from DarkOwl:The hidden economy of credentials on the Darknet29GitGuardian State of Secrets SprawlWhile it is impossible to grasp the total size of the underground digital economy,DarkOwl does hav
79、e insight into certain entities that indicate the potential for exploitation,including sensitive credential information.DarkOwls AI and analyst-augmented database is updated in near real-time and collects from hard-to-reach places of the darknet,including authenticated forums,ransomware sites,chat p
80、latforms,open server databases,and breach/leak exchanges.As of January 2023,their records detected:How many credentials are for sale on the Darknet?From DarkOwls data(Jan 2023):While it is unclear exactly how many total API keys have been leaked on the darknet,DarkOwl has identified 11,246 unique ex
81、posed API keys for Amazon Web Services(AWS)alone.02,500,000,0005,000,000,0007,500,000,0009,333,991,6055,003,601,6852,840,355,9992,543,145,8871,233,863,437637,477,376519,110,557total emailsplaintext passwordsunique emailsunique passwordsstrong passwordshashed passwordsunique emails/passwordscombosDar
82、kOwl Jan 2023 sensitive credential informationInsight from DarkOwl:The hidden economy of credentials on the Darknet30GitGuardian State of Secrets SprawlIf a colleague in security said to me that secrets detection is not a priority,I would say thats a mistake.Most of the big security problems come fr
83、om either social engineering attacks or credential stuffing.So its really important to know that your engineers and your employees are going to leak secrets.Thats life.Most of the time,its due to mistakes.But if it happens,we need to act on it.The more engineers there are,the more there is potential
84、 for leaks to happen.Tho CusnirApplication Security Engineer at PayfitTaming Secrets Sprawl in the SDLCTaming Secrets Sprawl in the SDLC32GitGuardian State of Secrets SprawlSecrets can get exposed in many ways.Source code is an asset that can quickly be lost to subcontractors and,of course,source co
85、de theft.Developers must often be reminded of the security risks of storing sensitive information in source code and other software artifacts.Otherwise,they may be tempted to think their private accounts and assets will stay out of the reach of attackers.They may also be unaware of the need to use s
86、ecure methods for storing secrets,such as environment variables or vaults or simply not appropriately trained in the workplace.Like many other security challenges,poor secrets hygiene involves the usual trifecta of people,processes,and tools.Organizations serious about taming secrets sprawl must wor
87、k on all these fronts simultaneously.PeopleThe first part of addressing poor secrets hygiene in software development is to ensure that Dev,Sec,and Ops are involved in the process and trained on secure coding best practices,including the proper use of secrets.Additionally,the stakeholders should be h
88、eld accountable for adhering to these best practices and have clear avenues to report potential security violations and contribute to their fixing.ProcessesEstablishing a secure software development lifecycle(S-SDLC)is critical in preventing the misuse of secrets in software development.It should in
89、clude secure coding guidelines,processes for provisioning and managing secrets,and processes for securely distributing them across various environments.ToolsUsing technologies such as detection,encryption,tokenization,and dynamic key rotation can help secure secrets in software development.Additiona
90、lly,systems should be designed focusing on built-in,not bolt-on,security.Whats a good strategy for mitigating hard-coded secrets?33GitGuardian State of Secrets SprawlWhats a good strategy for mitigating hard-coded secrets?Hard-coded secrets detection and mitigation can be shifted left at various lev
91、els to build defense-in-depth across the development cycle.To gradually move to a“zero secrets-in-code”policy,here is a step-by-step:Start by monitoring commits and merge/pull requests in real-time for all your repositories with native VCS or CI integration(shift to the team level).Enable pre-receiv
92、e checks to harden central repositories against leaks,and“stop the bleeding.”Educate about using pre-commit scanning as a seatbelt(shift to the individual level).Plan for the longer term:develop your strategy for dealing with incidents discovered through the historical analysis.Implement a secrets s
93、ecurity champion program.Continuously scan incremental codechanges at every stage of the SDLCConclusion34GitGuardian State of Secrets SprawlConclusionSecrets sprawl continues accelerating and there will never be a better time to act.With 10 million secrets discovered in public GitHub commits and cou
94、ntless more silently accumulating behind closed doors,this is one of the biggest threats to the security of the digital world.With attackers recognizing that compromising machine or human identities yields a higher return on investment,especially when targeting software supply chain components,this
95、trend will unfortunately not dissipate soon.Companies now understand that source code is one of their most valuable assets and must be protected.To scale these efforts will require bringing security and development closer and advancing towards an application shared responsibility.It is valid for the
96、 detection and remediation of incidents as well.The very first step is to get a clear audit of the organizations security posture regarding secrets:where and how are they used?Where do they leak?How to prepare for the worst?You can start right away by taking the(anonymous)secrets management maturity
97、 questionnaire and learn where to go from there with this white paper.Or you can contact us to get a complimentary audit of your companys exposed secrets on GitHub.Organizations must be prepared for secrets sprawl and have the right tools and resources to promptly detect and remediate any issues.Its
98、 time to take action!About GitGuardian35GitGuardian State of Secrets SprawlAbout GitGuardianGitGuardian is a code security platform that provides solutions for DevOps generation.A leader in the market of secrets detection and remediation,its solutions are already used by hundreds of thousands of dev
99、elopers.GitGuardian helps developers,cloud operation,security,and compliance professionals secure software development and define and enforce policies consistently and globally across all systems.GitGuardian solutions monitor public and private repositories in real-time,detect secrets,sensitive file
100、s,IaC misconfigurations,and alert to allow investigation and quick remediation.Additionally,GitGuardians Honeytoken module exposes decoy resources like AWS credentials,increasing the odds of catching intrusion in the software delivery pipeline.Gitguardian is by farand is trusted by leading companies
101、,including Instacart,Snowflake,Orange,Bouygues Telecom,Iress,Maven Wave,NOW:Pensions,DataDog,and PayFit.Learn more about GitGuardian:THE N1 SECURITY APPLICATION ON THE GITHUB MARKETPLACEWebsitePublic MonitoringInternal MonitoringAbout GitGuardian36GitGuardian State of Secrets SprawlAppendixAppendix3
102、7GitGuardian State of Secrets SprawlGitGuardian State of Secrets SprawlAppendix37DefinitionsSecretA secret can be any sensitive data we want to keep private.When discussing secrets in the context of software development,we refer to digital authentication credentials that grant access to services,sys
103、tems,and data.These are most commonly API keys,username and password combos,or security certificates.In this report,secrets refer to credentials hard-coded in cleartext(not encrypted).OccurrenceWhen our detection engine detects a hard-coded secret,it becomes an occurrence.A single incident generally
104、 encompasses multiple occurrences,which are the various locations across files or repositories where the secret was identified.Occurrences map to the magnitude of the sprawl and correlate to the amount of work needed to redistribute the secret after it has been rotated.Occurrences can be assimilated
105、 to a technical debt.Secret incidentA secret incident is a uniquely identified security event that has been determined to impact the organization and necessitates remediation.An incident often has multiple occurrences across files or repositories.Appendix38GitGuardian State of Secrets SprawlGitGuard
106、ian State of Secrets SprawlAppendix38Supply chain securityA software supply chain is a logistical pathway that covers anything required to build a software artifact.It is the set of assembled components,building tools,and processes from source code to production deployment.Supply chain security is a
107、bout securing each link in the chain by ensuring that components supplied by third parties have not been compromised and comply with security requirements.Infrastructure as code securityInfrastructure as code has become the favorite way for the DevOps generation to manage and provision infrastructur
108、e,especially in the cloud.Its also a new responsibility since uncaught mistakes can result in misconfigurations and,in the end,important security failures.Securing infrastructure as code requires bridging the gap between Security,Operations,and Development and leveraging automation to build efficien
109、t guardrails.Appendix39GitGuardian State of Secrets SprawlGitGuardian State of Secrets SprawlAppendix39MethodologyOctoverse 2022The data from the GitHub Octoverse is for the period 0/01/2021 to 09/30/2022.GitGuardian data is for the period 01/01/2021 to 12/31/2022.Specific and generic detectorsIn 20
110、22,GitGuardians 350+specific detectors collected,in total,3.016M unique secrets on public GitHub.Secrets detection is always a tradeoff between precision and recall.To improve the accuracy of our study,we manually removed 626k(specific)secrets from this count:399K django_secret_key(they often corres
111、pond to default or local settings)and 227K googlecloud_keys,which could be considered outliers.The proportions do not represent the mix of specific vs.generic secrets found on GitHub.Our engine detects approximately the same volume of generic and specific secrets.IaC VulnerabilitiesOur study data se
112、t comprised 6,311 git repositories hosting Terraform files,of which 1,358 were affected by one or more vulnerabilities(21.52%).In total,12,046 Terraform vulnerabilities were detected.Appendix40GitGuardian State of Secrets SprawlGitGuardian State of Secrets SprawlAppendix40About DarkOwlDarkOwl is a D
113、enver-based information security company specializing in darknet OSINT tools.Using machine learning and human analysts,DarkOwl continuously collects and indexes data from the darknet,deep web,and high-risk surface sites.This includes Tor,I2P,IRC,Telegram,ZeroNet,as well as transient paste sites,deep
114、 web criminal forums,and other high-value sites.Their data is collected and stored in near real-time,allowing it to be queried in a safe and secure manner without having to access the darknet itself.DarkOwls UI and API products were designed to make previously hard-to-access content actionable for cybersecurity companies,organizations and governments seeking to mitigate risk from potential threats such as ransomware or malicious actor activity.For more information,visit Appendix41GitGuardian State of Secrets S