Accidentally leaking secrets — usernames and passwords, API tokens, or private keys — in a public code repository is a developers and security teams worst nightmare. Fraudsters constantly scan public code repositories for these secrets to gain a foothold in to systems. Code is more connected than ever so often these secrets provide access to private and sensitive data — cloud infrastructures, database servers, payment gateways, and file storage systems to name a few. But what happens after a secret is leaked? And what is the potential fallout?
One thing is for certain: the fallout can be catastrophic in terms of financial loss and reputational damage. The now-infamous Uber data breach of 2016 was the result of an engineer accidentally uploading credentials to a public GitHub repository. These credentials allowed bad actors to access Uber's backend databases and confidential information. This breach ultimately resulted in a then-record-breaking $148 million fine — ouch.
More recently, the sophisticated hack against IT management software giant SolarWinds could of originated from exposed file server credentials in an engineers GitHub repository. This supply-chain attack and the resulting fallout will go down as one of the largest in history, joining the likes of WannaCry. The hack affected over 18,000 SolarWinds customers including various US federal agencies and tech-giants such as Microsoft and FireEye. As news of the hack broke, SolarWinds saw it's market cap cut in half and fall by billions of dollars after it's share price dipped below pre-IPO levels.
Thankfully these hard-hitting headlines are the exception and not the rule. We can look to publicly disclosed reports on bug bounty platforms such as HackerOne to get a feel for how often these secrets are being found and reported. At least by the good guys. These three reports from 2020 could easily have been a front page headline or resulted in huge regulatory fines — they got lucky.
Finding these secrets is what we do. Shhgit finds over 165,000 secrets every day single across public GitHub, GitLab and Bitbucket repositories. More than half of these are valid and live. And our data suggests it is an ever increasing problem that often goes unnoticed. You can catch a small glimpse of the scale of the problem on our homepage:
But what happens immediately after leaking secrets? To find out we purposely leaked valid Amazon AWS credentials to a public GitHub repository. We chose to leak AWS keys because we know they are highly sought after by fraudsters with all sorts of different motives — espionage, spamming, financial gain or blackmail. But we didn't quite realise how quickly it would happen...
We wanted to limit our liabilities as much as possible, even though we used a new AWS account loaded with free credits. We definitely do not want to be end up footing a huge bill for this experiment. AWS' Cost Management tool alerts you if you go over a set budget but it will not stop you from going past it. To be on the safe side we created a script to automatically destroy any new EC2 instances (servers) shortly after their creation. This gives us enough time to forensically capture the server and analyse what the hackers were doing.
(15:12 UTC) First, we created a new IAM user and attached basic S3 read and EC2 policies. This means the account will only have permissions to access our file storage and spin up new cloud servers. We then published the AWS secret keys to a new GitHub repository.
(15:16 UTC) Just four minutes later we receive an e-mail from the AWS team notifying us of the exposed credentials — neat!
Amazon automatically revokes exposed keys by applying a special policy called AWSCompromisedKeyQuarantine. This effectively prevents bad actors from using the key. But it somewhat renders our experiment useless if the keys can't be used. We knew this would happen beforehand so we created a simple script to automatically remove the quarantine policy if found (tip: never do this). Now we just wait for fraudsters to take the bait.
(15:17 UTC) Shhgit happened to be monitoring all activity by the user who leaked the secrets, so a minute later we received an e-mail alert of the leak.
(15:18 UTC) A minute later we detected a flurry of activity on the key from an IP address based in the Netherlands. Threat intelligence feeds associates the IP address with spamming, exploitation of Windows hosts, and running a TOR exit node.
The first batch of commands helps the attacker understand the lay of the land. Shortly after the bad actor spun-up two c6g.16xlarge EC2 instances — AWS' most powerful compute instance. Undetected this would have costed us thousands of dollars a month. Any guesses on motives?
We analysed the server afterwards and it was a base install of Ubuntu with XMRig installed which is used to mine for $XMR (Monero) — nothing overly exciting.
(15:54 UTC) Shortly after, another actor with an Israeli residential IP address used the secrets to access our S3 buckets (file storage) and download its entire contents (which we filled with random data). We booby-trapped some of the files with tracking pixels but unfortunately none of them triggered.
One attacker who copied the files from our S3 buckets started a live chat conversation through the shhgit.com homepage (the bucket was named shhgit) — a snippet of the conversation below.
Ultimately this ends up in a demand for money to pay him for his services in finding the 'bug'. We politely declined his generous offer. We believe this could have quickly turned in to an extortion attempt if the files were worth any value.
It look just 6 minutes for the malicious actor to find the leaked credentials on GitHub and compromise our AWS account. So why did we say 'in less than a minute' at the beginning of the post? Attacks like this are made possible by GitHub's "real time" feed. Every fork, code commit, comment, issue and pull request is fed in to a public events stream. Bad actors watch this stream to identify new code and scan it for secrets. GitHub delays the feed by 5 minutes — presumably for security reasons — making the earliest possible time a bad actor could have captured the secrets 15:17 UTC, 5 minutes after committing them.
Fraudsters are scanning public code repositories for your secrets. They are counting on you, or your team, or your suppliers, or your contractors, to mess up and take full advantage of it. As with all security controls, a defence in depth approach is always best so consider implementing the following:
We help secure forward-thinking development, operations, and security teams by finding secrets across their code before it leads to a security breach.
Of course you do! Share how you are using shhgit for a chance to receive some awesome swag.