After recently attending an “Un-conference” in San Francisco the other week, I returned to work where my co-workers had run into an extremely unique situation, and we couldn’t explain it.
Background
One of the leads from my former team approached me and told me how I’d missed out on all the fun while I was gone. He began to explain to me how for Product X, their QA or Perf environments would get messed up any time Product Y would do a deployment on those environments. On my former team, we use Terraform modules & Terragrunt to manage our Infrastructure as Code (IaC). After troubleshooting the recent changes for Product Y, it was determined that a junior engineer performed a bad copy/paste of some Terragrunt to create the IaC for Product Y. They took out the incorrect code, but I wanted to understand what was actually happening to allow for this to have occurred in the first place.
In the Terragrunt code for Product X, we are creating two S3 Buckets for static web content. In all the other products (like Product Y) we only create one bucket. The copied code happened to be the piece that creates both buckets. After doing some troubleshooting of my own, I confirmed that the Terraform/Terragrunt code was “re-creating” the bucket in this block of code. But why?
Diving Deeper
I had never seen anything like this with an S3 Bucket before. Normally when you try to create a bucket that already exists, you will receive the error BucketAlreadyExists, but that wasn’t happening here. I had to know why. I contacted one of our Technical Account Managers (TAM) and explained the situation to him. He too, hadn’t experienced this before, so we both dug in deep to try to get to the bottom of this.
Finally, the TAM found the S3 List of Error Codes for S3 Buckets. Because the Bucket being “created” was on the same account, it was really the BucketAlreadyOwnedByYou error code and not BucketAlreadyExists. Now, here’s where you need to read the fine print.
The bucket that you tried to create already exists, and you own it. Amazon S3 returns this error in all AWS Regions except in the US East (N. Virginia) Region (us-east-1). For legacy compatibility, if you re-create an existing bucket that you already own in us-east-1, Amazon S3 returns 200 OK and resets the bucket access control lists (ACLs).
Bolded for my emphasis
Moral of the Story
AWS should really look into deprecating this functionality so all regions work the same. I know that us-east-1 is special and will probably always keep these quirky details. The only way to find this was by reading the errors documentation or if you already knew about the BucketAlreadyOwnedByYou error.