Top 8 Reasons I Hate AWS Tags
Here are my top 8 reasons for not liking Amazon tags:
#8: It’s a Copy
When you tag your Amazon infrastructure with metadata on environment, role, product or customer, you are implicitly taking information from one external system - e.g. CRM, configuration management, contract database - and moving in into another external system. The result is you have a copy of the data. We are all familiar with the downsides of any type of copy: they need to be synchronized, are prone to becoming out of date, and require automation / process to manage. The result is Amazon tags become a rough approximation, but never an authoritative source.
#7: Not Available Everywhere
One of the great strengths of Amazon is their approach to small and decentralized teams for driving feature development. One of the great weaknesses of this approach is that at times things don’t always work the way you expect them to. Tags are a good example of this. I would expect Amazon tagging to be a generic service provided across all services. Instead tags were originally introduced in EC2, and have been gradually rolling out across different services. Even now there are many places in Amazon I expect to be able to tag but cannot, e.g. security groups, load balancers, elastic IPs, reserved instances, IAM users / groups, and so on. The result is that the value Amazon wants us to derive from tags - e.g. better insight into costs, easier identification of provisioned infrastructure - are often diluted by the inability to pervasively tag.
#6: 10? Really?
I am not sure what the logic was to limiting tags at 10. I have met many Amazon users who have been contorting themselves to work around this limit by combining multiple pieces of concatenated metadata in a single key. The good news is I would expect this limit to change in the future. The bad news is that until then, we are being forced to either limit what metadata we bring into Amazon, or to obfuscate the meaning of tags.
#5: Different APIs
Amazon provides the DescribeTags API call in EC2 to enumerate all tags, CreateTags to assign a tag, and then distributes to several API calls the retrieval of tag information for a resource type (e.g. DescribeInstances). I can work with this, right? Now let’s move to RDS. Unfortunately there is no equivalent to DescribeTags in RDS, and to set a tag you use AddTagsToResource, and the implicit tag information in calls such as DescribeDBInstances is your only way to retrieve tag information. Welcome to the Amazon tag API, where no two teams decided to implement it the same way.
#4: Custom Development Required
A customer with even a moderate scale infrastructure may have a few thousand assets they want to tags (e.g. instances, volumes, snapshots, RDS instances, S3 buckets). This infrastructure is likely changing all the time: new infrastructure is starting, and existing infrastructure shutting down or changing purpose. To manage this with even moderate quality requires each Amazon customer develop custom code that takes the metadata they want in the cloud from their internal systems (often multiple internal systems), and pushes it into Amazon using the various APIs for each service. This code needs to be developed, tested, and maintained individually by each and every Amazon customer.
#3: Quality
A comprehensive tag strategy requires that you integrate into every workflow that could affect the tags - e.g. launching new instances, detaching / reattaching volumes, changing functional role of infrastructure, etc… Some of these workflows will be fully automated, others partially automated, and yet others entirely manual. The result is that while tags are more right than wrong, they need to be treated for what they are: a non-authoritative source that is useful for approximations.
#2: Fixed Historical Record
Amazon delivers a great feature called its cost allocation report. If you take the time to develop a tag solution and setup this report, Amazon will report most (unfortunately not all) hourly costs in a CSV they will put into a bucket of your choice to associate costs to tags. Unfortunately only some rows will have tags, due to many reasons (e.g. lack of support for tagging that resource, lack of support for reporting tags on a resource type, or lack of association of cost to a resource). For small customers, this may produce a few thousand line CSV; for large, a several million line CSV.
I really like and appreciate this feature, but have one major issue: it makes tags a fixed historical record. For example, if I accidentally tagged some instances incorrectly as production when they are development, the resource CSV will report these costs as production. While I can change it going forward, I cannot go back and modify the association of tags to billable items. To make matters worse, if in the future I want to make a fundamental change to how I look at the cost of your infrastructure, I am limited by the historical tags I have used over time.
#1: Not Cloud Portable
I know this one is not Amazon’s issue. Their tagging feature is actually mostly unique across cloud providers. But since cloud portability is consistently a top theme among cloud customers, we as consumers need to understand how we plan to manage this complexity across more than one vendor.
Parting Thoughts
In conclusion, I hate Amazon tags. I also use them, and recommend you do too. ;)