As the number of development teams grew and system complexity increased we had to find ways to ship features to customers continuously as well as protect customers from outages of 3rd party APIs. To solve this problem we used "feature flags" technique and introduced implementation and lifecycle guidelines.
New approach allows for testing new features on a subset of loyal customers. No matter how much testing you do in lower environments, something unexpected usually comes up in production which you can’t prepare for due to volume issue, edge cases or environment issues so feature flags helped us a lot.
Types of feature flags
We identified 2 types of feature flags:
Allows you to turn things on an off in a running application, these flags are the most useful, because you can enable features at will. For example if you worked 6 weeks on a feature, to not do a big-bang roll out and migration or enable the feature for test users - you would use runtime feature flags for this
Enables or includes features into artifact based on a build process. For example if you have multiple payment providers and you want to disable one and exclude it from the build - you would use build time FF
Categories of feature flags by function
Also, we separate feature flags by function they perform:
🏎 "Launch" feature flags
Temporary FFs that are created before feature is completed to enable continuous deploy of the app and enable feature for test users. Also to perform gradual rollout to customers based on their segment. These are usually deleted when feature is fully completed and deployed to customers
Lifecycle of launch feature flags:
First step when starting a new feature is to create a "launch" feature flag
Then you wrap your entrypoint with a feature flag. Starting from this point on you can ship your changes to production with feature flag disabled
At the end of the project when feature is enabled for all customers, remove feature flag
🚩 Risk-mitigation flags
These flags are long-lived, they are created and kept permanently. It is useful to turn on and off certain features of the application when necessary.
Few use cases:
Wrap 3rd party integrations to wire off in case of a failure of 3rd party. For example:
- in an e-commerce platform that supports multiple payment systems it makes sense to wrap each one of them with separate feature flags, so that when one is unavailable or have degraded performance you can simply save yourself few grey hairs by not having to process payments using payment provider in a degraded state but then turn it back on when it's healthy.
- Another example is wiring off warehouse integration when warehouse software is down
Wrap internal features with feature flags to disable in case of performance issues, load events, etc
Defined and implemented a framework for working with feature flags with few requirements in mind: naming conventions, lifecycle of feature flags, etc
Flags should be named after features they "enable", not "disable":
enable_paypal?is a good name, because it is clear if feature flag is enabled, PayPal will be enabled too 😄
disable_paypal?is bad, because for developers it takes more time to process negative statements (we tested this!)
Feature flags must be available in all environments starting from dev
Automated provisioning of feature flags. This is required to prevent situations when feature flag exists in one environment and not the other, making conditions of the system unpredictable
Defined and implemented "smart defaults" to make feature flags easy and robust to work with
Alternative to feature flags
Prior to introduction of feature flags the alternative was to time releases to feature launches which turned out to be super stressful and fragile as it didn’t allow to test new features in production prior to customer launch to verify quality.