May 17, 2017
When Eric Ries published The Lean Startup in 2011, he outlined foundational principles of rapidly testing business hypothesis through product experiments. Startups like Contactually know they should split-test, be data-driven, and “Move fast and break things”, but actually applying these principles is another story.
Rapidly validating and iterating on product features requires your team to establish tooling and methodology for doing so, which requires explicit effort and commitment.
Contactually product development process has evolved into a fairly effective but simple pattern for running experiments that involves 5 different areas of methodology.
These 5 aspects of experiment execution have allowed us to drive powerful business outcomes like improving our onboarding’s user activation rate by 20%
Additionally, these 5 areas serve to reasonable balance several types of assessments:
Key to any controlled experiment is tooling that enables you to apply the new functionality to an experimental group of users, while maintaining a control group to compare against.
While many A/B testing tools exist for swapping out HTML or front-end components, Contactually wanted the ability to control deeper backend behavior (Ruby on Rails) based on experiment groupings. Things like changing scoring algorithms, triggering alternative background jobs, exposing or not exposing certain ActiveRecord relationships.
This led us to write our own Rails service to control A/B experiments.
Our custom-rolled Feature Flipper service allows us to create and manipulate experiment “Features” via a Rails console, or an internal admin panel for our non-technical employees, and is backed by Redis.
Our FeatureFlipper service give us the following options:
While the standard means to evaluate an A/B test is by comparing performance along some metric, this takes time, and we often want a shorter feedback loop to identify bugs, shortcomings, or use cases we overlooked.
While some of this can be achieved through a dialogue with the experimental users, there is another faster and more detached way we can gain this insight.
With Fullstory we can launch a new feature, create a search segment for user sessions interacting with the new feature, and watch how users use it.
Things of particular value that we can quickly learned are:
The flip-side of A/B tooling is need for a way to compare the performance of the two experiment groups.
While some front-end A/B tools have built in analytics, Contactually needed to track our Feature Flipper experiments in another metrics tool.
By including the list of experimental “Features” that users are a part of in all of our user events, we are able to create these comparisons in our existing metrics tools, namely MixPanel and Looker.
While observed user interactions helps identify problems, and metrics compare performance, identifying all possible sources of improvement is hard work.
Communicating to users about experimental features and soliciting their feedback can fill some of these gaps.
Self-reported feedback will never be flawless — as it subject to all kinds of personal biases — but when patterns emerge in feedback it is telling.
Different means of communication and feedback solicitation:
Forms of feedback solicited:
Finally, as we gather feedback from various mechanisms, we typically track them in a simple Google Sheet.
We group related pieces of feedback for visibility into the most common problems or suggestions, and evaluate them on several criteria:
As feedback rolls in, particular attention will be given to anything deemed high priority or low level-of-effort to resolve.
Finally, as we resolve some of these issues, we can attempt to keep affected users in-the-loop — notifying them of changes, and allowing them to continue to evaluate the new feature.
To see a real-life example of this aspects in work, let’s take a look at how they were involved in Contactually’s effort to rebuild our user onboarding flow in late 2016.
In short, the goal was to gives new users a conceptual tour of what would be involved in setting up their CRM and how it will drive business results.
Before developers ever touched code, our design team created multiple rounds of prototypes, and solicited interviews with old and new users to gauge their responses to concepts presented.
This drove iteration to double down on concepts that resonated with users and drove understanding, or to remove or improve areas that were deemed less impactful.
Interesting concepts that were not acted on were added to a backlist of possible changes that could be assessed and explored further later.
One key benefit here is that iteration happens quickly with cheaper, low-fidelity mockups and prototypes. Rather than deploy fully functionally software (very expensive), we opted for lower fidelity prototypes.
After development of the first iteration of the feature, which consisted of a dozen “slides” that users progressed through, we deployed this via our Feature Flipper to 50% of users.
By watching initial user sessions in FullStory, we very quickly identified several major challenges that user’s were encountering.
We were then able to disable the experiment via our FeatureFlipper, fix the problems within hours, and redeploy a new iteration of the flow.
This process was repeated 1–2 more times, but with bugs or challenges of less severity. This process typically took hours, or 1–2 days at most.
As we made quick improvements to the tour based on observed use, we set up tracking to report on what percent of users dropped off at which slide of the tour. If any slide has a high number of users who selected to skip the remaining tour, we would attempt to reduce friction, or add additional value to that slide.
This allowed us to identify, over the course of 1–2 weeks, whether or not changes to specific parts of the tour were successful or not.
New ideas for changes that didn’t have immediate team buy-in were added to a list of potential changes that we could assess and discuss further.
Once we had a reasonably-optimized tour based on a week of improvements, we set up an A/B metrics comparison to see which onboarding (new or old) resulted in a higher user activation rate afterwards.
We define user activation as a user having:
Our new onboarding flow represented at 20% improvement in user activation compared to our old onboarding flow, which also created an increased trial-to-paid conversion for those users.
Upon successful improvement of a key business metric, the Feature Flipper was used to deploy the new onboarding flow to 100% of new users moving forward.
Finally, as features and strategic approaches within the product have slowly moved in the six months since this endeavor, our UX and product team have kept a backlog of additional onboarding ideas so that this entire process can repeated again in the near future, and see if we can improve on our new flow yet again.
For millions of professionals, relationships are the backbone of a viable business. Whether you’re working with clients, prospects, or potential investors, Contactually helps you build stronger relationships with the people who can make you successful.
5 Aspects of Rapid Feature Validation & Iteration was originally published in Contactually Product and Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.