Different Frameworks for Prioritizing Your A/B Testing

Posted by Manuel da Costa / category: Conversion Optimization

If you’re like most conversion rate optimization teams, you have a long list of variables you would like to test as part of your CRO strategy. Some you know are worth your time, others may have some affect (but you aren’t sure), and then there’s a list of changes that you don’t know how to evaluate.

You can’t test everything at once because that would skew your results. Even testing two variables at the same time would muddy your data. You could slice your traffic into groups so you can run multiple experiments at once, but traffic is limited. Too much segmenting and your tests won’t have enough traffic of their own to gain valuable data.

Of course, you also have a boss (or bosses) breathing down your neck for results. They might not be entirely sold on the idea of conversion optimization, so you need to show them something to prove your team is worth the investment. That means you need results quickly.

So with little time and resources, you need to prioritize.

Download our free guide to find out which framework is right for your conversion optimization team.

Conversion Optimization Frameworks

A framework is a system of prioritizing your ideas. It’s a way to take a qualitative, abstract concept and give it a quantitative rating so you can compare ideas against one another. It’s not a perfect system.

There are no guarantees that the changes you make will lead to massive improvements, but if you apply the frameworks objectively, there’s a good chance they can put your ideas in a reasonable order.

There are two common frameworks that conversion rate optimization teams use and two more that are worth mentioning. You may have learned (or use) other frameworks. That’s quite alright.
The trick is to find one that suits your team and your goals best. If you have a framework that has been working for your website or app, don’t change.

But if you need a framework, I recommend one of these four.


Arguably, this is the most common conversion optimization framework. It involves three criteria.

  • Potential – How much improvement can be made? Which pages have the most room for improvement? It’s smart to target your worst performers first. Take into account data from your analytics, improvement opportunities from your heuristic analysis, and customer data.
  • Importance – How important is the page or element to your goals? If making an improvement doesn’t suit your goals, it won’t mean much (even if it’s an easy/quick change).
  • Ease – How complicated will the test be to implement? Factor in the technical requirements to get the test moving and how long it will need to run. Also consider internal barriers (that’s a nice way of saying “other people in the company who get in your way”) like a marketing director who needs a certain home page layout.

Rank your pages one through ten based on each of the three criteria. Then average them across to come up with an objective score. Rearrange the pages based on score (highest to low) and you have your priority list.


Source: WiderFunnel


ICE is used as the prioritization framework over at GrowthHackers. In fact, the framework was created by their founder, Sean Ellis. It uses three criteria.

  • Impact – What will happen if this test works? What are the benefits (in terms of your goals or learned intelligence)? What will be the impact on future testing (i.e., “we’ll learn which colors people like, which will simplify later experiments”).
  • Confidence – How confident am I that this will work?
  • Ease – How easy will this test be to implement? Does it require outside resources or special privileges? Can it be done quickly, or will it need to run to collect months of data? Are there any barriers in the way?

Rank your pages with one through ten, just like the PIE framework. Average the scores for each page. Highest scores should be optimized first.

A challenge of using this framework is the “impact” category. If you could accurately guess the impact of a change, why would you test it? If, for example, you knew that a full width video would make people click your call to action more, wouldn’t you just implement the video?

The big problem with the ICE and PIE frameworks is the inherent subjectivity of the scoring. If five people score a page, will they all be the same? Likely no. Even a one-point difference in any category could drastically rearrange the final prioritization list.

Furthermore, the same person could score a page differently at different times. You might rate a particular change lower in the “ease” category because your boss is having a bad day and unlikely to approve the test.

There’s a second version of the ICE framework that tries to add more objectivity to the exercise by being more specific with the criteria and reducing the scale of the data.

The three categories can only be scored with a 1 or 2. 1 means the opportunity is low. 2 means the opportunity is high. This binary scale keeps your numbers low so there are fewer errors caused by variation.

This framework uses three criteria:

  • Impact – Same as the first version of ICE.
  • Cost – How much does it cost to implement this test?
  • Effort – How much time is required to test this idea?


Image: Modern Machine Shop

The challenge with this framework, however, is the limited range of aggregate scores. What happens when you have a handful of potential tests that all received the same ICE score? You would have to pick one at random or use a different optimization framework.


The folks at ConversionXL put together their own framework to compensate for the challenges of the popular methods. It uses a binary scoring system and a variety of categories. It also assume several things about CRO common knowledge (like above the fold changes being more impactful). Finally, it adds clarity to the “ease” column so there’s less room for variation among people’s responses.


PXL removes a lot of the guesswork associated with the PIE and ICE scores. It’s stronger than the subjective frameworks because it uses hard facts and scoring attributes. This means that multiple people can evaluate the same website or page and come up with reasonably similar results.

Furthermore, you can customize this framework for your business needs. All businesses are different. They have their own goals. They might be working to achieve multiple goals, like improving conversions while simultaneously adhering to brand guidelines. You can customize the PXL template with your own variables.

Interestingly, the PXL framework mandates certain responses based on how those categories are weighted across the CRO industry. Notice under “Adding or removing an element” you can only put a 0 or 2.

Scores for each category are totaled across the rows. Highest scores are the most important tests you should run.


Vacation referral service Hotwire released their prioritization framework back in 2015. It uses a binary scoring system that includes several variables related to conversion optimization, such as changes that affect a page above the fold, changes that affect the mobile experience, and changes that affect 100% of customers.

This framework is useful because it removes subjectivity. Each category receives a point if the rule applies and a 0 if it doesn’t. The downside is that it operates on one company’s opinion of conversion optimization. For instance, if you don’t feel that evaluating elements for their position above or below the fold has merit, you would want to remove that rule from this framework or use a different one.

Read about the Hotwire framework and its rules here.

Which framework is right for your CRO team? Download our free guide to find out.

Going Forward

Your website is a unique property, unlike any other website. Don’t fall into the trap of comparing your site to another. Just because your competitor used a long home page doesn’t mean it will work for you.

The ICE and PIE frameworks are simple and easy, but subjective and data-poor. Serious CRO teams should stick to the Hotwire or PXL frameworks. Feel free to customize them if you need. For instance, if a particular element is important to your brand (say, an animated icon), you might add a category called “Doesn’t interfere with icons.”

As you evaluate underperforming pages for your framework, don’t forget to take into account dynamic factors like the season, your competition, and your brand voice. By using one of the above frameworks, you’ll be able to evaluate them as objectively as possible.

At the end of the day, you have to chose the right prioritization framework for your business. The decision is ultimately up to you. You might prefer the simplicity of PIE or the data-thoroughness of PXL. Regardless which you choose, our application, Effective Experiments, can be customized for any framework in your workflow.