<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=1919858758278392&amp;ev=PageView&amp;noscript=1">

How To Build Better Software With the Scientific Method

Nov 12, 2020 8:44:11 PM By Sylvia Fronczak

The Way We Think

When Parzych attended Neil DeGrasse Tyson’s masterclass, she learned that “the most important moments are not decided by what you know, but how you think.” She wants to talk about changing the way you think in order to build better software.

Problems are all around us. Whether we build software from scratch or buy it, we’re solving problems. 

So how do you approach a problem you haven't seen before? Do you just throw things at it and see what happens? Or do you think about it more?

When considering the difference between the growth mindset vs. the fixed mindset, Parzych shares that, for growth, failure is an opportunity. Conversely, with a fixed mindset, failure is seen as the limit of innate abilities.

Parzych shares how her son wants to become better at something and can become frustrated. He doesn’t want to practice. He just wants to get better. But the practice is part of the growth mindset that helps us improve.

The Scientific Method

Now let’s look at the scientific method. With this method, we want to isolate a process and form a hypothesis. Then, we’ll create an experiment and see if we can produce repeatable results. 

If we can’t repeat the results, it doesn’t pass the experience. And finally, we want to share the results with others.

We want to do this a bit more rigorously when it comes to software.

The Hypothesis

Let's start with the hypothesis. The hypothesis is more than a question, though questions can lead to hypotheses. A hypothesis is a prediction. 

For example, let’s look at pagination. You can predict that pagination will decrease page load time by 5%. That’s the hypothesis. 

Not all hypotheses are good hypotheses. They need to answer a single question. If you have multiple questions, you will need multiple hypotheses. 

The Experiment

Once we have a hypothesis, we need to run an experiment. An experiment provides learning opportunities. 

It’s not about what you know. It’s about what you can learn.

What are some ways we experiment when developing software?

  1. Test in production—deploy code and see what the results are.
  2. Game days—break things on purpose with chaos engineering or manual intervention.
  3. A/B testing—test different scenarios against your customers.

These experiments are great, but you first need to set yourself up for success. Because DevOps is about tools, right? No, just kidding. It’s not just the tools.

Experiments involve failure. Through failure, we learn from our mistakes. When following the scientific method, we either prove or disprove the hypothesis. We still learned something. So experiments aren’t on their own failures.

What else do we need for our culture of experimentation?

Blameless Culture

If you’re embracing failure and experimenting, you cannot blame others. A blameless culture doesn’t attribute incidents to people. It attributes incidents to system failures.

Psychological Safety

When we feel safe, we feel able to challenge the status quo. It means being comfortable sharing our ideas and our concerns. And we don’t feel like we’ll be blamed when things break—because things are going to break.

Without psychological safety, you won’t have a blameless environment, and you won’t have a culture of experimentation.

Cognitive Bias

With cognitive bias, we look to solve problems like

  1. Too much information.
  2. Not enough meaning.
  3. Not enough time.
  4. Not enough memory.

When we’re building experiments and using the scientific method, we want to make sure we’re not gaming the system to force the “right” answer. We want to avoid the trap of falling into these cognitive biases.

What biases impact software delivery? Again, we have four common biases:

  1. Anchoring bias—here we anchor ourselves to our first bias and everything else is based off that anchor.
  2. Framing effect—our bias are based on how something is framed, be it in a positive or negative light. If something is framed in a positive way, it will sway people.
  3. Bias blind spot—we say that we’re building AI without bias. But we do have bias, and it permeates the work we do. And we can’t pretend it doesn’t exist.
  4. Ikea effect—this is the “not invented here” syndrome. If you build something by hand, you feel that it’s worth more than if you bought it...or if you assemble it from Ikea.

So consider how these biases affect your experiment. 

Define Success

Next, define success of the experiment. There should only be a single definition of a successful experiment; you need one metric of success. What you don’t want to do is support two versions of the same feature, so be clear about how to evaluate success.

Also, identify and avoid vanity metrics. These metrics can make you feel good. But it can be gamed. And you can influence whether the metric goes up or down.

Some common examples of gameable metrics include blog views, twitter followers, lines of code, and more. These may not define success.

For example, McDonalds has sold billions of burgers. And they’re looking at the total of all time. But what’s the delta? This number always goes up.

So What Makes a Good Metric?

A metric should be customer focused. Think of things from your customer’s perspective. 

It also needs to be concrete and specific. And the metric must be tied to business. If it doesn’t apply to these things, then it’s not a good metric.

Parzych next shared an experiment that LaunchDarkly ran. For some customers, it was taking too long to load their dashboard. And the hypothesis was that pagination would improve page load time. 

Next they did some testing to determine if pagination would help those customers. And that looked good.

But they also wanted to confirm that adding pagination wouldn’t alter the page load time for everyone and that there weren’t unintended consequences for those other customers.

Fortunately, the experiment panned out. Users that had a high number of flags saw a reduction in page load times. Other users saw no increase in their page load times. And after the experiment, the feature was rolled out to all customers.


Parzych leaves us with four takeaways for this talk.

  1. Create a culture of learning and experimentation.
  2. Recognize your biases.
  3. Don’t be swayed by vanity metrics.
  4. Use feature flags to run experiments.

This session was summarized by Sylvia Fronczak. Sylvia is a software developer that has worked in various industries with various software methodologies. She’s currently focused on design practices that the whole team can own, understand, and evolve over time.