DoE in 8 SIMPLE Steps
DoE is a systematic approach to planning, conducting, and analyzing experiments. It provides a structured method for obtaining reliable results and accumulating high-quality data.
In this post, we’ll walk through an example of how to conduct a Design of Experiments (DoE). While we won’t dive into the details just yet, this overview will give you a clear picture of the DoE process. Our example involves optimizing the yield of a chemical reaction. Let’s get started!
DoE as a process
DoE is like a recipe that you can follow to ensure a systematic approach to planning, executing and analyzing your experiments. Think of it as a cooking guide that helps you collect high quality data.
- Define the problem or goal: Clearly state the problem you want to solve or the question you want to answer.
- Identify the Factors: Determine the variables that you will manipulate and their levels.
- Choose the Response: Decide what you will measure and how.
- Identify disturbance variables: Are there any variables that might influence your experiment? Can you control them? Do you need to apply blocking?
- Design the Experiment: Select an appropriate experimental design. There are many to choose from. The most important ones in my eyes are: full factorial design, fractional factorial design, and response surface design.
- Conduct the Experiment: Perform the experiment according to your design, ensuring to randomize and replicate as necessary.
- Analyze the Data: Use visualization techniques and statistical methods to analyze the data and draw conclusions about the relationships between factors and responses.
- Draw Conclusions: Interpret the results and make recommendations based on your findings.
Problem definition
We aim to increase the yield of a chemical reaction. This is a straightforward problem statement.
Brainstorming Factors
We select three factors based on our experience that are most likely to increase the yield of the reaction:
- Temperature: The temperature at which the reaction is running.
- Catalyst Type: The type of catalyst used.
- Concentration: The concentration at which the catalyst is used.
Determining relevant levels
Once the factors are identified, the next step is to determine the range in which to test these parameters. The tricky part is too choose ranges that are neither too large so that your product starts decomposing, nor too narrow so that possible effects remain undetected. Therefore, the ranges should be chosen based on prior knowledge, trial runs, or expert input. In our case we choose the following levels for our parameters:
- Temperature: 100 °C, 200 °C
- Catalyst Type: Type A, Type B
- Concentration: 0.1 wt.%, 1 wt.%
Choosing the response
We will measure the yield by calculating the ratio of the expected outcome in grams to the measured outcome in grams.
Identifying Disturbing Variables
Disturbance variables can affect the outcome of the experiment but are not the primary focus. Here are three potential disturbance variables and how we will manage them:
- Ambient Temperature: Since our reaction is conducted at high temperatures, ambient temperature variations are unlikely to affect the outcome significantly.
- Different Batches of Catalyst: To avoid variability, we will ensure sufficient material from the same batch for all experiments. If not possible, the batch can be introduced as a blocking variable.
- Unforeseen Factors: Randomization of the experimental plan will help mitigate the impact of any unforeseen variables.
Choose an appropriate design plan
Design of Experiments (DoE) offers a variety of preset design plans tailored to different needs. These plans help structure your experiments efficiently, ensuring you run the right number of experiments to gather the most valuable information. Without delving into too much detail, here are the three main design plans:
- Fractional Design: Ideal for when you’re starting out and have limited information about your system.
- Full Factorial Design: Perfect for exploring interactions between a few identified key factors.
- Response Surface Design: Best for optimization and handling non-linearity, or when you need to find robust process parameters.
I recommend beginning with a fractional design when you have many factors and are unsure about which ones drive your system. Once you’ve identified the key factors, use a full factorial design to explore the details. Reserve response surface designs for specific cases like optimization problems or when seeking robust process parameters. In our example, we already have a good understanding of the three parameters driving our system, so we will proceed with a full factorial design.
Conducting the Experiments
We have created a full factorial design for our three parameters, resulting in the following design matrix:
Temp | Type | Concentration | Yield |
---|---|---|---|
100 | A | 0.1 | |
200 | A | 0.1 | |
100 | B | 0.1 | |
200 | B | 0.1 | |
100 | A | 1.0 | |
200 | A | 1.0 | |
100 | B | 1.0 | |
200 | B | 1.0 |
Usually we would add replication runs to account for experimental error and randomize the design to minimize the impact of unknown disturbance variables. You can also add blocking for certain disturbance variables like for example if you have to use different batches of let’s say catalyst A. But more about that in another blog post.
Analyze the results
After conducting the experiments, we will analyze the results using main effect plots and interaction plots.
Usually we would add replication runs to account for experimental error and randomize the design to minimize the impact of unknown disturbance variables. You can also add blocking for certain disturbance variables like for example if you have to use different batches of let’s say catalyst A. But more about that in another blog post.
Analyze the results
After conducting the experiments, we will analyze the results using main effect plots and interaction plots.
Main effect plots visualize the average impact of individual factors, while interaction plots reveal how the effect of one factor depends on another. For example, the main effect plots show that on average, concentration has the highest effect on yield. But interaction plots can reveal more nuanced insights, such as the concentration change of the catalyst having a much higher impact on catalyst B than on catalyst A, particularly at 100 ºC.
To make sure that the observed changes from the plots are not by chance, we perform an Analysis of Variance (ANOVA). ANOVA confirms which parameters genuinely influence our results, providing robust statistical evidence.
Draw conclusions
Based on the significant parameters identified, we can now draw conclusions about which parameters are most important for controlling the yield and determine the optimum settings. Follow-up experiments can refine these findings.