Skip to main content

The Intelligent Coffee Roaster


https://www.gocoffeego.com/dbimages/148/roasting@2x.jpg



Coffee roasting is widely considered to be an art form among roasters that requires immense dedication to the craftsmanship. It certainly looks like an art considering how many people can't function normally without their morning cup of coffee. Coffee roasting is arguably the key transformable step in the coffee cycle, which usually starts by harvesting coffee cherries, then drying these cherries in numerous different methods, and usually ends up in the morning mugs of office workers (figure 1). Before coffee roasting, the coffee beans hardly resemble our beloved morning coffee in taste, look or smell. As artistic as it sounds, coffee roasting can be quantified to a large degree and can be turned to a scientific and engineering discipline. The motivation behind this article is to explore the usage of mathematical modelling and artificial intelligence to aid commercial coffee roasters to achieve a high level of consistency in roasting with relatively a lot of ease. 

Figure 1: Coffee processing cycle from planting to the cup (source unknown)





Coffee roastmasters usually praise themselves as being able to do an extremely hard and artistic job. One can position coffee roastmasters somewhere in the spectrum of a chef, technician and an artist. On one hand, it is an extremely delicate job to roast coffee as people consume coffee mainly for its aroma –plus getting a caffeine kick–, it requires some level of human palate understanding. Moreover, roastmasters have to deal with nature variability occasionally, meaning that each crop differs slightly on an annual basis in moisture content and size, which leads to variability in roasting. Thus, roastmasters have to flex their roasting skills to achieve the same roast on different crops. 

But is it actually that hard to roast coffee? I am challenging this taboo and suggest that coffee roasting is a deterministic process, meaning that there are a number of factors contributing to roast colour, and we can eventually know the exact coffee roast colour with the aid of mathematical modelling and machine learning techniques and some physical property measurement beforehand. 

Coffee is grown in regions called the coffee belt region. The coffee belt region includes countries like Ethiopia, Colombia, Brazil, Indonesia, etc .. . The common geographic trait between them is the high temperature and high humidity. As mentioned before, coffee goes through a lot of steps between harvesting to green coffee form ready for roasting. These steps of processing green coffee will almost certainly vary between country to country, not to mention between farm to farm. These variations in processing techniques, along with the unpredictability of weather, can lead to inconsistent green coffee beans in terms of moisture content, density and other physical factors. That is essentially why mastering coffee roasting is not an easy feat, this is why we called them "Roastmasters". Any average roastmaster must do the calibration of moisture, size, density and other factors in their heads to achieve consistent results. What if we can let the machine do the calibration itself?

Coffee roasting is very similar to cooking; Chemical reactions take place along with physiological changes occur to the bean during the roast, leading to the release of aromas. If you meet a roastmaster, they will almost certainly emphasize the importance of checking the coffee during the roast and dropping it for cooling at the perfect timing. In a sense, coffee timing is the recipe for the roastmaster.

The aim of this project is to solve the issue of coffee roasting inconsistency by utilising the abundance of roast profile data. It is assumed that the roasting colour (e.g. colour of the bean after roasting) is solely determined by physical factors (size of beans, the physical variation of bean shapes, moisture levels, bean density, heat applied, etc...). The actual chemical and physical changes to coffee have been studied and simulated intensively by Fabbri et al. [1] and Possion et al. [2]. They showed the effects of density and moisture content on the roasted coffee output. 



What can a roastmaster measure?


A time-series of temperature vs time is the most basic form of measuring what's going on to coffee during roasting, in a sense, its the roastmaster's recipe plot (figure 2). There are some key physical characteristics of green coffee beans that traders and roastmasters usually consider to evaluate the quality of green coffee beans. First, measuring the relative size of beans by using a mesh to filter out larger from smaller coffee beans. Second, the density of coffee also could be measured with relative ease, and finally, moisture levels could also be measured to using special moisture analysis instruments. An initial change in moisture levels and density could lead to a relatively large difference in roasting colour. The good news is that these parameters could be measured easily and it is considered as a standard practice among many roastmasters.

Figure 2: A typical roast profile (source: https://coffeecourses.com/coffee-roast-profiling/)


The role of mathematical modelling and artificial intelligence


Companies such as Cropster have a humongous amount of coffee roasting data –That is temperature change over time–. Moreover, Cropster also has a database of green coffee profiles and cupping results from its customers. Obviously, it makes sense to try to deduce a pattern from these data using a machine learning algorithm or a neural network to find the link between these data, maybe they could find the coffee equation?

There are two issues with using the aforementioned approach to find the perfect "coffee equation". The first is the need for a large amount of data because, by definition, a neural network starts with absolutely zero knowledge about whatever it is trying to learn, and as it learns from data it becomes better and better. The luxury of having these data is not available for everyone, mainly it is reserved for big companies. The second problem, perhaps more important, is the backbox problem of artificial intelligence. Essentially, most of us don't really know why neural networks work, we sort of just believe the answers.

I am proposing a slightly more scientific approach to the problem, by utilising the work of Isaac Newton and the more "traditional" way of modelling problems using calculus, as Richard Feynman famously said, "Calculus is the language God talks". The use of machine learning here is less of a brute force method of throwing a lot of data to the algorithm, but to support a system of differential equations to find the best parameters that suit our particular problem. This approach is widely used in scientific studies and inspired mainly by how scientists model microbial colonies and using AI to best fit the model, as mentioned by James J. Collins and others [3] and illustrated in figure 3. 
Figure 1: The workflow suggested by James J. Collins and others in their work [3]. Dynamical modelling is refined and constrained with the use of machine learning and experimental results.



How to ACTUALLY do it?


Assuming a roaster master can measure two things before roasting, moisture levels and the density of coffee. We intuitively know that there is an inverse relationship between the measured parameters and roasting times (e.g. the denser and wetter something is the longer it takes to cook and evaporate its water content).  But we don't know quantitatively how they change over time. We can start by writing the system of ODE's using Newton's laws of cooling:

where D is drum temperature, B is the bean temperature and H is the heat applied to the drum (using either flames or hot air).  Bear in mind that what roasters usually measure is the drum temperature and not the bean temperature.  The heat applied to the system H is basically a PD controller, where: 
A is just a preset temperature that we want our roaster to reach at the end, D(t)  is the drum temperature at time t. The rest are just parameters that are specific to the bean characteristics and the actual roaster and we currently do not know at the moment. More concretely, we can think of α as the heat coefficient of the actual roaster, higher values of α lead to higher heat transfer from the heat source to the drum.  γ is the coefficient related to the bean characteristics, that includes the shape of the bean, density, moisture content, etc ... We will focus on the latter two as they are the ones that we can measure and are the ones that generally varies significantly between different lots. Moisture and gases are released from the bean during coffee roasting [1], but we don't really know how they change over time. Thus, we can initially guess that it takes a decaying exponential form –It is not an unreasonable assumption after reading this paper [2]–. Thus, γ can be approximated as:

where ρ is the density and ω is the moisture content. β and η are unknown parameters specific to a problem. Thus the whole system is: 
where μ is just β • η. Solving the system numerically will lead to the results:




The graph above shows the numerical solution to up to 300 timesteps of the aforementioned system of ODE's. A roastmaster would typically read the drum temperature at his/her machine, just like the one in blue colour in the graph above – with the exception that the graph doesn't go indefinetily!–.

Adding a loss function


Loss functions and backpropagation is the backbone of artificial intelligence. A typical workflow to training a model would be by drawing random initial guesses of the unknown parameters, calculating the results by passing the random guesses to the function to get a result. Then, compare the results to the real answer (in supervised learning that is). We calculate how far away are we from the real answer by using a loss function (e.g. using least square). Then iteratively modifying the parameters to minimize the loss function (i.e backpropagating through the function or network to get closer to the real answer). From now on, I will refer to this process as curve-fitting, as I think it is more appropriate.

Our particular problem is very focused, essentially we are asking the question: What are the set of parameters the make our model fit the experimental data. Once we know the parameters, we can start observing how changing the density and water contents changes the output. Concretely, I have used the Scipy least-square algorithm to fit my model to the experimental data (explained neatly by Erico Tjoa in this article).

Fitting the model 

Going back to our problem, I have managed to get my hands on an actual roast profile produced by Toper TKM-SX30 roaster that was roasting an espresso blend. With our current model, we have 6 unknown parameters that we can choose from ( α , λ , κ, κd , β , η ). With a bit of eyeballing and intuitive guessing, I came up with an initial guess of the parameters that vaguely resembles a roast profile, as shown in red on the figure below and compared to experimental data in green: 

Even though the initial guess exhibit a fast initial drop in temperature as opposed to the experimental data, I think it is good enough to start optimizing from there. Performing least-square optimization gets the following result in blue:


The optimized parameters closely resembled the experimental results!



The most important question and verdict


Can we achieve high coffee roasting consistency by only knowing the density and moisture content of green coffee beforehand? Theoretically, yes. However, the model isn't verified experimentally yet. I encourage roastmasters reading this post to contact me to get the code and try it out. The purpose of this solution is not to replace roastmasters by any means –although it would be an interesting experiment to see if we can– but rather to assist roastmasters who believe of the complexity of the problem.  If this works, a roastmaster would typically create a profile and would like to reproduce this profile without thinking too much about the variations of different lots. The input to this algorithm should be the physical properties, plus the desired roast profile (could be a number of different colours: Medium, Dark, Medium Dark, ...). The output would be the instructions to the roastmaster on the things that can be controlled, which is only the amount of heat applied at specific moments in time. Charge temperature, end temperature, total time and delta T, etc... 



References

[1] Fabbri, Angelo, et al. “Numerical Modeling of Heat and Mass Transfer during Coffee Roasting Process.” Journal of Food Engineering, vol. 105, no. 2, 2011, pp. 264–269.
[2] Baggenstoss, Juerg, et al. “Coffee Roasting and Aroma Formation: Application of Different Time-Temperature Conditions.” Journal of Agricultural and Food Chemistry, vol. 56, no. 14, 2008, pp. 5836–5846.
[3] Lopatkin, A. J., & Collins, J. J. (2020). Predictive biology: modelling, understanding and harnessing microbial complexity. Nature Reviews Microbiology, 18(9), 507–520.



Comments

Popular posts from this blog

Butlerian Jihad: The crusade against AI and hidden tech

Image 1: Mdjourney generated picture using the prompt: "cartoon of human soldiers fighting a small robot. it shows the defeated robot in the middle and human soldiers aiming their rifles at the robot" "We must negate the machines-that-think. Humans must set their own guidelines. This is not something machines can do. Reasoning depends upon programming, not on hardware, and we are the ultimate program! Our Jihad is a "dump program." We dump the things which destroy us as humans!" ' ― Minister-companion of the Jihad. [6] That quote will be recognizable if you have read Dune by Frank Herbert . I found it suitable to bring the novel up during the extreme mixture of excitement and fear among people given the recent advance in artificial intelligence. Even an open letter was signed by many extremely influential people to halt the progress of artificial intelligence research to avoid a situation like in the cartoon above in image 1 (which is ironically AI ...

What does flattening the curve mean mathematically?

As of today, most countries are on lockdown; a strategy devised by governments to help slow down the spread of the novel coronavirus COVID-19. Moreover, officials use the term flattening the curve to help the healthcare system cope with the expected large amount of patients, but what does that mean mathematically? I will go on an overview of how to construct a useful toy mathematical model to get a qualitative view of the dynamics of the virus. This is a typical SIR model appearing a lot lately in the media. I will go further to explain how to develop such a model and explain the power and limitations of such models. How to construct SIR models SIR stands for Susceptible, Infectious and Recovered (or Removed) agents. In our example, the agents are humans spreading the disease to each other. To simplify things, we model an agent as a node and the links between nodes represent social connections, as seen in the picture below: Figure 1: Modelling humans as nodes...

Communication without noise

Communicating information is perhaps the highlight of the 21st century; the obvious example is the boom in internet and smartphone use in the last 20 years or so; everything is well connected and messages get sent within a fraction of a second. A less obvious example of communicating information is through reading information stored in physical object prone to damage. For example, a CD could be scratched slightly but still, all the relevant information can be recovered properly.  How did we arrive at such a resilient communication? Let's take communicating with your partner as  an example. Without sounding too much like a family consultant, the key here is to send your message as clearly as possible to avoid misunderstanding. Let's say you are an intelligent being who can formulate a proper and grammatically correct sentence, and you wanted to say "I want pizza". You could say the sentence face to face, and hopefully, your crave for pizza would be communicate...