Leveraging Data for Growth
Why I think Citrine Informatics could emerge as a leader in AI/ML for chemicals
One of my predictions for 2024 was that artificial intelligence (AI) and machine learning (ML) was going to inundate the chemical industry, but that it would be in places where we didn’t anticipate such as inventory, scheduling, and forecasting customer needs and wants. I’ve been skeptical that AI/ML would make a significant impact on product development or R&D because it seemed like many companies were seeking to replace the R&D function. The story I’ve seen around R&D and AI/ML is that software armed with enough computational power will somehow solve what humans have been struggling with for a century in drug discovery, materials, and organic chemistry. It’s not that I think it’s impossible, but rather that it’s the wrong target. To explain, come with me on a journey as I describe how many chemical companies operate and why I think companies like Citrine Informatics will succeed in this space.
The chemical industry is cheap. If you worked your ass off, made huge improvements in your company, landed a big customer, and launched new products, then congratulations—you have done your job. Your reward is maybe a paltry 4% raise because the business is still struggling due to [insert your favorite reason here]. However, if you stay loyal and put in the work, you should get rewarded in the future when the company is doing well (maybe a 7% increase in salary). If this sounds paltry and frustrating, then trust your instincts. This certainly was frustrating for me—but the thing about the chemical industry is that it is highly location dependent. You are often just happy to have a job and getting a new one might mean moving your whole life and family to somewhere completely new.
Your starting salary as a chemist, plant operator, process engineer, or a quality technician might be really low compared to a product manager at AirBnB, but you are often living in places like Akron, Louisville, Midland, Wauwtosa (Wisconsin), Kingsport (Tennessee), and York (Pennsylvania). Making $100,000/year in these places would let you live a very comfortable life, but there are often very few options to leave your employer. If you want to live in these places, it’s a match made in heaven, but if you want to live somewhere else, maybe near your parents or where you grew up things can be a bit trickier.
Chemical companies aren’t just cost conscious on salaries they take the lower cost option wherever the option presents itself and this has only been compounded by private equity operators. If you are a private equity firm buying two similar chemical companies with two R&D locations, you will eventually want to consolidate them because you can cut costs, still have R&D, and return more money to shareholders. Instead of all the chemists getting a ChemDraw license there is just one license sitting on a slow desktop computer in the lab with the Karl Fischer titrator. In this environment, it can be difficult to get buy-in for investing in tools to boost R&D productivity when you might be fighting to keep your R&D team from being cut.
The goal of chemical companies right now isn’t to grow. It’s to manage the business to either remain flat or grow enough to just barely beat inflation. A declining business is fine as long as it continues to generate profits and costs can be cut periodically to maintain those profits and returns to the shareholders.
One way to reduce costs is to make your existing workforce more productive. This often looks like having your existing workforce do more work for the same pay. This ultimately leads to burnout, brain drain, more inefficiency, and a slow spiral towards zero. Things don’t have to be this way.
Invest in Productivity and Growth
Any consultant worth their fee is going to give you some platitude about how to solve the chemical industry’s problems. It’s all about digitalization driving productivity and using that enhanced productivity to outperform your competition and gain new business. Maybe even grow a bit and open some new markets by a disciplined and thoughtful approach to mergers and acquisitions. Your consultants might even write a really solid 5-year plan for you. Implementation is on you unless you want to pony up a few million per year for the consultants to run it (see the above note about being cost conscious).
Armed with some good-looking slides and maybe a couple of papers, a senior leadership team will be asking their direct reports to go find some digital solution options that might help increase productivity and do what the consultants said it would do.
This might take the form of better engineering controls systems to improve the plants (no one ever wants to improve the plant, it’s depreciating/fully depreciated, it’s like putting new Bose speakers in your 98 Honda Accord when the AC doesn’t work).
This might look like getting some new features in Salesforce (did you actually gain productivity from adopting Salesforce in the first place?)
This might look like investing in figuring out how to use all the data R&D has generated that just sits in a random folder on an internal server.
I think investing in R&D is probably always a good idea, but I think senior leadership teams need to be clear in what they want out of that team, how it’s resourced, and more importantly, understand how it works. It doesn’t matter how many bakers you put in the kitchen, the cake won’t bake any faster and more scientists doesn’t mean faster development timelines for projects.
Be Clear on What You Want
Sometimes, being in the R&D function means someone is always angry at you. This is primarily driven by internal stakeholders feeling that the function is somehow holding everyone else back.
Your senior leadership team might be frustrated because they (who might be new because activist investors demanded change) want better products for less money in customer’s hands. Now.
Your sales team might be angry because some product isn’t going fast enough (see above). A different sales team is upset because a bunch of products that were launched too quickly have started to fail in the field and customer claims, where your customers blame you for their problems and charge you for it, are destroying your margins.
Supply chain and manufacturing are angry because you are slow to validate new raw materials and you are single sourced for key products.
The plants (manufacturing) are happy for now because it’s all R&Ds fault.
What everyone really wants is for the R&D team to develop that next big blockbuster product with really amazing gross margins or just that product that everyone else has, but for cheaper. Ideally, this should all happen in a few months. The way most R&D functions in companies (at least chemical companies) are run this is not possible because data is not managed well and the systems managing data favor the people with experience who built them.
I think this is where Citrine Informatics could really help change things.
Citrine Informatics
I recently spoke to Stephen Edkins, Director of Strategy, at Citrine Informatics about what they are up to and how they see artificial intelligence and machine learning improving the chemical industry. They even made my prediction list for 2024.
Citrine offers an AI platform designed to enable chemists and materials scientists to develop better products in less time. In big tech companies, data is abundant and there are armies of data scientists to use it primarily because software margins are huge, and these companies have been growing like crazy (maybe not forever). Chemical companies are very different. Data is relatively scarce because experiments take time to conduct, and you need lab space and the people doing the experiments are doing more than just product development. They are supporting the existing business. Citrine essentially allows R&D people to become data scientists through a no-code platform.
 I’ve written a lot about how companies mismanage data and I think there are really three approaches that companies can take right now:
1)Â Â Â Keep doing the same things and hoping for different results (what most companies are doing now)
2)Â Â Â Invest in an electronic laboratory information management system and manage all of your data (easy to result in bloat)
3)   Generate or leverage the most relevant data to build models that help you to do what you do best (I’m a fan of this approach).
Previously, to deploy AI/ML models as a chemical or chemical adjacent company you would need to hire some data scientists. This could be anywhere from $150-300k in total compensation depending on where you are located and how much experience you want from them. Then, you would need to give them data (it’s probably not organized at all and in terrible shape) and most likely go generate a bunch of new data. Maybe in a year you could have a basic model for a system that you utilize, like a pressure sensitive water based acrylic adhesive.
Citrine Informatics enables you to not hire a data scientist or two and instead allows someone like me (not a data scientist) to build my own models for whatever system I’m working on. By working on the model yourself, instead of through a data scientist, you can incorporate your expertise directly and iterate quickly. In polymeric products where formulation is essential for product development, like polyurethane foams or waterborne emulsions, I think this approach is the way.
Another problem that is particularly difficult as a chemist while formulating is not knowing the actual structures of the products you are using. This is primarily to keep trade secrets a secret but gets in the way of utilizing structure to inform design of the end product. When the structures are not available your next best move is to just get enough data and develop a statistical model, typically through a design of experiments (or DOE). The problem with running models is that the complexity starts to become too much for someone like me once we get past 3 variables. Formulations in polyurethane foam or waterborne emulsions can have upwards of 10 components from multiple suppliers. This means that developing a complex model starts to become incredibly burdensome for chemists to manage and it’s often why you hire data scientists.
Citrine’s mission is to make it easier for chemists to build and manage their own complex models. If you want to develop a spray foam formulation without the fishy smelling catalysts and no phosphate ester fire retardants using polyols alone, you probably need to develop a complex model to help you get to a formulation that works in terms of speed of foaming, rigidity, dimensional stability, insulation value, small flame test, density, open cell/closed cell, and near instant foaming once mixed. Here’s an example of how spray foam works and why density is an important factor for overall cost.
The Citrine Platform automatically creates complex models from data by examining - where known - the components, structure, and process history of materials. You can then tailor and improve the model so it can help you figure out an optimal formulation faster, and even suggest new formulations when you are trying to optimize for certain factors including cost (provided you have accurate costing on all your raw materials). Maybe you need a slightly different formulation for the level of humidity in Louisiana in the summer versus what you might use in Massachusetts in the winter. You could optimize formulations for all sorts of different customer needs and wants across not just the United States, but the entire world.
I think the best part here is that as you build and add more data to your models, the chemists eventually just have to try and verify and troubleshoot why certain formulations didn’t work. Having done a bit of formulation (it’s not easy) sometimes solving for one problem leads to two more and then once you figure those problems out another one rears up and instead of launching a product you’ve been playing whack-a-mole with for 4 months. Having everything in a model and knowing where to go next as well as where you shouldn’t start to become very valuable and over the course of years my bet is that it can solve a lot of the anger that is generally expressed to the product development teams.
Pro Tip: using the Whack-a-Mole analogy in a presentation with your senior leadership team doesn’t always go over so well based on my experience.
Citrine Informatics doesn’t need to know all the chemical structures of the stuff you are working with, and they aren’t going to necessarily predict a structure no one has thought of before, but for the typical chemist in a typical chemical company I think Citrine Informatics could allow them to at a minimum 2-5x their productivity given enough time and data. Also, I bet given enough data and performance a really good chemist can probably guess at the eventual structure of that mystery raw material (or given enough money and a big enough NMR and enough time).
I don’t think artificial intelligence or machine learning will replace chemists in a lab.
Companies that figure out how to use artificial intelligence and machine learning in the lab to increase productivity will replace the companies that fail to act on the tools that will help chemists be better. If you want to learn more about Citrine Informatics, please reach out using https://citrine.io/contact/.
I’m a software guy but love reading your deep dives into the base layers of the civilization stack.
One of your funnier columns Tony, if only funny because I've been there.