Skip to main content

PromAssist, synthetic biologists assistant




In previous posts, I briefly introduced the world of synthetic biology and designing of genetic circuits. The goal of competition such as iGEM is to encourage people designing creative solutions to real-world problems using synthetic biology with the hope of moving the field out of academia into industry, much like electronics. And much like electronics, iGEM is trying to develop a library of biological parts, analogous to electronic parts such as capacitors and resistors to build any circuit. If you have ever dabbled in electronics, you'd notice that not parts are created equally; Some capacitors have higher efficiencies than others which might suit a certain application more. In this post, I want to explore one of the most important "biological parts" in my opinion: 

 What are promoters

Promoters are regions of DNA usually found upstream of a gene that act as a switch to express whatever gene is downstream of it. Since translating DNA to mRNA requires the help of RNA polymerase, promoters job is to "attract" floating RNA polymerase and get it attached to so it can translate its downstream gene or genes (figure 1).

Figure 1: Simple sketch explaining promoters


 The simplest form of promoters are called constitutive promoters, or more directly unregulated promoters. Those promoter attract floating σ70 RNA polymerase (sometimes called housekeeping RNAP), they are unregulated as they are usually crucial to the survival of the cell. Things get way more complicated when these promoters are inducible, meaning they can be regulated by other proteins or molecules. A classic textbook example is the lactose operon (commonly known as lac operon) in E coli used to metabolise lactose in the absence of glucose. Since it is energetically expensive to produce lactase when glucose is abundant, the lac operon, which is a group of genes controlled by a single promoter, is mostly repressed. The way it works is roughly like this:
 
Figure 2: Lac operon dynamics. Source: Microbenotes


  1. Lac respressor, which is a DNA-binding protein is always expressed – downstream of a constitutive promoters (figure 2a).
  2. The Lac repressor binds to the Lac operon promoter, blocking RNA polymerase from attaching (figure 2a). 
  3. If lactose molecules are present, it binds to the Lac repressor and unblocks it from the promoter region (figure 2b).
  4. RNA polymerase can do its job of translating the operon genes into mRNA (figure 2b). 

 This is a beautifully evolved system that acts as an if statement in modern engineering talk. In fact, the lac operon was the first discovery of gene regulatory networks and it awarded its discoverers a Nobel prize in Physiology in 1965. 

Why promoters are important and how can PromAssist can help

Because the lac operon is quite well understood, synthetic biologist have been using the lac operon system to express their gene of choice in E. coli, usually replacing lactose with IPTG as it has a similar structure to lactose and can be used to inactivate the lac repressor. While it works on a simple on/off genetic circuits, it can be daunting if you want to create more complex circuits with multiple triggers and you desire high specificity in gene expression. Synthetic promoter, designed specifically for certain repressors, which can also be synthetic (as in not found in nature) can unlock endless possibilities for creating highly crafted and tuned circuits. Just like how AlphaFold can assist scientist in visualising the 3D shape of proteins, hence the binding affinity in proteins like lac repressors, PromAssist, can be a useful tool for scientists experimenting with different synthetic promoter to assist the strength of transcription factor binding and gene expression strength in general.  

PromAssist is based on DNABERT, which is a BERT-based machine learning model pre-trained on massive amount of DNA sequences, I have fined-tuned the model on data gathered from this paper, the goal is for scientists to input a certain DNA sequence for the promoter in mind and it should output the relative strength of its gene expression. The hope is that this tool can help biologist find a suitable synthetic promoter much quicker with the aid of PromAssist. The model will be updated as new fine-tuning strategies are used.

Comments

Popular posts from this blog

Modelling mycelium growth

  Why material science? Since material names are used to define periods of civilisations of the past (Stone Age, Bronze Age, etc ...), material science is considered, in my opinion, one of the species-defining endeavours. Civilizations that mastered smelting iron ended up victorious. All of a sudden, enemies had to deal with stronger swords and implantable shields. The same applies today, countries that managed to master silicon chip manufacturing ended up economically victorious. With the advent of biotechnology, largely with the help of artificial intelligence tools such as AlphaFold, there could be a promise of a new class of materials that are grown instead of manufactured. Bio-composite materials offer an alternative solution to create materials inspired by nature and can returned to it at the end of their lifespan. Such materials will tremendously accelerate reaching our sustainability goals. Mycelium Composites One such material is mycelium, which is considered nature's solu...

Engineered Patterns in Biology I: Intro to Turing Pattern Formation

What is this post about? This post will cover an overview of pattern formation, specifically Turing pattern formation in a non-mathematical way. The whole point here is to appreciate the beauty of what can Turing's reaction-diffusion model describe. If you are excited enough about the topic and want to cover some technicalities, a subsequent post will cover my Master's thesis work on the design of genetic circuits from first principles that makes use of Turing's reaction-diffusion model. Pattern formation in Biology It is best to start by explaining pattern formation in biology. The easiest and most obvious way to see patterns is within animals. Zebras; fish; cheetahs are some of the few animal examples that exhibit pigment patterns on their skin [1]. If you accept the fact that pigments are no more than proteins expressed within a group of cells [2], one can reasonably ask the question: How can cells determine their relative position to other cells so they end up ex...

Butlerian Jihad: The crusade against AI and hidden tech

Image 1: Mdjourney generated picture using the prompt: "cartoon of human soldiers fighting a small robot. it shows the defeated robot in the middle and human soldiers aiming their rifles at the robot" "We must negate the machines-that-think. Humans must set their own guidelines. This is not something machines can do. Reasoning depends upon programming, not on hardware, and we are the ultimate program! Our Jihad is a "dump program." We dump the things which destroy us as humans!" ' ― Minister-companion of the Jihad. [6] That quote will be recognizable if you have read Dune by Frank Herbert . I found it suitable to bring the novel up during the extreme mixture of excitement and fear among people given the recent advance in artificial intelligence. Even an open letter was signed by many extremely influential people to halt the progress of artificial intelligence research to avoid a situation like in the cartoon above in image 1 (which is ironically AI ...