Skip to main content

Are bats to blame for lockdown?

Can we blame bats for transmitting COVID-19 to humans? starting one of the most horrific pandemics and economic crises in the history of mankind. Does batman deserve to feel awkward when bowing to NHS staff?

Source: www.shutupandtakemymoney.com/what-a-bad-time-to-dress-up-as-a-bat-batman-meme/


It is known that bats can harbour different strains of viruses without being infected due to their amazing immune system that limits inflammation [1]. Nevertheless, that is not a valid proof to put the blame on bats straight away, it just puts bats as a pretty good candidate. In this article, I will take a hands-on approach to investigate the root organism that transmitted COVID-19 to humans by taking the bioinformatics rout. I will analyse a sequence taken from a faecal swap of common bats in China to try to arrive at a meaningful conclusion; in a way, reconfirming the results of this peer-reviewed paper [2]. I will start first by showing the computed results and explain its implications, then I will explain how to carry on such analysis by following the workflow that I have developed in SnakeMake.

The Short Answer 


The short answer to the question is "probably"; I'll explain why. When analysing the bat's faecal swap's sequence collected in China recently [3], RaTG13 coronavirus has been detected, which happened to be highly similar to COVID-19 when performing local alignment using BLAST shown in figure 1. In fact, it is roughly 96% similar to the original COVID-19 that was sequenced from the food market in Wuhan. That might seem like a reasonable conclusion to put the blame on bats, but if you stop and compare the genome of closely related species, you realise that seemingly completely different species like humans and bonobos share about 98.7% of their genome [4], staggering results considering how differently bonobos behave to humans. Just to clarify, if we assume the difference between humans and bonobos to be only 1%, that means that 1% of the more than 3 billion base pairs in the genome are different. Putting it bluntly, that's around 30 million different base pairs. Depending on the length of proteins, you could create quite a large number of different proteins given 30 million base pairs. Viruses, on the other hand, have much smaller DNA –or RNA in some cases–. So 1% of a typical virus genome which is around 30k bp will correspond to 300 bp. To put it into perspective, the membrane glycoprotein in COVID-19 is around 700 bp [5]. In our example,  4% difference is around 1200 bp, that makes the bat even a stronger candidate, but we can't say for sure. But for now, maybe batman deserves to feel a little bit guilty.

Figure 1: Blast report showing COVID-19 is 96.12% identical to RaTG13 coronavirus.



How to Compute The Results

I have carried on the analysis of raw fastq files produced by Illumina machines. Fastq files are the standard format in the new Next Generation Sequencing machines. The Illumina platform slices DNA into smaller bits before analyses (around 300 bp). The major advantage is the ability to parallelise sequencing thus reducing time and expenses of sequencing dramatically. One drawback of this method for the bioinformatician is having to assemble these smaller bits again into its original form; most of the time it is easy using off the shelf mapping tools such as Bowtie but it can be tedious in longer sequences where repetitive sequences can occur. 

For more on the technical information, visit my Github repository on the topic where I have explained how to reproduce the results.



References

Comments

Popular posts from this blog

Modelling mycelium growth

  Why material science? Since material names are used to define periods of civilisations of the past (Stone Age, Bronze Age, etc ...), material science is considered, in my opinion, one of the species-defining endeavours. Civilizations that mastered smelting iron ended up victorious. All of a sudden, enemies had to deal with stronger swords and implantable shields. The same applies today, countries that managed to master silicon chip manufacturing ended up economically victorious. With the advent of biotechnology, largely with the help of artificial intelligence tools such as AlphaFold, there could be a promise of a new class of materials that are grown instead of manufactured. Bio-composite materials offer an alternative solution to create materials inspired by nature and can returned to it at the end of their lifespan. Such materials will tremendously accelerate reaching our sustainability goals. Mycelium Composites One such material is mycelium, which is considered nature's solu...

Engineered Patterns in Biology I: Intro to Turing Pattern Formation

What is this post about? This post will cover an overview of pattern formation, specifically Turing pattern formation in a non-mathematical way. The whole point here is to appreciate the beauty of what can Turing's reaction-diffusion model describe. If you are excited enough about the topic and want to cover some technicalities, a subsequent post will cover my Master's thesis work on the design of genetic circuits from first principles that makes use of Turing's reaction-diffusion model. Pattern formation in Biology It is best to start by explaining pattern formation in biology. The easiest and most obvious way to see patterns is within animals. Zebras; fish; cheetahs are some of the few animal examples that exhibit pigment patterns on their skin [1]. If you accept the fact that pigments are no more than proteins expressed within a group of cells [2], one can reasonably ask the question: How can cells determine their relative position to other cells so they end up ex...

What does flattening the curve mean mathematically?

As of today, most countries are on lockdown; a strategy devised by governments to help slow down the spread of the novel coronavirus COVID-19. Moreover, officials use the term flattening the curve to help the healthcare system cope with the expected large amount of patients, but what does that mean mathematically? I will go on an overview of how to construct a useful toy mathematical model to get a qualitative view of the dynamics of the virus. This is a typical SIR model appearing a lot lately in the media. I will go further to explain how to develop such a model and explain the power and limitations of such models. How to construct SIR models SIR stands for Susceptible, Infectious and Recovered (or Removed) agents. In our example, the agents are humans spreading the disease to each other. To simplify things, we model an agent as a node and the links between nodes represent social connections, as seen in the picture below: Figure 1: Modelling humans as nodes...