Can we blame bats for transmitting COVID-19 to humans? starting one of the most horrific pandemics and economic crises in the history of mankind. Does batman deserve to feel awkward when bowing to NHS staff?
Source: www.shutupandtakemymoney.com/what-a-bad-time-to-dress-up-as-a-bat-batman-meme/ |
The Short Answer
The short answer to the question is "probably"; I'll explain why. When analysing the bat's faecal swap's sequence collected in China recently [3], RaTG13 coronavirus has been detected, which happened to be highly similar to COVID-19 when performing local alignment using BLAST shown in figure 1. In fact, it is roughly 96% similar to the original COVID-19 that was sequenced from the food market in Wuhan. That might seem like a reasonable conclusion to put the blame on bats, but if you stop and compare the genome of closely related species, you realise that seemingly completely different species like humans and bonobos share about 98.7% of their genome [4], staggering results considering how differently bonobos behave to humans. Just to clarify, if we assume the difference between humans and bonobos to be only 1%, that means that 1% of the more than 3 billion base pairs in the genome are different. Putting it bluntly, that's around 30 million different base pairs. Depending on the length of proteins, you could create quite a large number of different proteins given 30 million base pairs. Viruses, on the other hand, have much smaller DNA –or RNA in some cases–. So 1% of a typical virus genome which is around 30k bp will correspond to 300 bp. To put it into perspective, the membrane glycoprotein in COVID-19 is around 700 bp [5]. In our example, 4% difference is around 1200 bp, that makes the bat even a stronger candidate, but we can't say for sure. But for now, maybe batman deserves to feel a little bit guilty.
Figure 1: Blast report showing COVID-19 is 96.12% identical to RaTG13 coronavirus. |
How to Compute The Results
I have carried on the analysis of raw fastq files produced by Illumina machines. Fastq files are the standard format in the new Next Generation Sequencing machines. The Illumina platform slices DNA into smaller bits before analyses (around 300 bp). The major advantage is the ability to parallelise sequencing thus reducing time and expenses of sequencing dramatically. One drawback of this method for the bioinformatician is having to assemble these smaller bits again into its original form; most of the time it is easy using off the shelf mapping tools such as Bowtie but it can be tedious in longer sequences where repetitive sequences can occur.
For more on the technical information, visit my Github repository on the topic where I have explained how to reproduce the results.
References
Comments
Post a Comment