Skip to main content

Are bats to blame for lockdown?

Can we blame bats for transmitting COVID-19 to humans? starting one of the most horrific pandemics and economic crises in the history of mankind. Does batman deserve to feel awkward when bowing to NHS staff?

Source: www.shutupandtakemymoney.com/what-a-bad-time-to-dress-up-as-a-bat-batman-meme/


It is known that bats can harbour different strains of viruses without being infected due to their amazing immune system that limits inflammation [1]. Nevertheless, that is not a valid proof to put the blame on bats straight away, it just puts bats as a pretty good candidate. In this article, I will take a hands-on approach to investigate the root organism that transmitted COVID-19 to humans by taking the bioinformatics rout. I will analyse a sequence taken from a faecal swap of common bats in China to try to arrive at a meaningful conclusion; in a way, reconfirming the results of this peer-reviewed paper [2]. I will start first by showing the computed results and explain its implications, then I will explain how to carry on such analysis by following the workflow that I have developed in SnakeMake.

The Short Answer 


The short answer to the question is "probably"; I'll explain why. When analysing the bat's faecal swap's sequence collected in China recently [3], RaTG13 coronavirus has been detected, which happened to be highly similar to COVID-19 when performing local alignment using BLAST shown in figure 1. In fact, it is roughly 96% similar to the original COVID-19 that was sequenced from the food market in Wuhan. That might seem like a reasonable conclusion to put the blame on bats, but if you stop and compare the genome of closely related species, you realise that seemingly completely different species like humans and bonobos share about 98.7% of their genome [4], staggering results considering how differently bonobos behave to humans. Just to clarify, if we assume the difference between humans and bonobos to be only 1%, that means that 1% of the more than 3 billion base pairs in the genome are different. Putting it bluntly, that's around 30 million different base pairs. Depending on the length of proteins, you could create quite a large number of different proteins given 30 million base pairs. Viruses, on the other hand, have much smaller DNA –or RNA in some cases–. So 1% of a typical virus genome which is around 30k bp will correspond to 300 bp. To put it into perspective, the membrane glycoprotein in COVID-19 is around 700 bp [5]. In our example,  4% difference is around 1200 bp, that makes the bat even a stronger candidate, but we can't say for sure. But for now, maybe batman deserves to feel a little bit guilty.

Figure 1: Blast report showing COVID-19 is 96.12% identical to RaTG13 coronavirus.



How to Compute The Results

I have carried on the analysis of raw fastq files produced by Illumina machines. Fastq files are the standard format in the new Next Generation Sequencing machines. The Illumina platform slices DNA into smaller bits before analyses (around 300 bp). The major advantage is the ability to parallelise sequencing thus reducing time and expenses of sequencing dramatically. One drawback of this method for the bioinformatician is having to assemble these smaller bits again into its original form; most of the time it is easy using off the shelf mapping tools such as Bowtie but it can be tedious in longer sequences where repetitive sequences can occur. 

For more on the technical information, visit my Github repository on the topic where I have explained how to reproduce the results.



References

Comments

Popular posts from this blog

Mathematics as an art: Fourier epicycle library

If you are remotely interested in mathematics, you'd probably heard of Fourier, or Joseph Fourier . His name comes to mind whenever a physicist, electronic engineer or any technical person deals with frequencies . I don't need to praise Fourier anymore because there are tons of videos and articles about him all over the internet.  In this article, I will be talking about Fourier series and epicycles (Foucycle), seemingly two distinct branches of mathematics if you're unaware of Euler's famous formula . Epicycles are essentially circles within circles, they have been studied extensively by astronomers because it was thought planets' motion was perfectly circular (Not to mention how they were convinced the earth was the centre of the universe) until the inverse-square law of planet motion was introduced by Keppler and Newton . Fourier series is essentially a way to approximate any function as the infinite sum of scaled sins and cosines, simple yet revolutionary. 

Butlerian Jihad: The crusade against AI and hidden tech

Image 1: Mdjourney generated picture using the prompt: "cartoon of human soldiers fighting a small robot. it shows the defeated robot in the middle and human soldiers aiming their rifles at the robot" "We must negate the machines-that-think. Humans must set their own guidelines. This is not something machines can do. Reasoning depends upon programming, not on hardware, and we are the ultimate program! Our Jihad is a "dump program." We dump the things which destroy us as humans!" ' ― Minister-companion of the Jihad. [6] That quote will be recognizable if you have read Dune by Frank Herbert . I found it suitable to bring the novel up during the extreme mixture of excitement and fear among people given the recent advance in artificial intelligence. Even an open letter was signed by many extremely influential people to halt the progress of artificial intelligence research to avoid a situation like in the cartoon above in image 1 (which is ironically AI

Aggressiveness or defensiveness: The best way to play chess, a computer guide

 Different people play chess in different styles, no one knows the best way yet. This guide is perhaps useful to computers more than humans, although feel free to take a piece of life-long advice from this article on human behaviour. Source:  https://ar.casact.org/actuaries-versus-artificial-intelligence-what-do-actuaries-do-what-will-they-do/ The participants of this study are merely two chess programs I wrote. With the best of my abilities, I tried to give them some sort of personality that is reflected in their style of play. To understand how to create a "personality" in a program, it is helpful to understand the most common algorithms used in chess. Broadly speaking, designing a chess engine involves two parts: The Risk Assessment part and The Search part. For the latter, there is a pretty standard and efficient algorithm that searches for the best piece to move called the minimax. Thus I won't be altering the search algorithm much. I will, however, alter the risk as