Clouds save lives and restore health

Cloud technologies help save lives


What do clouds, Big Data and oncology have in common? How can cloud technologies help to fight serious diseases? Turns out, they do have something in common: using cloud technologies in the genome sequencing process helps to define the severity of the disease and make the right decision.


Cloud technologies no longer have this shade of mystery, and they are now just an ordinary innovative tool the modern business cannot function without. Obviously, the humanity hasn’t yet realized all advantages of using the cloud and cannot always see its benefits. However, the cloud solutions have recently demonstrated another great form of application, as they help to fight serious diseases, oncological ones in particular. Let us have a look at the connection between the health care and the cloud and what are the perspectives of this collaboration.

A few words about genomics

Let’s start with natural sciences. As you know (and you are free to verify it using Google), around fifteen years ago the dream of all genetic scientists, health care professionals and idealists came true: according to the research posted by Craig Venter, an American geneticist, in 2001 the human genome was finally decoded. Following that, in 2003, as a part of the world’s largest bioinformatics initiative, The Human Genome Project (HGP), a sequence of 92% of nucleotides has been interpreted (as of now, only a small part of the genome, between 4% and 9%, has not been decoded yet). These events opened up the way for body examinations on the genetic analysis level.


What does it mean? In simple terms, special equipment can help you to define if you are genetically sensitive or, vice versa, intolerable to any particular medicines or substances (anesthetic drugs, antibiotics, particular proteins, alkaloids, etc.). You can learn why you have this very body type and why some diets don’t help you at all while others work magic for you. The last but not the least, you can calculate the probability of getting a genetic disorder based on the DNA analysis and choose the best treatment option. For example, in 2013, Angelina Jolie, a famous actress, used the DNA sequencing technology and, based on its outcome, she opted for complex surgery, bilateral prophylactic mastectomy: a BRCA1 gene mutation was discovered in her genome, which leads to an oncology disease, and the probability of Angie getting a tumor was over 87%. By the way, her example spurred many other people to go for gene sequencing and get specific information about their health conditions, and Angelina became a symbol of being conscious about one’s future.


Predictive medicine (based on predicting the probability of a disease) is the most famous application for genome sequencing; however, it’s not the only one. Sequencing a patient’s genome helps to learn how effective the treatment can be and how medicines can impact the body. In some cases, a person can go through different forms of therapy for many years and get no results at all. This is what happened to Eric Dishman, the head of ‘All of Us’ Research Program under the auspices of the US National Institutes of Health, NIH: at the age of 19, he was diagnosed with a rare form of carcinoma, and he agreed for a test treatment. For 20 years, the man was fighting the disease going from one ineffective treatment option to another one. During that time, the cost of genome sequencing was gradually diminishing and eventually got to the point when Eric could afford it. He was taken aback by the results of the DNA decoding: it turned out that 92% of the medicines he received did not have any effect on him, and the doctors could not know or foresee it until the patient’s genome was finally decoded. As of now, Eric Dishman is among the passionate advocates of the sequencing technology development, promotes it all around the world and supports the idea of making this still costly procedure affordable to anyone.

Why is it so pricey

So, the genome of one human being was decoded in 2001. Despite the keen interest towards the new opportunities opened up by the sequencing technologies, the number of lucky ones who could ‘dive deeply’ into their DNAs did not grow exponentially – the second full genome sequencing did not happen until 2007. Steve Jobs (now-deceased), a co-founder of Apple, decided to sequence some of his genes after he was diagnosed with an oncological disease, and he only became the twentieth person in the world whose genome was interpreted. Unfortunately, in his case, the decoding of the tumor’s DNA and the following treatment did not help.


How come that such a great and prospective technology did not become widespread? On one hand, there’s a simple explanation: it’s pricey.


Clouds will save lives

Reduction in average cost of genome sequencing; source: https://www.genome.gov/


This diagram shows that the average cost of genome sequencing has plummeted over the last decade: at the beginning of the 21 century, the price was over 100 mln dollars (other sources claim that in 2003, the price was around 2.7 billion dollars), and it has reduced by an order of magnitude by 2007. In 2008, it was cut abruptly down to $500K, and the price was then being decreased exponentially. As of late 2017, you can get your genome decoded for $1K. In other words, ten years ago the cost of DNA decoding was comparable to buying an expensive car (and before that – it was like buying a private jet), and nowadays it’s comparable to buying an iPhone X. Some optimistic forecasts say that the procedure will become even cheaper.


On the other hand, the dramatic decrease in price is just one side of the coin. The speed of the sequencing operation has dropped, too: initially, the technology took a lot of time because a huge set of data had to be analyzed. The decoding process took years. As we mentioned, in 2007 James Watson, one of the DNA structure discoverers, became only the second person in the world whose genome was fully sequenced. Back in 2013, in a big and exciting post about genomics, Carole Cadwalladr, an English journalist and writer, shared her personal experience: she had to wait for the results of the DNA interpretation for almost a year. As of now, even less time is required – just a couple of days.


Tons of startups and large corporation are competing nowadays to see who can create the fastest and the most cost-effective and affordable sequenator. What’s the reason behind the price reduction and the increase in speed of DNA analysis? And here comes the most interesting part. Yes, the cloud and the usage of Big Data analysis tools lay the ground for it. Thanks to the integrated approaches towards artificial intelligence, big data and cloud computing along with the cutting-edge DNA analysis methodologies, genome sequencing can now be completed with lower expenses and faster than ever before.


Big Data help save lives

These Data are so big…

Big data are big indeed. Really huge:

  • Each genome has 4 billionsof base pairs.
  • The results of decoding one genome can take from 200 gigabytes up to 0.5 terabytes.
  • The Cancer Genome Atlas contains around 5 petabytesof data – these are the results of studying matched samples for 14.5 thousand cases.
  • BRCA Exchange, the largest database that contains information about BRCA1 and BRCA2 gene mutations (that cause breast cancer or prostate cancer respectively), contains 17 800 genome options.
  • During the DNA sequencing symposium held in 2013 by Illumina, an unrivaled leader in the production of sequenators, some intriguing statistics was presented to demonstrate the results of Big Data analysis of a human genome: by that time, the company had discovered 1 600 pathogen genes in 47 people who attended the event and had gone through the sequencing procedure, and there were 1 221 variations of those genes. Then, the company had evaluated 23 144 options of possible gene behavior in the pre-set conditions and arrived at a conclusion that 65 of those are likely to be pathological and may contribute to a disease.


The scale of those numbers makes you think about what tools can be used to process these data. Obviously, there should be a storage for the data – a reliable and secure one, completely isolated from any confidentiality threats and available for data exchange within an analytical group. 17 years ago, it was yet technically impossible to facilitate such computing power, a data storage system of such volume, applicable IT security mechanism or to power such amount of traffic. On the other hand, sticking to a top-notch technology stack at that time could cost like a private jet. Or maybe even a private airport – including but not limited to building the runways, the towers, the shelters, the weather meter and all other infrastructure.


Indeed, a local analysis, even if it’s powered by a supercomputer in a research facility, would take a lot of time and would be too costly.


By moving the data to the cloud, you solve a number of problems:

  • A reliable project data storage;
  • It’s faster to process data for a separate genome and compare it with the statistics;
  • Flexible configuration (RAM plays a great role in the sequencing process, while the number of processor cores doesn’t make any difference);
  • One can use computing powers exactly at the time when it’s needed (unlike a supercomputer at a research facility where scientists have to wait for their queue);
  • The project team members can collaborate even on an international scale;
  • The project is accessible 24/7 from any place in the world.


Another advantage is that the cloud option means you don’t have to buy any equipment and keep it up-to-date. Active usage of PaaS, IaaS, SaaS functionality help you scale the project up in any direction that you need at this particular moment. For example, if you need to process statistical data on mutations in some genes and to calculate the probability of pathologic behavior, you will require special software that can be easily integrated into the cloud.

Where do the clouds move??

The cloud makes it much easier to process the database of decoded genomes. The bigger this database gets, the better researchers can study all factors influencing the origin of the diseases and elaborate new methods of prevention and treatment. According to Jay T. Flatley, CEO at Illumina, in order to succeed in studying a complex disease such as cancer, one needs to analyze hundreds of thousands of genomes, and the modern technologies make it possible.


It’s no longer a science fiction – that’s the reality we live in. That’s how many companies work. AWS even had particular functionality developed specifically for the teams involved into genomics (say, the Broad Institute in Cambridge and the Institute for Genomic Medicine of the Seoul National Medical University), and Amazon works hard on this direction by building proprietary solutions based on the analysis of interacting with similar projects. It is even possible to fully or partially integrate projects hosted in different clouds.


Any time soon, the usage of clouds, Big Data, artificial intelligence and genetic analysis will result into the introduction of personalized medicine, when most diseases will be eliminated even before they originate. Apart from genome research, this technology holds huge potential for other industries such as bioinformatics, biochemistry, pharmacology, biotechnology, etc.


Another good news was announced by Illumina almost a year ago, back in January 2017, during a biotechnology conference in San-Francisco, when they said that the new line of sequenators called NovaSeq will help to cut the cost of genome decoding by 10 times – from 1000 dollars to just 100 dollars per one procedure – within the next decade. Unlike its predecessors, the new sequenators only make 8 steps instead of 38 when performing analysis. Besides, they let you verify that the data upload goes correctly, and all in all the procedure will run 6 times faster.


That said, in 10 years the problem of genome sequencing will no longer be related to financing and will only rely on how conscious we are about our health. After all, taking care of ourselves is our personal responsibility.


Even now, we are in a position to say that cloud technologies do make the world a better place and do improve the quality of life. It’s even safe to say that the cloud technologies help to save lives and to take care of the humanity. It might be the best example to demonstrate the capabilities of this technology.



This post is based on the following resources:


Author: Alisa Kandeeva

Share this: