Human genome sequencing is one of the greatest endeavors of biology. Genome sequencing refers to sequencing the entire genome of an organism. Because of the efforts of publicly funded human genome projects, the sequence is freely available to the public, which has contributed to significant discoveries worldwide. Read here to learn more about whole genome sequencing.
Recently, researchers at the Indian Institute of Science Education and Research (IISER) Bhopal have carried out whole genome sequencing of banyan (Ficus benghalensis) and peepal (Ficus religiosa) from leaf tissue samples.
They also undertook a comprehensive genome-wide phylogenetic analysis with 50 other angiosperm plant species, including four other sequenced Ficus species.
What is a genome?
The genome is the entire set of DNA instructions found in a cell. In humans, the genome consists of 23 pairs of chromosomes located in the cell’s nucleus, as well as a small chromosome in the cell’s mitochondria.
A genome contains all the information needed for an individual to develop and function.
All of the DNA of an organism is called its genome. Some genomes are incredibly small, such as those found in viruses and bacteria, whereas other genomes can be almost unexplainably large, such as those found in some plants.
- The human genome contains about 3 billion nucleotides.
- The rare Japanese flower called Paris japonica has a genome size of roughly 150 billion nucleotides, making it 50 times the size of the human genome.
Interesting facts about genome
If printed out the 3.2 billion letters in the human genome would:
- Fill a stack of paperback books 200 feet (61 m) high
- Fill 200 500-page telephone directories
- Take a century to recite, if we recited one letter per second for 24 hours a day
- Extend 3,000 km (1,864 miles), that’s about the distance from London to the Canary Islands, Washington to Guatemala, or from New Delhi to Hanoi.
Genome vs gene
- A gene is a part of the DNA while a genome is the total DNA in a cell.
- Gene is the hereditary element of genetic information while the genome is the complete set of nuclear DNA.
- Gene encodes protein synthesis, whereas genome encodes both protein and regulatory elements of protein synthesis.
- The length of the gene is about a few hundred bases while the genome of a higher organism has about a billion base pairs.
- Variations in genes are naturally selected, while horizontal gene transfer and duplication cause variations in the genome.
What is whole genome sequencing (WGS)?
All organisms (microorganisms, plants, mammals) have a unique genetic code, or genome, that is composed of nucleotide bases- adenine, thymine, cytosine, guanine (A, T, C, and G).
If you know the sequence of the bases in an organism, you have identified its unique DNA fingerprint or pattern, and determining the order of bases is called sequencing.
Whole genome sequencing is a laboratory procedure that determines the order of bases in the genome of an organism at a single time.
This includes sequencing all of an organism’s chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast.
- Whole genome sequencing should not be confused with DNA profiling, which only establishes the probability that genetic material originated from a specific person or group and does not provide details on genetic relationships, the origin of the genetic material, or a person’s susceptibility to particular diseases.
Whole genome sequencing has primarily been utilized as a tool for research, but it was first made available to clinics in 2014.
Whole genome sequence data may be a crucial tool in the future of personalized medicine to direct therapeutic intervention.
How does whole genome sequencing work?
Scientists conduct whole genome sequencing by following these four main steps:
- DNA shearing: Scientists begin by using molecular scissors to cut the DNA, which is composed of millions of bases (A, C, T, and G), into pieces that are small enough for the sequencing machine to read.
- DNA barcoding: Scientists add small pieces of DNA tags, or bar codes, to identify which piece of sheared DNA belongs to which bacteria. This is similar to how a bar code identifies a product at a grocery store.
- DNA sequencing: The bar-coded DNA from multiple bacteria is combined and put in a DNA sequencer. The sequencer identifies the A, C, T, and G, or bases, that make up each bacterial sequence. The sequencer uses the bar code to keep track of which bases belong to which bacteria.
- Data analysis: Scientists use computer analysis tools to compare sequences from multiple bacteria and identify differences. The number of differences can tell the scientists how closely related the bacteria are, and how likely it is that they are part of the same outbreak.
Advantages of Whole-Genome Sequencing
- Human whole genome sequencing (WGS) is starting to usher in a new era of personalized medicine to improve public health. By allowing the entire genome of a person to be sequenced, every gene can be turned into digital data for analysis.
- Identifies probable causal variants for additional follow-up investigations of gene expression and regulation mechanisms by providing a high-resolution.
- Provides a base-by-base scan of the genome by capturing both large and small variants that targeted techniques would overlook.
- Delivers substantial amounts of data quickly to facilitate the assembly of novel genomes
- The ability to learn about the effectiveness of medications or the negative effects of drug use is another benefit of genome sequencing. Pharmacogenomics is the study of how medications interact with the genome.
- With advances in bioinformatics, next-generation sequencing findings are starting to guide treatments for common genetic conditions such as cancers (colorectal cancer and melanoma) and are also being used to determine which medications are safe (and which are not) on a person-by-person basis.
- WGS provides 3,000 times more genetic information because it provides data on all six billion base pairs of the human genome.
- WGS also offers a better determination of copy number variations, rearrangements, and other structural variations due to it taking advantage of reads longer than 2×100 paired. The result of all of this is a much more thorough genotype and phenotype analysis that can offer the significantly more predictive potential for rare diseases and genomic medicine.
Limitations of genome sequencing
- Mutations that lead to diseases may be mistaken for normal genes giving false negative results.
- Conversely, a gene could be misread as a mutation that is expected to lead to an adverse condition, whereas in reality, the person is not harboring such a mutation in their genome (a false positive).
- The majority of the technologies currently in use have limited capabilities for identifying so-called structural variations, even though DNA sequencing technologies are quite accurate in deciphering the sequence.
- Duplications, deletions, and inversions are examples of changes that have an impact on vast DNA sequences at once. Such structural variants can still have a significant impact on health, but due to the current sequencing technologies’ limitations in detecting such structural genomic rearrangements, such biological occurrences may not be interpretable to a client’s advantage.
- Genome sequencing has the potential to lower morbidity and mortality by allowing for the early detection of risk factors for a variety of medical problems and by assisting in the selection of an appropriate treatment plan, but it is not a complete screening for all diseases that might exist.
- Currently, there is no gold standard against which the performance of population genomic screening can be judged. Whether the application of next-generation sequencing will reduce mortality and morbidity in the population the way it has been demonstrated in individual studies is yet to be established.
- While many countries have implemented laws that ban discrimination based on genetic information, often such laws do not protect an individual in the case of a life insurance policy. It is natural for new technology to be accepted slowly, and the accurate interpretation of new laws can take time.
Also read: Genome India Project (GIP)
Previous year question
Q1. With reference to agriculture in India, how can the technique of ‘genome sequencing’, often seen in the news, be used in the immediate future? (2017)
- Genome sequencing can be used to identify genetic markers for disease resistance and drought tolerance in various crop plants.
- This technique helps in reducing the time required to develop new varieties of crop plants.
- It can be used to decipher the host-pathogen relationships in crops.
Select the correct answer using the code given below:
(a) 1 only
(b) 2 and 3 only
(c) 1 and 3 only
(d) 1, 2 and 3
-Article written by Swathi Satish