Computational Methods to Study Tandem Repeats in Human Genome and Complex Diseases

Computational Methods to Study Tandem Repeats in Human Genome and Complex Diseases
Author: Mehrdad Bakhtiari
Publisher:
Total Pages: 152
Release: 2021
Genre:
ISBN:

A central goal in genomics is to identify genetic variations and their impact on underlying molecular changes that lead to disease. With the advances in whole genome sequencing, many studies have been able to identify thousands of genetic loci associated with human traits. These studies mainly focus on single-nucleotide variants (SNVs) and novel insertion and deletions in the genome, while ignoring more complex variants. Here, I consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6-100 bp) repeating units that span 3% of the human genome. While some VNTRs are known to play a role in complex disorders (e.g. Alzheimer's, Myoclonus epilepsy, and Diabetes), the majority of them have not been studied well due to computational difficulty in genotyping VNTRs on a large scale. Here, I will present our progress on developing efficient computational algorithms to profile VNTRs from high throughput sequencing data and identify possible variations within them. I applied our method to generate the largest catalog of VNTR genotypes to this date, which provides insights into the landscape of VNTR variations in different populations. I show the contribution of tandem repeats in mediating expression levels of key genes with known associations to neurological disorders and familial cancers, and argue the causality of this relation. Finally, I will describe our efforts to directly understand the impact of these variations on human phenotypes, which improves our understanding of genetic architecture of complex diseases.

Computational Methods in Genome Research

Computational Methods in Genome Research
Author: Sándor Suhai
Publisher: Springer Science & Business Media
Total Pages: 230
Release: 2012-12-06
Genre: Science
ISBN: 1461524512

The application of computational methods to solve scientific and pratical problems in genome research created a new interdisciplinary area that transcends boundaries traditionally separating genetics, biology, mathematics, physics, and computer science. Computers have been, of course, intensively used for many year~ in the field of life sciences, even before genome research started, to store and analyze DNA or proteins sequences, to explore and model the three-dimensional structure, the dynamics and the function of biopolymers, to compute genetic linkage or evolutionary processes etc. The rapid development of new molecular and genetic technologies, combined with ambitious goals to explore the structure and function of genomes of higher organisms, has generated, however, not only a huge and burgeoning body of data but also a new class of scientific questions. The nature and complexity of these questions will require, beyond establishing a new kind of alliance between experimental and theoretical disciplines, also the development of new generations both in computer software and hardware technologies, respectively. New theoretical procedures, combined with powerful computational facilities, will substantially extend the horizon of problems that genome research can ·attack with success. Many of us still feel that computational models rationalizing experimental findings in genome research fulfil their promises more slowly than desired. There also is an uncertainity concerning the real position of a 'theoretical genome research' in the network of established disciplines integrating their efforts in this field.

Tandem Repeats in Genes, Proteins, and Disease

Tandem Repeats in Genes, Proteins, and Disease
Author: Danny M. Hatters
Publisher: Humana
Total Pages: 0
Release: 2016-08-23
Genre: Medical
ISBN: 9781493963003

The genomes of humans, as well as many other species, are interspersed with hundreds of thousands of tandem repeats of DNA sequences. Those tandem repeats located as codons within open reading frames encode amino acid runs, such as polyglutamine and polyalanine. Tandem repeats have not only been implicated in biological evolution, development and function but also in a large collection of human disorders. In Tandem Repeats in Genes, Proteins, and Disease: Methods and Protocols, expert researchers in the field detail many methods covering the analysis of tandem repeats in DNA, RNA and protein, in healthy and diseased states. This will include molecular genetics, molecular biology, biochemistry, proteomics, biophysics, cell biology, and molecular and cellular approaches to animal models of tandem repeat disorders. Written in the highly successful Methods in Molecular BiologyTM series format, chapters include introductions to their respective topics, lists of the necessary materials and reagents, step-by-step, readily reproducible laboratory protocols, and key tips on troubleshooting and avoiding known pitfalls. Authoratative and Practical, Tandem Repeats in Genes, Proteins, and Disease: Methods and Protocols aids scientists in continuing to study the unique methodological challenges that come from repetitive DNA and poly-amino acid sequences.

Theoretical and Computational Methods in Genome Research

Theoretical and Computational Methods in Genome Research
Author: Sándor Suhai
Publisher: Springer Science & Business Media
Total Pages: 352
Release: 1997
Genre: Medical
ISBN:

Contains plenary lectures presented at the March 1996 International Symposium on Theoretical and Computational Genome Research, held in Heidelberg, Germany. Topics include the feasibility of whole human genome sequencing, analysis of gene functions by the metabolic pathway database, error analysis o

A Computational Approach for Diagnostic Long-read Genome Sequencing

A Computational Approach for Diagnostic Long-read Genome Sequencing
Author: Esko Kautto
Publisher:
Total Pages: 0
Release: 2022
Genre: Cancer
ISBN:

Our understanding of the human genome has greatly expanded since the completion of the Human Genome Project. Many large-scale landmark studies have since looked at the role genetic alterations play in the predisposition to disease and identified countless disease-causing mutations. While most of genomics-based research has been made possible through the commoditization of massively parallel next-generation sequencing, recent advances in sequencing technologies have allowed long-read single-molecule sequencing to further characterize and identify genetic alterations that were previously challenging to detect through conventional sequencing. In this research, we have used accurate long-read sequencing from Pacific Biosciences to study cancer and non-cancer samples alike to identify and characterize disease-associated genetic alterations. The work has involved the development of computational methods for stream-lining analysis of such data to provide high-confidence structural variant calls. The analysis pipeline and tools have been used to accurately identify causative mutations in pediatric cancer cases, discover an internal tandem duplication in the HOXD13 gene that caused syndactyly in two unrelated families, and to expand the role that activating FGFR1 mutations may play in closed spinal dysraphism.

Computational Methods for Disease Diagnosis and Understanding the Genetics of Complex Traits

Computational Methods for Disease Diagnosis and Understanding the Genetics of Complex Traits
Author: Lisa Gai
Publisher:
Total Pages: 99
Release: 2021
Genre:
ISBN:

An ever increasing wealth of biological data has become available in recent years, and with it, the potential to understand complex traits and extract disease relevant information from these many forms of data through computational methods. Understanding the genetic architecture behind complex traits can help us understand disease risk and adverse drug reactions, and to guide the development of treatment strategies. Many variants identified by genome-wide association studies (GWAS) have been found to affect multiple traits, either directly or through shared pathways. Analyzing multiple traits at once can increase power to detect shared variant effects from publicly available GWAS summary statistics. Use of multiple traits may also improve accuracy when estimating variant effects, which can be used in polygenic scores to stratify individuals by disease risk. This dissertation presents a method, CONFIT, for combining GWAS in multiple traits for variant discovery, and explores a few potential multi-trait methods for estimating polygenic scores. Computational methods can also be used to identify patients already suffering from disease who would benefit from treatment. Towards this end, this dissertation also presents work on deep learning to detect patients with orbital disease from image data with high accuracy and recall.

Computational Methods to Study Genomic Structure and Structural Variation

Computational Methods to Study Genomic Structure and Structural Variation
Author: Viraj Balkrishna Deshpande
Publisher:
Total Pages: 128
Release: 2017
Genre:
ISBN:

Chromosomes, the carriers of genes, were first observed in plant cells in 1842. Visual inspection of chromosomes via cytogenetics laid the foundation for understanding the structure and content of chromosomes. Eventually, Watson and Crick's discovery of DNA, the building block of the chromosomes paved the way for genomics and the DNA sequencing revolution. As we lie on the cusp of scaling genomics to personalized analysis and to the broad diversity of species, the new generation of scientific discoveries relies heavily on computational analysis of complex datasets. This thesis highlights computational methods that we developed for interpreting various data modalities to elucidate the large-scale structure of the genome. We describe two tools, Cerulean and AmpliconArchitect(AA), which aim to interpret sequencing data in different contexts to find an unknown genomic structure. The crux of these tools is the representation of the genomic structure as a graph which encodes connectivity of genomic segments, followed by delineation of the graph into ordered genomic segments. Cerulean performs hybrid de-novo assembly of a novel genome by combining accurate, short sequencing reads with erroneous, long reads which can span longer distances along the genome. AA focuses on a specific feature of cancer genomes called focal amplifications, or regions with a high increase in copy number. These often contain cancer-causing oncogenes and undergo complex rearrangements. AA simultaneously uses short sequencing reads from the cancer genome and information from the human reference genome to predict the structure of the focal amplification. We applied AA to comprehensively characterize the nature of focal amplifications across human cancer. We combined sequence analysis of AA with extensive computational analysis of cytogenetic images of cancer cells. Surprisingly, we found that focal amplification occurs through the formation of circular extrachromosomal DNA(ecDNA) structures breaking off from the human chromosomes in as many as 40% of all cancer cases. Through theoretical modeling we showed that formation of ecDNA drastically accelerates tumor growth and evolution, facilitating rapid development of resistance to targeted drugs. This defines a new paradigm in our understanding of cancer and cancer treatment.

Analysis of Complex Disease Association Studies

Analysis of Complex Disease Association Studies
Author: Eleftheria Zeggini
Publisher: Academic Press
Total Pages: 353
Release: 2010-11-17
Genre: Medical
ISBN: 0123751438

According to the National Institute of Health, a genome-wide association study is defined as any study of genetic variation across the entire human genome that is designed to identify genetic associations with observable traits (such as blood pressure or weight), or the presence or absence of a disease or condition. Whole genome information, when combined with clinical and other phenotype data, offers the potential for increased understanding of basic biological processes affecting human health, improvement in the prediction of disease and patient care, and ultimately the realization of the promise of personalized medicine. In addition, rapid advances in understanding the patterns of human genetic variation and maturing high-throughput, cost-effective methods for genotyping are providing powerful research tools for identifying genetic variants that contribute to health and disease. This burgeoning science merges the principles of statistics and genetics studies to make sense of the vast amounts of information available with the mapping of genomes. In order to make the most of the information available, statistical tools must be tailored and translated for the analytical issues which are original to large-scale association studies. Analysis of Complex Disease Association Studies will provide researchers with advanced biological knowledge who are entering the field of genome-wide association studies with the groundwork to apply statistical analysis tools appropriately and effectively. With the use of consistent examples throughout the work, chapters will provide readers with best practice for getting started (design), analyzing, and interpreting data according to their research interests. Frequently used tests will be highlighted and a critical analysis of the advantages and disadvantage complimented by case studies for each will provide readers with the information they need to make the right choice for their research. Additional tools including links to analysis tools, tutorials, and references will be available electronically to ensure the latest information is available. Easy access to key information including advantages and disadvantage of tests for particular applications, identification of databases, languages and their capabilities, data management risks, frequently used tests Extensive list of references including links to tutorial websites Case studies and Tips and Tricks

Introduction to Computational Genomics

Introduction to Computational Genomics
Author: Nello Cristianini
Publisher: Cambridge University Press
Total Pages: 200
Release: 2006-12-14
Genre: Computers
ISBN: 9780521856034

Where did SARS come from? Have we inherited genes from Neanderthals? How do plants use their internal clock? The genomic revolution in biology enables us to answer such questions. But the revolution would have been impossible without the support of powerful computational and statistical methods that enable us to exploit genomic data. Many universities are introducing courses to train the next generation of bioinformaticians: biologists fluent in mathematics and computer science, and data analysts familiar with biology. This readable and entertaining book, based on successful taught courses, provides a roadmap to navigate entry to this field. It guides the reader through key achievements of bioinformatics, using a hands-on approach. Statistical sequence analysis, sequence alignment, hidden Markov models, gene and motif finding and more, are introduced in a rigorous yet accessible way. A companion website provides the reader with Matlab-related software tools for reproducing the steps demonstrated in the book.

Neurogenetics, Part I

Neurogenetics, Part I
Author:
Publisher: Elsevier
Total Pages: 438
Release: 2018-01-08
Genre: Medical
ISBN: 0444632352

Genetic methodologies are having a significant impact on the study of neurological and psychiatric disorders. Using genetic science, researchers have identified over 200 genes that cause or contribute to neurological disorders. Still an evolving field of study, defining the relationship between genes and neurological and psychiatric disorders is evolving rapidly and expected to grow in scope as more disorders are linked to specific genetic markers. Part I covers basic genetic concepts and recurring biological themes, and begins the discussion of movement disorders and neurodevelopmental disorders, leading the way for Part II to cover a combination of neurological, neuromuscular, cerebrovascular, and psychiatric disorders. This volume in the Handbook of Clinical Neurology will provide a comprehensive introduction and reference on neurogenetics for the clinical practitioner and the research neurologist. Presents a comprehensive coverage of neurogenetics Details the latest science and impact on our understanding of neurological psychiatric disorders Provides a focused reference for clinical practitioners and the neuroscience/neurogenetics research community