|
|
DESTOBIO 2000 August 23-27, 2000 West Lafayette, Indiana, USA |
|---|
Much of biological and demographic history of living beings is coded in their DNA and proteins. The underlying processes, such as mutation, recombination, genetic drift and population change, involve both deterministic and random components. A variety of mathematical and statistical methods are used to model these processes and estimate their respective influences. These methods include random processes, graph theory, simulation, statistical inference dynamical systems. The results are interesting not only for evolutionary biology but also for mapping of genetic diseases, development of pharmaceuticals and other branches of biomedicine.
LIST OF SPEAKERS (in alphabetical order)
Adam Bobrowski, Department of Mathematics, University
of Houston, Houston, TX: "Markov Chain Approach to Evolution of Repeat-DNA Sequences"
ABSTRACT: We consider a model of evolution of a DNA-repeat sequence, which takes into account stepwise extension/contraction events occurring with intensities proportional to the sequence length and the possibility of catastrophic breakdowns. The model was originally introduced by Kruglyak et al. (Proc. Natl. Acad. Sci. USA, 1998, 95: 10774-10778), in a slightly different form, and used for between-species comparisons of short DNA-repeat data. In the present paper, we investigate the dynamics of absorption in the model, which is equivalent to the elimination of the DNA-repeat sequence. Using the theory of Markov chains and stochastic semigroups, we obtain explicit expressions for the distributions of the process and for the time to absorption.
Ranajit Chakraborty, Human Genetics Center, University of Texas at Houston, Houston,
TX: "Linkage Disequilibrium: Concept, Utility and
Evolutionary Dynamics in the Context of the Human Genome Variation"
ABSTRACT:
Linkage disequilibrium (LD), better called gametic phase disequilibrium, is a concept that describes the association of alleles across two or more loci. Thus, since allelic states are discrete characteristics of genetic variation at individual sites of the
genome, in a statistical sense, any measure of association for multidimensional categorical data can serve the purpose of defining LD. However, in population genetics, some specific measures of LD gained popularity because of the purpose for
which such measures are used. Historically, the initial use of LD was intended to measure the genetic proximity of loci (i.e., the stronger the LD is, the closer is expected to be the loci on a chromosome). However, it was soon discovered that
recombination during meiosis impacts the dynamics of LD, erasing the signature of linkage over time. Since most measures of LD are gene frequency dependent, other factors were also shown to affect LD. These include the genetic substructure
within a population, natural selection, as well as genetic drift effect due to the finite size of a population. Traditionally, the concept of LD was defined for scenarios in which both loci would exhibit only two different alleles present in a population.
When molecular techniques were developed that could detect multiple segregating alleles at each site, the need was realized for developing measures of LD that could encompass more than two segregating alleles per locus. As a consequence, the
role of mutation (model as well as rate) on the dynamics of LD became a subject of intensive investigation. Since the strength of LD is dictated by a combined effect of all these factors, it is virtually impossible to isolate the principle cause of LD in
any observed data without a carefully conducted genetic breeding experiment.
In the context of the human genome studies, the resurgence of importance of LD arose mainly due to the need of fine mapping of genes, when direct evidence of genetic recombination between closely spaced sites became difficult to find in family
data. Since past recombination events dictate the strength LD in an extant population, LD served as the population genetic rationale for position cloning of genes. The localization and eventual cloning of the Cystic Fibrosis gene is a classic example
of the success of the LD-approach of gene mapping. More recent studies indicate that the LD-approach of gene mapping may also be a fruitful method to uncover genes underlying complex phenotypes, particularly when populations of known
admixture history are utilized in the investigation.
It is expected that with the completion of the Human Genome Project we will soon have a detailed information on the physical locations of the polymorphic sites (such as the Single Nucleotide Polymorphism (SNP) sites, microsatellites,
insertion/deletion sites, etc.) evolving under different mutation mechanisms. This will offer a valuable resource to examine the rate and variability of recombination in different regions of the genome, and how natural selection shapes the LD between
loci, with effects of population substructure and demographic history incorporated in the analysis. Such data should also circumvent the use of overly simplistic analytical models of the dynamics of LD that has been used in the past or current
studies.
Elise Eller, Department of Anthropology, University of Utah, Salt Lake City, UT: "Population Extinction and
Recolonization in Human Demographic History"
ABSTRACT:
For nearly two decades, anthropologists have debated the origins of
anatomically modern humans. There are two primary opposing models. The
multiregional model posits regional continuity among subpopulations from Homo
erectus times to the present. The recent African origin model hypothesizes
that anatomically modern humans arose in Africa approximately 100,000 years ago,
later expanded out of Africa, and replaced the indigenous H. erectus
populations. Initially, the debate focused on paleoanthropological and
archaeological evidence, but in the past decade genetic evidence has been added
to the discussion. A particularly strong argument, from a population genetics
perspective, in favor of the recent African model is the small effective
population size of 10,000 suggested by data from a variety of genetic systems.
Effective size often is equated with the number of breeding individuals, and
it has been argued that 10,000 breeding individuals could not have continuously
occupied much of the Old World for most of the Pleistocene yet remained a
cohesive species by means of gene flow.
However, this argument ignores the effects of population extinction and
recolonization, which increase the variance among demes and reduce effective
population size. Using models developed for population extinction and
recolonization and estimating parameter values from ethnographic and
archaeological sources, I show that an effective population size of 10,000 can
be consistent with a large census size required by the multiregional model. If
the extinction rate is relatively large, the migration rate is low, and the
colonization process incorporates a small number of colonists or kin-structured
colonization, then an effective population size of 10,000 can be reconciled with
a large census size of several hundred thousand individuals consistent with the
multiregional model. Additionally, during this period of population extinction
and recolonization population differentiation (FST) would have been high: higher
than the 10-15% observed in contemporary humans. One approach to testing this
model of population extinction and recolonization would be to find
paleoanthropological or genetic evidence of greater population differentiation
during the Pleistocene.
More work is required to develop this model further. First, better parameter
estimates of extinction rates, migration rates and the colonization process for
Pleistocene hunter-gatherers are necessary to assess the viability of this
hypothesis. It is unclear whether the admittedly crude parameter estimates
used here adequately reflect hunter-gatherer demography. Furthermore, more
realistic models that go beyond island models of migration and incorporate
isolation by distance are needed. Although this work is preliminary, this
model deserves further study to determine the potential effects of population
extinction and recolonization on human demography.
Marek Kimmel, Department of Statistics, Rice University, Houston, TX: "Models of Point
Mutations in Human Genome: Theory versus Data"
ABSTRACT: Single-nucleotide polymorphisms (SNPs) or simply point mutations are an important new tool in the study of molecular evolution of modern humans. We developed two mathematical models, one based on a modification of the infinite sites model and the other based a two-state Markov process, which allow numerical predictions of various characteristics of SNPs, under different population dynamics scenarios. These two models lead to different predictions of frequency distributions of SNPs and different predictions of the ascertainment bias, as SNPs derived from one population are typed in another population. Theoretical predictions are illustrated by two recent data sets. The first includes distributions of nearly 400 SNP loci, mostly anonymous, i.e., not associated with known genes. Each of these loci was originally screened in Caucasians, but is studied now in six diverse populations. It seems that a two-state Markov model better predicts the observed frequency distribution and ascertainment bias in the data than the modified infinite sites model. The second data set includes several SNP haplotypes, typed in about 300 Americans of diverse ethnic origins, located in the vicinity of genes implicated in familial cancers. The haplotypes were identified from genotypes using the EM algorithm as well as an original method, which in addition allows estimate of the intensity of recombination. Patterns of haplotype distributions and linkage disequilibrium show a distinctive variability from one locus to another. Frequencies of main haplogroups differ from one population to another. Neither the infinite-sites model nor the two-state Markov model seems to account for the patterns observed. We interpret these findings in the terms of demographic history of modern humans and such genetic forces as mutation, recombination and selection.
Alexander Renwick, Department of Statistics, Rice University, Houston, TX: "Probabilistic
Models of Evolution of Short DNA-Repeat Sequences"
ABSTRACT: We examine length distributions of approximately 6000 human dinucleotide microsatellite loci, representing chromosomes 1 to 22, from the GDB Database. Under the stepwise mutation model, results from theory and simulation are compared with the empirical data. In both constant and expanding population scenarios, a simple single step model with parameters chosen to account for the observed variance of microsatellite lengths produces results inconsistent with the observed homozygosity and the dispersion of length skewness. Complicating the model by allowing a variable mutation rate accounts for the homozygosity, and introducing a small probability of a large step accounts for the dispersion in skewnesses. We discuss these results in light of the long term evolution of microsatellites.
Heidi Spratt, Department of Statistics, Rice
University, Houston, TX: "Probabilistic Methods for Detection of
Functional Residues in Proteins"
ABSTRACT: Resampling technology, like bootstrapping, can be used to determine if clusters in a tree, produced here by the method of evolutionary trace developed by Olivier Lichtarge ('96), are likely to be due to random chance. The bootstrap is a computer-based method frequently used to assess the accuracy of many statistical estimates. For phylogenetic trees, the bootstrap can be used to resample the data to create new data sets in order to assess the confidence for each branch of an observed tree. Here, I look at several protein trees and try to assess the confidence of functionally important residues.
The Department of Mathematics at Purdue
hits since
4/17/00.