|
|
(One intermediate revision by one other user not shown) |
Line 1: |
Line 1: |
| {{dablink|This article is not about F-statistics as that term is understood in statistical inference, especially analysis of variance and linear regression. See [[F-test|''F''-test]] and [[F-distribution|''F''-distribution]].}}
| | Oscar is how he's known as and he totally loves this title. My day job is a meter reader. Years ago we moved to Puerto Rico and my family members enjoys it. One of the extremely very best things in the world for me is to do aerobics and now I'm trying to earn money with it.<br><br>Stop by my blog ... diet meal delivery ([http://bit.do/Lu9u extra resources]) |
| {{merge|Coefficient of relationship|date=January 2012}}
| |
| In [[population genetics]], '''''F''-statistics''' (also known as fixation indices) describe the statistically expected level of [[Zygosity|heterozygosity]] in a population; more specifically the expected degree of (usually) a reduction in heterozygosity when compared to [[Hardy–Weinberg law|Hardy–Weinberg expectation]].
| |
| | |
| F-statistics can also be thought of as a measure of the correlation between genes drawn at different levels of a (hierarchically) subdivided population. This correlation is influenced by several evolutionary processes, such as mutation, migration, [[inbreeding]], [[natural selection]], or the [[Wahlund effect]], but it was originally designed to measure the amount of allelic fixation owing to [[genetic drift]].
| |
| | |
| The concept of ''F''-statistics was developed during the 1920s by the American geneticist [[Sewall Wright]],<ref>{{cite journal |pmid=15439261 |year=1950 |last1=Wright |first1=S |title=Genetical structure of populations |volume=166 |issue=4215 |pages=247–9 |journal=Nature |doi=10.1038/166247a0|bibcode = 1950Natur.166..247W }}</ref><ref>{{cite journal |pmid=4063030 |lccn=67025533 |year=1985 |last1=Kulig |first1=K |title=Utilization of emergency toxicology screens |volume=3 |issue=6 |pages=573–4 |journal=The American journal of emergency medicine}}</ref> who was interested in inbreeding in [[cattle]]. However, because [[complete dominance]] causes the [[phenotype]]s of [[Zygosity|homozygote]] dominants and heterozygotes to be the same, it was not until the advent of [[molecular genetics]] from the 1960s onwards that heterozygosity in populations could be measured.
| |
| | |
| ''F'' can be used to define [[effective population size]].
| |
| | |
| == Definitions and equations ==
| |
| The measures F<sub>IS</sub>, [[Fixation index|F<sub>st</sub>]], and F<sub>IT</sub> are related to the amounts of heterozygosity at various levels of population structure. Together, they are called F-statistics, and are derived from ''F'', the inbreeding coefficient. In a simple two-allele system with inbreeding, the genotypic frequencies are:
| |
| | |
| :<math> p^2(1-F) + pF\text{ for }\mathbf{AA};\ 2pq(1-F)\text{ for }\mathbf{Aa};\text{ and }q^2(1-F) + qF\text{ for }\mathbf{aa}. </math>
| |
| | |
| The value for '''F''' is found by solving the equation for '''F''' using heterozygotes in the above inbred population. This becomes one minus the [[observation|observed]] number of heterozygotes in a population divided by its [[expected value|expected]] number of heterozygotes at [[Hardy–Weinberg principle|Hardy–Weinberg equilibrium]]:
| |
| | |
| :<math> F = 1- \frac{\operatorname{O}(f(\mathbf{Aa}))} {\operatorname{E}(f(\mathbf{Aa}))} = 1- \frac{\operatorname{ObservedNumber}(\mathbf{Aa})} {n \operatorname{E}(f(\mathbf{Aa}))}, \!</math>
| |
| | |
| where the expected value at Hardy–Weinberg equilibrium is given by
| |
| | |
| :<math> \operatorname{E}(f(\mathbf{Aa})) = 2pq, \!</math>
| |
| | |
| where ''p'' and ''q'' are the [[allele frequencies]] of '''A''' and '''a''', respectively. It is also the probability that at any [[gene locus|locus]], two alleles from a random individuum of the population are [[identity by descent|identical by descent]].
| |
| | |
| For example, consider the data from [[E.B. Ford]] (1971) on a single population of the [[scarlet tiger moth]]:
| |
| | |
| {| border=1 cellpadding=5 style="border-collapse:collapse;" align=center
| |
| |+ '''Table 1: '''
| |
| |-
| |
| !Genotype
| |
| |White-spotted ('''AA''')
| |
| |Intermediate ('''Aa''')
| |
| |Little spotting ('''aa''')
| |
| !Total
| |
| |-
| |
| !Number
| |
| |1469
| |
| |138
| |
| |5
| |
| |1612
| |
| |}
| |
| | |
| From this, the [[allele frequencies]] can be calculated, and the expectation of ''ƒ''(AA) derived:
| |
| | |
| : <math>p = {2 \times \mathrm{obs}(AA) + \mathrm{obs}(Aa) \over 2 \times (\mathrm{obs}(AA) + \mathrm{obs}(Aa) + \mathrm{obs}(aa))} = 0.954</math>
| |
| | |
| : <math>q = 1 - p = 0.046\,</math>
| |
| | |
| : <math>F = 1- \frac{ \mathrm{obs}(Aa) } { n2pq } = 1- {138 \over 1612*2(0.954)(0.046)} = 0.023</math>
| |
| | |
| The different F-statistics look at different levels of population structure. '''F<sub>IT</sub>''' is the inbreeding coefficient of an individual ('''I''') relative to the total ('''T''') population, as above; '''F<sub>IS</sub>''' is the inbreeding coefficient of an individual ('''I''') relative to the subpopulation ('''S'''), using the above for subpopulations and averaging them; and '''F<sub>ST</sub>''' is the effect of subpopulations ('''S''') compared to the total population ('''T'''), and is calculated by solving the equation:
| |
| | |
| :<math>(1-F_{IS})(1-F_{ST}) = 1-F_{IT}, \, </math>
| |
| | |
| as shown in the next section.
| |
| | |
| == Partition due to population structure ==
| |
| | |
| [[File:F-statistics.png|frame|right|''F''<sub>''IT''</sub> can be partitioned into ''F''<sub>''ST''</sub> due to the [[Wahlund effect]] and ''F''<sub>''IS''</sub> due to [[inbreeding]].]]
| |
| | |
| Consider a population that has a [[population structure]] of two levels; one from the individual (I) to the subpopulation (S) and one from the subpopulation to the total (T). Then the total ''F'', known here as ''F''<sub>''IT''</sub>, can be [[Partition of the sum of squares|partition]]ed into ''F''<sub>''IS''</sub> (or ''f'') and ''F''<sub>''ST''</sub> (or ''θ''):
| |
| | |
| :<math> 1 - F_{IT} = (1 - F_{IS})\,(1 - F_{ST}). \!</math>
| |
| | |
| This may be further partitioned for population substructure, and it expands according to the rules of [[binomial expansion]], so that for ''I'' partitions:
| |
| | |
| :<math> 1 - F = \prod_{i=0}^{i=I} (1 - F_{i,i+1}) \!</math>
| |
| <!-- er I think, check this please! though it does look rather good-->
| |
| | |
| == <math> F_{ST} </math>==
| |
| | |
| A reformulation of the definition of '''F''' would be the ratio of the average number of differences between pairs of chromosomes sampled within diploid individuals with the average number obtained when sampling chromosomes randomly from the population (excluding the grouping per individual).
| |
| One can modify this definition and consider a grouping per sub-population instead of per individual. Population geneticists have used that idea to measure the degree of structure in a population.
| |
| | |
| Unfortunately, there is a large number of definitions for [[Fixation_index|''F''<sub>''ST''</sub>]], causing some confusion in the scientific literature. A common definition is the following:
| |
| | |
| : <math> F_{ST} = \frac{\operatorname{var}(\mathbf{p})}{p\,(1 - p)} \!</math>
| |
| | |
| where the variance of '''p''' is computed across sub-populations and ''p''(1 −''p'') is the expected frequency of heterozygotes.
| |
| | |
| == <math> F_{ST} </math> in human populations ==
| |
| | |
| It is well established that the genetic diversity among human populations is low,<ref>{{cite journal |doi=10.1038/nrg2611 |title=Genetics in geographically structured populations: Defining, estimating and interpreting FST |year=2009 |last1=Holsinger |first1=Kent E. |last2=Weir |first2=Bruce S. |journal=Nature Reviews Genetics |volume=10 |issue=9 |pages=639–50 |pmid=19687804}}</ref> although the distribution of the genetic diversity was only roughly estimated. Early studies argued that 85-90% of the genetic variation is found within individuals residing in the same populations within continents (intra-continental populations) and only an additional 10-15% is found between populations of different continents (continental populations).<ref>{{cite journal |author=Lewontin |title=The apportionment of human diversity |journal=Evolutionary Biology |volume=6 |pages=381–98 |year=1972 |doi=10.1007/978-1-4684-9063-3_14 |isbn=978-1-4684-9065-7}}</ref><ref>{{cite journal |first1=Anne M. |last1=Bowcock |first2=Judith R. |last2=Kidd |first3=Joanna L. |last3=Mountain |first4=Joan M. |last4=Herbert |first5=Luciano |last5=Carotenuto |first6=Kenneth K. |last6=Kidd |first7=Luca |last7=Cavalli-Sforza |bibcode=1991PNAS...88..839B |doi=10.1073/pnas.88.3.839 |jstor=2356081 |title=Drift, admixture, and selection in human evolution: A study with DNA polymorphisms |year=1991 |journal=Proceedings of the National Academy of Sciences |volume=88 |issue=3 |pages=839–43 |pmid=1992475 |pmc=50909}}</ref><ref>{{cite journal |pmid=9114021 |year=1997 |last1=Barbujani |first1=Guido |last2=Magagni |first2=Arianna |last3=Minch |first3=Eric |last4=Cavalli-Sforza |first4=L. Luca |title=An apportionment of human DNA diversity |volume=94 |issue=9 |pages=4516–9 |pmc=20754 |journal=Proceedings of the National Academy of Sciences of the United States of America |doi=10.1073/pnas.94.9.4516 |jstor=42042 |bibcode=1997PNAS...94.4516B}}</ref><ref>{{cite journal |doi=10.1086/302825 |title=The Distribution of Human Genetic Diversity: A Comparison of Mitochondrial, Autosomal, and Y-Chromosome Data |year=2000 |last1=Jorde |first1=L.B. |last2=Watkins |first2=W.S. |last3=Bamshad |first3=M.J. |last4=Dixon |first4=M.E. |last5=Ricker |first5=C.E. |last6=Seielstad |first6=M.T. |last7=Batzer |first7=M.A. |journal=The American Journal of Human Genetics |volume=66 |issue=3 |pages=979–88 |pmid=10712212 |pmc=1288178}}</ref><ref>{{cite journal |doi=10.1038/ng1435 |title=Genetic variation, classification and 'race' |year=2004 |last1=Jorde |first1=Lynn B |last2=Wooding |first2=Stephen P |journal=Nature Genetics |volume=36 |issue=11s |pages=S28-33 |pmid=15508000}}</ref> Later studies based on hundreds of thousands single-nucleotide polymorphism (SNPs) suggested that the genetic diversity between continental populations is even smaller and accounts for 3 to 7%<ref>{{cite journal |doi=10.1007/s10038-006-0041-1 |title=Similarity of the allele frequency and linkage disequilibrium pattern of single nucleotide polymorphisms in drug-related gene loci between Thai and northern East Asian populations: Implications for tagging SNP selection in Thais |year=2006 |last1=Mahasirimongkol |first1=Surakameth |last2=Chantratita |first2=Wasun |last3=Promso |first3=Somying |last4=Pasomsab |first4=Ekawat |last5=Jinawath |first5=Natini |last6=Jongjaroenprasert |first6=Wallaya |last7=Lulitanond |first7=Viraphong |last8=Krittayapoositpot |first8=Phanida |last9=Tongsima |first9=Sissades |displayauthors= 4 |journal=Journal of Human Genetics |volume=51 |issue=10 |pages=896–904 |pmid=16957813}}</ref><ref>{{cite journal |doi=10.1186/1471-2156-9-54 |title=Population substructure in Finland and Sweden revealed by the use of spatial coordinates and a small number of unlinked autosomal SNPs |year=2008 |last1=Hannelius |first1=Ulf |last2=Salmela |first2=Elina |last3=Lappalainen |first3=Tuuli |last4=Guillot |first4=Gilles |last5=Lindgren |first5=Cecilia M |last6=Von Döbeln |first6=Ulrika |last7=Lahermo |first7=Päivi |last8=Kere |first8=Juha |journal=BMC Genetics |volume=9 |pages=54 |pmid=18713460 |pmc=2527025}}</ref><ref>{{cite journal |doi=10.1016/j.cub.2008.07.049 |title=Correlation between Genetic and Geographic Structure in Europe |year=2008 |last1=Lao |first1=Oscar |last2=Lu |first2=Timothy T. |last3=Nothnagel |first3=Michael |last4=Junge |first4=Olaf |last5=Freitag-Wolf |first5=Sandra |last6=Caliebe |first6=Amke |last7=Balascakova |first7=Miroslava |last8=Bertranpetit |first8=Jaume |last9=Bindoff |first9=Laurence A. |displayauthors= 4 |journal=Current Biology |volume=18 |issue=16 |pages=1241–8 |pmid=18691889}}</ref><ref>{{cite journal |doi=10.1016/j.ajhg.2009.04.015 |title=Genome-wide Insights into the Patterns and Determinants of Fine-Scale Population Structure in Humans |year=2009 |last1=Biswas |first1=Shameek |last2=Scheinfeldt |first2=Laura B. |last3=Akey |first3=Joshua M. |journal=The American Journal of Human Genetics |volume=84 |issue=5 |pages=641}}</ref><ref>{{cite journal |doi=10.1371/journal.pone.0005472 |title=Genetic Structure of Europeans: A View from the North–East |year=2009 |editor1-last=Fleischer |editor1-first=Robert C |last1=Nelis |first1=Mari |last2=Esko |first2=Tõnu |last3=Mägi |first3=Reedik |last4=Zimprich |first4=Fritz |last5=Zimprich |first5=Alexander |last6=Toncheva |first6=Draga |last7=Karachanak |first7=Sena |last8=Piskácková |first8=Tereza |last9=Balašcák |first9=Ivan |displayauthors= 4 |journal=PLoS ONE |volume=4 |issue=5 |pages=e5472 |pmid=19424496 |pmc=2675054|bibcode = 2009PLoSO...4.5472N }}</ref><ref>{{cite journal |doi=10.1038/nature08365 |title=Reconstructing Indian population history |year=2009 |last1=Reich |first1=David |last2=Thangaraj |first2=Kumarasamy |last3=Patterson |first3=Nick |last4=Price |first4=Alkes L. |last5=Singh |first5=Lalji |displayauthors= 4 |journal=Nature |volume=461 |issue=7263 |pages=489–94 |pmid=19779445 |pmc=2842210|bibcode = 2009Natur.461..489R }}</ref> Most of these studies have used the [[Fixation_index|''F''<sub>''ST''</sub>]] statistics <ref>{{cite journal |first1=Sewall |last1=Wright |year=1965 |title=The Interpretation of Population Structure by F-Statistics with Special Regard to Systems of Mating |journal=Evolution |volume=19 |issue=3 |pages=395–420 |jstor=2406450 |doi=10.2307/2406450}}</ref> or closely related statistics.<ref>{{cite journal |doi=10.1080/00071669108417396 |title=Long-term goose breeding for egg production and crammed liver weight |year=1991 |last1=Shalev |first1=B. A. |last2=Dvorin |first2=A. |last3=Herman |first3=R. |last4=Katz |first4=Z. |last5=Bornstein |first5=S. |journal=British Poultry Science |volume=32 |issue=4 |pages=703–9 |pmid=1933444}}</ref><ref>{{cite journal |pmid=1644282 |year=1992 |last1=Excoffier |first1=L |last2=Smouse |first2=PE |last3=Quattro |first3=JM |title=Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA restriction data |volume=131 |issue=2 |pages=479–91 |pmc=1205020 |journal=Genetics}}</ref>
| |
| | |
| ==See also==
| |
| | |
| *[[Malecot's method of coancestry]]
| |
| *[[Heterozygosity]]
| |
| *[[Fixation index]]
| |
| | |
| == References ==
| |
| {{Reflist|2}}
| |
| | |
| == External links ==
| |
| * [http://www.library.auckland.ac.nz/subjects/bio/pdfs/733Pop-g-stats2.pdf Shane's Simple Guide to F-Statistics]
| |
| * [http://darwin.eeb.uconn.edu/eeb348/lecture-notes/genetic-structure.pdf Analyzing the genetic structure of populations]
| |
| * [http://darwin.eeb.uconn.edu/eeb348/lecture-notes/wahlund/wahlund.html Wahlund effect, Wright's F-statistics]
| |
| * [http://www.uwyo.edu/dbmcd/popecol/Maylects/FST.html Worked example of calculating F-statistics from genotypic data]
| |
| * [http://helix.mcmaster.ca/brent/node10.html IAM based F-statistics]
| |
| * [http://eco-tools.njit.edu/webMathematica/EcoTools/Fstats-1-1/Introduction.html F-statistics for Population Genetics Eco-Tool]
| |
| * [http://www.stats.ox.ac.uk/~mcvean/slides7.pdf Population Structure (slides)]
| |
| | |
| {{Population genetics}}
| |
| | |
| {{DEFAULTSORT:F-Statistics}}
| |
| [[Category:Population genetics]]
| |
Oscar is how he's known as and he totally loves this title. My day job is a meter reader. Years ago we moved to Puerto Rico and my family members enjoys it. One of the extremely very best things in the world for me is to do aerobics and now I'm trying to earn money with it.
Stop by my blog ... diet meal delivery (extra resources)