Joe Felsenstein’s Bible

August 3, 2018

Everybody in the evolution field knows who is Joe Felsenstein. Most people is familiar with his important contributions in phylogenetics. However, some of my young colleagues don’t know that he was a population genetics in his early years. Indeed, he has been a student/postdoc of Jim Crow, Dick Lewontin and Bill Hill, all renowned population and quantitative geneticists. (You can find more about Bill Hill in my old post.) Felsenstein’s paper on the advantage of recombination is among my all-time favourites, and his free textbook ‘Theoretical Evolutionary Genetics‘ is brilliant (in particular the historical notes included).

But Joe’s ability to gather and process information was not only invested in writing papers and books, but also in compiling publications in the field. He compiled near 8000 titles and published it in a hardback book in 1981 as ‘A Bibliography of Theoretical Population Genetics‘ (although there is an earlier 1973 version). In the pre-Internet times that was a very valuable resource. (Obviously not for me, I was just 1 years old when the volume was published.) A digital copy can be found at Felsenstein’s own webpage. Un-modestly (yet rightfully) the file is call ‘bible’. And for that reason I always refer to this work as Joe Felsenstein’s Bible.blankblank


A few years ago I attempted to parse the file and convert it into BibTex, so it can be of some use in my LaTeX writings. However, I abandoned the project as I was always facing complex formatting errors and didn’t have the time to fix them all. Now, I decided to make public the BibTex version of the bible (as it is now), that you can find here. But before you make any use, I should warn you that there are many errors so, please check carefully the formatting of your final reference list. Below is my modest re-analysis of the papers, books and texts indexed in Holy Felsenstein’s Book.


Publications by year

Publications are dated from 1867 to 1981. The oldest work is the review by Fleeming Jenkin of Darwin’s ‘Origin of Species‘. Jenkin was an engineer (he invented the telpherage), so he had the necessary mathematical background that Darwin lacked to tackle the issue of inheritance and selection. In his paper he criticised Darwin’s theory of natural selection but arguing that, under blending inheritance, it is not possible. An excellent historical account by Michael Bulmer of this review and the reactions of Darwin and other can be found here. The most recent works listed are Elseth and Baumgardner’s ‘Population Biology’ and the very own Felsentein’s bible.

If we plot the number of papers/works published per year, one observed a decrease during the years of the Second World War (as expected in a total war, where everybody, specially scientists, are asked to contribute to the war efforts). This was more pronounced toward the last years of the war, being the year 1945 the one with less works published. This effect is more evident in the log plot below. However, among the 8 works listed, we find a classic: Sewall Wright‘s ‘The differential equation of the distribution of gene frequencies‘, in which he applies Kolmogorov diffusion equations explicitly to work out expected gene frequencies. This work was probably not the most popular among Wright’s contributions, but it heavily influenced Motoo Kimura. Only for that, this paper deserves a special place in the history of population genetics. (See my post on the use of diffusion equations in population genetics.)byyear

Publications per year in Felsenstein’s bible. The right plot shows the logarithm of the number of publications in its y-axis. The red lines indicate the years of the World War II.


Digging a bit more into the 1930’s and 1940’s literature, one can find papers on the efficiency of the German (Nazi) Racial law from the point of view of theoretical population genetics. For instance, Koller’s response to previous criticisms. However, this can be to long to be discussed here, and I may even consider write about it in a future post.


Authors and co-authors

Another interesting analysis that can be done using Felsenstein’s bible is to evaluate the productivity of different authors in the field. In this sense, Kimura is the clear winner. Way behind we find Wright and Masatsohi Nei who get the silver and bronze medals respectively.

Author Publications
M. Kimura 200
S. Wright 123
M. Nei 120
J. B. S. Haldane 119
T. Maruyama 98
N. E. Morton 97
R. A. Fisher 84
S. Karlin 84
A. Robertson 79
T. Ohta 75


Regarding joint authorships, again Kimura is the leader together with Tomoko Ohta (with 40 co-authored works). Second is the prolific geneticist and epidemiologist Newton Morton and his late collaborator D. C. Rao (I wonder if he was related with famous statistician C. R. Rao). And third, as it could not be other way, the Charlesworths, Brian and Deborah.

Author Author Publications
M. Kimura T. Ohta 40
N. E. Morton D. C. Rao 20
B. Charlesworth D. Charlesworth 19
T. Maruyama M. Kimura 18
N. E. Morton S. Yee 18
J. McGregor S. Karlin 16
L. L. Cavalli-Sforza M. W. Feldman 15
M. Kimura J. F. Crow 15
C. C. Cockerham B. S. Weir 13


Actually, it is interesting to see which authors publish mostly alone and which authors are more likely to work with someone else. For the top-10 authors we see a wide range of single authored /co-authored ratios. Wright was mostly a lonely writer whilst Samuel Karlin and Newman Morton (as it is well known) enjoyed more working in collaborative projects with other colleagues.

Author Publications Solo Joint Prop. solo
M. Kimura 200 120 81 0.597
S. Wright 123 112 12 0.903
M. Nei 120 65 67 0.492
J. B. S. Haldane 119 101 18 0.849
T. Maruyama 98 53 51 0.510
N. E. Morton 97 41 107 0.277
R. A. Fisher 84 77 9 0.895
S. Karlin 84 23 68 0.253
A. Robertson 79 50 32 0.610
T. Ohta 75 32 43 0.427


I tried to produce a graph of co-authorships but I had some technical problems and, as my grant applications and manuscripts deserve some attention, I decided to postpone the generation of the graph. However, I provide here the table of the most connected authors. Morton is the most connected scientist.

Author Co-authors
N. E. Morton 36
C. C. Cockerham 34
L. L. Cavalli-Sforza 32
R. E. Comstock 28
M. W. Feldman 27
S. Yee 27
R. C. Elston 26
J. L. Jinks 25
J. F. Crow 24
S. Karlin 24
W. F. Bodmer 24


A valuable resource

Felsenstein’s bible is a great resource to understand the history of population genetics. Indeed, it’ll be interesting to extract abstracts of even full texts, and parse the information to study trends in topics during these years. I bet ‘linkage’ will be one of this topics that exploded in the early 1970’s. But to do so, it will be necessary quite some work of programming and parsing and, unfortunately, I don’t have the time (and probably not the skills) to do it right away. But I hope this modest analysis and the BibTex file is of some use to some of you.


“Nothing in biology…”: The most used and abused quotation in evolutionary biology

October 9, 2015

Quotations are everywhere. You can always pretend you know something about physics by quoting Einstein: “God does not play dice“. Or if you feel like contributing to a discussion about statistics you can assert that “all models are wrong, but some are useful. There’s a quotation for every occasion. In fact, Oxford University Press (OUP) has its own dictionary of quotations. (Although OUP has a dictionary for pretty much anything.) But if one quotation has been mercilessly abused, it has been that of Dobzhansky. You know which one, the one starting with “Nothing in biology…”.

It all began when Theodosius Dobzhansky became the president of the American Society of Zoologists. In his presidential lecture he raised some concerns on the emerging field of molecular biology, of which he said it was a ‘glamour field’. He worried that ‘[t]he notion has gained some currency that the only worthwhile biology is molecular biology. All else is […] “butterfly collecting”‘, in clear reference to Rutherford’s “stamp collecting” statement. Towards the end of his address, he concludes that molecular biologists focus more on ‘how things are’, and organismic biologists on ‘how things got to be that way’, but that both views are complementary, and a Darwinian approach is needed to understand also molecular biology. In his words: ‘nothing makes sense in biology except in the light of evolution, sub specie evolutionis‘.

Nothing in Dobzhansky makes sense except in the light of taxonomy

Nothing in Dobzhansky makes sense except in the light of taxonomy

The ending of the sentence, ‘in the light of evolution, sub specie evolutionis‘, originally comes from Julian Huxley. He paraphrased Spinoza, who described all the things that are universal and eternally truth as sub specie aeternitatis (from the point of view of eternity). Huxley coined the concept of sub specie evolutionis and translated it as ‘in the light of evolution’. But Dobzhansky slightly modified his own version of the sentence and use it as a title of a very influential paper he published in 1973, introducing to the World what it will become his most famous statement: “Nothing in Biology Makes Sense except in the Light of Evolution“. The nightmare began!

We find variations of all kinds. Apparently not only biology makes only sense in the light of evolution. Some authors assert that neither ethics, glycobiology, medicine, morality, biochemistry, microbiology, cancer, nor community ecology, makes sense except in the light of evolution. Some other swap the sentence around, and say that nothing in evolution makes sense except in the light of population genetics, of phylogeny or even of creation! One of my favourites (which I recently discovered via Tom Cameron) is ‘Nothing in evolution or ecology makes sense except in the light of the other‘.

But that’s not all. The structure ‘Nothing in X makes sense except in the light of Y’ has been recycled over and over again. Some are clever, like ‘Nothing in genetics makes sense except in the light of genomic conflict‘. Some are not that clever, like ‘Nothing in biology makes sense except in the light of sequencing‘. And some others are, in my modest opinion, just wrong, like ‘Nothing in the genome makes sense except in the light of the transcriptome‘.

Outside the biosciences there have been many attempts to create Dobzhanski-alike sentences. ‘Nothing in linguistics makes sense except in the light of change‘, ‘Nothing in Human Behavior Makes Sense Except in the Light of Culture‘ and ‘Nothing in the universe makes sense except in the light of Big History‘ are but a few examples. One that I really support is ‘Nothing in scholarly communication makes sense except in the light of Open Access‘. On the other side are the physicists, who still struggle to find their version of Dobzhansky’s.

Some go a bit far: ‘Screw Dobzhansky, nothing in biology makes sense, period‘, or its counterpart ‘in creationism nothing makes sense, period‘.

In any case, I think we all have created a version of Dobzhansky’s ‘Nothing in X makes sense…’. I’ve seen hundreds of different versions, particularly in conferences and seminars. They all look clever to the eyes of the author, but probably not as much to the audience. Like this blog, which I always try to do my best, but I’m aware that most readers do not care much about what I write. After all, nothing in this blog makes sense except in the light of its author, sub species Antonii.

Benchmark Papers in Quantitative Genetics (The Bill Hill’s List, part II)

December 19, 2014

As I promised, here’s the second part of the list, that corresponds to the papers commented in volume II. It was particularly difficult to find PDFs for all of them , and some links go to the publisher which sells the paper for a (in my opinion) substantial amount of money. I encourage you to go to the library, find alternative resources (JSTOR,…), or ask a colleague. I have most of them if anyone is interested.

The Hill-Robertson'66 paper, one of my all-time-favourites, is not in the list. Actually, Hill didn't include any of his own papers!

The Hill-Robertson’66 paper, one of my all-time-favourites, is not in the list. Actually, Hill didn’t include any of his own papers!


The Bill Hill’s List, part II

Nature of Selection Response

  • Castle WE (1905) The Mutation Theory of Organic Evolution, from the Standpoint of Animal Breeding. Science 21:521-525. [PDF]
  • Jennings HS (1916) Heredity, Variation and the Results of Selection in the Uniparental Reproduction of DIFFLUGIA CORONA. Genetics 1:407-534. [PDF]
  • Sturtevant AH (1918) An analysis of the effects of selection [PDF]
  • Castle WE (1919) Piebald rats and selection, a correction. Am Nat 53:370-376. [PDF]

Statistical Predictions of Selection Response

  • Lush JL (1935) Progeny Test and Individual Performance as Indicators of an Animal’s Breeding Value. J Dairy Science 18:1-19. [PDF]
  • Hazel LN (1943) The genetic basis for constructing selection indexes. Genetics 28:476-490. [PDF]
  • Falconer DS (1952) The problem of environment and selection. Am Nat 86:293-298. [PDF]
  • Dickerson GE and Hazel LN Effectiveness of selection on progeny performance as a supplement to earlier culling in livestock. J Agric Res 69:459-476. [PDF]
  • Henderson CR (1974) General Flexibility of Linear Model Techniques for Sire Evaluation. J Dairy Science 57:963-972. [PDF]

Genetical Prediction of Selection Response

  • Fisher RA (1930) The fundamental theorem of natural selection. In: The Genetical Theory of Natural Selection. Oxford Claredon Press. [PDF]
  • Haldane JBS (1931) Selection intensity as a function of mortality rate. Proceedings of the Cambridge Philosophical Society 27:131-136 [PDF]
  • Comstock RE et al (1949) A Breeding Procedure Designed To Make Maximum Use of Both General and Specific Combining Ability. Agronomy J 41:360-367. [PDF]
  • Robertson A (1960) A theory of limits in artificial selection. Roc Soc (Lon) Proc B153:234-249. [PDF]

Results from Selection Experiments

  • Dudley JW (1977) 76 Generations of selection for oil and protein percentage in maize. Proc Int Conf Quant Genetics Ames IA, ISU Press, 459-473. [PDF]
  • Mather, K (1941) Variation and selection of polygenic characters. J Genet 41: 159–193. [PDF]
  • Lerner IM and Dempster ER (1951) Attenuation of genetic progress under continued selection in poultry. Heredity 5:75–94. [PDF]
  • Robertson FW (1955) Selection response and the properties of genetic variation. Cold Spring Harbor Symp.Quant. Biol. 20:166–177. [PDF]
  • Falconer DS (1960) The genetics of litter size in mice. J Cell Comp Physiol 56(Suppl 1):153–167. [PubMed]
  • Clayton GA et al (1957) An experimental check on quantitative genetical theory. I. Short-term responses to selection. J Genet 55:131–151. [PDF]
  • Bell, AE et al (1955) The evolution of new methods for the improvement of quantitative characters. Cold Spring Harbor Symp Quant Biol 20:197-211. [PDF]

Selection and Maintenance of Genetic Variation

  • Wright S  (1932) The roles of mutation, inbreeding, crossbreeding and selection in evolution. Proc VI Int Congress Genetics 1:356-366 [PDF]
  • Robertson A (1955) Selection in animals: synthesis. Cold Spring Harb Symp Quant Biol  20:225229. [PDF]
  • Bulmer MG (1971) The effect of selection on genetic variability. Am Nat, 105:201–211 [PDF]
  • Lande R (1976) The maintenance of genetic variation by mutation in a polygenic character with linked loci. Genet Res (Camb). 26:221-235. [PDF]

Nature of Quantitative Genetic Variation

  • Clayton GA and Robertson A (1955) Mutation and quantitative variation. Am Nat 89:151-158  [PDF]
  • Linney R et al (1971) Variation for metrical characters in Drosophila populations III. The nature of selection. Heredity 27:163–174 [PubMed]
  • “Student” (1934) A calculation of the minimum number of genes in winter’s selection experiment. Ann Eugenics 6:77–82  [PDF]
  • Thoday JM (1961) Location of polygenes. Nature 191:368-370 [PDF]

Benchmark Papers in Quantitative Genetics (The Bill Hill’s List, part I)

July 21, 2014

Despite having a long time interest in evolutionary biology, I deliberately avoided papers on animal breeding and pure quantitative genetics, and I thought they were not relevant to modern evolutionary thought. As I gained more interest in mathematical models I started to read some of these classic papers, and realized that some of the issues I’m interested in (such as genetic linkage) were studied in depth by livestock breeders. Now I regret I didn’t pay more attention to them in the past. However, I have a chance to redeem myself and put some of these papers in my reading list. Which ones? Well, that’s a hard decision, but fortunately I found a list compiled by quantitative geneticist William Hill.

William G. Hill. Picture from the Genetics Society at

William (Bill) Hill was invited to contribute with a  volume to the famous “Benchmark Papers in Genetics” series by Springer. However, he came up with two volumes. I could not find any of them in the University of Essex library so I decided, for the first time, to visit the British Library. They had, indeed, both volumes. I forgot to bring some cash so I couldn’t afford to do copies. Hence, as in the old times, I grabbed a pencil and transcribed the lists of papers from both volumes. It was a hard yet rewarding task. Because most of these papers are quite old, I was confident that I should find them in JSTOR or another similar repository (and I did!).

The best bits of the books are the comments by Bill Hill himself on the papers which, obviously, I cannot reproduce here. However, just the list of papers is of great value, and I thought there may be someone else interested. So, I’ve compiled both lists and looked for PDFs and links to author’s biographies. Here’s the list of papers from volume I. The second bit will be posted soon. Hope you find it useful.


The Bill Hill’s List, part I


  • Pearson K (1904) Mathematical Contributions to the Theory of Evolution – XII. Phil Trans R Soc Lond A 203:359-371. [PDF]
  • Yule GU (1906) On the Theory of Inheritance of Quantitative Compound Characters on the Basis of Mendel’s Laws – A Preliminary Note. In: Proceedings of the International Congress of Genetics. [PDF]
  • Weinberg W (1910) Further contributions to the theory of Inheritance. (English translation by K. Meyer in this compilation) [PDF NOT FOUND]
  • Fisher RA (1918) The Correlation Between Relatives on the Supposition of Mendelian Inheritance. Trans R Soc Edinburgh 52:399-433. [PDF]
  • Wright S (1921) Systems of Mating I. The Biometric Relations betweens Parent and Offspring. Genetics 6:111-123. [PDF]
  • Wright S (1921) Systems of Mating II. The Effects of Inbreeding on the Genetic Composition of a Population. Genetics 6:124-143. [PDF]
  • East EM (1910) A Mendelian Interpretation of Variation that is Apparently Continuous. Am Nat 44:65-82. [PDF]
  • Jones DF (1917) Dominance of Linked Factors as a Means of Accounting for Heterosis. Genetics 2:466-479. [PDF]


  • Mather K (1949) The Genetical Theory of of Continuous Variation. Hereditas 35:376-401. [PDF]
  • Wright S (1950) The Genetics of Quantitative Variability. In: Quantitative inheritance. [Google Books]


  • Kempthorne O (1954) The Correlation Between Relatives in a Random Mating Population. Proc R Soc London B 143:102-113. [PDF]
  • Cockerham CC (1956) Effects of Linkage on the Covariances between Relatives. Genetics 41:138-141. [PDF]
  • Willham RL (1963) The Covariance between Relatives for Characters Composed of Components Contributed by Related Individuals. Biometrics 19:18–27. [PDF NOT FOUND]
  • Robertson A (1952) The Effect of Inbreeding on the Variation Due to Recessive Genes. Genetics 37:189–207. [PDF]


  • Lush JL (1940) Intra-Sire Correlations or Regressions of Offspring on Dam as a Method of Estimating Heritability of Characteristics. Ann Proc Am Soc Anim Prod 33:293–301. [PDF]
  • Comstock RE, Robinson HF (1952) Estimation of average dominance of genes. In: Heterosis. [Google Book]
  • Griffing B (1956) A generalised treatment of the use of diallel crosses in quantitative inheritance. Heredity 10:31-50. [PDF]


The outgrowth of Muller’s eugenics program

February 14, 2014

Hermann Muller is one of my favourite scientists ever. Among other things he used the first balancer chromosomes to do genetic analysis, discovered X-rays induced mutations, and developed the concept of genetic load. Muller was also tireless and systematic, spending countless hours crossing flies with care and patience. No wonder he was the sole recipient of the Nobel prize in Physiology or Medicine in 1946 for his work in mutagenesis.

However, Muller’s name is strongly associated with ‘eugenics’, a word that is today a synonym of racism, or even Nazism. Because of that, I have been afraid of reading some of his writings that may jeopardize my idealization of Muller. A few months ago I changed my mind and decide to read more about his eugenics views. It took me a while, but I finally got a copy of his “Out of the Night: A Biologist’s view of the future”, his personal view on how science and eugenics will change the future. To my surprise, Muller was very critical with most eugenic programs, and fought against selective sterilization or similar practices. Muller, as a geneticist, was well aware that selective elimination of weak phenotypes had only a little impact to remove recessive deleterious alleles from the population. Noteworthy, Muller even sent his book to Joseph Stalin while living in the USSR, together with a lengthy letter, hoping that he would embrace Mendelism and his eugenic program. Muller soon heard that Stalin was “displeased by it, and has ordered an attack prepared against it“.

Cover of Hermann Muller's "Out of the night", 1936, Victor Gollancz Ltd, London.

Cover of Hermann Muller’s “Out of the night”, 1936, Victor Gollancz Ltd, London.

But “Out of the Nigh…”, more than a eugenics program, has become a description on how society have changed in the last century. Muller’s proposals have become a list of predictions, and most (if not all of them) have become a reality. Here’s the list of his ‘proposals’, and their current status.

1) Universal dissemination of knowledge about the means of birth control. Abortion must also be legalized and regulated.  Abortion is indeed legal in most European, Asian and North American countries. In these countries abortion is regulated and it is permitted under request or, in countries with a more restrictive law, when there is a risk for the mother or in cases of rape.

2) Better systems of pain relief during labour. In the UK (the case I know the best) there are various options for pain relieve during labour: Entonox, pethidine and epidural anaesthesia. In the US, more than 50% of women give birth with Epidura anaesthesia.

3) Better ways to deal with illnesses affecting children in their first six months. Nowadays many congenital disorders are routinely detected before birth thanks to ultrasound scans and maternal blood screens. In most North American and European countries, newborn babies have their blood test for several genetic and metabolic disorders (heel prick). In the near future, whole genome sequencing of a foetus would be possible by extracting only the mothers blood.

4) Develop public organization for food preparation, laundrering and other services for infants and young children. I think Muller was thinking on a huge collective nursery for all children in a community, or something like that. In any case, childcare is now an important part of our society and present day nurseries may somewhat fit Muller’s idealization.

5) Inspire “women of the highest type of intelligence” to be mothers. Maternity leaves permit working women to have a career break to have children without losing their jobs. I don’t agree at all with this obsession for “intelligent parents to have intelligent children”, but benefits and leaves do help successful woman to become mothers.

I do not aim to state whether Muller was right or not, or whether abortion or benefits are right or wrong. What I want to point out is that, the eugenics program described by Muller, as such, has become a reality. One should take into account that what we call now eugenics, it was a different thing decades ago. Actually, some journals with the ‘eugenics’ word in their title were actually journals of genetics such as “Eugenics Review” (which published papers on human genetics) or “Annals of Eugenics” (which changed its name to “Annals of Human Genetics” in 1954). William Provine has stated that ‘Eugenics has merely been renamed genetic counselling‘. In this sense, Muller’s “Out of the night” may have become the first eugenics/genetic-counselling manifesto that has been fully fulfilled.

Best 2013 genetics and evolution papers (a personal selection)

December 13, 2013

At this time of the year I usually screen the papers I have printed/downloaded/read during the year, as an exercise to recall what has been discovered this year. I thought that it would be a good idea to create a list of my favourite 2013 papers and post it in this blog. Obviously, this is a very personal list, and the selection is completely biased. Also, the list is not about the best discoveries or the most famous findings, so don’t expect microbrains or Lenski’s experiments. Hope you find this list, at least, informative.

But before the list, I want to stress from which journals I have downloaded/read most of the papers. These are the top 10 journals: PLoS ONE, PLoS Genetics, Genetics, MBE, PNAS, NAR, arXiv, Science, BMC Genomics and Genome Res. Clearly, PLoS ONE and arXiv have a substantial impact in my field.

Out of 500+, this is my small selection (with no particular order):

Happy new year!

Theoretical Evolutionary Genetics: Now, in three different flavours!

November 5, 2013

Theoretical Evolutionary Genetics is at the heart of evolutionary biology. Francis Galton and his followers laid the foundations of theoretical evolution soon after Darwin’s Origin. This so-called biometrics school had its continuity in the work of Ronald Fisher, who amalgamated the Biometric and Mendelian schools into a robust theoretical framework. Animal breeders such as Sewall Wright independently developed methods to study heredity and genetic improvement in animal stocks. Fisher, Wright and Haldane (the latter coming from a different tradition, his own) are acknowledged to be the founding fathers of population genetics.

The fact that population genetics is the basis of theoretical evolutionary biology may be seen as an accident of the way it was introduced by the founding trio. As a matter of fact, there are ways of exploring the same biological problems other than the canonical population genetics approach. I recently bought two textbooks on evolution and I found that they used completely different approaches. After I read them I became aware that, indeed, there are at least three ways of doing things in theoretical evolutionary biology. These are the three ‘flavours’:

Flavour 1: Standard Population Genetics

This flavour is best represented by the Crow and Kimura manual, and more recently by the Charlesworths’ textbook. Evolutionary problems are tackled from the point of view of evolving populations in which allele frequencies change as a function of their relative selective fitness and sampling effects. Typically, a ‘change in allele frequency’ is defined and the resulting equation is solved for specific cases. These cases are basically equilibria situations or, in case of fixation/losses of alleles, how long will the process take and with which probability (on average). OK, this is an extreme oversimplification of population genetics, but for what I want to say it’s enough.

Flavour 2: Price’s Equation

I must admit that I never got Price’s equation. I mean, even after someone explained it to me and I thought I understood it, I didn’t see what it was useful for. I have recently changed my mind. I recently read Sean Rice’s ‘Evolutionary Theory’ and see, to my astonishment, that he approached classic problems in population biology using Price’s equation. Price’s equation is due to George Price, a strange man and even stranger scientist. He wrote down the whole evolutionary process into a single equation, accounting for the fitness of the ‘elements’ of an evolutionary system and the relationships between these elements (species, genes…). It’s basically a covariance equation. Using the appropriate definition of parameters one can reduce complex population genetics problems into a single covariance equation, and that’s the strength of Price’s approach.

Flavour 3: Game Theory

Game theory was used in biology originally by Bill Hamilton (I think) although it was definitively John Maynard Smith who fully developed the topic. According to Maynard Smith, he started using Game Theory while visiting Chicago, as there was nothing else to do in Chicago! Anyway, I always thought that game theory was useful for phenotype evolution modelling (like behaviour). However, while reading Martin Nowak’s ‘Evolutionary Dynamics’ I discovered that most of what we know from standard population genetics could be approached from the game theory perspective. In that case different alleles represent different game strategies. Winning strategies are equivalent to fitter alleles and the winner of the game is, obviously, the fixed allele in the population.

Is there a best flavour?

I should say that my preferred method is that of classical canonical standard population genetics, but after reading Nowak’s and Rice’s books I’m aware that this is only a personal preference. What these different approaches show is that there are many ways of solving the same problem, and that should be used as a powerful tool in evolutionary biology. As an example, I included a BOX below in which I derive a classical result in population genetics using these three different approaches. Richard Levins once wrote that “[deriving] alternative proofs for the same result is not merely a mathematical exercise – it is a method of validation”. Now I wonder whether all major findings in population genetics achieved during the last century could be reproduced by using these alternative flavours!

Box 1. Deriving the change in allele frequency with three different approaches.

One of the most important quantities in population genetics is  how much an allele frequency changes after one generation, or \Delta p. Let’s assume we have a large population of bacteria, and that for a given locus we have two alternative alleles A_1 and A_2 with frequencies p and q respectively. (Remember that p+q=1.) If A_1 have a selective advantage over A_2, how much the frequency of A_1 changes after one generation? This is a classic problem in population genetics, and we are going to derive an expression using three different methods.

Standard approach

The change in an allele frequency for an haploid population is given by (Crow and Kimura 1970):

\Delta p = \dfrac{(w_1 - w_2) p q}{\overline{w}}

where w_1, w_2 and \overline{w} are the fitnesses of allele A_1, A_2 and average fitness respectively. For selective advantage of A_1 over A_2 we define the following relative fitnesses:

w_1=1+s ; w_2=1

\overline{w}=(1+s)p + (1-p)=1+sp

being s the selection coefficient. Substituting, we find and expression for \Delta p as a function of the selection coefficient:

\Delta p = \dfrac{s p q}{1+sp}

For a small s we can find a linear approximation by the Taylor expansion about s=0:

\Delta p(s) \Big|_{s=0}= s \dfrac{d\Delta p(s)}{ds}\Big|_{s=0} + O(s^2) \approx \dfrac{pq(2sp + 2) - 2sp^2 q}{(1+sp)^2} \Big|_{s=0}

obtaining the classic result for haploid populations:

\Delta p\approx spq

Price’s equation approach

A common form of Price’s equation is (Rice 2004):

\Delta \overline{\Phi} = \dfrac{1}{\overline{w}}[cov(w,\Phi) + E(w\overline{\delta})]

and it says that the change in a trait or character (\Delta \overline{\Phi}) is a function of how traits covariates with fitness (cov(w,\Phi)) and how the parental and offspring traits are related (E(w\overline{\delta})). By noticing that cov(W,\Phi)=\overline{\Phi}(w-\overline{w}) and that E(W\overline{\delta})=0 for the haploid case (no inbreeding nor mating):

\Delta \overline{\Phi} = \dfrac{1}{\overline{w}}[\overline{\Phi}(w-\overline{w})]

As we are interested in the allele frequency as a trait we rewrite this formula in a more familiar form:

\Delta p = \dfrac{1}{\overline{w}}[p(w_1-\overline{w})]

which for the selection scheme defined above is equivalent to:

\Delta p = \dfrac{1}{1+sp}(p(1+s-1-sp))

which to a linear approximation leads to the same result as the classic approach:

\Delta p \approx spq

Game theory approach

Last but not least, we can consider that the two alleles are two strategies in a game played by their bacterial hosts. In game theory we first create a table of costs of the different strategies like this one:


In our case, if two bacteria play the same strategy (they have the same allele) there is no cost or benefit for any of them. However, if one bacteria has the fitter allele while the other has the alternative allele, the benefit for one will be the selection coefficient s and the cost for the other -s. Hence, the cost matrix has the form:


The instant rate of change of the winning strategy frequency is given by (Nowak 2006):

\dfrac{d p}{d t} = p q (f_a(p) - f_b(p))

For our game we have:

f_A(p) = sq ; f_B(p) = -sp

So that:

\dfrac{d p}{d t} = s p q

moving d t to the right and integrating both sides:

\int_{p_0}^{p_1}d p = \int_{t_0}^{t_1}s p q dt

and for a time interval of one generation (\Delta t = 1) we obtain the classic equation:

p_1 - p_0 = s p q (t_1 - t_0)

\Delta p = s p q

In conclusion, the three different approaches based on different assumptions yield the same result.

Diffusion in population genetics: who was first?

July 25, 2013

We know Motoo Kimura beause of the neutral theory. (Or should I say that we know the neutral theory because of Kimura?) But before his classic paper in 1968, Kimura was already a prominent figure in evolutionary genetics, mainly because of his productive use of the diffusion method to study the change of gene frequencies in populations. Why is diffusion so important? And who was actually the first to apply it to population genetics?

Diffusion and probability

Suppose that we have a large population, large enough that you can assume that it’s infinite in size. By using the so-called deterministic models one can easily compute the effects of evolutionary forces in the gene composition of such a population. Now suppose that the population is rather small. In that case, small fluctuations due to sampling will obviously influence the evolution of genes. This is known as genetic drift, and its study requires extensive computations. If you want to know how genetic drift affects to a population with mutation, selection, epistasis and/or linkage of multiple alleles, the computations become impractical or even impossible. But if you assume that the change in a gene frequency is very small during a short fraction of time, you can treat your gene frequencies as if they were particles diffusing in a continuum of probability states. This is (in a very simplistic way) the principle underlying the diffusion method.

Fisher’s diffusion approach

Ronald Fisher first thought that the probability of a given gene frequency could be modelled in a continuous space (as I indicated above). He borrowed the heat diffusion equation from thermodynamics and adapted it to genetics. This is, as far as we know, the first use of diffusion in population genetics. However, a diffusion process is an approximation, and Fisher’s approximation wasn’t accurate enough. Indeed, Sewall Wright noticed some discrepancies with Fisher diffusion approach. After this, Fisher found an error and was just in time to correct the equation right before the publication of his 1930 book. Fisher thanked Wright and admitted:

“I have now fully convinced myself that your [Wright’s] solution is the right one” Letter from Fisher to Wright, quoted in Provine, 1986

However, Fisher’s ego was already quite damaged, and he soon  stopped communicating with Wright. In later years, Fisher developed a bit further his diffusion method, but mostly for his own amusement, and never brought it to the fore of population genetics. (One may speculate that he was still affected by his early mistake.)

(Non-) Artistic depiction of how Fisher interpreted gene frequency distributions as continuous diffusion processes.

(Un-) Artistic depiction of how Fisher interpreted gene frequency distributions as continuous diffusion processes.

Proper diffusion: Kolmogorov and Feller

In 1931, the prolific mathematician Andrey Kolmogorov published his celebrated probability diffusion equations (although under a different name). A few years later, William Feller fully explored the potentials of Kolmogorov’s diffusion and coined the terms forward and backwards equations to refer to the two most popular forms of these equations. As Feller noticed, Kolmogorov added an additional term to the standard heat diffusion equation. That term, precisely, is the part that Fisher missed in his first approximation and that he added later on.

Kolmogorov quickly realized of the potential of his own equations to described evolutionary dynamics and published a paper about it (see comment in Feller 1951). Kolmogorov sent a reprint of his paper to Sewall Wright, who rapidly published a paper in PNAS using Kolmogorov forward equation to calculate the stationary distribution of gene frequencies. However, Wright himself preferred his integration method and his paper received little attention. Feller and Malécot showed later that Fisher’s diffusion, Wright’s integrals and the classical branching models all converge to the Kolmogorov forward equation. That is, they are mathematically equivalent.  The path was ready for someone to fully exploit the potential of Kolmogorov’s equations in genetics.

Kimura enters the game

Motoo Kimura first read about diffusion processes in Wright’s 1945 paper and quickly started to develop these equations for his own purposes. Kimura’s first diffusion paper was indeed communicated to the National Academy of Sciences by Sewall Wright himself. But if Kimura surprised the other theoreticians was because of his use of Kolmogorov’s backward equation to calculate the probability and time of fixation of new genes in a population. Kimura provided a new horizon to explore the evolution of finite-size populations in a time in which computers were not powerful enough. However, he was aware of his limits as a mathematician, and invited others to join his particular crusade:

“I cannot escape from this limitation […] but I hope it will stimulate mathematicians to work in this fascinating field”. Kimura 1964

The years that followed were dominated by the diffusion method, and ‘proper’ mathematicians joined the ‘diffusion crew’ (Karlin, Ewens and Watterson, to name but a few). The diffusion method gained in rigour and precision. Today the diffusion method has lost some interest in favour of computational simulation. However, they are still at the core (and the heart) of theoretical population genetics.

So, who was first?

So far we could concluded that Fisher was the first using diffusion to approximate stochastic processes, not only in genetics but in probability theory. However, as also noticed by Feller, the original Fisher’s diffusion equation was first used in a probabilistic context by Albert Einstein in his classic paper about Brownian motion of particles, almost 20 years before Fisher’s account! It may be that probability as a diffusion process was a popular topic among mathematicians in the early 20th Century, and that Fisher was smart enough to adapt it to genetics before anyone else. Whether Fisher knew or not about Einstein’s approach, I have no idea.

For an historical discussion on the use of diffusion equations in genetics I recommend Kimura’s review on the topic and Felsenstein free textbook on Theoretical Population Genetics. For a technical account it is often recommended Warren Ewens classic text, but I’ve found more useful the recent manual by Otto and Day on mathematical models. The later is, in my opinion, the best textbook in mathematical biology I’ve read so far.

So, what is the take home message? Who we have to thank for the diffusion method in population genetics? It’s hard to summarize the contributions of the different people involved. But if I have to write a single sentence I would conclude: Fisher was the first, Kimura did the best!

Is evolution driven by mutation?

June 18, 2013

This post is a review of Masatoshi Nei‘s book Mutation-Driven Evolution (Oxford University Press, 2013); 256 pages, price £55/$89.95/€66.88.

As an undergrad student I spent countless hours at the University’s library. The “evolution” section was small, yet it contained the most important volumes, so I used to pick a random book every now and then and read it. That’s how I stumbled upon Nei’s book ‘Molecular Evolutionary Genetics’, and as I read its last chapter my life changed for good: I wanted to be an evolutionary geneticist! I have devoured ever since any book or paper by Nei. In 2008 I was doing a post-doc with Sudhir Kumar (a former Nei’s student). I remember once that Sudhir and I were talking about hypermutability and selection, and then Sudhir told me: “Nei is preparing a new book on mutation”. Since that day I’ve been patiently waiting until the book was ready. Now, five years after that conversation, the book is published. It was worth the wait.


Masatoshi Nei’s book have a special place in my library

Nei’s ideas on mutationism were already sketched in his population genetics book published in 1975, and they were fully presented in a book chapter in 1983 and in his 1987 book I mentioned at the beginning of this post. He builds on the neutral theory, arguing that mutation and drift are more important than selection in molecular evolution. In this sense he continues Motoo Kimura‘s tradition. But Nei emphasizes the role of mutation over all, following the lead of Thomas Morgan and Hermann Muller. Indeed, the contribution of Morgan to evolutionary biology has only been recently acknowledge (as far as I know) by Nei, whilst other evolutionists (probably influenced by the biased account by Ernst Mayr) have completely ignored it.

‘Mutation-Driven Evolution’ is written as an historical account on how our knowledge about evolution has changed in the last 100 years. The first three chapters review the development of early evolutionary theories, with a strong focus on population genetic models (one of Nei’s fields of expertise). Chapters 4 and 5 account for the advent of molecular data and how it demonstrated that evolution is mutation-dependent. In chapters 6 and 7, Nei links the evolution of genomes with the evolution of phenotypic characters and speciation, a frequently missed aspect in many molecular evolution texts. Chapters 8 and 9 cover the role of mutation in adaptation and evolution. A last chapter summarizes the whole book and can be read as a stand-alone piece of text.

The book touches every aspect of evolutionary biology, and Nei gives his view on the cis/trans gene regulation debate, evolution of sex, the emergence of eusociality, Ohno‘s duplication model and Ohta’s nearly neutral theory, among other topics. He clearly states that mutation often produces adaptation, and much of the adaptation we believed to be the product of natural selection is not adaptation at all. In his words: “(adaptation) represents a human perception of the living status of the organism”.

If I have to say something negative about the book, I could only mention that Nei’s style is… well, Nei’s style. Somewhat opinionated and very critic with his opponents. Sometimes one gets the feeling that he is using modern arguments/evidences to attack postulates made by others some decades ago. Some may think this is unfair. Others (as I do) will understand that a point has to be made clear, sound and bold, and Nei has no problem in doing so.

The best way to describe what you’ll find in this book is to reproduce here the last couple of sentences:

“[…] mutation is the ultimate source of all biological innovations and the enormous amount of biodiversity in this world. In this view of evolution there is no need of considering teleological elements.”

Quod erat demonstrandum.

Chromosomes and reduplication: how genetic linkage was discovered

May 15, 2013

We just accept that genetic linkage exists. I mean, it looks obvious that genes are linked to other genes because of the chromosomes. Until very recently I assumed that the concept of genetic linkage naturally emerged from Mendel’s experiments, and that perhaps Mendel himself already suggested that linkage may happen. Alas, how wrong I was!

I recently came across a concept I’ve never heard before: reduplication. By looking for some more information about it I found that ‘reduplication’ and ‘linkage’ were competing interpretations to explain the coupling and repulsion of alleles. These two ‘schools’ were championed by two of the most important geneticists at the time: William Bateson and Thomas Morgan. Here I briefly discuss the origin of the controversy and its development, and how the modern concept of ‘genetic linkage’ emerged.

Bateson and Punnett discover the coupling and repulsion of alleles

When Mendel’s laws were rediscovered at the beginning of the 20th century, many scientists begun to test their favourite organisms to see whether they also followed a Mendelian pattern of heredity. Among them, the embryologist William Bateson initiated a very successful research program, convincing a full generation of scientist that Mendelism was true. As a matter of fact, it was Bateson who first translated into English Mendel’s original paper. Together with Reginald Punnett (another early defender of Mendelism) they discovered a strange phenomenon that escaped Mendel’s attention: the coupling and repulsion of characters. What they found is that, when they crossed sweet pea plants with different characters, some of the characters always appeared together. For instance, in crossing plants with red flowers and round pollen with plants with blue flowers and long pollen grains, they found that, contrary to Mendelian expectations, ‘blue’ and ‘long’ were always present together. They called this ‘coupling’. When a pair of characters (alleles) never appeared together, they said these alleles were on ‘repulsion’.

William Bateson

Reginald Punnett

Thomas Morgan

The idea that genes may be encoded in the chromosomes was already suggested by Walter Sutton and Theodor Boveri. Based on this theory, some scientists (including Hugo De Vries and Walter Sutton himself) already predicted the existence of genetic linkage. The findings of Bateson and Punnett may have confirmed genetic linkage, and apparently they were very close:

“…there must be an order of precedence among factors composing such system, and the suggestion is plausible that this order will follow the grade of coupling in which the factors are accustomed to be linked.” Bateson and Punnett (1911)

They came out, however, with a completely different interpretation.

Reduplication versus linkage: the Bateson-Punnett-Morgan debate

When Thomas Morgan started to work with Drosophila, he didn’t believe that chromosomes were carrying the genetic information. However, after he discovered the first Drosophila mutant: white eyes (flies generally have red eyes) he changed his mind. The white mutant allele was linked to female flies, in particular to the X chromosomes. Inspired by Theodor Boveri hypothesis, Morgan started to believe that chromosomes were, indeed, the carriers of the genetic information. (Ironically, as Mayr pointed out, his last paper criticising the chromosome theory was published after his paper describing the white mutant, due to a delay in the processing of the former.) Morgan (and his people) soon realized that a chromosomal inheritance implied linkage between genes.

In the meanwhile, Bateson and Punnett proposed a mechanistic explanation of the observed coupling and repulsion of characters: reduplication. By this mechanisms, cellular division giving rise to gametes would be asymmetrical. For instance, if two gametes of genotypes AB and ab form a new zygote (see Figure 1, left), the new individual will produce new gametes by cellular division. But because of the asymmetric cellular division, there will be more gametes AB and ab than Ab and aB. Hence, A and B are coupled alleles. In the very same paper Bateson and Punnett proposed to abandon the use of the terms ‘coupling’ and ‘repulsion’ and adopt reduplication instead.

Reduplication and Linkage models to explain the coupling and repulsion of gametes.

Figure 1. Reduplication and Linkage models to explain the coupling and repulsion of gametes.

Morgan soon complained, and in a paper in Science he said that Bateson’s “results are a simple mechanical result of the location of the materials in the chromosomes.” The battle for linkage begun!

At first Morgan was deliberately ignored. Punnett avoid citing Morgan’s findings by limiting his research to characters in which “sex-limited inheritance is not involved”. Later on Bailey used this as an argument to attack Morgan: “their results, however, are complicated by the phenomena of sex limitation, and by differential death rate”.

But Morgan and his students had already collected dozens of Drosophila mutants and, measuring crossing-over rates, they mapped the genes into the chromosomes. The linkage model seemed to be correct (Figure 1, right). The masterpiece “The Mechanisms of Mendelian Heredity” summarized their findings. Interestingly, the frontispiece of the book is actually a chromosomal map of genes. I think that was a (very successful) provocation.

Bateson finally accepted that Morgan may be right, but he warned:

“promising though it is, must be tried by tests on a scale far wider than experience of Drosophila provides before we are able to assess its value with confident”. Bateson (1919)

The battle for linkage was over, but a new one was about to begin.

Three-dimensional genetic linkage?

What if genes were organized in a three-dimensional manner rather than in a linear fashion? You would say that this is crazy, but that was William Castle‘s interpretation of the observed linkage between characters. Castle observed that the fraction of recombinants did not always support a linear disposition of genes, particularly for long chromosomal distances. Morgan and his people predicted that double-crossing over may be creating this artifact. Castle vowed for a 3D disposition of the genes.

Alfred Sturtevant got particularly upset, as his main project was to generate a comprehensive (and linear) linkage map of the Drosophila genes. Several papers from Morgan, Sturtevant and colleagues insisted in the importance of double crossing-over in the interpretation of recombinants. Sturtevant and Calvin Bridges maps were, however, based on the analysis of double mutants. It took the fine analysis of multiple mutants by Hermann Muller to show that double and triple crossing over was, indeed, the reason why recombination distances were not fully additive.

By 1920 Castle acknowledges that double crossing over may have an impact in distances and finally accepted the linear hypothesis proposed by Morgan. Some additional discussion about the Morgan-Castle debate can be found here.

The acceptance of genetic linkage

During the 1920s the chromosomal basis of heredity and a linear disposition of the genes was well established. Even Punnet accepted it in a paper published in 1923. But how can it be that Bateson and Punnett didn’t see that genes were linearly linked? They were pretty close indeed. Punnett had an answer for that. He wrote decades after the debate was over:

“I have sometimes been asked how it was that having got so far we managed to miss the tie-up of linkage phenomena with the chromosomes. The answer is Boveri. We were deeply impressed by his paper “On the Individuality of the Chromosomes” and felt that any tampering with them by way of breakage and recombination was forbidden. For to break the chromosome would be to break the rules. So it was left for Morgan and his colleagues to make use of Janssen’s observations and by their brilliant work to link up genetics and cytology, thereby opening up a new era in these studies.” Punnett (1950)

It is amazing that the very same scientist, Theodor Boveri, inspired both Bateson and Morgan schools of thought, yet they reached opposite conclusions. After all, the generation of hypothesis is, in a manner, a matter of interpretation. Thankfully, hard work and communication is a successful way of validating hypothesis in science, although it may take some fight in the process.