Using phylogenetic comparative methods warrants a basic understanding of the history and progress of this field.  Working with some of the more recent tools for comparative evolutionary biology, I feel compelled to find out how current methods were devised, whom to credit for the methods I use, and what assumptions I am making by using them.  Below is a list of some of the landmark papers in comparative methods, with comments and synopses (written by me and Tomomi).

Felsenstein (1981) describes the basics for creating a maximum likelihood tree from a set of nucleotide sequences. One step elaborated from his 1973 paper is Felsenstein’s pruning algorithm for calculating the likelihood of a phylogenetic tree given branch lengths and tip values.  This algorithm makes likelihood calculations more computationally efficient by eliminating redundant calculations.  The paper also describes the Markov process for finding the maximum likelihood tree from nucleotide data.  Felsenstein uses a substitution model for molecular phylogenetics in which each nucleotide has a different stationary frequency (A, C, G, and T are not expected to be equally represented at any given site on DNA sequences).  Methods for searching tree space have been improved, and Bayesian theory has since permeated phylogenetic analyses, but the pruning algorithm continues to be an important subroutine in phylogenetic computations.

Possibly the most cited paper in phylogenetic comparative methods, Felsenstein (1985) describes with clear examples why species trait values may not be statistically independent and what might be done to compensate.  Felsenstein elaborates on his method of calculating standardized contrasts (phylogenetically independent contrasts) to help overcome the non-independence of character traits.  These contrasts are basically the differences between trait values of species pairs weighted by the evolutionary change separating them; they are estimates of the rate of change over time. A common use of standardized contrasts is to look for correlation in this rate between two traits; if standardized contrasts of traits X and Y are compared in a regression analysis, a linear trend suggests correlated rates of evolution between the traits.

Schluter et al. (1997) discuss the need for error estimates on ancestral state reconstructions. The paper introduces maximum likelihood ancestral state reconstructions of both discrete and continuous characters.   Responding to Schluter et al.’s call to account for error in tree construction, Huelsenbeck et al. (2003) describe a Bayesian method for mapping the change in character states onto a phylogeny.  The introduction reiterates the importance of having an alternative to parsimony methods when tackling character change; as with maximum likelihood, the new methods allow for more than one change along a given branch in the tree.  While the Huelsenbeck et al. paper is a landmark for evolutionary analysis, it also contains a very coherent introduction to the instantaneous rate matrix, substitution model, and likelihood calculations for finding the probabilities of evolutionary histories.

Although Brownian motion is often used to model quantitative character evolution, the Ornstein-Uhlenbeck (OU) process can also be used to develop informative evolutionary models.  OU models incorporate selection as a selective optima, or adaptive peak.  OU and Brownian motion are not entirely unrelated as OU collapses to Brownian motion in the absence of selection.  Butler and King (2004) use OU to test which of several evolutionary models has the best fit to several example data sets involving anoles.  They use likelihood ratio tests to determine how various OU-based models perform against Brownian motion, observing that biological information is important in determining what models to consider.  They also stress that stasis, although positive support for stabilizing selection, is often disregarded and can lead to underestimation of evolutionary drift.  However, although they state that Brownian motion is a pure drift process, Brownian motion can provide a good fit to selective schemes under fluctuating directional selection (O’Meara et al. 2006).

While Grafen (1989) first describes a generalized least squares model for phylogenetic regressions, Chris O. from our lab pointed out the appendix of a paper on mammal intestines (Lavin et al. 2008) as a nice summary of PGLS.  This appendix describes both the history and the methods of phylogenetic regression analysis.  PGLS can be performed not only under a Brownian motion model, but also with OU and several other transformations of the variance-covariance matrix used in the mixed model approach.

These are just a handful of important papers; feel free to add to the list via comments with other references and their contributions.