Estimating selection: an explanation and example

The data on which this analysis is based is that of Hermon Bumpus (1899). After a severe snow storm, 136 English sparrows were brought to the Anatomical Laboratory at Brown University where Bumpus measured a number of traits on sparrows that died and sparrows that lived. We use the Bumpus data here because the data are available for use and also because the Bumpus data have been analyzed as an example of different ways to calculate selection more times than any other data set, thus our results are comparable to the results attained by others.

You can download the raw Bupus data as an excel file.

Here we use excel as well as SAS and SPSS to calculate standard measures of selection using the Bumpus data. We provide SAS code and SPSS commands for all of the following analyses to accommodate users of both statistical packages. We start with the most critical analysis (selection gradients) and then move to other considerations.

Part 1: Data preparation

**Step 1**. Calculate the cube root of weight so that its dimension is the same as the linear measurements. Not doing so has potentially large effects on significance and coefficients when traits are subsequently standardized by subtracting the mean and then dividing by the standard deviation (at least this seemed to be the case with a different data set). Why this is I do not know but for the time being, it is best to use the cube root transformation as it gives the same values as the ln transformation (NEED TO FIGURE OUT WHY THIS IS SO).

**Step 2**. Two options are available. We use ‘a’ in the present description but ‘b’ is a better option for several reason that will become apparent later. For our purposes here, however, ‘a’ is fine. a) Use natural logarithms to transform all linear measurements and the cube root of weight. Note that values obtained using log10 transformations are identical to those for natural logarithms for all of the analyses. b) If analyzing non-linear as well as linear gradients, one should standardize each trait to a mean of zero and a SD of 1. Note the standardized values are computed using the raw data with the exception of weight, where weight should be substituted with the cube root of weight. Also, because the gradients are already standardized they need not be multiplied by the SD to attain the standardized gradients. We did the calculations of linear selection gradients (Part 2) using both types of transformations, ‘a’ and ‘b’, and they yield similar but not identical results.

**Step 3**. Determine the means and standard deviations for all the individuals (those that lived and those that died) for each sex. This facilitates the later standardization of differentials and gradients into standard deviation units.** **The following table shows the results of step 4 for the Bumpus data.

**Step 4**. Calculate relative fitness for each individual (see comments in Lande and Arnold 1983 and Janzen and Stern 1998). Relative fitness for each individual (w) is calculated as w = cW, where W is the absolute fitness of each individual (0 or 1 for those that died and those that lived, respectively). c = 1/*W*, where *W* equals the mean absolute fitness in the population. To calculate c for each data set (i.e., males and females separately), you take the inverse of mean absolute fitness (i.e., 1/*W*), which can be calculated as the total number of individuals divided by the number that survived (i.e., 87/51 = 1.71 for males and 49/21 = 2.33 for females). Thus, the fitness of each male that lives becomes 1.71 and each female that lives becomes 2.33. Note that relative fitness has an average value of 1.0 for both males and females.

Part 2. Estimating linear selection gradients using multiple linear regression

Directional selection gradients measure the effect of a trait on relative fitness when the effects of all other measured traits are held constant. A directional selection gradient is therefore the partial regression of a trait on fitness. These gradients also represent the direct effects of selection on a trait (i.e., the total change will include direct effects and indirect effects owing to the traits correlation with other traits), and are what is required in the equations of multivariate evolution (z = Gb, where z is the change in a trait over one generation, G is the additive genetic variance and b is the selection gradient – note z and b are vectors and G is a matrix of genetic variances and covariances).

To calculate directional selection gradients, regress all of the ln transformed traits on relative fitness in a single multiple linear regression for each sex. Output from these analyses from SPSS follows:

Note that the R^{2} values in these regressions (0.46 for males, 0.21 for females) are basically the same as that reported in Lande and Arnold (1983, p. 1222). The direction selection gradients are given in the "unstandardized coefficients, B" column. Note also, that the significance of each coefficient given in the "Sig." column correspond to those reported in Janzen and Stern (1998, Tables 1 and 2).

To now calculate standardized selection gradients, the unstandardized gradients must be *multiplied* by the phenotypic standard deviation for the trait (including all individuals – those that lived and those that died). These SD values appear in the table from Part 1, step 4.

In the following table the standardized gradients calculated as above are give by the b' columns. Note that they correspond to the values given by Lande and Arnold (1983) and those in the first column of Tables 1 and 2 in Janzen and Stern (1998).

Standard errors for each standardized measurement are calculated by multiplying the standard error for each unstandardized gradient (b) by the phenotypic SD for the trait (as was done for mean values). When we did this, the values corresponded exactly to those given by Janzen and Stern (1998).

Part 3. Estimating selection gradients using logistic regression

When using dichotomous data, such as survival, it is better to use logistic regression than linear regression for a variety of reasons (Janzen and Stern 1998). Here I show how to calculate logistic selection gradients and then convert these to linear selection gradients that can be used in equations of evolutionary change.

The first step is to perform a multiple logistic regression of the ln traits on relative fitness (just as in a linear regression). At the same time, however, it is wise to save the "predicted" values from the regression. These are used later for converting to linear gradients. Printed below are the SPSS outputs for logistic regression analyses of males and females. Note that the significance levels for each gradient given below are identical to those in Janzen and Stern.

The values given in the "B" column for "Variables in the Equation" are the logistic selection gradients (‘a’ in Janzen and Stern and in the table below). To convert these to standardized gradients, we must now multiply them by the SD for each trait (as we did for the multiple linear regression). As you will note, the standardized logistic gradients (a') below are identical to those in Janzen and Stern. Standard errors can be calculated in a similar fashion (multiply unstandardized SE by SD).

Next, we need to use predicted fitness of each individual from multiple logistic regression (saved as noted above) to calculated the average of w*(1-w), where w is predicted fitness for each individual. This measure is denoted "c" in Janzen and Stern. Each standardized logistic selection gradient (a’) is then multiplied by c to estimate the corresponding linear selection gradient.

Note that the saved predicted values are probabilities between 0 and 1. The value of c calculated from them therefore needs to be converted back to relative fitness. This can be done by multiplying c (0.1206 for males, 0.1927 for females - see table below) by the inverse of average fitness. In this case, average fitness from the predicted values for males is 0.58621 and females is 0.42857 (see table below). Thus, c for males should be multiplied by 1.705873 (1/.58621) and for females should be multiplied by 2.333341 (1/.42857).

Multiplying each standardized logistic regression gradient by this corrected c yields values identical to those in the last column of Janzen and Stern. They are very similar but not identical to the selection gradients calculated using multiple linear regression.

Part 4. Calculating linear selection differentials

Linear selection differentials are interpreted as the total change in mean phenotype with a generation, and are calculated as the difference in trait means before and after selection (or the covariance between the trait and relative fitness; Brodie III et al. 1995). The critical difference between differentials and gradients is that the former measure the total strength of selection on the trait (direct and indirect) and the later only the strength of direct selection on the trait. Thus, gradients are the effect of a trait on relative fitness when all other traits are held constant and the differentials do not consider any other traits.

I used all of the sparrows (those that live and those that died) for the before selection sample, and only those that lived as the after sample. The data for these calculations are shown below.

Standardized selection differentials are then calculated by *dividing* each differential by the SD for that trait. Note that this is the opposite of what is done to standardize selection gradients (I figured out why once but I have forgotten).

Part 5. Using linear selection gradients to estimate selection differentials

Now, I want to show that the linear selection gradients calculated earlier (Step 2) can be used to estimate the selection differential. To do this we need to remember that b=P^{-1}s (P is the phenotypic variance/covariance matrix). Thus, s = bP. I therefore first calculated the variance/covariance matrix (using the correlation command in SPSS, with the covariance option selected).

I then multiplied the vector of selection gradients for each sex by that matrix. The values came out just as they should have (i.e., they were equal to the differentials calculated using regression as above).

It may also be possible to convert from differentials to gradients but the problem there is that the above matrices must be inverted and this is not a simple task because an inverted matrix must be found as P^{-1}P=I, where I is the identity matrix (1 in the diagonal, 0 elsewhere). It can be done using programs that manipulate matrices. I do not think this is advisable for us, however, because we should be using logistic regression to determine significance values and this back-calculation approach from selection differentials would not allow this.

Part 6. Non-linear selection gradients

Selection is obviously not always linear. Sometimes it can be stabilizing or disruptive (Brodie et al. 1995). These effects can be considered by calculating non-linear selection gradients (Lande and Arnold 1983). These gradients indicate the curvature of the selection surface and are therefore a necessary condition to identify these types of selection (Brodie et al. 1995. They are not sufficient, however, because a convex curve does not have to have a peak (i.e, you may only be looking at one side of the curve).

Non-linear selection gradients are analyzed using multiple linear regression (as above), except that the model will now include data standardized to a mean = 0 and SD = 1 (Lande and Arnold 1983; Blanckenhorn et al. 1999). Thus, I first did this for all of the traits (making new columns for the resulting new variables). These have the prefix "std" in the data sheets. (Note that if the linear selection gradients are calculated using these std measurements, the result are identical to the standardized gradients calculated using the ln transformed variables.)

I then tried a few things, none of which were entirely satisfactory because I could really check my results against anything. The only one to do this for Bumpus’ data was Lande and Arnold (1983) and they used principal components, rather than individual traits because otherwise too many parameters would need to be estimated for the data available. They found evidence for stabilizing selection on body length in females (PC1). I did a similar PC analysis and found a similar but not identical result. However, there are too many intermediate steps to know why the discrepancy is present. I also tried the analysis for a length and weight by themselves and together. I gist of this was that there does appear to be some selection against extreme females (as was found by others using less sophisticated – and therefore more understandable – techniques; Grant 1972; Johnstone et al. 1972). I therefore conclude that I have a handle on how to do this.

Some comments on these non-linear gradients. First, the crossproducts (bivariate) will contain some negative numbers but the squared terms (univariate) will not. I don’t at the moment know whether this is appropriate or whether the crossproducts should all be made positive. Second, some of the people who have talked about coefficients of non-linear selection have equations where the univariate regression coefficient is equal to 0.5 times the gradient (Brodie and Janzen 1996; Blanckenhorn et al. 1999). Does this mean the coefficient for a univariate non-linear regression should be multiplied by two to get the non-linear gradient? Comparison of my PC values to those in Lande and Arnold (1983) would seem to suggest that this factor of 2 correction is appropriate.

Part 7. Capture-recapture data

Kingsolver and Stern (1995) point out that mark-recapture data are not the best for traditional regression analyses because those methods do not take into account the fact that some individuals may be lost (i.e., not dead). They suggest a maximum likelihood approach using capture-recapture models. Unfortunately, I have no idea how to do this and I don’t think it will allow all the inferences we need. We should certainly look at it in more detail, however.

Part 8. Visualizing selection surfaces

Cubic spline programs are available from Dolph Schluter's website.

Results from cubic spline analysis of Bumpus' data.