Two hundred and thirty blood samples were collected from 11 sampling locations over four breeding seasons (2007-2010; Figure 1, Supplementary Table S1). Sampling locations, hereafter referred to as populations, were limited to a 50-km radius, where possible, with no obvious barriers to dispersal within a sampling site. Fifty-three museum samples (tissue and toe-pads) augmented sample sizes from field sites and added an additional three sampling locations (Figure 1, Supplementary Tables S1 and S2). DNA was extracted using a modified chelex procedure (Walsh et al., 1991).
Two mtDNA fragments were amplified: the control region partial domain I and II (CR) and the ATPase coding region (ATP), which contains ATPase6, ATPase 8 and the tRNA lysine. The control region was amplified using the primers LmochCR1/H1015chCR (Lait et al., 2012). Some museum samples (toe-pads) required a semi-nested PCR using the primers L26chCR/H1015chCR, followed by LmochCR2/H1015chCR (Supplementary Table S3). The ATP gene fragment was amplified using universal avian primers L8929COII/H9855ATP6 (Sorenson et al., 1999) and toe-pad samples were amplified in two shorter, overlapping fragments using H534chATP/L8929COII and L298chATP/H9855ATP6 (Supplementary Table S3). Samples were sequenced on a 3130 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA), and museum samples were sent to Genome Quebec for sequencing (McGill University, Montreal, QC, Canada).
A small number of individuals from geographically distant populations (AKA and NL) were screened using 20 microsatellite primer pairs developed in a number of other passerines with a M13-tailed forward primer (see Burg et al., 2005). Of those that yielded PCR products, five loci were monomorphic (BT6, Escu1, Escu3, Pocc2 and Titgata88; Hanotte et al., 1994; Bensch et al., 1997; Otter et al., 2001; Wang et al., 2005), and eight were polymorphic (Escu4, Escu6, Pat14, Pat43, Pdo5, Ppi2, Titgata02 and Titgata39; Hanotte et al., 1994; Otter et al., 1998; Griffith et al., 1999; Martinez et al., 1999; Wang et al., 2005; Supplementary Table S3). The PCR used a two-step annealing process with eight cycles at T and 31 cycles at T. For four of the loci (Escu6, Pat14, Ppi2 and Titgata39), the second step was reduced from 31 to 25 cycles (Supplementary Table S3). The PCR products were run on a 6% polyacrylamide gel using a LI-COR 4300 DNA Analyzer (LI-COR Inc., Lincoln, NE, USA). Alleles were scored manually by visual inspection and were independently confirmed by a second person. Four controls of known size were included on each gel, and a 'mystery plate' of 44 individuals was run as an additional test for consistency.
All analyses were done on a concatenated sequence of the CR and ATP fragments and on the two fragments separately; the results were similar and only the concatenated results are discussed further. The sequences were aligned in MEGA v5.05 (Tamura et al., 2011), haplotypes were assigned manually and confirmed using TCS v1.21 (Clement et al., 2000). Nucleotide and haplotype diversity were calculated in DNASP v5 (Librado and Rozas, 2009). Pairwise genetic differences (p distances; Φ; 10 000 permutations) were calculated in ARLEQUIN v3.11 (Excoffier et al., 2005), and a modified false discovery rate (FDR) correction (Benjamini and Yekutieli, 2001) was applied. A Mantel's test was performed in GENEPOP v4.0.10 to test for isolation-by-distance (Raymond and Rousset, 1995; Rousset, 2008) using geographical distances calculated in GEOGRAPHIC DISTANCE MATRIX GENERATOR v1.2.3 (Ersts, 2010) and linearised Φ values. Significance was tested using 10 000 permutations.
A statistical parsimony network was constructed in TCS v1.21 (Clement et al., 2000) with a 95% connection limit. A spatial analysis of molecular variance (SAMOVA) was run to identify genetic clusters independent of sampling locations using both geographical and genetic data (K=2-13; 100 iterations; Dupanloup et al., 2002). To test if any structure detected was due to historical separation in refugia or to contemporary barriers, a series of hierarchical AMOVAs (10 000 permutations) were run in ARLEQUIN v3.11 (Excoffier et al., 1992, 2005). The scenarios tested included two or three refugia (east and west; east, central and west; and east, southern (SAB) and west), one, two or three contemporary barriers (mountain ranges and water barriers as shown in Figure 1) and combinations of refugia and barriers.
A principal coordinates analysis (PCoA) was performed in GENALEX v6.3 (Peakall and Smouse, 2006) on both individuals and population pairwise Φ values. BAPS (Bayesian Analysis of Population Structure) v5.2 (Corander et al., 2008) was used to assign individuals to K clusters based on genetic data, with no a priori population information. As per Corander et al. (2008), the analyses used the clustering with linked loci option (Corander and Tang, 2007) and varied K (K=1-14).
Approximate divergence dates and relative migration rate estimates were calculated using the isolation with migration model as implemented in IMA v3.5 (Hey and Nielsen, 2007). IMA allows two populations to be compared; as such, the eastern and western groups identified by SAMOVA were tested. Multiple initial runs were performed to identify upper parameter bounds and an ideal heating scheme. The three final runs were performed using the HKY model, 500 000 burn-in and 1 000 000 post burn-in MCMC (Markov Chain Monte Carlo) steps, geometric heating, different random number seeds and a weighted average divergence rate of 1.7% per Myr (see below) with a low (1.0% per Myr) and a high (5.0% Myr) rate.
The control region and ATP divergence rates were estimated using an average cytochrome b (cyt b) rate of 2.1% per Myr (Weir and Schluter, 2008) and a combination of Poecile cytb, ATP and CR sequences (Uimaniemi et al., 2003; Gill et al., 2005; Lait et al., 2012). The calculated divergence rate was 1.78% per Myr for ATP, 2.62% per Myr for control region domain I and 0.95% per Myr for domain II. Although these values are considerably lower than traditionally considered for the control region, similar rates have recently been calibrated in a number of parids (Kvist et al., 2001; Päckert et al., 2007) and other birds (Zink et al., 2003; Lerner et al., 2011). Although using sequences to calculate divergence times is not as ideal as using fossil calibrations, the calculations do show that cyt b, ATP and the control region have similar divergence rates in Poecile.
Only individuals with genotypes from at least six loci were included in the analyses with the exception of museum samples from NON, NQC and NY where samples with up to four (50%) missing loci were included in some analyses. The SAB population was not genotyped. MICRO-CHECKER v2.2.3 was used to detect input errors, allelic dropout, slippage stutter or null alleles (Van Oosterhout et al., 2004). Exact tests were run in GENEPOP v4.0.10 (Raymond and Rousset, 1995; Rousset, 2008) to check for linkage disequilibrium and deviations from Hardy-Weinberg equilibrium using modified Markov chain parameters (1000 batches, 10 000 iterations and 10 000 dememorisation steps). All P-values were corrected for multiple tests using the modified FDR method (Benjamini and Yekutieli, 2001).
Both contemporary (ecological niche modelling) and historical (palaeo-distributional modelling) patterns of species distribution were estimated in MAXENT v3.3.3 using a maximum entropy statistical model on presence-only occurrence data (Phillips et al., 2006; Phillips and Dudík, 2008). The model estimates the potential distribution of a species by assuming that the species is found in its preferred environmental conditions and that the current niche requirements are reflective of those in the past and/or future (that is, niche fidelity; Phillips et al., 2006; Richards et al., 2007).
Occurrence records were comprised of sampling locations from this study (n=279) and sightings downloaded from the Global Biodiversity Information Facility data portal (n=9624; GBIF data portal, 2011) and the Avian Knowledge Network (n=22 252; Avian Knowledge Network, 2009). The models were based on 19 WorldClim climatic variables (see Supplementary Table S4; Hijmans et al., 2005) extrapolated from GIS (Geographical Information Systems) layers as described in Carstens et al. (2007). The variables included averages, extremes and ranges of temperature and precipitation and were available for both the contemporary timescale and estimates for the LGM (ca. 21 kya). The Model for Interdisciplinary Research on Climate (MIROC) climate layers used as the past climate estimates were provided by the Palaeoclimate Modelling Intercomparison Project Phase II (Braconnot et al., 2007). Ten replicates were run using the cross-validation method.