Speciation and the large effect of sex chromosomes

Biologists have long suspected that the origin of species can be a result of differential evolution and adaptation between populations. Yet generalities about the phenotypes or underlying genes responsible for speciation remain broadly unresolved. Despite this uncertainty, the disproportionate role of sex chromosomes in speciation is well recognized. Two routes of investigation into speciation have been important in implicating the sex chromosomes in speciation. First, classic studies crossing individuals in the lab have disproportionately implicated the heterogametic sex and, then, the X chromosome specifically, in post-zygotic hybrid failure. Second, studies of gene flow across landscapes have similarly identified the X by showing disproportionate divergence in allele frequencies between populations. Studies directly measuring gene flow between parapatric species at zones of contact and, then, studies inferring rates of introgression, both indicate reduced gene exchange between populations on the X compared to the rest of the genome. Ideally, studying why the sex chromosomes play a disproportionate role in speciation will provide insights to some of the other unresolved questions in the study of the origin of species.

The first signs of a role for sex chromosomes in hybrid failures arose from investigations of sex-ratio distortion in hybrid offspring. In response to a debate on the cause of sex-biased broods from hybrid crosses, JBS Haldane undertook a review of hybrid infertility and inviability in species with sex chromosomes. Haldane famously found that, regardless of sex determination system, “when in the F1 offspring of two different animal races one sex is absent, rare, or sterile, that sex is the heterozygous sex”. As was common of his time, Haldane thought the pattern could be due to structural rearrangements between the sex chromosomes. Hybrids between populations where different rearrangements between X and Y, or between X and autosomes, had fixed would cause hybrids to be missing genes, and therefore be inviable, when they inherited the respective chromosome from each parent from which genes moved. This model has recently received renewed interest, but is probably unable to entirely account for the generality of Haldane’s rule. Haldane’s rule was the first empirical finding to implicate sex chromosomes in speciation.

In the 1930s, Dobzhansky and Muller independently proposed an alternate model of the evolution of reproductive isolation to chromosomal rearrangements, which instead involved epistasis (it has also been pointed out that this idea was proposed by Bateman thirty years earlier). Under the Dobzhansky-Muller Incompatibility (DMI) model, two alternate alleles at two different loci fix in isolated populations. Neither change has a deleterious effect on fitness within each population, but when brought together in the same genome, these alternatively fixed alleles have strong detrimental epistatic effects. The deleterious epistatic effects of these incompatibilities cause hybrids to fail and thereby keep populations reproductively isolated. Using the DMI model, Muller proposed that, given new mutations were most often recessive, Haldane’s rule could be explained through inviability by epistasis: unmasking recessive incompatibilities on the X by a degenerate Y, known as hemizygosity, would cause the heterogametic sex to fail more often than the homogametic sex. This ‘dominance hypothesis’ was formally developed by A. Orr and M. Turelli in the 1990s (the recessivity of incompatibilities can also account for hybrid failure in later generations, see below). Generally, the uncovering of recessive deleterious or incompatible alleles on the X by a degenerate Y is thought to play an important role in Haldane’s rule. The dominance hypothesis for Haldane’s rule and the DMI model more broadly have had a large and important impact on the study of speciation.

While investigating the genetic basis of Haldane’s rule, Dobzhansky discovered that the role of the sex chromosomes in hybrid failure extended beyond Haldane’s rule. In 1930s crossing experiments, Dobzhansky showed that the X chromosome played a disproportionate role in hybrid breakdown that could not be explained by recessive incompatibilities alone. The results from Dobzhansky’s work were rediscovered 50 years later and led to a slew of insightful crossing experiments which showed the role for the sex chromosomes was much more complex than F1 dominance effects. Most conclusively, work by J.P. Masley and D. Presgraves in hybrids of Drosophila mauritiana and D. sechellia found a higher density of genes with negative effects on introgression on the X, even while controlling for hemizygosity. These studies revealed a second rule of speciation: the X plays a disproportionate role in speciation compared to the rest of the genome. This observation has also been confirmed in several other systems, including plants. The ‘large-X effect’, as it has come to be known, suggests the role of the sex chromosomes extends beyond a simple unmasking of recessive DM incompatibilities.

One simple way of accounting for the large X effect again assumes a degenerate Y chromosome. If adaptive mutations are most often recessive, hemizygosity would increase the rate of evolution on the X by revealing recessive adaptive mutations. This higher rate of adaptive evolution on the X can increase the likelihood of incompatible mutations on X chromosomes compared to elsewhere in the genome. The “faster X” model has been expanded to include the unique effective population size of the sex chromosomes and, in theory, can explain asymmetries in post-mating isolation. This faster-X model would suggest the sex chromosomes are involved in speciation only because the Y chromosome is degenerate.

Empirical investigation into the faster-X found evidence that genes on the X evolve fast, but this pattern is not always associated with hemizygosity. The evidence supports a faster-X effect on coding sequence evolution in several Drosophila lineages, spiders, mice and tentative in Silene, and on gene expression in Drosophila. Hemizygosity of the Z enhances purifying selection but not positive selection in Satyrine butterflies. Faster-X evolution may explain the phylogenetic positive association between the extent of sex chromosome heteromorphism and reproductive isolation. However, in most of these studies faster rates of X evolution are not conclusively associated with hemizygosity, and may represent a pattern of X evolution that is not caused uniquely by the uncovering of recessive adaptive alleles. For example, the disproportionate role for the X has been observed in species without degenerate Ys such as Drosophila miranda, mosquitos and European tree frogs. These results left the connection between hemizygosity, rates of molecular evolution and reproductive isolation on the X somewhat tentative.

Alternate research approaches estimating population-wide levels of gene flow offered an alternate and complementary approach to understanding the Large-X effect. Patterns of gene flow between populations is an important part of the study of speciation. One key insight of ‘the modern synthesis’ of the 1940s was to propose that reduced levels of gene flow between populations could allow for allele frequencies in these populations to diverge and thus for species to evolve different phenotypes. Studies of genetic gradients in phenotypes or allele frequencies, known as clines, complicated the concept of species as populations completely isolated from gene flow by finding gradual changes in allele frequencies across parapatric species’ boundaries. Furthermore, allele frequency clines were found to be heterogeneous across the genome, suggesting heterogeneity in gene flow across the genome. Using models where different loci have different clines, the X chromosome was frequently found to show steeper rates of differentiation than other genomic regions. Studies of clines across hybrid zones for loci on the sex chromosomes showed a disproportionately low amount of gene flow for the sex chromosomes in rodents, birds, crickets, but not toads with undifferentiated sex chromosomes. Hybrid cline analysis did much to question the impermeability of species boundaries.

The incorporation of continuous gene flow into models of speciation also led to the proposition that regions of the genome with low rates of recombination could allow for the maintenance of coadapted allele complexes, which would cause hybrid failures when in the wrong environment. This coadapted gene complex perspective of speciation led to a view known as ‘ecological speciation’, where local adaptation caused divergence between populations. Coadapted complexes were expected to show low rates of recombination as to keep alleles together. This conceptualization was supported by finding low rates of recombination across polymorphic chromosomal arrangements in hybrid zones, for example in sunflowers, Boechera stricta, mosquitos, birds and Drosophila. Multi-locus DNA sequencing further produced evidence confirming that speciation often proceeded with gene flow, and that gene flow was not restricted to narrow regions of contact between species. For example, the work of Wang, Hey and Wakeley (1997) showed not only that gene exchange between Drosophila species was frequent but also that there was heterogeneity in the amount of gene exchange between loci across the genome far beyond the regions of parapatry. Speciation with gene flow has been shown to occur in plants including monkeyflower, morning glories, and wild tomato and in insects including Drosophila and butterflies. Speciation seems to frequently occur with gene flow.

The recognition that gene flow between populations proceeds during multiple stages of divergence changed how genomes could be analyzed for patterns of population differentiation. Given growing evidence that speciation often proceeds with gene flow and suspecting that linked gene complexes were responsible, the idea became popular to look for genomic regions showing excess differentiation at neutral loci as selection for local adaptation was predicted to drag along linked neutral variation and therefore genomic regions showing high levels of neutral divergence were expected to be linked to complexes responsible for speciation. Early approaches did not consider the effects of purifying selection in regions of low recombination on metrics of divergence. Nonetheless, even after correcting for the effect of linked purifying selection, sex chromosomes showed evidence for being disproportionately differentiated in rabbits, birds, Silene, and Heliconius butterflies but not in the mating-type loci of haploid anther-smut fungi. Even when looking at patterns of gene flow in allopatric populations, the X chromosome appeared as a genomic centre of differentiation.

In addition to the dominance hypothesis, another explanation is that sex chromosomes pleiotropically appear as hubs of divergence due to their inherent association to sex phenotypes. The presence of separate males and females is associated with increased diversification and speciation in animals, but the pattern is unclear in plants. Sex chromosomes may show patterns of divergence by pleiotropy if the loci on sex chromosomes play a disproportionate role in local adaptation, especially if local adaptation is sex-specific. Alternatively, sex-specific differences in natural selection may be important in the origin of species. As highlighted in a previous post, sexual antagonism can lead to conflict between alleles with opposite fitness effects in each sex. This conflict can cause genes involved in sexually antagonistic traits to evolve more rapidly, and their accelerated evolution may pleiotropically cause incompatibilities between populations. Importantly, if sexual conflict is resolved with the fixation of different alleles that are incompatible in different populations, conflict can lead to hybrid failure and population divergence. For example, in yellow monkeyflower Mimulus selection on male function in pollen is likely to play an important role in reproductive isolation. In Drosophila, genes with sex-biased expression and X-linkage have faster rates of adaptive protein evolution. Sex-specific evolution could cause patterns of differentiation in any genomic region with an association with sex, and many such genes are likely to be on the sex chromosomes.

Sex chromosomes are likely hubs for conflict during meiosis as well. In 1881, Roux conjectured a theory called Kampf der Theile im Organismus (“Battle of parts in the body”) in which he proposed that, as a reasonable extension of evolution by natural selection, we should expect every part of an organism to compete against every other part for survival. Haldane extended Roux’s Kampf der Theile theory to include a battle between every gene, proposing that “the process continues until natural selection or the increased activity of other genes puts a stop to it” (Haldane, 1933, p.15). Because sex chromosomes are inherited in a sex-specific manner and meiotic drivers most often have sex-specific effects, meiotic drivers are most likely to spread in the population when they are linked to the sex chromosomes. For example, elements that can bias their segregation away from the polar bodies during female meiosis are most likely to spread through a population if the element is inherited more often by females, and therefore linkage to the sex determining region is favoured. Particularly surprising is the conclusion that, despite potentially imposing substantial fitness costs upon an organism, selfish genes can be very successful. The appealing aspect of ‘selfish genetics’ is that it can explain invasion of alleles that otherwise have substantial fitness costs. Selfish elements have been associated with speciation. Indeed, regions experiencing female meiotic drive have been observed to be more divergent between populations and to cause hybrid failures in several plants. This model would suggest that even without apparent sexually dimorphic phenotypes, sex-specific action during meiosis could cause the sex chromosomes to appear as regions disproportionally contributing to speciation.

The sex chromosomes are deeply intertwined with the process of speciation. Unmasking of genes causing incompatibilities between populations, and of recessive adaptive alleles by a degenerate Y, can cause heterogametic hybrids to fail more often and also cause hybrids with a foreign X chromosome to fail more often. Loci on the X seem to often evolve faster than the rest of the genome and both sex-specific evolution and unmasking of adaptive alleles may contribute. Analyses of gene flow across the landscape suggest the patterns of hybrid failure on the X in the lab may be associated with real impacts in natural populations. Allele frequencies change faster across space on the X and X haplotypes are under-represented in foreign genomes. These patterns are likely to have a composite underpinning arising from both Y degeneration and faster rates of evolution in genes associated with intersexual conflict, including during meiosis.

Restricting X|Y Recombination

In this post, I review the evolution of recombination rate between the sex chromosomes. As considered in a previous post, low rates, and more often absence, of crossover between parts of the X and the Y plays an important role in sex chromosome evolution and is frequently considered a defining characteristic of sex chromosomes. Wide variation in recombination rate between the X and Y is observed across different species: in some species recombination is completely absent in one sex, for example in Drosophila or the amphipod crustacean Gammarus chevreuxi, whereas in other species the non-recombining region of the sex chromosomes is limited to very few loci, for example in fugu pufferfish or asparagus. The full gamut of differences between these extremes in recombination-rate variation exists. As I discuss below, various processes have been invoked to explain this variation, and the empirical support is mixed.

In 1937, Darlington proposed low rates of crossing-over were essential to sex chromosome ontogeny. From his study of sex chromosome cytotypes, Darlington concluded that the evolution of sex chromosomes was inseparable from the evolution of separate sexes in an ancestrally hermaphroditic population, assuming that sex was heritable and followed Mendelian patterns of inheritance. At the very least, Darlington proposed, the complete transition to separate male and female individuals from a hermaphroditic population required two mutations: individuals with a male sterility mutation that exhibit a female phenotype and individuals with a female sterility mutation that exhibit a male phenotype. Such sex-specific sterility alleles could invade a hermaphroditic population when selection acts to avoid the cost of inbreeding depression, to maximize fitness in sexually dimorphic phenotypes, or a combination of both. In the transition to separate sexes, each sex-specific sterility mutation must arise on different haplotypes to avoid the deleterious fitness consequences of double-sterile haplotypes, and therefore the recombination rate between these two haplotypes must be low or evolve to be low. A corollary of the two-sterility model of sex chromosome evolution is that sex chromosomes evolve from regular (autosomal) chromosomes. Indeed, ancient molecular homology between X and Y genes in many species supports the hypothesis that sex chromosomes evolved from a homologous pair. As any chromosome could harbour the pair of sex-specific sterility mutations, the loss of recombination, favoured by selection to avoid the cost of double-sterile individuals, will create sex chromosomes. 

Yet recombination loss does not seem to have occurred in a single step. First observed by Lahn and Page in humans, studies of molecular divergence suggest the data are best explained by several distinct divergence times across the sex chromosomes rather than one single loss of recombination event. Whereas some groups of genes which are physically close to each other on the X have similar divergence times from the Y, discrete clusters of genes often have markedly different levels of X-Y divergence. The clusters of divergence have been termed ‘evolutionary strata’ to invoke distinct evolutionary periods across the chromosome, similar to the evolutionary eras captured by geological strata. Strata have been observed on sex chromosomes in a variety of organisms including mammals, Drosophila, the predominantly haploid anther-smut funguses, some birds but not others and some plants like Silene latifolia but not Salix or Rumex. The observation of evolutionary strata suggests that recombination loss across the sex chromosomes can be more complex than a simple transition from hermaphroditism. One way to explain evolutionary strata is that subsequent to the transition to separate males and females, more genomic regions become locked into segregation with the sex-specific sterility mutations. This hypothesis proposes that the non-recombining region of sex chromosomes can evolve and grow in size over evolutionary time. 

A primary selective mechanism that has been thought to drive the subsequent spread of recombination suppression on sex chromosomes is sexual antagonism. In 1948, A.J. Bateman revolutionized the view of sex-specific selection in a way that massively broadened the parameter space under which genetic polymorphism could be maintained. In Darwin’s original view, sex-specific selection was a difference in the strength or in the presence/absence of selection between the sexes. In contrast, under Bateman’s proposal, selection could act in opposite directions in each sex because males and females have different reproductive strategies and thus different evolutionary optima. Bateman proposed that phenotypes could be favoured in one sex that were deleterious in the other sex, which can be described as a difference between males and females in their fitness optima. Selection on phenotypes participating in the trade-off between offspring investment and offspring number, for example, should be different between males and females, and this form of selection is known as ‘sexually antagonistic’ selection.

Alleles contributing to sexually antagonistic traits can be maintained for long evolutionary time periods because their selection pressure will switch when inherited in the opposite sex. Whereas one allele would be beneficial to males, the other would be beneficial to females, and under some circumstances neither allele can spread to fixation. In populations evolving under weak natural selection, differential selection between the sexes can cause a departure from mean optimal phenotypes that is characterized as a fitness load. This fitness load can be countered by changes to genome organization that minimize the fitness cost of this load. The stable maintenance of a polymorphism due to sexually antagonistic selection allows for the invasion of genes that lower the rate of recombination between the X and Y. For example, in support of this hypothesis, study of the sex chromosomes of monkeys and guppies suggests species with more sexual dimorphism have lower rates of recombination. As sexually antagonistic alleles evolve to be linked with the sex determination region, the non-recombining region of the sex chromosomes expands. In this way, sex-specific inheritance limits the cost of sexually antagonistic variation.

Although perhaps an intuitive explanation for the extension of recombination loss across sex chromosomes, these models require very strong sexually antagonistic selection. The requirements for the strength of sexually antagonistic selection are less stringent when some recombination suppression, and thus linkage, already exists. The invasion of a recombination modifier is much less restricted when the locus under selection and the Sex-Determining Region (SDR) are already on the same chromosome and even less constrained when recombination rate is already low in that region. For example, the sex chromosomes of papaya seem to have arisen in a region with ancestrally low rates of recombination. This observation also predicts that recombination loss is more likely to spread across the sex chromosomes if a SDR is established than it is to recruit other chromosomes to fuse with the sex chromosomes, known as a neo sex chromosomes. 

Gene expression may also evolve to be sex-specific or sex-biased in cases where alleles with opposing fitness effects between the sexes are maintained by selection. To avoid the fitness costs of sexually antagonistic genetic load, one can expect the invasion of gene expression modifiers that cause sex-specific expression of genes underlying sexually antagonistic traits. Sex-specific gene expression can evolve to increase male and/or female fitness through cis regulatory changes. The existence of the evolution of sex-specific gene expression as a means to alleviate load from sexually antagonistic variation is currently being actively investigated in animals. Some metrics suggest differentiation in allele frequencies between the sexes at loci with sex-biased gene expression may represent this type of process. Studies of sex-specific gene expression suggest genes with sex-biased expression are more likely to be sexually antagonistic in humans, Drosophila, and fishes, but recent simulation studies suggest the association between sexual antagonism and sex-biased gene expression can also arise from random processes. Although hinting at an association between gene expression and sexual antagonism, these studies struggle to associate sex-specific gene expression to fitness.

Sexually dimorphic gene expression with sex-specific fitness effects is expected to cause sex-specific dominance reversal for genetic variation in fitness. Sex-specific dominance reversals have been observed for sexually antagonistic major effect loci such as RXFP2 in soay sheep and VGLL3 in salmonids, and sex-specific dominance reversals have significant and broad fitness effects in the seed beetle Callosobruchus maculatus. Quantitative trait locus (QTL) analysis suggests sex-specific trait expression may avoid load in plants and in haplodiploid systems, such as Nylanderia ants, but not for the mandibles of flour beetles. These studies suggest selection may act to avoid sexually antagonistic load, but the underlying genetics remains unclear.

The dominance of alleles under selection will factor into the expectations of change in recombination rate, and therefore the evolution of recombination rate between sex chromosomes may be affected by the evolution of sexually dimorphic gene expression. As the intensity of selection will be mediated by dominance, in cases where dominance reversals relieve the pressure of sexual antagonism, recombination modifiers are less likely to invade. For example, birds with homomorphic sex chromosomes have more sexually dimorphic gene expression and changes to sex-specific gene expression can account for recombination rate on the sex chromosomes of guppies. Similarly, haploid gene expression can have important effects on recombination rate evolution on sex chromosomes. A recent extension to the Charlesworth and Charlesworth (1980) model of invasion of recombination modifiers between loci under sexually antagonistic selection considers haploid selection. As selection is expected to be more efficient on the haploid stage of the lifecycle, increased efficacy of selection in the haploid stage can indirectly increase selection on a recombination modifier beyond, or even counter to, the expectation for selection in the diploid phase. These studies suggest gene expression can have a significant impact on changes in recombination rate between the sex chromosomes.

In summary, the evolution of reduced recombination between X and Y is a dynamic process. Recombination rates can continue to evolve beyond the initial invasion of sex chromosomes, whereas in some systems the non-recombining region remains small and stable for long time periods. Sexual antagonism is also likely to play an important role in the evolution of recombination rate. Because both sex-specific inheritance and sex-specific gene expression can participate in alleviating the load of sexually antagonistic genetic variation, both processes are intertwined. While sex-specific gene expression in the diploid phase may limit the loss of recombination between X-Y, recent work suggests haploid-specific expression of sexually antagonistic variation could extend the region of recombination loss.

Why the X and Y don’t match

Because X-Y heteromorphism allowed for the sex chromosomes to be followed through meiotic division, it was the first marker to be correlated with the segregation and inheritance of a phenotype (sex). Still today, molecular differences between X and Y are used to identify the sex chromosomes in genetic sequencing studies. Because of its pivotal role in the study of evolutionary genetics, the evolutionary source of this diagnostic heteromorphism has been subject to intense study. The problem is all the more interesting as the evolution between X and Y seems to be asymmetrical: the Y chromosome disproportionately loses genes compared to the X. Here I review the study of the divergence and heteromorphism between X and Y chromosomes.

As first noted by Morgan, recombination rates between the X and the Y chromosomes are unusually low compared to the rest of the genome. The lack of recombination between the X and Y chromosomes can lead to heteromorphism because new mutations are not exchanged between the pair, which allows the X and Y chromosomes to diverge over time. The divergence between the X and the Y, caused by the loss of recombination, leads to an absence of coalescence analogous to evolution in separate species. Indeed, the methods used to estimate divergence times between related species have been repurposed to estimate the time since recombination arrest between the X and Y. Using these techniques, molecular heteromorphism between X and Y can be quantified. In some plants such as kiwifruit, melon, willow, papaya, date palm and strawberry, genetic divergence is limited, while in others such as Silene, Cannabis, and some loci in Rumex the level of divergence is remarkable. Extreme heterogeneity in sex chromosome differentiation has been found between sister species of livebearers whereas X-Y differentiation is polymorphic and correlated to geography in frogs, killifish and stickleback. In some cases, divergence is so pronounced that it leads to X-Y heteromorphism, the so called ‘heteromorphic’ as opposed to ‘homomorphic’ sex chromosomes often reported in the literature. Whether at the chromosome-scale or in terms of molecular evolution, divergence between X and Y is a quintessential feature of sex chromosome evolution.

Neutral divergence between X and Y is likely to participate to some extent in X-Y heteromorphism, but cannot alone account for one of the most striking aspects of the sex chromosomes: Y chromosomes have a general dearth of genes compared to the X. Even in very early studies, the absence of genes on the Y was attributed to a gradual degeneration over evolutionary time. First, surveys of sex chromosome karyotypes in the 20th century revealed a stunning variety of sex chromosomes; while some species had homomorphic sex chromosomes, other species completely lacked a Y chromosome. By envisioning this variation in sex chromosome heteromorphism as an evolutionary trajectory, it seemed plausible that Y chromosomes started as regular chromosomes (autosomes) but that some process caused gene loss and occasionally resulted in the loss of the entire Y chromosome. Second, because of the obvious cost of unmasking of X-linked recessive lethals only in males, the emptying of the Y seemed more likely to be a degenerative process than one involving optimization driven directly by natural selection. This led to the hypothesis that the Y chromosome may inevitably degenerate over time.

With the advent of large-scale genome sequencing, the Y was often found to be highly degenerate as expected. Beyond missing genes, Y chromosomes had an over-abundance of loci where the protein product was truncated, known as pseudo-genes. Y chromosomes were found to have very high rates of evolution at non-synonymous sites compared to synonymous sites: e.g. in apes, Drosophila, stickleback, birds, clam shrimp and plants including Silene, Rumex and papaya. Assuming synonymous sites evolve neutrally and non-synonymous substitutions are under selection, this is likely to be a signal of pronounced degeneration. As techniques were refined, more signals of degeneration appeared: Y chromosomes were found to use less favoured codons and to have lowered levels of gene expression in plants, mammals, Drosophila, butterflies and stickleback. A young Y chromosome in Drosophila miranda shows a reduction in adaptation compared to the X and an accumulation of deleterious mutations, including large deletions, premature stop codons and frameshift mutations. Y chromosome polymorphism even contributes epistatically to male fitness reduction in Drosophila, but not in frogs, suggesting degeneration can arise from a lowered efficacy of selection rather than as a completely neutral process.

The Y chromosome has also been shown to have an over-abundance of transposable elements. Since they are considered genomic parasites, this is further evidence for deleterious mutation accumulation on the Y. Transposable elements invaded early in sex chromosome evolution in fishes and birds. Microsatellites expanded on the Y chromosomes of Rumex, Hippophae seabuckthorn, Mercurialis annua and in the mostly haploid liverwort. Retrotransposons proliferate on the Y in Silene. Overall, many lines of evidence suggest Y chromosomes degenerate over time.

Alongside the progress in characterizing Y chromosome degeneration, the debate about what caused this degeneration continued. H.J. Muller proposed that Y degeneration resulted because selection was not able to act effectively on the Y chromosome and, later, suggested that selection was not effective on Y chromosomes because Ys were always masked by X chromosomes. Assuming gene loss had a recessive effect on fitness, the ever-heterozygous XY condition allowed gene loss because the X alleles could mask the losses on the Y. However, this model was shown to be ineffective because it required a higher rate of fixation of new mutation on the Y over the X and thus dominance effects alone were unlikely to account for Y chromosome degeneration. That low rates of recombination may be implicated in degeneration arose from comparisons with studies of the relative inefficacy of selection in organisms that replicate clonally or otherwise asexually. Indeed, as shown by Muller and many studies since, asexual lineages are more often less fit than sexual lineages. Because Ys do not recombine, they are likely to follow similar evolutionary trajectories as asexual lineages. 

The incorporation of finite population sizes, and the influence of chance associations, into models of the evolution of sex, known as linkage interference, was hugely successful in understanding the effects of asexuality on the efficacy of selection. The first model considering the association of recombination loss and fitness in finite populations was proposed by Muller in 1964. Muller proposed that disadvantageous mutations accumulated in asexuals by the chance loss from the population of all the haplotypes with the smallest number of deleterious mutations (the ‘least loaded class’). In the case of Y chromosomes, this model would predict that the fittest Y haplotypes in a population were eventually be lost by chance and would never be recovered. With each loss of the best Y haplotypes, the mean fitness of all Y haplotypes in the population falls further behind the mean fitness of Xs. This process became known as ‘Muller’s ratchet’. Another suite of processes involving chance events affecting recombination in finite populations could also cause degeneration. The random addition of alleles into a population by mutation, or their loss by drift, causes correlations between alleles in tightly linked regions. Without recombination, selection acts on blocks of alleles, rather than having the resolution allowed by recombination to act on each allele individually. Indeed, in 1966, Hill and Robertson used Monte Carlo computer simulations to show that selection was less effective than expected when the effects of random associations between alleles under selection were considered, and these results have been replicated since. This process, known as ‘linkage interference’ or the ‘Hill-Robertson effect’, is likely to be a powerful force in slowing the efficacy of selection in regions of low recombination.

The power of linkage interference on fitness is most pronounced under a molecular evolution framework devised by T. Ohta in the 70s known as the ‘nearly neutral’ model. Under the nearly neutral model, most alleles reside on the border between being affected by selection or by drift when the product of the selection coefficient (s) and the effect of drift (Ne) is about 1. With an Ne⋅ s < 1, selection cannot overcome the effects of drift. The action of linkage interference can be said to further locally reduce the effective population size of a specific genomic region. The local reduction in Ne pushes the nearly neutral variation into the zone where it is affected solely by drift (Ne⋅ s < 1) reducing the efficacy of selection on nearly neutral variants. If most new mutations are slightly deleterious, as predicted by the nearly neutral model, linked sites under strong selection can cause these weakly deleterious alleles to spread and to fix. The removal of very deleterious alleles, known as ‘background selection’ or the spread of very beneficial alleles, known as ‘selective sweeps’ thus affects the likelihood of fixation of nearly neutral linked variants. The effects of linked selection increase as recombination rate decreases simply because more sites are subject to selection on nearby sites.

The predictions for effects of selection at linked sites have been supported by empirical evidence. Measures of genetic variation around selected sites seem to be well explained by the effects of linkage interference and correlate with recombination rate at the genomic scale. Empirical assessment of the distribution of fitness effects (DFE) of new mutations finds strong evidence in support of the nearly neutral model, suggesting linked selection is likely to often cause the fixation of slightly deleterious alleles at the genomic level. For example, regions of low recombination have a higher genetic load in Drosophila melanogaster and maize, and they also show fewer adaptive substitutions in Drosophila melanogaster. Other non-recombining regions have also been noted to have increased non-adaptive substitutions such as the mating-type locus of Chlamydomonas reinhardtii and Microbotryum anther-smut fungus, the self-incompatibility locus of Arabidopsis, the gene-complex involved in colony organization in Solenopsis fire ants, and the morph gene-complexes in Heliconius butterflies and sparrows. Linked selection may even explain patterns of codon bias across single genes and is likely to contribute to degeneration of genes during cancer progression. Interference selection seems a process universal to regions of low or no recombination, and, in many species, the Y chromosome is the largest non-recombining region of the genome.

Molecular studies of Y chromosomes lend credence to the prediction that alleles on the Y are degenerating because of linked selection. The non-recombining sex chromosome indeed shows dramatically reduced levels of genetic diversity in birds and even on regions recently translocated to sex chromosomes in Drosophila. Model fitting and simulations suggest the reduction in genetic diversity on the Y can be effectively explained by background selection rather than invoking positive selection in Drosophila, humans and Rumex. Experimentally reduced rates of recombination across a synthetic Y chromosome in Drosophila also reduced the efficacy of selection. These results suggest linked selection played a significant role in the evolution and degeneration of Y chromosomes.

Similar to coding sequence degeneration, lowering and loss of gene expression seem to be common features of Y chromosome evolution. However how lowered gene expression interacts with linkage interference remains unclear and several hypotheses have been proposed. First, Y expression degeneration may simply be a direct symptom of linked selection. Under this hypothesis, Y alleles loose expression as their enhancers and promoters degenerate from the fixation of deleterious variants as allowed by the less efficient selection on the Y. Consistent with this hypothesis, regulatory regions may be under weak purifying selection and therefore are likely to degenerate faster than genic regions. Gene expression loss would then proceed at the same rate as coding sequence degeneration.

If a Y allele loses expression, selection will no longer be able to act on that allele as the allele will be completely recessive. Gene expression loss can thus allow the Y to completely degenerate and even be lost. This process may be analogous to a reduction in the efficacy of selection with reduced dominance. In support of this hypothesis, chromosome-wide gene silencing precedes Y degeneration in Drosophila albomicans. If Y allele expression loss occurs early in degeneration, coding sequence degeneration may be a neutral side effect of gene expression loss. 

The association between dominance, gene expression and linked interference is in line with Haldane’s hypothesis that selection during the haploid stage (e.g. pollen) could slow degeneration of the Y. Slowed decay of the Y due to pollen or ovule expressed genes, known as haploid selection, may be able to account for the observation that X-Y heteromorphism in dioecious angiosperms is not especially common, occurring in only four families. The effect of haploid selection may be substantial when we consider that in plants around ~60% of genes are expressed in pollen, the male haploid phase. As expected, pollen-expressed Y-linked genes have been shown to degenerate more slowly in Silene and Rumex than other Y-linked genes. Sex chromosomes in organisms with predominantly haploid lifecycles similarly are less degenerate, such as in the brown alga Ectocarpus and the liverworts but not mosses. Sex chromosome sequence involved in the haploid phase of animals are also highly constrained on the sex chromosomes in mammals, while the pattern is more complex in birds, potentially due to female heterogamety. Selection in the haploid-phase may therefore play an important part in slowing Y chromosome degeneration.

Because of the lowered chance of the fixation of beneficial alleles on the Y compared to the X due to stronger linked selection, Orr and Kim proposed that it is beneficial to silence the Y because its alleles are less likely to be well adapted than those on the X. In support of this hypothesis, Orr and Kim estimated that in Drosophila the greatest difference in fitness between the X and the Y is caused by differences in the fixation of beneficial mutations rather than of deleterious ones. Similarly, it may be advantageous to the genome to silence sex ratio distorters or other selfish genetic elements such as TEs that accumulate on the Y due to a lowered efficacy of selection. Under either of these ‘active-silencing’ models, Y chromosome degeneration may occur in part because of the accidental silencing of nearby genes, a process likely to be associated with methylation. Indeed, heterochromatin is known to be imperfect in its silencing, through a phenomenon known as ‘position-effect variegation’. There is evidence of position-effect variegation playing a role in Y chromosome evolution in Drosophila where it plays a role in sex-specific aging. Silencing of the Y may therefore be a process favoured by natural selection, and Y coding sequence degeneration may be a neutral process following silencing.

Overall, X and Y divergence is a common aspect of sex chromosome evolution. In some cases, X-Y divergence can lead to gross, cytological heteromorphism. A significant parameter likely to be involved in dictating the trajectory towards heteromorphism is the number of selected sites in the region within which X and Y recombination is absent. The effect of linkage interference on the Y is determined by the number of linked sites, and the effect of linkage interference on molecular degeneration of Y alleles is well supported by empirical evidence. The role in degeneration of gene expression loss of Y-alleles remain less clear. The loss of expression may be an active process or a by-product of degeneration, but its impact on sequence evolution is important. The effects of pollen expression may be crucial in disentangling the relative contributions of expression loss and selection interference to Y degeneration, and to X-Y divergence and heteromorphism more broadly.

Early Studies in Sex Chromosome Evolution

The discovery of sex chromosomes, at the turn of the twentieth century, significantly shaped not only how we study sex chromosomes today but also the fields of genetics and evolution more broadly. The first heteromorphic chromosomes to be correlated with sex phenotypes were observed in insects. In 1905, cytological study led N. Stevens to find that a dimorphic pair of chromosomes was associated with sex phenotype in the mealworm beetle Tenebrio molitor. Working with the related insect group, the Hymenoptera, E.B. Wilson also concluded a causal connection between oddly-shaped chromosomes and sex determination. These two scientists were the first to find organisms for which males had a dimorphic pair of chromosomes where, in females, the same pair was evenly sized. Heteromorphism between the sex chromosome allowed X and Y, respectively, to be tracked across meioses, enabling a direct connection to be made between chromosomal segregation and phenotype.

The studies by Stevens and Wilson were the culmination of a decade of work on the inheritance of primary sex phenotype. Before the discovery of sex chromosomes, it was generally believed that sex was induced by environmental conditions. These theories proposed that sex was determined by a variety of factors including temperature at conception or quality of the mother’s diet, among other theories. A series of studies at the turn of the 20th century radically changed this view. Study by H. Henking in 1891 showed that each sperm from a male Pyrrhocoris firebug could be categorized into one of two size classes, and that each sperm class predictably produced offspring of one sex. Work by McClung on sperm in the Xiphidium locust in 1899 extended this project by finding that the sperm from the male-determining class had one less chromosome than the female determining class. Considering that sperm size class correlated strongly with the sex of the offspring, McClung concluded that it was specifically the presence or absence of a chromosome in the sperm that dictated whether a zygote was most likely to develop to be female or male, respectively. 

The work in 1905 by Stevens and by Wilson expanded McClung’s hypothesis by suggesting that, in some species, male-producing sperm carried a full chromosome set, but that one chromosome was much smaller than its partner chromosome. Together, these studies provided strong evidence that sex phenotype could be determined by chromosomes. By 1915, the list of species with evidence of sex-linked chromosomes included humans, cats, birds, fish, Drosophila ampelophila (now D. melanogaster), moths, and the plant Lychnis dioica (now Silene latifolia), suggesting sex chromosomes were not an unusual characteristic specific to insects. Furthermore, heteromorphic sex chromosomes were not exclusive to males: females were discovered to have heteromorphic sex chromosomes in birds and moths. The list of species with evidence of sex chromosomes today is considerably more extensive. In an international lecture in 1909, Wilson secured the names for each of the sex chromosomes: each in the homomorphic pair would be called an ‘X’ chromosome, inspired by Henking’s 1891 ‘x-element’, while the heteromorphic male-specific chromosome would be named a ‘Y’ chromosome. For cases where the female was heterogametic, the chromosomes were dubbed Z and W, such that males were ZZ and females ZW.

The discovery that chromosomes contributed to sex phenotypes caused a revolution in the study of evolution. First, the discovery of sex chromosomes supported W.S. Sutton’s and T. Boveri’s concurrently developed chromosomal theory of heredity. Besides supporting the chromosomal theory of inheritance, sex chromosomes elegantly demonstrated the expectations from Mendel’s theory of inheritance, in contrast to the then popular idea that an offspring’s traits were a blend of the parents’ traits. In support of Mendel’s principles, Correns proposed Mendelian analysis could explain the connection between sex chromosomes and sex phenotypes: given females were homozygous for the X while males were heterozygous X-Y, maleness was the expected pattern of inheritance for a dominant trait on the Y. This connection between sex chromosomes and sex phenotypes was one of the first pieces of concrete evidence that Mendelian and chromosomal heredity were both tenable and probably inseparable. Finally, the evidence supporting a material basis for Mendel’s theory of heritability also significantly strengthened Darwin’s theory of evolution by natural selection, as Mendel’s view of inheritance allowed for a substantially more effective explanation for the maintenance of variation than blending inheritance. The agreement between Darwin’s and Mendel’s theories revolutionized biology and began a new era.

Given the previously fringe status of Darwin’s theory of natural selection and Mendel’s theory of inheritance at the time when sex chromosomes were first discovered, neither theory had been sufficiently developed to make predictions regarding the existence, the nature, nor the evolution of sex chromosomes. Nonetheless, studies of sex chromosomes went forward. For several years, scientists, mostly studying Drosophila fruitflies, rigorously studied sex chromosomes without expectations to what they might discover. Three germinal findings defined our current understanding of sex chromosomes.

The first generalization to be made about sex chromosomes was that X homologs on Y chromosomes were more often missing than in other chromosome pairs. The uniqueness of the Y was first noted by the marked size dimorphism between X and Y and then by finding a lack of phenotypes associated with the Y. The Y was proposed to be missing most genes, as was supported by the unmasking, in males only, of X-linked lethal factors and by the observation of inviability in synthetic YY individuals of Drosophila and Lebistes guppies, or infertility as in Mercurialis. However, it was not clear why the Y should be gene depauperate.

The second finding was born from the nascent study of linkage mapping. Study of genetics in Drosophila revealed not all traits were inherited independently. Indeed, the likelihood of joint inheritance of variants was found to differ between different traits. T.H. Morgan proposed that linkage between Mendelian traits could be broken down over evolutionary time by shuffling loci between paired chromosomes in events called ‘crossovers’, and that the probability of joint inheritance of variants reflected the physical distance on the chromosome between the loci responsible for each trait. Supporting Morgan’s theory, F. Janssen’s proposed ‘chiasma’, joins between homologous chromosomes observable under the microscope, as the physical observation of crossover between homologous chromosomes. In 1931, B. McClintock showed that, in maize, crossover and chiasma distances were indeed correlated, and therefore likely to represent the same phenomenon.

While linkage was generally variable, male-specific phenotypes were discovered to be completely linked in Drosophila. The linkage of all male-specific mutations led to the conclusion of a complete absence of crossovers in Drosophila males. While the absence of crossovers in males has not been found to be universal in Eukaryotes, the linkage of some sex-specific mutations is generalizable to many other systems. This conclusion is supported by studies demonstrating that although the X and Y can often pair at meiosis, chiasma between the X and Y are absent or, when very few exist, chiasma are relegated to the tips of chromosomes. The absence of crossovers and chiasma between at least part of the X and the Y has been found to be a defining characteristic of sex chromosomes. 

The third correlate of sex chromosomes was a disproportionate participation in the process of speciation. In a 1922 survey of hybrid failure, Haldane observed that in interspecific crosses where one sex was more often sterile or inviable, the less fit sex was always the heterogametic sex, most often the male. This observation came to be known as ‘Haldane’s rule’. Haldane’s rule incriminated the Y once more in lowering male fitness, this time in hybrids, but also suggested sex chromosomes could play a role in reproductive isolation between species. Further investigation into the genetic underpinning of sex-specific hybrid failure uncovered a deeper connection between reproductive isolation and the sex chromosomes. Between 1934-1937, T. Dobzhansky conducted a series of experiments searching for the loci causing hybrid failure in crosses between Drosophila species. Dobzhansky’s studies found a disproportionate number of loci responsible for hybrid failure on the X chromosome compared to the rest of the genome, regardless of sex. This result, known as the ‘large X effect’, has been replicated in many independent hybridizing species pairs. Together, Haldane’s rule and the large X effect support an important role for the sex chromosomes in population differentiation and speciation.

In sum, three rules appear to be generally true of the evolution of independent sex chromosome systems and suggest these are due to convergent evolutionary mechanisms. First, Y alleles are often degenerate or entirely missing. Second, at least part of the X and Y chromosomes do not crossover with each other. Third, the sex chromosomes disproportionately participate in population divergence and speciation. The search for the evolutionary and molecular explanations for these three simple observations has revealed complex and inter-related processes.