Adam, Eve and Population Genetics: A Reply to Dr. Richard Buggs (Part 1)

Yes, it does get a little muddled. Let’s give this a try, and then maybe you can respond for additional clarification if needed.

Early studies on human variation, prior to the human genome project (HGP) were restricted to working with alleles of single “genes” (in reality, generally short stretches of DNA that included a gene but also some DNA around it). These studies depended on the researchers actually going out and sequencing a large number of people for this specific gene, and then making sense of the allele diversity they found for that region (by modelling using mutation frequency, etc). These are not PSMC methods, but earlier coalescent-based methods.

For example, this early paper looks at a few such genes for which data was available at the time and concludes this (from the abstract, my emphases):

“Genetic variation at most loci examined in human populations indicates that the (effective) population size has been approximately 10(4) for the past 1 Myr and that individuals have been genetically united rather tightly. Also suggested is that the population size has never dropped to a few individuals, even in a single generation. These impose important requirements for the hypotheses for the origin of modern humans: a relatively large population size and frequent migration if populations were geographically subdivided. Any hypothesis that assumes a small number of founding individuals throughout the late Pleistocene can be rejected.”

Later pre-HGP papers were in agreement with these results. For example, this paper looked at another gene (the PHDA1 gene), and reports a human effective population size of ~18,000.

Another paper from this timeframe looked at allelic diversity of the beta-globin gene and found it to indicate an ancestral effective population size of ~11,000, and conclude that “There is no evidence for an exponential expansion out of a bottlenecked founding population, and an effective population size of approximately 10,000 has been maintained.” They also state that the allelic diversity they are working with cannot be explained by recent population expansion - the alleles are too old to be that recent. (This also fits with the genome-wide allele frequency data we see later from the HGP.)

It is in this timeframe that the Alu paper is also published. It looks at allelic diversity of a different kind. Alu elements are transposons - mobile DNA - and they can generate “alleles” where they insert. Generally, if an Alu is present, that’s an allele, compared to when an Alu is absent (the alternative allele). This paper is also nice because it does not depend on a forward nucleotide substitution rate - i.e. the DNA mutation rate, since Alu alleles are not produced by nucleotide substitutions. This paper concludes that the human effective population size is ~18,000. They also state (my emphases):

“The disagreement between the two figures suggests a mild hourglass constriction of human
effective size during the last interglacial since 6000 is very different from 18,000. On the other hand our results also deny the hypothesis that there was a severe hourglass contraction in the number of our ancestors in the late middle and upper Pleistocene. If humans were descended from some small group of survivors of a catastrophic loss of population, then the distribution of ascertained Alu polymorphisms would show a pre- ponderance of high frequency insertions (unpublished simulation results). Instead the suggestion is that our ancestors were not part of a world network of gene flow among archaic human populations but were instead effectively a separate species with effective size of 10,000-20,000 throughout the Pleistocene.”

From here, we start to get into what are really HGP papers but are focused studies on small DNA regions, rather than genome-wide variation. These are still not PSMC studies. For example, this paper looks at a small section of an autosomal chromosome (chromosome 22). They conclude (my emphases):

"The comparable value in non- Africans to that in Africans indicates no severe bottleneck during the evolution of modern non-Africans; however, the possibility of a mild bottleneck cannot be excluded because non-Africans showed considerably fewer variants than Africans. The present and two previous large data sets all show a strong excess of low frequency variants in comparison to that expected from an equilibrium population, indicating a relatively recent population expansion. The mutation rate was estimated to be 1.15 10 9 per nucleotide per year. Estimates of the long-term effective population size Ne by various statistical methods were similar to those in other studies. "

A second paper of this type looked at a region of chromosome 1. They also do a variety of estimates of population size for this region, and they conclude the following (my emphases):

An average estimate of ∼12,600 for the long-term effective population size was obtained using various methods; the estimate was not far from the commonly used value of 10,000. Fu and Li’s tests rejected the assumption of an equilibrium neutral Wright-Fisher population, largely owing to the high proportion of low-frequency variants. The age of the most recent common ancestor of the sequences in our sample was estimated to be more than 1 Myr. Allowing for some unrealistic assumptions in the model, this estimate would still suggest an age of more than 500,000 years, providing further evidence for a genetic history of humans much more ancient than the emergence of modern humans. The fact that many unique variants exist in Europe and Asia also suggests a fairly long genetic history outside of Africa and argues against a complete replacement of all indigenous populations in Europe and Asia by a small Africa stock. Moreover, the ancient genetic history of humans indicates no severe bottleneck during the evolution of humans in the last half million years; otherwise, much of the ancient genetic history would have been lost during a severe bottleneck.

In other words, the alleles we see in the present day cannot be explained as arising after a severe bottleneck in the last 500,000 years.

From here, we’re on to the HGP papers and later the 1000 genomes papers as they extend this sort of thing to the genome as a whole, show the allele frequency spectrum for a much, much larger dataset, and now we start seeing PSMC analyses included. There’s a lot to summarize in those papers, but the take-home message is those papers support the same conclusions as the previous work, but now using a massive data set. No one looked at the HGP/ 1000 genomes work and said it’s time to revisit the previous conclusion that a sharp bottleneck had been ruled out. On the contrary - the HGP/1000 genomes papers provide additional evidence that the prior work was solid.

So, there’s a full treatment of what is glossed as a few sentences in Adam and the Genome.

I’ll cover linkage disequilibrium (LD) (which is independent of the nucleotide substitution rate) and the single-genome PSMC approaches in my upcoming replies to Richard. Hopefully this gets you (and everyone else) up to speed thus far. Let me know if you’d like clarification on any of the above.

1 Like

No assumption is needed to deal with variation in mutation rate across the genome. Both the mutation rate and the genetic variation data include contributions from (more or less) the entire genome. It doesn’t matter whether the different parts of the genome contribute uniformly or not – they’re all contributing to both. (Unless you have to worry about multiple mutations at sites, but that’s not the case here.)

Variation in mutation rate with time could cause problems, provided the variation were large enough. There are good reasons to think it’s not in fact an issue, though. First, the high-end mutation rate I mentioned (2 x 10^-8) was calculated by comparison with the chimpanzee genome, so it would include any previous higher rate. As I showed with one plot, using that rate does not qualitatively change the situation. Second, there is no biologically plausible mechanism for changing rates for different mutational processes in sync. If one process had changed rate, I would expect to see that reflected in the proportions of different kinds of mutation over different time scales, but I don’t. In particular, the ratio of mutations at CpG sites to mutations at other sites is the same in intra- and inter-species data, even though they are caused by very different processes.

Structure can indeed be important, but you have the sign wrong. There is a body of theoretical work on the effect of population structure on detecting bottlenecks, and as far as I know, it all points to structure causes spurious signals of bottlenecks, not erasing the signatures of actual bottlenecks. (See this paper, for example, and references therein, in particular John Wakeley’s 1999 paper, in which he concludes that we underestimate the ancestral human population size when we fail to consider population structure and migration.[quote=“RichardBuggs, post:53, topic:37039”]
3) As far as I can see the model currently also assumes no admixture from outside of Africa.
[/quote]
This is really just another version of (2), I think. In general, a fragmented population (inside or outside Africa) creates two classes of parts of the genome: those with genetic ancestry entirely within one population, and those with ancestry from a second population. The former will have coalescence times (and therefore diversities) characteristic of the population of the single population, while the latter will have longer coalescence times and higher diversities; their most recent common ancestor has to lie before the time the populations diverged, or at least far enough back for earlier migration to have carried the lineage into the second population. This signature – many regions with low diversity, some with much higher diversity – is also the signature of a bottleneck, in which some regions have variation that made it through the bottleneck and some don’t.

While positive selection has certainly occurred in the human lineage, its effect on the overall landscape of genetic diversity is actually pretty hard to pick out, and is almost certainly smaller than the effect of background selection (which acts more or less to reduce the effective population size relative to the census size near functional elements in the genome), and even more so than the effect of neutral drift. There has been a debate whether the effect of positive selection is even detectable.

I assumed that all variants in the founding couple were what they inherited from their ancestors, who were part of a large, constant-sized population. For each simulation, I included as much as was needed to match the predicted and observed data for the higher portion of the allele frequency distribution.

5 Likes

@RichardBuggs,

But both statements are “Deus Ex Macina” objections… you pull this notion out of nowhere… what if things were different a million years ago? Or just 3000 years ago?

I don’t know… what if? When Galileo had the Pope’s representatives look through the telescope, he asked if the could see the imperfections to the Lunar sphere - - craters and mountains and jagged hillsides on the supposedly pristine Lunar plains.

Their answer was that they could just detect an invisible layer of Lunar material covering over these imperfections, to render the Moon, once again, as a divinely perfect object.

Galileo, with his eyes flaring, bends over to look through the telescope again. And he steps back and concludes, but gentlemen, I see invisible mountains and craters on top of your invisible perfect lunar planes!

Propose the fringiest fringe ideas you would like … but you have to start showing results that would support these
and related contentions.

@DennisVenema

I think it is pretty clear that if there is a bottleneck, it happened within Africa, and not in the out-of-Africa diaspora.

[Typo: “not is the out-of-Africa diaspora” corrected to “not In the out-of-Africa diaspora”]

Hi Dennis,

Thank you very much, that clears things up for me considerably. I look forward to your future discussion on the Linkage Disequilibrum and PSMC approaches. Also, does this then leave us with four methods being discussed here? Earlier coalescent-based methods involving (1) allelic diversity via nucleotide polymorphisms (mutation rate dependent) & (2) allelic diversity via Alu insertions (mutation rate independent), as well as (3) linkage disequilibrium (also mutation rate independent) & (4) single-genome PSMC? So (2) & (3) could both be considered independent checks irrespective of mutation rate? Thanks!

You’re welcome. You’ve basically got it, yes, but be aware that there are a variety of related coalescent methods in those papers cited above, but it’s a bit fuzzy to draw sharp distinctions between them. They do, however, use different raw data sets. The PSMC approach in the 1000 genomes papers is also a form of a coalescent analysis, as is single-genome PMSC. But you’re right that the LD and Alu analyses are independent of the nucleotide mutation frequency. They are also independent of each other. So at a minimum, we’re looking at three independent lines of evidence (if we want to lump all coalescent modelling together). Obviously, population geneticists don’t lump them all together, otherwise they wouldn’t keep improving them and applying them to larger and larger data sets.

2 Likes

Ah, yes. The point is that if the method is powerful enough to exclude a sharp bottleneck in non-Africans, which have an effective population size (Ne) around 1200, then it is amply able to exclude one for African populations which have a much higher Ne.

1 Like

Hi Steve, @glipsnort, thanks for your responses to the points I raised about your model. I will respond more in due course, but for now I will just focus on the issue of population sub-structure.[quote=“RichardBuggs, post:53, topic:37039”]
2) Also, as far as I can see (Steve, do correct me if I am wrong), this approach depends on the assumption of a single panmictic population over the timespan that is being examined. I think it would be fair to say that there has been substantial population substructure in Africa over that timespan and that this has varied over time. To my mind, this population substructure could well boost the number of alleles at the frequencies of 0.05 to 0.2.

Let me just try to explain that in a way that is a bit more accessible to our readers. I am saying that Steve’s model (at least in its current preliminary form) is making the approximation that there is one single interbreeding population that has been present in Africa throughout history, and that mating is random within that population. However, the actual history is almost certainly very different to this. The population would have been divided into smaller tribal groups which mainly bred within themselves. Within these small populations, some new mutations would have spread to all individuals and reached an allele frequency of 100%. In other tribes these mutations would not have happened at all. Thus if you treated them all as a large population, you would see an allele frequency spectrum that would depend on how many individuals you sampled from each tribe. It is more complicated than this because every-so-often tribes would meet each other after a long time of separation and interbreed, or one tribe would take over another tribe and subsume it within itself. Such a complex history, over tens or hundreds of thousands of years would be impossible to reconstruct accurately, but would distort the allele frequency spectrum away from what we would expect from a single population with random mating. It gets even more complicated if we start also including monogamy, or polygamy.
[/quote]

I think you will find that John Wakeley’s paper supports the point I am making. My point is only about the approach that you are using: modelling of allele frequency spectra. It is not (for now) about other methods of detecting bottlenecks. The problem for the bottleneck hypothesis that you are posing is the high number of intermediate frequency alleles in present day Africa. I am suggesting that past population structure (post-bottleneck) could explain this. Similarly, Wakeley is seeking in his 1999 paper to explain the fact that in a dataset he is looking at “nuclear loci show an excess of polymorphic sites segregating at intermediate frequencies (Hey 1997). This is illustrated by Tajima’s (1989) statistic, D, which is positive…”. Wakeley then goes on to explain pattern as “due to a shift from a more ancient subdivided population to one with less structure today”. As far as I can see, this supports the point I am making: population subdivision can cause intermediate allele frequencies.

In addition, a paper which built on Wakeley’s work shows that ““in simulations with low levels of gene flow between demes… Tajima’s D calculated from samples spread among several demes was often significantly positive, as expected for a strongly subdivided population” (Pannell, Evolution 57(5), 2003, pp. 949–961).

Thus I think it is fair to say that strong population sub-structure for a prolonged period at some point subsequent to a bottleneck would shift allele frequency spectra towards having more alleles at intermediate frequencies.

No, that’s exactly the opposite of the problem. Note that in this context “intermediate frequency” means not close to 0% or 100% (look at the Hey paper Wakeley cites if you doubt this). After your tight bottleneck, you’ve still got a substantial number of intermediate frequency alleles, but you’ve lost almost all of the low frequency alleles.

Tajima’s D for the post-bottleneck scenarios is positive – very positive initially, because heterozygosity wasn’t reduced very much by the bottleneck, while rare alleles were wiped out. The real human population, meanwhile, has a modestly negative D, thanks to the excess rare alleles from population expansion. You’re proposing to add a process to the bottleneck scenario that will make D even more positive.

2 Likes

Hi Steve, perhaps I have misunderstood which aspect of you simulations is not fitting with the data. I was going on this comment that you posted near the beginning:

Then putting this comment together with this one:

I got the impression that you were saying that the problem with the model in your 100kya_16K simulation was that between 0.05 and 0.2 on the X axis the model is not predicting enough variants. This is why I suggested that one could invoke population subdivision over part of the last 100Kya to increase the numbers of these intermediate alleles, and if this were included it might be possible to fit the data.

Have you calculated Tajima’s D for the data and simulation in the 100kya_16K chart? How do they compare?

I completely agree with you that the immediate effect of the bottleneck would be a positive Tajima’s D, but I thought your argument was that 100Kya later the intermediate frequency alleles derived from the bottleneck had a very small - almost negligible effect on the allele frequency spectrum, which was now dominated by new mutations.

I am sure I must be misunderstanding something here.

Another thing to keep in mind here is the allele frequency spectrum is a smooth distribution. If you’re proposing that it was cobbled together from different demes after a bottleneck you’ll have to account for the shape and fit of the curve, not just some explanation that happens to boost the frequency of some alleles due to population structure.

Back to being a fly on the wall (for now).

At least biology-trained flies on the wall can sort out for themselves what’s reasonable or not in this exchange. Being a non-expert fly on the wall is interesting but also pretty confusing.

6 Likes

That is the problem. The model doesn’t predict enough relative to alleles around 50%, which are also intermediate alleles.

Let’s back up and think about it in terms of first principles. After the bottleneck, you have enough alleles around 50% frequency, but a great shortage of lower frequency alleles. These can only be replenished by mutation followed by drift, and you have to get both processes to work in the time allotted. In a panmictic population, drift does not occur fast enough to move new mutations up to ~30% frequency in time.

Subdividing the population doesn’t affect the mutation component at all: it’s the same number of individuals mutating. Subdivision does affect the drift process, and it slows it. You can see this intuitively. When there is one copy of a new allele, it matters hardly at all whether it is in a panmictic population: it will be passed on to zero, one or two descendants regardless of how big the immediate population is. When it becomes more common, though, an increase in frequency is held back if it’s in a local deme, because the new number of copies can’t exceed the deme size, while there is no such constraint in the panmictic population.

Alternatively, treat it mathematically. Drift speed is governed by the variance of the sampling process; large variance = fast drift. Consider a new mutation in two populations, each with 10,000 individuals, one panmictic and the other divided into 10 demes. When there is one copy of the new allele, the variance is virtually identical for the two populations: Binomial variance = np(1-p). For population 1, n = 10,000, p = 1e-4, variance = .9999. For population 2, n = 1,000, p = 1e-3, variance = .999. (Not surprisingly, both are almost equal to the variance for an equivalent Poisson process, which this almost is.) But when the allele is at a higher frequency, the variances diverge. When there are 500 copies, for pop 1, p = .05 and the variance = 475, while for pop 2, p = .5 and the variance = 250. Drift proceeds faster in the large population.

Migration reduces but does not reverse this effect. If the allele has migrated to a second deme, with say 400 copies in the first and 100 in the second, the variance on the total number is the sum of the variances for the two demes, or 240 + 90, which is still less than the variance for a panmictic population. (The two only drift at the same rate when every deme has the same number of copies.)

Since your basic problem is that you have to speed up drift somehow while still producing lots of new mutations, subdivision cannot help you.

3 Likes

Thanks Steve for your patience.

This was how my argument was working: Immediately after a bottleneck, Tajima’s D is positive (due to the intermediate frequency (25% and 50% minor allele frequency) alleles. Some time after the bottleneck, let’s call this “Phase 2”, Tajima’s D is negative (due to the excess of very minor frequency alleles). Longer after this, Tajima’s D ends up closer to zero as the population comes to equilibrium. That is my understanding of the expectation - I think we are on the same page here.

The issue of fitting the bottleneck model to the data was that the Tajima’s D of the 100kya_16K model seemed more negative than the actual data (I am not sure if the actual data has zero Tajima’s D or positive Tajima’s D - if it is similar to the data that Wakeley (1999) was looking at then I guess it would have been positive). So we seem to be in “phase 2”.

This was a similar problem to that faced by Wakeley: Tajima’s D seemed too positive in his data. Wakeley solved this by invoking population sub-division. Similarly in models by Pannell (2003) Tajima’s D tended to be more positive in a strongly sub-divided population, when samples were collected from several demes.

Thus it seemed reasonable to me to suggest that population subdivision could be causing more positive Tajima’s D in the 1000 genomes project African populations, despite a past extreme bottleneck. Do you disagree with Wakeley’s use of population sub-division to explain a higher Tajima’s D, or did he make a mistake, or am I misunderstanding his paper? (I suspect the latter may well be the case, but I would like to know where I am going wrong).

I take your point that if we are thinking in terms of overall allele numbers, drift happens more quickly in a large population. I.e., an allele can get more quickly from 1 copy to 500 copies in a single population of 10,000 individuals than it can in a single population of 1,000 individuals. However, the allele can more quickly go to fixation in the 1,000 population. Once an allele has reached a frequency of 1000 in a population of 1000 it cannot be lost, whereas an allele with a frequency of 1000 in a population of 10,000 can be lost by drift as easily as it arrived. Thus population subdivision will enhance the retention of new mutations as they get fixed in sub-populations. Then if the sub-populations mix again in the future, or if we sample across the sub-populations treating them as a single population, then we will find an excess of intermediate frequency alleles. In addition, it is harder to an allele to go to complete fixation in the whole metapopulation when there is population sub-division - in fact that will be rare. So the number of new mutations that end up neither being lost nor going to complete fixation is higher in a subdivided metapopulation. More of them get stuck at intermediate frequencies. This is my intuitive understanding of Pannell’s result. That is the point I have been trying, perhaps not very clearly, to make up until now.

However, I also have another possibility that I would like to raise. What if the high number of intermediate frequency alleles in the African 1000 genomes study were due to a more recent bottleneck. I.e. we were still in the phase where the bottleneck had caused positive Tajima’s D? How recent would the bottleneck need to be for that to be the case?

I have a suspicion the answer to my last question would be a date that is so young as to not be able to fit with known archeology. If so, I am wondering if perhaps population subdivision could act to slow down the loss of the intermediate alleles that are derived from the bottleneck itself. If there was a bottleneck of two, followed by rapid expansion of multiple sub-populations of offspring, most of the intermediate frequency alleles could be maintained in the metapopulation as a whole, giving intermediate frequencies long into the future.

This brings me to something that I wanted to follow up on earlier:[quote=“glipsnort, post:67, topic:37039”]
I assumed that all variants in the founding couple were what they inherited from their ancestors, who were part of a large, constant-sized population. For each simulation, I included as much as was needed to match the predicted and observed data for the higher portion of the allele frequency distribution.
[/quote]
I was wondering why you could not include as much variation from the pre-bottleneck population as was needed to cause the simulations to fit the data in the 20-50% range? Why only focus on making them match at the 60-70% frequency range? I didn’t quite follow the logic here, with your brief description.

Thanks again for your patience. BTW, I accept the point about mutation rates that you made earlier: that you are being quite generous to the model with your mutation rates. I still need to think a bit more about the selection and admixture issues, though I agree with you that the admixture issue is in effect another take on the population sub-division issue.

@RichardBuggs,

I think you folks could be playing with these numbers from here to doomsday… trying to see what they include!

Wouldn’t it be easier to see what they exclude?

What if you arranged your data with all of your most optimistic - - but still reasonable - - assumptions.
And use three different baseline rates:

1] the Low Baseline: the average rate for conventional multi-rate alleles;

2] the Medium Baseline: 1.5 x the average rate for conventional multi-rate alleles;

3] the High Baseline: 2 x the average rate for conventional multi-rate alleles.

Run these 3 baselines against 3 time frames:

A] Low, Medium & High for 6000 years;
B] Low, Medium & High for 10,000 years (a 66% longer timeframe); and
C] Low, Medium & High for 500,000 years (an 8200% longer timeframe!).

So, using your most optimistic premises, and a broad range of baseline mutation rates,
compare the resulting allele diversity factors for these 9 scenarios to the allele diversity
factors that the human population currently presents.

Do all of them meet and exceed current diversity? Or do just a few?
This will at least give you folks something concrete to fiddle with …
Right now, all I see are people with the pointer finger touching their foreheads
and winging it.

**These are not deadly radioactive chemicals we are working with here. They are **
Rates of Change and “N’s”… oh, and I suppose a few other factors. But it
is still a pretty finite universe of numbers to put together.

Hi Dennis, just a quick note to thank you for the five papers that you have pointed out in your recent posts. I have downloaded them and started to work through them. I will come back with comments in due course.

2 Likes

Sounds good - a little light reading for you… :slight_smile:

Hi @glipsnort. A couple of quick questions (I think…).

If the generation times were faster, perhaps closer to puberty at 15 years:
How would that affect the mutation rate per generation? Would it be less per generation because they are younger, or are most of the errors copying errors anyway from when the germ cell first forms so the mutation rate is about the same?
Secondly, if the mutation rate stayed “the same” and the generation time reduced, would that pull the X axis in proportionally? That is, would a 40% reduction in generation times produce a 40% reduction in the X axis?

George - I can’t find it now, but I recall Steve pointing out that the graphs assume fairly constant mutation rates, which is a fair place to try. But, for example, if our solar system goes through regions of space that are higher or lower in various types of radiation or other toxicity, they could vary some. I think the point of @RichardBuggs is not to just cast doubt, but to draw legitimate boundaries around what can be said with greater or lesser confidence. Scientists don’t want to be guilty of assumptions that could be proven false, but if those assumptions are explicit because there is no better option at this time, well, better to identify those areas. I appreciate Steve stating it up front, and I think it appropriate to recognize it as a fair working assumption which could perhaps be open to revision.

Now for the scientists, is there any way to assess variability in historical mutation rates? And for Steve, if the mutation rate halved, for example, would that double the X axis of your graphs?

@Marty

Well, I think we all know that numbers can vary. But to have numbers “seriously challenged” with absolutely no evidence to support the speculation is a bit of a stretch. Frankly, we can speculate that the change is actually in favor of the BioLogos conclusions, rather than the opposite, right? I don’t remember seeing any “equal time” on how the wild speculation could go… But, I’ll tell you what, Marty, I like your point!

I think we should really put some teeth into it though - don’t you? For our sakes and for @RichardBuggs as well. Here is a useful article for accomplishing that:

http://curious.astro.cornell.edu/physics/55-our-solar-system/the-sun/the-sun-in-the-milky-way/207-how-often-does-the-sun-pass-through-a-spiral-arm-in-the-milky-way-intermediate

.
.
.

The article provides approximate speeds for our solar system, relative to the entire Milky Way galaxy:

The solar motion on top of its circular orbit about the centre of the Galaxy (which has a period of about 230 million years) can be described by how fast it is going in three different directions:

U = 10 km/s (radially inwards) [ < towards the center of the galaxy ]
V = 5 km/s (in the direction of Galactic rotation)
W = 7 km/s (northwards out of the plane of the Galaxy)
. . . . . [ moving either up or down, perpendicular to the “flat” disk of the galaxy]

“Of course the Sun won’t keep on going in this direction forever. In fact we approximate its motion by an ‘epicycle’ on top of the mean motion around the Galaxy. The period of oscillation in and out of the plane of the galaxy (up and down) is about 70 million years. This means that we pass through the Galactic midplane about every 35 million years. . .”

I wonder how far @RichardBuggs thinks our solar system can move in just 6,000 years. It’s pretty far in terms of miles, but not very far compared to the rest of galactic space embraced by the arms of the Milky Way!

6000 years as a percentage of 35,000,000 years is the decimal fraction of 0.00017, or less than 0.02%. That’s not 2%. That’s 2 One-Hundredths of just One Percent!

I recommend clicking on the image in order to see the image more clearly, especially the white bar in the lower portion of the rectangular inset, with a down arrow and a white arrow pointing to emphasize the top of the white bar and the bottom of the white bar. This white bar is more than 0.02% [ or 2 One-Hundredths of 1% ] of the full distance our star will travel in 35 million years!

I should point out that Cosmologists, who not only have to traverse the width of the Milky Way to make their observations, but also must measure light and gravity in the immense distances ‘‘between’’ galaxies … not just the width of galaxies… and they have not detected any inconsistencies in the behavior of light or matter in any direction they look - - other than what is expected from the General Theory of Relativity!

Hi Dennis, I’ve had a read of Zhao et al (2000) PNAS now. Here is what you said about it to @tallen_1 :

I am not sure why you highlighted the words in the abstract about the authors not finding evidence for a bottleneck of modern non-Africans because that is not what we are discussing here. We are discussing the possibility of a bottleneck in the human lineage, not just non-African humans.

The authors estimates of human effective population size are between 8100 and 18800. These estimates are based on present day numbers of segregating sites in the sample sequences, and estimates of mutation rate. This method assumes a fairly constant population size over time. Thus, although they are estimating present effective population size, they are happy to extrapolate this into the past, and call their estimates “long-term effective population size”. This phrase is essentially an expression of their assumption, which is necessary for their method, that population size has remained fairly constant. They say in their discussion: “The lowest value (8,100) suggests that the long-term effective population size of humans is unlikely to be lower than 5,000” but this is 5000 figure is not supported by a calculation: it is seems to be a figure chosen for being a round number. “Long-term” is not defined in terms of number of years. No historical reconstruction of effective population size at different time-points in history is given. I struggle to see this paper as evidence that there was never a short sharp bottleneck in human history.