Category Archives: genetics

DNA Confirms Tyrolean Iceman Died of Extreme Fashion Violation

In 1991, the five-thousand year-old mummified remains of a man were discovered in the Italian Alps. Numerous DNA analyses have been performed on those remains in the past, providing a lot of information about who he was and what he ate. But now, a team of Italian and Irish researchers have analyzed ancient mitochondrial DNA recovered from his clothing, providing critical insight into his likely cause of death.

The results were published earlier this month  in Scientific Reports, and hoo boy is the ghost of Joan Rivers angry. Shoelaces made from cattle, sheepskin loincloth, and goatskin leggings; a quiver made from roe deer and a bearskin hat. His coat? Goat and sheep! I mean, this guy was basically wearing a Guy Fieri nacho recipe.

d280542b4d80050459c05ac840718a2a
Artist’s reconstruction of the iceman’s ensemble. Image via https://www.pinterest.com/pin/69242912993643041/

The results provide information about the ancient phylogeography of these animals, as well as insight into the iceman’s lifestyle. The cattle, goats, and sheep all appear to be closely related to contemporary domesticated populations in Europe, consistent with an agricultural/pastoral existence. The deer and bear point to an important additional role for the use of wild species.

Other analysis has suggested that this iceman died as a result of an arrow wound. Alternative theory: this is evidence of early mirror technology — a technology that, like the nanobots in Wool, developed before the culture was advanced enough to handle it.

Jerry Coyne sees a picture of my poster on twitter, is a dick

Last week I was in Raleigh, NC for Evolution 2014, this year’s edition of the annual joint meeting of the Society for the Study of Evolution, the Society of Systematic Biology, and the American Society of Naturalists. I brought a poster that presented some work I’ve been doing on how noise (e.g., environmental fluctuation) can select for epistasis (non-linear interactions among different genes). On Monday evening, when I was hanging out by my poster, someone showed me that my Acknowledgements section had made it to twitter:

Screen Shot 2014-06-29 at 7.52.48 PMNow, this is what you might call an underdetermined tweet. Not knowing Alex, I was not sure what the intent was. Was this a sanctimonious evolutionary biologist expressing outrage about the fact that I had funding from the Templeton Foundation, which is viewed skeptically by many biologists because of its interest in religious topics? Or was it someone who thought it was unacceptable to include jokes in your acknowledgement section?

Fortunately, it turns out to be the only answer that won’t make you lose faith in humanity: it was someone being ironic. Alex Stewart is a postdoc working with Josh Plotkin at U Penn. Last year Alex and Josh published a paper in the Proceedings of the National Academy of Sciences, and in their acknowledgements section they thanked a number of funding sources, including the “Foundational Questions in Evolutionary Biology Fund”.

Through some mechanism that appears to be named “Todd”, this paper was brought to the attention of Jerry Coyne, evolutionary biologist at U Chicago and blogger at Why Evolution is True.  Along with people like PZ Myers and Richard Dawkins, Coyne is a prominent and vocal critic of Intelligent Design and of the efforts by religious groups to undermine the teaching and study of evolution.

So why did Coyne care about this paper? Because apparently, if you look into the Foundational Questions in Evolutionary Biology Fund, it is a big pot of money at Harvard Univeristy, which was put there by the John Templeton Foundation. Coyne gets mad whenever anyone has anything to do with Templeton. And, in this case, he professed outrage over the fact that they had “disguised” their funding source.

Now, I’m guessing that when Coyne saw Alex’s picture, he did not notice or recognize the name, as he seems to have read the comment as straight-up, unironic outrage, and he jumped on the bandwagon with his own short post, summed up by his comment, “Unacceptable indeed!” He goes on:

Okay, who are these miscreants?

The good news is that scientists clearly recognize the woo-ish nature of Templeton, as well as its nefarious mission to pollute science with religion. (Note, though that, contra the slide, Templeton has disavowed all forms of creationism, including intelligent design.)

The bad news is that four collaborators on this project took Templeton money anyway.

This is pretty awesome. I don’t think I’ve ever been called a miscreant before.

The rest of this is sort of weird, though. I think it is fair to say that the scientists recognize the fact that Templeton is perceived as “woo-ish” by many in the evolutionary biology community. However, most of their funding these days, including this grant, is pretty much straight-up science. Also, if you read the acknowledgements, which state that the work is not creationist, and then read Coyne’s comment, it makes you wonder if he knows what contra means.

That is, the point here is not to say, “This is not creationism, unlike most of what Templeton funds”. The point is to say, “This is not creationism, despite what you might wrongly assume about what Templeton funds.”

Most of the comments go back and forth on the issues you’d expect, so I’ll limit myself to Coyne’s. His first is this:

Would you take money from the Council of Conservative Citizens (a segregationist organization), or the Tea Party to do pure science? How about the Nazi Party? Is there no organization so nefarious that you wouldn’t take their money?

Really, people take it not to further the science, but to further their careers, because you need funding to get tenure, promotions, and so on.

I do fault those who take Templeton money, for they’re lending their imprimatur to an organization whose aim is the corruption of science. That’s precisely why Templeton funds “pure” science–to give them cover for their investigations of “spirituality and science”–the so-called “Big Questions.”

That’s some classic straw-man bullshit right there, although you have to give him credit for going full Godwin in his first comment, thereby saving his community the embarrassment of being the first to bring up a bogus Nazi analogy. That’s leadership!

And while I can be (and have been) accused of many things, compromising my principles to get tenure may be the least valid.

As for my imprimatur, I can’t lend it to Templeton, because back when I was in Santa Fe, I loaned it to Jeremy Van Cleve, and he never returned it!

And how do you take money from an organization like that without compromising “all principles”? The same way you take money from The Council of Conservative Citizens without compromising all principles?

The fact is that you’ve compromised principles simply by taking the money.

First off, again, the analogy with segregationist or racist organizations (which, to the best of my knowledge, are not major sources of science funding anyway), is ridiculous and seems disingenuous. Or maybe Jerry Coyne honestly believes that Templeton is a force for evil on the level of The Council of Conservative Citizens or the Nazi Party, but I think that’s a position that would be hard to find much support for, even among evolutionary biologists.

So we’re left with what, a slippery slope argument, maybe? Or a one-drop argument?  There’s two problems with that. First, whether or not you let your scientific conclusions be influenced by your funder’s (perceived) agenda is up to you. If you’re an honest scientist, you do your work and say what you believe to be true. If some agency or foundation won’t fund you in the future because they don’t like what you said, so be it.

Second, the same argument applies to all sources of funding. For example, the funding structure at NIH strongly rewards confirmation bias and the overinterpretation of marginally significant statistics. As a consequence, the biomedical literature is riddled with unreproducible results. The funding, hiring, and promotion structures in academia have done far more to corrupt science than even the bogeyman version of Templeton that inhabits Jerry Coyne’s mind.

Maybe the intent [of the acknowledgement] is a bit nebulous, but it’s factually incorrect (Templeton doesn’t fund crea[ti]onism or ID any longer), and unprofessional as well. If you’re going to take money from someone, you don’t diss them in public. I bet if Templeton found out about this (I won’t tell them!) they wouldn’t give any more $$.

I love this last one, as it concisely captures the angry incoherence of the argument. First, it accuses me (inaccurately) of claiming that Templeton funds creationism. Second, it accuses me (inaccurately) of dissing them. Third, it suggests that if Templeton found out they had been dissed (which they weren’t), they would spitefully refuse funding in the future (which they probably wouldn’t).

Note that the overarching theme here is that I am bad and foolish, but for contradictory reasons:

I’m bad for taking money from an organization with an alleged religious agenda. But look! I’m foolish, because they have actually renounced that agenda! Gotcha!

I’m bad for taking money from an organization with an agenda, because I will constrain what I say, for fear of losing future funding. But look! I’m foolish, because I did not constrain what I said! Gotcha!

So, if I may build on Coyne’s Nazi analogy, the morality being proposed seems to be something like this: Jerry Coyne would never take money from the Nazi Party, but if he did, he would never publicly criticize the Nazis!

But, to be fair, these comments were probably never meant to support such a close reading. A more accurate characterization might be that Coyne starts from the ideological position that no one should ever take money from Templeton. As someone who has received funding from Templeton, I am therefore someone who is bad and foolish. Starting from this “bad and foolish” conclusion, Coyne works backwards, using whatever evidence and arguments will get him from my poster to that conclusion. This includes misinterpreting my statement (through disingenuousness, carelessness, or a combination of the two) as well as employing arguments that seem logically inconsistent.

Or maybe this is a Colbert-like performance piece, where he takes on the persona of “Jerry Coyne” to illustrate how dogmatically espousing an ideology corrupts the reasoning process. If that’s the case, my hat is off to you, Professor Coyne! Well played!

12,000-Year-Old Underwater Skeleton and the Peopling of the Americas

So, a paper published in Science yesterday describes the analysis of the skull and mitochondrial DNA of a skeleton discovered in Hoyo Negro, a water-filled cave beneath the surface of the Yucatán Peninsula.  In addition to the human skeleton (whom the scientists named “Naia” before removing her head for further study), the cave contains the remains of 26 other large mammals, including a saber-tooth tiger and some sort of a mammoth-type thing.

Check out the story over at National Geographic for some cool underwater pictures.

There are a couple of things that make this an interesting story. First of all, it’s a freaking underwater cave with a 12,000-year-old human skeleton and a saber-tooth tiger. Second, it adds an interesting piece of data to our understanding of how people first came to America. (Spoiler: the answer is not “Jesus brought them on the Ark”.)

The standard story of the colonization of the Americas goes something like this. Back during the last ice age(s), maybe 15,000 to 25,000 years ago, the sea levels were lower, and there was a land bridge connecting Siberia to Alaska. During that period, people from Northeastern Asia crossed over and spread throughout North and South America. Thousands of years later, their descendants had the misfortune of being discovered by the Europeans.

The dates of archaeological sites throughout the hemisphere generally fit with this story, as do genetic data collected from contemporary Native Americans and from skeletal remains. Native Americans, both past and present, are genetically most closely related to the peoples of Siberia, and the genetic divergence between the two groups is consistent with the populations having separated around the time when the land bridge existed.

The problem is that when you look at skull shapes (“cranio-facial morphology”), they seem to tell a different story.  Contemporary Native Americans have facial features similar to those found in Northeastern Asia. But “Paleoamericans” (dating from more than about 9,000 years ago) have features more closely resembling those found in African and Southeast Asian populations.

Those features suggest a different story, one where humans arrived in America in two waves. In this scenario, the humans who crossed the Bering land bridge would be the second wave, perhaps displacing the original, first-wave settlers. This is a story that entered the public consciousness more than fifteen years ago, following the discovery of “Kennewick Man”, who was described as possessing “caucasoid” features by James Chatters, who is also the first author on this paper. A certain strain of “thinker” took this to mean that the White people who came to America were not colonizers, but liberators, having been the continent’s original inhabitants.

The single-wave model suggests the possibility that the difference in skull morphology observed between earlier and later Paleoamericans represents evolutionary change that occurred after the migration across the land bridge. At first blush, this seems a bit questionable, since it would have the American population evolving to more closely resemble their genetic relatives in Asia, but only after having become geographically separated from those relatives.

The persistence of this controversy is due, in part, to the fact that the genetic data has generally come from different sources than the morphological data. This is where Naia comes in. Naia has the longer, more slender, Africa-esque cranium found in other early sites, but her mitochondrial DNA haplotype is a typical Native American one. This seems to support the idea that the people who left these narrow skulls all over America and the people who left their descendants all over America were the same people.

The biggest caveat, of course, is that this is a single skeleton. It is exciting and informative, since very few samples of this age have been discovered, and none of them have been of this quality. But those small numbers also mean that anything we discover about this skeleton is bound to be consistent with multiple stories, and things are unlikely to be resolved without a lot more data.

The other caveat is that the mitochondrial DNA is only one piece of the genetic history. It is possible that these really were two separate populations, and that Naia just happened to have some second-wave ancestors on her mother’s side. If we were to examine the rest of her genome, we might find some or all of it to be more similar to some other population (like the lost thirteenth tribe, who immigrated to America from Israel and/or Kobol).

Will we get the rest of Naia’s genome? I hope so, but we’ll see. It is relatively easy to collect mitochondrial DNA from archaeological samples, since there are hundreds of copies of the small, circular mitochondrial chromosome in each cell. There are only two copies per cell of the rest of the genes, which reside in the cell’s nucleus. So, it is possible that the sample was sufficiently well preserved that mitochondrial DNA could be extracted, but degraded enough that the nuclear DNA is not recoverable.

Whatever the eventual conclusion, the story will be interesting. Either the peopling of America involved a mixture of multiple populations that will be fun to unravel, or it involved some interesting, almost convergent, morphological evolution. Stay tuned!

Five Reasons Biologists Should Use Preprint Servers

So, following my previous post, I got some interesting feedback from a couple of biologists who were not completely sold on the idea of posting preprints of your work to the arXiv (or, now, the bioRxiv). Or, rather, they were not convinced that the cost-benefit calculus worked out in favor of posting. After all, as one person pointed out, there are already a bunch of hoops to jump through on the way to publication, what with formatting, revising, angrily cursing reviewer number 2, reformatting, resubmitting, and whatnot. What does posting to a preprint server do for you, beyond adding another step?

Well, it occurs to me that this is probably a question shared by a lot of biologists out there, so I thought I would share the reasons I’ve come up with.

  1. Open Access. You want your work to be available to the widest possible audience, right? When some enthusiastic young researcher is searching the literature, and they stumble across your seminal work on tribble parthenogenesis, you don’t want them getting Spock-blocked by some crappy paywall. Sure, maybe their University has an overpriced subscription to the obscure journal you published in, but maybe it doesn’t. Or maybe it’s a pain for them to access it via the proxy server when they’re off campus. Or maybe this young researcher doesn’t have access because he and/or she is an independent scholar, because they’re actually too smart and creative to work for The Man. Maybe they’ll just scroll down the search results page until they find another paper by your grad-school nemesis — the one who never chipped in his fair share for pizza — and you’ve lost another citation. Don’t let this happen to you! Make sure that your work is freely, and easily, accessible to everyone everywhere.
  2. Speed. You’ve finished your research, you’ve blindly written down the p-value that the software you downloaded from the internet spit out — er, I mean, “double-checked the statistics”, and you’ve written a beautiful discussion section that skillfully implies that your results are going to revolutionize not only your own field, but any field whose scientists have sufficient foresight to follow in your footsteps. But now you have to wait for six months or a year, or maybe longer, before your paper appears in print, and, of course, by that time, even you will have moved on to more interesting problems. If you post to a preprint server, though, your work is available immediately. And, if you make revisions in response to reviewer comments, you can post the revised version there, too. Some journals (e.g., Evolution) will even let you post the final, published, journal-formatted PDF to the preprint server after some time (12 months following publication for Evolution). So, the fact that you’re getting your work out there early does not mean that you’re committing to something less than the final version.
  3. Normalization. At this point, most biology journals are okay with authors posting their manuscripts to preprint servers, but some still are not. Not to name names (*cough* Elsevier *cough*), but some publishers would still like to hold on to an outdated publishing model where they can earn obscene profits through ownership of a product to which they contribute little to no value. The more biologists publish preprints — and commit to publishing only in journals that permit prepublication — the more pressure it places on publishers to stop rent-seeking. Basically, it is a really easy way to nudge the world of academic publishing in the direction of justice. Or, you know, if you prefer, you can keep feeding those paywall parasites like the rest of your Vichy scientist colleagues. No judgment here.
  4. Feedback. When you’re desperately worried about getting out publications so that you can get your degree, or get tenure, or whatever, it is easy to forget the real purpose of peer review. In an ideal world, peer review means that experts in your field look closely at your work and help you to make it better. By posting a preprint, you are able to get comments from the entire community — at an early enough stage that those comments might actually help you to improve the paper before it fossilizes.
  5. The Left Side of History. Look, the fact is, this is the direction that everything is moving. And you need to ask yourself, years from now, do you want to be the stodgy, old, out-of-touch professor who doesn’t post preprints, and who has to get their grad students to help set their powerpoint presentation to full-screen mode? Or do you want to be the super-cool hipster prof, who could say things like, “I’ve been posting on bioRxiv since you were in diapers”, but who would never actually say that, because it would make you sound like a total dickhead? At future Thanksgiving dinners, do you want to be your field’s Liz Cheney, or its Mary Cheney?*

* Answer: You want to be your field’s Lon Chaney.

Enter the bioRxiv

So, if you are a Physicist, or if you know a Physicist and are very patient, you’ve heard all about the arXiv, the preprint server that kicked off the open-access publication movement. If not, here’s what you need to know. The idea is that when you write up a paper, you post it online, where it becomes immediately and freely available to the world. If you revise, you can post the revised paper. And, even if you go on to publish the work traditionally, there will be a version out there that is not behind some journal’s paywall.

Most arXiv users do, in fact, go on to publish their work in traditional, peer-reviewed journals. But by posting to the arXiv first, you get your work out quickly. If you’re a naive idealist, this lubricates the flow and speeds the creation of knowledge. If you’re a paranoid careerist, it allows you to date-stamp your ideas to guard against being scooped.

While the arXiv has a “Quantitative Biology” section, preprint culture has never really taken hold in the Biology community the way it has in Physics. But here’s something that will maybe help to push things in the right direction: bioRxiv, Biology’s very own preprint server! The server features twenty-four sub-fields of Biology, and, as of this writing, Evolutionary Biology is WINNING with eight posted manuscripts.

If you’re worried about whether posting a preprint of your manuscript might interfere with your ability to publish in a traditional journal:

  1. Grow a pair of non-gender-specific gonads!
  2. Look into the pre-publication policies of various publishers here. (And, if you’re planning to publish somewhere that prohibits preprints, rethink your priorities, you collaborator!)

Now get to posting!

Gene Patents Overturned — and Scalia’s Weird Dissenting Opinion

So, the Supreme Court just ruled that Myriad Genetics does not, in fact, have the right to patent two naturally occurring human genes, BRCA1 and BRCA2. This is good news, because . . . well, because patenting a gene is total bullshit.

If you’re not familiar, these two genes are important because genetic variation in their DNA sequences has been linked to breast cancer. So, the sequence of your DNA in these two genes can reveal if you have a higher-than-average risk of developing breast cancer. It was exactly this sort of test that prompted Angelina Jolie to undergo a preemptive double mastectomy.

The problem is that the tests were really, really expensive, because of Myriad’s patents. So, the immediate consequence of the ruling should be that the prices for these tests should come way, way down.

The opinion (PDF here, if you’re interested) focuses on the difference between “discovering” something — like the sequence or location of a gene — and “creating” something — like a thing that can be patented. So, a gene is a naturally occurring thing that can not be patented. However, if you take the mRNA from a gene and reverse-transcribe it to make cDNA, this new thing might still be patentable. But, the ruling explicitly notes that the cDNA would be a creation because of the removal of introns. So, cDNA from a single-exon gene might not be patentable.

The ruling explicitly states that it offers no opinion on the patentability of genes that have had their DNA sequences deliberately altered — leaving that question for another day.

It also points out limitations of the ruling with respect to plants. The goal here seems to be to ensure that this ruling is not interpreted as invalidating any plant patents covering plant strains that have been developed through selective breeding.

That all seems pretty straightforward. The ruling does seem to leave a number of issues surrounding the patenting of genetic material unresolved, but it is quite clear about which issues it is kicking down the field.

But then there’s this bit of weirdness at the end.

The opinion is pretty much unanimous, which is always nice. Except for a little, tiny bit of dissension from Antonin Scalia. Here is the complete text of his dissenting opinion:

I join the judgment of the Court, and all of its opinion except Part I–A and some portions of the rest of the opinion going into fine details of molecular biology. I am unable to affirm those details on my own knowledge or even my own belief. It suffices for me to affirm, having studied the opinions below and the expert briefs presented here, that the portion of DNA isolated from its natural state sought to be patented is identical to that portion of the DNA in its natural state; and that complementary DNA (cDNA) is a synthetic creation not normally present in nature.

I actually thought Part 1-A of the ruling was a little weird when I first read it. Not because it said anything strange or controversial, but because it read sort of like a Wikipedia entry on basic genetics, and contains a lot of details that don’t seem particularly relevant?.

Here’s the full text of the part of the ruling about which Scalia says, “I am unable to affirm those details on my own knowledge or even my own belief.”

Genes form the basis for hereditary traits in living organisms. See generally Association for Molecular Pathology v. United States Patent andTrademark Office, 702 F. Supp. 2d 181, 192–211 (SDNY 2010). The human genome consists of approximately 22,000 genes packed into 23 pairs of chromosomes. Each gene is encoded as DNA, which takes the shape of the familiar “double helix” that Doctors James Watson and Francis Crick first described in 1953. Each “cross-bar” in the DNA helix consists of two chemically joined nucleotides. The possible nucleotides are adenine (A), thymine (T), cytosine (C), and guanine (G), each of which binds naturally with another nucleotide: A pairs with T; C pairs with G. The nucleotide cross-bars are chemically connected to a sugar-phosphate backbone that forms the outside framework of the DNA helix. Sequences of DNA nucleotides contain the information necessary to create strings of amino acids, which in turn are used in the body to build proteins. Only some DNA nucleotides, however, code for amino acids; these nucleotides are known as “exons.” Nucleotides that do not code for amino acids, in contrast, are known as “introns.” 

Creation of proteins from DNA involves two principal steps, known as transcription and translation. In transcription, the bonds between DNA nucleotides separate, and the DNA helix unwinds into two single strands. A single strand is used as a template to create a complementary ribonucleic acid (RNA) strand. The nucleotides on the DNA strand pair naturally with their counterparts, with the exception that RNA uses the nucleotide base uracil (U) instead of thymine (T). Transcription results in a single strand RNA molecule, known as pre-RNA, whose nucleotides form an inverse image of the DNA strand from which it was created. Pre-RNA still contains nucleotides corresponding to both the exons and introns in the DNA molecule. The pre-RNA is then naturally “spliced” by the physical removal of the introns. The resulting product is a strand of RNA that contains nucleotides corresponding only to the exons from the original DNA strand. The exons-only strand is known as messenger RNA (mRNA), which creates amino acids through translation. In translation, cellular structures known as ribosomes read each set of three nucleotides, known as codons, in the mRNA. Each codon either tells the ribosomes which of the 20 possible amino acids to synthesize or provides a stop signal that ends amino acid production.

DNA’s informational sequences and the processes that create mRNA, amino acids, and proteins occur naturally within cells. Scientists can, however, extract DNA from cells using well known laboratory methods. These methods allow scientists to isolate specific segments of DNA — for instance, a particular gene or part of a gene—which can then be further studied, manipulated, or used. It is also possible to create DNA synthetically through processes similarly well known in the field of genetics. One such method begins with an mRNA molecule and uses the natural bonding properties of nucleotides to create a new, synthetic DNA molecule. The result is the inverse of the mRNA’s inverse image of the original DNA, with one important distinction: Because the natural creation of mRNA involves splicing that removes introns, the synthetic DNA created from mRNA also contains only the exon sequences. This synthetic DNA created in the laboratory from mRNA is known as complementary DNA (cDNA).

Changes in the genetic sequence are called mutations. Mutations can be as small as the alteration of a single nucleotide—a change affecting only one letter in the genetic code. Such small-scale changes can produce an entirely different amino acid or can end protein production altogether. Large changes, involving the deletion, rearrangement, or duplication of hundreds or even millions of nucleotides, can result in the elimination, misplacement, or duplication of entire genes. Some mutations are harmless, but others can cause disease or increase the risk of disease. As a result, the study of genetics can lead to valuable medical breakthroughs.

So, what do you think Scalia is objecting to? Is he just signaling that he thinks that the details of the molecular biology are not important here? Is it the claim that “Genes form the basis for hereditary traits in living organisms”? Is he unable to affirm with his own belief that G pairs with C? That uracil substitutes for thymine in RNA? That humans have 23 pairs of chromosomes?

Please share your most outlandish conspiracy theories in the comments!

How does the FBI know it found “Female DNA”?

So, the latest development in the investigation of the Boston Marathon bombing is a report that the FBI has identified “female DNA” on the remains of at least one of the two bombs used by Dzhokhar and Tamerlan Tsarnaev in the attack. According to the report, published first in the Wall Street Journal, some genetic material has been recovered, and the FBI has gone to collect a DNA sample from Katherine Russell, the widow of Tamerlan Tsarnaev, presumably to see if it matches the DNA recovered from the bomb.

Here’s the thing. How does the FBI know, or think it knows, that it has recovered “Female DNA”? Well, there aren’t a lot of details available yet, but there are a couple of possibilities.

First, let’s start with the basic genetics. Humans normally have 46 chromosomes, which come in 23 pairs, as well as some mitochondrial DNA. From the mitochondrial DNA, and 22 of the 23 other chromosome pairs, there is nothing to tell you whether the DNA came from a male or a female. The genetic difference between males and females resides in that last chromosome pair, the sex chromosomes. At the sex chromosomes, women have two X chromosomes, while men have one X chromosome and one Y chromosome.

So, if you have a discrete source of your DNA sample, like a hair, you could do a couple of things. You could test it for the presence of Y-chromosome genetic material. If the DNA source was female, you should not find any. Of course, that requires basing your conclusion on a negative result (the absence of a Y chromosome), which is not ideal, since it is possible that you could miss the material for technical reasons (e.g., failure of a particular chemical reaction).

The real thing you would look for to indicate that you had DNA from a female is the presence of two different X chromosomes. That means you need to identify the DNA sequence on part of the X chromosome. You can do this by actually sequencing a region of the chromosome, but this is probably unnecessarily expensive. After all, the vast majority of sites on the chromosome are going to be identical, not just in the X chromosomes in your sample, but in every X chromosome in every human being in the world.

What you can do instead is use tools that focus on specific sites that are already known to be variable in the population. Maybe there’s a particular site where it is known that some people have a C in their DNA sequence, while other people have a G. (This is referred to as a “polymorphic” site.) You simply ask whether that site in your particular sample has a C or a G, while simultaneously asking the analogous question about thousands of other sites.

If your DNA sample came from a male, you might find that the answer is C, or G, or whatever, at a particular site. What the answer is is not as important as the fact that there will be a single answer. If your DNA sample comes from a woman, you should find that sometimes you have a mixture of C and G. Of course, at a given site, you could still get a single answer, say, G, if both of the woman’s X chromosomes had a G at this position. However, if you look at a whole bunch of sites, you should find that a decent number of them indicate a mixture of two sequences — revealing the presence of two distinct X chromosomes, and therefore, a female.

But what if you don’t have a discrete genetic sample, like a hair, to work from? There’s not a lot of detail in the original article, so we have to speculate a bit here. (I’ve reached out to the reporters from the original piece, to see if there was some genetics-dork-relevant information that did not make it into the article. I’ll post an update if and when I hear back.) It seems likely that the bombs would also have carried DNA from one or both of the Tsarnaev brothers. Thus it is possible that the DNA collected by the FBI could contain a mixture of cells from multiple different individuals — like, say, they swabbed all around the bomb’s remains to collect their samples. What would they need to do then?

Well, first of all, let’s consider the case where you had a mixture of the two brothers’ DNA. The Y chromosomes from Dzhokhar and Tamerlan would be (virtually) identical, having both been copied from single the Y chromosome of their father. The two would have distinct X chromosomes, each of which would be a patchwork of pieces copied from their mother’s two X chromosomes.

So, the X chromosomes present in this sort of sample would look similar in some ways to the X chromosomes you would get from a female DNA sample: there would be some polymorphic sites where you would find a mixture of DNA sequences in your sample. However, we would not expect to find as many of these mixed sites as in a sample from a female. On average, half of the X chromosome sequence inherited by one brother would be (virtually) identical to the sequence inherited by the other brother. Although, depending on how, exactly, recombination plays out, the identical fraction of their X chromosomes could range anywhere from nearly none to nearly all of it. It is possible, just by chance, that the X chromosomes inherited by Dzhokhar and Tamerlan would be as different from each other as the two X chromosomes present in their mother. Of course, at this point, DNA samples have almost certainly been collected from both brothers, so that investigators would know exactly what sequences to expect.

But what if there was an even messier mixture of DNA, say with samples from both brothers as well as one or more additional people? Well, at some point, the procedure of just looking for mixed sites in the DNA sequence is going to run into trouble. At most of these polymorphic sites, there are just two variants circulating at any frequency in the population. So, simply identifying sites that are polymorphic within your sample will let you distinguish between one X chromosome and more than one, but will not necessarily do a good job of telling you exactly how many different X chromosomes are present.

One approach to deal with this situation would be to look at a different type of polymorphism, one where there are more than two sequence variants present in the population. The polymorphisms most commonly used in this sort of context are short tandem repeats (STRs). These are stretches of DNA where a short sequence, maybe four or so nucleotides long, is repeated over and over again. Due to the nature of the process by which DNA is copied, these sequences are prone to a particular type of mutation, where the number of repeats increases or decreases. So, I might have a stretch of 19 copies of the sequence TCTA at a particular site in my genome, while you might have 23 copies of TCTA at the same location in your genome.

By looking at a whole bunch of these STR sites, the FBI could probably tell if the DNA they collected contained two, three, four, or more distinct X chromosomes. And, these are most likely the sorts of sites they will be using to see if the DNA collected from the bomb matches the DNA collected from Katherine Russell.

Although the focus of this post has been on genetics, and specifically what it means for the FBI to say that they recovered some “female DNA,” I would be remiss if I did not include the caveat (emphasized in the original WSJ article) that there are a lot of different ways that someone’s DNA might have gotten onto one of the bombs without that person having been involved in the bombing — even if that person winds up being Tamerlan Tsarnaev’s widow.

Epigenetics and Homosexuality

So, last week featured a lot of news about a paper that came out in the Quarterly Review of Biology titled “Homsexuality as a Consequence of Epigenetically Canalized Sexual Development.” The authors were Bill Rice (UCSB), Urban Friberg (Uppsala U), and Sergey Gavrilets (U Tennessee). The paper got quite a bit of press. Unfortunately, most of that press was of pretty poor quality, badly misrepresenting the actual contents of the paper. (PDF available here.)

I’m going to walk through the paper’s argument, but if you don’t want to read the whole thing, here’s the tl;dr:

This paper presents a model. It is a theory paper. Any journalist who writes that the paper “shows” that homosexuality is caused by epigenetic inheritance from the opposite sex parent either 1) is invoking a very non-standard usage of the word “shows,” or 2) was too lazy to read the actual paper, and based their report on the press release put out by the National Institute for Mathematical and Biological Synthesis.

That’s not to say that this is a bad paper. In fact, it’s a very good paper. The authors integrate a lot of different information to come up with a plausible biological mechanism for epigenetic modifications to exert influence on sexual preference. They demonstrate that such a mechanism could be favored by natural selection under what seem to be biologically realistic conditions. Most importantly, they formulate their model into with clear predictions that can be empirically tested.

But those empirical tests have not been carried out yet. And, in biology, when we say that a paper shows that X causes Y, we generally mean that we have found an empirical correlation between X and Y, and that we have a mechanistic model that is well enough supported that we can infer causation from that correlation. This paper does not even show a correlation. It shows that it would probably be worth someone’s time to look for a particular correlation.

As a friend wrote to me in an e-mail,

I found it a much more interesting read than I thought I would from the press it’s getting, which now rivals the bullshit surrounding the ENCODE project for the most bullshitty bullshit spin of biology for the popular press. A long-winded-but-moderately-well-grounded-in-real-biology mathematical model does not proof make.

Exactly.

Okay, now the long version.

The Problem of Homosexuality

The first thing to remember is that when an evolutionary biologist talks about the “problem of homosexuality,” this does not imply that homosexuality is problematic. All it is saying is that a straightforward, naive application of evolutionary thinking would lead one to predict that homosexuality would not exist, or would be vanishingly rare. The fact that it does exist, and at appreciable frequency, poses a problem for the theory.

In fact, this is a good thing to keep in mind in general. The primary goal of evolutionary biology is to understand how things in the world came to be the way they are. If there is a disconnect between theory and the world, it is ALWAYS the theory that is wrong. (Actually, this is equally true for any science / social science.)

Simply put, heterosexual sex leads to children in a way that homosexual sex does not. So, all else being equal, people who are more attracted to the opposite sex will have more offspring than will people who are less attracted to the opposite sex.

[For rhetorical simplicity, I will refer specifically to “homosexuality” here, although the arguments described in the paper and in this post are intended to apply to the full spectrum of sexual orientation, and assume more of a Kinsey-scale type of continuum.]

The fact that a substantial fraction of people seem not at all to be attracted to the opposite sex suggests that all else is not equal.

Evolutionary explanations for homosexuality are basically efforts to discover what that “all else” is, and why it is not equal.

There are two broad classes of possible explanation.

One possibility is that there is no biological variation in the population for a predisposition towards homosexuality. Then, there would be nothing for selection to act on. Maybe the potential for sexual human brain simply has an inherent and uniform disposition. Variation in sexual preference would then be the result of environmental (including cultural) factors and/or random developmental variation.

This first class of explanation seems unlikely because there is, in fact, a substantial heritability to sexual orientation. For example, considering identical twins who were raised separately, if one twin is gay, there is a 20% chance that the other will be as well.

Evidence suggests that sexual orientation has a substantial heritable component. Image: Comic Blasphemy.

This points us towards the second class of explanation, which assumes that there is some sort of heritable genetic variation that influences sexual orientation. Given the presumably substantial reduction in reproductive output associated with a same-sex preference, these explanations typically aim to identify some direct or indirect benefit somehow associated with homosexuality that compensates for the reduced reproductive output.

One popular variant is the idea that homosexuals somehow increase the reproductive output of their siblings (e.g., by helping to raise their children). Or that homosexuality represents a deleterious side effect of selection for something else that is beneficial, like how getting one copy of the sickle-cell hemoglobin allele protects you from malaria, but getting two copies gives you sickle cell anemia.

It was some variant of this sort of idea that drove much of the research searching for “the gay gene” over the past couple of decades.  The things is, though, those searches have failed to come up with any reproducible candidate genes. This suggests that there must be something more complicated going on.

The Testosterone Epigenetic Canalization Theory

Sex-specific development depends on fetal exposure to androgens, like Testosterone and related compounds. This is simply illustrated by Figure 1A of the paper:

Figure 1A from the paper: a simplified picture of the “classical” view of sex differentiation. T represents testosterone, and E represent Estrogen.

SRY is the critical genetic element on the Y chromosome that triggers the fetus to go down the male developmental pathway, rather than the default female developmental pathway. They note that in the classical model of sex differentiation, androgen levels differ substantially between male and female fetuses.

The problem with the classical view, they rightly argue, is that androgen levels are not sufficient in and of themselves to account for sex differentiation. In fact, there is some overlap between the androgen levels between XX and XY fetuses. Yet, in the vast majority of cases, the XX fetuses with the highest androgen levels develop normal female genitalia, while the XY fetuses with the lowest androgen levels develop normal male genitalia. Thus, there must be at least one more part of the puzzle.

The key, they argue, is that tissues in XX and XY fetuses also show differential response to androgens. So, XX fetuses become female because they have lower androgen levels and they respond only weakly to those androgens. XY fetuses become male because they have higher androgen levels and they respond more strongly to those androgens.

This is illustrated in their Figure 1B:

Sex-specific development is thus canalized by some sort of mechanism that they refer to generically as “epi-marks.” That is, they imagine that there must be some epigenetic differences between XX and XY fetuses that encode differential sensitivity to Testosterone.

All of this seems well reasoned, and is supported by the review of a number of studies. It is worth noting, however, that we don’t, at the moment, know exactly which sex-specific epigenetic modifications these would be. One could come up with a reasonable list of candidate genes, and look for differential marks (such as DNA methylation or various histone modifications) in the vicinity of those genes. However, this forms part of the not-yet-done empirical work required to test this hypothesis, or, in the journalistic vernacular, “show” that this happens.

Leaky Epigenetics and Sex-Discordant Traits

Assuming for the moment that there exist various epigenetic marks that 1) differ between and XX and XY fetuses and 2) modulate androgen sensitivity. These marks would need to be established at some point early on in development, perhaps concurrent with the massive, genome-wide epigenetic reprogramming that occurs shortly after fertilization.

The theory formulated in the paper relies on two additional suppositions, both of which can be tested empirically (but, to reiterate, have not yet been).

The first supposition is that there are many of these canalizing epigenetic marks, and that they vary with respect to which sex-typical traits they canalize. So, some epigenetic marks would canalize gonad development. Other marks would canalize sexual orientation. (Others, they note, might canalize other traits, like gender identity, but this is not a critical part of the argument.)

The model presented in this paper suggests that various traits that are associated with sex differences may be controlled by distinct genetic elements, and that sex-typical expression of those traits may rely on epigenetic modifications of those genes. Image: Mikhaela.net.

The second supposition is that the epigenetic reprogramming of these marks that normally happens every generation is somewhat leaky.

There are two large-scale rounds of epigenetic reprogramming that happen every generation. One occurs during gametogenesis (the production of eggs or sperm). The second happens shortly after fertilization. What we would expect is that any sex-specifc epigenetic marks would be removed during one of these phases (although it could happen at other times).

For example, a gene in a male might have male-typical epigenetic marks. But what happens if that male has a daughter? Well, normally, those marks would be removed during one of the reprogramming phases, and then female-typical epigenetic marks would be established at the site early in his daughter’s development.

The idea here is that sometimes this reprogramming does not happen. So, maybe the daughter inherits an allele with male-typical epigenetic marks. If the gene influences sexual orientation by modulating androgen sensitivity, then maybe the daughter develops the (male-typical) sexual preference for females. Similarly, a mother might pass on female-typical epigenetic marks to her son, and these might lead to his developing a (female-typical) sexual preference for males.

So, basically, in this model, homosexuality is a side effect of the epigenetic canalization of sex differences. Homosexuality itself is assumed to impose a fitness cost, but this cost is outweighed by the benefit of locking in sexual preference in those cases where reprogramming is successful (or unnecessary).

Sociological Concerns

Okay, if you ever took a gender-studies class, or anything like that, this study may be raising a red flag for you. After all, the model here is basically that some men are super manly, and sometimes their manliness leaks over into their daughters. This masculinizes them, which makes them lesbians. Likewise, gay men are gay because they were feminized by their mothers.

That might sound a bit fishy, like it is invoking stereotype-based reasoning, but I don’t think that would be a fair criticism. Nor do I think it raises any substantial concerns about the paper in terms of its methodology or its interpretation. (Of course, I could be wrong. If you have specific concerns, I would love to hear about them in the comments.) The whole idea behind the paper is to treat chromosomal sex, gonadal sex, and sexual orientation as separate traits that are empirically highly (but not perfectly) correlated. The aim is to understand the magnitude and nature of that empirical correlation.

The other issue that this raises is the possibility of determining the sexual orientation of your children, either by selecting gametes based on their epigenetics, or by reprogramming the epigenetic state of gametes or early embryos. This technology does not exist at the moment, but it is not unreasonable to imagine that it might exist within a generation. If this model is correct in its strongest form (in that the proposed mechanism fully accounts for variation in sexual preference), you could effectively choose the sexual orientation of each of your children.

Image: Brainless Tales.

This, of course, is not a criticism of the paper. The biology is what it is. It does raise certain ethical questions that we will have to grapple with at some point. (Programming of sexual orientation will be the subject of the next installment of the Genetical Book Review.)

Plausibility/Testability Check

The question one wants to ask of a paper like this is whether it is 1) biologically plausible, and 2) empirically testable. Basically, my read is yes and yes. The case for the existence of mechanisms of epigenetic canalization of sex differentiation seems quite strong. We know that some epigenetic marks seem to propagate across generations, evading the broad epigenetic reprogramming. We don’t understand this escape very well at the moment, but the assumptions here are certainly consistent with the current state of our knowledge. And, assuming some rate of escape, the model seems to work for plausible-sounding parameter values.

Testing is actually pretty straightforward (conceptually, if not technically). Ideally, empirical studies would look for sex-specific epigenetic modifications, and for variation in these modifications that correlate with variation in sexual preference. The authors note that one test that could be done in the short term would be to do comparative epigenetic profiling of the sperm of men with and without homosexual daughters.

As Opposed to What?

The conclusions reached by models in evolution are most strongly shaped by the set of alternatives that are considered in the model. That is, a model might find that a particular trait will be selectively favored, but this always needs to be interpreted in the context of that set of alternatives. Most importantly, one needs to ask if there are likely to be other evolutionarily accessible traits that have been excluded from the model, but would have changed the conclusions of the model if they had been included.

The big question here is the inherent leakiness of epigenetic reprogramming. A back-of-the-envelope calculation in the paper suggests that for this model to fully explain the occurrence of homosexuality (with a single gene controlling sexual preference), the rate of leakage would have to be quite high.

An apparent implication of the model is that there would then be strong selection to reduce the rate at which these epigenetic marks are passed from one generation to the next. In order for the model to work in its present form, there would need to be something preventing natural selection from finding this solution.

Possibilities for this something include some sort of mechanistic constraint (it’s just hard to build something that reprograms more efficiently than what we have) or some sort of time constraint (evolution has not had enough time to fix this). The authors seem to favor this second possibility, as they argue that the basis of sexual orientation in humans may be quite different from that in our closest relatives.

On the other hand this explanation could form a part of the explanation for homosexuality with much lower leakage rates.

What Happened with the Press?

So, how do we go from what was a really good paper to a slew of really bad articles? Well, I suspect that the culprit was this paragraph from the press release from NIMBios:

The study solves the evolutionary riddle of homosexuality, finding that “sexually antagonistic” epi-marks, which normally protect parents from natural variation in sex hormone levels during fetal development, sometimes carryover across generations and cause homosexuality in opposite-sex offspring. The mathematical modeling demonstrates that genes coding for these epi-marks can easily spread in the population because they always increase the fitness of the parent but only rarely escape erasure and reduce fitness in offspring.

If you know that this is a pure theory paper, this is maybe not misleading. Maybe. But phrases like “solves the evolutionary riddle of homosexuality” and “finding that . . . epi-marks . . . cause homosexuality in opposite-sex offspring,” when interpreted in the standard way that I think an English speaker would interpret them, pretty strongly imply things about the paper that are just not true.

Rice, W., Friberg, U., & Gavrilets, S. (2012). Homosexuality as a Consequence of Epigenetically Canalized Sexual Development The Quarterly Review of Biology, 87 (4), 343-368 DOI: 10.1086/668167

Update: Also see this excellent post on the subject by Jeremy Yoder over at Nothing in Biology Makes Sense.

2012 Gift Guide for Population Geneticists

So, it’s that time of year again, when you have to come up with gift ideas for the population geneticist in your life. Personally, I like cash, but if you insist on coming up with personalized gifts, here are some ideas for you:

1. Mathematical Population Genetics, by Warren Ewens

This book was originally published in 1979. When I was in grad school, it had been out of print for years. People would pass around xeroxed copies that had been made from other xeroxed copies.

Finally, a couple of years ago, the second edition came out. So now the population geneticist in your life can own their very own book-shaped copy.

Of course, it’s a little bit pricey. Fortunately, there are plenty of other gifts on this list for the folks about whom you don’t care enough to buy this book. 🙁

2. The Gospel of the Flying Spaghetti Monster, by Bobby Henderson

Okay, cheapskate, maybe this is a little bit more your speed. This is the perfect gift for the pastafarian population geneticist.

Or it could be a good evangelical gift for those who have not yet been touched by his noodly appendage.

And look, it comes with one of those little ribbon things that means you don’t have to use your wadded up Starbucks receipt as a bookmark!

3. Gene Pool Shirt

Get it?

It’s a jean shirt!

With a pool ball on it!

Great conversation starter!

Also comes in Flaming 8-Ball!

4. Obnoxious Car Decals

There are a number of different aggressively obnoxious things that you can get for your car, like a T-Rex eating a Jesus fish. But if your goal in life is to get your headlights smashed by some nice religious folk, nothing will beat this “Procreation Car Emblem.”

If you’re in the mood for something a little more subtle, there are some good options in the “Customers who bought this item also bought” section.

5. Remarkable, by Lizzie Foley

Okay, okay, I know what you’re thinking. That this is shameless promotion of my wife’s book, and has nothing to do with population genetics.

Yes, fine, it’s shameless, but it’s a great book, perfect for the population geneticist with one or more F1s a home (ages 8 and up!). And it does feature a cameo appearance by population geneticist and UCLA Professor John Novembre. For reals!

Also, the story features boy and girl identical twins. So, analyze that.

6. DNA Earrings

What’s that?

I can’t hear you.

I’ve got DNA in my ear.

7. DNA Portraits

Okay, check this out. You send in a swab of DNA, and $199, and they’ll send you a giant picture of a gel, which is I guess is supposed to be some fraction of your genome? Maybe? It looks like there are supposed to be eight sample lanes, and it’s that old-school sequencing analysis where each dideoxynucleotide terminator gets its own lane. So this might be about forty bases of sequence. Maybe?

To be honest, though, this looks a lot more like a protein gel to me. Maybe they use your DNA, clone a little tiny homunculus of you, grind it up, trypsin digest it, and this is that gel.

If that wasn’t bad enough, you also have the option of getting your DNA made into a giant QR code poster (that no one will ever scan).

For the money, I’d go with two copies of the Ewens book.

8. Personalized Genetic Analysis

The classic here is 23 and Me.

Okay, maybe you’re thinking, no, a real population geneticist would not want one of these goofy personalized genetic analysis things. Those are for amateurs, mere heredity enthusiasts. Will my population geneticist friend be offended by the ridiculous pinpointing of their Y-chromosome and mitochondrial ancestry, or the ridiculous breakdown of racial composition, or the ridiculous risk-factor analysis?

Well, that’s the beauty of this gift. If they are the wild-eyed, naive sort of population geneticist, they’re just going to be so gosh-darned excited to get all that cool information. If they’re the bitter, cynical sort of population geneticist (most of them, in my experience), you’ll be giving them the gift of feeling knowledgable and superior!

If you want to surprise them, order the kit and swab their cheek while they’re sleeping.

If you really want to surprise them, order a second kit, swab a random guy, get the results, and claim that the results are from their father.

9. Darwin Eats Cake Stuff

Yeah, you thought plugging my wife’s book was shameless? I’ll show you shameless! Check out these new items from the official Darwin Eats Cake store:

Look! It’s a mug illustrating the academic funding cycle: papers->money->caffeine->papers.
Also works for non-population-geneticist academic types.

Look! It’s a trucker hat featuring Guillaume the Adaptationist Goat’s credo!

Look! It’s a t-shirt featuring J B S Haldane’s moustache in a jar!

Don’t see anything you like? You can check out the comics and contact the “artist” here to submit special requests.

10. Ronald Reagan Riding a Velociraptor with a Machine Gun

Okay, so this one really has nothing to do with population genetics, but it is 100% pure awesome.

Prints available in 11×17 or 24×36 from SharpWriter at deviantART.

Other ideas? Leave them in the comments.

Two new characters at Darwin Eats Cake

So, if you’re a regular reader of Darwin Eats Cake, you’ll already know that two new characters have been introduced to the strip: R A Fisher’s Pipe and J B S Haldane’s Moustache.

If you’re not a regular reader, you should be, because it will make me happy (and it is, after all, the holiday season), and also because Robert Gonzales once called it “my [meaning Robert’s] new favorite webcomic” over at io9.

For those of you who are not population geneticists, or at least evolutionary biologists, Fisher and Haldane are two of the major figures of the “modern synthesis” in evolution in the first part of the twentieth century. This was basically the integration of the Mendelian idea of the gene with the Darwinian idea of gradual change via natural selection. Fisher, in addition, created a whole lot of modern statistics, which have found applications far outside of evolutionary biology.

R. A. Fisher smoking his pipe. Not a euphemism.
J. B. S. Haldane, um, I guess, having his mustache. Note the lack of “o” in the American spelling of mustache.

Fisher loved himself a good smoke. In fact, late in his life, he publicly challenged research purporting to show a causal link between smoking and lung cancer. Oops.

Haldane once chased my former officemate and his mother down the street in a rainstorm in Calcutta to offer them an umbrella.

These two anecdotes provide all the information you need to accurately reconstruct the political views of each.

Fisher passed away in 1962, and Haldane in 1964. Fortunately, one of the most salient features of each was preserved in a jar for posterity. And now, half a century later, the two have reunited to bring you their genetically inspired comedy stylings.

Here’s what you’ve missed so far:

Best URL for sharing: http://www.darwineatscake.com/?id=143
Permanent image URL for hotlinking or embedding: http://www.darwineatscake.com/img/comic/143.png
Best URL for sharing: http://www.darwineatscake.com/?id=145
Permanent image URL for hotlinking or embedding: http://www.darwineatscake.com/img/comic/145.png
Best URL for sharing: http://www.darwineatscake.com/?id=148
Permanent image URL for hotlinking or embedding: http://www.darwineatscake.com/img/comic/148.png