Mendelian Randomization

Bucky_Wood · June 7, 2022, 7:38pm

I have been impressed by the rapidity of our evolution of studies that have always been complicated by confounding variables and reverse causation. These are often impossible to control for, as is explained in the JAMA study of the Association of Habitual Alcohol Intake with Risk of CV Disease. But now, Mendelian Randomization (MR) promises a solution.

I think the topic worthy of everyone’s consideration because of the dramatic conclusions that differ from our previous “conventional wisdom”.

That recent paper that attracted my attention was about “Alcohol and CV Disease”. Almost 100% of previous studies concluded that light drinking results in improved health. Epidemiologists have struggled forever to analyze modifiable risk factors that affect population health. Many have focused on red-wine, one glass per day drinkers doing better than abstainers. But critics have always made the point about confounding variables impossible to control, so did the wine help, or was it their lifestyle? E.g. Light drinkers are also mostly never-smokers, exercise regularly and eat more healthy diets than heavy drinkers and abstainers as a cohort, but we cannot control for these variables in a study. Imagine how you could ever begin a study randomizing two groups and demand that one group MUST smoke, remain sedentary and eat fat red meat every day as they drink only water! So all such studies actually done are called observational, wherein groups are followed for an endpoint (cardiac disease) and at the end are asked about associated factors such as smoking, exercise, etc. There is no control for confounding variables. There is also the issue of ‘reverse causation’. Suppose that a smoker has lung cancer or emphysema and stops smoking. Then he dies sooner than the rest of the smoking population and we conclude that stopping ex-smokers are more likely to die than smokers…reverse causation. Suppose someone with a strong family history of heart disease decides to eat better and exercise, but they then have a heart attack and die just like their daddy. Do we conclude that his improved behavior resulted in his death? No, but we cannot control for either these confounding variables or reverse causation.

Genetics has given us a solution: mendelian randomization (MR) . Among the many diverse topics in my reading since my interest in the genome has been the explosion of GWASs, or genome wide association studies. These are studies that evaluate groups with a common trait to identify the differences in their DNA. A single nucleotide difference is called a SNV for single nucleotide variation or an SNP for polymorphism. I was unaware, however, the extent that they are now being used in epidemiology to eliminate confounding variables and reverse causation. Here is how it works:

You find a genetic variant that is strongly associated with a risk factor (alcohol e.g.) that in turn is responsible for an effect or outcome (heart disease e.g.). And it must NOT be related to confounders nor cause the outcome independent of the risk (alcohol e.g.). This is called an instrumental variable . It is “instrumental” to the effect alcohol has, but independent of any effect that that variable of genetic code might have had on the actual cardiac disease. In addition, there are usually multiple loci and/or genes that are associated in a GWAS, so each one is called an allele and thus there is an allele score . And finally since any given genetic variant may have multiple different biologic pathways, they must be isolated to focus on a single cause>effect, called pleiotropic effect . We can’t understand the statistical methods…

for God Sake…

…but one, the MR Egger regression (there are 3 others also), corrects for both more than one variant and for multiple pleiotropic effects.

So the short story of the comparison of alcohol consumption with cardiac disease is simplified. A gene called ALDH2 has a variant we shall call rs671. Those with it drink 1.1 g/day of alcohol. Those without it drink 23.7 g/day (it causes a flush response that reduces drinking. There are other SNPs associated with smoking (15q25 and 8p11 on chromosomes 8 and 11 respectively, fwi) that correlate strongly with both smoking initiation and cessation. Then there are associations of SNPs for those who exercise, both for type, frequency and duration (GS1P1, DNASE2B, etc)!

So using these and MR we can now, only now, eliminate both confounders and reverse causation. The bottom line is that alcohol, at any level, is bad. It is a J-shaped curve of detriment that correlates with consumption. At 7 drinks/week your risk is ‘only’ 10% higher, while at >35 drinks/week it is 100 times greater than the non-drinker who otherwise has the same risk factors/behavior.

Another overview is below. Our understanding is about be overwhelmed with new and better data. And that goes to all of epidemiology that cannot be randomized into RCTs and cannot eliminate confounding variables…this is truly a new world!
A Method of Overcoming Confounding is here.

jmitroka · June 13, 2022, 3:57pm

Indeed MR is a great approach for sorting out cause and effect, as you suggest. Unfortunately, they require genetic markers linked to such functional polymorphisms, which is rarely the case.

paleomalacologist · June 16, 2022, 9:16pm

A related caution is that the genome is large enough that any two groups will almost certainly differ noticeably in the frequency of some SNP. Sorting out the random correlations versus actual causality is not easy.

system · June 23, 2022, 6:16pm

This topic was automatically closed 6 days after the last reply. New replies are no longer allowed.