I'm going to start with a primer on biology. If you already have a working knowledge of DNA, gene expression, and epigenetics, skip ahead to "More about Methylation" below.

The Biology (A Primer)

Nearly every cell in the body contains DNA (red blood cells are an exception) - roughly 3.2 billion base pairs, organized into 23 pairs of chromosomes. DNA is made of four chemical bases: adenine (A), thymine (T), guanine (G), and cytosine (C). The specific sequence of these bases encodes the instructions for building and maintaining the body. Genes are specific stretches of DNA that encode proteins, which do most of the work in the body's cells.

A liver cell and a brain cell contain essentially identical DNA, but they look and behave differently. That's because each cell type only uses, or "expresses," a relatively small proportion of its genes. A liver cell has the genes for making brain-specific proteins, but it never turns those genes on. The system that controls which genes are on and which are off, without changing the underlying DNA sequence, is called epigenetics. One of the most important epigenetic mechanisms is DNA methylation, which is the process of attaching a chemical tag (a methyl group, consisting of one carbon atom bonded to three hydrogen atoms) to the cytosine base in DNA. This typically happens at spots where a cytosine sits next to a guanine, called CpG sites (the "p" refers to the phosphate bond connecting the cytosine and guanine).

When clusters of CpG sites near a gene's promoter region (the stretch of DNA just upstream of a gene that controls whether that gene gets turned on) get heavily methylated, a gene tends to get silenced. When those sites are unmethylated, a gene can be expressed. The DNA in every cell type in the body has a distinctive methylation pattern, which can be thought of as a molecular fingerprint that defines its identity.

In cancer, methylation patterns can become disrupted. Genes that are supposed to be turned on can get silenced, while genes that are supposed to be silenced can get activated. Tumor suppressor genes, one category of molecular “brakes” that prevent cells from growing uncontrollably, can be shut down by abnormal methylation at their promoters. This is an early molecular event in the development of certain cancers. Importantly, unlike genetic mutations, which are often unique to individual tumors or even individual cells within a tumor, aberrant methylation patterns tend to converge across cancers. Different clones (genetically distinct populations of cells within the same tumor), each descended from a single ancestor cell, tend to share similar methylation abnormalities.

Billions of cells undergo a controlled, programmed death (apoptosis) at any given time, which is how the body replaces old cells with new ones. When a cell dies by apoptosis, its DNA gets cleaved into small fragments and released into the blood. These fragments are known as cell-free DNA (cfDNA). The fragments tend to be specific sizes. DNA in cells is wrapped around protein spools called nucleosomes, and when a cell dies, the DNA gets cleaved in the exposed stretches between nucleosomes, producing fragments that are, on average, predominantly about 167 base pairs long.

These cfDNA fragments retain the methylation patterns of the cell they came from. For example, a fragment that originated from a liver cell carries liver-specific methylation patterns. Additionally, a fragment of cfDNA shed by a cancer cell carries the aberrant methylation patterns characteristic of that cancer. cfDNA doesn't survive in the bloodstream long. Its half-life is hours, but it's almost constantly being produced. So, at any given moment, blood contains a real-time snapshot of which cells are dying and shedding cfDNA. If a tumor is growing somewhere and its cells are dying and shedding fragments of DNA into the blood, and if you can "read" the methylation patterns on those fragments, you can, in principle, detect cancer and figure out where it's coming from. That is the foundational idea behind GRAIL's Galleri MCED test, the most well-known MCED test.

More About Methylation

The Circulating Cell-Free Genome Atlas (CCGA) study is a 15,000-participant study that serves as the foundation for the entire Galleri platform (and is discussed in a later post). It was designed from the beginning as a head-to-head comparison of multiple approaches to detect cancer in blood. The first sub-study of the CCGA, published in Cancer Cell in 2022, evaluated three different assay types in parallel on the same patient samples: whole-genome sequencing (looking for somatic mutations, or genetic changes acquired during a person's lifetime, as opposed to inherited ones), ultra-deep targeted sequencing of 507 cancer-related genes (looking for specific recurrent mutations in a specific part of the genome), and whole-genome bisulfite sequencing (identifying methylation patterns across the genome). Sequencing will be discussed later in this post. Each assay was paired with its own machine learning classifier, and the three were compared directly. The methylation-based approach had the best combination of cancer detection sensitivity and prediction accuracy pertaining to where the cancer originated. Methylation-based detection outperformed the mutation-based approaches for several reasons.

First, methylation signals are more abundant. Most individual tumor cells carry only one or a few driver mutations (the specific genetic changes that cause a cell to become cancerous and cause problems), and the frequency of any single mutation in circulating cfDNA is extremely low, diluted by the vast majority of non-tumor cfDNA in the blood. Methylation changes, in contrast, are more pervasive. They affect thousands of genomic regions simultaneously, creating a larger signal.

Second, methylation is more consistent in tumors. Different patients with the same cancer type often have different driver mutations, and even different physical regions of the same tumor can harbor different mutations. However, cells within the same tumor tend to show similar methylation abnormalities.

Third, and arguably most important, methylation patterns are tissue-specific. Because normal methylation patterns define cell identity, cancer-associated methylation patterns retain information about the tissue of origin. A lung cancer sheds cfDNA with a methylation profile that looks like a deranged version of a lung cell, not a deranged version of a colon cell. This is what enables the test to not only detect cancer but also predict where in the body it's coming from.

Fourth, mutation-based approaches faced a specific technical problem, which is clonal hematopoiesis of indeterminate potential (CHIP). As people age, blood-forming stem cells accumulate mutations that get passed to their daughter cells, creating clones of mutated cells that resemble cancer cells on a mutation-based assay. Correcting for CHIP requires sequencing blood cells in addition to cfDNA for every patient, increasing cost and complexity. The methylation approach doesn't have this problem, since CHIP, as far as we know, doesn't produce cancer-like methylation patterns.

The CCGA investigators also found that combining mutation data with methylation data added essentially nothing. There was little to no complementarity, and the methylation signal alone captured nearly all the detectable cancer information. Based on these results, GRAIL committed entirely to a refined targeted methylation platform for what would become Galleri.

A Bit About Sequencing

Sequencing refers to technology that allows for the ability to read DNA by determining the exact order of bases (A, T, G, C) along a strand of DNA. The revolution that made MCED testing possible was the development of next-generation sequencing (NGS), which reads millions of DNA fragments simultaneously. The most widely used NGS platform in clinical genomics, and the one Galleri runs on, is made by Illumina. Their approach is called sequencing by synthesis (SBS), and the basic principle involves watching a DNA strand being copied (one base at a time) and recording which base gets added at each step.

First, the DNA fragments to be sequenced are attached to a glass surface (called a flow cell) and amplified into clusters (small groups of identical copies of each original fragment). This amplification is necessary because the signal from a single molecule would be too faint to detect. Next, the sequencing machine floods the flow cell with fluorescently labeled nucleotides (A, T, G, C), each tagged with a different color of fluorescent dye. A DNA polymerase enzyme incorporates one complementary base at a time into the growing strand. After each base is added, a camera captures the color of fluorescence at every cluster position on the flow cell, recording which base was just incorporated. The fluorescent tag is then chemically cleaved, and the next cycle begins. This process repeats for the entire length of the fragment (typically 150 cycles, reading 150 bases per end). The result is millions of short reads, each representing the sequence of one cfDNA fragment. These reads are then aligned to a reference human genome by software, producing a digital map of which bases were present at which positions in the original sample.

An important concept in sequencing is depth (also called coverage), which is the average number of times each base in the targeted region is read. If you sequence a region to 30x depth, that means, on average, each base in that region was read 30 times across different fragments. Depth is relevant to MCED testing because tumor-derived cfDNA is typically a tiny amount of total cfDNA (often less than 1%, and in early-stage cancers, potentially less than 0.1%). If you don't sequence deeply enough, those rare tumor-derived cfDNA fragments get lost in the overwhelming background of normal cfDNA. But more depth also means more cost, more data to process, and more time.

There are different sequencing strategies, and they determine which parts of the genome get read, how deeply, and what kind of information you extract. Four types are particularly relevant to understanding MCED testing.

Whole-genome sequencing (WGS) reads the entire genome (all 3.2 billion base pairs). It's comprehensive, relatively expensive, and generates massive datasets. When applied to cfDNA, it spreads sequencing capacity across the entire genome rather than concentrating it where the most informative signals might be. The CCGA study used WGS in one of its three arms to look for somatic mutations across the whole genome.

Targeted sequencing uses molecular probes to capture and sequence only specific, pre-selected regions of interest. This means you can sequence those regions much more deeply (reading each base many more times), which can improve the ability to detect rare variants. The tradeoff is that you only see what you designed the panel to look at. The CCGA used ultra-deep targeted sequencing of 507 cancer-related genes in another one of its three arms.

Whole-genome bisulfite sequencing (WGBS) combines bisulfite conversion (described below in Step 3 in the section "How the Test Works, Step by Step") with WGS to reveal the methylation status at every CpG site across the entire genome. It's the gold standard for comprehensive methylation profiling and was the approach that won the CCGA comparison. However, it is expensive, and the bisulfite conversion process chemically damages DNA (which can be a drawback when starting with an already small and fragmented amount of cfDNA in a blood sample). The third arm of the CCGA study used this sequencing method.

Targeted methylation sequencing is the approach Galleri uses in practice. It combines bisulfite conversion with hybridization capture (which uses custom-designed molecular probes to pull out and sequence only the ~103,000 most informative genomic regions covering about 1.1 million CpG sites out of roughly 30 million in the genome). This gives the methylation information needed at a lower cost than sequencing the whole genome, while concentrating sequencing depth on the regions that likely matter most for cancer detection.

How the Galleri Test Works, Step by Step

Step 1: Blood draw and plasma separation. Blood is collected using specialized cfDNA collection tubes that stabilize the sample and prevent blood cells from lysing (breaking open) during transport, which would contaminate the signal. The blood is separated into its cellular components and plasma (the cell-free liquid portion), which is where the cfDNA circulates.

Step 2: cfDNA extraction. cfDNA is isolated from the plasma. A typical sample yields up to about 75 nanograms of cfDNA, a relatively small amount of DNA, which is one of the technical challenges of the platform since everything downstream has to work with this input.

Step 3: Bisulfite conversion. The extracted cfDNA is treated with sodium bisulfite, a chemical that converts unmethylated cytosines into uracil (which are read as thymine after polymerase chain reaction amplification) while leaving methylated cytosines unchanged. In other words, if a cytosine survives the treatment, it was methylated in the original sample. Conversely, if a cytosine gets converted to a thymine, it was unmethylated. In this way, an epigenetic mark becomes a readable sequence difference.

Step 4: Library preparation and targeted enrichment. The bisulfite-converted cfDNA is prepared into a sequencing library, followed by hybridization capture. GRAIL designed a custom panel that uses molecular probes to selectively capture and enrich for the most informative regions of the genome. As stated previously, the panel targets 103,456 distinct genomic regions, covering approximately 1.1 million CpG sites out of the genome's roughly 30 million. The target regions were selected using a custom algorithm trained on the whole-genome data from the CCGA.

Step 5: Sequencing. The enriched library is loaded onto an Illumina NovaSeq platform, which uses the sequencing-by-synthesis approach previously mentioned. The machine reads 150 base pairs from each end of every captured fragment (paired-end sequencing), generating millions of short reads per sample. Each read represents one cfDNA fragment, and because unmethylated cytosines were converted to thymines during bisulfite treatment, the sequence data now encodes methylation information directly (every position where a cytosine appears in the read was methylated in the original sample). The sequencing depth is calibrated to maximize the probability of capturing and reading rare tumor-derived fragments, which may represent a small amount of total cfDNA. The output is a comprehensive dataset of methylation states across all targeted CpG sites for each patient sample.

Step 6: Machine learning classification. The sequencing data are fed into a machine learning classifier trained on labeled data from the CCGA. The classifier operates in two stages. The first stage is cancer signal detection, and the second stage is cancer signal origin. In the first stage, the algorithm compares the patient's cfDNA methylation pattern against patterns it learned from over 15,000 CCGA participants (6,670 without cancer and 8,584 with cancer) and determines whether the sample's methylation pattern looks more like the non-cancer group or the cancer group. In the second stage, a classifier provides a prediction for the most likely organ where the cancer is located.

Step 7: Results. A report is generated indicating either "Cancer Signal Not Detected" or "Cancer Signal Detected" with a cancer signal origin prediction. If a signal is detected, the clinician uses the cancer signal origin to guide diagnostic workup, which may include imaging, endoscopy, or whatever is deemed appropriate for the predicted organ of origin.

Why Sensitivity Varies

The headline sensitivity (the probability of detecting a cancer that's actually there) of 51.5% across all cancers for Galleri in CCGA obscures enormous variation by cancer type and stage. The test has reported detection rates of liver and bile duct cancers with a sensitivity of 93.5%, pancreatic at 83.7%, esophageal at 85.0%, and ovarian at 83.1%. But the test's sensitivity for kidney cancer has been reported to be as low as only 18.2%, and even lower for localized prostate cancer.

One of the main reasons for this difference is thought to be related to how much tumor-derived cfDNA is circulating in the blood. The technical term is circulating tumor allele fraction (cTAF), which is the proportion of total cfDNA that comes from the tumor rather than from normal cells. Several factors determine cTAF. Different cancer types shed cfDNA into the bloodstream at different rates due to differences in factors such as cell turnover rate, proximity to blood vessels, and the biology of how cells in that cancer die. These are intrinsic properties that no amount of assay optimization can overcome (yet). Larger tumors also generally shed more cfDNA, and thus, more advanced cancers are detected at higher rates than early-stage cancers across virtually every cancer type. For example, in pancreatic cancer, the sensitivity for stage IV disease was reported at 95.9% versus 61.9% for stage I disease. This "stage gradient" is a core tension of MCED testing, as the tests work well for cancers that would ideally be caught early, but it catches them most reliably when they're advanced.

Every assay also has a floor, or a minimum cTAF below which it cannot reliably distinguish a cancer signal from background noise. If a patient's tumor is shedding cfDNA at a level below that floor, the test will return "Cancer Signal Not Detected" even though cancer is present. This is a constraint of the (current) technology.

Other Approaches to MCED Testing

Everything I've described so far reflects GRAIL's methylation-based approach that reads epigenetic patterns on cfDNA to detect cancer and predict where it's coming from. It's the most advanced MCED platform in terms of clinical data as of 2026, but it's not the only way to detect cancer in the blood. Several companies have built, or are building, MCED tests using different strategies. The most prominent are Exact Sciences and Guardant Health.

Exact Sciences: Exact Sciences (acquired by Abbott in March 2026, announced November 2025), best known for the Cologuard stool-based colorectal cancer screening test, has taken a different approach to MCED testing.

Their approach, which originated in the CancerSEEK research program, is now commercially available as the Cancerguard test, integrating three distinct biomarker classes from a single blood draw: circulating tumor DNA (ctDNA) mutations, DNA methylation patterns, and circulating protein biomarkers. The ctDNA mutation component interrogates segments of predetermined genes, looking for somatic mutations. The protein component measures levels of circulating proteins that tumors shed or provoke the body to produce, including cancer antigen 125 (CA-125), carcinoembryonic antigen (CEA), cancer antigen 19-9 (CA 19-9), and several others. The rationale for adding proteins is that some think many early-stage tumors do not shed enough ctDNA to be detectable even by sensitive sequencing, but they may still elevate certain protein biomarkers, and proteins also contribute to tissue-of-origin prediction (since different cancer types have unique protein signatures).

Guardant Health: Guardant Health built its reputation on the Guardant360 liquid biopsy platform, which sequences ctDNA to identify mutations that could guide treatment selection in advanced cancers. Their move into blood-based cancer detection began with Shield, the first FDA-approved blood-based colorectal cancer screening test, and is now expanding into multi-cancer detection with Shield MCD. Guardant's approach is rooted in epigenomics. Like GRAIL, Shield reads methylation patterns on cfDNA, but with the addition of fragmentomics, which refers to the analysis of the size, distribution, and genomic positioning of cfDNA fragments. As mentioned previously, when cells die, fragments of DNA are cleaved at specific points that reflect the chromatin structure of the cell of origin.

Coming Up

You now have an understanding of the biology behind MCED tests, the mechanics of how Galleri works, and a framework for understanding how other companies are approaching the same problem with different strategies. In the posts ahead, I'll dig into the specific studies that validated these technologies, the regulatory and reimbursement landscape, and the competitive dynamics across the field. I'll also cover the critics, as many critiques of MCED testing are valid. The technology is evolving rapidly, including the incorporation of multi-analyte testing into assays, new sequencing platforms, and increasingly sophisticated machine learning, which are all reshaping what these tests can do.

Disclosures. Before Stage One is written by Michael LaPelusa, MD in his personal capacity. Views expressed are his own and do not represent the views of any institution. Content is provided for informational purposes only and should not be relied upon as medical, legal, business, investment, or tax advice. Nothing here is a recommendation to undergo, avoid, prescribe, or order any medical test or treatment, nor a recommendation to buy or sell any security. Readers should consult their own physicians and advisers regarding clinical, financial, and legal decisions. The author does not hold positions in any company discussed unless explicitly disclosed in the post. See full disclosures.