Deep-learning algorithm used to identify disease-associated DNA variants

Deep-learning algorithm used to identify disease-associated DNA variants
Susan Furth, MD, PhD Executive Vice President and Chief Scientific Officer — Children's Hospital of Philadelphia
0Comments

Researchers at the Children’s Hospital of Philadelphia (CHOP) and the University of Pennsylvania’s Perelman School of Medicine have employed a deep learning algorithm to identify potential mutations in the noncoding regions of DNA that could increase disease risk. The research findings, published in the American Journal of Human Genetics, lay the groundwork for future detection of disease-associated variants across various common diseases.

Current understanding shows that while certain genome sections code for proteins, over 98% of the human genome does not have this function. Instead, disease-related variants in the noncoding areas frequently participate in controlling protein expression, an aspect known as the “regulatory code.” Genome-wide association studies (GWAS) have contributed significantly to clarifying the clinical significance of many noncoding variants.

The challenge in recognizing specific disease-causing variants within broad regions identified by GWAS persists. Such variants are often located around motifs where transcription factors, specialized proteins, bind to regulate gene expression. These proteins leave a “footprint” when they bind, which researchers can trace to ascertain precise binding sites.

“This situation is comparable to a police lineup,” explained senior study author Dr. Struan F.A. Grant from CHOP. “You’re looking at similar suspects together, so it can be challenging to know who the actual culprit is. With the approach we used in this study, we’re able to pinpoint the disease-causing variant through identification of this so-called footprint.”

Using the ATAC-seq sequencing method and the PRINT algorithm, the researchers examined data from 170 human liver samples, identifying 809 footprint quantitative trait loci associated with DNA-protein interactions. These analyses allowed researchers to determine the strength of transcription factor binding at various sites based on the mutations present.

“This approach helps resolve some fundamental issues we have encountered in the past when trying to determine which noncoding variants may be driving disease,” noted Max Dudek, a PhD student involved in the study. “With larger sample sizes, we believe that pinpointing these casual variants could ultimately inform the design of novel treatments for common diseases.”

Funding for this study came from several sources, including the National Science Foundation Graduate Research Fellowship Program and various National Institutes of Health grants.

The research titled “Characterization of non-coding variants associated with transcription factor binding through ATAC-seq-defined footprint QTLs in liver” was published online on April 17, 2025.



Related

Madeline Bell, President and CEO - Children%27s Hospital of Philadelphia

Study finds fewer complications using new surgical method for simple syndactyly

Syndactyly is a condition present at birth where two or more fingers are fused together.

Michael Young, President and CEO - Jeanes Hospital and Temple University

Temple University Hospital shifts labor and delivery services to new location

Beginning September 3, Temple University Hospital – Main Campus will stop offering Labor and Delivery services.

Michael Young, President and CEO - Jeanes Hospital and Temple University

Temple University doctor comments on Phillies pitcher Zack Wheeler’s medical diagnosis

Dr. Cherie P Erkmen, a professor of thoracic medicine and surgery at the Lewis Katz School of Medicine at Temple University, provided expert commentary to CBS News Philadelphia regarding Philadelphia Phillies pitcher Zack Wheeler’s recent…

Trending

The Weekly Newsletter

Sign-up for the Weekly Newsletter from East Montgomery Times.