Development and Validation of an Automated DNA-Encoded Library Screening Data Analysis Platform: PB-DEL Autoscreening Analysis (PB-DELASA)

Keke Dong, Xiangfei Meng, Hongyi Diao, Bing Qi, Zhuangzhi Chen, Wei Ma, Yihang Zhang, Minmin Yang, Jing Zhao, Liu Liu

J Chem Inf Model

DOI: 10.1021/acs.jcim.5c00816

Abstract

Existing tools for analyzing next-generation sequencing (NGS) data from DNA-encoded library (DEL) screens are often limited to custom internal methods, focusing narrowly on protein–ligand interactions and lacking standardization in compound selection. These tools typically ignore sequencing depth, error rates, and quality control, leading to time-consuming and subjective analysis. To address these issues, we developed PB-DELASA, a fully automated, standardized, and accurate DEL data analysis workflow. It incorporates AI, computational analysis, and medicinal chemistry expertise to generate 2D/3D visualizations and ranked compound lists. Validated through a screen against CDK9, PB-DELASA identified potent and selective hits with minimal synthesis effort. The source code is publicly available.

Summary

PB-DELASA is a novel, automated platform for analyzing DEL screening data. It integrates QC, enrichment analysis, and compound prioritization into a single workflow. The tool uses AI and empirical rules to reduce false positives and recommend high-quality compounds for off-DNA synthesis. It was validated using a CDK9 screening campaign, resulting in the discovery of novel, potent, and selective inhibitors. The platform improves efficiency, reduces bias, and supports medicinal chemists in early-stage drug discovery.

Highlights

1. Automated, end-to-end DEL data analysis workflow 2. Combines AI, computational tools, and medicinal chemistry expertise 3. 2D/3D visualization and compound prioritization 4. Validated with CDK9 target; identified potent and selective hits 5. Reduces synthetic workload and improves hit confidence 6. Open-source and user-friendly for non-bioinformaticians

Conclusion

PB-DELASA addresses major limitations in current DEL data analysis by providing a standardized, automated, and accurate workflow. It enhances the reliability of hit identification, reduces manual intervention, and supports efficient compound prioritization. The successful validation against CDK9 demonstrates its potential to accelerate early-stage drug discovery. By making the tool open-source, the authors aim to promote broader adoption and further development in the DEL community.

logo
logo