Seminar: Exploring antibiotic resistance with Next-Generation Sequencing: A graph-based algorithm and machine learning approach

Time:

Venue/Location: Phòng B.102, VIASM

Báo cáo viên: Dr. Đỗ Văn Hoàn, Đại học Kỹ thuật Lê Quý Đôn

Abstract:Whole genome sequencing (WGS) has emerged as a fundamental method for elucidating the genetic mechanisms underlying antimicrobial resistance (AMR) and for surveilling drug-resistant bacterial pathogens. Despite the widespread use of Illumina sequencing technology for bacterial genome sequencing due to its high throughput, accuracy, and cost-effectiveness, the short-read nature of this technology often results in fragmented assemblies, limiting comprehensive genome analysis. In response, we introduce Pasa, a novel graph-based algorithm that leverages pangenome and assembly graph information to enhance scaffolding quality. By incorporating population data of bacterial species, Pasa utilizes gene family linkage information to resolve contig graphs within assemblies. Our method surpasses current state-of-the-art techniques in accuracy while remaining computationally efficient, making it suitable for analyzing many draft assemblies. In addition, we introduce PanKA, which exploits the pangenome to extract a concise set of features relevant to AMR prediction. PanKA not only accelerates model training and prediction but also improves prediction accuracy. Applied to Escherichia coli and Klebsiella pneumoniae, PanKA outperforms existing methods and even surpasses the state-of-the-art classification approach for AMR prediction.