Using Deep Learning for graph theory and subgraph matching
Graph theory is a branch of mathematics that studies graphs, which are mathematical structures used to model pairwise relations between objects. Currently, existing classical algorithms can solve these problems effectively. However, these algorithms are not scalable and cannot be applied to large-scaled graphs.
Research Focus
I explore how to use Deep Learning to solve graph theory problems with targeted large-scaled graphs and applications in drug discovery, social network analysis, etc.
Master’s Thesis
My Master’s thesis focuses on enhancing subgraph isomorphism prediction models to improve scalability and interpretability, particularly for applications in drug design. I introduce xNeuSM, an Explainable Neural Subgraph Matching framework, which leverages Graph Learnable Multi-hop Attention Networks (GLeMA) to dynamically learn attention decay parameters across multiple hops.
Key Results
Higher accuracy than approximate baselines
Query times at least 7x faster than exact algorithms
Practical for large-scale molecular analysis in drug discovery
Subgraph matching is a challenging problem with a wide range of applications in drug discovery, social network analysis, biochemistry, and cognitive science. It involves determining whether a given query graph is present within a larger target graph. Traditional graph-matching algorithms provide precise results but face challenges in large graph instances due to the NP-complete nature of the problem, limiting their practical applicability. In contrast, recent neural network-based approximations offer more scalable solutions but often lack interpretable node correspondences. To address these limitations, this article presents a multi-task learning framework called xNeuSM: Explainable Neural Subgraph Matching, which introduces Graph Learnable Multi-hop Attention Networks (GLeMA) that adaptively learn the parameters governing the attention factor decay for each node across hops rather than relying on fixed hyperparameters. Our framework jointly optimizes both subgraph matching and finding subgraph-isomorphism mappings. We provide a theoretical analysis establishing error bounds for GLeMA’s approximation of multi-hop attention as a function of the number of hops. Additionally, we prove that learning distinct attention decay factors for each node leads to a correct approximation of multi-hop attention. Empirical evaluation on real-world datasets shows that xNeuSM achieves substantial improvements in prediction F1 score of up to 34% compared to approximate baselines and, notably, at least a seven-fold faster query time than exact algorithms. With these results, xNeuSM can be applied to solve matching problems in various domains spanning from biochemistry to social science.
@article{nguyen_explainable_2024,title={Explainable {Neural} {Subgraph} {Matching} {With} {Learnable} {Multi}-{Hop} {Attention}},volume={12},copyright={Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License},issn={2169-3536},url={https://ieeexplore.ieee.org/document/10677500},doi={10.1109/ACCESS.2024.3458050},urldate={2025-09-08},journal={IEEE Access},author={Nguyen, Duc Q. and Toan Nguyen, Thanh and Jo, Jun and Poux, Florent and Anirban, Shikha and Quan, Tho T.},year={2024},keywords={graph neural networks, Graph neural networks, Spread spectrum communication, Pattern matching, subgraph matching, Adaptation models, Explainability, learnable multi-hop attention, Multitasking, Object recognition},pages={130474--130492},}
The goal of subgraph matching is to determine the presence of a particular query pattern within a large collection of data graphs. Despite being a hard problem, subgraph matching is essential in various disciplines, including bioinformatics, text matching, and graph retrieval. Although traditional approaches could provide exact solutions, their computations are known to be NP-complete, leading to an overwhelmingly querying latency. While recent neural-based approaches have been shown to improve the response time, the oversimplified assumption of the first-order network may neglect the generalisability of fully capturing patterns in varying sizes, causing the performance to drop significantly in datasets in various domains. To overcome these limitations, this paper proposes xDualSM, a dual matching neural network model with interleaved diffusion attention. Specifically, we first embed the structural information of graphs into different adjacency matrices, which explicitly capture the intra-graph and cross-graph structures between the query pattern and the target graph. Then, we introduce a dual matching network with interleaved diffusion attention to carefully capture intra-graph and cross-graph information while reducing computational complexity. Empirically, our proposed framework not only boosted the speed of subgraph matching more than 10x compared to the fastest baseline but also achieved significant improvements of 47.64% in Recall and 34.39% in F1-score compared to the state-of-the-art approximation approach on COX2 dataset. In addition, our results are comparable with exact methods.
@inproceedings{nguyen_10x_2023,title={{10X} {Faster} {Subgraph} {Matching}: {Dual} {Matching} {Networks} with {Interleaved} {Diffusion} {Attention}},copyright={Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License},shorttitle={{10X} {Faster} {Subgraph} {Matching}},url={https://ieeexplore.ieee.org/document/10191159},doi={10.1109/IJCNN54540.2023.10191159},urldate={2025-09-08},booktitle={2023 {International} {Joint} {Conference} on {Neural} {Networks} ({IJCNN})},author={Nguyen, Thanh Toan and Nguyen, Duc Q. and Ren, Zhao and Jo, Jun and Nguyen, Quoc Viet Hung and Nguyen, Thanh Tam},month=jun,year={2023},note={ISSN: 2161-4407},keywords={Bioinformatics, graph neural networks, Computational complexity, diffusion attention, graph mining, Impedance matching, matching by embedding, Neural networks, Pattern matching, subgraph matching, Time factors},pages={1--9},}