Liu, Dan and Young, Francesca and Lamb, Kieran D. and Claudio Quiros, Adalberto and Pancheva, Alexandrina and Miller, Crispin and Macdonald, Craig and Robertson, David L. and Yuan, Ke (2024) PLM-interact: extending protein language models to predict protein-protein interactions. bioRxiv.

Abstract

Computational prediction of protein structure from amino acid sequences alone has been achieved with unprecedented accuracy, yet the prediction of protein-protein interactions (PPIs) remains an outstanding challenge. Here we assess the ability of protein language models (PLMs), routinely applied to protein folding, to be retrained for PPI prediction. Existing PPI prediction models that exploit PLMs use a pre-trained PLM feature set, ignoring that the proteins are physically interacting. Our novel method, PLM-interact, goes beyond a single protein, jointly encoding protein pairs to learn their relationships, analogous to the next-sentence prediction task from natural language processing. This approach provides a significant improvement in performance: Trained on human-human PPIs, PLM-interact predicts mouse, fly, worm, E. coli and yeast PPIs, with 16-28% improvements in AUPR compared with state-of-the-art PPI models. Additionally, it can detect changes that disrupt or cause PPIs and be applied to virus-host PPI prediction. Our work demonstrates that large language models can be extended to learn the intricate relationships among biomolecules from their sequences alone.

People
Liu, Dan
Author

Liu, Dan and Young, Francesca and Lamb, Kieran D. and Claudio Quiros, Adalberto and Pancheva, Alexandrina and Miller, Crispin and Macdonald, Craig and Robertson, David L. and Yuan, Ke (2024) PLM-interact: extending protein language models to predict protein-protein interactions. bioRxiv.

See full publications list
Young, Francesca
Author

Liu, Dan and Young, Francesca and Lamb, Kieran D. and Claudio Quiros, Adalberto and Pancheva, Alexandrina and Miller, Crispin and Macdonald, Craig and Robertson, David L. and Yuan, Ke (2024) PLM-interact: extending protein language models to predict protein-protein interactions. bioRxiv.

Lamb, Kieran D. and Hughes, Joseph and Lytras, Spyros and Young, Francesca and Koci, Orges and Herzig, James and Lovell, Simon C, and Grove, Joe and Yuan, Ke and Robertson, David L. (2024) From a single sequence to evolutionary trajectories: protein language models capture the evolutionary potential of SARS-CoV-2 protein sequences. bioRxiv.

See full publications list
Lamb, Kieran D.
Author

Lytras, Spyros and Lamb, Kieran D. and Ito, Jumpei and Grove, Joe and Yuan, Ke and Sato, Kei and Hughes, Joseph and Robertson, David L. (2025) Pathogen genomic surveillance and the AI revolution. Journal of Virology, 99 (2): e0160124. ISSN 0022-538X

Liu, Dan and Young, Francesca and Lamb, Kieran D. and Claudio Quiros, Adalberto and Pancheva, Alexandrina and Miller, Crispin and Macdonald, Craig and Robertson, David L. and Yuan, Ke (2024) PLM-interact: extending protein language models to predict protein-protein interactions. bioRxiv.

Lamb, Kieran D. and Hughes, Joseph and Lytras, Spyros and Young, Francesca and Koci, Orges and Herzig, James and Lovell, Simon C, and Grove, Joe and Yuan, Ke and Robertson, David L. (2024) From a single sequence to evolutionary trajectories: protein language models capture the evolutionary potential of SARS-CoV-2 protein sequences. bioRxiv.

See full publications list
Claudio Quiros, Adalberto
Author

Coudray, Nicolas and Juarez, Michelle C. and Criscito, Maressa C. and Claudio Quiros, Adalberto and Wilken, Reason and Jackson Cullison, Stephanie R. and Stevenson, Mary L. and Doudican, Nicole A. and Yuan, Ke and Aquino, Jamie D. and Klufas, Daniel M. and North, Jeffrey P. and Yu, Siegrid S. and Murad, Fadi and Ruiz, Emily and Schmults, Chrysalyne D. and Cardona Machado, Cristian D. and Cañueto, Javier and Choudhary, Anirudh and Hughes, Alysia N. and Stockard, Alyssa and Leibovit-Reiben, Zachary and Mangold, Aaron R. and Tsirigos, Aristotelis and Carucci, John A. (2025) Self supervised artificial intelligence predicts poor outcome from primary cutaneous squamous cell carcinoma at diagnosis. npj Digital Medicine, 8 (1): 105. ISSN 2398-6352

Liu, Dan and Young, Francesca and Lamb, Kieran D. and Claudio Quiros, Adalberto and Pancheva, Alexandrina and Miller, Crispin and Macdonald, Craig and Robertson, David L. and Yuan, Ke (2024) PLM-interact: extending protein language models to predict protein-protein interactions. bioRxiv.

See full publications list
Pancheva, Alexandrina
Author

Liu, Dan and Young, Francesca and Lamb, Kieran D. and Claudio Quiros, Adalberto and Pancheva, Alexandrina and Miller, Crispin and Macdonald, Craig and Robertson, David L. and Yuan, Ke (2024) PLM-interact: extending protein language models to predict protein-protein interactions. bioRxiv.

See full publications list
Miller, Crispin
Author

Liu, Dan and Young, Francesca and Lamb, Kieran D. and Claudio Quiros, Adalberto and Pancheva, Alexandrina and Miller, Crispin and Macdonald, Craig and Robertson, David L. and Yuan, Ke (2024) PLM-interact: extending protein language models to predict protein-protein interactions. bioRxiv.

Malla, Sudhir B. and Byrne, Ryan M. and Lafarge, Maxime W. and Corry, Shania M. and Fisher, Natalie C. and Tsantoulis, Petros K. and Mills, Megan L. and Ridgway, Rachel A. and Lannagan, Tamsin R. M. and Najumudeen, Arafath K. and Gilroy, Kathryn L. and Amirkhah, Raheleh and Maguire, Sarah L. and Mulholland, Eoghan J. and Belnoue-Davis, Hayley L. and Grassi, Elena and Viviani, Marco and Rogan, Emily and Redmond, Keara L. and Sakhnevych, Svetlana and McCooey, Aoife J. and Bull, Courtney and Hoey, Emily and Sinevici, Nicoleta and Hall, Holly and Ahmaderaghi, Baharak and Domingo, Enric and Blake, Andrew and Richman, Susan D. and Isella, Claudio and Miller, Crispin and Bertotti, Andrea and Trusolino, Livio and Loughrey, Maurice B. and Kerr, Emma M. and Tejpar, Sabine and S:CORT Consortium and Maughan, Timothy S. and Lawler, Mark and Campbell, Andrew D. and Leedham, Simon J. and Koelzer, Viktor H. and Sansom, Owen J. and Dunne, Philip D. (2024) Author Correction: Pathway level subtyping identifies a slow-cycling biological phenotype associated with poor clinical outcomes in colorectal cancer. Nature Genetics, 56 (6): 1321. ISSN 1061-4036

Malla, Sudhir B. and Byrne, Ryan M. and Lafarge, Maxime W. and Corry, Shania M. and Fisher, Natalie C. and Tsantoulis, Petros K. and Mills, Megan L. and Ridgway, Rachel A. and Lannagan, Tamsin R. M. and Najumudeen, Arafath K. and Gilroy, Kathryn L. and Amirkhah, Raheleh and Maguire, Sarah L. and Mulholland, Eoghan J. and Belnoue-Davis, Hayley L. and Grassi, Elena and Viviani, Marco and Rogan, Emily and Redmond, Keara L. and Sakhnevych, Svetlana and McCooey, Aoife J. and Bull, Courtney and Hoey, Emily and Sinevici, Nicoleta and Hall, Holly and Ahmaderaghi, Baharak and Domingo, Enric and Blake, Andrew and Richman, Susan D. and Isella, Claudio and Miller, Crispin and Bertotti, Andrea and Trusolino, Livio and Loughrey, Maurice B. and Kerr, Emma M. and Tejpar, Sabine and S:CORT consortium and Maughan, Timothy S. and Lawler, Mark and Campbell, Andrew D. and Leedham, Simon J. and Koelzer, Viktor H. and Sansom, Owen J. and Dunne, Philip D. (2024) Pathway level subtyping identifies a slow-cycling biological phenotype associated with poor clinical outcomes in colorectal cancer. Nature Genetics, 56 (3). pp. 458-472. ISSN 1061-4036

See full publications list
Macdonald, Craig
Author

Liu, Dan and Young, Francesca and Lamb, Kieran D. and Claudio Quiros, Adalberto and Pancheva, Alexandrina and Miller, Crispin and Macdonald, Craig and Robertson, David L. and Yuan, Ke (2024) PLM-interact: extending protein language models to predict protein-protein interactions. bioRxiv.

See full publications list
Robertson, David L.
Author

Lytras, Spyros and Lamb, Kieran D. and Ito, Jumpei and Grove, Joe and Yuan, Ke and Sato, Kei and Hughes, Joseph and Robertson, David L. (2025) Pathogen genomic surveillance and the AI revolution. Journal of Virology, 99 (2): e0160124. ISSN 0022-538X

Liu, Dan and Young, Francesca and Lamb, Kieran D. and Claudio Quiros, Adalberto and Pancheva, Alexandrina and Miller, Crispin and Macdonald, Craig and Robertson, David L. and Yuan, Ke (2024) PLM-interact: extending protein language models to predict protein-protein interactions. bioRxiv.

Lamb, Kieran D. and Hughes, Joseph and Lytras, Spyros and Young, Francesca and Koci, Orges and Herzig, James and Lovell, Simon C, and Grove, Joe and Yuan, Ke and Robertson, David L. (2024) From a single sequence to evolutionary trajectories: protein language models capture the evolutionary potential of SARS-CoV-2 protein sequences. bioRxiv.

See full publications list
Yuan, Ke
Author

Farndale, Lucas and Insall, Robert and Yuan, Ke (2025) TriDeNT: Triple deep network training for privileged knowledge distillation in histopathology. Medical Image Analysis, 102: 103479. ISSN 1361-8415

Ji, Yanni and Cutiongco, Marie F.A. and Jensen, Bjørn Sand and Yuan, Ke (2025) Generating realistic single-cell images from CellProfiler representations. Medical Image Analysis. ISSN 1361-8415 (In Press)

Coudray, Nicolas and Juarez, Michelle C. and Criscito, Maressa C. and Claudio Quiros, Adalberto and Wilken, Reason and Jackson Cullison, Stephanie R. and Stevenson, Mary L. and Doudican, Nicole A. and Yuan, Ke and Aquino, Jamie D. and Klufas, Daniel M. and North, Jeffrey P. and Yu, Siegrid S. and Murad, Fadi and Ruiz, Emily and Schmults, Chrysalyne D. and Cardona Machado, Cristian D. and Cañueto, Javier and Choudhary, Anirudh and Hughes, Alysia N. and Stockard, Alyssa and Leibovit-Reiben, Zachary and Mangold, Aaron R. and Tsirigos, Aristotelis and Carucci, John A. (2025) Self supervised artificial intelligence predicts poor outcome from primary cutaneous squamous cell carcinoma at diagnosis. npj Digital Medicine, 8 (1): 105. ISSN 2398-6352

See full publications list
Texts
107:108
lightbox image
349398.pdf - Published Version
Available under License Creative Commons Attribution.

Download (3MB) | Preview
Information
Library

View Item