Recommendation Systems in Drug Discovery


I recently gave a talk at Neo4J's developer conference called "Graph-Based Features for Recommendation Systems in Drug Discovery" (It was at 7 am in the morning, so the link is below if you missed it and want to watch 😄).

The talk gave me a great opportunity for a technical presentation to a wide audience, and explain how we at Crossr see recommendation systems supercharging biologists and speeding up the drug discovery cycle. The talk had a technical focus on the application of Neo4J (the leading vendor for graph databases), a technology exploding as the value of connected data becomes realised.

Why Recommendation Systems in Drug Discovery?

Recommendation systems have seen success across various companies such as Netflix for movie recommendations or e-commerce companies for surfacing ads, but have not yet been widely applied in drug discovery. At Crossr, we see them as hugely beneficial in helping scientists make key prioritisation decisions fast.

The first use case that comes to mind is identifying a gene (or protein) as a drug target to treat a disease. There are often thousands of potential targets for a scientist to consider for a disease, and prioritising which genes to move forward with is a difficult process that can take months. A recommendation system provides a mechanism for the scientist to decide what features/characteristics are important to them and quickly apply them in ranking thousands of genes. This process results in a short list (maybe 20 genes) that can then be manually reviewed and pre-clinically validated.

This is just one example, but recommendation systems can be applied in other areas including drug repurposing and clinical trials (the list goes on!).

Our Evolving Approach to Recommendation Systems in Drug Discovery

  • Scientists should easily be able to consider any data they think is relevant for the problem at hand (Multi-Omics, RNA-seq, GWAS, Clinical Trials etc.)
  • Leveraging all the newly available data and computational approaches in drug discovery should not be reserved only for scientists who can code. This means we need enterprise-level scientific software to help everyone benefit with easy access and use.
  • Put the power in the hands of the scientists, so they can choose which features to consider or ignore each time. This is important as there is often a context or hypothesis in mind (for example that Pathway X or Biological Process Y is implicated in the disease).
  • Having out-of-the-box features available for set workflows means that scientists aren't stuck engineering common features themselves  (i.e. similarity to known targets for target identification).
  • Collaboration and transparency is key to get buy-in from stakeholders inside a company, so being able to clearly and visually demonstrate why a set of targets was recommended can help move things forward.

Graph-based approaches and recommendation systems have a bright future in drug discovery, but it's an evolving field so open discussion is needed to push adoption forward. Therefore, I invite anyone to comment on my presentation here, or get in touch if you want to chat about it.