Hetnets in biomedicine

Hetnets — short for heterogeneous networks — are networks with multiple node or relationship types. They provide a scalable, intuitive, and frictionless structure for data integration and translation of biomedical knowledge.

Hetionet (Biomedical Hetnet)

Hetionet is a network that integrates knowledge from over 50 years of biomedical into a single resource. Hetionet encodes biology through its 47,031 nodes of 11 types and 2,250,197 relationships of 24 types. Version 1.0 was created for Project Rephetio and is therefore ideally suited for drug repurposing. We host a public Neo4j Browser allowing setup-free interaction with Hetionet.

Neo4j Browser » Data on GitHub »

Project Rephetio (Drug Repurposing)

Project Rephetio used Hetionet v1.0 to predict new uses for existing compounds. Our hetnet edge prediction method learned a systematic model of drug efficacy, which we applied to predict the probability of treatment between 1,538 approved small molecule compounds and 136 complex diseases. Project Rephetio is entirely open: check out the Thinklab for discussion and navigativing to code or data.

Prediction Browser » Publication » Thinklab Project » GraphGist Tutorial »

Disease-Associated Genes (GWAS Prioritization)

Using heterogeneous network edge prediction on a network with 18 node types (metanodes) and 19 edge types (metaedges), we predict the probability that each protein-coding gene is associated with each of 29 complex human diseases. Reported associations from the GWAS Catalog were used to train the model. Potential applications include prioritizing GWAS analyses, determining the causal gene within a genomic region, identifying genes of biological interest, and comparing the ability of various information domains to identify pathogenic variants.

More details » Prediction Browser » Publication »

Heterogeneous Network Edge Prediction

The goal of HNEP is to produce biologically-meaningful predictions by integrating multiple high-throughput data sources. The approach computes features describing the network topology connecting two nodes. These features are used as input to a machine learning method which predicts the probability that an edge exists. Through evaluating the informativeness of each feature, the relevance of included domains can be compared providing insight into the influential mechanisms behind the process of interest.

View details »

Publications & Media

Browse the collection of media relating to the project. Check back here for citation information.

View details »


Our python implementation of a framework for representing heterogeneous networks. Over time we plan to release additional code related to this project. We welcome contributions from the open source community.

View details »


This project was developed by Daniel Himmelstein and Sergio Baranzini from the Baranzini Lab at UCSF. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. 1144247. Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.


Contact Sergio Baranzini or Daniel Himmelstein by email. Ask questions and please report any or you find.