Biograph
Proprietarysmarts.bio's unified biomedical knowledge graph. Connects genes, proteins, diseases, variants, pathways, and drugs into a single queryable graph — powered by Neo4j and curated from dozens of public databases.
How Biograph works
Biograph integrates curated relationships from UniProt, NCBI Gene, OMIM, ClinVar, KEGG, Reactome, DrugBank, and the scientific literature into a Neo4j property graph. You can look up entities, do full-text search, or traverse multi-hop networks to discover indirect connections that would require querying a dozen databases separately.
Entity types
| Type | Scale | ID formats accepted |
|---|---|---|
| gene | 60K+ | Gene symbol (BRCA1), NCBI Gene ID (672), Ensembl gene ID |
| protein | 20K+ | UniProt accession (P38398), UniProt entry name (BRCA1_HUMAN) |
| disease | 26K+ | OMIM ID (114480), MeSH ID, MONDO ID, disease name |
| variant | 1M+ | dbSNP rs ID (rs80357914), ClinVar variation ID, HGVS notation |
| pathway | 5K+ | KEGG pathway ID (hsa05212), Reactome ID, pathway name |
| drug | 10K+ | DrugBank ID (DB00619), drug name, ChEMBL ID |
Query mode: entity-lookup
Retrieve a single entity and its direct relationships. Best for fetching structured facts about a known gene, protein, disease, or variant.
biographtools scopeParameters
| Parameter | Type | Description |
|---|---|---|
| mode * | string | entity-lookup |
| entity * | string | Entity ID or name (gene symbol, UniProt accession, OMIM ID, rs ID, etc.) |
| entity_type * | string | One of: gene, protein, disease, variant, pathway, drug |
| relation_types | string[] | Filter to specific relationship types (e.g. ["AFFECTS", "ASSOCIATED_WITH"]). Default: all. |
| depth | integer | Relationship traversal depth (1 = direct neighbors only, max 2). Default: 1. |
result = client.tools.run(
tool_id="biograph",
input={
"mode": "entity-lookup",
"entity": "BRCA1",
"entity_type": "gene",
},
)
entity = result["entity"]
print(f"Gene: {entity['name']} ({entity['id']})")
print(f"Description: {entity['description']}")
print(f"\nRelationships ({len(result['relationships'])} total):")
for rel in result["relationships"][:10]:
print(f" [{rel['type']}] → {rel['target']['name']} ({rel['target']['entity_type']})")Query mode: search
Full-text search across all entity types simultaneously. Returns ranked results with their type, identifiers, and a short description. Useful when you know a term but not its exact ID.
biographParameters
| Parameter | Type | Description |
|---|---|---|
| mode * | string | search |
| query * | string | Search term, disease name, gene name, pathway description, etc. |
| entity_types | string[] | Restrict to specific entity types. Default: all six types. |
| limit | integer | Max results to return (default 20, max 100). |
result = client.tools.run(
tool_id="biograph",
input={
"mode": "search",
"query": "hereditary breast ovarian cancer",
"entity_types": ["gene", "disease", "variant"],
"limit": 15,
},
)
for hit in result["results"]:
print(
f"[{hit['entity_type']:8s}] {hit['name']:<30s} "
f"id={hit['id']} score={hit['score']:.2f}"
)Query mode: network
Traverse the knowledge graph starting from one or more seed entities. Returns the sub-graph (nodes + edges) up to a given depth — ideal for building interaction networks, drug target mapping, or disease mechanism exploration.
biographParameters
| Parameter | Type | Description |
|---|---|---|
| mode * | string | network |
| seeds * | object[] | List of seed entities, each with entity and entity_type. Max 5 seeds. |
| depth | integer | Traversal hops from seeds (1–3, default 2). Larger values return exponentially more nodes. |
| relation_types | string[] | Limit traversal to specific edge types (e.g. ["TARGETS", "ASSOCIATED_WITH"]). Default: all. |
| max_nodes | integer | Cap on returned nodes (default 200, max 500). Prevents oversized graphs. |
# Map the drug-target network around an oncogene
result = client.tools.run(
tool_id="biograph",
input={
"mode": "network",
"seeds": [
{"entity": "KRAS", "entity_type": "gene"},
{"entity": "EGFR", "entity_type": "gene"},
],
"depth": 2,
"relation_types": ["TARGETS", "ASSOCIATED_WITH", "ENCODES"],
"max_nodes": 150,
},
)
nodes = result["nodes"]
edges = result["edges"]
print(f"Network: {len(nodes)} nodes, {len(edges)} edges")
# Count by entity type
from collections import Counter
type_counts = Counter(n["entity_type"] for n in nodes)
for entity_type, count in type_counts.most_common():
print(f" {entity_type}: {count}")
# Find drugs in the network
drugs = [n for n in nodes if n["entity_type"] == "drug"]
print(f"\nDrugs found: {[d['name'] for d in drugs[:10]]}")Relationship types
Use these values in relation_types to filter traversals.
| Relationship | Source → Target | Meaning |
|---|---|---|
| ENCODES | gene → protein | Gene encodes a protein product |
| ASSOCIATED_WITH | gene / variant → disease | Genetic association from GWAS or OMIM |
| AFFECTS | variant → gene / protein | Variant has functional effect (ClinVar significance) |
| TARGETS | drug → protein / gene | Drug acts on this molecular target (DrugBank / ChEMBL) |
| TREATS | drug → disease | Approved or investigational treatment indication |
| PARTICIPATES_IN | gene / protein → pathway | Gene/protein is a member of a biological pathway |
| INTERACTS_WITH | protein ↔ protein | Physical protein-protein interaction (STRING / IntAct) |
| REGULATES | gene → gene | Transcriptional regulation relationship |
Use cases
Drug repurposing — find drugs targeting a disease pathway
Traverse from a disease through its associated genes, then follow TARGETS edges to find approved drugs that hit those genes — surfacing repurposing candidates.
# Start from a disease, discover drug repurposing candidates
result = client.tools.run(
tool_id="biograph",
input={
"mode": "network",
"seeds": [{"entity": "Pancreatic cancer", "entity_type": "disease"}],
"depth": 3,
"relation_types": ["ASSOCIATED_WITH", "ENCODES", "TARGETS", "TREATS"],
},
)
# Extract drugs connected to this disease network
drugs = {n["id"]: n["name"] for n in result["nodes"] if n["entity_type"] == "drug"}
# Find edges where drugs target genes in the network
drug_targets = [
e for e in result["edges"]
if e["type"] == "TARGETS" and e["source"] in drugs
]
print("Drug repurposing candidates:")
for edge in drug_targets[:10]:
drug_name = drugs[edge["source"]]
target_id = edge["target"]
target_node = next((n for n in result["nodes"] if n["id"] == target_id), {})
print(f" {drug_name:25s} → targets {target_node.get('name', target_id)}")Let the agent traverse Biograph automatically
When you use the Query endpoint, the agent will automatically use Biograph when your prompt involves relationships between biological entities — combining it with literature (PubMed), clinical (ClinVar), and structural (PDB) data as needed.
response = client.query.run(
"What genes are associated with Alzheimer's disease and which drugs target them?",
)
# The agent queries Biograph, enriches with literature evidence from PubMed,
# and returns a structured summary with references.
print(response.answer)