Biograph

Proprietary

smarts.bio's unified biomedical knowledge graph. Connects genes, proteins, diseases, variants, pathways, and drugs into a single queryable graph — powered by Neo4j and curated from dozens of public databases.

How Biograph works

Biograph integrates curated relationships from UniProt, NCBI Gene, OMIM, ClinVar, KEGG, Reactome, DrugBank, and the scientific literature into a Neo4j property graph. You can look up entities, do full-text search, or traverse multi-hop networks to discover indirect connections that would require querying a dozen databases separately.

60K+

Genes

26K+

Diseases

1M+

Variants

Entity types

Type	Scale	ID formats accepted
gene	60K+	Gene symbol (`BRCA1`), NCBI Gene ID (`672`), Ensembl gene ID
protein	20K+	UniProt accession (`P38398`), UniProt entry name (`BRCA1_HUMAN`)
disease	26K+	OMIM ID (`114480`), MeSH ID, MONDO ID, disease name
variant	1M+	dbSNP rs ID (`rs80357914`), ClinVar variation ID, HGVS notation
pathway	5K+	KEGG pathway ID (`hsa05212`), Reactome ID, pathway name
drug	10K+	DrugBank ID (`DB00619`), drug name, ChEMBL ID

Query mode: `entity-lookup`

Retrieve a single entity and its direct relationships. Best for fetching structured facts about a known gene, protein, disease, or variant.

TOOLbiographtools scope

Parameters

Parameter	Type	Description
mode *	string	`entity-lookup`
entity *	string	Entity ID or name (gene symbol, UniProt accession, OMIM ID, rs ID, etc.)
entity_type *	string	One of: `gene`, `protein`, `disease`, `variant`, `pathway`, `drug`
relation_types	string[]	Filter to specific relationship types (e.g. `["AFFECTS", "ASSOCIATED_WITH"]`). Default: all.
depth	integer	Relationship traversal depth (1 = direct neighbors only, max 2). Default: 1.

result = client.tools.run(
    tool_id="biograph",
    input={
        "mode": "entity-lookup",
        "entity": "BRCA1",
        "entity_type": "gene",
    },
)

entity = result["entity"]
print(f"Gene: {entity['name']} ({entity['id']})")
print(f"Description: {entity['description']}")
print(f"\nRelationships ({len(result['relationships'])} total):")
for rel in result["relationships"][:10]:
    print(f"  [{rel['type']}] → {rel['target']['name']} ({rel['target']['entity_type']})")

Query mode: `search`

Full-text search across all entity types simultaneously. Returns ranked results with their type, identifiers, and a short description. Useful when you know a term but not its exact ID.

TOOLbiograph

Parameters

Parameter	Type	Description
mode *	string	`search`
query *	string	Search term, disease name, gene name, pathway description, etc.
entity_types	string[]	Restrict to specific entity types. Default: all six types.
limit	integer	Max results to return (default 20, max 100).

result = client.tools.run(
    tool_id="biograph",
    input={
        "mode": "search",
        "query": "hereditary breast ovarian cancer",
        "entity_types": ["gene", "disease", "variant"],
        "limit": 15,
    },
)

for hit in result["results"]:
    print(
        f"[{hit['entity_type']:8s}]  {hit['name']:<30s}  "
        f"id={hit['id']}  score={hit['score']:.2f}"
    )

Query mode: `network`

Traverse the knowledge graph starting from one or more seed entities. Returns the sub-graph (nodes + edges) up to a given depth — ideal for building interaction networks, drug target mapping, or disease mechanism exploration.

TOOLbiograph

Parameters

Parameter	Type	Description
mode *	string	`network`
seeds *	object[]	List of seed entities, each with `entity` and `entity_type`. Max 5 seeds.
depth	integer	Traversal hops from seeds (1–3, default 2). Larger values return exponentially more nodes.
relation_types	string[]	Limit traversal to specific edge types (e.g. `["TARGETS", "ASSOCIATED_WITH"]`). Default: all.
max_nodes	integer	Cap on returned nodes (default 200, max 500). Prevents oversized graphs.

# Map the drug-target network around an oncogene
result = client.tools.run(
    tool_id="biograph",
    input={
        "mode": "network",
        "seeds": [
            {"entity": "KRAS", "entity_type": "gene"},
            {"entity": "EGFR", "entity_type": "gene"},
        ],
        "depth": 2,
        "relation_types": ["TARGETS", "ASSOCIATED_WITH", "ENCODES"],
        "max_nodes": 150,
    },
)

nodes = result["nodes"]
edges = result["edges"]
print(f"Network: {len(nodes)} nodes, {len(edges)} edges")

# Count by entity type
from collections import Counter
type_counts = Counter(n["entity_type"] for n in nodes)
for entity_type, count in type_counts.most_common():
    print(f"  {entity_type}: {count}")

# Find drugs in the network
drugs = [n for n in nodes if n["entity_type"] == "drug"]
print(f"\nDrugs found: {[d['name'] for d in drugs[:10]]}")

Relationship types

Use these values in relation_types to filter traversals.

Relationship	Source → Target	Meaning
ENCODES	gene → protein	Gene encodes a protein product
ASSOCIATED_WITH	gene / variant → disease	Genetic association from GWAS or OMIM
AFFECTS	variant → gene / protein	Variant has functional effect (ClinVar significance)
TARGETS	drug → protein / gene	Drug acts on this molecular target (DrugBank / ChEMBL)
TREATS	drug → disease	Approved or investigational treatment indication
PARTICIPATES_IN	gene / protein → pathway	Gene/protein is a member of a biological pathway
INTERACTS_WITH	protein ↔ protein	Physical protein-protein interaction (STRING / IntAct)
REGULATES	gene → gene	Transcriptional regulation relationship

Use cases

Drug repurposing — find drugs targeting a disease pathway

Traverse from a disease through its associated genes, then follow TARGETS edges to find approved drugs that hit those genes — surfacing repurposing candidates.

# Start from a disease, discover drug repurposing candidates
result = client.tools.run(
    tool_id="biograph",
    input={
        "mode": "network",
        "seeds": [{"entity": "Pancreatic cancer", "entity_type": "disease"}],
        "depth": 3,
        "relation_types": ["ASSOCIATED_WITH", "ENCODES", "TARGETS", "TREATS"],
    },
)

# Extract drugs connected to this disease network
drugs = {n["id"]: n["name"] for n in result["nodes"] if n["entity_type"] == "drug"}

# Find edges where drugs target genes in the network
drug_targets = [
    e for e in result["edges"]
    if e["type"] == "TARGETS" and e["source"] in drugs
]

print("Drug repurposing candidates:")
for edge in drug_targets[:10]:
    drug_name = drugs[edge["source"]]
    target_id = edge["target"]
    target_node = next((n for n in result["nodes"] if n["id"] == target_id), {})
    print(f"  {drug_name:25s} → targets {target_node.get('name', target_id)}")

Let the agent traverse Biograph automatically

When you use the Query endpoint, the agent will automatically use Biograph when your prompt involves relationships between biological entities — combining it with literature (PubMed), clinical (ClinVar), and structural (PDB) data as needed.

response = client.query.run(
    "What genes are associated with Alzheimer's disease and which drugs target them?",
)
# The agent queries Biograph, enriches with literature evidence from PubMed,
# and returns a structured summary with references.
print(response.answer)

← All Databases SmartsMatch →KEGG & Reactome →

Biograph

How Biograph works

Entity types

Query mode: entity-lookup

Parameters

Query mode: search

Parameters

Query mode: network

Parameters

Relationship types

Use cases

Drug repurposing — find drugs targeting a disease pathway

Let the agent traverse Biograph automatically

Query mode: `entity-lookup`

Query mode: `search`

Query mode: `network`