Capability

Graphify — codebases & docs to knowledge graphs

A Claude Code skill that turns any folder of code, docs, PDFs, and images into a queryable knowledge graph.

#skills#knowledge-graph#mcp

graphify is a Claude Code skill. Type /graphify in any folder and it reads your files — code, PDFs, markdown, screenshots, diagrams, even images in other languages — and builds a queryable knowledge graph out of them. It uses Claude vision to extract concepts and relationships, connects everything into one graph, and is honest about what it found versus what it guessed. Queries run at roughly 71× fewer tokens than re-reading the raw files, and the graph persists across sessions.

Drop in filescode · docs · PDFs · images
Extracttree-sitter + Claude
Knowledge graphgod nodes · communities
Use itquery · wiki · export · MCP
Any mix of files in, one connected graph out — then query, visualize, or export it.

Install

Requires Claude Code and Python 3.10+.

  1. 1

    Install the package and the skill

    One command installs the CLI and registers the /graphify skill.
    bash
    pip install graphifyy && graphify install
  2. 2

    Run it in any directory

    Open Claude Code in a project and build the graph.
    bash
    /graphify .

Naming & PATH

The PyPI package is temporarily graphifyy (double-y) while the graphify name is reclaimed — the CLI and skill are still graphify. On macOS “externally-managed” errors or Windows PATH issues, use pipx install graphifyy instead.

What a run produces

Every run writes a graphify-out/ folder you can browse, query, or commit so your team starts from the cached graph:

graphify-out/
graph.html       interactive graph — click nodes, search, filter by community
obsidian/        open as an Obsidian vault
wiki/            Wikipedia-style articles for agent navigation (--wiki)
GRAPH_REPORT.md  god nodes, surprising connections, suggested questions
graph.json       persistent graph — query weeks later without re-reading
cache/           SHA256 cache — re-runs only process changed files

The full command surface

graphify is a single skill with one command and a set of flags. Everything it does:

Build & grow the graph

CommandWhat it does
/graphify [path]Build a knowledge graph from a folder (defaults to the current directory).
--mode deepMore aggressive extraction with richer INFERRED edges.
--updateIncremental: re-extract only changed files and merge into the existing graph.
add <url>Fetch a paper, tweet, or web page, save it, and fold it into the graph.

Query it

CommandWhat it does
query "…"Ask the graph a question in natural language.
path "A" "B"Find how two nodes connect to each other.
explain "X"Explain a node and the relationships around it.

Keep it fresh

CommandWhat it does
--watchAuto-sync as files change — instant for code, notifies you for docs.
graphify hook installPost-commit git hook that rebuilds the graph after every commit.

Export & integrate

CommandWhat it does
--wikiBuild an agent-crawlable wiki — an index.md plus one article per community.
--svgExport graph.svg.
--graphmlExport graph.graphml for Gephi or yEd.
--neo4jGenerate cypher.txt to load the graph into Neo4j.
--mcpStart an MCP stdio server so agents can query the graph as a tool.

Quick examples

bash
/graphify ./raw --mode deep        # thorough build with richer inferred edges
/graphify add https://arxiv.org/abs/1706.03762   # pull in a paper
/graphify query "what connects attention to the optimizer?"
/graphify path "DigestAuth" "Response"
/graphify explain "SwinTransformer"

What it reads

Fully multimodal — drop in any mix of file types and it extracts from all of them:

TypeExtensionsExtraction
Code.py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .phpAST via tree-sitter + a call-graph pass
Docs.md .txt .rstConcepts + relationships via Claude
Papers.pdfCitation mining + concept extraction
Images.png .jpg .webp .gifClaude vision — screenshots, diagrams, any language

What you get back

God nodes

The highest-degree concepts — what everything else connects through.

Surprising connections

Ranked by a composite score (code↔paper edges rank above code↔code), each with a plain-English why.

Suggested questions

Four or five questions the graph is uniquely positioned to answer.

Token benchmark

Printed after every run — e.g. ~71.5× fewer tokens per query vs reading the raw files.

Communities & confidence

Leiden clustering groups related concepts; every edge is tagged EXTRACTED, INFERRED, or AMBIGUOUS.

Stays current

--watch rebuilds as code changes; the git hook rebuilds on every commit.

Honest by design

Because every edge carries a confidence tag, you always know what graphify found in the source versus what it inferred — no silent guessing.

Under the hood

Built on NetworkX + Leiden (graspologic) + tree-sitter + Claude + vis.js. It runs entirely locally — no Neo4j, no server. Beyond Claude Code, the same skill installs into Cursor, Gemini CLI, Codex, and Copilot CLI via graphify <platform> install, and the --mcp mode exposes the graph to any MCP client.

Claude Code skillCLIMCP serverRuns locally

Learn more