Capability

Graphify — codebases & docs to knowledge graphs

A Claude Code skill that turns any folder of code, docs, PDFs, and images into a queryable knowledge graph.

#skills#knowledge-graph#mcp

graphify is a Claude Code skill. Type /graphify in any folder and it reads your files — code, PDFs, markdown, screenshots, diagrams, even images in other languages — and builds a queryable knowledge graph out of them. It uses Claude vision to extract concepts and relationships, connects everything into one graph, and is honest about what it found versus what it guessed. Queries run at roughly 71× fewer tokens than re-reading the raw files, and the graph persists across sessions.

Drop in filescode · docs · PDFs · images

Extracttree-sitter + Claude

Knowledge graphgod nodes · communities

Use itquery · wiki · export · MCP

Any mix of files in, one connected graph out — then query, visualize, or export it.

Install

Requires Claude Code and Python 3.10+.

1
Install the package and the skill
One command installs the CLI and registers the /graphify skill.
bash
```
pip install graphifyy && graphify install
```
2
Run it in any directory
Open Claude Code in a project and build the graph.
bash
```
/graphify .
```

Naming & PATH

The PyPI package is temporarily graphifyy (double-y) while the graphify name is reclaimed — the CLI and skill are still graphify. On macOS “externally-managed” errors or Windows PATH issues, use pipx install graphifyy instead.

What a run produces

Every run writes a graphify-out/ folder you can browse, query, or commit so your team starts from the cached graph:

graphify-out/

graph.html       interactive graph — click nodes, search, filter by community
obsidian/        open as an Obsidian vault
wiki/            Wikipedia-style articles for agent navigation (--wiki)
GRAPH_REPORT.md  god nodes, surprising connections, suggested questions
graph.json       persistent graph — query weeks later without re-reading
cache/           SHA256 cache — re-runs only process changed files

The full command surface

graphify is a single skill with one command and a set of flags. Everything it does:

Build & grow the graph

Command	What it does
/graphify [path]	Build a knowledge graph from a folder (defaults to the current directory).
--mode deep	More aggressive extraction with richer INFERRED edges.
--update	Incremental: re-extract only changed files and merge into the existing graph.
add <url>	Fetch a paper, tweet, or web page, save it, and fold it into the graph.

Query it

Command	What it does
query "…"	Ask the graph a question in natural language.
path "A" "B"	Find how two nodes connect to each other.
explain "X"	Explain a node and the relationships around it.

Keep it fresh

Command	What it does
--watch	Auto-sync as files change — instant for code, notifies you for docs.
graphify hook install	Post-commit git hook that rebuilds the graph after every commit.

Export & integrate

Command	What it does
--wiki	Build an agent-crawlable wiki — an index.md plus one article per community.
--svg	Export graph.svg.
--graphml	Export graph.graphml for Gephi or yEd.
--neo4j	Generate cypher.txt to load the graph into Neo4j.
--mcp	Start an MCP stdio server so agents can query the graph as a tool.

Quick examples

bash

/graphify ./raw --mode deep        # thorough build with richer inferred edges
/graphify add https://arxiv.org/abs/1706.03762   # pull in a paper
/graphify query "what connects attention to the optimizer?"
/graphify path "DigestAuth" "Response"
/graphify explain "SwinTransformer"

What it reads

Fully multimodal — drop in any mix of file types and it extracts from all of them:

Type	Extensions	Extraction
Code	.py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .php	AST via tree-sitter + a call-graph pass
Docs	.md .txt .rst	Concepts + relationships via Claude
Papers	.pdf	Citation mining + concept extraction
Images	.png .jpg .webp .gif	Claude vision — screenshots, diagrams, any language

What you get back

God nodes

The highest-degree concepts — what everything else connects through.

Surprising connections

Ranked by a composite score (code↔paper edges rank above code↔code), each with a plain-English why.

Token benchmark

Printed after every run — e.g. ~71.5× fewer tokens per query vs reading the raw files.

Communities & confidence

Leiden clustering groups related concepts; every edge is tagged EXTRACTED, INFERRED, or AMBIGUOUS.

Stays current

--watch rebuilds as code changes; the git hook rebuilds on every commit.

Honest by design

Because every edge carries a confidence tag, you always know what graphify found in the source versus what it inferred — no silent guessing.

Under the hood

Built on NetworkX + Leiden (graspologic) + tree-sitter + Claude + vis.js. It runs entirely locally — no Neo4j, no server. Beyond Claude Code, the same skill installs into Cursor, Gemini CLI, Codex, and Copilot CLI via graphify <platform> install, and the --mcp mode exposes the graph to any MCP client.

Claude Code skillCLIMCP serverRuns locally

Learn more

github.com/safishamsi/graphify — the skill, CLI, and worked examples
worked/ — real corpora with their actual graph output and token numbers
ARCHITECTURE.md — module responsibilities and how to add a language