search
HomeBackend DevelopmentPython TutorialNavigating the world of Harry Potter with Knowledge Graphs

Aim

Are you a Harry Potter fan who want to have everything about the Harry Potter universe on your fingertips? Or do you simply want to impress your friends with a cool chart of how the different characters in Harry Potter come together? Look no further than knowledge graphs.

This guide will show you how to get a knowledge graph up in Neo4J with just your laptop and your favourite book.

What is knowledge graph

According to Wikipedia:

A knowledge graph is a knowledge base that uses a graph-structured data model or topology to represent and operate on data.

What do you need

In terms of hardware, all you need is a computer, preferably one with a Nvidia graphics card. To be fully self-sufficient, I will go with a local LLM setup, but one could easily also use an OpenAI API for the same purpose.

Steps in setting up

You will need the following:

  1. Ollama, and your favourite LLM model
  2. a python environment
  3. Neo4J

Ollama

As I am coding on Ubuntu 24.04 in WSL2, in order for any GPU workload to passthrough easily, I am using Ollama docker. Running Ollama as a docker container is as simple as first installing the Nvidia container toolkit, and then the following:

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

If you do not have a Nvidia GPU, you can run a CPU-only Ollama using the following command in CLI:

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Once you are done, you can pull your favourite LLM model into Ollama. The list of models available on Ollama is here. For example if I want to pull qwen2.5, I can run the following command in CLI:

docker exec -it ollama ollama run qwen2.5

And you are done with Ollama!

Python environment

You will first want to create a python virtual environment, so that any packages you install, or any configurations changes you made, are restricted to within the environment, instead of having these applied globally. The following command will create a virtual environment harry-potter-rag:

python -m venv harry-potter-rag

You can then activate the virtual environment using the following command:

source tutorial-env/bin/activate

Next, use pip to install the relevant packages, mainly from LangChain:

%pip install --upgrade --quiet  langchain langchain-community langchain-openai langchain-experimental neo4j

Setting up Neo4J

We will set up Neo4J as a docker container. For ease of setting up with specific configurations, we use docker compose. You may simply copy the following into a file called docker-compose.yaml, and then run docker-compose up -d in the same directory to set up Neo4J.

This setup also ensures data, logs and plugins are persisted in local folders, i.e. /data. /logs and plugins.

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Building the Knowledge Graph

We can now start building the Knowledge Graph in Jupyter Notebook! We first set up an Ollama LLM instance using the following:

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Next, we connect our LLM to Neo4J:

docker exec -it ollama ollama run qwen2.5

Now, it is time to grab your favourite Harry Potter text, or any favourite book, and we will use LangChain to split the text into chunks. Chunking is a strategy to break down a long text into parts, and we can then send each part to the LLM to convert them into nodes and edges, and insert each chunk's nodes and edges in Neo4J. Just a quick primer, nodes are circles you see on a graph, and each edge joins two nodes together.

The code also prints the first chunk for a quick preview of how the chunks look like.

python -m venv harry-potter-rag

Now, it is time to let our GPU do the heavy lifting and convert out text into Knowledge Graph! Before we dive deep into the entire book, let us experiment with prompts to better guide the LLM in returning a graph in the way we want.

Prompts are essentially examples of what we expect, or instructions of what we want to appear in the response. In the context of knowledge graphs, we can instruct the LLM to only extract persons and organisations as nodes, and to only accept certain types of relationships given the entities. For example, we can allow the relationship of spouse to only happen between a person and another person, and not between a person and an organisation.

We can now employ the LLMGraphTransformer on the first chunk of text to see how the graph could turn out. This is a good chance for us to tweak the prompt until the result is to our liking.

The following example expects nodes which could be a Person or Organization, and the allowed_relationships specify the types of relationships that are allowed. In order to allow LLM to capture the variety of the original text, I also set strict_mode to False, so that any other relationships or entities which are not defined below can also be captured. If you instead set strict_mode to True, entities and relationships that do not comply with what is allowed could be either dropped, or forced into what is allowed (which may be inaccurate).

source tutorial-env/bin/activate

After you are satisfied with fine-tuning your prompt, it is now time to ingest into a Knowledge Graph. Note that the try - except is to explicitly handle any response that could not be properly inserted into Neo4J -- the code is designed so that any error is logged, but does not block the loop from moving on with converting subsequent chunks into graph.

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

The loop above took me about 46 minutes to ingest Harry Potter and the Philosopher's Stone, Harry Potter and the Chamber of Secrets, and Harry Potter and the Prisoner of Azkaban. I end up with 4868 unique nodes! A quick preview is available below. You can see that the graph is really crowded, and and it is hard to distinguish who is related to who else, and in what way.

Navigating the world of Harry Potter with Knowledge Graphs

We can now leverage on cypher queries to look at say, Dumbledore!

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Navigating the world of Harry Potter with Knowledge Graphs

Ok now we get just Dumbledore himself. Let's see how he is related to Harry Potter.

docker exec -it ollama ollama run qwen2.5

Navigating the world of Harry Potter with Knowledge Graphs

Ok, now we are interested in what Harry and Dumbledore have spoked.

python -m venv harry-potter-rag

Navigating the world of Harry Potter with Knowledge Graphs

We can see that the graph is still really confusing, with many documents to go through to really find what we are looking for. We can see that the modelling of documents as nodes is not ideal, and further work could be done on the LLMGraphTransformer to make the graph more intuitive to use.

Conclusion

You can see how easy it is to set up a Knowledge Graph on your own local computer, without even needing to connect to the internet.

The github repo, which also contains the entire Knowledge Graph of the Harry Potter universe, is available here.

Postscript

To import the harry_potter.graphml file into Neo4J, copy the graphml file into neo4j /import folder, and run the following on the Neo4J browser:

source tutorial-env/bin/activate

The above is the detailed content of Navigating the world of Harry Potter with Knowledge Graphs. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Python vs. C  : Applications and Use Cases ComparedPython vs. C : Applications and Use Cases ComparedApr 12, 2025 am 12:01 AM

Python is suitable for data science, web development and automation tasks, while C is suitable for system programming, game development and embedded systems. Python is known for its simplicity and powerful ecosystem, while C is known for its high performance and underlying control capabilities.

The 2-Hour Python Plan: A Realistic ApproachThe 2-Hour Python Plan: A Realistic ApproachApr 11, 2025 am 12:04 AM

You can learn basic programming concepts and skills of Python within 2 hours. 1. Learn variables and data types, 2. Master control flow (conditional statements and loops), 3. Understand the definition and use of functions, 4. Quickly get started with Python programming through simple examples and code snippets.

Python: Exploring Its Primary ApplicationsPython: Exploring Its Primary ApplicationsApr 10, 2025 am 09:41 AM

Python is widely used in the fields of web development, data science, machine learning, automation and scripting. 1) In web development, Django and Flask frameworks simplify the development process. 2) In the fields of data science and machine learning, NumPy, Pandas, Scikit-learn and TensorFlow libraries provide strong support. 3) In terms of automation and scripting, Python is suitable for tasks such as automated testing and system management.

How Much Python Can You Learn in 2 Hours?How Much Python Can You Learn in 2 Hours?Apr 09, 2025 pm 04:33 PM

You can learn the basics of Python within two hours. 1. Learn variables and data types, 2. Master control structures such as if statements and loops, 3. Understand the definition and use of functions. These will help you start writing simple Python programs.

How to teach computer novice programming basics in project and problem-driven methods within 10 hours?How to teach computer novice programming basics in project and problem-driven methods within 10 hours?Apr 02, 2025 am 07:18 AM

How to teach computer novice programming basics within 10 hours? If you only have 10 hours to teach computer novice some programming knowledge, what would you choose to teach...

How to avoid being detected by the browser when using Fiddler Everywhere for man-in-the-middle reading?How to avoid being detected by the browser when using Fiddler Everywhere for man-in-the-middle reading?Apr 02, 2025 am 07:15 AM

How to avoid being detected when using FiddlerEverywhere for man-in-the-middle readings When you use FiddlerEverywhere...

What should I do if the '__builtin__' module is not found when loading the Pickle file in Python 3.6?What should I do if the '__builtin__' module is not found when loading the Pickle file in Python 3.6?Apr 02, 2025 am 07:12 AM

Error loading Pickle file in Python 3.6 environment: ModuleNotFoundError:Nomodulenamed...

How to improve the accuracy of jieba word segmentation in scenic spot comment analysis?How to improve the accuracy of jieba word segmentation in scenic spot comment analysis?Apr 02, 2025 am 07:09 AM

How to solve the problem of Jieba word segmentation in scenic spot comment analysis? When we are conducting scenic spot comments and analysis, we often use the jieba word segmentation tool to process the text...

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function