Skip to content

Getting started

This guide shows how to install pyjelly and prepare your environment for use with RDFLib.

Installation (with RDFLib)

Install pyjelly from PyPI:

pip install pyjelly[rdflib]

pyjelly requires Python 3.9 or newer and works on all major platforms (Linux, macOS, Windows).

Usage with RDFLib

Once installed, pyjelly integrates with RDFLib automatically. You can immediately serialize and parse .jelly files using the standard RDFLib API.

Serialization

To serialize a graph to the Jelly format:

from rdflib import Graph

g = Graph()
g.parse("http://xmlns.com/foaf/spec/index.rdf")
g.serialize(destination="foaf.jelly", format="jelly")

This creates a delimited Jelly stream using default options.

Parsing

To load RDF data from a .jelly file:

from rdflib import Graph

g = Graph()
g.parse("foaf.jelly", format="jelly")

print("Parsed triples:")
for s, p, o in g:
    print(f"{s} {p} {o}")

RDFLib will reconstruct the graph from the serialized Jelly stream.

Streaming graph parser

To process a Jelly stream frame-by-frame, loading each as a separate RDFLib graph:

from rdflib import Dataset, Graph

# libraries to load example jelly stream data
import gzip
import urllib.request
from typing import cast, IO

from pyjelly.integrations.rdflib.parse import parse_jelly_grouped

url = "https://w3id.org/riverbench/datasets/dbpedia-live/dev/files/jelly_10K.jelly.gz"

# load, uncompress .gz file, and pass to jelly parser
with (
    urllib.request.urlopen(url) as resp,
    cast(IO[bytes], gzip.GzipFile(fileobj=resp)) as jelly_stream,
):
    graphs = parse_jelly_grouped(
        jelly_stream,
        graph_factory=lambda: Graph(),
        dataset_factory=lambda: Dataset(),
    )
    for i, graph in enumerate(graphs):
        print(f"Graph {i} in the stream has {len(graph)} triples")

Because parse_jelly_grouped returns a generator, each iteration receives one graph, keeping memory usage bounded to the current frame. Thus, large datasets and live streams can be processed efficiently.

File extension support

You can generally omit the format="jelly" parameter if the file ends in .jelly – RDFLib will auto-detect the format:

from rdflib import Graph
import pyjelly.integrations.rdflib

g = Graph()
g.parse("foaf.jelly")

Warning

Unfortunately, the way this is implemented in RDFLib is a bit wonky, so it will only work if you explicitly import pyjelly.integrations.rdflib, or you used format="jelly" in the serialize() or parse() call before.