API reference
pyjelly
Modules:
| Name | Description |
|---|---|
errors |
|
integrations |
|
jelly |
|
options |
|
parse |
|
serialize |
|
errors
Classes:
| Name | Description |
|---|---|
JellyConformanceError |
Raised when Jelly conformance is violated. |
JellyAssertionError |
Raised when a recommended assertion from the specification fails. |
JellyNotImplementedError |
Raised when a future feature is not yet implemented. |
JellyConformanceError
Bases: Exception
Raised when Jelly conformance is violated.
JellyAssertionError
Bases: AssertionError
Raised when a recommended assertion from the specification fails.
JellyNotImplementedError
Bases: NotImplementedError
Raised when a future feature is not yet implemented.
integrations
Modules:
| Name | Description |
|---|---|
generic |
|
rdflib |
|
generic
Modules:
| Name | Description |
|---|---|
generic_sink |
|
parse |
|
serialize |
|
generic_sink
Classes:
| Name | Description |
|---|---|
BlankNode |
Class for blank nodes, storing BN's identifier as a string. |
IRI |
Class for IRIs, storing IRI as a string. |
Literal |
Class for literals. |
Triple |
Class for RDF triples. |
Quad |
Class for RDF quads. |
Prefix |
Class for generic namespace declaration. |
GenericStatementSink |
|
BlankNode(identifier)
IRI(iri)
Literal(lex, langtag=None, datatype=None)
Class for literals.
Notes: Consists of: lexical form, and optional language tag and datatype. All parts of literal are stored as strings.
Source code in pyjelly/integrations/generic/generic_sink.py
Triple
Quad
Prefix
GenericStatementSink(identifier=DefaultGraph)
Notes: _store preserves the order of statements.
Args: identifier (str, optional): Identifier for a sink. Defaults to DefaultGraph.
Attributes:
| Name | Type | Description |
|---|---|---|
is_triples_sink |
bool
|
Check if the sink contains triples or quads. |
Source code in pyjelly/integrations/generic/generic_sink.py
is_triples_sink
Check if the sink contains triples or quads.
Returns: bool: true, if length of statement is 3.
parse
Classes:
| Name | Description |
|---|---|
GenericStatementSinkAdapter |
Implement Adapter for generic statements. |
GenericTriplesAdapter |
Triples adapted implementation for GenericStatementSink. |
GenericQuadsAdapter |
Extends GenericQuadsBaseAdapter for QUADS physical type. |
GenericGraphsAdapter |
Extends GenericQuadsBaseAdapter for GRAPHS physical type. |
Functions:
| Name | Description |
|---|---|
parse_triples_stream |
Parse flat triple stream. |
parse_quads_stream |
Parse flat quads stream. |
parse_jelly_grouped |
Take a jelly file and return generators of generic statements sinks. |
parse_jelly_to_graph |
Add statements from Generator to GenericStatementSink. |
parse_jelly_flat |
Parse jelly file with FLAT logical type into a Generator of stream events. |
GenericStatementSinkAdapter(*args, **kwargs)
Bases: Adapter
Implement Adapter for generic statements.
Notes: Returns custom RDF terms expected by GenericStatementSink, handles namespace declarations, and quoted triples.
Args: Adapter (type): base Adapter class
GenericTriplesAdapter(options)
Bases: GenericStatementSinkAdapter
Triples adapted implementation for GenericStatementSink.
Args: GenericStatementSinkAdapter (type): base GenericStatementSink adapter implementation that handles terms and namespaces.
Source code in pyjelly/integrations/generic/parse.py
GenericQuadsAdapter(options)
Bases: GenericQuadsBaseAdapter
Extends GenericQuadsBaseAdapter for QUADS physical type.
Args: GenericQuadsBaseAdapter (type): quads adapter that handles base quads processing.
Source code in pyjelly/integrations/generic/parse.py
GenericGraphsAdapter(options)
Bases: GenericQuadsBaseAdapter
Extends GenericQuadsBaseAdapter for GRAPHS physical type.
Notes: introduces graph start/end, checks if graph exists.
Args: GenericQuadsBaseAdapter (type): quads adapter that handles base quads processing.
Raises: JellyConformanceError: raised if graph start message was not received.
Source code in pyjelly/integrations/generic/parse.py
parse_triples_stream(frames, options, frame_metadata=None)
Parse flat triple stream.
Args: frames (Iterable[jelly.RdfStreamFrame]): iterator over stream frames options (ParserOptions): stream options frame_metadata: (ContextVar[ScalarMap[str, bytes]]): context variable used for extracting frame metadata
Yields: Generator[Iterable[Triple | Prefix]]: Generator of iterables of Triple or Prefix objects, one iterable per frame.
Source code in pyjelly/integrations/generic/parse.py
parse_quads_stream(frames, options, frame_metadata=None)
Parse flat quads stream.
Args: frames (Iterable[jelly.RdfStreamFrame]): iterator over stream frames options (ParserOptions): stream options frame_metadata: (ContextVar[ScalarMap[str, bytes]]): context variable used for extracting frame metadata
Yields: Generator[Iterable[Quad | Prefix]]: Generator of iterables of Quad or Prefix objects, one iterable per frame.
Source code in pyjelly/integrations/generic/parse.py
parse_jelly_grouped(inp, sink_factory=lambda: GenericStatementSink(), *, logical_type_strict=False, frame_metadata=None)
Take a jelly file and return generators of generic statements sinks.
Yields one generic statements sink per frame.
Args: inp (IO[bytes]): input jelly buffered binary stream sink_factory (Callable): lambda to construct a statement sink. By default, creates an empty in-memory GenericStatementSink. logical_type_strict (bool): If True, validate the logical type in stream options and require a grouped logical type. Otherwise, only the physical type is used to route parsing. frame_metadata: (ContextVar[ScalarMap[str, bytes]]): context variable used for extracting frame metadata
Raises: NotImplementedError: is raised if a physical type is not implemented
Yields: Generator[GenericStatementSink]: returns generators for GenericStatementSink, regardless of stream type.
Source code in pyjelly/integrations/generic/parse.py
230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 | |
parse_jelly_to_graph(inp, sink_factory=lambda: GenericStatementSink())
Add statements from Generator to GenericStatementSink.
Args: inp (IO[bytes]): input jelly stream. sink_factory (Callable[[], GenericStatementSink]): factory to create statement sink. By default creates an empty in-memory GenericStatementSink. Has no division for datasets/graphs, utilizes the same underlying data structures.
Returns: GenericStatementSink: GenericStatementSink with statements.
Source code in pyjelly/integrations/generic/parse.py
parse_jelly_flat(inp, frames=None, options=None, *, logical_type_strict=False)
Parse jelly file with FLAT logical type into a Generator of stream events.
Args: inp (IO[bytes]): input jelly buffered binary stream. frames (Iterable[jelly.RdfStreamFrame | None): jelly frames if read before. options (ParserOptions | None): stream options if read before. logical_type_strict (bool): If True, validate the logical type in stream options and require FLAT (TRIPLES/QUADS). Otherwise, only the physical type is used to route parsing.
Raises: NotImplementedError: if physical type is not supported
Yields: Generator[Statement | Prefix]: Generator of stream events
Source code in pyjelly/integrations/generic/parse.py
serialize
Modules:
| Name | Description |
|---|---|
jelly |
Generated protocol buffer code. |
Classes:
| Name | Description |
|---|---|
BlankNode |
Class for blank nodes, storing BN's identifier as a string. |
GenericSinkTermEncoder |
|
GenericStatementSink |
|
GraphStream |
|
IRI |
Class for IRIs, storing IRI as a string. |
Literal |
Class for literals. |
Quad |
Class for RDF quads. |
QuadStream |
|
Slot |
mypyc filler docstring |
Stream |
|
TermEncoder |
|
Triple |
Class for RDF triples. |
TripleStream |
|
Functions:
| Name | Description |
|---|---|
stream_frames |
|
Attributes:
| Name | Type | Description |
|---|---|---|
HasGraph |
Represent a PEP 604 union type |
|
QUAD_ARITY |
int([x]) -> integer |
|
Statement |
Represent a PEP 604 union type |
|
__file__ |
str(object='') -> str |
|
__name__ |
str(object='') -> str |
|
__package__ |
str(object='') -> str |
HasGraph = rdf_pb2.RdfQuad | rdf_pb2.RdfGraphStart
Represent a PEP 604 union type
E.g. for int | str
QUAD_ARITY = 4
int([x]) -> integer int(x, base=10) -> integer
Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.int(). For floating point numbers, this truncates towards zero.
If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by '+' or '-' and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal.
int('0b100', base=0) 4
Statement = rdf_pb2.RdfQuad | rdf_pb2.RdfTriple
Represent a PEP 604 union type
E.g. for int | str
__file__ = '/home/runner/work/pyjelly/pyjelly/pyjelly/integrations/generic/serialize.cpython-310-x86_64-linux-gnu.so'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__name__ = 'pyjelly.integrations.generic.serialize'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__package__ = 'pyjelly.integrations.generic'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
BlankNode(identifier)
GenericSinkTermEncoder(*args, **kwargs)
Bases: pyjelly.serialize.encode.TermEncoder
Methods:
| Name | Description |
|---|---|
__new__ |
Create and return a new object. See help(type) for accurate signature. |
Attributes:
| Name | Type | Description |
|---|---|---|
__doc__ |
str(object='') -> str |
|
__module__ |
str(object='') -> str |
|
__mypyc_attrs__ |
Built-in immutable sequence. |
__doc__ = ''
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__module__ = 'pyjelly.integrations.generic.serialize'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__mypyc_attrs__ = ('lookup_preset', 'names', 'prefixes', 'datatypes')
Built-in immutable sequence.
If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable's items.
If the argument is a tuple, the return value is the same object.
__new__(*args, **kwargs)
Create and return a new object. See help(type) for accurate signature.
GenericStatementSink(identifier=DefaultGraph)
Notes: _store preserves the order of statements.
Args: identifier (str, optional): Identifier for a sink. Defaults to DefaultGraph.
Attributes:
| Name | Type | Description |
|---|---|---|
is_triples_sink |
bool
|
Check if the sink contains triples or quads. |
Source code in pyjelly/integrations/generic/generic_sink.py
is_triples_sink
Check if the sink contains triples or quads.
Returns: bool: true, if length of statement is 3.
GraphStream(*, encoder, options=None)
Bases: TripleStream
Methods:
| Name | Description |
|---|---|
graph |
Process one graph into a sequence of jelly frames. |
Source code in pyjelly/serialize/streams.py
graph(graph_id, graph)
Process one graph into a sequence of jelly frames.
Args: graph_id (object): graph id (BN, Literal, iri, default) graph (Iterable[Iterable[object]]): iterable of triples (graph's content)
Yields: Generator[jelly.RdfStreamFrame]: jelly frames.
Source code in pyjelly/serialize/streams.py
IRI(iri)
Literal(lex, langtag=None, datatype=None)
Class for literals.
Notes: Consists of: lexical form, and optional language tag and datatype. All parts of literal are stored as strings.
Source code in pyjelly/integrations/generic/generic_sink.py
Quad
QuadStream(*, encoder, options=None)
Bases: Stream
Methods:
| Name | Description |
|---|---|
quad |
Process one quad to Protobuf messages. |
Source code in pyjelly/serialize/streams.py
quad(terms)
Process one quad to Protobuf messages.
Args: terms (Iterable[object]): terms to encode.
Returns: jelly.RdfStreamFrame | None: stream frame if flow supports frames slicing and current flow is full
Source code in pyjelly/serialize/streams.py
Slot
Bases: enum.IntEnum
mypyc filler docstring
Attributes:
| Name | Type | Description |
|---|---|---|
__doc__ |
str(object='') -> str |
|
__module__ |
str(object='') -> str |
|
graph |
mypyc filler docstring |
|
object |
mypyc filler docstring |
|
predicate |
mypyc filler docstring |
|
subject |
mypyc filler docstring |
__doc__ = 'mypyc filler docstring'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__module__ = 'pyjelly.serialize.encode'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
graph = <Slot.graph: 3>
mypyc filler docstring
object = <Slot.object: 2>
mypyc filler docstring
predicate = <Slot.predicate: 1>
mypyc filler docstring
subject = <Slot.subject: 0>
mypyc filler docstring
Stream(*, encoder, options=None)
Methods:
| Name | Description |
|---|---|
infer_flow |
Return flow based on the stream options provided. |
enroll |
Initialize start of the stream. |
stream_options |
Encode and append stream options row to the current flow. |
namespace_declaration |
Add namespace declaration to jelly stream. |
for_rdflib |
Initialize stream with RDFLib encoder. |
Source code in pyjelly/serialize/streams.py
infer_flow()
Return flow based on the stream options provided.
Returns: FrameFlow: initialised FrameFlow object.
Source code in pyjelly/serialize/streams.py
enroll()
stream_options()
Encode and append stream options row to the current flow.
Source code in pyjelly/serialize/streams.py
namespace_declaration(name, iri)
Add namespace declaration to jelly stream.
Args: name (str): namespace prefix label iri (str): namespace iri
Source code in pyjelly/serialize/streams.py
for_rdflib(options=None)
Initialize stream with RDFLib encoder.
Args: options (SerializerOptions | None, optional): Stream options. Defaults to None.
Raises: TypeError: if Stream is passed, and not a Stream for specific physical type.
Returns: Stream: initialized stream with RDFLib encoder.
Source code in pyjelly/serialize/streams.py
TermEncoder(*args, **kwargs)
Methods:
| Name | Description |
|---|---|
__new__ |
Create and return a new object. See help(type) for accurate signature. |
Attributes:
| Name | Type | Description |
|---|---|---|
__doc__ |
str(object='') -> str |
|
__module__ |
str(object='') -> str |
|
__mypyc_attrs__ |
Built-in immutable sequence. |
__doc__ = ''
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__module__ = 'pyjelly.serialize.encode'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__mypyc_attrs__ = ('lookup_preset', 'names', 'prefixes', 'datatypes')
Built-in immutable sequence.
If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable's items.
If the argument is a tuple, the return value is the same object.
__new__(*args, **kwargs)
Create and return a new object. See help(type) for accurate signature.
Triple
TripleStream(*, encoder, options=None)
Bases: Stream
Methods:
| Name | Description |
|---|---|
triple |
Process one triple to Protobuf messages. |
Source code in pyjelly/serialize/streams.py
triple(terms)
Process one triple to Protobuf messages.
Note: Adds new rows to the current flow and returns StreamFrame if frame size conditions are met.
Args: terms (Iterable[object]): RDF terms to encode.
Returns: jelly.RdfStreamFrame | None: stream frame if flow supports frames slicing and current flow is full
Source code in pyjelly/serialize/streams.py
stream_frames()
jelly
Generated protocol buffer code.
rdflib
Modules:
| Name | Description |
|---|---|
parse |
|
serialize |
|
Functions:
| Name | Description |
|---|---|
register_extension_to_rdflib |
Make rdflib.util.guess_format discover Jelly format. |
register_extension_to_rdflib(extension='.jelly')
Make rdflib.util.guess_format discover Jelly format.
rdflib.util.guess_format("foo.jelly") register_extension_to_rdflib() rdflib.util.guess_format("foo.jelly") 'jelly'
Source code in pyjelly/integrations/rdflib/__init__.py
parse
Classes:
| Name | Description |
|---|---|
Triple |
Describe RDFLib triple. |
Quad |
Describe RDFLib quad. |
Prefix |
Describe RDF Prefix(i.e, namespace declaration). |
RDFLibAdapter |
RDFLib adapter class, is extended by triples and quads implementations. |
RDFLibTriplesAdapter |
Triples adapter RDFLib implementation. |
RDFLibQuadsAdapter |
Extended RDFLib adapter for the QUADS physical type. |
RDFLibGraphsAdapter |
Extension of RDFLibQuadsBaseAdapter for the GRAPHS physical type. |
RDFLibJellyParser |
|
Functions:
| Name | Description |
|---|---|
parse_triples_stream |
Parse flat triple stream. |
parse_quads_stream |
Parse flat quads stream. |
parse_jelly_grouped |
Take jelly file and return generators based on the detected physical type. |
parse_jelly_to_graph |
Add statements from Generator to provided Graph/Dataset. |
parse_jelly_flat |
Parse jelly file with FLAT logical type into a Generator of stream events. |
Triple
Quad
Prefix
RDFLibAdapter(*args, **kwargs)
Bases: Adapter
RDFLib adapter class, is extended by triples and quads implementations.
Args: Adapter (): abstract adapter class
RDFLibTriplesAdapter(options)
RDFLibQuadsAdapter(options)
Bases: RDFLibQuadsBaseAdapter
Extended RDFLib adapter for the QUADS physical type.
Args: RDFLibQuadsBaseAdapter (RDFLibAdapter): base quads adapter (shared with graphs physical type)
Source code in pyjelly/integrations/rdflib/parse.py
RDFLibGraphsAdapter(options)
Bases: RDFLibQuadsBaseAdapter
Extension of RDFLibQuadsBaseAdapter for the GRAPHS physical type.
Notes: introduces graph start/end, checks if graph exists.
Args: RDFLibQuadsBaseAdapter (RDFLibAdapter): base adapter for quads management.
Raises: JellyConformanceError: if no graph_start was encountered
Source code in pyjelly/integrations/rdflib/parse.py
RDFLibJellyParser
Bases: Parser
Methods:
| Name | Description |
|---|---|
parse |
Parse jelly file into provided RDFLib Graph. |
parse(source, sink)
Parse jelly file into provided RDFLib Graph.
Args: source (InputSource): jelly file as buffered binary stream InputSource obj sink (Graph): RDFLib Graph
Raises: TypeError: raises error if invalid input
Source code in pyjelly/integrations/rdflib/parse.py
parse_triples_stream(frames, options, frame_metadata=None)
Parse flat triple stream.
Args: frames (Iterable[jelly.RdfStreamFrame]): iterator over stream frames options (ParserOptions): stream options frame_metadata: (ContextVar[ScalarMap[str, bytes]]): context variable used for extracting frame metadata
Yields: Generator[Iterable[Triple | Prefix]]: Generator of iterables of Triple or Prefix objects, one iterable per frame.
Source code in pyjelly/integrations/rdflib/parse.py
parse_quads_stream(frames, options, frame_metadata=None)
Parse flat quads stream.
Args: frames (Iterable[jelly.RdfStreamFrame]): iterator over stream frames options (ParserOptions): stream options frame_metadata: (ContextVar[ScalarMap[str, bytes]]): context variable used for extracting frame metadata
Yields: Generator[Iterable[Quad | Prefix]]: Generator of iterables of Quad or Prefix objects, one iterable per frame.
Source code in pyjelly/integrations/rdflib/parse.py
parse_jelly_grouped(inp, graph_factory=lambda: Graph(), dataset_factory=lambda: Dataset(), *, logical_type_strict=False, frame_metadata=None)
Take jelly file and return generators based on the detected physical type.
Yields one graph/dataset per frame.
Args: inp (IO[bytes]): input jelly buffered binary stream graph_factory (Callable): lambda to construct a Graph. By default creates an empty in-memory Graph, but you can pass something else here. dataset_factory (Callable): lambda to construct a Dataset. By default creates an empty in-memory Dataset, but you can pass something else here. logical_type_strict (bool): If True, validate the logical type in stream options and require a grouped logical type. Otherwise, only the physical type is used to route parsing. frame_metadata: (ContextVar[ScalarMap[str, bytes]]): context variable used for extracting frame metadata
Raises: NotImplementedError: is raised if a physical type is not implemented
Yields: Generator[Graph] | Generator[Dataset]: returns generators for graphs/datasets based on the type of input
Source code in pyjelly/integrations/rdflib/parse.py
327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 | |
parse_jelly_to_graph(inp, graph_factory=lambda: Graph(), dataset_factory=lambda: Dataset())
Add statements from Generator to provided Graph/Dataset.
Args: inp (IO[bytes]): input jelly stream. graph_factory (Callable[[], Graph]): factory to create Graph. By default creates an empty in-memory Graph, but you can pass something else here. dataset_factory (Callable[[], Dataset]): factory to create Dataset. By default creates an empty in-memory Dataset, but you can pass something else here.
Returns: Dataset | Graph: Dataset or Graph with statements.
Source code in pyjelly/integrations/rdflib/parse.py
parse_jelly_flat(inp, frames=None, options=None, *, logical_type_strict=False)
Parse jelly file with FLAT logical type into a Generator of stream events.
Args: inp (IO[bytes]): input jelly buffered binary stream. frames (Iterable[jelly.RdfStreamFrame | None): jelly frames if read before. options (ParserOptions | None): stream options if read before. logical_type_strict (bool): If True, validate the logical type in stream options and require FLAT_(TRIPLES|QUADS). Otherwise, only the physical type is used to route parsing.
Raises: NotImplementedError: if physical type is not supported
Yields: Generator[Statement | Prefix]: Generator of stream events
Source code in pyjelly/integrations/rdflib/parse.py
serialize
Classes:
| Name | Description |
|---|---|
RDFLibTermEncoder |
|
RDFLibJellySerializer |
RDFLib serializer for writing graphs in Jelly RDF stream format. |
Functions:
| Name | Description |
|---|---|
triples_stream_frames |
Serialize a Graph/Dataset into jelly frames. |
quads_stream_frames |
Serialize a Dataset into jelly frames. |
graphs_stream_frames |
Serialize a Dataset into jelly frames as a stream of graphs. |
guess_options |
Guess the serializer options based on the store type. |
guess_stream |
Return an appropriate stream implementation for the given options. |
grouped_stream_to_frames |
Transform Graphs/Datasets into Jelly frames, one frame per Graph/Dataset. |
grouped_stream_to_file |
Write stream of Graphs/Datasets to a binary file. |
flat_stream_to_frames |
Serialize a stream of raw triples or quads into Jelly frames. |
flat_stream_to_file |
Write Triple or Quad events to a binary file in Jelly flat format. |
RDFLibTermEncoder(*args, **kwargs)
Bases: TermEncoder
Methods:
| Name | Description |
|---|---|
encode_spo |
Encode s/p/o term based on its RDFLib object. |
encode_graph |
Encode graph name term based on its RDFLib object. |
encode_spo(term, slot, statement)
Encode s/p/o term based on its RDFLib object.
Args: term (object): term to encode slot (Slot): its place in statement. statement (Statement): Triple/Quad message to fill with s/p/o terms.
Returns: Rows: encoded extra rows
Source code in pyjelly/integrations/rdflib/serialize.py
encode_graph(term, statement)
Encode graph name term based on its RDFLib object.
Args: term (object): term to encode statement (HasGraph): Quad/GraphStart message to fill g_{} in.
Returns: Rows: encoded extra rows
Source code in pyjelly/integrations/rdflib/serialize.py
RDFLibJellySerializer(store)
Bases: Serializer
RDFLib serializer for writing graphs in Jelly RDF stream format.
Handles streaming RDF terms into Jelly frames using internal encoders. Supports only graphs and datasets (not quoted graphs).
Methods:
| Name | Description |
|---|---|
serialize |
Serialize self.store content to Jelly format. |
Source code in pyjelly/integrations/rdflib/serialize.py
serialize(out, /, *, stream=None, options=None, **unused)
Serialize self.store content to Jelly format.
Args: out (IO[bytes]): output buffered writer stream (Stream | None, optional): Jelly stream object. Defaults to None. options (SerializerOptions | None, optional): Serializer options if defined beforehand, e.g., read from a separate file. Defaults to None. **unused(Any): unused args for RDFLib serialize
Source code in pyjelly/integrations/rdflib/serialize.py
triples_stream_frames(stream, data)
Serialize a Graph/Dataset into jelly frames.
Args: stream (TripleStream): stream that specifies triples processing data (Graph | Dataset | Generator[Triple]): Graph/Dataset/Statements to serialize.
Notes: if Dataset is given, its graphs are unpacked and iterated over if flow is GraphsFrameFlow, emits a frame per graph.
Yields: Generator[jelly.RdfStreamFrame]: jelly frames.
Source code in pyjelly/integrations/rdflib/serialize.py
quads_stream_frames(stream, data)
Serialize a Dataset into jelly frames.
Notes: Emits one frame per dataset if flow is of DatasetsFrameFlow.
Args: stream (QuadStream): stream that specifies quads processing data (Dataset | Generator[Quad]): Dataset to serialize.
Yields: Generator[jelly.RdfStreamFrame]: jelly frames
Source code in pyjelly/integrations/rdflib/serialize.py
graphs_stream_frames(stream, data)
Serialize a Dataset into jelly frames as a stream of graphs.
Notes: If flow of DatasetsFrameFlow type, the whole dataset will be encoded into one frame.
Args: stream (GraphStream): stream that specifies graphs processing data (Dataset | Generator[Quad]): Dataset to serialize.
Yields: Generator[jelly.RdfStreamFrame]: jelly frames
Source code in pyjelly/integrations/rdflib/serialize.py
guess_options(sink)
Guess the serializer options based on the store type.
guess_options(Graph()).logical_type 1 guess_options(Dataset()).logical_type 2
Source code in pyjelly/integrations/rdflib/serialize.py
guess_stream(options, sink)
Return an appropriate stream implementation for the given options.
Notes: if base(!) logical type is GRAPHS and Dataset is given, initializes TripleStream
graph_ser = RDFLibJellySerializer(Graph()) ds_ser = RDFLibJellySerializer(Dataset())
type(guess_stream(guess_options(graph_ser.store), graph_ser.store))
type(guess_stream(guess_options(ds_ser.store), ds_ser.store))
Source code in pyjelly/integrations/rdflib/serialize.py
grouped_stream_to_frames(sink_generator, options=None)
Transform Graphs/Datasets into Jelly frames, one frame per Graph/Dataset.
Note: options are guessed if not provided.
Args: sink_generator (Generator[Graph] | Generator[Dataset]): Generator of Graphs/Dataset to transform. options (SerializerOptions | None, optional): stream options to use. Options are guessed based on the sink store type. Defaults to None.
Yields: Generator[jelly.RdfStreamFrame]: produced Jelly frames
Source code in pyjelly/integrations/rdflib/serialize.py
grouped_stream_to_file(stream, output_file, **kwargs)
Write stream of Graphs/Datasets to a binary file.
Args: stream (Generator[Graph] | Generator[Dataset]): Generator of Graphs/Dataset to transform. output_file (IO[bytes]): output buffered writer. **kwargs (Any): options to pass to stream.
Source code in pyjelly/integrations/rdflib/serialize.py
flat_stream_to_frames(statements, options=None)
Serialize a stream of raw triples or quads into Jelly frames.
Args: statements (Generator[Triple | Quad]): s/p/o triples or s/p/o/g quads to serialize. options (SerializerOptions | None, optional): if omitted, guessed based on the first tuple.
Yields: Generator[jelly.RdfStreamFrame]: generated frames.
Source code in pyjelly/integrations/rdflib/serialize.py
flat_stream_to_file(statements, output_file, options=None)
Write Triple or Quad events to a binary file in Jelly flat format.
Args: statements (Generator[Triple | Quad]): statements to serialize. output_file (IO[bytes]): output buffered writer. options (SerializerOptions | None, optional): stream options.
Source code in pyjelly/integrations/rdflib/serialize.py
jelly
Modules:
| Name | Description |
|---|---|
rdf_pb2 |
Generated protocol buffer code. |
rdf_pb2
Generated protocol buffer code.
options
Classes:
| Name | Description |
|---|---|
StreamTypes |
|
Functions:
| Name | Description |
|---|---|
register_mimetypes |
Associate files that have Jelly extension with Jelly MIME types. |
Attributes:
| Name | Type | Description |
|---|---|---|
INTEGRATION_SIDE_EFFECTS |
bool
|
Whether to allow integration module imports to trigger side effects. |
INTEGRATION_SIDE_EFFECTS = True
Whether to allow integration module imports to trigger side effects.
These side effects are cheap and may include populating some registries for guessing the defaults for external integrations that work with Jelly.
StreamTypes(physical_type=jelly.PHYSICAL_STREAM_TYPE_UNSPECIFIED, logical_type=jelly.LOGICAL_STREAM_TYPE_UNSPECIFIED)
Methods:
| Name | Description |
|---|---|
__repr__ |
Return the representation of StreamTypes. |
__repr__()
Return the representation of StreamTypes.
repr(StreamTypes(9999, 8888)) 'StreamTypes(9999, 8888)'
Source code in pyjelly/options.py
register_mimetypes(extension='.jelly')
Associate files that have Jelly extension with Jelly MIME types.
register_mimetypes() mimetypes.guess_type("out.jelly") ('application/x-jelly-rdf', None)
Source code in pyjelly/options.py
parse
Modules:
| Name | Description |
|---|---|
decode |
|
ioutils |
|
lookup |
|
decode
Modules:
| Name | Description |
|---|---|
jelly |
Generated protocol buffer code. |
Classes:
| Name | Description |
|---|---|
Adapter |
|
Decoder |
|
LookupDecoder |
Shared base for RDF lookup encoders using Jelly compression. |
ParserOptions |
mypyc filler docstring |
ParsingMode |
mypyc filler docstring |
StreamTypes |
|
Attributes:
| Name | Type | Description |
|---|---|---|
MAX_VERSION |
int([x]) -> integer |
|
__file__ |
str(object='') -> str |
|
__name__ |
str(object='') -> str |
|
__package__ |
str(object='') -> str |
MAX_VERSION = 2
int([x]) -> integer int(x, base=10) -> integer
Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.int(). For floating point numbers, this truncates towards zero.
If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by '+' or '-' and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal.
int('0b100', base=0) 4
__file__ = '/home/runner/work/pyjelly/pyjelly/pyjelly/parse/decode.cpython-310-x86_64-linux-gnu.so'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__name__ = 'pyjelly.parse.decode'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__package__ = 'pyjelly.parse'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
Adapter(*args, **kwargs)
Methods:
| Name | Description |
|---|---|
__new__ |
Create and return a new object. See help(type) for accurate signature. |
Attributes:
| Name | Type | Description |
|---|---|---|
__module__ |
str(object='') -> str |
|
__mypyc_attrs__ |
Built-in immutable sequence. |
__module__ = 'pyjelly.parse.decode'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__mypyc_attrs__ = ('options', 'parsing_mode')
Built-in immutable sequence.
If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable's items.
If the argument is a tuple, the return value is the same object.
__new__(*args, **kwargs)
Create and return a new object. See help(type) for accurate signature.
Decoder(*args, **kwargs)
Methods:
| Name | Description |
|---|---|
__new__ |
Create and return a new object. See help(type) for accurate signature. |
Attributes:
| Name | Type | Description |
|---|---|---|
__doc__ |
str(object='') -> str |
|
__module__ |
str(object='') -> str |
|
__mypyc_attrs__ |
Built-in immutable sequence. |
__doc__ = ''
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__module__ = 'pyjelly.parse.decode'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__mypyc_attrs__ = ('adapter', 'names', 'prefixes', 'datatypes', 'repeated_terms', 'row_handlers', 'term_handlers')
Built-in immutable sequence.
If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable's items.
If the argument is a tuple, the return value is the same object.
__new__(*args, **kwargs)
Create and return a new object. See help(type) for accurate signature.
LookupDecoder(*, lookup_size)
Shared base for RDF lookup encoders using Jelly compression.
Tracks the last assigned and last reused index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lookup_size
|
int
|
Maximum lookup size. |
required |
Source code in pyjelly/parse/lookup.py
ParserOptions
Bases: builtins.tuple
mypyc filler docstring
Attributes:
| Name | Type | Description |
|---|---|---|
__annotations__ |
dict() -> new empty dictionary |
|
__doc__ |
str(object='') -> str |
|
__match_args__ |
Built-in immutable sequence. |
|
__module__ |
str(object='') -> str |
|
__slots__ |
Built-in immutable sequence. |
|
lookup_preset |
Alias for field number 1 |
|
params |
Alias for field number 2 |
|
stream_types |
Alias for field number 0 |
__annotations__ = {'stream_types': <class 'pyjelly.options.StreamTypes'>, 'lookup_preset': <class 'pyjelly.options.LookupPreset'>, 'params': <class 'pyjelly.options.StreamParameters'>}
dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object's (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict(**kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2)
__doc__ = 'mypyc filler docstring'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__match_args__ = ('stream_types', 'lookup_preset', 'params')
Built-in immutable sequence.
If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable's items.
If the argument is a tuple, the return value is the same object.
__module__ = 'pyjelly.parse.decode'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__slots__ = ()
Built-in immutable sequence.
If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable's items.
If the argument is a tuple, the return value is the same object.
lookup_preset = _tuplegetter(1, 'Alias for field number 1')
Alias for field number 1
params = _tuplegetter(2, 'Alias for field number 2')
Alias for field number 2
stream_types = _tuplegetter(0, 'Alias for field number 0')
Alias for field number 0
ParsingMode
Bases: enum.Enum
mypyc filler docstring
Attributes:
| Name | Type | Description |
|---|---|---|
FLAT |
mypyc filler docstring |
|
GROUPED |
mypyc filler docstring |
|
__doc__ |
str(object='') -> str |
|
__module__ |
str(object='') -> str |
FLAT = <ParsingMode.FLAT: 1>
mypyc filler docstring
GROUPED = <ParsingMode.GROUPED: 2>
mypyc filler docstring
__doc__ = 'mypyc filler docstring'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__module__ = 'pyjelly.parse.decode'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
StreamTypes(physical_type=jelly.PHYSICAL_STREAM_TYPE_UNSPECIFIED, logical_type=jelly.LOGICAL_STREAM_TYPE_UNSPECIFIED)
Methods:
| Name | Description |
|---|---|
__repr__ |
Return the representation of StreamTypes. |
__repr__()
Return the representation of StreamTypes.
repr(StreamTypes(9999, 8888)) 'StreamTypes(9999, 8888)'
Source code in pyjelly/options.py
jelly
Generated protocol buffer code.
ioutils
Functions:
| Name | Description |
|---|---|
delimited_jelly_hint |
Detect whether a Jelly file is delimited from its first 3 bytes. |
get_options_and_frames |
Return stream options and frames from the buffered binary stream. |
delimited_jelly_hint(header)
Detect whether a Jelly file is delimited from its first 3 bytes.
Truth table (notation: 0A = 0x0A, NN = not 0x0A, ?? = don't care):
| Byte 1 | Byte 2 | Byte 3 | Result |
|---|---|---|---|
NN |
?? |
?? |
Delimited |
0A |
NN |
?? |
Non-delimited |
0A |
0A |
NN |
Delimited (size = 10) |
0A |
0A |
0A |
Non-delimited (stream options size = 10) |
delimited_jelly_hint(bytes([0x00, 0x00, 0x00])) True
delimited_jelly_hint(bytes([0x00, 0x00, 0x0A])) True
delimited_jelly_hint(bytes([0x00, 0x0A, 0x00])) True
delimited_jelly_hint(bytes([0x00, 0x0A, 0x0A])) True
delimited_jelly_hint(bytes([0x0A, 0x00, 0x00])) False
delimited_jelly_hint(bytes([0x0A, 0x00, 0x0A])) False
delimited_jelly_hint(bytes([0x0A, 0x0A, 0x00])) True
delimited_jelly_hint(bytes([0x0A, 0x0A, 0x0A])) False
Source code in pyjelly/parse/ioutils.py
get_options_and_frames(inp)
Return stream options and frames from the buffered binary stream.
Args: inp (IO[bytes]): jelly buffered binary stream
Raises: JellyConformanceError: if no non-empty frames detected in the delimited stream JellyConformanceError: if non-delimited, error is raised if no rows are detected (empty frame)
Returns: tuple[ParserOptions, Iterator[jelly.RdfStreamFrame]]: ParserOptions holds: stream types, lookup presets and other stream options
Source code in pyjelly/parse/ioutils.py
lookup
Classes:
| Name | Description |
|---|---|
LookupDecoder |
Shared base for RDF lookup encoders using Jelly compression. |
LookupDecoder(*, lookup_size)
Shared base for RDF lookup encoders using Jelly compression.
Tracks the last assigned and last reused index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lookup_size
|
int
|
Maximum lookup size. |
required |
Source code in pyjelly/parse/lookup.py
serialize
Modules:
| Name | Description |
|---|---|
encode |
|
flows |
|
lookup |
|
streams |
|
encode
Modules:
| Name | Description |
|---|---|
jelly |
Generated protocol buffer code. |
options |
|
Classes:
| Name | Description |
|---|---|
JellyConformanceError |
Raised when Jelly conformance is violated. |
LookupEncoder |
Shared base for RDF lookup encoders using Jelly compression. |
Slot |
mypyc filler docstring |
TermEncoder |
|
Attributes:
| Name | Type | Description |
|---|---|---|
HasGraph |
Represent a PEP 604 union type |
|
Statement |
Represent a PEP 604 union type |
|
T |
Type variable. |
|
Terms |
Represent a PEP 604 union type |
|
__file__ |
str(object='') -> str |
|
__name__ |
str(object='') -> str |
|
__package__ |
str(object='') -> str |
HasGraph = rdf_pb2.RdfQuad | rdf_pb2.RdfGraphStart
Represent a PEP 604 union type
E.g. for int | str
Statement = rdf_pb2.RdfQuad | rdf_pb2.RdfTriple
Represent a PEP 604 union type
E.g. for int | str
T = ~T
Type variable.
Usage::
T = TypeVar('T') # Can be anything A = TypeVar('A', str, bytes) # Must be str or bytes
Type variables exist primarily for the benefit of static type checkers. They serve as the parameters for generic types as well as for generic function definitions. See class Generic for more information on generic types. Generic functions work as follows:
def repeat(x: T, n: int) -> List[T]: '''Return a list containing n references to x.''' return [x]*n
def longest(x: A, y: A) -> A: '''Return the longest of two strings.''' return x if len(x) >= len(y) else y
The latter example's signature is essentially the overloading of (str, str) -> str and (bytes, bytes) -> bytes. Also note that if the arguments are instances of some subclass of str, the return type is still plain str.
At runtime, isinstance(x, T) and issubclass(C, T) will raise TypeError.
Type variables defined with covariant=True or contravariant=True can be used to declare covariant or contravariant generic types. See PEP 484 for more details. By default generic types are invariant in all type variables.
Type variables can be introspected. e.g.:
T.name == 'T' T.constraints == () T.covariant == False T.contravariant = False A.constraints == (str, bytes)
Note that only type variables defined in global scope can be pickled.
Terms = rdf_pb2.RdfIri | rdf_pb2.RdfLiteral | str | rdf_pb2.RdfDefaultGraph | rdf_pb2.RdfTriple
Represent a PEP 604 union type
E.g. for int | str
__file__ = '/home/runner/work/pyjelly/pyjelly/pyjelly/serialize/encode.cpython-310-x86_64-linux-gnu.so'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__name__ = 'pyjelly.serialize.encode'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__package__ = 'pyjelly.serialize'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
JellyConformanceError
Bases: Exception
Raised when Jelly conformance is violated.
LookupEncoder(*, lookup_size)
Shared base for RDF lookup encoders using Jelly compression.
Tracks the last assigned and last reused index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lookup_size
|
int
|
Maximum lookup size. |
required |
Methods:
| Name | Description |
|---|---|
encode_entry_index |
Get or assign the index to use in an entry. |
Source code in pyjelly/serialize/lookup.py
encode_entry_index(key)
Get or assign the index to use in an entry.
Returns:
| Type | Description |
|---|---|
int or None
|
|
If the return value is None, the entry is already in the lookup and does not
|
|
need to be emitted. Any integer value (including 0) means the entry is new
|
|
and should be emitted.
|
|
Source code in pyjelly/serialize/lookup.py
Slot
Bases: enum.IntEnum
mypyc filler docstring
Attributes:
| Name | Type | Description |
|---|---|---|
__doc__ |
str(object='') -> str |
|
__module__ |
str(object='') -> str |
|
graph |
mypyc filler docstring |
|
object |
mypyc filler docstring |
|
predicate |
mypyc filler docstring |
|
subject |
mypyc filler docstring |
__doc__ = 'mypyc filler docstring'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__module__ = 'pyjelly.serialize.encode'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
graph = <Slot.graph: 3>
mypyc filler docstring
object = <Slot.object: 2>
mypyc filler docstring
predicate = <Slot.predicate: 1>
mypyc filler docstring
subject = <Slot.subject: 0>
mypyc filler docstring
TermEncoder(*args, **kwargs)
Methods:
| Name | Description |
|---|---|
__new__ |
Create and return a new object. See help(type) for accurate signature. |
Attributes:
| Name | Type | Description |
|---|---|---|
__doc__ |
str(object='') -> str |
|
__module__ |
str(object='') -> str |
|
__mypyc_attrs__ |
Built-in immutable sequence. |
__doc__ = ''
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__module__ = 'pyjelly.serialize.encode'
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
__mypyc_attrs__ = ('lookup_preset', 'names', 'prefixes', 'datatypes')
Built-in immutable sequence.
If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable's items.
If the argument is a tuple, the return value is the same object.
__new__(*args, **kwargs)
Create and return a new object. See help(type) for accurate signature.
jelly
Generated protocol buffer code.
options
Classes:
| Name | Description |
|---|---|
StreamTypes |
|
Functions:
| Name | Description |
|---|---|
register_mimetypes |
Associate files that have Jelly extension with Jelly MIME types. |
Attributes:
| Name | Type | Description |
|---|---|---|
INTEGRATION_SIDE_EFFECTS |
bool
|
Whether to allow integration module imports to trigger side effects. |
INTEGRATION_SIDE_EFFECTS = True
Whether to allow integration module imports to trigger side effects.
These side effects are cheap and may include populating some registries for guessing the defaults for external integrations that work with Jelly.
StreamTypes(physical_type=jelly.PHYSICAL_STREAM_TYPE_UNSPECIFIED, logical_type=jelly.LOGICAL_STREAM_TYPE_UNSPECIFIED)
Methods:
| Name | Description |
|---|---|
__repr__ |
Return the representation of StreamTypes. |
__repr__()
Return the representation of StreamTypes.
repr(StreamTypes(9999, 8888)) 'StreamTypes(9999, 8888)'
Source code in pyjelly/options.py
register_mimetypes(extension='.jelly')
Associate files that have Jelly extension with Jelly MIME types.
register_mimetypes() mimetypes.guess_type("out.jelly") ('application/x-jelly-rdf', None)
Source code in pyjelly/options.py
flows
Classes:
| Name | Description |
|---|---|
FrameFlow |
Abstract base class for producing Jelly frames from RDF stream rows. |
ManualFrameFlow |
Produces frames only when manually requested (never automatically). |
BoundedFrameFlow |
Produce frames automatically when a fixed number of rows is reached. |
GraphsFrameFlow |
|
DatasetsFrameFlow |
|
Functions:
| Name | Description |
|---|---|
flow_for_type |
Return flow based on logical type requested. |
FrameFlow(initlist=None, *, logical_type=None, **__kwargs)
Bases: UserList[RdfStreamRow]
Abstract base class for producing Jelly frames from RDF stream rows.
Collects stream rows and assembles them into RdfStreamFrame objects when ready.
Allows for passing LogicalStreamType, required for logical subtypes and non-delimited streams.
Methods:
| Name | Description |
|---|---|
frame_from_graph |
Treat the current rows as a graph and produce a frame. |
frame_from_dataset |
Treat the current rows as a dataset and produce a frame. |
to_stream_frame |
Create stream frame from flow content. |
Source code in pyjelly/serialize/flows.py
frame_from_graph()
Treat the current rows as a graph and produce a frame.
Default implementation returns None.
frame_from_dataset()
Treat the current rows as a dataset and produce a frame.
Default implementation returns None.
to_stream_frame()
Create stream frame from flow content.
Notes: Clears flow content after creating the frame.
Returns: jelly.RdfStreamFrame | None: stream frame
Source code in pyjelly/serialize/flows.py
ManualFrameFlow(initlist=None, *, logical_type=None, **__kwargs)
Bases: FrameFlow
Produces frames only when manually requested (never automatically).
Warning
All stream rows are kept in memory until to_stream_frame() is called.
This may lead to high memory usage for large streams.
Used for non-delimited serialization.
Source code in pyjelly/serialize/flows.py
BoundedFrameFlow(initlist=None, logical_type=None, *, frame_size=None)
Bases: FrameFlow
Produce frames automatically when a fixed number of rows is reached.
Used for delimited encoding (default mode).
Methods:
| Name | Description |
|---|---|
frame_from_bounds |
Emit frame from flow if full. |
Source code in pyjelly/serialize/flows.py
frame_from_bounds()
Emit frame from flow if full.
Returns: jelly.RdfStreamFrame | None: stream frame
Source code in pyjelly/serialize/flows.py
GraphsFrameFlow(initlist=None, *, logical_type=None, **__kwargs)
Bases: FrameFlow
Methods:
| Name | Description |
|---|---|
frame_from_graph |
Emit current flow content (one graph) as jelly frame. |
Source code in pyjelly/serialize/flows.py
frame_from_graph()
Emit current flow content (one graph) as jelly frame.
Returns: jelly.RdfStreamFrame | None: jelly frame or none if flow is empty.
Source code in pyjelly/serialize/flows.py
DatasetsFrameFlow(initlist=None, *, logical_type=None, **__kwargs)
Bases: FrameFlow
Methods:
| Name | Description |
|---|---|
frame_from_dataset |
Emit current flow content (dataset) as jelly frame. |
Source code in pyjelly/serialize/flows.py
frame_from_dataset()
Emit current flow content (dataset) as jelly frame.
Returns: jelly.RdfStreamFrame | None: jelly frame or none if flow is empty.
Source code in pyjelly/serialize/flows.py
flow_for_type(logical_type)
Return flow based on logical type requested.
Note: uses base logical type for subtypes (i.e., SUBJECT_GRAPHS uses the same flow as its base type GRAPHS).
Args: logical_type (jelly.LogicalStreamType): logical type requested.
Raises: NotImplementedError: if (base) logical stream type is not supported.
Returns: type[FrameFlow]: FrameFlow for respective logical type.
Source code in pyjelly/serialize/flows.py
lookup
Classes:
| Name | Description |
|---|---|
Lookup |
Fixed-size 1-based string-to-index mapping with LRU eviction. |
LookupEncoder |
Shared base for RDF lookup encoders using Jelly compression. |
Lookup(max_size)
Fixed-size 1-based string-to-index mapping with LRU eviction.
- Assigns incrementing indices starting from 1.
- After reaching the maximum size, reuses the existing indices from evicting the least-recently-used entries.
- Index 0 is reserved for delta encoding in Jelly streams.
To check if a key exists, use .move(key) and catch KeyError.
If KeyError is raised, the key can be inserted with .insert(key).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
max_size
|
int
|
Maximum number of entries. Zero disables lookup. |
required |
Source code in pyjelly/serialize/lookup.py
LookupEncoder(*, lookup_size)
Shared base for RDF lookup encoders using Jelly compression.
Tracks the last assigned and last reused index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lookup_size
|
int
|
Maximum lookup size. |
required |
Methods:
| Name | Description |
|---|---|
encode_entry_index |
Get or assign the index to use in an entry. |
Source code in pyjelly/serialize/lookup.py
encode_entry_index(key)
Get or assign the index to use in an entry.
Returns:
| Type | Description |
|---|---|
int or None
|
|
If the return value is None, the entry is already in the lookup and does not
|
|
need to be emitted. Any integer value (including 0) means the entry is new
|
|
and should be emitted.
|
|
Source code in pyjelly/serialize/lookup.py
streams
Classes:
| Name | Description |
|---|---|
Stream |
|
TripleStream |
|
QuadStream |
|
GraphStream |
|
Functions:
| Name | Description |
|---|---|
stream_for_type |
Give a Stream based on physical type specified. |
Stream(*, encoder, options=None)
Methods:
| Name | Description |
|---|---|
infer_flow |
Return flow based on the stream options provided. |
enroll |
Initialize start of the stream. |
stream_options |
Encode and append stream options row to the current flow. |
namespace_declaration |
Add namespace declaration to jelly stream. |
for_rdflib |
Initialize stream with RDFLib encoder. |
Source code in pyjelly/serialize/streams.py
infer_flow()
Return flow based on the stream options provided.
Returns: FrameFlow: initialised FrameFlow object.
Source code in pyjelly/serialize/streams.py
enroll()
stream_options()
Encode and append stream options row to the current flow.
Source code in pyjelly/serialize/streams.py
namespace_declaration(name, iri)
Add namespace declaration to jelly stream.
Args: name (str): namespace prefix label iri (str): namespace iri
Source code in pyjelly/serialize/streams.py
for_rdflib(options=None)
Initialize stream with RDFLib encoder.
Args: options (SerializerOptions | None, optional): Stream options. Defaults to None.
Raises: TypeError: if Stream is passed, and not a Stream for specific physical type.
Returns: Stream: initialized stream with RDFLib encoder.
Source code in pyjelly/serialize/streams.py
TripleStream(*, encoder, options=None)
Bases: Stream
Methods:
| Name | Description |
|---|---|
triple |
Process one triple to Protobuf messages. |
Source code in pyjelly/serialize/streams.py
triple(terms)
Process one triple to Protobuf messages.
Note: Adds new rows to the current flow and returns StreamFrame if frame size conditions are met.
Args: terms (Iterable[object]): RDF terms to encode.
Returns: jelly.RdfStreamFrame | None: stream frame if flow supports frames slicing and current flow is full
Source code in pyjelly/serialize/streams.py
QuadStream(*, encoder, options=None)
Bases: Stream
Methods:
| Name | Description |
|---|---|
quad |
Process one quad to Protobuf messages. |
Source code in pyjelly/serialize/streams.py
quad(terms)
Process one quad to Protobuf messages.
Args: terms (Iterable[object]): terms to encode.
Returns: jelly.RdfStreamFrame | None: stream frame if flow supports frames slicing and current flow is full
Source code in pyjelly/serialize/streams.py
GraphStream(*, encoder, options=None)
Bases: TripleStream
Methods:
| Name | Description |
|---|---|
graph |
Process one graph into a sequence of jelly frames. |
Source code in pyjelly/serialize/streams.py
graph(graph_id, graph)
Process one graph into a sequence of jelly frames.
Args: graph_id (object): graph id (BN, Literal, iri, default) graph (Iterable[Iterable[object]]): iterable of triples (graph's content)
Yields: Generator[jelly.RdfStreamFrame]: jelly frames.
Source code in pyjelly/serialize/streams.py
stream_for_type(physical_type)
Give a Stream based on physical type specified.
Args: physical_type (jelly.PhysicalStreamType): jelly stream physical type.
Raises: NotImplementedError: if no stream for requested physical type is available.
Returns: type[Stream]: jelly stream