Skip to content

Parser

omniread.core.parser

Summary

Abstract parsing contracts for OmniRead.

This module defines the format-agnostic parser interface used to transform raw content into structured, typed representations.

Parsers are responsible for:

  • Interpreting a single Content instance
  • Validating compatibility with the content type
  • Producing a structured output suitable for downstream consumers

Parsers are not responsible for:

  • Fetching or acquiring content
  • Performing retries or error recovery
  • Managing multiple content sources

Classes

BaseParser

BaseParser(content: Content)

Bases: ABC, Generic[T]

Base interface for all parsers.

Notes

Guarantees:

1
2
3
4
- A parser is a self-contained object that owns the `Content` it is
  responsible for interpreting.
- Consumers may rely on early validation of content compatibility
  and type-stable return values from `parse()`.

Responsibilities:

1
2
3
- Implementations must declare supported content types via `supported_types`.
- Implementations must raise parsing-specific exceptions from `parse()`.
- Implementations must remain deterministic for a given input.

Initialize the parser with content to be parsed.

Parameters:

Name Type Description Default
content Content

Content instance to be parsed.

required

Raises:

Type Description
ValueError

If the content type is not supported by this parser.

Attributes
supported_types class-attribute instance-attribute
supported_types: Set[ContentType] = set()

Set of content types supported by this parser. An empty set indicates that the parser is content-type agnostic.

Functions
parse abstractmethod
parse() -> T

Parse the owned content into structured output.

Returns:

Name Type Description
T T

Parsed, structured representation.

Raises:

Type Description
Exception

Parsing-specific errors as defined by the implementation.

Notes

Responsibilities:

1
2
- Implementations must fully consume the provided content and
  return a deterministic, structured output.
supports
supports() -> bool

Check whether this parser supports the content's type.

Returns:

Name Type Description
bool bool

True if the content type is supported; False otherwise.