Client
omniread.pdf.client
Summary
PDF client abstractions for OmniRead.
This module defines the client layer responsible for retrieving raw PDF bytes from a concrete backing store.
Clients provide low-level access to PDF binaries and are intentionally decoupled from scraping and parsing logic. They do not perform validation, interpretation, or content extraction.
Typical backing stores include:
- Local filesystems
- Object storage (S3, GCS, etc.)
- Network file systems
Classes
BasePDFClient
Bases: ABC
Abstract client responsible for retrieving PDF bytes.
Retrieves bytes from a specific backing store (filesystem, S3, FTP, etc.).
Notes
Responsibilities:
1 2 3 4 | |
Functions
fetch
abstractmethod
Fetch raw PDF bytes from the given source.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source |
Any
|
Identifier of the PDF location, such as a file path, object storage key, or remote reference. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
bytes |
bytes
|
Raw PDF bytes. |
Raises:
| Type | Description |
|---|---|
Exception
|
Retrieval-specific errors defined by the implementation. |
FileSystemPDFClient
Bases: BasePDFClient
PDF client that reads from the local filesystem.
Notes
Guarantees:
1 2 | |
Functions
fetch
Read a PDF file from the local filesystem.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path |
Path
|
Filesystem path to the PDF file. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
bytes |
bytes
|
Raw PDF bytes. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the path does not exist. |
ValueError
|
If the path exists but is not a file. |