using doc-forge (#1)
Reviewed-on: #1 Co-authored-by: Vishesh 'ironeagle' Bangotra <aetoskia@gmail.com> Co-committed-by: Vishesh 'ironeagle' Bangotra <aetoskia@gmail.com>
This commit is contained in:
@@ -94,6 +94,27 @@ PDF:
|
||||
- FileSystemPDFClient
|
||||
- PDFScraper
|
||||
- PDFParser
|
||||
|
||||
## Core Philosophy
|
||||
|
||||
`OmniRead` is designed as a **decoupled content engine**:
|
||||
|
||||
1. **Separation of Concerns**: Scrapers *fetch*, Parsers *interpret*. Neither knows about the other.
|
||||
2. **Normalized Exchange**: All components communicate via the `Content` model, ensuring a consistent contract.
|
||||
3. **Format Agnosticism**: The core logic is independent of whether the input is HTML, PDF, or JSON.
|
||||
|
||||
## Documentation Design
|
||||
|
||||
For those extending `OmniRead`, follow these "AI-Native" docstring principles:
|
||||
|
||||
### For Humans
|
||||
- **Clear Contracts**: Explicitly state what a component is and is NOT responsible for.
|
||||
- **Runnable Examples**: Include small, logical snippets in the package `__init__.py`.
|
||||
|
||||
### For LLMs
|
||||
- **Structured Models**: Use dataclasses and enums for core data to ensure clean MCP JSON representation.
|
||||
- **Type Safety**: All public APIs must be fully typed and have corresponding `.pyi` stubs.
|
||||
- **Detailed Raises**: Include `: description` pairs in the `Raises` section to help agents handle errors gracefully.
|
||||
"""
|
||||
|
||||
from .core import Content, ContentType
|
||||
|
||||
Reference in New Issue
Block a user