using doc-forge (#1)

Reviewed-on: #1 Co-authored-by: Vishesh 'ironeagle' Bangotra <aetoskia@gmail.com> Co-committed-by: Vishesh 'ironeagle' Bangotra <aetoskia@gmail.com>
2026-01-22 11:27:56 +00:00
parent 6808538485
commit 67a3074ab4
46 changed files with 4475 additions and 107 deletions
--- a/omniread/init.py
+++ b/omniread/init.py
@@ -94,6 +94,27 @@ PDF:
 - FileSystemPDFClient
 - PDFScraper
 - PDFParser
+
+## Core Philosophy
+
+`OmniRead` is designed as a **decoupled content engine**:
+
+1. **Separation of Concerns**: Scrapers *fetch*, Parsers *interpret*. Neither knows about the other.
+2. **Normalized Exchange**: All components communicate via the `Content` model, ensuring a consistent contract.
+3. **Format Agnosticism**: The core logic is independent of whether the input is HTML, PDF, or JSON.
+
+## Documentation Design
+
+For those extending `OmniRead`, follow these "AI-Native" docstring principles:
+
+### For Humans
+- **Clear Contracts**: Explicitly state what a component is and is NOT responsible for.
+- **Runnable Examples**: Include small, logical snippets in the package `__init__.py`.
+
+### For LLMs
+- **Structured Models**: Use dataclasses and enums for core data to ensure clean MCP JSON representation.
+- **Type Safety**: All public APIs must be fully typed and have corresponding `.pyi` stubs.
+- **Detailed Raises**: Include `: description` pairs in the `Raises` section to help agents handle errors gracefully.
 """

 from .core import Content, ContentType