updated docs strings and added README.md

mcp docs
google styled doc
2026-03-08 17:59:56 +05:30 · 2026-03-08 00:41:28 +05:30 · 2026-03-08 00:29:25 +05:30 · 2026-02-21 16:47:08 +00:00 · 2026-01-22 11:27:56 +00:00 · 2026-01-09 15:55:54 +05:30
57 changed files with 5119 additions and 353 deletions
--- a/.drone.yml
+++ b/.drone.yml
@@ -0,0 +1,129 @@
+---
+kind: pipeline
+type: docker
+name: build-and-publish-pypi
+
+platform:
+  os: linux
+  arch: arm64
+
+workspace:
+  path: /drone/src
+
+steps:
+  - name: check-version
+    image: curlimages/curl:latest
+    environment:
+      PIP_REPO_URL:
+        from_secret: PIP_REPO_URL
+      PIP_USERNAME:
+        from_secret: PIP_USERNAME
+      PIP_PASSWORD:
+        from_secret: PIP_PASSWORD
+    commands:
+      - PACKAGE_NAME=$(grep -E '^name\s*=' pyproject.toml | head -1 | cut -d'"' -f2)
+      - VERSION=$(grep -E '^version\s*=' pyproject.toml | head -1 | cut -d'"' -f2)
+      - echo "🔍 Checking if $PACKAGE_NAME==$VERSION exists on $PIP_REPO_URL ..."
+      - |
+        if curl -fsSL -u "$PIP_USERNAME:$PIP_PASSWORD" "$PIP_REPO_URL/simple/$PACKAGE_NAME/" | grep -q "$VERSION"; then
+          echo "✅ $PACKAGE_NAME==$VERSION already exists — skipping build."
+          exit 78
+        else
+          echo "🆕 New version detected: $PACKAGE_NAME==$VERSION"
+        fi
+
+  - name: build-package
+    image: python:3.13-slim
+    commands:
+      - pip install --upgrade pip build
+      - echo "📦 Building Python package..."
+      - python -m build
+      - ls -l dist
+
+  - name: upload-to-private-pypi
+    image: python:3.13-slim
+    environment:
+      PIP_REPO_URL:
+        from_secret: PIP_REPO_URL
+      PIP_USERNAME:
+        from_secret: PIP_USERNAME
+      PIP_PASSWORD:
+        from_secret: PIP_PASSWORD
+    commands:
+      - pip install --upgrade twine
+      - echo "🚀 Uploading to private PyPI at $PIP_REPO_URL ..."
+      - |
+        twine upload \
+          --repository-url "$PIP_REPO_URL" \
+          -u "$PIP_USERNAME" \
+          -p "$PIP_PASSWORD" \
+          dist/*
+
+trigger:
+  event:
+    - tag
+
+---
+kind: pipeline
+type: docker
+name: backfill-pypi-from-tags
+
+platform:
+  os: linux
+  arch: arm64
+
+workspace:
+  path: /drone/src
+
+steps:
+  - name: fetch-tags
+    image: alpine/git
+    commands:
+      - git fetch --tags --force
+
+  - name: build-and-upload-missing
+    image: python:3.13-slim
+    environment:
+      PIP_REPO_URL:
+        from_secret: PIP_REPO_URL
+      PIP_USERNAME:
+        from_secret: PIP_USERNAME
+      PIP_PASSWORD:
+        from_secret: PIP_PASSWORD
+    commands:
+      - apt-get update
+      - apt-get install -y git curl ca-certificates
+      - pip install --upgrade pip build twine
+      - |
+        set -e
+
+        PACKAGE_NAME=$(grep -E '^name\s*=' pyproject.toml | cut -d'"' -f2)
+        echo "📦 Package: $PACKAGE_NAME"
+
+        for TAG in $(git tag --sort=version:refname); do
+          VERSION="$TAG"
+          echo "🔁 Version: $VERSION"
+
+          if curl -fsSL -u "$PIP_USERNAME:$PIP_PASSWORD" \
+            "$PIP_REPO_URL/simple/$PACKAGE_NAME/" | grep -q "$VERSION"; then
+            echo "⏭️  Exists, skipping"
+            continue
+          fi
+
+          git checkout --force "$TAG"
+
+          echo "🏗️  Building $VERSION"
+          rm -rf dist
+          python -m build
+
+          echo "⬆️  Uploading $VERSION"
+          twine upload \
+            --repository-url "$PIP_REPO_URL" \
+            -u "$PIP_USERNAME" \
+            -p "$PIP_PASSWORD" \
+            dist/*
+        done
+
+trigger:
+  event:
+    - custom
--- a/.gitignore
+++ b/.gitignore
@@ -38,3 +38,4 @@ Thumbs.db
 *.swo
 *~
 *.tmp
+site
--- a/README.md
+++ b/README.md
@@ -0,0 +1,108 @@
+# omniread
+
+# Summary
+
+`OmniRead` — format-agnostic content acquisition and parsing framework.
+
+`OmniRead` provides a **cleanly layered architecture** for fetching, parsing,
+and normalizing content from heterogeneous sources such as HTML documents
+and PDF files.
+
+The library is structured around three core concepts:
+
+1.  **`Content`**: A canonical, format-agnostic container representing raw content
+    bytes and minimal contextual metadata.
+2.  **`Scrapers`**: Components responsible for *acquiring* raw content from a
+    source (HTTP, filesystem, object storage, etc.). `Scrapers` never interpret
+    content.
+3.  **`Parsers`**: Components responsible for *interpreting* acquired content and
+    converting it into structured, typed representations.
+
+`OmniRead` deliberately separates these responsibilities to ensure:
+
+-   Clear boundaries between IO and interpretation.
+-   Replaceable implementations per format.
+-   Predictable, testable behavior.
+
+# Installation
+
+Install `OmniRead` using pip:
+
+```bash
+pip install omniread
+```
+
+Install OmniRead using Poetry:
+```bash
+poetry add omniread
+```
+
+---
+
+## Quick start
+
+Example:
+    HTML example:
+        ```python
+        from omniread import HTMLScraper, HTMLParser
+
+        scraper = HTMLScraper()
+        content = scraper.fetch("https://example.com")
+
+        class TitleParser(HTMLParser[str]):
+            def parse(self) -> str:
+                return self._soup.title.string
+
+        parser = TitleParser(content)
+        title = parser.parse()
+        ```
+
+    PDF example:
+        ```python
+        from omniread import FileSystemPDFClient, PDFScraper, PDFParser
+        from pathlib import Path
+
+        client = FileSystemPDFClient()
+        scraper = PDFScraper(client=client)
+        content = scraper.fetch(Path("document.pdf"))
+
+        class TextPDFParser(PDFParser[str]):
+            def parse(self) -> str:
+                # implement PDF text extraction
+                ...
+
+        parser = TextPDFParser(content)
+        result = parser.parse()
+        ```
+
+---
+
+# Public API
+
+This module re-exports the **recommended public entry points** of OmniRead.
+Consumers are encouraged to import from this namespace rather than from
+format-specific submodules directly, unless advanced customization is
+required.
+
+- `Content`: Canonical content model.
+- `ContentType`: Supported media types.
+- `HTMLScraper`: HTTP-based HTML acquisition.
+- `HTMLParser`: Base parser for HTML DOM interpretation.
+- `FileSystemPDFClient`: Local filesystem PDF access.
+- `PDFScraper`: PDF-specific content acquisition.
+- `PDFParser`: Base parser for PDF binary interpretation.
+
+---
+
+# Core Philosophy
+
+`OmniRead` is designed as a **decoupled content engine**:
+
+1. **Separation of Concerns**: Scrapers *fetch*, Parsers *interpret*. Neither
+   knows about the other.
+2. **Normalized Exchange**: All components communicate via the `Content` model,
+   ensuring a consistent contract.
+3. **Format Agnosticism**: The core logic is independent of whether the input
+   is HTML, PDF, or JSON.
+
+---
--- a/docforge.nav.yml
+++ b/docforge.nav.yml
@@ -0,0 +1,16 @@
+home: index.md
+groups:
+  Core API:
+    - core/index.md
+    - core/content.md
+    - core/parser.md
+    - core/scraper.md
+  HTML Handling:
+    - html/index.md
+    - html/parser.md
+    - html/scraper.md
+  PDF Handling:
+    - pdf/index.md
+    - pdf/client.md
+    - pdf/parser.md
+    - pdf/scraper.md
--- a/docs/core/content.md
+++ b/docs/core/content.md
@@ -1 +1,3 @@
+# Content
+
 ::: omniread.core.content
--- a/docs/core/index.md
+++ b/docs/core/index.md
@@ -1 +1,3 @@
+# Core
+
 ::: omniread.core
--- a/docs/core/parser.md
+++ b/docs/core/parser.md
@@ -1 +1,3 @@
+# Parser
+
 ::: omniread.core.parser
--- a/docs/core/scraper.md
+++ b/docs/core/scraper.md
@@ -1 +1,3 @@
+# Scraper
+
 ::: omniread.core.scraper
--- a/docs/html/index.md
+++ b/docs/html/index.md
@@ -1 +1,3 @@
+# Html
+
 ::: omniread.html
--- a/docs/html/parser.md
+++ b/docs/html/parser.md
@@ -1 +1,3 @@
+# Parser
+
 ::: omniread.html.parser
--- a/docs/html/scraper.md
+++ b/docs/html/scraper.md
@@ -1 +1,3 @@
+# Scraper
+
 ::: omniread.html.scraper
--- a/docs/index.md
+++ b/docs/index.md
@@ -1 +1,3 @@
+# omniread
+
 ::: omniread
--- a/docs/pdf/client.md
+++ b/docs/pdf/client.md
@@ -1 +1,3 @@
+# Client
+
 ::: omniread.pdf.client
--- a/docs/pdf/index.md
+++ b/docs/pdf/index.md
@@ -1 +1,3 @@
+# Pdf
+
 ::: omniread.pdf
--- a/docs/pdf/parser.md
+++ b/docs/pdf/parser.md
@@ -1 +1,3 @@
+# Parser
+
 ::: omniread.pdf.parser
--- a/docs/pdf/scraper.md
+++ b/docs/pdf/scraper.md
@@ -1 +1,3 @@
+# Scraper
+
 ::: omniread.pdf.scraper
--- a/generate_docs.py
+++ b/generate_docs.py
@@ -1,46 +0,0 @@
-"""
-Programmatic MkDocs build script for OmniRead.
-
-This script builds (or serves) the documentation by invoking MkDocs
-*as a Python library*, not via shell commands.
-
-Requirements:
- mkdocs
- mkdocs-material
- mkdocstrings[python]
-
-Usage:
-    python generate_docs.py
-    python generate_docs.py --serve
-"""
-
-import sys
-from pathlib import Path
-
-from mkdocs.commands import build as mkdocs_build
-from mkdocs.commands import serve as mkdocs_serve
-from mkdocs.config import load_config
-
-
-PROJECT_ROOT = Path(__file__).resolve().parent
-MKDOCS_YML = PROJECT_ROOT / "mkdocs.yml"
-
-
-def main() -> None:
-    if not MKDOCS_YML.exists():
-        raise FileNotFoundError("mkdocs.yml not found at project root")
-
-    # Load MkDocs configuration programmatically
-    config = load_config(str(MKDOCS_YML))
-
-    # Decide mode
-    if "--serve" in sys.argv:
-        # Live-reload development server
-        mkdocs_serve.serve(config)
-    else:
-        # Static site build
-        mkdocs_build.build(config)
-
-
-if __name__ == "__main__":
-    main()
--- a/mcp_docs/index.json
+++ b/mcp_docs/index.json
@@ -0,0 +1,6 @@
+{
+  "project": "omniread",
+  "type": "docforge-model",
+  "modules_count": 12,
+  "source": "docforge"
+}
--- a/mcp_docs/modules/omniread.core.content.json
+++ b/mcp_docs/modules/omniread.core.content.json
@@ -0,0 +1,118 @@
+{
+  "module": "omniread.core.content",
+  "content": {
+    "path": "omniread.core.content",
+    "docstring": "# Summary\n\nCanonical content models for OmniRead.\n\nThis module defines the **format-agnostic content representation** used across\nall parsers and scrapers in OmniRead.\n\nThe models defined here represent *what* was extracted, not *how* it was\nretrieved or parsed. Format-specific behavior and metadata must not alter\nthe semantic meaning of these models.",
+    "objects": {
+      "Enum": {
+        "name": "Enum",
+        "kind": "alias",
+        "path": "omniread.core.content.Enum",
+        "signature": "<bound method Alias.signature of Alias('Enum', 'enum.Enum')>",
+        "docstring": null
+      },
+      "dataclass": {
+        "name": "dataclass",
+        "kind": "alias",
+        "path": "omniread.core.content.dataclass",
+        "signature": "<bound method Alias.signature of Alias('dataclass', 'dataclasses.dataclass')>",
+        "docstring": null
+      },
+      "Any": {
+        "name": "Any",
+        "kind": "alias",
+        "path": "omniread.core.content.Any",
+        "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+        "docstring": null
+      },
+      "Mapping": {
+        "name": "Mapping",
+        "kind": "alias",
+        "path": "omniread.core.content.Mapping",
+        "signature": "<bound method Alias.signature of Alias('Mapping', 'typing.Mapping')>",
+        "docstring": null
+      },
+      "Optional": {
+        "name": "Optional",
+        "kind": "alias",
+        "path": "omniread.core.content.Optional",
+        "signature": "<bound method Alias.signature of Alias('Optional', 'typing.Optional')>",
+        "docstring": null
+      },
+      "ContentType": {
+        "name": "ContentType",
+        "kind": "class",
+        "path": "omniread.core.content.ContentType",
+        "signature": "<bound method Class.signature of Class('ContentType', 19, 42)>",
+        "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+        "members": {
+          "HTML": {
+            "name": "HTML",
+            "kind": "attribute",
+            "path": "omniread.core.content.ContentType.HTML",
+            "signature": null,
+            "docstring": "HTML document content."
+          },
+          "PDF": {
+            "name": "PDF",
+            "kind": "attribute",
+            "path": "omniread.core.content.ContentType.PDF",
+            "signature": null,
+            "docstring": "PDF document content."
+          },
+          "JSON": {
+            "name": "JSON",
+            "kind": "attribute",
+            "path": "omniread.core.content.ContentType.JSON",
+            "signature": null,
+            "docstring": "JSON document content."
+          },
+          "XML": {
+            "name": "XML",
+            "kind": "attribute",
+            "path": "omniread.core.content.ContentType.XML",
+            "signature": null,
+            "docstring": "XML document content."
+          }
+        }
+      },
+      "Content": {
+        "name": "Content",
+        "kind": "class",
+        "path": "omniread.core.content.Content",
+        "signature": "<bound method Class.signature of Class('Content', 45, 77)>",
+        "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+        "members": {
+          "raw": {
+            "name": "raw",
+            "kind": "attribute",
+            "path": "omniread.core.content.Content.raw",
+            "signature": null,
+            "docstring": "Raw content bytes as retrieved from the source."
+          },
+          "source": {
+            "name": "source",
+            "kind": "attribute",
+            "path": "omniread.core.content.Content.source",
+            "signature": null,
+            "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+          },
+          "content_type": {
+            "name": "content_type",
+            "kind": "attribute",
+            "path": "omniread.core.content.Content.content_type",
+            "signature": null,
+            "docstring": "Optional MIME type of the content, if known."
+          },
+          "metadata": {
+            "name": "metadata",
+            "kind": "attribute",
+            "path": "omniread.core.content.Content.metadata",
+            "signature": null,
+            "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+          }
+        }
+      }
+    }
+  }
+}
--- a/mcp_docs/modules/omniread.core.json
+++ b/mcp_docs/modules/omniread.core.json
@@ -0,0 +1,513 @@
+{
+  "module": "omniread.core",
+  "content": {
+    "path": "omniread.core",
+    "docstring": "# Summary\n\nCore domain contracts for OmniRead.\n\nThis package defines the **format-agnostic domain layer** of OmniRead.\nIt exposes canonical content models and abstract interfaces that are\nimplemented by format-specific modules (HTML, PDF, etc.).\n\nPublic exports from this package are considered **stable contracts** and\nare safe for downstream consumers to depend on.\n\nSubmodules:\n\n- `content`: Canonical content models and enums.\n- `parser`: Abstract parsing contracts.\n- `scraper`: Abstract scraping contracts.\n\nFormat-specific behavior must not be introduced at this layer.\n\n---\n\n# Public API\n\n- `Content`\n- `ContentType`\n\n---",
+    "objects": {
+      "Content": {
+        "name": "Content",
+        "kind": "class",
+        "path": "omniread.core.Content",
+        "signature": "<bound method Alias.signature of Alias('Content', 'omniread.core.content.Content')>",
+        "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+        "members": {
+          "raw": {
+            "name": "raw",
+            "kind": "attribute",
+            "path": "omniread.core.Content.raw",
+            "signature": "<bound method Alias.signature of Alias('raw', 'omniread.core.content.Content.raw')>",
+            "docstring": "Raw content bytes as retrieved from the source."
+          },
+          "source": {
+            "name": "source",
+            "kind": "attribute",
+            "path": "omniread.core.Content.source",
+            "signature": "<bound method Alias.signature of Alias('source', 'omniread.core.content.Content.source')>",
+            "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+          },
+          "content_type": {
+            "name": "content_type",
+            "kind": "attribute",
+            "path": "omniread.core.Content.content_type",
+            "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.core.content.Content.content_type')>",
+            "docstring": "Optional MIME type of the content, if known."
+          },
+          "metadata": {
+            "name": "metadata",
+            "kind": "attribute",
+            "path": "omniread.core.Content.metadata",
+            "signature": "<bound method Alias.signature of Alias('metadata', 'omniread.core.content.Content.metadata')>",
+            "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+          }
+        }
+      },
+      "ContentType": {
+        "name": "ContentType",
+        "kind": "class",
+        "path": "omniread.core.ContentType",
+        "signature": "<bound method Alias.signature of Alias('ContentType', 'omniread.core.content.ContentType')>",
+        "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+        "members": {
+          "HTML": {
+            "name": "HTML",
+            "kind": "attribute",
+            "path": "omniread.core.ContentType.HTML",
+            "signature": "<bound method Alias.signature of Alias('HTML', 'omniread.core.content.ContentType.HTML')>",
+            "docstring": "HTML document content."
+          },
+          "PDF": {
+            "name": "PDF",
+            "kind": "attribute",
+            "path": "omniread.core.ContentType.PDF",
+            "signature": "<bound method Alias.signature of Alias('PDF', 'omniread.core.content.ContentType.PDF')>",
+            "docstring": "PDF document content."
+          },
+          "JSON": {
+            "name": "JSON",
+            "kind": "attribute",
+            "path": "omniread.core.ContentType.JSON",
+            "signature": "<bound method Alias.signature of Alias('JSON', 'omniread.core.content.ContentType.JSON')>",
+            "docstring": "JSON document content."
+          },
+          "XML": {
+            "name": "XML",
+            "kind": "attribute",
+            "path": "omniread.core.ContentType.XML",
+            "signature": "<bound method Alias.signature of Alias('XML', 'omniread.core.content.ContentType.XML')>",
+            "docstring": "XML document content."
+          }
+        }
+      },
+      "BaseParser": {
+        "name": "BaseParser",
+        "kind": "class",
+        "path": "omniread.core.BaseParser",
+        "signature": "<bound method Alias.signature of Alias('BaseParser', 'omniread.core.parser.BaseParser')>",
+        "docstring": "Base interface for all parsers.\n\nNotes:\n    **Guarantees:**\n\n        - A parser is a self-contained object that owns the `Content` it is\n          responsible for interpreting.\n        - Consumers may rely on early validation of content compatibility\n          and type-stable return values from `parse()`.\n\n    **Responsibilities:**\n\n        - Implementations must declare supported content types via `supported_types`.\n        - Implementations must raise parsing-specific exceptions from `parse()`.\n        - Implementations must remain deterministic for a given input.",
+        "members": {
+          "supported_types": {
+            "name": "supported_types",
+            "kind": "attribute",
+            "path": "omniread.core.BaseParser.supported_types",
+            "signature": "<bound method Alias.signature of Alias('supported_types', 'omniread.core.parser.BaseParser.supported_types')>",
+            "docstring": "Set of content types supported by this parser. An empty set indicates that the parser is content-type agnostic."
+          },
+          "content": {
+            "name": "content",
+            "kind": "attribute",
+            "path": "omniread.core.BaseParser.content",
+            "signature": "<bound method Alias.signature of Alias('content', 'omniread.core.parser.BaseParser.content')>",
+            "docstring": null
+          },
+          "parse": {
+            "name": "parse",
+            "kind": "function",
+            "path": "omniread.core.BaseParser.parse",
+            "signature": "<bound method Alias.signature of Alias('parse', 'omniread.core.parser.BaseParser.parse')>",
+            "docstring": "Parse the owned content into structured output.\n\nReturns:\n    T:\n        Parsed, structured representation.\n\nRaises:\n    Exception:\n        Parsing-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully consume the provided content and\n          return a deterministic, structured output."
+          },
+          "supports": {
+            "name": "supports",
+            "kind": "function",
+            "path": "omniread.core.BaseParser.supports",
+            "signature": "<bound method Alias.signature of Alias('supports', 'omniread.core.parser.BaseParser.supports')>",
+            "docstring": "Check whether this parser supports the content's type.\n\nReturns:\n    bool:\n        True if the content type is supported; False otherwise."
+          }
+        }
+      },
+      "BaseScraper": {
+        "name": "BaseScraper",
+        "kind": "class",
+        "path": "omniread.core.BaseScraper",
+        "signature": "<bound method Alias.signature of Alias('BaseScraper', 'omniread.core.scraper.BaseScraper')>",
+        "docstring": "Base interface for all scrapers.\n\nNotes:\n    **Responsibilities:**\n\n        - A scraper is responsible ONLY for fetching raw content (bytes)\n          from a source. It must not interpret or parse it.\n        - A scraper is a stateless acquisition component that retrieves raw\n          content from a source and returns it as a `Content` object.\n        - Scrapers define how content is obtained, not what the content means.\n        - Implementations may vary in transport mechanism, authentication\n          strategy, retry and backoff behavior.\n\n    **Constraints:**\n\n        - Implementations must not parse content, modify content semantics,\n          or couple scraping logic to a specific parser.",
+        "members": {
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.core.BaseScraper.fetch",
+            "signature": "<bound method Alias.signature of Alias('fetch', 'omniread.core.scraper.BaseScraper.fetch')>",
+            "docstring": "Fetch raw content from the given source.\n\nArgs:\n    source (str):\n        Location identifier (URL, file path, S3 URI, etc.).\n\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional hints for the scraper (headers, auth, etc.).\n\nReturns:\n    Content:\n        Content object containing raw bytes and metadata.\n\nRaises:\n    Exception:\n        Retrieval-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must retrieve the content referenced by `source`\n          and return it as raw bytes wrapped in a `Content` object."
+          }
+        }
+      },
+      "content": {
+        "name": "content",
+        "kind": "module",
+        "path": "omniread.core.content",
+        "signature": null,
+        "docstring": "# Summary\n\nCanonical content models for OmniRead.\n\nThis module defines the **format-agnostic content representation** used across\nall parsers and scrapers in OmniRead.\n\nThe models defined here represent *what* was extracted, not *how* it was\nretrieved or parsed. Format-specific behavior and metadata must not alter\nthe semantic meaning of these models.",
+        "members": {
+          "Enum": {
+            "name": "Enum",
+            "kind": "alias",
+            "path": "omniread.core.content.Enum",
+            "signature": "<bound method Alias.signature of Alias('Enum', 'enum.Enum')>",
+            "docstring": null
+          },
+          "dataclass": {
+            "name": "dataclass",
+            "kind": "alias",
+            "path": "omniread.core.content.dataclass",
+            "signature": "<bound method Alias.signature of Alias('dataclass', 'dataclasses.dataclass')>",
+            "docstring": null
+          },
+          "Any": {
+            "name": "Any",
+            "kind": "alias",
+            "path": "omniread.core.content.Any",
+            "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+            "docstring": null
+          },
+          "Mapping": {
+            "name": "Mapping",
+            "kind": "alias",
+            "path": "omniread.core.content.Mapping",
+            "signature": "<bound method Alias.signature of Alias('Mapping', 'typing.Mapping')>",
+            "docstring": null
+          },
+          "Optional": {
+            "name": "Optional",
+            "kind": "alias",
+            "path": "omniread.core.content.Optional",
+            "signature": "<bound method Alias.signature of Alias('Optional', 'typing.Optional')>",
+            "docstring": null
+          },
+          "ContentType": {
+            "name": "ContentType",
+            "kind": "class",
+            "path": "omniread.core.content.ContentType",
+            "signature": "<bound method Class.signature of Class('ContentType', 19, 42)>",
+            "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+            "members": {
+              "HTML": {
+                "name": "HTML",
+                "kind": "attribute",
+                "path": "omniread.core.content.ContentType.HTML",
+                "signature": null,
+                "docstring": "HTML document content."
+              },
+              "PDF": {
+                "name": "PDF",
+                "kind": "attribute",
+                "path": "omniread.core.content.ContentType.PDF",
+                "signature": null,
+                "docstring": "PDF document content."
+              },
+              "JSON": {
+                "name": "JSON",
+                "kind": "attribute",
+                "path": "omniread.core.content.ContentType.JSON",
+                "signature": null,
+                "docstring": "JSON document content."
+              },
+              "XML": {
+                "name": "XML",
+                "kind": "attribute",
+                "path": "omniread.core.content.ContentType.XML",
+                "signature": null,
+                "docstring": "XML document content."
+              }
+            }
+          },
+          "Content": {
+            "name": "Content",
+            "kind": "class",
+            "path": "omniread.core.content.Content",
+            "signature": "<bound method Class.signature of Class('Content', 45, 77)>",
+            "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+            "members": {
+              "raw": {
+                "name": "raw",
+                "kind": "attribute",
+                "path": "omniread.core.content.Content.raw",
+                "signature": null,
+                "docstring": "Raw content bytes as retrieved from the source."
+              },
+              "source": {
+                "name": "source",
+                "kind": "attribute",
+                "path": "omniread.core.content.Content.source",
+                "signature": null,
+                "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+              },
+              "content_type": {
+                "name": "content_type",
+                "kind": "attribute",
+                "path": "omniread.core.content.Content.content_type",
+                "signature": null,
+                "docstring": "Optional MIME type of the content, if known."
+              },
+              "metadata": {
+                "name": "metadata",
+                "kind": "attribute",
+                "path": "omniread.core.content.Content.metadata",
+                "signature": null,
+                "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+              }
+            }
+          }
+        }
+      },
+      "parser": {
+        "name": "parser",
+        "kind": "module",
+        "path": "omniread.core.parser",
+        "signature": null,
+        "docstring": "# Summary\n\nAbstract parsing contracts for OmniRead.\n\nThis module defines the **format-agnostic parser interface** used to transform\nraw content into structured, typed representations.\n\nParsers are responsible for:\n\n- Interpreting a single `Content` instance\n- Validating compatibility with the content type\n- Producing a structured output suitable for downstream consumers\n\nParsers are not responsible for:\n\n- Fetching or acquiring content\n- Performing retries or error recovery\n- Managing multiple content sources",
+        "members": {
+          "ABC": {
+            "name": "ABC",
+            "kind": "alias",
+            "path": "omniread.core.parser.ABC",
+            "signature": "<bound method Alias.signature of Alias('ABC', 'abc.ABC')>",
+            "docstring": null
+          },
+          "abstractmethod": {
+            "name": "abstractmethod",
+            "kind": "alias",
+            "path": "omniread.core.parser.abstractmethod",
+            "signature": "<bound method Alias.signature of Alias('abstractmethod', 'abc.abstractmethod')>",
+            "docstring": null
+          },
+          "Generic": {
+            "name": "Generic",
+            "kind": "alias",
+            "path": "omniread.core.parser.Generic",
+            "signature": "<bound method Alias.signature of Alias('Generic', 'typing.Generic')>",
+            "docstring": null
+          },
+          "TypeVar": {
+            "name": "TypeVar",
+            "kind": "alias",
+            "path": "omniread.core.parser.TypeVar",
+            "signature": "<bound method Alias.signature of Alias('TypeVar', 'typing.TypeVar')>",
+            "docstring": null
+          },
+          "Set": {
+            "name": "Set",
+            "kind": "alias",
+            "path": "omniread.core.parser.Set",
+            "signature": "<bound method Alias.signature of Alias('Set', 'typing.Set')>",
+            "docstring": null
+          },
+          "Content": {
+            "name": "Content",
+            "kind": "class",
+            "path": "omniread.core.parser.Content",
+            "signature": "<bound method Alias.signature of Alias('Content', 'omniread.core.content.Content')>",
+            "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+            "members": {
+              "raw": {
+                "name": "raw",
+                "kind": "attribute",
+                "path": "omniread.core.parser.Content.raw",
+                "signature": "<bound method Alias.signature of Alias('raw', 'omniread.core.content.Content.raw')>",
+                "docstring": "Raw content bytes as retrieved from the source."
+              },
+              "source": {
+                "name": "source",
+                "kind": "attribute",
+                "path": "omniread.core.parser.Content.source",
+                "signature": "<bound method Alias.signature of Alias('source', 'omniread.core.content.Content.source')>",
+                "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+              },
+              "content_type": {
+                "name": "content_type",
+                "kind": "attribute",
+                "path": "omniread.core.parser.Content.content_type",
+                "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.core.content.Content.content_type')>",
+                "docstring": "Optional MIME type of the content, if known."
+              },
+              "metadata": {
+                "name": "metadata",
+                "kind": "attribute",
+                "path": "omniread.core.parser.Content.metadata",
+                "signature": "<bound method Alias.signature of Alias('metadata', 'omniread.core.content.Content.metadata')>",
+                "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+              }
+            }
+          },
+          "ContentType": {
+            "name": "ContentType",
+            "kind": "class",
+            "path": "omniread.core.parser.ContentType",
+            "signature": "<bound method Alias.signature of Alias('ContentType', 'omniread.core.content.ContentType')>",
+            "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+            "members": {
+              "HTML": {
+                "name": "HTML",
+                "kind": "attribute",
+                "path": "omniread.core.parser.ContentType.HTML",
+                "signature": "<bound method Alias.signature of Alias('HTML', 'omniread.core.content.ContentType.HTML')>",
+                "docstring": "HTML document content."
+              },
+              "PDF": {
+                "name": "PDF",
+                "kind": "attribute",
+                "path": "omniread.core.parser.ContentType.PDF",
+                "signature": "<bound method Alias.signature of Alias('PDF', 'omniread.core.content.ContentType.PDF')>",
+                "docstring": "PDF document content."
+              },
+              "JSON": {
+                "name": "JSON",
+                "kind": "attribute",
+                "path": "omniread.core.parser.ContentType.JSON",
+                "signature": "<bound method Alias.signature of Alias('JSON', 'omniread.core.content.ContentType.JSON')>",
+                "docstring": "JSON document content."
+              },
+              "XML": {
+                "name": "XML",
+                "kind": "attribute",
+                "path": "omniread.core.parser.ContentType.XML",
+                "signature": "<bound method Alias.signature of Alias('XML', 'omniread.core.content.ContentType.XML')>",
+                "docstring": "XML document content."
+              }
+            }
+          },
+          "T": {
+            "name": "T",
+            "kind": "attribute",
+            "path": "omniread.core.parser.T",
+            "signature": null,
+            "docstring": null
+          },
+          "BaseParser": {
+            "name": "BaseParser",
+            "kind": "class",
+            "path": "omniread.core.parser.BaseParser",
+            "signature": "<bound method Class.signature of Class('BaseParser', 30, 111)>",
+            "docstring": "Base interface for all parsers.\n\nNotes:\n    **Guarantees:**\n\n        - A parser is a self-contained object that owns the `Content` it is\n          responsible for interpreting.\n        - Consumers may rely on early validation of content compatibility\n          and type-stable return values from `parse()`.\n\n    **Responsibilities:**\n\n        - Implementations must declare supported content types via `supported_types`.\n        - Implementations must raise parsing-specific exceptions from `parse()`.\n        - Implementations must remain deterministic for a given input.",
+            "members": {
+              "supported_types": {
+                "name": "supported_types",
+                "kind": "attribute",
+                "path": "omniread.core.parser.BaseParser.supported_types",
+                "signature": null,
+                "docstring": "Set of content types supported by this parser. An empty set indicates that the parser is content-type agnostic."
+              },
+              "content": {
+                "name": "content",
+                "kind": "attribute",
+                "path": "omniread.core.parser.BaseParser.content",
+                "signature": null,
+                "docstring": null
+              },
+              "parse": {
+                "name": "parse",
+                "kind": "function",
+                "path": "omniread.core.parser.BaseParser.parse",
+                "signature": "<bound method Function.signature of Function('parse', 75, 94)>",
+                "docstring": "Parse the owned content into structured output.\n\nReturns:\n    T:\n        Parsed, structured representation.\n\nRaises:\n    Exception:\n        Parsing-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully consume the provided content and\n          return a deterministic, structured output."
+              },
+              "supports": {
+                "name": "supports",
+                "kind": "function",
+                "path": "omniread.core.parser.BaseParser.supports",
+                "signature": "<bound method Function.signature of Function('supports', 96, 111)>",
+                "docstring": "Check whether this parser supports the content's type.\n\nReturns:\n    bool:\n        True if the content type is supported; False otherwise."
+              }
+            }
+          }
+        }
+      },
+      "scraper": {
+        "name": "scraper",
+        "kind": "module",
+        "path": "omniread.core.scraper",
+        "signature": null,
+        "docstring": "# Summary\n\nAbstract scraping contracts for OmniRead.\n\nThis module defines the **format-agnostic scraper interface** responsible for\nacquiring raw content from external sources.\n\nScrapers are responsible for:\n\n- Locating and retrieving raw content bytes\n- Attaching minimal contextual metadata\n- Returning normalized `Content` objects\n\nScrapers are explicitly NOT responsible for:\n\n- Parsing or interpreting content\n- Inferring structure or semantics\n- Performing content-type specific processing\n\nAll interpretation must be delegated to parsers.",
+        "members": {
+          "ABC": {
+            "name": "ABC",
+            "kind": "alias",
+            "path": "omniread.core.scraper.ABC",
+            "signature": "<bound method Alias.signature of Alias('ABC', 'abc.ABC')>",
+            "docstring": null
+          },
+          "abstractmethod": {
+            "name": "abstractmethod",
+            "kind": "alias",
+            "path": "omniread.core.scraper.abstractmethod",
+            "signature": "<bound method Alias.signature of Alias('abstractmethod', 'abc.abstractmethod')>",
+            "docstring": null
+          },
+          "Any": {
+            "name": "Any",
+            "kind": "alias",
+            "path": "omniread.core.scraper.Any",
+            "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+            "docstring": null
+          },
+          "Mapping": {
+            "name": "Mapping",
+            "kind": "alias",
+            "path": "omniread.core.scraper.Mapping",
+            "signature": "<bound method Alias.signature of Alias('Mapping', 'typing.Mapping')>",
+            "docstring": null
+          },
+          "Optional": {
+            "name": "Optional",
+            "kind": "alias",
+            "path": "omniread.core.scraper.Optional",
+            "signature": "<bound method Alias.signature of Alias('Optional', 'typing.Optional')>",
+            "docstring": null
+          },
+          "Content": {
+            "name": "Content",
+            "kind": "class",
+            "path": "omniread.core.scraper.Content",
+            "signature": "<bound method Alias.signature of Alias('Content', 'omniread.core.content.Content')>",
+            "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+            "members": {
+              "raw": {
+                "name": "raw",
+                "kind": "attribute",
+                "path": "omniread.core.scraper.Content.raw",
+                "signature": "<bound method Alias.signature of Alias('raw', 'omniread.core.content.Content.raw')>",
+                "docstring": "Raw content bytes as retrieved from the source."
+              },
+              "source": {
+                "name": "source",
+                "kind": "attribute",
+                "path": "omniread.core.scraper.Content.source",
+                "signature": "<bound method Alias.signature of Alias('source', 'omniread.core.content.Content.source')>",
+                "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+              },
+              "content_type": {
+                "name": "content_type",
+                "kind": "attribute",
+                "path": "omniread.core.scraper.Content.content_type",
+                "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.core.content.Content.content_type')>",
+                "docstring": "Optional MIME type of the content, if known."
+              },
+              "metadata": {
+                "name": "metadata",
+                "kind": "attribute",
+                "path": "omniread.core.scraper.Content.metadata",
+                "signature": "<bound method Alias.signature of Alias('metadata', 'omniread.core.content.Content.metadata')>",
+                "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+              }
+            }
+          },
+          "BaseScraper": {
+            "name": "BaseScraper",
+            "kind": "class",
+            "path": "omniread.core.scraper.BaseScraper",
+            "signature": "<bound method Class.signature of Class('BaseScraper', 30, 82)>",
+            "docstring": "Base interface for all scrapers.\n\nNotes:\n    **Responsibilities:**\n\n        - A scraper is responsible ONLY for fetching raw content (bytes)\n          from a source. It must not interpret or parse it.\n        - A scraper is a stateless acquisition component that retrieves raw\n          content from a source and returns it as a `Content` object.\n        - Scrapers define how content is obtained, not what the content means.\n        - Implementations may vary in transport mechanism, authentication\n          strategy, retry and backoff behavior.\n\n    **Constraints:**\n\n        - Implementations must not parse content, modify content semantics,\n          or couple scraping logic to a specific parser.",
+            "members": {
+              "fetch": {
+                "name": "fetch",
+                "kind": "function",
+                "path": "omniread.core.scraper.BaseScraper.fetch",
+                "signature": "<bound method Function.signature of Function('fetch', 51, 82)>",
+                "docstring": "Fetch raw content from the given source.\n\nArgs:\n    source (str):\n        Location identifier (URL, file path, S3 URI, etc.).\n\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional hints for the scraper (headers, auth, etc.).\n\nReturns:\n    Content:\n        Content object containing raw bytes and metadata.\n\nRaises:\n    Exception:\n        Retrieval-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must retrieve the content referenced by `source`\n          and return it as raw bytes wrapped in a `Content` object."
+              }
+            }
+          }
+        }
+      }
+    }
+  }
+}
--- a/mcp_docs/modules/omniread.core.parser.json
+++ b/mcp_docs/modules/omniread.core.parser.json
@@ -0,0 +1,162 @@
+{
+  "module": "omniread.core.parser",
+  "content": {
+    "path": "omniread.core.parser",
+    "docstring": "# Summary\n\nAbstract parsing contracts for OmniRead.\n\nThis module defines the **format-agnostic parser interface** used to transform\nraw content into structured, typed representations.\n\nParsers are responsible for:\n\n- Interpreting a single `Content` instance\n- Validating compatibility with the content type\n- Producing a structured output suitable for downstream consumers\n\nParsers are not responsible for:\n\n- Fetching or acquiring content\n- Performing retries or error recovery\n- Managing multiple content sources",
+    "objects": {
+      "ABC": {
+        "name": "ABC",
+        "kind": "alias",
+        "path": "omniread.core.parser.ABC",
+        "signature": "<bound method Alias.signature of Alias('ABC', 'abc.ABC')>",
+        "docstring": null
+      },
+      "abstractmethod": {
+        "name": "abstractmethod",
+        "kind": "alias",
+        "path": "omniread.core.parser.abstractmethod",
+        "signature": "<bound method Alias.signature of Alias('abstractmethod', 'abc.abstractmethod')>",
+        "docstring": null
+      },
+      "Generic": {
+        "name": "Generic",
+        "kind": "alias",
+        "path": "omniread.core.parser.Generic",
+        "signature": "<bound method Alias.signature of Alias('Generic', 'typing.Generic')>",
+        "docstring": null
+      },
+      "TypeVar": {
+        "name": "TypeVar",
+        "kind": "alias",
+        "path": "omniread.core.parser.TypeVar",
+        "signature": "<bound method Alias.signature of Alias('TypeVar', 'typing.TypeVar')>",
+        "docstring": null
+      },
+      "Set": {
+        "name": "Set",
+        "kind": "alias",
+        "path": "omniread.core.parser.Set",
+        "signature": "<bound method Alias.signature of Alias('Set', 'typing.Set')>",
+        "docstring": null
+      },
+      "Content": {
+        "name": "Content",
+        "kind": "class",
+        "path": "omniread.core.parser.Content",
+        "signature": "<bound method Alias.signature of Alias('Content', 'omniread.core.content.Content')>",
+        "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+        "members": {
+          "raw": {
+            "name": "raw",
+            "kind": "attribute",
+            "path": "omniread.core.parser.Content.raw",
+            "signature": "<bound method Alias.signature of Alias('raw', 'omniread.core.content.Content.raw')>",
+            "docstring": "Raw content bytes as retrieved from the source."
+          },
+          "source": {
+            "name": "source",
+            "kind": "attribute",
+            "path": "omniread.core.parser.Content.source",
+            "signature": "<bound method Alias.signature of Alias('source', 'omniread.core.content.Content.source')>",
+            "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+          },
+          "content_type": {
+            "name": "content_type",
+            "kind": "attribute",
+            "path": "omniread.core.parser.Content.content_type",
+            "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.core.content.Content.content_type')>",
+            "docstring": "Optional MIME type of the content, if known."
+          },
+          "metadata": {
+            "name": "metadata",
+            "kind": "attribute",
+            "path": "omniread.core.parser.Content.metadata",
+            "signature": "<bound method Alias.signature of Alias('metadata', 'omniread.core.content.Content.metadata')>",
+            "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+          }
+        }
+      },
+      "ContentType": {
+        "name": "ContentType",
+        "kind": "class",
+        "path": "omniread.core.parser.ContentType",
+        "signature": "<bound method Alias.signature of Alias('ContentType', 'omniread.core.content.ContentType')>",
+        "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+        "members": {
+          "HTML": {
+            "name": "HTML",
+            "kind": "attribute",
+            "path": "omniread.core.parser.ContentType.HTML",
+            "signature": "<bound method Alias.signature of Alias('HTML', 'omniread.core.content.ContentType.HTML')>",
+            "docstring": "HTML document content."
+          },
+          "PDF": {
+            "name": "PDF",
+            "kind": "attribute",
+            "path": "omniread.core.parser.ContentType.PDF",
+            "signature": "<bound method Alias.signature of Alias('PDF', 'omniread.core.content.ContentType.PDF')>",
+            "docstring": "PDF document content."
+          },
+          "JSON": {
+            "name": "JSON",
+            "kind": "attribute",
+            "path": "omniread.core.parser.ContentType.JSON",
+            "signature": "<bound method Alias.signature of Alias('JSON', 'omniread.core.content.ContentType.JSON')>",
+            "docstring": "JSON document content."
+          },
+          "XML": {
+            "name": "XML",
+            "kind": "attribute",
+            "path": "omniread.core.parser.ContentType.XML",
+            "signature": "<bound method Alias.signature of Alias('XML', 'omniread.core.content.ContentType.XML')>",
+            "docstring": "XML document content."
+          }
+        }
+      },
+      "T": {
+        "name": "T",
+        "kind": "attribute",
+        "path": "omniread.core.parser.T",
+        "signature": null,
+        "docstring": null
+      },
+      "BaseParser": {
+        "name": "BaseParser",
+        "kind": "class",
+        "path": "omniread.core.parser.BaseParser",
+        "signature": "<bound method Class.signature of Class('BaseParser', 30, 111)>",
+        "docstring": "Base interface for all parsers.\n\nNotes:\n    **Guarantees:**\n\n        - A parser is a self-contained object that owns the `Content` it is\n          responsible for interpreting.\n        - Consumers may rely on early validation of content compatibility\n          and type-stable return values from `parse()`.\n\n    **Responsibilities:**\n\n        - Implementations must declare supported content types via `supported_types`.\n        - Implementations must raise parsing-specific exceptions from `parse()`.\n        - Implementations must remain deterministic for a given input.",
+        "members": {
+          "supported_types": {
+            "name": "supported_types",
+            "kind": "attribute",
+            "path": "omniread.core.parser.BaseParser.supported_types",
+            "signature": null,
+            "docstring": "Set of content types supported by this parser. An empty set indicates that the parser is content-type agnostic."
+          },
+          "content": {
+            "name": "content",
+            "kind": "attribute",
+            "path": "omniread.core.parser.BaseParser.content",
+            "signature": null,
+            "docstring": null
+          },
+          "parse": {
+            "name": "parse",
+            "kind": "function",
+            "path": "omniread.core.parser.BaseParser.parse",
+            "signature": "<bound method Function.signature of Function('parse', 75, 94)>",
+            "docstring": "Parse the owned content into structured output.\n\nReturns:\n    T:\n        Parsed, structured representation.\n\nRaises:\n    Exception:\n        Parsing-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully consume the provided content and\n          return a deterministic, structured output."
+          },
+          "supports": {
+            "name": "supports",
+            "kind": "function",
+            "path": "omniread.core.parser.BaseParser.supports",
+            "signature": "<bound method Function.signature of Function('supports', 96, 111)>",
+            "docstring": "Check whether this parser supports the content's type.\n\nReturns:\n    bool:\n        True if the content type is supported; False otherwise."
+          }
+        }
+      }
+    }
+  }
+}
--- a/mcp_docs/modules/omniread.core.scraper.json
+++ b/mcp_docs/modules/omniread.core.scraper.json
@@ -0,0 +1,97 @@
+{
+  "module": "omniread.core.scraper",
+  "content": {
+    "path": "omniread.core.scraper",
+    "docstring": "# Summary\n\nAbstract scraping contracts for OmniRead.\n\nThis module defines the **format-agnostic scraper interface** responsible for\nacquiring raw content from external sources.\n\nScrapers are responsible for:\n\n- Locating and retrieving raw content bytes\n- Attaching minimal contextual metadata\n- Returning normalized `Content` objects\n\nScrapers are explicitly NOT responsible for:\n\n- Parsing or interpreting content\n- Inferring structure or semantics\n- Performing content-type specific processing\n\nAll interpretation must be delegated to parsers.",
+    "objects": {
+      "ABC": {
+        "name": "ABC",
+        "kind": "alias",
+        "path": "omniread.core.scraper.ABC",
+        "signature": "<bound method Alias.signature of Alias('ABC', 'abc.ABC')>",
+        "docstring": null
+      },
+      "abstractmethod": {
+        "name": "abstractmethod",
+        "kind": "alias",
+        "path": "omniread.core.scraper.abstractmethod",
+        "signature": "<bound method Alias.signature of Alias('abstractmethod', 'abc.abstractmethod')>",
+        "docstring": null
+      },
+      "Any": {
+        "name": "Any",
+        "kind": "alias",
+        "path": "omniread.core.scraper.Any",
+        "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+        "docstring": null
+      },
+      "Mapping": {
+        "name": "Mapping",
+        "kind": "alias",
+        "path": "omniread.core.scraper.Mapping",
+        "signature": "<bound method Alias.signature of Alias('Mapping', 'typing.Mapping')>",
+        "docstring": null
+      },
+      "Optional": {
+        "name": "Optional",
+        "kind": "alias",
+        "path": "omniread.core.scraper.Optional",
+        "signature": "<bound method Alias.signature of Alias('Optional', 'typing.Optional')>",
+        "docstring": null
+      },
+      "Content": {
+        "name": "Content",
+        "kind": "class",
+        "path": "omniread.core.scraper.Content",
+        "signature": "<bound method Alias.signature of Alias('Content', 'omniread.core.content.Content')>",
+        "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+        "members": {
+          "raw": {
+            "name": "raw",
+            "kind": "attribute",
+            "path": "omniread.core.scraper.Content.raw",
+            "signature": "<bound method Alias.signature of Alias('raw', 'omniread.core.content.Content.raw')>",
+            "docstring": "Raw content bytes as retrieved from the source."
+          },
+          "source": {
+            "name": "source",
+            "kind": "attribute",
+            "path": "omniread.core.scraper.Content.source",
+            "signature": "<bound method Alias.signature of Alias('source', 'omniread.core.content.Content.source')>",
+            "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+          },
+          "content_type": {
+            "name": "content_type",
+            "kind": "attribute",
+            "path": "omniread.core.scraper.Content.content_type",
+            "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.core.content.Content.content_type')>",
+            "docstring": "Optional MIME type of the content, if known."
+          },
+          "metadata": {
+            "name": "metadata",
+            "kind": "attribute",
+            "path": "omniread.core.scraper.Content.metadata",
+            "signature": "<bound method Alias.signature of Alias('metadata', 'omniread.core.content.Content.metadata')>",
+            "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+          }
+        }
+      },
+      "BaseScraper": {
+        "name": "BaseScraper",
+        "kind": "class",
+        "path": "omniread.core.scraper.BaseScraper",
+        "signature": "<bound method Class.signature of Class('BaseScraper', 30, 82)>",
+        "docstring": "Base interface for all scrapers.\n\nNotes:\n    **Responsibilities:**\n\n        - A scraper is responsible ONLY for fetching raw content (bytes)\n          from a source. It must not interpret or parse it.\n        - A scraper is a stateless acquisition component that retrieves raw\n          content from a source and returns it as a `Content` object.\n        - Scrapers define how content is obtained, not what the content means.\n        - Implementations may vary in transport mechanism, authentication\n          strategy, retry and backoff behavior.\n\n    **Constraints:**\n\n        - Implementations must not parse content, modify content semantics,\n          or couple scraping logic to a specific parser.",
+        "members": {
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.core.scraper.BaseScraper.fetch",
+            "signature": "<bound method Function.signature of Function('fetch', 51, 82)>",
+            "docstring": "Fetch raw content from the given source.\n\nArgs:\n    source (str):\n        Location identifier (URL, file path, S3 URI, etc.).\n\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional hints for the scraper (headers, auth, etc.).\n\nReturns:\n    Content:\n        Content object containing raw bytes and metadata.\n\nRaises:\n    Exception:\n        Retrieval-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must retrieve the content referenced by `source`\n          and return it as raw bytes wrapped in a `Content` object."
+          }
+        }
+      }
+    }
+  }
+}
--- a/mcp_docs/modules/omniread.html.json
+++ b/mcp_docs/modules/omniread.html.json
@@ -0,0 +1,488 @@
+{
+  "module": "omniread.html",
+  "content": {
+    "path": "omniread.html",
+    "docstring": "# Summary\n\nHTML format implementation for OmniRead.\n\nThis package provides **HTML-specific implementations** of the core OmniRead\ncontracts defined in `omniread.core`.\n\nIt includes:\n\n- HTML parsers that interpret HTML content.\n- HTML scrapers that retrieve HTML documents.\n\nKey characteristics:\n\n- Implements, but does not redefine, core contracts.\n- May contain HTML-specific behavior and edge-case handling.\n- Produces canonical content models defined in `omniread.core.content`.\n\nConsumers should depend on `omniread.core` interfaces wherever possible and\nuse this package only when HTML-specific behavior is required.\n\n---\n\n# Public API\n\n- `HTMLScraper`\n- `HTMLParser`\n\n---",
+    "objects": {
+      "HTMLScraper": {
+        "name": "HTMLScraper",
+        "kind": "class",
+        "path": "omniread.html.HTMLScraper",
+        "signature": "<bound method Alias.signature of Alias('HTMLScraper', 'omniread.html.scraper.HTMLScraper')>",
+        "docstring": "Base HTML scraper using `httpx`.\n\nNotes:\n    **Responsibilities:**\n\n        - This scraper retrieves HTML documents over HTTP(S) and returns\n          them as raw content wrapped in a `Content` object.\n        - Fetches raw bytes and metadata only.\n        - The scraper uses `httpx.Client` for HTTP requests, enforces an\n          HTML content type, and preserves HTTP response metadata.\n\n    **Constraints:**\n\n        - The scraper does not: Parse HTML, perform retries or backoff,\n          handle non-HTML responses.",
+        "members": {
+          "content_type": {
+            "name": "content_type",
+            "kind": "attribute",
+            "path": "omniread.html.HTMLScraper.content_type",
+            "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.html.scraper.HTMLScraper.content_type')>",
+            "docstring": null
+          },
+          "validate_content_type": {
+            "name": "validate_content_type",
+            "kind": "function",
+            "path": "omniread.html.HTMLScraper.validate_content_type",
+            "signature": "<bound method Alias.signature of Alias('validate_content_type', 'omniread.html.scraper.HTMLScraper.validate_content_type')>",
+            "docstring": "Validate that the HTTP response contains HTML content.\n\nArgs:\n    response (httpx.Response):\n        HTTP response returned by `httpx`.\n\nRaises:\n    ValueError:\n        If the `Content-Type` header is missing or does not indicate HTML content."
+          },
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.html.HTMLScraper.fetch",
+            "signature": "<bound method Alias.signature of Alias('fetch', 'omniread.html.scraper.HTMLScraper.fetch')>",
+            "docstring": "Fetch an HTML document from the given source.\n\nArgs:\n    source (str):\n        URL of the HTML document.\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional metadata to be merged into the returned content.\n\nReturns:\n    Content:\n        A `Content` instance containing raw HTML bytes, source URL, HTML content type, and HTTP response metadata.\n\nRaises:\n    httpx.HTTPError:\n        If the HTTP request fails.\n    ValueError:\n        If the response is not valid HTML."
+          }
+        }
+      },
+      "HTMLParser": {
+        "name": "HTMLParser",
+        "kind": "class",
+        "path": "omniread.html.HTMLParser",
+        "signature": "<bound method Alias.signature of Alias('HTMLParser', 'omniread.html.parser.HTMLParser')>",
+        "docstring": "Base HTML parser.\n\nNotes:\n    **Responsibilities:**\n\n        - This class extends the core `BaseParser` with HTML-specific behavior,\n          including DOM parsing via BeautifulSoup and reusable extraction helpers.\n        - Provides reusable helpers for HTML extraction. Concrete parsers must\n          explicitly define the return type.\n\n    **Guarantees:**\n\n        - Accepts only HTML content.\n        - Owns a parsed BeautifulSoup DOM tree.\n        - Provides pure helper utilities for common HTML structures.\n\n    **Constraints:**\n\n        - Concrete subclasses must define the output type `T` and implement\n          the `parse()` method.",
+        "members": {
+          "supported_types": {
+            "name": "supported_types",
+            "kind": "attribute",
+            "path": "omniread.html.HTMLParser.supported_types",
+            "signature": "<bound method Alias.signature of Alias('supported_types', 'omniread.html.parser.HTMLParser.supported_types')>",
+            "docstring": "Set of content types supported by this parser (HTML only)."
+          },
+          "parse": {
+            "name": "parse",
+            "kind": "function",
+            "path": "omniread.html.HTMLParser.parse",
+            "signature": "<bound method Alias.signature of Alias('parse', 'omniread.html.parser.HTMLParser.parse')>",
+            "docstring": "Fully parse the HTML content into structured output.\n\nReturns:\n    T:\n        Parsed representation of type `T`.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully interpret the HTML DOM and return a\n          deterministic, structured output."
+          },
+          "parse_div": {
+            "name": "parse_div",
+            "kind": "function",
+            "path": "omniread.html.HTMLParser.parse_div",
+            "signature": "<bound method Alias.signature of Alias('parse_div', 'omniread.html.parser.HTMLParser.parse_div')>",
+            "docstring": "Extract normalized text from a `<div>` element.\n\nArgs:\n    div (Tag):\n        BeautifulSoup tag representing a `<div>`.\n    separator (str, optional):\n        String used to separate text nodes.\n\nReturns:\n    str:\n        Flattened, whitespace-normalized text content."
+          },
+          "parse_link": {
+            "name": "parse_link",
+            "kind": "function",
+            "path": "omniread.html.HTMLParser.parse_link",
+            "signature": "<bound method Alias.signature of Alias('parse_link', 'omniread.html.parser.HTMLParser.parse_link')>",
+            "docstring": "Extract the hyperlink reference from an `<a>` element.\n\nArgs:\n    a (Tag):\n        BeautifulSoup tag representing an anchor.\n\nReturns:\n    Optional[str]:\n        The value of the `href` attribute, or None if absent."
+          },
+          "parse_table": {
+            "name": "parse_table",
+            "kind": "function",
+            "path": "omniread.html.HTMLParser.parse_table",
+            "signature": "<bound method Alias.signature of Alias('parse_table', 'omniread.html.parser.HTMLParser.parse_table')>",
+            "docstring": "Parse an HTML table into a 2D list of strings.\n\nArgs:\n    table (Tag):\n        BeautifulSoup tag representing a `<table>`.\n\nReturns:\n    list[list[str]]:\n        A list of rows, where each row is a list of cell text values."
+          },
+          "parse_meta": {
+            "name": "parse_meta",
+            "kind": "function",
+            "path": "omniread.html.HTMLParser.parse_meta",
+            "signature": "<bound method Alias.signature of Alias('parse_meta', 'omniread.html.parser.HTMLParser.parse_meta')>",
+            "docstring": "Extract high-level metadata from the HTML document.\n\nReturns:\n    dict[str, Any]:\n        Dictionary containing extracted metadata.\n\nNotes:\n    **Responsibilities:**\n\n        - Extract high-level metadata from the HTML document.\n        - This includes: Document title, `<meta>` tag name/property to\n          content mappings."
+          }
+        }
+      },
+      "parser": {
+        "name": "parser",
+        "kind": "module",
+        "path": "omniread.html.parser",
+        "signature": null,
+        "docstring": "# Summary\n\nHTML parser base implementations for OmniRead.\n\nThis module provides reusable HTML parsing utilities built on top of\nthe abstract parser contracts defined in `omniread.core.parser`.\n\nIt supplies:\n\n- Content-type enforcement for HTML inputs\n- BeautifulSoup initialization and lifecycle management\n- Common helper methods for extracting structured data from HTML elements\n\nConcrete parsers must subclass `HTMLParser` and implement the `parse()` method\nto return a structured representation appropriate for their use case.",
+        "members": {
+          "Any": {
+            "name": "Any",
+            "kind": "alias",
+            "path": "omniread.html.parser.Any",
+            "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+            "docstring": null
+          },
+          "Generic": {
+            "name": "Generic",
+            "kind": "alias",
+            "path": "omniread.html.parser.Generic",
+            "signature": "<bound method Alias.signature of Alias('Generic', 'typing.Generic')>",
+            "docstring": null
+          },
+          "TypeVar": {
+            "name": "TypeVar",
+            "kind": "alias",
+            "path": "omniread.html.parser.TypeVar",
+            "signature": "<bound method Alias.signature of Alias('TypeVar', 'typing.TypeVar')>",
+            "docstring": null
+          },
+          "Optional": {
+            "name": "Optional",
+            "kind": "alias",
+            "path": "omniread.html.parser.Optional",
+            "signature": "<bound method Alias.signature of Alias('Optional', 'typing.Optional')>",
+            "docstring": null
+          },
+          "abstractmethod": {
+            "name": "abstractmethod",
+            "kind": "alias",
+            "path": "omniread.html.parser.abstractmethod",
+            "signature": "<bound method Alias.signature of Alias('abstractmethod', 'abc.abstractmethod')>",
+            "docstring": null
+          },
+          "BeautifulSoup": {
+            "name": "BeautifulSoup",
+            "kind": "alias",
+            "path": "omniread.html.parser.BeautifulSoup",
+            "signature": "<bound method Alias.signature of Alias('BeautifulSoup', 'bs4.BeautifulSoup')>",
+            "docstring": null
+          },
+          "Tag": {
+            "name": "Tag",
+            "kind": "alias",
+            "path": "omniread.html.parser.Tag",
+            "signature": "<bound method Alias.signature of Alias('Tag', 'bs4.Tag')>",
+            "docstring": null
+          },
+          "ContentType": {
+            "name": "ContentType",
+            "kind": "class",
+            "path": "omniread.html.parser.ContentType",
+            "signature": "<bound method Alias.signature of Alias('ContentType', 'omniread.core.content.ContentType')>",
+            "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+            "members": {
+              "HTML": {
+                "name": "HTML",
+                "kind": "attribute",
+                "path": "omniread.html.parser.ContentType.HTML",
+                "signature": "<bound method Alias.signature of Alias('HTML', 'omniread.core.content.ContentType.HTML')>",
+                "docstring": "HTML document content."
+              },
+              "PDF": {
+                "name": "PDF",
+                "kind": "attribute",
+                "path": "omniread.html.parser.ContentType.PDF",
+                "signature": "<bound method Alias.signature of Alias('PDF', 'omniread.core.content.ContentType.PDF')>",
+                "docstring": "PDF document content."
+              },
+              "JSON": {
+                "name": "JSON",
+                "kind": "attribute",
+                "path": "omniread.html.parser.ContentType.JSON",
+                "signature": "<bound method Alias.signature of Alias('JSON', 'omniread.core.content.ContentType.JSON')>",
+                "docstring": "JSON document content."
+              },
+              "XML": {
+                "name": "XML",
+                "kind": "attribute",
+                "path": "omniread.html.parser.ContentType.XML",
+                "signature": "<bound method Alias.signature of Alias('XML', 'omniread.core.content.ContentType.XML')>",
+                "docstring": "XML document content."
+              }
+            }
+          },
+          "Content": {
+            "name": "Content",
+            "kind": "class",
+            "path": "omniread.html.parser.Content",
+            "signature": "<bound method Alias.signature of Alias('Content', 'omniread.core.content.Content')>",
+            "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+            "members": {
+              "raw": {
+                "name": "raw",
+                "kind": "attribute",
+                "path": "omniread.html.parser.Content.raw",
+                "signature": "<bound method Alias.signature of Alias('raw', 'omniread.core.content.Content.raw')>",
+                "docstring": "Raw content bytes as retrieved from the source."
+              },
+              "source": {
+                "name": "source",
+                "kind": "attribute",
+                "path": "omniread.html.parser.Content.source",
+                "signature": "<bound method Alias.signature of Alias('source', 'omniread.core.content.Content.source')>",
+                "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+              },
+              "content_type": {
+                "name": "content_type",
+                "kind": "attribute",
+                "path": "omniread.html.parser.Content.content_type",
+                "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.core.content.Content.content_type')>",
+                "docstring": "Optional MIME type of the content, if known."
+              },
+              "metadata": {
+                "name": "metadata",
+                "kind": "attribute",
+                "path": "omniread.html.parser.Content.metadata",
+                "signature": "<bound method Alias.signature of Alias('metadata', 'omniread.core.content.Content.metadata')>",
+                "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+              }
+            }
+          },
+          "BaseParser": {
+            "name": "BaseParser",
+            "kind": "class",
+            "path": "omniread.html.parser.BaseParser",
+            "signature": "<bound method Alias.signature of Alias('BaseParser', 'omniread.core.parser.BaseParser')>",
+            "docstring": "Base interface for all parsers.\n\nNotes:\n    **Guarantees:**\n\n        - A parser is a self-contained object that owns the `Content` it is\n          responsible for interpreting.\n        - Consumers may rely on early validation of content compatibility\n          and type-stable return values from `parse()`.\n\n    **Responsibilities:**\n\n        - Implementations must declare supported content types via `supported_types`.\n        - Implementations must raise parsing-specific exceptions from `parse()`.\n        - Implementations must remain deterministic for a given input.",
+            "members": {
+              "supported_types": {
+                "name": "supported_types",
+                "kind": "attribute",
+                "path": "omniread.html.parser.BaseParser.supported_types",
+                "signature": "<bound method Alias.signature of Alias('supported_types', 'omniread.core.parser.BaseParser.supported_types')>",
+                "docstring": "Set of content types supported by this parser. An empty set indicates that the parser is content-type agnostic."
+              },
+              "content": {
+                "name": "content",
+                "kind": "attribute",
+                "path": "omniread.html.parser.BaseParser.content",
+                "signature": "<bound method Alias.signature of Alias('content', 'omniread.core.parser.BaseParser.content')>",
+                "docstring": null
+              },
+              "parse": {
+                "name": "parse",
+                "kind": "function",
+                "path": "omniread.html.parser.BaseParser.parse",
+                "signature": "<bound method Alias.signature of Alias('parse', 'omniread.core.parser.BaseParser.parse')>",
+                "docstring": "Parse the owned content into structured output.\n\nReturns:\n    T:\n        Parsed, structured representation.\n\nRaises:\n    Exception:\n        Parsing-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully consume the provided content and\n          return a deterministic, structured output."
+              },
+              "supports": {
+                "name": "supports",
+                "kind": "function",
+                "path": "omniread.html.parser.BaseParser.supports",
+                "signature": "<bound method Alias.signature of Alias('supports', 'omniread.core.parser.BaseParser.supports')>",
+                "docstring": "Check whether this parser supports the content's type.\n\nReturns:\n    bool:\n        True if the content type is supported; False otherwise."
+              }
+            }
+          },
+          "T": {
+            "name": "T",
+            "kind": "attribute",
+            "path": "omniread.html.parser.T",
+            "signature": null,
+            "docstring": null
+          },
+          "HTMLParser": {
+            "name": "HTMLParser",
+            "kind": "class",
+            "path": "omniread.html.parser.HTMLParser",
+            "signature": "<bound method Class.signature of Class('HTMLParser', 30, 205)>",
+            "docstring": "Base HTML parser.\n\nNotes:\n    **Responsibilities:**\n\n        - This class extends the core `BaseParser` with HTML-specific behavior,\n          including DOM parsing via BeautifulSoup and reusable extraction helpers.\n        - Provides reusable helpers for HTML extraction. Concrete parsers must\n          explicitly define the return type.\n\n    **Guarantees:**\n\n        - Accepts only HTML content.\n        - Owns a parsed BeautifulSoup DOM tree.\n        - Provides pure helper utilities for common HTML structures.\n\n    **Constraints:**\n\n        - Concrete subclasses must define the output type `T` and implement\n          the `parse()` method.",
+            "members": {
+              "supported_types": {
+                "name": "supported_types",
+                "kind": "attribute",
+                "path": "omniread.html.parser.HTMLParser.supported_types",
+                "signature": null,
+                "docstring": "Set of content types supported by this parser (HTML only)."
+              },
+              "parse": {
+                "name": "parse",
+                "kind": "function",
+                "path": "omniread.html.parser.HTMLParser.parse",
+                "signature": "<bound method Function.signature of Function('parse', 81, 96)>",
+                "docstring": "Fully parse the HTML content into structured output.\n\nReturns:\n    T:\n        Parsed representation of type `T`.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully interpret the HTML DOM and return a\n          deterministic, structured output."
+              },
+              "parse_div": {
+                "name": "parse_div",
+                "kind": "function",
+                "path": "omniread.html.parser.HTMLParser.parse_div",
+                "signature": "<bound method Function.signature of Function('parse_div', 102, 117)>",
+                "docstring": "Extract normalized text from a `<div>` element.\n\nArgs:\n    div (Tag):\n        BeautifulSoup tag representing a `<div>`.\n    separator (str, optional):\n        String used to separate text nodes.\n\nReturns:\n    str:\n        Flattened, whitespace-normalized text content."
+              },
+              "parse_link": {
+                "name": "parse_link",
+                "kind": "function",
+                "path": "omniread.html.parser.HTMLParser.parse_link",
+                "signature": "<bound method Function.signature of Function('parse_link', 119, 132)>",
+                "docstring": "Extract the hyperlink reference from an `<a>` element.\n\nArgs:\n    a (Tag):\n        BeautifulSoup tag representing an anchor.\n\nReturns:\n    Optional[str]:\n        The value of the `href` attribute, or None if absent."
+              },
+              "parse_table": {
+                "name": "parse_table",
+                "kind": "function",
+                "path": "omniread.html.parser.HTMLParser.parse_table",
+                "signature": "<bound method Function.signature of Function('parse_table', 134, 155)>",
+                "docstring": "Parse an HTML table into a 2D list of strings.\n\nArgs:\n    table (Tag):\n        BeautifulSoup tag representing a `<table>`.\n\nReturns:\n    list[list[str]]:\n        A list of rows, where each row is a list of cell text values."
+              },
+              "parse_meta": {
+                "name": "parse_meta",
+                "kind": "function",
+                "path": "omniread.html.parser.HTMLParser.parse_meta",
+                "signature": "<bound method Function.signature of Function('parse_meta', 177, 205)>",
+                "docstring": "Extract high-level metadata from the HTML document.\n\nReturns:\n    dict[str, Any]:\n        Dictionary containing extracted metadata.\n\nNotes:\n    **Responsibilities:**\n\n        - Extract high-level metadata from the HTML document.\n        - This includes: Document title, `<meta>` tag name/property to\n          content mappings."
+              }
+            }
+          },
+          "list": {
+            "name": "list",
+            "kind": "alias",
+            "path": "omniread.html.parser.list",
+            "signature": "<bound method Alias.signature of Alias('list', 'typing.list')>",
+            "docstring": null
+          },
+          "dict": {
+            "name": "dict",
+            "kind": "alias",
+            "path": "omniread.html.parser.dict",
+            "signature": "<bound method Alias.signature of Alias('dict', 'typing.dict')>",
+            "docstring": null
+          }
+        }
+      },
+      "scraper": {
+        "name": "scraper",
+        "kind": "module",
+        "path": "omniread.html.scraper",
+        "signature": null,
+        "docstring": "# Summary\n\nHTML scraping implementation for OmniRead.\n\nThis module provides an HTTP-based scraper for retrieving HTML documents.\nIt implements the core `BaseScraper` contract using `httpx` as the transport\nlayer.\n\nThis scraper is responsible for:\n\n- Fetching raw HTML bytes over HTTP(S)\n- Validating response content type\n- Attaching HTTP metadata to the returned content\n\nThis scraper is not responsible for:\n\n- Parsing or interpreting HTML\n- Retrying failed requests\n- Managing crawl policies or rate limiting",
+        "members": {
+          "httpx": {
+            "name": "httpx",
+            "kind": "alias",
+            "path": "omniread.html.scraper.httpx",
+            "signature": "<bound method Alias.signature of Alias('httpx', 'httpx')>",
+            "docstring": null
+          },
+          "Any": {
+            "name": "Any",
+            "kind": "alias",
+            "path": "omniread.html.scraper.Any",
+            "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+            "docstring": null
+          },
+          "Mapping": {
+            "name": "Mapping",
+            "kind": "alias",
+            "path": "omniread.html.scraper.Mapping",
+            "signature": "<bound method Alias.signature of Alias('Mapping', 'typing.Mapping')>",
+            "docstring": null
+          },
+          "Optional": {
+            "name": "Optional",
+            "kind": "alias",
+            "path": "omniread.html.scraper.Optional",
+            "signature": "<bound method Alias.signature of Alias('Optional', 'typing.Optional')>",
+            "docstring": null
+          },
+          "Content": {
+            "name": "Content",
+            "kind": "class",
+            "path": "omniread.html.scraper.Content",
+            "signature": "<bound method Alias.signature of Alias('Content', 'omniread.core.content.Content')>",
+            "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+            "members": {
+              "raw": {
+                "name": "raw",
+                "kind": "attribute",
+                "path": "omniread.html.scraper.Content.raw",
+                "signature": "<bound method Alias.signature of Alias('raw', 'omniread.core.content.Content.raw')>",
+                "docstring": "Raw content bytes as retrieved from the source."
+              },
+              "source": {
+                "name": "source",
+                "kind": "attribute",
+                "path": "omniread.html.scraper.Content.source",
+                "signature": "<bound method Alias.signature of Alias('source', 'omniread.core.content.Content.source')>",
+                "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+              },
+              "content_type": {
+                "name": "content_type",
+                "kind": "attribute",
+                "path": "omniread.html.scraper.Content.content_type",
+                "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.core.content.Content.content_type')>",
+                "docstring": "Optional MIME type of the content, if known."
+              },
+              "metadata": {
+                "name": "metadata",
+                "kind": "attribute",
+                "path": "omniread.html.scraper.Content.metadata",
+                "signature": "<bound method Alias.signature of Alias('metadata', 'omniread.core.content.Content.metadata')>",
+                "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+              }
+            }
+          },
+          "ContentType": {
+            "name": "ContentType",
+            "kind": "class",
+            "path": "omniread.html.scraper.ContentType",
+            "signature": "<bound method Alias.signature of Alias('ContentType', 'omniread.core.content.ContentType')>",
+            "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+            "members": {
+              "HTML": {
+                "name": "HTML",
+                "kind": "attribute",
+                "path": "omniread.html.scraper.ContentType.HTML",
+                "signature": "<bound method Alias.signature of Alias('HTML', 'omniread.core.content.ContentType.HTML')>",
+                "docstring": "HTML document content."
+              },
+              "PDF": {
+                "name": "PDF",
+                "kind": "attribute",
+                "path": "omniread.html.scraper.ContentType.PDF",
+                "signature": "<bound method Alias.signature of Alias('PDF', 'omniread.core.content.ContentType.PDF')>",
+                "docstring": "PDF document content."
+              },
+              "JSON": {
+                "name": "JSON",
+                "kind": "attribute",
+                "path": "omniread.html.scraper.ContentType.JSON",
+                "signature": "<bound method Alias.signature of Alias('JSON', 'omniread.core.content.ContentType.JSON')>",
+                "docstring": "JSON document content."
+              },
+              "XML": {
+                "name": "XML",
+                "kind": "attribute",
+                "path": "omniread.html.scraper.ContentType.XML",
+                "signature": "<bound method Alias.signature of Alias('XML', 'omniread.core.content.ContentType.XML')>",
+                "docstring": "XML document content."
+              }
+            }
+          },
+          "BaseScraper": {
+            "name": "BaseScraper",
+            "kind": "class",
+            "path": "omniread.html.scraper.BaseScraper",
+            "signature": "<bound method Alias.signature of Alias('BaseScraper', 'omniread.core.scraper.BaseScraper')>",
+            "docstring": "Base interface for all scrapers.\n\nNotes:\n    **Responsibilities:**\n\n        - A scraper is responsible ONLY for fetching raw content (bytes)\n          from a source. It must not interpret or parse it.\n        - A scraper is a stateless acquisition component that retrieves raw\n          content from a source and returns it as a `Content` object.\n        - Scrapers define how content is obtained, not what the content means.\n        - Implementations may vary in transport mechanism, authentication\n          strategy, retry and backoff behavior.\n\n    **Constraints:**\n\n        - Implementations must not parse content, modify content semantics,\n          or couple scraping logic to a specific parser.",
+            "members": {
+              "fetch": {
+                "name": "fetch",
+                "kind": "function",
+                "path": "omniread.html.scraper.BaseScraper.fetch",
+                "signature": "<bound method Alias.signature of Alias('fetch', 'omniread.core.scraper.BaseScraper.fetch')>",
+                "docstring": "Fetch raw content from the given source.\n\nArgs:\n    source (str):\n        Location identifier (URL, file path, S3 URI, etc.).\n\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional hints for the scraper (headers, auth, etc.).\n\nReturns:\n    Content:\n        Content object containing raw bytes and metadata.\n\nRaises:\n    Exception:\n        Retrieval-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must retrieve the content referenced by `source`\n          and return it as raw bytes wrapped in a `Content` object."
+              }
+            }
+          },
+          "HTMLScraper": {
+            "name": "HTMLScraper",
+            "kind": "class",
+            "path": "omniread.html.scraper.HTMLScraper",
+            "signature": "<bound method Class.signature of Class('HTMLScraper', 30, 143)>",
+            "docstring": "Base HTML scraper using `httpx`.\n\nNotes:\n    **Responsibilities:**\n\n        - This scraper retrieves HTML documents over HTTP(S) and returns\n          them as raw content wrapped in a `Content` object.\n        - Fetches raw bytes and metadata only.\n        - The scraper uses `httpx.Client` for HTTP requests, enforces an\n          HTML content type, and preserves HTTP response metadata.\n\n    **Constraints:**\n\n        - The scraper does not: Parse HTML, perform retries or backoff,\n          handle non-HTML responses.",
+            "members": {
+              "content_type": {
+                "name": "content_type",
+                "kind": "attribute",
+                "path": "omniread.html.scraper.HTMLScraper.content_type",
+                "signature": null,
+                "docstring": null
+              },
+              "validate_content_type": {
+                "name": "validate_content_type",
+                "kind": "function",
+                "path": "omniread.html.scraper.HTMLScraper.validate_content_type",
+                "signature": "<bound method Function.signature of Function('validate_content_type', 78, 102)>",
+                "docstring": "Validate that the HTTP response contains HTML content.\n\nArgs:\n    response (httpx.Response):\n        HTTP response returned by `httpx`.\n\nRaises:\n    ValueError:\n        If the `Content-Type` header is missing or does not indicate HTML content."
+              },
+              "fetch": {
+                "name": "fetch",
+                "kind": "function",
+                "path": "omniread.html.scraper.HTMLScraper.fetch",
+                "signature": "<bound method Function.signature of Function('fetch', 104, 143)>",
+                "docstring": "Fetch an HTML document from the given source.\n\nArgs:\n    source (str):\n        URL of the HTML document.\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional metadata to be merged into the returned content.\n\nReturns:\n    Content:\n        A `Content` instance containing raw HTML bytes, source URL, HTML content type, and HTTP response metadata.\n\nRaises:\n    httpx.HTTPError:\n        If the HTTP request fails.\n    ValueError:\n        If the response is not valid HTML."
+              }
+            }
+          }
+        }
+      }
+    }
+  }
+}
--- a/mcp_docs/modules/omniread.html.parser.json
+++ b/mcp_docs/modules/omniread.html.parser.json
@@ -0,0 +1,241 @@
+{
+  "module": "omniread.html.parser",
+  "content": {
+    "path": "omniread.html.parser",
+    "docstring": "# Summary\n\nHTML parser base implementations for OmniRead.\n\nThis module provides reusable HTML parsing utilities built on top of\nthe abstract parser contracts defined in `omniread.core.parser`.\n\nIt supplies:\n\n- Content-type enforcement for HTML inputs\n- BeautifulSoup initialization and lifecycle management\n- Common helper methods for extracting structured data from HTML elements\n\nConcrete parsers must subclass `HTMLParser` and implement the `parse()` method\nto return a structured representation appropriate for their use case.",
+    "objects": {
+      "Any": {
+        "name": "Any",
+        "kind": "alias",
+        "path": "omniread.html.parser.Any",
+        "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+        "docstring": null
+      },
+      "Generic": {
+        "name": "Generic",
+        "kind": "alias",
+        "path": "omniread.html.parser.Generic",
+        "signature": "<bound method Alias.signature of Alias('Generic', 'typing.Generic')>",
+        "docstring": null
+      },
+      "TypeVar": {
+        "name": "TypeVar",
+        "kind": "alias",
+        "path": "omniread.html.parser.TypeVar",
+        "signature": "<bound method Alias.signature of Alias('TypeVar', 'typing.TypeVar')>",
+        "docstring": null
+      },
+      "Optional": {
+        "name": "Optional",
+        "kind": "alias",
+        "path": "omniread.html.parser.Optional",
+        "signature": "<bound method Alias.signature of Alias('Optional', 'typing.Optional')>",
+        "docstring": null
+      },
+      "abstractmethod": {
+        "name": "abstractmethod",
+        "kind": "alias",
+        "path": "omniread.html.parser.abstractmethod",
+        "signature": "<bound method Alias.signature of Alias('abstractmethod', 'abc.abstractmethod')>",
+        "docstring": null
+      },
+      "BeautifulSoup": {
+        "name": "BeautifulSoup",
+        "kind": "alias",
+        "path": "omniread.html.parser.BeautifulSoup",
+        "signature": "<bound method Alias.signature of Alias('BeautifulSoup', 'bs4.BeautifulSoup')>",
+        "docstring": null
+      },
+      "Tag": {
+        "name": "Tag",
+        "kind": "alias",
+        "path": "omniread.html.parser.Tag",
+        "signature": "<bound method Alias.signature of Alias('Tag', 'bs4.Tag')>",
+        "docstring": null
+      },
+      "ContentType": {
+        "name": "ContentType",
+        "kind": "class",
+        "path": "omniread.html.parser.ContentType",
+        "signature": "<bound method Alias.signature of Alias('ContentType', 'omniread.core.content.ContentType')>",
+        "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+        "members": {
+          "HTML": {
+            "name": "HTML",
+            "kind": "attribute",
+            "path": "omniread.html.parser.ContentType.HTML",
+            "signature": "<bound method Alias.signature of Alias('HTML', 'omniread.core.content.ContentType.HTML')>",
+            "docstring": "HTML document content."
+          },
+          "PDF": {
+            "name": "PDF",
+            "kind": "attribute",
+            "path": "omniread.html.parser.ContentType.PDF",
+            "signature": "<bound method Alias.signature of Alias('PDF', 'omniread.core.content.ContentType.PDF')>",
+            "docstring": "PDF document content."
+          },
+          "JSON": {
+            "name": "JSON",
+            "kind": "attribute",
+            "path": "omniread.html.parser.ContentType.JSON",
+            "signature": "<bound method Alias.signature of Alias('JSON', 'omniread.core.content.ContentType.JSON')>",
+            "docstring": "JSON document content."
+          },
+          "XML": {
+            "name": "XML",
+            "kind": "attribute",
+            "path": "omniread.html.parser.ContentType.XML",
+            "signature": "<bound method Alias.signature of Alias('XML', 'omniread.core.content.ContentType.XML')>",
+            "docstring": "XML document content."
+          }
+        }
+      },
+      "Content": {
+        "name": "Content",
+        "kind": "class",
+        "path": "omniread.html.parser.Content",
+        "signature": "<bound method Alias.signature of Alias('Content', 'omniread.core.content.Content')>",
+        "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+        "members": {
+          "raw": {
+            "name": "raw",
+            "kind": "attribute",
+            "path": "omniread.html.parser.Content.raw",
+            "signature": "<bound method Alias.signature of Alias('raw', 'omniread.core.content.Content.raw')>",
+            "docstring": "Raw content bytes as retrieved from the source."
+          },
+          "source": {
+            "name": "source",
+            "kind": "attribute",
+            "path": "omniread.html.parser.Content.source",
+            "signature": "<bound method Alias.signature of Alias('source', 'omniread.core.content.Content.source')>",
+            "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+          },
+          "content_type": {
+            "name": "content_type",
+            "kind": "attribute",
+            "path": "omniread.html.parser.Content.content_type",
+            "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.core.content.Content.content_type')>",
+            "docstring": "Optional MIME type of the content, if known."
+          },
+          "metadata": {
+            "name": "metadata",
+            "kind": "attribute",
+            "path": "omniread.html.parser.Content.metadata",
+            "signature": "<bound method Alias.signature of Alias('metadata', 'omniread.core.content.Content.metadata')>",
+            "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+          }
+        }
+      },
+      "BaseParser": {
+        "name": "BaseParser",
+        "kind": "class",
+        "path": "omniread.html.parser.BaseParser",
+        "signature": "<bound method Alias.signature of Alias('BaseParser', 'omniread.core.parser.BaseParser')>",
+        "docstring": "Base interface for all parsers.\n\nNotes:\n    **Guarantees:**\n\n        - A parser is a self-contained object that owns the `Content` it is\n          responsible for interpreting.\n        - Consumers may rely on early validation of content compatibility\n          and type-stable return values from `parse()`.\n\n    **Responsibilities:**\n\n        - Implementations must declare supported content types via `supported_types`.\n        - Implementations must raise parsing-specific exceptions from `parse()`.\n        - Implementations must remain deterministic for a given input.",
+        "members": {
+          "supported_types": {
+            "name": "supported_types",
+            "kind": "attribute",
+            "path": "omniread.html.parser.BaseParser.supported_types",
+            "signature": "<bound method Alias.signature of Alias('supported_types', 'omniread.core.parser.BaseParser.supported_types')>",
+            "docstring": "Set of content types supported by this parser. An empty set indicates that the parser is content-type agnostic."
+          },
+          "content": {
+            "name": "content",
+            "kind": "attribute",
+            "path": "omniread.html.parser.BaseParser.content",
+            "signature": "<bound method Alias.signature of Alias('content', 'omniread.core.parser.BaseParser.content')>",
+            "docstring": null
+          },
+          "parse": {
+            "name": "parse",
+            "kind": "function",
+            "path": "omniread.html.parser.BaseParser.parse",
+            "signature": "<bound method Alias.signature of Alias('parse', 'omniread.core.parser.BaseParser.parse')>",
+            "docstring": "Parse the owned content into structured output.\n\nReturns:\n    T:\n        Parsed, structured representation.\n\nRaises:\n    Exception:\n        Parsing-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully consume the provided content and\n          return a deterministic, structured output."
+          },
+          "supports": {
+            "name": "supports",
+            "kind": "function",
+            "path": "omniread.html.parser.BaseParser.supports",
+            "signature": "<bound method Alias.signature of Alias('supports', 'omniread.core.parser.BaseParser.supports')>",
+            "docstring": "Check whether this parser supports the content's type.\n\nReturns:\n    bool:\n        True if the content type is supported; False otherwise."
+          }
+        }
+      },
+      "T": {
+        "name": "T",
+        "kind": "attribute",
+        "path": "omniread.html.parser.T",
+        "signature": null,
+        "docstring": null
+      },
+      "HTMLParser": {
+        "name": "HTMLParser",
+        "kind": "class",
+        "path": "omniread.html.parser.HTMLParser",
+        "signature": "<bound method Class.signature of Class('HTMLParser', 30, 205)>",
+        "docstring": "Base HTML parser.\n\nNotes:\n    **Responsibilities:**\n\n        - This class extends the core `BaseParser` with HTML-specific behavior,\n          including DOM parsing via BeautifulSoup and reusable extraction helpers.\n        - Provides reusable helpers for HTML extraction. Concrete parsers must\n          explicitly define the return type.\n\n    **Guarantees:**\n\n        - Accepts only HTML content.\n        - Owns a parsed BeautifulSoup DOM tree.\n        - Provides pure helper utilities for common HTML structures.\n\n    **Constraints:**\n\n        - Concrete subclasses must define the output type `T` and implement\n          the `parse()` method.",
+        "members": {
+          "supported_types": {
+            "name": "supported_types",
+            "kind": "attribute",
+            "path": "omniread.html.parser.HTMLParser.supported_types",
+            "signature": null,
+            "docstring": "Set of content types supported by this parser (HTML only)."
+          },
+          "parse": {
+            "name": "parse",
+            "kind": "function",
+            "path": "omniread.html.parser.HTMLParser.parse",
+            "signature": "<bound method Function.signature of Function('parse', 81, 96)>",
+            "docstring": "Fully parse the HTML content into structured output.\n\nReturns:\n    T:\n        Parsed representation of type `T`.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully interpret the HTML DOM and return a\n          deterministic, structured output."
+          },
+          "parse_div": {
+            "name": "parse_div",
+            "kind": "function",
+            "path": "omniread.html.parser.HTMLParser.parse_div",
+            "signature": "<bound method Function.signature of Function('parse_div', 102, 117)>",
+            "docstring": "Extract normalized text from a `<div>` element.\n\nArgs:\n    div (Tag):\n        BeautifulSoup tag representing a `<div>`.\n    separator (str, optional):\n        String used to separate text nodes.\n\nReturns:\n    str:\n        Flattened, whitespace-normalized text content."
+          },
+          "parse_link": {
+            "name": "parse_link",
+            "kind": "function",
+            "path": "omniread.html.parser.HTMLParser.parse_link",
+            "signature": "<bound method Function.signature of Function('parse_link', 119, 132)>",
+            "docstring": "Extract the hyperlink reference from an `<a>` element.\n\nArgs:\n    a (Tag):\n        BeautifulSoup tag representing an anchor.\n\nReturns:\n    Optional[str]:\n        The value of the `href` attribute, or None if absent."
+          },
+          "parse_table": {
+            "name": "parse_table",
+            "kind": "function",
+            "path": "omniread.html.parser.HTMLParser.parse_table",
+            "signature": "<bound method Function.signature of Function('parse_table', 134, 155)>",
+            "docstring": "Parse an HTML table into a 2D list of strings.\n\nArgs:\n    table (Tag):\n        BeautifulSoup tag representing a `<table>`.\n\nReturns:\n    list[list[str]]:\n        A list of rows, where each row is a list of cell text values."
+          },
+          "parse_meta": {
+            "name": "parse_meta",
+            "kind": "function",
+            "path": "omniread.html.parser.HTMLParser.parse_meta",
+            "signature": "<bound method Function.signature of Function('parse_meta', 177, 205)>",
+            "docstring": "Extract high-level metadata from the HTML document.\n\nReturns:\n    dict[str, Any]:\n        Dictionary containing extracted metadata.\n\nNotes:\n    **Responsibilities:**\n\n        - Extract high-level metadata from the HTML document.\n        - This includes: Document title, `<meta>` tag name/property to\n          content mappings."
+          }
+        }
+      },
+      "list": {
+        "name": "list",
+        "kind": "alias",
+        "path": "omniread.html.parser.list",
+        "signature": "<bound method Alias.signature of Alias('list', 'typing.list')>",
+        "docstring": null
+      },
+      "dict": {
+        "name": "dict",
+        "kind": "alias",
+        "path": "omniread.html.parser.dict",
+        "signature": "<bound method Alias.signature of Alias('dict', 'typing.dict')>",
+        "docstring": null
+      }
+    }
+  }
+}
--- a/mcp_docs/modules/omniread.html.scraper.json
+++ b/mcp_docs/modules/omniread.html.scraper.json
@@ -0,0 +1,157 @@
+{
+  "module": "omniread.html.scraper",
+  "content": {
+    "path": "omniread.html.scraper",
+    "docstring": "# Summary\n\nHTML scraping implementation for OmniRead.\n\nThis module provides an HTTP-based scraper for retrieving HTML documents.\nIt implements the core `BaseScraper` contract using `httpx` as the transport\nlayer.\n\nThis scraper is responsible for:\n\n- Fetching raw HTML bytes over HTTP(S)\n- Validating response content type\n- Attaching HTTP metadata to the returned content\n\nThis scraper is not responsible for:\n\n- Parsing or interpreting HTML\n- Retrying failed requests\n- Managing crawl policies or rate limiting",
+    "objects": {
+      "httpx": {
+        "name": "httpx",
+        "kind": "alias",
+        "path": "omniread.html.scraper.httpx",
+        "signature": "<bound method Alias.signature of Alias('httpx', 'httpx')>",
+        "docstring": null
+      },
+      "Any": {
+        "name": "Any",
+        "kind": "alias",
+        "path": "omniread.html.scraper.Any",
+        "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+        "docstring": null
+      },
+      "Mapping": {
+        "name": "Mapping",
+        "kind": "alias",
+        "path": "omniread.html.scraper.Mapping",
+        "signature": "<bound method Alias.signature of Alias('Mapping', 'typing.Mapping')>",
+        "docstring": null
+      },
+      "Optional": {
+        "name": "Optional",
+        "kind": "alias",
+        "path": "omniread.html.scraper.Optional",
+        "signature": "<bound method Alias.signature of Alias('Optional', 'typing.Optional')>",
+        "docstring": null
+      },
+      "Content": {
+        "name": "Content",
+        "kind": "class",
+        "path": "omniread.html.scraper.Content",
+        "signature": "<bound method Alias.signature of Alias('Content', 'omniread.core.content.Content')>",
+        "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+        "members": {
+          "raw": {
+            "name": "raw",
+            "kind": "attribute",
+            "path": "omniread.html.scraper.Content.raw",
+            "signature": "<bound method Alias.signature of Alias('raw', 'omniread.core.content.Content.raw')>",
+            "docstring": "Raw content bytes as retrieved from the source."
+          },
+          "source": {
+            "name": "source",
+            "kind": "attribute",
+            "path": "omniread.html.scraper.Content.source",
+            "signature": "<bound method Alias.signature of Alias('source', 'omniread.core.content.Content.source')>",
+            "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+          },
+          "content_type": {
+            "name": "content_type",
+            "kind": "attribute",
+            "path": "omniread.html.scraper.Content.content_type",
+            "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.core.content.Content.content_type')>",
+            "docstring": "Optional MIME type of the content, if known."
+          },
+          "metadata": {
+            "name": "metadata",
+            "kind": "attribute",
+            "path": "omniread.html.scraper.Content.metadata",
+            "signature": "<bound method Alias.signature of Alias('metadata', 'omniread.core.content.Content.metadata')>",
+            "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+          }
+        }
+      },
+      "ContentType": {
+        "name": "ContentType",
+        "kind": "class",
+        "path": "omniread.html.scraper.ContentType",
+        "signature": "<bound method Alias.signature of Alias('ContentType', 'omniread.core.content.ContentType')>",
+        "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+        "members": {
+          "HTML": {
+            "name": "HTML",
+            "kind": "attribute",
+            "path": "omniread.html.scraper.ContentType.HTML",
+            "signature": "<bound method Alias.signature of Alias('HTML', 'omniread.core.content.ContentType.HTML')>",
+            "docstring": "HTML document content."
+          },
+          "PDF": {
+            "name": "PDF",
+            "kind": "attribute",
+            "path": "omniread.html.scraper.ContentType.PDF",
+            "signature": "<bound method Alias.signature of Alias('PDF', 'omniread.core.content.ContentType.PDF')>",
+            "docstring": "PDF document content."
+          },
+          "JSON": {
+            "name": "JSON",
+            "kind": "attribute",
+            "path": "omniread.html.scraper.ContentType.JSON",
+            "signature": "<bound method Alias.signature of Alias('JSON', 'omniread.core.content.ContentType.JSON')>",
+            "docstring": "JSON document content."
+          },
+          "XML": {
+            "name": "XML",
+            "kind": "attribute",
+            "path": "omniread.html.scraper.ContentType.XML",
+            "signature": "<bound method Alias.signature of Alias('XML', 'omniread.core.content.ContentType.XML')>",
+            "docstring": "XML document content."
+          }
+        }
+      },
+      "BaseScraper": {
+        "name": "BaseScraper",
+        "kind": "class",
+        "path": "omniread.html.scraper.BaseScraper",
+        "signature": "<bound method Alias.signature of Alias('BaseScraper', 'omniread.core.scraper.BaseScraper')>",
+        "docstring": "Base interface for all scrapers.\n\nNotes:\n    **Responsibilities:**\n\n        - A scraper is responsible ONLY for fetching raw content (bytes)\n          from a source. It must not interpret or parse it.\n        - A scraper is a stateless acquisition component that retrieves raw\n          content from a source and returns it as a `Content` object.\n        - Scrapers define how content is obtained, not what the content means.\n        - Implementations may vary in transport mechanism, authentication\n          strategy, retry and backoff behavior.\n\n    **Constraints:**\n\n        - Implementations must not parse content, modify content semantics,\n          or couple scraping logic to a specific parser.",
+        "members": {
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.html.scraper.BaseScraper.fetch",
+            "signature": "<bound method Alias.signature of Alias('fetch', 'omniread.core.scraper.BaseScraper.fetch')>",
+            "docstring": "Fetch raw content from the given source.\n\nArgs:\n    source (str):\n        Location identifier (URL, file path, S3 URI, etc.).\n\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional hints for the scraper (headers, auth, etc.).\n\nReturns:\n    Content:\n        Content object containing raw bytes and metadata.\n\nRaises:\n    Exception:\n        Retrieval-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must retrieve the content referenced by `source`\n          and return it as raw bytes wrapped in a `Content` object."
+          }
+        }
+      },
+      "HTMLScraper": {
+        "name": "HTMLScraper",
+        "kind": "class",
+        "path": "omniread.html.scraper.HTMLScraper",
+        "signature": "<bound method Class.signature of Class('HTMLScraper', 30, 143)>",
+        "docstring": "Base HTML scraper using `httpx`.\n\nNotes:\n    **Responsibilities:**\n\n        - This scraper retrieves HTML documents over HTTP(S) and returns\n          them as raw content wrapped in a `Content` object.\n        - Fetches raw bytes and metadata only.\n        - The scraper uses `httpx.Client` for HTTP requests, enforces an\n          HTML content type, and preserves HTTP response metadata.\n\n    **Constraints:**\n\n        - The scraper does not: Parse HTML, perform retries or backoff,\n          handle non-HTML responses.",
+        "members": {
+          "content_type": {
+            "name": "content_type",
+            "kind": "attribute",
+            "path": "omniread.html.scraper.HTMLScraper.content_type",
+            "signature": null,
+            "docstring": null
+          },
+          "validate_content_type": {
+            "name": "validate_content_type",
+            "kind": "function",
+            "path": "omniread.html.scraper.HTMLScraper.validate_content_type",
+            "signature": "<bound method Function.signature of Function('validate_content_type', 78, 102)>",
+            "docstring": "Validate that the HTTP response contains HTML content.\n\nArgs:\n    response (httpx.Response):\n        HTTP response returned by `httpx`.\n\nRaises:\n    ValueError:\n        If the `Content-Type` header is missing or does not indicate HTML content."
+          },
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.html.scraper.HTMLScraper.fetch",
+            "signature": "<bound method Function.signature of Function('fetch', 104, 143)>",
+            "docstring": "Fetch an HTML document from the given source.\n\nArgs:\n    source (str):\n        URL of the HTML document.\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional metadata to be merged into the returned content.\n\nReturns:\n    Content:\n        A `Content` instance containing raw HTML bytes, source URL, HTML content type, and HTTP response metadata.\n\nRaises:\n    httpx.HTTPError:\n        If the HTTP request fails.\n    ValueError:\n        If the response is not valid HTML."
+          }
+        }
+      }
+    }
+  }
+}
--- a/mcp_docs/modules/omniread.json
+++ b/mcp_docs/modules/omniread.json
--- a/mcp_docs/modules/omniread.pdf.client.json
+++ b/mcp_docs/modules/omniread.pdf.client.json
@@ -0,0 +1,69 @@
+{
+  "module": "omniread.pdf.client",
+  "content": {
+    "path": "omniread.pdf.client",
+    "docstring": "# Summary\n\nPDF client abstractions for OmniRead.\n\nThis module defines the **client layer** responsible for retrieving raw PDF\nbytes from a concrete backing store.\n\nClients provide low-level access to PDF binaries and are intentionally\ndecoupled from scraping and parsing logic. They do not perform validation,\ninterpretation, or content extraction.\n\nTypical backing stores include:\n\n- Local filesystems\n- Object storage (S3, GCS, etc.)\n- Network file systems",
+    "objects": {
+      "Any": {
+        "name": "Any",
+        "kind": "alias",
+        "path": "omniread.pdf.client.Any",
+        "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+        "docstring": null
+      },
+      "ABC": {
+        "name": "ABC",
+        "kind": "alias",
+        "path": "omniread.pdf.client.ABC",
+        "signature": "<bound method Alias.signature of Alias('ABC', 'abc.ABC')>",
+        "docstring": null
+      },
+      "abstractmethod": {
+        "name": "abstractmethod",
+        "kind": "alias",
+        "path": "omniread.pdf.client.abstractmethod",
+        "signature": "<bound method Alias.signature of Alias('abstractmethod', 'abc.abstractmethod')>",
+        "docstring": null
+      },
+      "Path": {
+        "name": "Path",
+        "kind": "alias",
+        "path": "omniread.pdf.client.Path",
+        "signature": "<bound method Alias.signature of Alias('Path', 'pathlib.Path')>",
+        "docstring": null
+      },
+      "BasePDFClient": {
+        "name": "BasePDFClient",
+        "kind": "class",
+        "path": "omniread.pdf.client.BasePDFClient",
+        "signature": "<bound method Class.signature of Class('BasePDFClient', 25, 57)>",
+        "docstring": "Abstract client responsible for retrieving PDF bytes.\n\nRetrieves bytes from a specific backing store (filesystem, S3, FTP, etc.).\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must accept a source identifier appropriate to\n          the backing store.\n        - Return the full PDF binary payload.\n        - Raise retrieval-specific errors on failure.",
+        "members": {
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.pdf.client.BasePDFClient.fetch",
+            "signature": "<bound method Function.signature of Function('fetch', 40, 57)>",
+            "docstring": "Fetch raw PDF bytes from the given source.\n\nArgs:\n    source (Any):\n        Identifier of the PDF location, such as a file path, object storage key, or remote reference.\n\nReturns:\n    bytes:\n        Raw PDF bytes.\n\nRaises:\n    Exception:\n        Retrieval-specific errors defined by the implementation."
+          }
+        }
+      },
+      "FileSystemPDFClient": {
+        "name": "FileSystemPDFClient",
+        "kind": "class",
+        "path": "omniread.pdf.client.FileSystemPDFClient",
+        "signature": "<bound method Class.signature of Class('FileSystemPDFClient', 60, 96)>",
+        "docstring": "PDF client that reads from the local filesystem.\n\nNotes:\n    **Guarantees:**\n\n        - This client reads PDF files directly from the disk and returns\n          their raw binary contents.",
+        "members": {
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.pdf.client.FileSystemPDFClient.fetch",
+            "signature": "<bound method Function.signature of Function('fetch', 71, 96)>",
+            "docstring": "Read a PDF file from the local filesystem.\n\nArgs:\n    path (Path):\n        Filesystem path to the PDF file.\n\nReturns:\n    bytes:\n        Raw PDF bytes.\n\nRaises:\n    FileNotFoundError:\n        If the path does not exist.\n    ValueError:\n        If the path exists but is not a file."
+          }
+        }
+      }
+    }
+  }
+}
--- a/mcp_docs/modules/omniread.pdf.json
+++ b/mcp_docs/modules/omniread.pdf.json
@@ -0,0 +1,419 @@
+{
+  "module": "omniread.pdf",
+  "content": {
+    "path": "omniread.pdf",
+    "docstring": "# Summary\n\nPDF format implementation for OmniRead.\n\nThis package provides **PDF-specific implementations** of the core OmniRead\ncontracts defined in `omniread.core`.\n\nUnlike HTML, PDF handling requires an explicit client layer for document\naccess. This package therefore includes:\n\n- PDF clients for acquiring raw PDF data.\n- PDF scrapers that coordinate client access.\n- PDF parsers that extract structured content from PDF binaries.\n\nPublic exports from this package represent the supported PDF pipeline\nand are safe for consumers to import directly when working with PDFs.\n\n---\n\n# Public API\n\n- `FileSystemPDFClient`\n- `PDFScraper`\n- `PDFParser`\n\n---",
+    "objects": {
+      "FileSystemPDFClient": {
+        "name": "FileSystemPDFClient",
+        "kind": "class",
+        "path": "omniread.pdf.FileSystemPDFClient",
+        "signature": "<bound method Alias.signature of Alias('FileSystemPDFClient', 'omniread.pdf.client.FileSystemPDFClient')>",
+        "docstring": "PDF client that reads from the local filesystem.\n\nNotes:\n    **Guarantees:**\n\n        - This client reads PDF files directly from the disk and returns\n          their raw binary contents.",
+        "members": {
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.pdf.FileSystemPDFClient.fetch",
+            "signature": "<bound method Alias.signature of Alias('fetch', 'omniread.pdf.client.FileSystemPDFClient.fetch')>",
+            "docstring": "Read a PDF file from the local filesystem.\n\nArgs:\n    path (Path):\n        Filesystem path to the PDF file.\n\nReturns:\n    bytes:\n        Raw PDF bytes.\n\nRaises:\n    FileNotFoundError:\n        If the path does not exist.\n    ValueError:\n        If the path exists but is not a file."
+          }
+        }
+      },
+      "PDFScraper": {
+        "name": "PDFScraper",
+        "kind": "class",
+        "path": "omniread.pdf.PDFScraper",
+        "signature": "<bound method Alias.signature of Alias('PDFScraper', 'omniread.pdf.scraper.PDFScraper')>",
+        "docstring": "Scraper for PDF sources.\n\nNotes:\n    **Responsibilities:**\n\n        - Delegates byte retrieval to a PDF client and normalizes output\n          into `Content`.\n        - Preserves caller-provided metadata.\n\n    **Constraints:**\n\n        - The scraper does not perform parsing or interpretation.\n        - Does not assume a specific storage backend.",
+        "members": {
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.pdf.PDFScraper.fetch",
+            "signature": "<bound method Alias.signature of Alias('fetch', 'omniread.pdf.scraper.PDFScraper.fetch')>",
+            "docstring": "Fetch a PDF document from the given source.\n\nArgs:\n    source (Any):\n        Identifier of the PDF source as understood by the configured PDF client.\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional metadata to attach to the returned content.\n\nReturns:\n    Content:\n        A `Content` instance containing raw PDF bytes, source identifier, PDF content type, and optional metadata.\n\nRaises:\n    Exception:\n        Retrieval-specific errors raised by the PDF client."
+          }
+        }
+      },
+      "PDFParser": {
+        "name": "PDFParser",
+        "kind": "class",
+        "path": "omniread.pdf.PDFParser",
+        "signature": "<bound method Alias.signature of Alias('PDFParser', 'omniread.pdf.parser.PDFParser')>",
+        "docstring": "Base PDF parser.\n\nNotes:\n    **Responsibilities:**\n\n        - This class enforces PDF content-type compatibility and provides\n          the extension point for implementing concrete PDF parsing strategies.\n\n    **Constraints:**\n\n        - Concrete implementations must define the output type `T` and\n          implement the `parse()` method.",
+        "members": {
+          "supported_types": {
+            "name": "supported_types",
+            "kind": "attribute",
+            "path": "omniread.pdf.PDFParser.supported_types",
+            "signature": "<bound method Alias.signature of Alias('supported_types', 'omniread.pdf.parser.PDFParser.supported_types')>",
+            "docstring": "Set of content types supported by this parser (PDF only)."
+          },
+          "parse": {
+            "name": "parse",
+            "kind": "function",
+            "path": "omniread.pdf.PDFParser.parse",
+            "signature": "<bound method Alias.signature of Alias('parse', 'omniread.pdf.parser.PDFParser.parse')>",
+            "docstring": "Parse PDF content into a structured output.\n\nReturns:\n    T:\n        Parsed representation of type `T`.\n\nRaises:\n    Exception:\n        Parsing-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully interpret the PDF binary payload and\n          return a deterministic, structured output."
+          }
+        }
+      },
+      "client": {
+        "name": "client",
+        "kind": "module",
+        "path": "omniread.pdf.client",
+        "signature": null,
+        "docstring": "# Summary\n\nPDF client abstractions for OmniRead.\n\nThis module defines the **client layer** responsible for retrieving raw PDF\nbytes from a concrete backing store.\n\nClients provide low-level access to PDF binaries and are intentionally\ndecoupled from scraping and parsing logic. They do not perform validation,\ninterpretation, or content extraction.\n\nTypical backing stores include:\n\n- Local filesystems\n- Object storage (S3, GCS, etc.)\n- Network file systems",
+        "members": {
+          "Any": {
+            "name": "Any",
+            "kind": "alias",
+            "path": "omniread.pdf.client.Any",
+            "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+            "docstring": null
+          },
+          "ABC": {
+            "name": "ABC",
+            "kind": "alias",
+            "path": "omniread.pdf.client.ABC",
+            "signature": "<bound method Alias.signature of Alias('ABC', 'abc.ABC')>",
+            "docstring": null
+          },
+          "abstractmethod": {
+            "name": "abstractmethod",
+            "kind": "alias",
+            "path": "omniread.pdf.client.abstractmethod",
+            "signature": "<bound method Alias.signature of Alias('abstractmethod', 'abc.abstractmethod')>",
+            "docstring": null
+          },
+          "Path": {
+            "name": "Path",
+            "kind": "alias",
+            "path": "omniread.pdf.client.Path",
+            "signature": "<bound method Alias.signature of Alias('Path', 'pathlib.Path')>",
+            "docstring": null
+          },
+          "BasePDFClient": {
+            "name": "BasePDFClient",
+            "kind": "class",
+            "path": "omniread.pdf.client.BasePDFClient",
+            "signature": "<bound method Class.signature of Class('BasePDFClient', 25, 57)>",
+            "docstring": "Abstract client responsible for retrieving PDF bytes.\n\nRetrieves bytes from a specific backing store (filesystem, S3, FTP, etc.).\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must accept a source identifier appropriate to\n          the backing store.\n        - Return the full PDF binary payload.\n        - Raise retrieval-specific errors on failure.",
+            "members": {
+              "fetch": {
+                "name": "fetch",
+                "kind": "function",
+                "path": "omniread.pdf.client.BasePDFClient.fetch",
+                "signature": "<bound method Function.signature of Function('fetch', 40, 57)>",
+                "docstring": "Fetch raw PDF bytes from the given source.\n\nArgs:\n    source (Any):\n        Identifier of the PDF location, such as a file path, object storage key, or remote reference.\n\nReturns:\n    bytes:\n        Raw PDF bytes.\n\nRaises:\n    Exception:\n        Retrieval-specific errors defined by the implementation."
+              }
+            }
+          },
+          "FileSystemPDFClient": {
+            "name": "FileSystemPDFClient",
+            "kind": "class",
+            "path": "omniread.pdf.client.FileSystemPDFClient",
+            "signature": "<bound method Class.signature of Class('FileSystemPDFClient', 60, 96)>",
+            "docstring": "PDF client that reads from the local filesystem.\n\nNotes:\n    **Guarantees:**\n\n        - This client reads PDF files directly from the disk and returns\n          their raw binary contents.",
+            "members": {
+              "fetch": {
+                "name": "fetch",
+                "kind": "function",
+                "path": "omniread.pdf.client.FileSystemPDFClient.fetch",
+                "signature": "<bound method Function.signature of Function('fetch', 71, 96)>",
+                "docstring": "Read a PDF file from the local filesystem.\n\nArgs:\n    path (Path):\n        Filesystem path to the PDF file.\n\nReturns:\n    bytes:\n        Raw PDF bytes.\n\nRaises:\n    FileNotFoundError:\n        If the path does not exist.\n    ValueError:\n        If the path exists but is not a file."
+              }
+            }
+          }
+        }
+      },
+      "parser": {
+        "name": "parser",
+        "kind": "module",
+        "path": "omniread.pdf.parser",
+        "signature": null,
+        "docstring": "# Summary\n\nPDF parser base implementations for OmniRead.\n\nThis module defines the **PDF-specific parser contract**, extending the\nformat-agnostic `BaseParser` with constraints appropriate for PDF content.\n\nPDF parsers are responsible for interpreting binary PDF data and producing\nstructured representations suitable for downstream consumption.",
+        "members": {
+          "Generic": {
+            "name": "Generic",
+            "kind": "alias",
+            "path": "omniread.pdf.parser.Generic",
+            "signature": "<bound method Alias.signature of Alias('Generic', 'typing.Generic')>",
+            "docstring": null
+          },
+          "TypeVar": {
+            "name": "TypeVar",
+            "kind": "alias",
+            "path": "omniread.pdf.parser.TypeVar",
+            "signature": "<bound method Alias.signature of Alias('TypeVar', 'typing.TypeVar')>",
+            "docstring": null
+          },
+          "abstractmethod": {
+            "name": "abstractmethod",
+            "kind": "alias",
+            "path": "omniread.pdf.parser.abstractmethod",
+            "signature": "<bound method Alias.signature of Alias('abstractmethod', 'abc.abstractmethod')>",
+            "docstring": null
+          },
+          "ContentType": {
+            "name": "ContentType",
+            "kind": "class",
+            "path": "omniread.pdf.parser.ContentType",
+            "signature": "<bound method Alias.signature of Alias('ContentType', 'omniread.core.content.ContentType')>",
+            "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+            "members": {
+              "HTML": {
+                "name": "HTML",
+                "kind": "attribute",
+                "path": "omniread.pdf.parser.ContentType.HTML",
+                "signature": "<bound method Alias.signature of Alias('HTML', 'omniread.core.content.ContentType.HTML')>",
+                "docstring": "HTML document content."
+              },
+              "PDF": {
+                "name": "PDF",
+                "kind": "attribute",
+                "path": "omniread.pdf.parser.ContentType.PDF",
+                "signature": "<bound method Alias.signature of Alias('PDF', 'omniread.core.content.ContentType.PDF')>",
+                "docstring": "PDF document content."
+              },
+              "JSON": {
+                "name": "JSON",
+                "kind": "attribute",
+                "path": "omniread.pdf.parser.ContentType.JSON",
+                "signature": "<bound method Alias.signature of Alias('JSON', 'omniread.core.content.ContentType.JSON')>",
+                "docstring": "JSON document content."
+              },
+              "XML": {
+                "name": "XML",
+                "kind": "attribute",
+                "path": "omniread.pdf.parser.ContentType.XML",
+                "signature": "<bound method Alias.signature of Alias('XML', 'omniread.core.content.ContentType.XML')>",
+                "docstring": "XML document content."
+              }
+            }
+          },
+          "BaseParser": {
+            "name": "BaseParser",
+            "kind": "class",
+            "path": "omniread.pdf.parser.BaseParser",
+            "signature": "<bound method Alias.signature of Alias('BaseParser', 'omniread.core.parser.BaseParser')>",
+            "docstring": "Base interface for all parsers.\n\nNotes:\n    **Guarantees:**\n\n        - A parser is a self-contained object that owns the `Content` it is\n          responsible for interpreting.\n        - Consumers may rely on early validation of content compatibility\n          and type-stable return values from `parse()`.\n\n    **Responsibilities:**\n\n        - Implementations must declare supported content types via `supported_types`.\n        - Implementations must raise parsing-specific exceptions from `parse()`.\n        - Implementations must remain deterministic for a given input.",
+            "members": {
+              "supported_types": {
+                "name": "supported_types",
+                "kind": "attribute",
+                "path": "omniread.pdf.parser.BaseParser.supported_types",
+                "signature": "<bound method Alias.signature of Alias('supported_types', 'omniread.core.parser.BaseParser.supported_types')>",
+                "docstring": "Set of content types supported by this parser. An empty set indicates that the parser is content-type agnostic."
+              },
+              "content": {
+                "name": "content",
+                "kind": "attribute",
+                "path": "omniread.pdf.parser.BaseParser.content",
+                "signature": "<bound method Alias.signature of Alias('content', 'omniread.core.parser.BaseParser.content')>",
+                "docstring": null
+              },
+              "parse": {
+                "name": "parse",
+                "kind": "function",
+                "path": "omniread.pdf.parser.BaseParser.parse",
+                "signature": "<bound method Alias.signature of Alias('parse', 'omniread.core.parser.BaseParser.parse')>",
+                "docstring": "Parse the owned content into structured output.\n\nReturns:\n    T:\n        Parsed, structured representation.\n\nRaises:\n    Exception:\n        Parsing-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully consume the provided content and\n          return a deterministic, structured output."
+              },
+              "supports": {
+                "name": "supports",
+                "kind": "function",
+                "path": "omniread.pdf.parser.BaseParser.supports",
+                "signature": "<bound method Alias.signature of Alias('supports', 'omniread.core.parser.BaseParser.supports')>",
+                "docstring": "Check whether this parser supports the content's type.\n\nReturns:\n    bool:\n        True if the content type is supported; False otherwise."
+              }
+            }
+          },
+          "T": {
+            "name": "T",
+            "kind": "attribute",
+            "path": "omniread.pdf.parser.T",
+            "signature": null,
+            "docstring": null
+          },
+          "PDFParser": {
+            "name": "PDFParser",
+            "kind": "class",
+            "path": "omniread.pdf.parser.PDFParser",
+            "signature": "<bound method Class.signature of Class('PDFParser', 22, 62)>",
+            "docstring": "Base PDF parser.\n\nNotes:\n    **Responsibilities:**\n\n        - This class enforces PDF content-type compatibility and provides\n          the extension point for implementing concrete PDF parsing strategies.\n\n    **Constraints:**\n\n        - Concrete implementations must define the output type `T` and\n          implement the `parse()` method.",
+            "members": {
+              "supported_types": {
+                "name": "supported_types",
+                "kind": "attribute",
+                "path": "omniread.pdf.parser.PDFParser.supported_types",
+                "signature": null,
+                "docstring": "Set of content types supported by this parser (PDF only)."
+              },
+              "parse": {
+                "name": "parse",
+                "kind": "function",
+                "path": "omniread.pdf.parser.PDFParser.parse",
+                "signature": "<bound method Function.signature of Function('parse', 43, 62)>",
+                "docstring": "Parse PDF content into a structured output.\n\nReturns:\n    T:\n        Parsed representation of type `T`.\n\nRaises:\n    Exception:\n        Parsing-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully interpret the PDF binary payload and\n          return a deterministic, structured output."
+              }
+            }
+          }
+        }
+      },
+      "scraper": {
+        "name": "scraper",
+        "kind": "module",
+        "path": "omniread.pdf.scraper",
+        "signature": null,
+        "docstring": "# Summary\n\nPDF scraping implementation for OmniRead.\n\nThis module provides a PDF-specific scraper that coordinates PDF byte\nretrieval via a client and normalizes the result into a `Content` object.\n\nThe scraper implements the core `BaseScraper` contract while delegating\nall storage and access concerns to a `BasePDFClient` implementation.",
+        "members": {
+          "Any": {
+            "name": "Any",
+            "kind": "alias",
+            "path": "omniread.pdf.scraper.Any",
+            "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+            "docstring": null
+          },
+          "Mapping": {
+            "name": "Mapping",
+            "kind": "alias",
+            "path": "omniread.pdf.scraper.Mapping",
+            "signature": "<bound method Alias.signature of Alias('Mapping', 'typing.Mapping')>",
+            "docstring": null
+          },
+          "Optional": {
+            "name": "Optional",
+            "kind": "alias",
+            "path": "omniread.pdf.scraper.Optional",
+            "signature": "<bound method Alias.signature of Alias('Optional', 'typing.Optional')>",
+            "docstring": null
+          },
+          "Content": {
+            "name": "Content",
+            "kind": "class",
+            "path": "omniread.pdf.scraper.Content",
+            "signature": "<bound method Alias.signature of Alias('Content', 'omniread.core.content.Content')>",
+            "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+            "members": {
+              "raw": {
+                "name": "raw",
+                "kind": "attribute",
+                "path": "omniread.pdf.scraper.Content.raw",
+                "signature": "<bound method Alias.signature of Alias('raw', 'omniread.core.content.Content.raw')>",
+                "docstring": "Raw content bytes as retrieved from the source."
+              },
+              "source": {
+                "name": "source",
+                "kind": "attribute",
+                "path": "omniread.pdf.scraper.Content.source",
+                "signature": "<bound method Alias.signature of Alias('source', 'omniread.core.content.Content.source')>",
+                "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+              },
+              "content_type": {
+                "name": "content_type",
+                "kind": "attribute",
+                "path": "omniread.pdf.scraper.Content.content_type",
+                "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.core.content.Content.content_type')>",
+                "docstring": "Optional MIME type of the content, if known."
+              },
+              "metadata": {
+                "name": "metadata",
+                "kind": "attribute",
+                "path": "omniread.pdf.scraper.Content.metadata",
+                "signature": "<bound method Alias.signature of Alias('metadata', 'omniread.core.content.Content.metadata')>",
+                "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+              }
+            }
+          },
+          "ContentType": {
+            "name": "ContentType",
+            "kind": "class",
+            "path": "omniread.pdf.scraper.ContentType",
+            "signature": "<bound method Alias.signature of Alias('ContentType', 'omniread.core.content.ContentType')>",
+            "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+            "members": {
+              "HTML": {
+                "name": "HTML",
+                "kind": "attribute",
+                "path": "omniread.pdf.scraper.ContentType.HTML",
+                "signature": "<bound method Alias.signature of Alias('HTML', 'omniread.core.content.ContentType.HTML')>",
+                "docstring": "HTML document content."
+              },
+              "PDF": {
+                "name": "PDF",
+                "kind": "attribute",
+                "path": "omniread.pdf.scraper.ContentType.PDF",
+                "signature": "<bound method Alias.signature of Alias('PDF', 'omniread.core.content.ContentType.PDF')>",
+                "docstring": "PDF document content."
+              },
+              "JSON": {
+                "name": "JSON",
+                "kind": "attribute",
+                "path": "omniread.pdf.scraper.ContentType.JSON",
+                "signature": "<bound method Alias.signature of Alias('JSON', 'omniread.core.content.ContentType.JSON')>",
+                "docstring": "JSON document content."
+              },
+              "XML": {
+                "name": "XML",
+                "kind": "attribute",
+                "path": "omniread.pdf.scraper.ContentType.XML",
+                "signature": "<bound method Alias.signature of Alias('XML', 'omniread.core.content.ContentType.XML')>",
+                "docstring": "XML document content."
+              }
+            }
+          },
+          "BaseScraper": {
+            "name": "BaseScraper",
+            "kind": "class",
+            "path": "omniread.pdf.scraper.BaseScraper",
+            "signature": "<bound method Alias.signature of Alias('BaseScraper', 'omniread.core.scraper.BaseScraper')>",
+            "docstring": "Base interface for all scrapers.\n\nNotes:\n    **Responsibilities:**\n\n        - A scraper is responsible ONLY for fetching raw content (bytes)\n          from a source. It must not interpret or parse it.\n        - A scraper is a stateless acquisition component that retrieves raw\n          content from a source and returns it as a `Content` object.\n        - Scrapers define how content is obtained, not what the content means.\n        - Implementations may vary in transport mechanism, authentication\n          strategy, retry and backoff behavior.\n\n    **Constraints:**\n\n        - Implementations must not parse content, modify content semantics,\n          or couple scraping logic to a specific parser.",
+            "members": {
+              "fetch": {
+                "name": "fetch",
+                "kind": "function",
+                "path": "omniread.pdf.scraper.BaseScraper.fetch",
+                "signature": "<bound method Alias.signature of Alias('fetch', 'omniread.core.scraper.BaseScraper.fetch')>",
+                "docstring": "Fetch raw content from the given source.\n\nArgs:\n    source (str):\n        Location identifier (URL, file path, S3 URI, etc.).\n\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional hints for the scraper (headers, auth, etc.).\n\nReturns:\n    Content:\n        Content object containing raw bytes and metadata.\n\nRaises:\n    Exception:\n        Retrieval-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must retrieve the content referenced by `source`\n          and return it as raw bytes wrapped in a `Content` object."
+              }
+            }
+          },
+          "BasePDFClient": {
+            "name": "BasePDFClient",
+            "kind": "class",
+            "path": "omniread.pdf.scraper.BasePDFClient",
+            "signature": "<bound method Alias.signature of Alias('BasePDFClient', 'omniread.pdf.client.BasePDFClient')>",
+            "docstring": "Abstract client responsible for retrieving PDF bytes.\n\nRetrieves bytes from a specific backing store (filesystem, S3, FTP, etc.).\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must accept a source identifier appropriate to\n          the backing store.\n        - Return the full PDF binary payload.\n        - Raise retrieval-specific errors on failure.",
+            "members": {
+              "fetch": {
+                "name": "fetch",
+                "kind": "function",
+                "path": "omniread.pdf.scraper.BasePDFClient.fetch",
+                "signature": "<bound method Alias.signature of Alias('fetch', 'omniread.pdf.client.BasePDFClient.fetch')>",
+                "docstring": "Fetch raw PDF bytes from the given source.\n\nArgs:\n    source (Any):\n        Identifier of the PDF location, such as a file path, object storage key, or remote reference.\n\nReturns:\n    bytes:\n        Raw PDF bytes.\n\nRaises:\n    Exception:\n        Retrieval-specific errors defined by the implementation."
+              }
+            }
+          },
+          "PDFScraper": {
+            "name": "PDFScraper",
+            "kind": "class",
+            "path": "omniread.pdf.scraper.PDFScraper",
+            "signature": "<bound method Class.signature of Class('PDFScraper', 20, 77)>",
+            "docstring": "Scraper for PDF sources.\n\nNotes:\n    **Responsibilities:**\n\n        - Delegates byte retrieval to a PDF client and normalizes output\n          into `Content`.\n        - Preserves caller-provided metadata.\n\n    **Constraints:**\n\n        - The scraper does not perform parsing or interpretation.\n        - Does not assume a specific storage backend.",
+            "members": {
+              "fetch": {
+                "name": "fetch",
+                "kind": "function",
+                "path": "omniread.pdf.scraper.PDFScraper.fetch",
+                "signature": "<bound method Function.signature of Function('fetch', 47, 77)>",
+                "docstring": "Fetch a PDF document from the given source.\n\nArgs:\n    source (Any):\n        Identifier of the PDF source as understood by the configured PDF client.\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional metadata to attach to the returned content.\n\nReturns:\n    Content:\n        A `Content` instance containing raw PDF bytes, source identifier, PDF content type, and optional metadata.\n\nRaises:\n    Exception:\n        Retrieval-specific errors raised by the PDF client."
+              }
+            }
+          }
+        }
+      }
+    }
+  }
+}
--- a/mcp_docs/modules/omniread.pdf.parser.json
+++ b/mcp_docs/modules/omniread.pdf.parser.json
@@ -0,0 +1,134 @@
+{
+  "module": "omniread.pdf.parser",
+  "content": {
+    "path": "omniread.pdf.parser",
+    "docstring": "# Summary\n\nPDF parser base implementations for OmniRead.\n\nThis module defines the **PDF-specific parser contract**, extending the\nformat-agnostic `BaseParser` with constraints appropriate for PDF content.\n\nPDF parsers are responsible for interpreting binary PDF data and producing\nstructured representations suitable for downstream consumption.",
+    "objects": {
+      "Generic": {
+        "name": "Generic",
+        "kind": "alias",
+        "path": "omniread.pdf.parser.Generic",
+        "signature": "<bound method Alias.signature of Alias('Generic', 'typing.Generic')>",
+        "docstring": null
+      },
+      "TypeVar": {
+        "name": "TypeVar",
+        "kind": "alias",
+        "path": "omniread.pdf.parser.TypeVar",
+        "signature": "<bound method Alias.signature of Alias('TypeVar', 'typing.TypeVar')>",
+        "docstring": null
+      },
+      "abstractmethod": {
+        "name": "abstractmethod",
+        "kind": "alias",
+        "path": "omniread.pdf.parser.abstractmethod",
+        "signature": "<bound method Alias.signature of Alias('abstractmethod', 'abc.abstractmethod')>",
+        "docstring": null
+      },
+      "ContentType": {
+        "name": "ContentType",
+        "kind": "class",
+        "path": "omniread.pdf.parser.ContentType",
+        "signature": "<bound method Alias.signature of Alias('ContentType', 'omniread.core.content.ContentType')>",
+        "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+        "members": {
+          "HTML": {
+            "name": "HTML",
+            "kind": "attribute",
+            "path": "omniread.pdf.parser.ContentType.HTML",
+            "signature": "<bound method Alias.signature of Alias('HTML', 'omniread.core.content.ContentType.HTML')>",
+            "docstring": "HTML document content."
+          },
+          "PDF": {
+            "name": "PDF",
+            "kind": "attribute",
+            "path": "omniread.pdf.parser.ContentType.PDF",
+            "signature": "<bound method Alias.signature of Alias('PDF', 'omniread.core.content.ContentType.PDF')>",
+            "docstring": "PDF document content."
+          },
+          "JSON": {
+            "name": "JSON",
+            "kind": "attribute",
+            "path": "omniread.pdf.parser.ContentType.JSON",
+            "signature": "<bound method Alias.signature of Alias('JSON', 'omniread.core.content.ContentType.JSON')>",
+            "docstring": "JSON document content."
+          },
+          "XML": {
+            "name": "XML",
+            "kind": "attribute",
+            "path": "omniread.pdf.parser.ContentType.XML",
+            "signature": "<bound method Alias.signature of Alias('XML', 'omniread.core.content.ContentType.XML')>",
+            "docstring": "XML document content."
+          }
+        }
+      },
+      "BaseParser": {
+        "name": "BaseParser",
+        "kind": "class",
+        "path": "omniread.pdf.parser.BaseParser",
+        "signature": "<bound method Alias.signature of Alias('BaseParser', 'omniread.core.parser.BaseParser')>",
+        "docstring": "Base interface for all parsers.\n\nNotes:\n    **Guarantees:**\n\n        - A parser is a self-contained object that owns the `Content` it is\n          responsible for interpreting.\n        - Consumers may rely on early validation of content compatibility\n          and type-stable return values from `parse()`.\n\n    **Responsibilities:**\n\n        - Implementations must declare supported content types via `supported_types`.\n        - Implementations must raise parsing-specific exceptions from `parse()`.\n        - Implementations must remain deterministic for a given input.",
+        "members": {
+          "supported_types": {
+            "name": "supported_types",
+            "kind": "attribute",
+            "path": "omniread.pdf.parser.BaseParser.supported_types",
+            "signature": "<bound method Alias.signature of Alias('supported_types', 'omniread.core.parser.BaseParser.supported_types')>",
+            "docstring": "Set of content types supported by this parser. An empty set indicates that the parser is content-type agnostic."
+          },
+          "content": {
+            "name": "content",
+            "kind": "attribute",
+            "path": "omniread.pdf.parser.BaseParser.content",
+            "signature": "<bound method Alias.signature of Alias('content', 'omniread.core.parser.BaseParser.content')>",
+            "docstring": null
+          },
+          "parse": {
+            "name": "parse",
+            "kind": "function",
+            "path": "omniread.pdf.parser.BaseParser.parse",
+            "signature": "<bound method Alias.signature of Alias('parse', 'omniread.core.parser.BaseParser.parse')>",
+            "docstring": "Parse the owned content into structured output.\n\nReturns:\n    T:\n        Parsed, structured representation.\n\nRaises:\n    Exception:\n        Parsing-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully consume the provided content and\n          return a deterministic, structured output."
+          },
+          "supports": {
+            "name": "supports",
+            "kind": "function",
+            "path": "omniread.pdf.parser.BaseParser.supports",
+            "signature": "<bound method Alias.signature of Alias('supports', 'omniread.core.parser.BaseParser.supports')>",
+            "docstring": "Check whether this parser supports the content's type.\n\nReturns:\n    bool:\n        True if the content type is supported; False otherwise."
+          }
+        }
+      },
+      "T": {
+        "name": "T",
+        "kind": "attribute",
+        "path": "omniread.pdf.parser.T",
+        "signature": null,
+        "docstring": null
+      },
+      "PDFParser": {
+        "name": "PDFParser",
+        "kind": "class",
+        "path": "omniread.pdf.parser.PDFParser",
+        "signature": "<bound method Class.signature of Class('PDFParser', 22, 62)>",
+        "docstring": "Base PDF parser.\n\nNotes:\n    **Responsibilities:**\n\n        - This class enforces PDF content-type compatibility and provides\n          the extension point for implementing concrete PDF parsing strategies.\n\n    **Constraints:**\n\n        - Concrete implementations must define the output type `T` and\n          implement the `parse()` method.",
+        "members": {
+          "supported_types": {
+            "name": "supported_types",
+            "kind": "attribute",
+            "path": "omniread.pdf.parser.PDFParser.supported_types",
+            "signature": null,
+            "docstring": "Set of content types supported by this parser (PDF only)."
+          },
+          "parse": {
+            "name": "parse",
+            "kind": "function",
+            "path": "omniread.pdf.parser.PDFParser.parse",
+            "signature": "<bound method Function.signature of Function('parse', 43, 62)>",
+            "docstring": "Parse PDF content into a structured output.\n\nReturns:\n    T:\n        Parsed representation of type `T`.\n\nRaises:\n    Exception:\n        Parsing-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must fully interpret the PDF binary payload and\n          return a deterministic, structured output."
+          }
+        }
+      }
+    }
+  }
+}
--- a/mcp_docs/modules/omniread.pdf.scraper.json
+++ b/mcp_docs/modules/omniread.pdf.scraper.json
@@ -0,0 +1,152 @@
+{
+  "module": "omniread.pdf.scraper",
+  "content": {
+    "path": "omniread.pdf.scraper",
+    "docstring": "# Summary\n\nPDF scraping implementation for OmniRead.\n\nThis module provides a PDF-specific scraper that coordinates PDF byte\nretrieval via a client and normalizes the result into a `Content` object.\n\nThe scraper implements the core `BaseScraper` contract while delegating\nall storage and access concerns to a `BasePDFClient` implementation.",
+    "objects": {
+      "Any": {
+        "name": "Any",
+        "kind": "alias",
+        "path": "omniread.pdf.scraper.Any",
+        "signature": "<bound method Alias.signature of Alias('Any', 'typing.Any')>",
+        "docstring": null
+      },
+      "Mapping": {
+        "name": "Mapping",
+        "kind": "alias",
+        "path": "omniread.pdf.scraper.Mapping",
+        "signature": "<bound method Alias.signature of Alias('Mapping', 'typing.Mapping')>",
+        "docstring": null
+      },
+      "Optional": {
+        "name": "Optional",
+        "kind": "alias",
+        "path": "omniread.pdf.scraper.Optional",
+        "signature": "<bound method Alias.signature of Alias('Optional', 'typing.Optional')>",
+        "docstring": null
+      },
+      "Content": {
+        "name": "Content",
+        "kind": "class",
+        "path": "omniread.pdf.scraper.Content",
+        "signature": "<bound method Alias.signature of Alias('Content', 'omniread.core.content.Content')>",
+        "docstring": "Normalized representation of extracted content.\n\nNotes:\n    **Responsibilities:**\n\n        - A `Content` instance represents a raw content payload along with\n          minimal contextual metadata describing its origin and type.\n        - This class is the primary exchange format between scrapers,\n          parsers, and downstream consumers.",
+        "members": {
+          "raw": {
+            "name": "raw",
+            "kind": "attribute",
+            "path": "omniread.pdf.scraper.Content.raw",
+            "signature": "<bound method Alias.signature of Alias('raw', 'omniread.core.content.Content.raw')>",
+            "docstring": "Raw content bytes as retrieved from the source."
+          },
+          "source": {
+            "name": "source",
+            "kind": "attribute",
+            "path": "omniread.pdf.scraper.Content.source",
+            "signature": "<bound method Alias.signature of Alias('source', 'omniread.core.content.Content.source')>",
+            "docstring": "Identifier of the content origin (URL, file path, or logical name)."
+          },
+          "content_type": {
+            "name": "content_type",
+            "kind": "attribute",
+            "path": "omniread.pdf.scraper.Content.content_type",
+            "signature": "<bound method Alias.signature of Alias('content_type', 'omniread.core.content.Content.content_type')>",
+            "docstring": "Optional MIME type of the content, if known."
+          },
+          "metadata": {
+            "name": "metadata",
+            "kind": "attribute",
+            "path": "omniread.pdf.scraper.Content.metadata",
+            "signature": "<bound method Alias.signature of Alias('metadata', 'omniread.core.content.Content.metadata')>",
+            "docstring": "Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes)."
+          }
+        }
+      },
+      "ContentType": {
+        "name": "ContentType",
+        "kind": "class",
+        "path": "omniread.pdf.scraper.ContentType",
+        "signature": "<bound method Alias.signature of Alias('ContentType', 'omniread.core.content.ContentType')>",
+        "docstring": "Supported MIME types for extracted content.\n\nNotes:\n    **Guarantees:**\n\n        - This enum represents the declared or inferred media type of the\n          content source.\n        - It is primarily used for routing content to the appropriate\n          parser or downstream consumer.",
+        "members": {
+          "HTML": {
+            "name": "HTML",
+            "kind": "attribute",
+            "path": "omniread.pdf.scraper.ContentType.HTML",
+            "signature": "<bound method Alias.signature of Alias('HTML', 'omniread.core.content.ContentType.HTML')>",
+            "docstring": "HTML document content."
+          },
+          "PDF": {
+            "name": "PDF",
+            "kind": "attribute",
+            "path": "omniread.pdf.scraper.ContentType.PDF",
+            "signature": "<bound method Alias.signature of Alias('PDF', 'omniread.core.content.ContentType.PDF')>",
+            "docstring": "PDF document content."
+          },
+          "JSON": {
+            "name": "JSON",
+            "kind": "attribute",
+            "path": "omniread.pdf.scraper.ContentType.JSON",
+            "signature": "<bound method Alias.signature of Alias('JSON', 'omniread.core.content.ContentType.JSON')>",
+            "docstring": "JSON document content."
+          },
+          "XML": {
+            "name": "XML",
+            "kind": "attribute",
+            "path": "omniread.pdf.scraper.ContentType.XML",
+            "signature": "<bound method Alias.signature of Alias('XML', 'omniread.core.content.ContentType.XML')>",
+            "docstring": "XML document content."
+          }
+        }
+      },
+      "BaseScraper": {
+        "name": "BaseScraper",
+        "kind": "class",
+        "path": "omniread.pdf.scraper.BaseScraper",
+        "signature": "<bound method Alias.signature of Alias('BaseScraper', 'omniread.core.scraper.BaseScraper')>",
+        "docstring": "Base interface for all scrapers.\n\nNotes:\n    **Responsibilities:**\n\n        - A scraper is responsible ONLY for fetching raw content (bytes)\n          from a source. It must not interpret or parse it.\n        - A scraper is a stateless acquisition component that retrieves raw\n          content from a source and returns it as a `Content` object.\n        - Scrapers define how content is obtained, not what the content means.\n        - Implementations may vary in transport mechanism, authentication\n          strategy, retry and backoff behavior.\n\n    **Constraints:**\n\n        - Implementations must not parse content, modify content semantics,\n          or couple scraping logic to a specific parser.",
+        "members": {
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.pdf.scraper.BaseScraper.fetch",
+            "signature": "<bound method Alias.signature of Alias('fetch', 'omniread.core.scraper.BaseScraper.fetch')>",
+            "docstring": "Fetch raw content from the given source.\n\nArgs:\n    source (str):\n        Location identifier (URL, file path, S3 URI, etc.).\n\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional hints for the scraper (headers, auth, etc.).\n\nReturns:\n    Content:\n        Content object containing raw bytes and metadata.\n\nRaises:\n    Exception:\n        Retrieval-specific errors as defined by the implementation.\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must retrieve the content referenced by `source`\n          and return it as raw bytes wrapped in a `Content` object."
+          }
+        }
+      },
+      "BasePDFClient": {
+        "name": "BasePDFClient",
+        "kind": "class",
+        "path": "omniread.pdf.scraper.BasePDFClient",
+        "signature": "<bound method Alias.signature of Alias('BasePDFClient', 'omniread.pdf.client.BasePDFClient')>",
+        "docstring": "Abstract client responsible for retrieving PDF bytes.\n\nRetrieves bytes from a specific backing store (filesystem, S3, FTP, etc.).\n\nNotes:\n    **Responsibilities:**\n\n        - Implementations must accept a source identifier appropriate to\n          the backing store.\n        - Return the full PDF binary payload.\n        - Raise retrieval-specific errors on failure.",
+        "members": {
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.pdf.scraper.BasePDFClient.fetch",
+            "signature": "<bound method Alias.signature of Alias('fetch', 'omniread.pdf.client.BasePDFClient.fetch')>",
+            "docstring": "Fetch raw PDF bytes from the given source.\n\nArgs:\n    source (Any):\n        Identifier of the PDF location, such as a file path, object storage key, or remote reference.\n\nReturns:\n    bytes:\n        Raw PDF bytes.\n\nRaises:\n    Exception:\n        Retrieval-specific errors defined by the implementation."
+          }
+        }
+      },
+      "PDFScraper": {
+        "name": "PDFScraper",
+        "kind": "class",
+        "path": "omniread.pdf.scraper.PDFScraper",
+        "signature": "<bound method Class.signature of Class('PDFScraper', 20, 77)>",
+        "docstring": "Scraper for PDF sources.\n\nNotes:\n    **Responsibilities:**\n\n        - Delegates byte retrieval to a PDF client and normalizes output\n          into `Content`.\n        - Preserves caller-provided metadata.\n\n    **Constraints:**\n\n        - The scraper does not perform parsing or interpretation.\n        - Does not assume a specific storage backend.",
+        "members": {
+          "fetch": {
+            "name": "fetch",
+            "kind": "function",
+            "path": "omniread.pdf.scraper.PDFScraper.fetch",
+            "signature": "<bound method Function.signature of Function('fetch', 47, 77)>",
+            "docstring": "Fetch a PDF document from the given source.\n\nArgs:\n    source (Any):\n        Identifier of the PDF source as understood by the configured PDF client.\n    metadata (Optional[Mapping[str, Any]], optional):\n        Optional metadata to attach to the returned content.\n\nReturns:\n    Content:\n        A `Content` instance containing raw PDF bytes, source identifier, PDF content type, and optional metadata.\n\nRaises:\n    Exception:\n        Retrieval-specific errors raised by the PDF client."
+          }
+        }
+      }
+    }
+  }
+}
--- a/mcp_docs/nav.json
+++ b/mcp_docs/nav.json
@@ -0,0 +1,50 @@
+[
+  {
+    "module": "omniread",
+    "resource": "doc://modules/omniread"
+  },
+  {
+    "module": "omniread.core",
+    "resource": "doc://modules/omniread.core"
+  },
+  {
+    "module": "omniread.core.content",
+    "resource": "doc://modules/omniread.core.content"
+  },
+  {
+    "module": "omniread.core.parser",
+    "resource": "doc://modules/omniread.core.parser"
+  },
+  {
+    "module": "omniread.core.scraper",
+    "resource": "doc://modules/omniread.core.scraper"
+  },
+  {
+    "module": "omniread.html",
+    "resource": "doc://modules/omniread.html"
+  },
+  {
+    "module": "omniread.html.parser",
+    "resource": "doc://modules/omniread.html.parser"
+  },
+  {
+    "module": "omniread.html.scraper",
+    "resource": "doc://modules/omniread.html.scraper"
+  },
+  {
+    "module": "omniread.pdf",
+    "resource": "doc://modules/omniread.pdf"
+  },
+  {
+    "module": "omniread.pdf.client",
+    "resource": "doc://modules/omniread.pdf.client"
+  },
+  {
+    "module": "omniread.pdf.parser",
+    "resource": "doc://modules/omniread.pdf.parser"
+  },
+  {
+    "module": "omniread.pdf.scraper",
+    "resource": "doc://modules/omniread.pdf.scraper"
+  }
+]
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -1,53 +1,81 @@
-site_name: Aetoskia OmniRead
-site_description: Format-agnostic document reading, parsing, and scraping framework
-
 theme:
  name: material
  palette:
-    - scheme: slate
-      primary: deep purple
-      accent: cyan
+  - scheme: slate
+    primary: deep purple
+    accent: cyan
  font:
    text: Inter
    code: JetBrains Mono
  features:
-    - navigation.tabs
-    - navigation.expand
-    - navigation.top
-    - navigation.instant
-    - content.code.copy
-    - content.code.annotate
-
+  - navigation.sections
+  - navigation.expand
+  - navigation.top
+  - navigation.instant
+  - navigation.tracking
+  - navigation.indexes
+  - content.code.copy
+  - content.code.annotate
+  - content.tabs.link
+  - content.action.edit
+  - search.highlight
+  - search.share
+  - search.suggest
 plugins:
-  - search
-  - mkdocstrings:
-      handlers:
-        python:
-          paths: ["."]
-          options:
-            docstring_style: google
-            show_source: false
-            show_signature_annotations: true
-            separate_signature: true
-            merge_init_into_class: true
-            inherited_members: true
-            annotations_path: brief
-            show_root_heading: true
-            group_by_category: true
-
+- search
+- mkdocstrings:
+    handlers:
+      python:
+        paths:
+        - .
+        options:
+          docstring_style: google
+          show_source: false
+          show_signature_annotations: true
+          separate_signature: true
+          merge_init_into_class: true
+          inherited_members: true
+          annotations_path: brief
+          show_root_heading: true
+          group_by_category: true
+          show_category_heading: true
+          show_object_full_path: false
+          show_symbol_type_heading: true
+markdown_extensions:
+- pymdownx.superfences
+- pymdownx.inlinehilite
+- pymdownx.snippets
+- admonition
+- pymdownx.details
+- pymdownx.superfences
+- pymdownx.highlight:
+    linenums: true
+    anchor_linenums: true
+    line_spans: __span
+    pygments_lang_class: true
+- pymdownx.tabbed:
+    alternate_style: true
+- pymdownx.tasklist:
+    custom_checkbox: true
+- tables
+- footnotes
+- pymdownx.caret
+- pymdownx.tilde
+- pymdownx.mark
+site_name: omniread
 nav:
-  - Home: index.md
-
-  - Core (Contracts):
-      - Content Models: core/content.md
-      - Parsers: core/parser.md
-      - Scrapers: core/scraper.md
-
-  - HTML Implementation:
-      - HTML Parser: html/parser.md
-      - HTML Scraper: html/scraper.md
-
-  - PDF Implementation:
-      - PDF Client: pdf/client.md
-      - PDF Parser: pdf/parser.md
-      - PDF Scraper: pdf/scraper.md
+- Home: index.md
+- Core API:
+  - core/index.md
+  - core/content.md
+  - core/parser.md
+  - core/scraper.md
+- HTML Handling:
+  - html/index.md
+  - html/parser.md
+  - html/scraper.md
+- PDF Handling:
+  - pdf/index.md
+  - pdf/client.md
+  - pdf/parser.md
+  - pdf/scraper.md
--- a/omniread/init.py
+++ b/omniread/init.py
@@ -1,99 +1,110 @@
 """
-OmniRead — format-agnostic content acquisition and parsing framework.
+# Summary

-OmniRead provides a **cleanly layered architecture** for fetching, parsing,
+`OmniRead` — format-agnostic content acquisition and parsing framework.
+
+`OmniRead` provides a **cleanly layered architecture** for fetching, parsing,
 and normalizing content from heterogeneous sources such as HTML documents
 and PDF files.

 The library is structured around three core concepts:

-1. **Content**
-   A canonical, format-agnostic container representing raw content bytes
-   and minimal contextual metadata.
+1.  **`Content`**: A canonical, format-agnostic container representing raw content
+    bytes and minimal contextual metadata.
+2.  **`Scrapers`**: Components responsible for *acquiring* raw content from a
+    source (HTTP, filesystem, object storage, etc.). `Scrapers` never interpret
+    content.
+3.  **`Parsers`**: Components responsible for *interpreting* acquired content and
+    converting it into structured, typed representations.

-2. **Scrapers**
-   Components responsible for *acquiring* raw content from a source
-   (HTTP, filesystem, object storage, etc.). Scrapers never interpret
-   content.
+`OmniRead` deliberately separates these responsibilities to ensure:

-3. **Parsers**
-   Components responsible for *interpreting* acquired content and
-   converting it into structured, typed representations.
+-   Clear boundaries between IO and interpretation.
+-   Replaceable implementations per format.
+-   Predictable, testable behavior.

-OmniRead deliberately separates these responsibilities to ensure:
- Clear boundaries between IO and interpretation
- Replaceable implementations per format
- Predictable, testable behavior
+# Installation

----------------------------------------------------------------------
-Installation
----------------------------------------------------------------------
+Install `OmniRead` using pip:

-Install OmniRead using pip:
+```bash
+pip install omniread
+```

-    pip install omniread
+Install OmniRead using Poetry:
+```bash
+poetry add omniread
+```

-Or with Poetry:
+---

-    poetry add omniread
+## Quick start

----------------------------------------------------------------------
-Basic Usage
----------------------------------------------------------------------
+Example:
+    HTML example:
+        ```python
+        from omniread import HTMLScraper, HTMLParser

-HTML example:
+        scraper = HTMLScraper()
+        content = scraper.fetch("https://example.com")

-    from omniread import HTMLScraper, HTMLParser
+        class TitleParser(HTMLParser[str]):
+            def parse(self) -> str:
+                return self._soup.title.string

-    scraper = HTMLScraper()
-    content = scraper.fetch("https://example.com")
+        parser = TitleParser(content)
+        title = parser.parse()
+        ```

-    class TitleParser(HTMLParser[str]):
-        def parse(self) -> str:
-            return self._soup.title.string
+    PDF example:
+        ```python
+        from omniread import FileSystemPDFClient, PDFScraper, PDFParser
+        from pathlib import Path

-    parser = TitleParser(content)
-    title = parser.parse()
+        client = FileSystemPDFClient()
+        scraper = PDFScraper(client=client)
+        content = scraper.fetch(Path("document.pdf"))

-PDF example:
+        class TextPDFParser(PDFParser[str]):
+            def parse(self) -> str:
+                # implement PDF text extraction
+                ...

-    from omniread import FileSystemPDFClient, PDFScraper, PDFParser
-    from pathlib import Path
+        parser = TextPDFParser(content)
+        result = parser.parse()
+        ```

-    client = FileSystemPDFClient()
-    scraper = PDFScraper(client=client)
-    content = scraper.fetch(Path("document.pdf"))
+---

-    class TextPDFParser(PDFParser[str]):
-        def parse(self) -> str:
-            # implement PDF text extraction
-            ...
-
-    parser = TextPDFParser(content)
-    result = parser.parse()
-
----------------------------------------------------------------------
-Public API Surface
----------------------------------------------------------------------
+# Public API

 This module re-exports the **recommended public entry points** of OmniRead.
-
 Consumers are encouraged to import from this namespace rather than from
 format-specific submodules directly, unless advanced customization is
 required.

-Core:
- Content
- ContentType
+- `Content`: Canonical content model.
+- `ContentType`: Supported media types.
+- `HTMLScraper`: HTTP-based HTML acquisition.
+- `HTMLParser`: Base parser for HTML DOM interpretation.
+- `FileSystemPDFClient`: Local filesystem PDF access.
+- `PDFScraper`: PDF-specific content acquisition.
+- `PDFParser`: Base parser for PDF binary interpretation.

-HTML:
- HTMLScraper
- HTMLParser
+---

-PDF:
- FileSystemPDFClient
- PDFScraper
- PDFParser
+# Core Philosophy
+
+`OmniRead` is designed as a **decoupled content engine**:
+
+1. **Separation of Concerns**: Scrapers *fetch*, Parsers *interpret*. Neither
+   knows about the other.
+2. **Normalized Exchange**: All components communicate via the `Content` model,
+   ensuring a consistent contract.
+3. **Format Agnosticism**: The core logic is independent of whether the input
+   is HTML, PDF, or JSON.
+
+---
 """

 from .core import Content, ContentType
--- a/omniread/init.pyi
+++ b/omniread/init.pyi
@@ -0,0 +1,13 @@
+from .core import Content, ContentType
+from .html import HTMLScraper, HTMLParser
+from .pdf import FileSystemPDFClient, PDFScraper, PDFParser
+
+__all__ = [
+    "Content",
+    "ContentType",
+    "HTMLScraper",
+    "HTMLParser",
+    "FileSystemPDFClient",
+    "PDFScraper",
+    "PDFParser",
+]
--- a/omniread/core/init.py
+++ b/omniread/core/init.py
@@ -1,4 +1,6 @@
 """
+# Summary
+
 Core domain contracts for OmniRead.

 This package defines the **format-agnostic domain layer** of OmniRead.
@@ -9,11 +11,21 @@ Public exports from this package are considered **stable contracts** and
 are safe for downstream consumers to depend on.

 Submodules:
- content: Canonical content models and enums
- parser: Abstract parsing contracts
- scraper: Abstract scraping contracts
+
+- `content`: Canonical content models and enums.
+- `parser`: Abstract parsing contracts.
+- `scraper`: Abstract scraping contracts.

 Format-specific behavior must not be introduced at this layer.
+
+---
+
+# Public API
+
+- `Content`
+- `ContentType`
+
+---
 """

 from .content import Content, ContentType
--- a/omniread/core/init.pyi
+++ b/omniread/core/init.pyi
@@ -0,0 +1,10 @@
+from .content import Content, ContentType
+from .parser import BaseParser
+from .scraper import BaseScraper
+
+__all__ = [
+    "Content",
+    "ContentType",
+    "BaseParser",
+    "BaseScraper",
+]
--- a/omniread/core/content.py
+++ b/omniread/core/content.py
@@ -1,4 +1,6 @@
 """
+# Summary
+
 Canonical content models for OmniRead.

 This module defines the **format-agnostic content representation** used across
@@ -18,9 +20,13 @@ class ContentType(str, Enum):
    """
    Supported MIME types for extracted content.

-    This enum represents the declared or inferred media type of the content
-    source. It is primarily used for routing content to the appropriate
-    parser or downstream consumer.
+    Notes:
+        **Guarantees:**
+
+            - This enum represents the declared or inferred media type of the
+              content source.
+            - It is primarily used for routing content to the appropriate
+              parser or downstream consumer.
    """

    HTML = "text/html"
@@ -41,23 +47,31 @@ class Content:
    """
    Normalized representation of extracted content.

-    A `Content` instance represents a raw content payload along with minimal
-    contextual metadata describing its origin and type.
+    Notes:
+        **Responsibilities:**

-    This class is the **primary exchange format** between:
-    - Scrapers
-    - Parsers
-    - Downstream consumers
-
-    Attributes:
-        raw: Raw content bytes as retrieved from the source.
-        source: Identifier of the content origin (URL, file path, or logical name).
-        content_type: Optional MIME type of the content, if known.
-        metadata: Optional, implementation-defined metadata associated with
-            the content (e.g., headers, encoding hints, extraction notes).
+            - A `Content` instance represents a raw content payload along with
+              minimal contextual metadata describing its origin and type.
+            - This class is the primary exchange format between scrapers,
+              parsers, and downstream consumers.
    """

    raw: bytes
+    """
+    Raw content bytes as retrieved from the source.
+    """
+
    source: str
+    """
+    Identifier of the content origin (URL, file path, or logical name).
+    """
+
    content_type: Optional[ContentType] = None
+    """
+    Optional MIME type of the content, if known.
+    """
+
    metadata: Optional[Mapping[str, Any]] = None
+    """
+    Optional, implementation-defined metadata associated with the content (e.g., headers, encoding hints, extraction notes).
+    """
--- a/omniread/core/content.pyi
+++ b/omniread/core/content.pyi
@@ -0,0 +1,15 @@
+from enum import Enum
+from typing import Any, Mapping, Optional
+
+class ContentType(str, Enum):
+    HTML = "text/html"
+    PDF = "application/pdf"
+    JSON = "application/json"
+    XML = "application/xml"
+
+class Content:
+    raw: bytes
+    source: str
+    content_type: Optional[ContentType]
+    metadata: Optional[Mapping[str, Any]]
+    def __init__(self, raw: bytes, source: str, content_type: Optional[ContentType] = ..., metadata: Optional[Mapping[str, Any]] = ...) -> None: ...
--- a/omniread/core/parser.py
+++ b/omniread/core/parser.py
@@ -1,15 +1,19 @@
 """
+# Summary
+
 Abstract parsing contracts for OmniRead.

 This module defines the **format-agnostic parser interface** used to transform
 raw content into structured, typed representations.

 Parsers are responsible for:
+
 - Interpreting a single `Content` instance
 - Validating compatibility with the content type
 - Producing a structured output suitable for downstream consumers

 Parsers are not responsible for:
+
 - Fetching or acquiring content
 - Performing retries or error recovery
 - Managing multiple content sources
@@ -27,23 +31,24 @@ class BaseParser(ABC, Generic[T]):
    """
    Base interface for all parsers.

-    A parser is a self-contained object that owns the Content
-    it is responsible for interpreting.
+    Notes:
+        **Guarantees:**

-    Implementations must:
-    - Declare supported content types via `supported_types`
-    - Raise parsing-specific exceptions from `parse()`
-    - Remain deterministic for a given input
+            - A parser is a self-contained object that owns the `Content` it is
+              responsible for interpreting.
+            - Consumers may rely on early validation of content compatibility
+              and type-stable return values from `parse()`.

-    Consumers may rely on:
-    - Early validation of content compatibility
-    - Type-stable return values from `parse()`
+        **Responsibilities:**
+
+            - Implementations must declare supported content types via `supported_types`.
+            - Implementations must raise parsing-specific exceptions from `parse()`.
+            - Implementations must remain deterministic for a given input.
    """

    supported_types: Set[ContentType] = set()
-    """Set of content types supported by this parser.
-
-    An empty set indicates that the parser is content-type agnostic.
+    """
+    Set of content types supported by this parser. An empty set indicates that the parser is content-type agnostic.
    """

    def __init__(self, content: Content):
@@ -51,10 +56,12 @@ class BaseParser(ABC, Generic[T]):
        Initialize the parser with content to be parsed.

        Args:
-            content: Content instance to be parsed.
+            content (Content):
+                Content instance to be parsed.

        Raises:
-            ValueError: If the content type is not supported by this parser.
+            ValueError:
+                If the content type is not supported by this parser.
        """

        self.content = content
@@ -70,14 +77,19 @@ class BaseParser(ABC, Generic[T]):
        """
        Parse the owned content into structured output.

-        Implementations must fully consume the provided content and
-        return a deterministic, structured output.
-
        Returns:
-            Parsed, structured representation.
+            T:
+                Parsed, structured representation.

        Raises:
-            Exception: Parsing-specific errors as defined by the implementation.
+            Exception:
+                Parsing-specific errors as defined by the implementation.
+
+        Notes:
+            **Responsibilities:**
+
+                - Implementations must fully consume the provided content and
+                  return a deterministic, structured output.
        """
        raise NotImplementedError

@@ -86,7 +98,8 @@ class BaseParser(ABC, Generic[T]):
        Check whether this parser supports the content's type.

        Returns:
-            True if the content type is supported; False otherwise.
+            bool:
+                True if the content type is supported; False otherwise.
        """

        if not self.supported_types:
--- a/omniread/core/parser.pyi
+++ b/omniread/core/parser.pyi
@@ -0,0 +1,13 @@
+from abc import ABC, abstractmethod
+from typing import Generic, TypeVar, Set
+from .content import Content, ContentType
+
+T = TypeVar("T")
+
+class BaseParser(ABC, Generic[T]):
+    supported_types: Set[ContentType]
+    content: Content
+    def __init__(self, content: Content) -> None: ...
+    @abstractmethod
+    def parse(self) -> T: ...
+    def supports(self) -> bool: ...
--- a/omniread/core/scraper.py
+++ b/omniread/core/scraper.py
@@ -1,15 +1,19 @@
 """
+# Summary
+
 Abstract scraping contracts for OmniRead.

 This module defines the **format-agnostic scraper interface** responsible for
 acquiring raw content from external sources.

 Scrapers are responsible for:
+
 - Locating and retrieving raw content bytes
 - Attaching minimal contextual metadata
 - Returning normalized `Content` objects

 Scrapers are explicitly NOT responsible for:
+
 - Parsing or interpreting content
 - Inferring structure or semantics
 - Performing content-type specific processing
@@ -27,23 +31,21 @@ class BaseScraper(ABC):
    """
    Base interface for all scrapers.

-    A scraper is responsible ONLY for fetching raw content
-    (bytes) from a source. It must not interpret or parse it.
+    Notes:
+        **Responsibilities:**

-    A scraper is a **stateless acquisition component** that retrieves raw
-    content from a source and returns it as a `Content` object.
+            - A scraper is responsible ONLY for fetching raw content (bytes)
+              from a source. It must not interpret or parse it.
+            - A scraper is a stateless acquisition component that retrieves raw
+              content from a source and returns it as a `Content` object.
+            - Scrapers define how content is obtained, not what the content means.
+            - Implementations may vary in transport mechanism, authentication
+              strategy, retry and backoff behavior.

-    Scrapers define *how content is obtained*, not *what the content means*.
+        **Constraints:**

-    Implementations may vary in:
-    - Transport mechanism (HTTP, filesystem, cloud storage)
-    - Authentication strategy
-    - Retry and backoff behavior
-
-    Implementations must not:
-    - Parse content
-    - Modify content semantics
-    - Couple scraping logic to a specific parser
+            - Implementations must not parse content, modify content semantics,
+              or couple scraping logic to a specific parser.
    """

    @abstractmethod
@@ -56,20 +58,25 @@ class BaseScraper(ABC):
        """
        Fetch raw content from the given source.

-        Implementations must retrieve the content referenced by `source`
-        and return it as raw bytes wrapped in a `Content` object.
-
        Args:
-            source: Location identifier (URL, file path, S3 URI, etc.)
-            metadata: Optional hints for the scraper (headers, auth, etc.)
+            source (str):
+                Location identifier (URL, file path, S3 URI, etc.).
+
+            metadata (Optional[Mapping[str, Any]], optional):
+                Optional hints for the scraper (headers, auth, etc.).

        Returns:
-            Content object containing raw bytes and metadata.
-            - Raw content bytes
-            - Source identifier
-            - Optional metadata
+            Content:
+                Content object containing raw bytes and metadata.

        Raises:
-            Exception: Retrieval-specific errors as defined by the implementation.
+            Exception:
+                Retrieval-specific errors as defined by the implementation.
+
+        Notes:
+            **Responsibilities:**
+
+                - Implementations must retrieve the content referenced by `source`
+                  and return it as raw bytes wrapped in a `Content` object.
        """
        raise NotImplementedError
--- a/omniread/core/scraper.pyi
+++ b/omniread/core/scraper.pyi
@@ -0,0 +1,7 @@
+from abc import ABC, abstractmethod
+from typing import Any, Mapping, Optional
+from .content import Content
+
+class BaseScraper(ABC):
+    @abstractmethod
+    def fetch(self, source: str, *, metadata: Optional[Mapping[str, Any]] = ...) -> Content: ...
--- a/omniread/html/init.py
+++ b/omniread/html/init.py
@@ -1,20 +1,33 @@
 """
+# Summary
+
 HTML format implementation for OmniRead.

 This package provides **HTML-specific implementations** of the core OmniRead
 contracts defined in `omniread.core`.

 It includes:
- HTML parsers that interpret HTML content
- HTML scrapers that retrieve HTML documents

-This package:
- Implements, but does not redefine, core contracts
- May contain HTML-specific behavior and edge-case handling
- Produces canonical content models defined in `omniread.core.content`
+- HTML parsers that interpret HTML content.
+- HTML scrapers that retrieve HTML documents.
+
+Key characteristics:
+
+- Implements, but does not redefine, core contracts.
+- May contain HTML-specific behavior and edge-case handling.
+- Produces canonical content models defined in `omniread.core.content`.

 Consumers should depend on `omniread.core` interfaces wherever possible and
 use this package only when HTML-specific behavior is required.
+
+---
+
+# Public API
+
+- `HTMLScraper`
+- `HTMLParser`
+
+---
 """


--- a/omniread/html/init.pyi
+++ b/omniread/html/init.pyi
@@ -0,0 +1,4 @@
+from .scraper import HTMLScraper
+from .parser import HTMLParser
+
+__all__ = ["HTMLScraper", "HTMLParser"]
--- a/omniread/html/parser.py
+++ b/omniread/html/parser.py
@@ -1,10 +1,13 @@
 """
+# Summary
+
 HTML parser base implementations for OmniRead.

 This module provides reusable HTML parsing utilities built on top of
 the abstract parser contracts defined in `omniread.core.parser`.

 It supplies:
+
 - Content-type enforcement for HTML inputs
 - BeautifulSoup initialization and lifecycle management
 - Common helper methods for extracting structured data from HTML elements
@@ -28,36 +31,44 @@ class HTMLParser(BaseParser[T], Generic[T]):
    """
    Base HTML parser.

-    This class extends the core `BaseParser` with HTML-specific behavior,
-    including DOM parsing via BeautifulSoup and reusable extraction helpers.
+    Notes:
+        **Responsibilities:**

-    Provides reusable helpers for HTML extraction.
-    Concrete parsers must explicitly define the return type.
+            - This class extends the core `BaseParser` with HTML-specific behavior,
+              including DOM parsing via BeautifulSoup and reusable extraction helpers.
+            - Provides reusable helpers for HTML extraction. Concrete parsers must
+              explicitly define the return type.

-    Characteristics:
-    - Accepts only HTML content
-    - Owns a parsed BeautifulSoup DOM tree
-    - Provides pure helper utilities for common HTML structures
+        **Guarantees:**

-    Concrete subclasses must:
-    - Define the output type `T`
-    - Implement the `parse()` method
+            - Accepts only HTML content.
+            - Owns a parsed BeautifulSoup DOM tree.
+            - Provides pure helper utilities for common HTML structures.
+
+        **Constraints:**
+
+            - Concrete subclasses must define the output type `T` and implement
+              the `parse()` method.
    """

    supported_types = {ContentType.HTML}
-    """Set of content types supported by this parser (HTML only)."""
+    """
+    Set of content types supported by this parser (HTML only).
+    """

    def __init__(self, content: Content, features: str = "html.parser"):
        """
        Initialize the HTML parser.

        Args:
-            content: HTML content to be parsed.
-            features: BeautifulSoup parser backend to use
-                (e.g., 'html.parser', 'lxml').
+            content (Content):
+                HTML content to be parsed.
+            features (str, optional):
+                BeautifulSoup parser backend to use (e.g., 'html.parser', 'lxml').

        Raises:
-            ValueError: If the content is empty or not valid HTML.
+            ValueError:
+                If the content is empty or not valid HTML.
        """
        super().__init__(content)
        self._features = features
@@ -72,11 +83,15 @@ class HTMLParser(BaseParser[T], Generic[T]):
        """
        Fully parse the HTML content into structured output.

-        Implementations must fully interpret the HTML DOM and return
-        a deterministic, structured output.
-
        Returns:
-            Parsed representation of type `T`.
+            T:
+                Parsed representation of type `T`.
+
+        Notes:
+            **Responsibilities:**
+
+                - Implementations must fully interpret the HTML DOM and return a
+                  deterministic, structured output.
        """
        raise NotImplementedError

@@ -90,11 +105,14 @@ class HTMLParser(BaseParser[T], Generic[T]):
        Extract normalized text from a `<div>` element.

        Args:
-            div: BeautifulSoup tag representing a `<div>`.
-            separator: String used to separate text nodes.
+            div (Tag):
+                BeautifulSoup tag representing a `<div>`.
+            separator (str, optional):
+                String used to separate text nodes.

        Returns:
-            Flattened, whitespace-normalized text content.
+            str:
+                Flattened, whitespace-normalized text content.
        """
        return div.get_text(separator=separator, strip=True)

@@ -104,10 +122,12 @@ class HTMLParser(BaseParser[T], Generic[T]):
        Extract the hyperlink reference from an `<a>` element.

        Args:
-            a: BeautifulSoup tag representing an anchor.
+            a (Tag):
+                BeautifulSoup tag representing an anchor.

        Returns:
-            The value of the `href` attribute, or None if absent.
+            Optional[str]:
+                The value of the `href` attribute, or None if absent.
        """
        return a.get("href")

@@ -117,10 +137,12 @@ class HTMLParser(BaseParser[T], Generic[T]):
        Parse an HTML table into a 2D list of strings.

        Args:
-            table: BeautifulSoup tag representing a `<table>`.
+            table (Tag):
+                BeautifulSoup tag representing a `<table>`.

        Returns:
-            A list of rows, where each row is a list of cell text values.
+            list[list[str]]:
+                A list of rows, where each row is a list of cell text values.
        """
        rows: list[list[str]] = []
        for tr in table.find_all("tr"):
@@ -141,10 +163,12 @@ class HTMLParser(BaseParser[T], Generic[T]):
        Build a BeautifulSoup DOM tree from raw HTML content.

        Returns:
-            Parsed BeautifulSoup document tree.
+            BeautifulSoup:
+                Parsed BeautifulSoup document tree.

        Raises:
-            ValueError: If the content payload is empty.
+            ValueError:
+                If the content payload is empty.
        """
        if not self.content.raw:
            raise ValueError("Empty HTML content")
@@ -154,12 +178,16 @@ class HTMLParser(BaseParser[T], Generic[T]):
        """
        Extract high-level metadata from the HTML document.

-        This includes:
-        - Document title
-        - `<meta>` tag name/property → content mappings
-
        Returns:
-            Dictionary containing extracted metadata.
+            dict[str, Any]:
+                Dictionary containing extracted metadata.
+
+        Notes:
+            **Responsibilities:**
+
+                - Extract high-level metadata from the HTML document.
+                - This includes: Document title, `<meta>` tag name/property to
+                  content mappings.
        """
        soup = self._soup

--- a/omniread/html/parser.pyi
+++ b/omniread/html/parser.pyi
@@ -0,0 +1,18 @@
+from typing import Any, Generic, TypeVar, Optional, list, dict
+from bs4 import BeautifulSoup, Tag
+from omniread.core.content import ContentType, Content
+from omniread.core.parser import BaseParser
+
+T = TypeVar("T")
+
+class HTMLParser(BaseParser[T], Generic[T]):
+    supported_types: set[ContentType]
+    def __init__(self, content: Content, features: str = ...) -> None: ...
+    def parse(self) -> T: ...
+    @staticmethod
+    def parse_div(div: Tag, *, separator: str = ...) -> str: ...
+    @staticmethod
+    def parse_link(a: Tag) -> Optional[str]: ...
+    @staticmethod
+    def parse_table(table: Tag) -> list[list[str]]: ...
+    def parse_meta(self) -> dict[str, Any]: ...
--- a/omniread/html/scraper.py
+++ b/omniread/html/scraper.py
@@ -1,4 +1,6 @@
 """
+# Summary
+
 HTML scraping implementation for OmniRead.

 This module provides an HTTP-based scraper for retrieving HTML documents.
@@ -6,11 +8,13 @@ It implements the core `BaseScraper` contract using `httpx` as the transport
 layer.

 This scraper is responsible for:
+
 - Fetching raw HTML bytes over HTTP(S)
 - Validating response content type
 - Attaching HTTP metadata to the returned content

 This scraper is not responsible for:
+
 - Parsing or interpreting HTML
 - Retrying failed requests
 - Managing crawl policies or rate limiting
@@ -25,21 +29,21 @@ from omniread.core.scraper import BaseScraper

 class HTMLScraper(BaseScraper):
    """
-    Base HTML scraper using httpx.
+    Base HTML scraper using `httpx`.

-    This scraper retrieves HTML documents over HTTP(S) and returns them
-    as raw content wrapped in a `Content` object.
+    Notes:
+        **Responsibilities:**

-    Fetches raw bytes and metadata only.
-    The scraper:
-    - Uses `httpx.Client` for HTTP requests
-    - Enforces an HTML content type
-    - Preserves HTTP response metadata
+            - This scraper retrieves HTML documents over HTTP(S) and returns
+              them as raw content wrapped in a `Content` object.
+            - Fetches raw bytes and metadata only.
+            - The scraper uses `httpx.Client` for HTTP requests, enforces an
+              HTML content type, and preserves HTTP response metadata.

-    The scraper does not:
-    - Parse HTML
-    - Perform retries or backoff
-    - Handle non-HTML responses
+        **Constraints:**
+
+            - The scraper does not: Parse HTML, perform retries or backoff,
+              handle non-HTML responses.
    """

    def __init__(
@@ -54,11 +58,14 @@ class HTMLScraper(BaseScraper):
        Initialize the HTML scraper.

        Args:
-            client: Optional pre-configured `httpx.Client`. If omitted,
-                a client is created internally.
-            timeout: Request timeout in seconds.
-            headers: Optional default HTTP headers.
-            follow_redirects: Whether to follow HTTP redirects.
+            client (httpx.Client | None, optional):
+                Optional pre-configured `httpx.Client`. If omitted, a client is created internally.
+            timeout (float, optional):
+                Request timeout in seconds.
+            headers (Optional[Mapping[str, str]], optional):
+                Optional default HTTP headers.
+            follow_redirects (bool, optional):
+                Whether to follow HTTP redirects.
        """

        self._client = client or httpx.Client(
@@ -76,11 +83,12 @@ class HTMLScraper(BaseScraper):
        Validate that the HTTP response contains HTML content.

        Args:
-            response: HTTP response returned by `httpx`.
+            response (httpx.Response):
+                HTTP response returned by `httpx`.

        Raises:
-            ValueError: If the `Content-Type` header is missing or does not
-                indicate HTML content.
+            ValueError:
+                If the `Content-Type` header is missing or does not indicate HTML content.
        """

        raw_ct = response.headers.get("Content-Type")
@@ -103,19 +111,20 @@ class HTMLScraper(BaseScraper):
        Fetch an HTML document from the given source.

        Args:
-            source: URL of the HTML document.
-            metadata: Optional metadata to be merged into the returned content.
+            source (str):
+                URL of the HTML document.
+            metadata (Optional[Mapping[str, Any]], optional):
+                Optional metadata to be merged into the returned content.

        Returns:
-            A `Content` instance containing:
-            - Raw HTML bytes
-            - Source URL
-            - HTML content type
-            - HTTP response metadata
+            Content:
+                A `Content` instance containing raw HTML bytes, source URL, HTML content type, and HTTP response metadata.

        Raises:
-            httpx.HTTPError: If the HTTP request fails.
-            ValueError: If the response is not valid HTML.
+            httpx.HTTPError:
+                If the HTTP request fails.
+            ValueError:
+                If the response is not valid HTML.
        """

        response = self._client.get(source)
--- a/omniread/html/scraper.pyi
+++ b/omniread/html/scraper.pyi
@@ -0,0 +1,10 @@
+import httpx
+from typing import Any, Mapping, Optional
+from omniread.core.content import Content, ContentType
+from omniread.core.scraper import BaseScraper
+
+class HTMLScraper(BaseScraper):
+    content_type: ContentType
+    def __init__(self, *, client: Optional[httpx.Client] = ..., timeout: float = ..., headers: Optional[Mapping[str, str]] = ..., follow_redirects: bool = ...) -> None: ...
+    def validate_content_type(self, response: httpx.Response) -> None: ...
+    def fetch(self, source: str, *, metadata: Optional[Mapping[str, Any]] = ...) -> Content: ...
--- a/omniread/pdf/init.py
+++ b/omniread/pdf/init.py
@@ -1,4 +1,6 @@
 """
+# Summary
+
 PDF format implementation for OmniRead.

 This package provides **PDF-specific implementations** of the core OmniRead
@@ -6,12 +8,23 @@ contracts defined in `omniread.core`.

 Unlike HTML, PDF handling requires an explicit client layer for document
 access. This package therefore includes:
- PDF clients for acquiring raw PDF data
- PDF scrapers that coordinate client access
- PDF parsers that extract structured content from PDF binaries
+
+- PDF clients for acquiring raw PDF data.
+- PDF scrapers that coordinate client access.
+- PDF parsers that extract structured content from PDF binaries.

 Public exports from this package represent the supported PDF pipeline
 and are safe for consumers to import directly when working with PDFs.
+
+---
+
+# Public API
+
+- `FileSystemPDFClient`
+- `PDFScraper`
+- `PDFParser`
+
+---
 """

 from .client import FileSystemPDFClient
--- a/omniread/pdf/init.pyi
+++ b/omniread/pdf/init.pyi
@@ -0,0 +1,5 @@
+from .client import FileSystemPDFClient
+from .scraper import PDFScraper
+from .parser import PDFParser
+
+__all__ = ["FileSystemPDFClient", "PDFScraper", "PDFParser"]
--- a/omniread/pdf/client.py
+++ b/omniread/pdf/client.py
@@ -1,4 +1,6 @@
 """
+# Summary
+
 PDF client abstractions for OmniRead.

 This module defines the **client layer** responsible for retrieving raw PDF
@@ -9,40 +11,48 @@ decoupled from scraping and parsing logic. They do not perform validation,
 interpretation, or content extraction.

 Typical backing stores include:
+
 - Local filesystems
 - Object storage (S3, GCS, etc.)
 - Network file systems
 """

+from typing import Any
 from abc import ABC, abstractmethod
 from pathlib import Path


 class BasePDFClient(ABC):
    """
-    Abstract client responsible for retrieving PDF bytes
-    from a specific backing store (filesystem, S3, FTP, etc.).
+    Abstract client responsible for retrieving PDF bytes.

-    Implementations must:
-    - Accept a source identifier appropriate to the backing store
-    - Return the full PDF binary payload
-    - Raise retrieval-specific errors on failure
+    Retrieves bytes from a specific backing store (filesystem, S3, FTP, etc.).
+
+    Notes:
+        **Responsibilities:**
+
+            - Implementations must accept a source identifier appropriate to
+              the backing store.
+            - Return the full PDF binary payload.
+            - Raise retrieval-specific errors on failure.
    """

    @abstractmethod
-    def fetch(self, source: str) -> bytes:
+    def fetch(self, source: Any) -> bytes:
        """
        Fetch raw PDF bytes from the given source.

        Args:
-            source: Identifier of the PDF location, such as a file path,
-                object storage key, or remote reference.
+            source (Any):
+                Identifier of the PDF location, such as a file path, object storage key, or remote reference.

        Returns:
-            Raw PDF bytes.
+            bytes:
+                Raw PDF bytes.

        Raises:
-            Exception: Retrieval-specific errors defined by the implementation.
+            Exception:
+                Retrieval-specific errors defined by the implementation.
        """
        raise NotImplementedError

@@ -51,8 +61,11 @@ class FileSystemPDFClient(BasePDFClient):
    """
    PDF client that reads from the local filesystem.

-    This client reads PDF files directly from the disk and returns their raw
-    binary contents.
+    Notes:
+        **Guarantees:**
+
+            - This client reads PDF files directly from the disk and returns
+              their raw binary contents.
    """

    def fetch(self, path: Path) -> bytes:
@@ -60,14 +73,18 @@ class FileSystemPDFClient(BasePDFClient):
        Read a PDF file from the local filesystem.

        Args:
-            path: Filesystem path to the PDF file.
+            path (Path):
+                Filesystem path to the PDF file.

        Returns:
-            Raw PDF bytes.
+            bytes:
+                Raw PDF bytes.

        Raises:
-            FileNotFoundError: If the path does not exist.
-            ValueError: If the path exists but is not a file.
+            FileNotFoundError:
+                If the path does not exist.
+            ValueError:
+                If the path exists but is not a file.
        """

        if not path.exists():
--- a/omniread/pdf/client.pyi
+++ b/omniread/pdf/client.pyi
@@ -0,0 +1,10 @@
+from abc import ABC, abstractmethod
+from pathlib import Path
+from typing import Any
+
+class BasePDFClient(ABC):
+    @abstractmethod
+    def fetch(self, source: Any) -> bytes: ...
+
+class FileSystemPDFClient(BasePDFClient):
+    def fetch(self, source: Path | str) -> bytes: ...
--- a/omniread/pdf/parser.py
+++ b/omniread/pdf/parser.py
@@ -1,4 +1,6 @@
 """
+# Summary
+
 PDF parser base implementations for OmniRead.

 This module defines the **PDF-specific parser contract**, extending the
@@ -21,29 +23,40 @@ class PDFParser(BaseParser[T], Generic[T]):
    """
    Base PDF parser.

-    This class enforces PDF content-type compatibility and provides the
-    extension point for implementing concrete PDF parsing strategies.
+    Notes:
+        **Responsibilities:**

-    Concrete implementations must define:
-    - Define the output type `T`
-    - Implement the `parse()` method
+            - This class enforces PDF content-type compatibility and provides
+              the extension point for implementing concrete PDF parsing strategies.
+
+        **Constraints:**
+
+            - Concrete implementations must define the output type `T` and
+              implement the `parse()` method.
    """

    supported_types = {ContentType.PDF}
-    """Set of content types supported by this parser (PDF only)."""
+    """
+    Set of content types supported by this parser (PDF only).
+    """

    @abstractmethod
    def parse(self) -> T:
        """
        Parse PDF content into a structured output.

-        Implementations must fully interpret the PDF binary payload and
-        return a deterministic, structured output.
-
        Returns:
-            Parsed representation of type `T`.
+            T:
+                Parsed representation of type `T`.

        Raises:
-            Exception: Parsing-specific errors as defined by the implementation.
+            Exception:
+                Parsing-specific errors as defined by the implementation.
+
+        Notes:
+            **Responsibilities:**
+
+                - Implementations must fully interpret the PDF binary payload and
+                  return a deterministic, structured output.
        """
        raise NotImplementedError
--- a/omniread/pdf/parser.pyi
+++ b/omniread/pdf/parser.pyi
@@ -0,0 +1,11 @@
+from abc import abstractmethod
+from typing import Generic, TypeVar
+from omniread.core.content import ContentType
+from omniread.core.parser import BaseParser
+
+T = TypeVar("T")
+
+class PDFParser(BaseParser[T], Generic[T]):
+    supported_types: set[ContentType]
+    @abstractmethod
+    def parse(self) -> T: ...
--- a/omniread/pdf/scraper.py
+++ b/omniread/pdf/scraper.py
@@ -1,4 +1,6 @@
 """
+# Summary
+
 PDF scraping implementation for OmniRead.

 This module provides a PDF-specific scraper that coordinates PDF byte
@@ -19,13 +21,17 @@ class PDFScraper(BaseScraper):
    """
    Scraper for PDF sources.

-    Delegates byte retrieval to a PDF client and normalizes
-    output into Content.
+    Notes:
+        **Responsibilities:**

-    The scraper:
-    - Does not perform parsing or interpretation
-    - Does not assume a specific storage backend
-    - Preserves caller-provided metadata
+            - Delegates byte retrieval to a PDF client and normalizes output
+              into `Content`.
+            - Preserves caller-provided metadata.
+
+        **Constraints:**
+
+            - The scraper does not perform parsing or interpretation.
+            - Does not assume a specific storage backend.
    """

    def __init__(self, *, client: BasePDFClient):
@@ -33,13 +39,14 @@ class PDFScraper(BaseScraper):
        Initialize the PDF scraper.

        Args:
-            client: PDF client responsible for retrieving raw PDF bytes.
+            client (BasePDFClient):
+                PDF client responsible for retrieving raw PDF bytes.
        """
        self._client = client

    def fetch(
        self,
-        source: str,
+        source: Any,
        *,
        metadata: Optional[Mapping[str, Any]] = None,
    ) -> Content:
@@ -47,19 +54,18 @@ class PDFScraper(BaseScraper):
        Fetch a PDF document from the given source.

        Args:
-            source: Identifier of the PDF source as understood by the
-                configured PDF client.
-            metadata: Optional metadata to attach to the returned content.
+            source (Any):
+                Identifier of the PDF source as understood by the configured PDF client.
+            metadata (Optional[Mapping[str, Any]], optional):
+                Optional metadata to attach to the returned content.

        Returns:
-            A `Content` instance containing:
-            - Raw PDF bytes
-            - Source identifier
-            - PDF content type
-            - Optional metadata
+            Content:
+                A `Content` instance containing raw PDF bytes, source identifier, PDF content type, and optional metadata.

        Raises:
-            Exception: Retrieval-specific errors raised by the PDF client.
+            Exception:
+                Retrieval-specific errors raised by the PDF client.
        """
        raw = self._client.fetch(source)

--- a/omniread/pdf/scraper.pyi
+++ b/omniread/pdf/scraper.pyi
@@ -0,0 +1,8 @@
+from typing import Any, Mapping, Optional
+from omniread.core.content import Content, ContentType
+from omniread.core.scraper import BaseScraper
+from .client import BasePDFClient
+
+class PDFScraper(BaseScraper):
+    def __init__(self, *, client: BasePDFClient) -> None: ...
+    def fetch(self, source: Any, *, metadata: Optional[Mapping[str, Any]] = ...) -> Content: ...
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,19 +0,0 @@
-httpx==0.27.0
-beautifulsoup4==4.12.0
-pydantic==2.12.3
-jinja2==3.1.6
-# lxml==5.2.0
-
-# Test Packages
-pytest==7.4.0
-pytest-asyncio==0.21.0
-pytest-cov==4.1.0
-
-# Doc Packages
-mkdocs==1.6.1
-mkdocs-material==9.6.23
-neoteroi-mkdocs==1.1.3
-pymdown-extensions==10.16.1
-mkdocs-swagger-ui-tag==0.7.2
-mkdocstrings==1.0.0
-mkdocstrings-python==2.0.1
Author	SHA1	Message	Date
Vishesh 'ironeagle' Bangotra	de7d04eb1a	updated docs strings and added README.md	2026-03-08 17:59:56 +05:30
Vishesh 'ironeagle' Bangotra	0fbf0ca0f0	mcp docs	2026-03-08 00:41:28 +05:30
Vishesh 'ironeagle' Bangotra	5842e6a227	google styled doc	2026-03-08 00:29:25 +05:30
Vishesh 'ironeagle' Bangotra	a188e78283	module as source doc fixes (#2 ) Reviewed-on: #2 Co-authored-by: Vishesh 'ironeagle' Bangotra <aetoskia@gmail.com> Co-committed-by: Vishesh 'ironeagle' Bangotra <aetoskia@gmail.com>	2026-02-21 16:47:08 +00:00
Vishesh 'ironeagle' Bangotra	67a3074ab4	using doc-forge (#1 ) Reviewed-on: #1 Co-authored-by: Vishesh 'ironeagle' Bangotra <aetoskia@gmail.com> Co-committed-by: Vishesh 'ironeagle' Bangotra <aetoskia@gmail.com>	2026-01-22 11:27:56 +00:00
Vishesh 'ironeagle' Bangotra	6808538485	added .drone.yml All checks were successful continuous-integration/drone Build is passing Details	2026-01-09 15:55:54 +05:30