updated mcp

2026-03-08 17:57:34 +05:30
parent 9191de9dff
commit 0e49f02c4c
167 changed files with 7632 additions and 98942 deletions
--- a/libs/omniread/site/pdf/client/index.html
+++ b/libs/omniread/site/pdf/client/index.html
@@ -974,18 +974,19 @@

    <div class="doc doc-contents first">

-      <p>PDF client abstractions for OmniRead.</p>
-<hr />
-<h4 id="omniread.pdf.client--summary">Summary</h4>
+      <h3 id="omniread.pdf.client--summary">Summary</h3>
+<p>PDF client abstractions for OmniRead.</p>
 <p>This module defines the <strong>client layer</strong> responsible for retrieving raw PDF
 bytes from a concrete backing store.</p>
 <p>Clients provide low-level access to PDF binaries and are intentionally
 decoupled from scraping and parsing logic. They do not perform validation,
 interpretation, or content extraction.</p>
-<p>Typical backing stores include:
- Local filesystems
- Object storage (S3, GCS, etc.)
- Network file systems</p>
+<p>Typical backing stores include:</p>
+<ul>
+<li>Local filesystems</li>
+<li>Object storage (S3, GCS, etc.)</li>
+<li>Network file systems</li>
+</ul>



@@ -1014,14 +1015,20 @@ interpretation, or content extraction.</p>
              Bases: <code><span title="abc.ABC">ABC</span></code></p>


-      <p>Abstract client responsible for retrieving PDF bytes
-from a specific backing store (filesystem, S3, FTP, etc.).</p>
+      <p>Abstract client responsible for retrieving PDF bytes.</p>
+<p>Retrieves bytes from a specific backing store (filesystem, S3, FTP, etc.).</p>


 <details class="notes" open>
  <summary>Notes</summary>
  <p><strong>Responsibilities:</strong></p>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must accept a source identifier appropriate to the backing store, return the full PDF binary payload, and raise retrieval-specific errors on failure
+<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
+<span class="normal">2</span>
+<span class="normal">3</span>
+<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must accept a source identifier appropriate to
+  the backing store.
+- Return the full PDF binary payload.
+- Raise retrieval-specific errors on failure.
 </code></pre></div></td></tr></table></div>
 </details>

@@ -1165,7 +1172,9 @@ from a specific backing store (filesystem, S3, FTP, etc.).</p>
 <details class="notes" open>
  <summary>Notes</summary>
  <p><strong>Guarantees:</strong></p>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- This client reads PDF files directly from the disk and returns their raw binary contents
+<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
+<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- This client reads PDF files directly from the disk and returns
+  their raw binary contents.
 </code></pre></div></td></tr></table></div>
 </details>

--- a/libs/omniread/site/pdf/index.html
+++ b/libs/omniread/site/pdf/index.html
@@ -896,26 +896,26 @@

    <div class="doc doc-contents first">

-      <p>PDF format implementation for OmniRead.</p>
-<hr />
-<h4 id="omniread.pdf--summary">Summary</h4>
+      <h3 id="omniread.pdf--summary">Summary</h3>
+<p>PDF format implementation for OmniRead.</p>
 <p>This package provides <strong>PDF-specific implementations</strong> of the core OmniRead
 contracts defined in <code>omniread.core</code>.</p>
 <p>Unlike HTML, PDF handling requires an explicit client layer for document
-access. This package therefore includes:
- PDF clients for acquiring raw PDF data
- PDF scrapers that coordinate client access
- PDF parsers that extract structured content from PDF binaries</p>
+access. This package therefore includes:</p>
+<ul>
+<li>PDF clients for acquiring raw PDF data.</li>
+<li>PDF scrapers that coordinate client access.</li>
+<li>PDF parsers that extract structured content from PDF binaries.</li>
+</ul>
 <p>Public exports from this package represent the supported PDF pipeline
 and are safe for consumers to import directly when working with PDFs.</p>
 <hr />
-<h4 id="omniread.pdf--public-api">Public API</h4>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
-<span class="normal">2</span>
-<span class="normal">3</span></pre></div></td><td class="code"><div><pre><span></span><code>FileSystemPDFClient
-PDFScraper
-PDFParser
-</code></pre></div></td></tr></table></div>
+<h3 id="omniread.pdf--public-api">Public API</h3>
+<ul>
+<li><code>FileSystemPDFClient</code></li>
+<li><code>PDFScraper</code></li>
+<li><code>PDFParser</code></li>
+</ul>
 <hr />


@@ -951,7 +951,9 @@ PDFParser
 <details class="notes" open>
  <summary>Notes</summary>
  <p><strong>Guarantees:</strong></p>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- This client reads PDF files directly from the disk and returns their raw binary contents
+<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
+<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- This client reads PDF files directly from the disk and returns
+  their raw binary contents.
 </code></pre></div></td></tr></table></div>
 </details>

@@ -1093,7 +1095,7 @@ PDFParser

    <div class="doc doc-contents ">
            <p class="doc doc-class-bases">
-              Bases: <code><a class="autorefs autorefs-internal" title="omniread.core.parser.BaseParser" href="../omniread/core/parser/#omniread.core.parser.BaseParser">BaseParser</a>[<span title="omniread.pdf.parser.T">T</span>]</code>, <code><span title="typing.Generic">Generic</span>[<span title="omniread.pdf.parser.T">T</span>]</code></p>
+              Bases: <code><a class="autorefs autorefs-internal" title="omniread.core.parser.BaseParser" href="../core/parser/#omniread.core.parser.BaseParser">BaseParser</a>[<span title="omniread.pdf.parser.T">T</span>]</code>, <code><span title="typing.Generic">Generic</span>[<span title="omniread.pdf.parser.T">T</span>]</code></p>


      <p>Base PDF parser.</p>
@@ -1102,10 +1104,14 @@ PDFParser
 <details class="notes" open>
  <summary>Notes</summary>
  <p><strong>Responsibilities:</strong></p>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- This class enforces PDF content-type compatibility and provides the extension point for implementing concrete PDF parsing strategies
+<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
+<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- This class enforces PDF content-type compatibility and provides
+  the extension point for implementing concrete PDF parsing strategies.
 </code></pre></div></td></tr></table></div>
 <p><strong>Constraints:</strong></p>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- Concrete implementations must: Define the output type `T`, implement the `parse()` method
+<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
+<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- Concrete implementations must define the output type `T` and
+  implement the `parse()` method.
 </code></pre></div></td></tr></table></div>
 </details>
      <p>Initialize the parser with content to be parsed.</p>
@@ -1125,7 +1131,7 @@ PDFParser
          <tr class="doc-section-item">
            <td><code>content</code></td>
            <td>
-                  <code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../omniread/core/content/#omniread.core.content.Content">Content</a></code>
+                  <code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../core/content/#omniread.core.content.Content">Content</a></code>
            </td>
            <td>
              <div class="doc-md-description">
@@ -1268,7 +1274,9 @@ PDFParser
 <details class="notes" open>
  <summary>Notes</summary>
  <p><strong>Responsibilities:</strong></p>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must fully interpret the PDF binary payload and return a deterministic, structured output
+<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
+<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must fully interpret the PDF binary payload and
+  return a deterministic, structured output.
 </code></pre></div></td></tr></table></div>
 </details>
    </div>
@@ -1339,7 +1347,7 @@ PDFParser

    <div class="doc doc-contents ">
            <p class="doc doc-class-bases">
-              Bases: <code><a class="autorefs autorefs-internal" title="omniread.core.scraper.BaseScraper" href="../omniread/core/scraper/#omniread.core.scraper.BaseScraper">BaseScraper</a></code></p>
+              Bases: <code><a class="autorefs autorefs-internal" title="omniread.core.scraper.BaseScraper" href="../core/scraper/#omniread.core.scraper.BaseScraper">BaseScraper</a></code></p>


      <p>Scraper for PDF sources.</p>
@@ -1349,11 +1357,15 @@ PDFParser
  <summary>Notes</summary>
  <p><strong>Responsibilities:</strong></p>
 <div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
-<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- Delegates byte retrieval to a PDF client and normalizes output into Content
- Preserves caller-provided metadata
+<span class="normal">2</span>
+<span class="normal">3</span></pre></div></td><td class="code"><div><pre><span></span><code>- Delegates byte retrieval to a PDF client and normalizes output
+  into `Content`.
+- Preserves caller-provided metadata.
 </code></pre></div></td></tr></table></div>
 <p><strong>Constraints:</strong></p>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- The scraper: Does not perform parsing or interpretation, does not assume a specific storage backend
+<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
+<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- The scraper does not perform parsing or interpretation.
+- Does not assume a specific storage backend.
 </code></pre></div></td></tr></table></div>
 </details>
      <p>Initialize the PDF scraper.</p>
@@ -1470,7 +1482,7 @@ PDFParser
      <tbody>
          <tr class="doc-section-item">
 <td><code>Content</code></td>            <td>
-                  <code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../omniread/core/content/#omniread.core.content.Content">Content</a></code>
+                  <code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../core/content/#omniread.core.content.Content">Content</a></code>
            </td>
            <td>
              <div class="doc-md-description">
--- a/libs/omniread/site/pdf/parser/index.html
+++ b/libs/omniread/site/pdf/parser/index.html
@@ -962,9 +962,8 @@

    <div class="doc doc-contents first">

-      <p>PDF parser base implementations for OmniRead.</p>
-<hr />
-<h4 id="omniread.pdf.parser--summary">Summary</h4>
+      <h3 id="omniread.pdf.parser--summary">Summary</h3>
+<p>PDF parser base implementations for OmniRead.</p>
 <p>This module defines the <strong>PDF-specific parser contract</strong>, extending the
 format-agnostic <code>BaseParser</code> with constraints appropriate for PDF content.</p>
 <p>PDF parsers are responsible for interpreting binary PDF data and producing
@@ -995,7 +994,7 @@ structured representations suitable for downstream consumption.</p>

    <div class="doc doc-contents ">
            <p class="doc doc-class-bases">
-              Bases: <code><a class="autorefs autorefs-internal" title="omniread.core.parser.BaseParser" href="../../omniread/core/parser/#omniread.core.parser.BaseParser">BaseParser</a>[<span title="omniread.pdf.parser.T">T</span>]</code>, <code><span title="typing.Generic">Generic</span>[<span title="omniread.pdf.parser.T">T</span>]</code></p>
+              Bases: <code><a class="autorefs autorefs-internal" title="omniread.core.parser.BaseParser" href="../../core/parser/#omniread.core.parser.BaseParser">BaseParser</a>[<span title="omniread.pdf.parser.T">T</span>]</code>, <code><span title="typing.Generic">Generic</span>[<span title="omniread.pdf.parser.T">T</span>]</code></p>


      <p>Base PDF parser.</p>
@@ -1004,10 +1003,14 @@ structured representations suitable for downstream consumption.</p>
 <details class="notes" open>
  <summary>Notes</summary>
  <p><strong>Responsibilities:</strong></p>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- This class enforces PDF content-type compatibility and provides the extension point for implementing concrete PDF parsing strategies
+<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
+<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- This class enforces PDF content-type compatibility and provides
+  the extension point for implementing concrete PDF parsing strategies.
 </code></pre></div></td></tr></table></div>
 <p><strong>Constraints:</strong></p>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- Concrete implementations must: Define the output type `T`, implement the `parse()` method
+<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
+<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- Concrete implementations must define the output type `T` and
+  implement the `parse()` method.
 </code></pre></div></td></tr></table></div>
 </details>
      <p>Initialize the parser with content to be parsed.</p>
@@ -1027,7 +1030,7 @@ structured representations suitable for downstream consumption.</p>
          <tr class="doc-section-item">
            <td><code>content</code></td>
            <td>
-                  <code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../../omniread/core/content/#omniread.core.content.Content">Content</a></code>
+                  <code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../../core/content/#omniread.core.content.Content">Content</a></code>
            </td>
            <td>
              <div class="doc-md-description">
@@ -1170,7 +1173,9 @@ structured representations suitable for downstream consumption.</p>
 <details class="notes" open>
  <summary>Notes</summary>
  <p><strong>Responsibilities:</strong></p>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must fully interpret the PDF binary payload and return a deterministic, structured output
+<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
+<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must fully interpret the PDF binary payload and
+  return a deterministic, structured output.
 </code></pre></div></td></tr></table></div>
 </details>
    </div>
--- a/libs/omniread/site/pdf/scraper/index.html
+++ b/libs/omniread/site/pdf/scraper/index.html
@@ -894,9 +894,8 @@

    <div class="doc doc-contents first">

-      <p>PDF scraping implementation for OmniRead.</p>
-<hr />
-<h4 id="omniread.pdf.scraper--summary">Summary</h4>
+      <h3 id="omniread.pdf.scraper--summary">Summary</h3>
+<p>PDF scraping implementation for OmniRead.</p>
 <p>This module provides a PDF-specific scraper that coordinates PDF byte
 retrieval via a client and normalizes the result into a <code>Content</code> object.</p>
 <p>The scraper implements the core <code>BaseScraper</code> contract while delegating
@@ -927,7 +926,7 @@ all storage and access concerns to a <code>BasePDFClient</code> implementation.<

    <div class="doc doc-contents ">
            <p class="doc doc-class-bases">
-              Bases: <code><a class="autorefs autorefs-internal" title="omniread.core.scraper.BaseScraper" href="../../omniread/core/scraper/#omniread.core.scraper.BaseScraper">BaseScraper</a></code></p>
+              Bases: <code><a class="autorefs autorefs-internal" title="omniread.core.scraper.BaseScraper" href="../../core/scraper/#omniread.core.scraper.BaseScraper">BaseScraper</a></code></p>


      <p>Scraper for PDF sources.</p>
@@ -937,11 +936,15 @@ all storage and access concerns to a <code>BasePDFClient</code> implementation.<
  <summary>Notes</summary>
  <p><strong>Responsibilities:</strong></p>
 <div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
-<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- Delegates byte retrieval to a PDF client and normalizes output into Content
- Preserves caller-provided metadata
+<span class="normal">2</span>
+<span class="normal">3</span></pre></div></td><td class="code"><div><pre><span></span><code>- Delegates byte retrieval to a PDF client and normalizes output
+  into `Content`.
+- Preserves caller-provided metadata.
 </code></pre></div></td></tr></table></div>
 <p><strong>Constraints:</strong></p>
-<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- The scraper: Does not perform parsing or interpretation, does not assume a specific storage backend
+<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
+<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- The scraper does not perform parsing or interpretation.
+- Does not assume a specific storage backend.
 </code></pre></div></td></tr></table></div>
 </details>
      <p>Initialize the PDF scraper.</p>
@@ -1058,7 +1061,7 @@ all storage and access concerns to a <code>BasePDFClient</code> implementation.<
      <tbody>
          <tr class="doc-section-item">
 <td><code>Content</code></td>            <td>
-                  <code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../../omniread/core/content/#omniread.core.content.Content">Content</a></code>
+                  <code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../../core/content/#omniread.core.content.Content">Content</a></code>
            </td>
            <td>
              <div class="doc-md-description">