updated mcp
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
2026-03-08 17:57:34 +05:30
parent 9191de9dff
commit 0e49f02c4c
167 changed files with 7632 additions and 98942 deletions

View File

@@ -896,19 +896,22 @@
<div class="doc doc-contents first">
<p>Abstract scraping contracts for OmniRead.</p>
<hr />
<h4 id="omniread.core.scraper--summary">Summary</h4>
<h3 id="omniread.core.scraper--summary">Summary</h3>
<p>Abstract scraping contracts for OmniRead.</p>
<p>This module defines the <strong>format-agnostic scraper interface</strong> responsible for
acquiring raw content from external sources.</p>
<p>Scrapers are responsible for:
- Locating and retrieving raw content bytes
- Attaching minimal contextual metadata
- Returning normalized <code>Content</code> objects</p>
<p>Scrapers are explicitly NOT responsible for:
- Parsing or interpreting content
- Inferring structure or semantics
- Performing content-type specific processing</p>
<p>Scrapers are responsible for:</p>
<ul>
<li>Locating and retrieving raw content bytes</li>
<li>Attaching minimal contextual metadata</li>
<li>Returning normalized <code>Content</code> objects</li>
</ul>
<p>Scrapers are explicitly NOT responsible for:</p>
<ul>
<li>Parsing or interpreting content</li>
<li>Inferring structure or semantics</li>
<li>Performing content-type specific processing</li>
</ul>
<p>All interpretation must be delegated to parsers.</p>
@@ -947,13 +950,21 @@ acquiring raw content from external sources.</p>
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><code>- A scraper is responsible ONLY for fetching raw content (bytes) from a source. It must not interpret or parse it
- A scraper is a stateless acquisition component that retrieves raw content from a source and returns it as a `Content` object
- Scrapers define how content is obtained, not what the content means
- Implementations may vary in transport mechanism, authentication strategy, retry and backoff behavior
<span class="normal">4</span>
<span class="normal">5</span>
<span class="normal">6</span>
<span class="normal">7</span></pre></div></td><td class="code"><div><pre><span></span><code>- A scraper is responsible ONLY for fetching raw content (bytes)
from a source. It must not interpret or parse it.
- A scraper is a stateless acquisition component that retrieves raw
content from a source and returns it as a `Content` object.
- Scrapers define how content is obtained, not what the content means.
- Implementations may vary in transport mechanism, authentication
strategy, retry and backoff behavior.
</code></pre></div></td></tr></table></div>
<p><strong>Constraints:</strong></p>
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must not parse content, modify content semantics, or couple scraping logic to a specific parser
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must not parse content, modify content semantics,
or couple scraping logic to a specific parser.
</code></pre></div></td></tr></table></div>
</details>
@@ -1007,7 +1018,7 @@ acquiring raw content from external sources.</p>
</td>
<td>
<div class="doc-md-description">
<p>Location identifier (URL, file path, S3 URI, etc.)</p>
<p>Location identifier (URL, file path, S3 URI, etc.).</p>
</div>
</td>
<td>
@@ -1021,7 +1032,7 @@ acquiring raw content from external sources.</p>
</td>
<td>
<div class="doc-md-description">
<p>Optional hints for the scraper (headers, auth, etc.)</p>
<p>Optional hints for the scraper (headers, auth, etc.).</p>
</div>
</td>
<td>
@@ -1043,7 +1054,7 @@ acquiring raw content from external sources.</p>
<tbody>
<tr class="doc-section-item">
<td><code>Content</code></td> <td>
<code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../../omniread/core/content/#omniread.core.content.Content">Content</a></code>
<code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../content/#omniread.core.content.Content">Content</a></code>
</td>
<td>
<div class="doc-md-description">
@@ -1081,7 +1092,9 @@ acquiring raw content from external sources.</p>
<details class="notes" open>
<summary>Notes</summary>
<p><strong>Responsibilities:</strong></p>
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must retrieve the content referenced by `source` and return it as raw bytes wrapped in a `Content` object
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must retrieve the content referenced by `source`
and return it as raw bytes wrapped in a `Content` object.
</code></pre></div></td></tr></table></div>
</details>
</div>