updated mcp
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
2026-03-08 17:57:34 +05:30
parent 9191de9dff
commit 0e49f02c4c
167 changed files with 7632 additions and 98942 deletions

View File

@@ -989,25 +989,26 @@
<div class="doc doc-contents first">
<p>Core domain contracts for OmniRead.</p>
<hr />
<h4 id="omniread.core--summary">Summary</h4>
<h3 id="omniread.core--summary">Summary</h3>
<p>Core domain contracts for OmniRead.</p>
<p>This package defines the <strong>format-agnostic domain layer</strong> of OmniRead.
It exposes canonical content models and abstract interfaces that are
implemented by format-specific modules (HTML, PDF, etc.).</p>
<p>Public exports from this package are considered <strong>stable contracts</strong> and
are safe for downstream consumers to depend on.</p>
<p>Submodules:
- content: Canonical content models and enums
- parser: Abstract parsing contracts
- scraper: Abstract scraping contracts</p>
<p>Submodules:</p>
<ul>
<li><code>content</code>: Canonical content models and enums.</li>
<li><code>parser</code>: Abstract parsing contracts.</li>
<li><code>scraper</code>: Abstract scraping contracts.</li>
</ul>
<p>Format-specific behavior must not be introduced at this layer.</p>
<hr />
<h4 id="omniread.core--public-api">Public API</h4>
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>Content
ContentType
</code></pre></div></td></tr></table></div>
<h3 id="omniread.core--public-api">Public API</h3>
<ul>
<li><code>Content</code></li>
<li><code>ContentType</code></li>
</ul>
<hr />
@@ -1045,15 +1046,19 @@ ContentType
<summary>Notes</summary>
<p><strong>Guarantees:</strong></p>
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- A parser is a self-contained object that owns the Content it is responsible for interpreting
- Consumers may rely on early validation of content compatibility and type-stable return values from `parse()`
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><code>- A parser is a self-contained object that owns the `Content` it is
responsible for interpreting.
- Consumers may rely on early validation of content compatibility
and type-stable return values from `parse()`.
</code></pre></div></td></tr></table></div>
<p><strong>Responsibilities:</strong></p>
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must declare supported content types via `supported_types`
- Implementations must raise parsing-specific exceptions from `parse()`
- Implementations must remain deterministic for a given input
<span class="normal">3</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must declare supported content types via `supported_types`.
- Implementations must raise parsing-specific exceptions from `parse()`.
- Implementations must remain deterministic for a given input.
</code></pre></div></td></tr></table></div>
</details>
<p>Initialize the parser with content to be parsed.</p>
@@ -1073,7 +1078,7 @@ ContentType
<tr class="doc-section-item">
<td><code>content</code></td>
<td>
<code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../omniread/core/content/#omniread.core.content.Content">Content</a></code>
<code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="content/#omniread.core.content.Content">Content</a></code>
</td>
<td>
<div class="doc-md-description">
@@ -1216,7 +1221,9 @@ ContentType
<details class="notes" open>
<summary>Notes</summary>
<p><strong>Responsibilities:</strong></p>
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must fully consume the provided content and return a deterministic, structured output
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must fully consume the provided content and
return a deterministic, structured output.
</code></pre></div></td></tr></table></div>
</details>
</div>
@@ -1298,13 +1305,21 @@ ContentType
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><code>- A scraper is responsible ONLY for fetching raw content (bytes) from a source. It must not interpret or parse it
- A scraper is a stateless acquisition component that retrieves raw content from a source and returns it as a `Content` object
- Scrapers define how content is obtained, not what the content means
- Implementations may vary in transport mechanism, authentication strategy, retry and backoff behavior
<span class="normal">4</span>
<span class="normal">5</span>
<span class="normal">6</span>
<span class="normal">7</span></pre></div></td><td class="code"><div><pre><span></span><code>- A scraper is responsible ONLY for fetching raw content (bytes)
from a source. It must not interpret or parse it.
- A scraper is a stateless acquisition component that retrieves raw
content from a source and returns it as a `Content` object.
- Scrapers define how content is obtained, not what the content means.
- Implementations may vary in transport mechanism, authentication
strategy, retry and backoff behavior.
</code></pre></div></td></tr></table></div>
<p><strong>Constraints:</strong></p>
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must not parse content, modify content semantics, or couple scraping logic to a specific parser
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must not parse content, modify content semantics,
or couple scraping logic to a specific parser.
</code></pre></div></td></tr></table></div>
</details>
@@ -1358,7 +1373,7 @@ ContentType
</td>
<td>
<div class="doc-md-description">
<p>Location identifier (URL, file path, S3 URI, etc.)</p>
<p>Location identifier (URL, file path, S3 URI, etc.).</p>
</div>
</td>
<td>
@@ -1372,7 +1387,7 @@ ContentType
</td>
<td>
<div class="doc-md-description">
<p>Optional hints for the scraper (headers, auth, etc.)</p>
<p>Optional hints for the scraper (headers, auth, etc.).</p>
</div>
</td>
<td>
@@ -1394,7 +1409,7 @@ ContentType
<tbody>
<tr class="doc-section-item">
<td><code>Content</code></td> <td>
<code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="../omniread/core/content/#omniread.core.content.Content">Content</a></code>
<code><a class="autorefs autorefs-internal" title="omniread.core.content.Content" href="content/#omniread.core.content.Content">Content</a></code>
</td>
<td>
<div class="doc-md-description">
@@ -1432,7 +1447,9 @@ ContentType
<details class="notes" open>
<summary>Notes</summary>
<p><strong>Responsibilities:</strong></p>
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must retrieve the content referenced by `source` and return it as raw bytes wrapped in a `Content` object
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- Implementations must retrieve the content referenced by `source`
and return it as raw bytes wrapped in a `Content` object.
</code></pre></div></td></tr></table></div>
</details>
</div>
@@ -1473,8 +1490,12 @@ ContentType
<summary>Notes</summary>
<p><strong>Responsibilities:</strong></p>
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- A `Content` instance represents a raw content payload along with minimal contextual metadata describing its origin and type
- This class is the primary exchange format between Scrapers, Parsers, and Downstream consumers
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><code>- A `Content` instance represents a raw content payload along with
minimal contextual metadata describing its origin and type.
- This class is the primary exchange format between scrapers,
parsers, and downstream consumers.
</code></pre></div></td></tr></table></div>
</details>
@@ -1615,8 +1636,12 @@ ContentType
<summary>Notes</summary>
<p><strong>Guarantees:</strong></p>
<div class="language-text highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>- This enum represents the declared or inferred media type of the content source
- It is primarily used for routing content to the appropriate parser or downstream consumer
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><code>- This enum represents the declared or inferred media type of the
content source.
- It is primarily used for routing content to the appropriate
parser or downstream consumer.
</code></pre></div></td></tr></table></div>
</details>