139 lines
3.9 KiB
Markdown
139 lines
3.9 KiB
Markdown
# mail_intake
|
|
|
|
# Summary
|
|
|
|
Mail Intake — provider-agnostic, read-only email ingestion framework.
|
|
|
|
Mail Intake is a **contract-first library** designed to ingest, parse, and
|
|
normalize email data from external providers (such as Gmail) into clean,
|
|
provider-agnostic domain models.
|
|
|
|
The library is intentionally structured around clear layers, each exposed
|
|
as a first-class module at the package root:
|
|
|
|
- `adapters`: Provider-specific access (e.g., Gmail).
|
|
- `auth`: Authentication providers and credential lifecycle management.
|
|
- `credentials`: Credential persistence abstractions and implementations.
|
|
- `parsers`: Extraction and normalization of message content.
|
|
- `ingestion`: Orchestration and high-level ingestion workflows.
|
|
- `models`: Canonical, provider-agnostic data representations.
|
|
- `config`: Explicit global configuration.
|
|
- `exceptions`: Library-defined error hierarchy.
|
|
|
|
The package root acts as a **namespace**, not a facade. Consumers are
|
|
expected to import functionality explicitly from the appropriate module.
|
|
|
|
---
|
|
|
|
# Installation
|
|
|
|
Install using pip:
|
|
|
|
```bash
|
|
pip install mail-intake
|
|
```
|
|
|
|
Or with Poetry:
|
|
|
|
```bash
|
|
poetry add mail-intake
|
|
```
|
|
|
|
Mail Intake is pure Python and has no runtime dependencies beyond those
|
|
required by the selected provider (for example, Google APIs for Gmail).
|
|
|
|
---
|
|
|
|
# Quick Start
|
|
|
|
Minimal Gmail ingestion example (local development):
|
|
|
|
```python
|
|
from mail_intake.ingestion import MailIntakeReader
|
|
from mail_intake.adapters import MailIntakeGmailAdapter
|
|
from mail_intake.auth import MailIntakeGoogleAuth
|
|
from mail_intake.credentials import PickleCredentialStore
|
|
|
|
store = PickleCredentialStore(path="token.pickle")
|
|
|
|
auth = MailIntakeGoogleAuth(
|
|
credentials_path="credentials.json",
|
|
store=store,
|
|
scopes=["https://www.googleapis.com/auth/gmail.readonly"],
|
|
)
|
|
|
|
adapter = MailIntakeGmailAdapter(auth_provider=auth)
|
|
reader = MailIntakeReader(adapter)
|
|
|
|
for message in reader.iter_messages("from:recruiter@example.com"):
|
|
print(message.subject, message.from_email)
|
|
```
|
|
|
|
Iterating over threads:
|
|
|
|
```python
|
|
for thread in reader.iter_threads("subject:Interview"):
|
|
print(thread.normalized_subject, len(thread.messages))
|
|
```
|
|
|
|
---
|
|
|
|
# Architecture
|
|
|
|
Mail Intake is designed to be extensible via **public contracts** exposed
|
|
through its modules:
|
|
|
|
- Users MAY implement their own mail adapters by subclassing
|
|
`adapters.MailIntakeAdapter`.
|
|
- Users MAY implement their own authentication providers by subclassing
|
|
`auth.MailIntakeAuthProvider[T]`.
|
|
- Users MAY implement their own credential persistence layers by implementing
|
|
`credentials.CredentialStore[T]`.
|
|
|
|
Users SHOULD NOT subclass built-in adapter implementations. Built-in
|
|
adapters (such as Gmail) are reference implementations and may change
|
|
internally without notice.
|
|
|
|
**Design Guarantees:**
|
|
|
|
- Read-only access: no mutation of provider state.
|
|
- Provider-agnostic domain models.
|
|
- Explicit configuration and dependency injection.
|
|
- No implicit global state or environment reads.
|
|
- Deterministic, testable behavior.
|
|
- Distributed-safe authentication design.
|
|
|
|
Mail Intake favors correctness, clarity, and explicitness over convenience
|
|
shortcuts.
|
|
|
|
**Core Philosophy:**
|
|
|
|
`Mail Intake` is built as a **contract-first ingestion pipeline**:
|
|
|
|
1. **Layered Decoupling**: Adapters handle transport, Parsers handle format
|
|
normalization, and Ingestion orchestrates.
|
|
2. **Provider Agnosticism**: Domain models and core logic never depend on
|
|
provider-specific (e.g., Gmail) API internals.
|
|
3. **Stateless Workflows**: The library functions as a read-only pipe, ensuring
|
|
side-effect-free ingestion.
|
|
|
|
---
|
|
|
|
# Public API
|
|
|
|
The supported public API consists of the following top-level modules:
|
|
|
|
- `mail_intake.ingestion`
|
|
- `mail_intake.adapters`
|
|
- `mail_intake.auth`
|
|
- `mail_intake.credentials`
|
|
- `mail_intake.parsers`
|
|
- `mail_intake.models`
|
|
- `mail_intake.config`
|
|
- `mail_intake.exceptions`
|
|
|
|
Classes and functions should be imported explicitly from these modules.
|
|
No individual symbols are re-exported at the package root.
|
|
|
|
---
|