updated docs strings and added README.md
This commit is contained in:
@@ -1,10 +1,8 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Mail Intake — provider-agnostic, read-only email ingestion framework.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
Mail Intake is a **contract-first library** designed to ingest, parse, and
|
||||
normalize email data from external providers (such as Gmail) into clean,
|
||||
provider-agnostic domain models.
|
||||
@@ -12,109 +10,126 @@ provider-agnostic domain models.
|
||||
The library is intentionally structured around clear layers, each exposed
|
||||
as a first-class module at the package root:
|
||||
|
||||
- adapters: provider-specific access (e.g. Gmail)
|
||||
- auth: authentication providers and credential lifecycle management
|
||||
- credentials: credential persistence abstractions and implementations
|
||||
- parsers: extraction and normalization of message content
|
||||
- ingestion: orchestration and high-level ingestion workflows
|
||||
- models: canonical, provider-agnostic data representations
|
||||
- config: explicit global configuration
|
||||
- exceptions: library-defined error hierarchy
|
||||
- `adapters`: Provider-specific access (e.g., Gmail).
|
||||
- `auth`: Authentication providers and credential lifecycle management.
|
||||
- `credentials`: Credential persistence abstractions and implementations.
|
||||
- `parsers`: Extraction and normalization of message content.
|
||||
- `ingestion`: Orchestration and high-level ingestion workflows.
|
||||
- `models`: Canonical, provider-agnostic data representations.
|
||||
- `config`: Explicit global configuration.
|
||||
- `exceptions`: Library-defined error hierarchy.
|
||||
|
||||
The package root acts as a **namespace**, not a facade. Consumers are
|
||||
expected to import functionality explicitly from the appropriate module.
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
# Installation
|
||||
|
||||
Install using pip:
|
||||
|
||||
pip install mail-intake
|
||||
```bash
|
||||
pip install mail-intake
|
||||
```
|
||||
|
||||
Or with Poetry:
|
||||
|
||||
poetry add mail-intake
|
||||
```bash
|
||||
poetry add mail-intake
|
||||
```
|
||||
|
||||
Mail Intake is pure Python and has no runtime dependencies beyond those
|
||||
required by the selected provider (for example, Google APIs for Gmail).
|
||||
|
||||
---
|
||||
|
||||
## Quick start
|
||||
# Quick Start
|
||||
|
||||
Minimal Gmail ingestion example (local development):
|
||||
|
||||
from mail_intake.ingestion import MailIntakeReader
|
||||
from mail_intake.adapters import MailIntakeGmailAdapter
|
||||
from mail_intake.auth import MailIntakeGoogleAuth
|
||||
from mail_intake.credentials import PickleCredentialStore
|
||||
```python
|
||||
from mail_intake.ingestion import MailIntakeReader
|
||||
from mail_intake.adapters import MailIntakeGmailAdapter
|
||||
from mail_intake.auth import MailIntakeGoogleAuth
|
||||
from mail_intake.credentials import PickleCredentialStore
|
||||
|
||||
store = PickleCredentialStore(path="token.pickle")
|
||||
store = PickleCredentialStore(path="token.pickle")
|
||||
|
||||
auth = MailIntakeGoogleAuth(
|
||||
credentials_path="credentials.json",
|
||||
store=store,
|
||||
scopes=["https://www.googleapis.com/auth/gmail.readonly"],
|
||||
)
|
||||
auth = MailIntakeGoogleAuth(
|
||||
credentials_path="credentials.json",
|
||||
store=store,
|
||||
scopes=["https://www.googleapis.com/auth/gmail.readonly"],
|
||||
)
|
||||
|
||||
adapter = MailIntakeGmailAdapter(auth_provider=auth)
|
||||
reader = MailIntakeReader(adapter)
|
||||
adapter = MailIntakeGmailAdapter(auth_provider=auth)
|
||||
reader = MailIntakeReader(adapter)
|
||||
|
||||
for message in reader.iter_messages("from:recruiter@example.com"):
|
||||
print(message.subject, message.from_email)
|
||||
for message in reader.iter_messages("from:recruiter@example.com"):
|
||||
print(message.subject, message.from_email)
|
||||
```
|
||||
|
||||
Iterating over threads:
|
||||
|
||||
for thread in reader.iter_threads("subject:Interview"):
|
||||
print(thread.normalized_subject, len(thread.messages))
|
||||
```python
|
||||
for thread in reader.iter_threads("subject:Interview"):
|
||||
print(thread.normalized_subject, len(thread.messages))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
# Architecture
|
||||
|
||||
Mail Intake is designed to be extensible via **public contracts** exposed
|
||||
through its modules:
|
||||
|
||||
- Users MAY implement their own mail adapters by subclassing ``adapters.MailIntakeAdapter``
|
||||
- Users MAY implement their own authentication providers by subclassing ``auth.MailIntakeAuthProvider[T]``
|
||||
- Users MAY implement their own credential persistence layers by implementing ``credentials.CredentialStore[T]``
|
||||
- Users MAY implement their own mail adapters by subclassing
|
||||
`adapters.MailIntakeAdapter`.
|
||||
- Users MAY implement their own authentication providers by subclassing
|
||||
`auth.MailIntakeAuthProvider[T]`.
|
||||
- Users MAY implement their own credential persistence layers by implementing
|
||||
`credentials.CredentialStore[T]`.
|
||||
|
||||
Users SHOULD NOT subclass built-in adapter implementations. Built-in
|
||||
adapters (such as Gmail) are reference implementations and may change
|
||||
internally without notice.
|
||||
|
||||
**Design Guarantees:**
|
||||
- Read-only access: no mutation of provider state
|
||||
- Provider-agnostic domain models
|
||||
- Explicit configuration and dependency injection
|
||||
- No implicit global state or environment reads
|
||||
- Deterministic, testable behavior
|
||||
- Distributed-safe authentication design
|
||||
|
||||
- Read-only access: no mutation of provider state.
|
||||
- Provider-agnostic domain models.
|
||||
- Explicit configuration and dependency injection.
|
||||
- No implicit global state or environment reads.
|
||||
- Deterministic, testable behavior.
|
||||
- Distributed-safe authentication design.
|
||||
|
||||
Mail Intake favors correctness, clarity, and explicitness over convenience
|
||||
shortcuts.
|
||||
|
||||
**Core Philosophy:**
|
||||
|
||||
`Mail Intake` is built as a **contract-first ingestion pipeline**:
|
||||
1. **Layered Decoupling**: Adapters handle transport, Parsers handle format normalization, and Ingestion orchestrates.
|
||||
2. **Provider Agnosticism**: Domain models and core logic never depend on provider-specific (e.g., Gmail) API internals.
|
||||
3. **Stateless Workflows**: The library functions as a read-only pipe, ensuring side-effect-free ingestion.
|
||||
|
||||
1. **Layered Decoupling**: Adapters handle transport, Parsers handle format
|
||||
normalization, and Ingestion orchestrates.
|
||||
2. **Provider Agnosticism**: Domain models and core logic never depend on
|
||||
provider-specific (e.g., Gmail) API internals.
|
||||
3. **Stateless Workflows**: The library functions as a read-only pipe, ensuring
|
||||
side-effect-free ingestion.
|
||||
|
||||
---
|
||||
|
||||
## Public API
|
||||
# Public API
|
||||
|
||||
The supported public API consists of the following top-level modules:
|
||||
|
||||
- mail_intake.ingestion
|
||||
- mail_intake.adapters
|
||||
- mail_intake.auth
|
||||
- mail_intake.credentials
|
||||
- mail_intake.parsers
|
||||
- mail_intake.models
|
||||
- mail_intake.config
|
||||
- mail_intake.exceptions
|
||||
- `mail_intake.ingestion`
|
||||
- `mail_intake.adapters`
|
||||
- `mail_intake.auth`
|
||||
- `mail_intake.credentials`
|
||||
- `mail_intake.parsers`
|
||||
- `mail_intake.models`
|
||||
- `mail_intake.config`
|
||||
- `mail_intake.exceptions`
|
||||
|
||||
Classes and functions should be imported explicitly from these modules.
|
||||
No individual symbols are re-exported at the package root.
|
||||
|
||||
@@ -1,19 +1,18 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Mail provider adapter implementations for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This package contains **adapter-layer implementations** responsible for
|
||||
interfacing with external mail providers and exposing a normalized,
|
||||
provider-agnostic contract to the rest of the system.
|
||||
|
||||
Adapters in this package:
|
||||
- Implement the `MailIntakeAdapter` interface
|
||||
- Encapsulate all provider-specific APIs and semantics
|
||||
- Perform read-only access to mail data
|
||||
- Return provider-native payloads without interpretation
|
||||
|
||||
- Implement the `MailIntakeAdapter` interface.
|
||||
- Encapsulate all provider-specific APIs and semantics.
|
||||
- Perform read-only access to mail data.
|
||||
- Return provider-native payloads without interpretation.
|
||||
|
||||
Provider-specific logic **must not leak** outside of adapter implementations.
|
||||
All parsings, normalizations, and transformations must be handled by downstream
|
||||
@@ -21,10 +20,10 @@ components.
|
||||
|
||||
---
|
||||
|
||||
## Public API
|
||||
# Public API
|
||||
|
||||
MailIntakeAdapter
|
||||
MailIntakeGmailAdapter
|
||||
- `MailIntakeAdapter`
|
||||
- `MailIntakeGmailAdapter`
|
||||
|
||||
---
|
||||
"""
|
||||
|
||||
@@ -1,10 +1,8 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Mail provider adapter contracts for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module defines the **provider-agnostic adapter interface** used for
|
||||
read-only mail ingestion.
|
||||
|
||||
@@ -24,13 +22,13 @@ class MailIntakeAdapter(ABC):
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- discover messages matching a query
|
||||
- retrieve full message payloads
|
||||
- retrieve full thread payloads
|
||||
- Discover messages matching a query.
|
||||
- Retrieve full message payloads.
|
||||
- Retrieve full thread payloads.
|
||||
|
||||
**Lifecycle:**
|
||||
|
||||
- adapters are intentionally read-only and must not mutate provider state
|
||||
- Adapters are intentionally read-only and must not mutate provider state.
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
@@ -49,15 +47,18 @@ class MailIntakeAdapter(ABC):
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- Implementations must yield dictionaries containing at least ``message_id`` and ``thread_id``
|
||||
- Implementations must yield dictionaries containing at least
|
||||
`message_id` and `thread_id`.
|
||||
|
||||
Example:
|
||||
Typical yield:
|
||||
|
||||
{
|
||||
"message_id": "...",
|
||||
"thread_id": "..."
|
||||
}
|
||||
```python
|
||||
{
|
||||
"message_id": "...",
|
||||
"thread_id": "..."
|
||||
}
|
||||
```
|
||||
"""
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
@@ -1,17 +1,16 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Gmail adapter implementation for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module provides a **Gmail-specific implementation** of the
|
||||
`MailIntakeAdapter` contract.
|
||||
|
||||
It is the only place in the codebase where:
|
||||
- `googleapiclient` is imported
|
||||
- Gmail REST API semantics are known
|
||||
- Low-level `.execute()` calls are made
|
||||
|
||||
- `googleapiclient` is imported.
|
||||
- Gmail REST API semantics are known.
|
||||
- Low-level `.execute()` calls are made.
|
||||
|
||||
All Gmail-specific behavior must be strictly contained within this module.
|
||||
"""
|
||||
@@ -37,15 +36,15 @@ class MailIntakeGmailAdapter(MailIntakeAdapter):
|
||||
Notes:
|
||||
**Responsibilities:**
|
||||
|
||||
- This class is the ONLY place where googleapiclient is imported
|
||||
- Gmail REST semantics are known
|
||||
- .execute() is called
|
||||
- This class is the ONLY place where `googleapiclient` is imported.
|
||||
- Gmail REST semantics are known.
|
||||
- `.execute()` is called.
|
||||
|
||||
**Constraints:**
|
||||
|
||||
- Must remain thin and imperative
|
||||
- Must not perform parsing or interpretation
|
||||
- Must not expose Gmail-specific types beyond this class
|
||||
|
||||
- Must remain thin and imperative.
|
||||
- Must not perform parsing or interpretation.
|
||||
- Must not expose Gmail-specific types beyond this class.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
|
||||
@@ -1,31 +1,31 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Authentication provider implementations for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This package defines the **authentication layer** used by mail adapters
|
||||
to obtain provider-specific credentials.
|
||||
|
||||
It exposes:
|
||||
- A stable, provider-agnostic authentication contract
|
||||
- Concrete authentication providers for supported platforms
|
||||
|
||||
- A stable, provider-agnostic authentication contract.
|
||||
- Concrete authentication providers for supported platforms.
|
||||
|
||||
Authentication providers:
|
||||
- Are responsible for credential acquisition and lifecycle management
|
||||
- Are intentionally decoupled from adapter logic
|
||||
- May be extended by users to support additional providers
|
||||
|
||||
- Are responsible for credential acquisition and lifecycle management.
|
||||
- Are intentionally decoupled from adapter logic.
|
||||
- May be extended by users to support additional providers.
|
||||
|
||||
Consumers should depend on the abstract interface and use concrete
|
||||
implementations only where explicitly required.
|
||||
|
||||
---
|
||||
|
||||
## Public API
|
||||
# Public API
|
||||
|
||||
MailIntakeAuthProvider
|
||||
MailIntakeGoogleAuth
|
||||
- `MailIntakeAuthProvider`
|
||||
- `MailIntakeGoogleAuth`
|
||||
|
||||
---
|
||||
"""
|
||||
|
||||
@@ -1,10 +1,8 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Authentication provider contracts for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module defines the **authentication abstraction layer** used by mail
|
||||
adapters to obtain provider-specific credentials.
|
||||
|
||||
@@ -30,15 +28,17 @@ class MailIntakeAuthProvider(ABC, Generic[T]):
|
||||
Notes:
|
||||
**Responsibilities:**
|
||||
|
||||
- Acquire credentials from an external provider
|
||||
- Refresh or revalidate credentials as needed
|
||||
- Handle authentication-specific failure modes
|
||||
- Coordinate with credential persistence layers where applicable
|
||||
- Acquire credentials from an external provider.
|
||||
- Refresh or revalidate credentials as needed.
|
||||
- Handle authentication-specific failure modes.
|
||||
- Coordinate with credential persistence layers where applicable.
|
||||
|
||||
**Constraints:**
|
||||
|
||||
- Mail adapters must treat returned credentials as opaque and provider-specific
|
||||
- Mail adapters rely only on the declared credential type expected by the adapter
|
||||
|
||||
- Mail adapters must treat returned credentials as opaque and
|
||||
provider-specific.
|
||||
- Mail adapters rely only on the declared credential type expected
|
||||
by the adapter.
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
@@ -48,7 +48,7 @@ class MailIntakeAuthProvider(ABC, Generic[T]):
|
||||
|
||||
Returns:
|
||||
T:
|
||||
Credentials of type ``T`` suitable for immediate use by the
|
||||
Credentials of type `T` suitable for immediate use by the
|
||||
corresponding mail adapter.
|
||||
|
||||
Raises:
|
||||
@@ -59,8 +59,10 @@ class MailIntakeAuthProvider(ABC, Generic[T]):
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- This method is synchronous by design
|
||||
- Represents the sole entry point through which adapters obtain authentication material
|
||||
- Implementations must either return credentials of the declared type ``T`` that are valid at the time of return or raise an exception
|
||||
- This method is synchronous by design.
|
||||
- Represents the sole entry point through which adapters obtain
|
||||
authentication material.
|
||||
- Implementations must either return credentials of the declared
|
||||
type `T` that are valid at the time of return or raise an exception.
|
||||
"""
|
||||
raise NotImplementedError
|
||||
|
||||
@@ -1,18 +1,17 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Google authentication provider implementation for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module provides a **Google OAuth–based authentication provider**
|
||||
used primarily for Gmail access.
|
||||
|
||||
It encapsulates all Google-specific authentication concerns, including:
|
||||
- Credential loading and persistence
|
||||
- Token refresh handling
|
||||
- Interactive OAuth flow initiation
|
||||
- Coordination with a credential persistence layer
|
||||
|
||||
- Credential loading and persistence.
|
||||
- Token refresh handling.
|
||||
- Interactive OAuth flow initiation.
|
||||
- Coordination with a credential persistence layer.
|
||||
|
||||
No Google authentication details should leak outside this module.
|
||||
"""
|
||||
@@ -40,14 +39,15 @@ class MailIntakeGoogleAuth(MailIntakeAuthProvider):
|
||||
Notes:
|
||||
**Responsibilities:**
|
||||
|
||||
- Load cached credentials from a credential store when available
|
||||
- Refresh expired credentials when possible
|
||||
- Initiate an interactive OAuth flow only when required
|
||||
- Persist refreshed or newly obtained credentials via the store
|
||||
- Load cached credentials from a credential store when available.
|
||||
- Refresh expired credentials when possible.
|
||||
- Initiate an interactive OAuth flow only when required.
|
||||
- Persist refreshed or newly obtained credentials via the store.
|
||||
|
||||
**Guarantees:**
|
||||
|
||||
- This class is synchronous by design and maintains a minimal internal state
|
||||
- This class is synchronous by design and maintains a minimal
|
||||
internal state.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
@@ -79,7 +79,7 @@ class MailIntakeGoogleAuth(MailIntakeAuthProvider):
|
||||
|
||||
Returns:
|
||||
Credentials:
|
||||
A ``google.oauth2.credentials.Credentials`` instance suitable
|
||||
A `google.oauth2.credentials.Credentials` instance suitable
|
||||
for use with Google API clients.
|
||||
|
||||
Raises:
|
||||
@@ -90,10 +90,10 @@ class MailIntakeGoogleAuth(MailIntakeAuthProvider):
|
||||
Notes:
|
||||
**Lifecycle:**
|
||||
|
||||
- Load cached credentials from the configured credential store
|
||||
- Refresh expired credentials when possible
|
||||
- Perform an interactive OAuth login as a fallback
|
||||
- Persist valid credentials for future use
|
||||
- Load cached credentials from the configured credential store.
|
||||
- Refresh expired credentials when possible.
|
||||
- Perform an interactive OAuth login as a fallback.
|
||||
- Persist valid credentials for future use.
|
||||
"""
|
||||
creds = self.store.load()
|
||||
|
||||
|
||||
@@ -1,10 +1,8 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Global configuration models for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module defines the **top-level configuration object** used to control
|
||||
mail ingestion behavior across adapters, authentication providers, and
|
||||
ingestion workflows.
|
||||
@@ -20,16 +18,17 @@ from typing import Optional
|
||||
@dataclass(frozen=True)
|
||||
class MailIntakeConfig:
|
||||
"""
|
||||
Global configuration for mail-intake.
|
||||
Global configuration for `mail-intake`.
|
||||
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- This configuration is intentionally explicit and immutable
|
||||
- No implicit environment reads or global state
|
||||
- Explicit configuration over implicit defaults
|
||||
- No direct environment or filesystem access
|
||||
- This model is safe to pass across layers and suitable for serialization
|
||||
- This configuration is intentionally explicit and immutable.
|
||||
- No implicit environment reads or global state.
|
||||
- Explicit configuration over implicit defaults.
|
||||
- No direct environment or filesystem access.
|
||||
- This model is safe to pass across layers and suitable for
|
||||
serialization.
|
||||
"""
|
||||
|
||||
provider: str = "gmail"
|
||||
|
||||
@@ -1,10 +1,8 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Credential persistence interfaces and implementations for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This package defines the abstractions and concrete implementations used
|
||||
to persist authentication credentials across Mail Intake components.
|
||||
|
||||
@@ -14,20 +12,21 @@ credential acquisition, validation, and refresh, while implementations
|
||||
within this package are responsible solely for storage and retrieval.
|
||||
|
||||
The package provides:
|
||||
- A generic ``CredentialStore`` abstraction defining the persistence contract
|
||||
- Local filesystem–based storage for development and single-node use
|
||||
- Distributed, Redis-backed storage for production and scaled deployments
|
||||
|
||||
- A generic `CredentialStore` abstraction defining the persistence contract.
|
||||
- Local filesystem–based storage for development and single-node use.
|
||||
- Distributed, Redis-backed storage for production and scaled deployments.
|
||||
|
||||
Credential lifecycle management, interpretation, and security policy
|
||||
decisions remain the responsibility of authentication providers.
|
||||
|
||||
---
|
||||
|
||||
## Public API
|
||||
# Public API
|
||||
|
||||
CredentialStore
|
||||
PickleCredentialStore
|
||||
RedisCredentialStore
|
||||
- `CredentialStore`
|
||||
- `PickleCredentialStore`
|
||||
- `RedisCredentialStore`
|
||||
|
||||
---
|
||||
"""
|
||||
|
||||
@@ -1,18 +1,16 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Local filesystem–based credential persistence for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module provides a file-backed implementation of the
|
||||
``CredentialStore`` abstraction using Python's ``pickle`` module.
|
||||
`CredentialStore` abstraction using Python's `pickle` module.
|
||||
|
||||
The pickle-based credential store is intended for local development,
|
||||
The `pickle`-based credential store is intended for local development,
|
||||
single-node deployments, and controlled environments where credentials
|
||||
do not need to be shared across processes or machines.
|
||||
|
||||
Due to the security and portability risks associated with pickle-based
|
||||
Due to the security and portability risks associated with `pickle`-based
|
||||
serialization, this implementation is not suitable for distributed or
|
||||
untrusted environments.
|
||||
"""
|
||||
@@ -36,13 +34,14 @@ class PickleCredentialStore(CredentialStore[T]):
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- Stores credentials on the local filesystem
|
||||
- Uses pickle for serialization and deserialization
|
||||
- Does not provide encryption, locking, or concurrency guarantees
|
||||
- Stores credentials on the local filesystem.
|
||||
- Uses `pickle` for serialization and deserialization.
|
||||
- Does not provide encryption, locking, or concurrency guarantees.
|
||||
|
||||
**Constraints:**
|
||||
|
||||
- Credential lifecycle management, validation, and refresh logic are explicitly out of scope for this class
|
||||
|
||||
- Credential lifecycle management, validation, and refresh logic are
|
||||
explicitly out of scope for this class.
|
||||
"""
|
||||
|
||||
def __init__(self, path: str):
|
||||
@@ -62,14 +61,16 @@ class PickleCredentialStore(CredentialStore[T]):
|
||||
|
||||
Returns:
|
||||
Optional[T]:
|
||||
An instance of type ``T`` if credentials are present and
|
||||
successfully deserialized; otherwise ``None``.
|
||||
An instance of type `T` if credentials are present and
|
||||
successfully deserialized; otherwise `None`.
|
||||
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- If the credential file does not exist or cannot be successfully deserialized, this method returns ``None``
|
||||
- The store does not attempt to validate or interpret the returned credentials
|
||||
- If the credential file does not exist or cannot be successfully
|
||||
deserialized, this method returns `None`.
|
||||
- The store does not attempt to validate or interpret the
|
||||
returned credentials.
|
||||
"""
|
||||
try:
|
||||
with open(self.path, "rb") as fh:
|
||||
|
||||
@@ -1,12 +1,10 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Redis-backed credential persistence for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module provides a Redis-based implementation of the
|
||||
``CredentialStore`` abstraction, enabling credential persistence
|
||||
`CredentialStore` abstraction, enabling credential persistence
|
||||
across distributed and horizontally scaled deployments.
|
||||
|
||||
The Redis credential store is designed for environments where
|
||||
@@ -15,10 +13,11 @@ processes, containers, or nodes, such as container orchestration
|
||||
platforms and microservice architectures.
|
||||
|
||||
Key characteristics:
|
||||
- Distributed-safe, shared storage using Redis
|
||||
- Explicit, caller-defined serialization and deserialization
|
||||
- No reliance on unsafe mechanisms such as pickle
|
||||
- Optional time-to-live (TTL) support for automatic credential expiry
|
||||
|
||||
- Distributed-safe, shared storage using Redis.
|
||||
- Explicit, caller-defined serialization and deserialization.
|
||||
- No reliance on unsafe mechanisms such as `pickle`.
|
||||
- Optional time-to-live (TTL) support for automatic credential expiry.
|
||||
|
||||
This module is responsible solely for persistence concerns.
|
||||
Credential validation, refresh, rotation, and acquisition remain the
|
||||
@@ -35,7 +34,7 @@ T = TypeVar("T")
|
||||
|
||||
class RedisCredentialStore(CredentialStore[T]):
|
||||
"""
|
||||
Redis-backed implementation of ``CredentialStore``.
|
||||
Redis-backed implementation of `CredentialStore`.
|
||||
|
||||
This store persists credentials in Redis and is suitable for
|
||||
distributed and horizontally scaled deployments where credentials
|
||||
@@ -44,13 +43,16 @@ class RedisCredentialStore(CredentialStore[T]):
|
||||
Notes:
|
||||
**Responsibilities:**
|
||||
|
||||
- This class is responsible only for persistence and retrieval
|
||||
- It does not interpret, validate, refresh, or otherwise manage the lifecycle of the credentials being stored
|
||||
- This class is responsible only for persistence and retrieval.
|
||||
- It does not interpret, validate, refresh, or otherwise manage the
|
||||
lifecycle of the credentials being stored.
|
||||
|
||||
**Guarantees:**
|
||||
|
||||
- The store is intentionally generic and delegates all serialization concerns to caller-provided functions
|
||||
- This avoids unsafe mechanisms such as pickle and allows credential formats to be explicitly controlled and audited
|
||||
- The store is intentionally generic and delegates all serialization
|
||||
concerns to caller-provided functions.
|
||||
- This avoids unsafe mechanisms such as `pickle` and allows
|
||||
credential formats to be explicitly controlled and audited.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
@@ -92,14 +94,18 @@ class RedisCredentialStore(CredentialStore[T]):
|
||||
|
||||
Returns:
|
||||
Optional[T]:
|
||||
An instance of type ``T`` if credentials are present and
|
||||
successfully deserialized; otherwise ``None``.
|
||||
An instance of type `T` if credentials are present and
|
||||
successfully deserialized; otherwise `None`.
|
||||
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- If no value exists for the configured key, or if the stored payload cannot be successfully deserialized, this method returns ``None``
|
||||
- The store does not attempt to validate the returned credentials or determine whether they are expired or otherwise usable
|
||||
- If no value exists for the configured key, or if the stored
|
||||
payload cannot be successfully deserialized, this method
|
||||
returns `None`.
|
||||
- The store does not attempt to validate the returned
|
||||
credentials or determine whether they are expired or
|
||||
otherwise usable.
|
||||
"""
|
||||
raw = self.redis.get(self.key)
|
||||
if not raw:
|
||||
|
||||
@@ -1,14 +1,12 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Credential persistence abstractions for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module defines the generic persistence contract used to store and
|
||||
retrieve authentication credentials across Mail Intake components.
|
||||
|
||||
The ``CredentialStore`` abstraction establishes a strict separation
|
||||
The `CredentialStore` abstraction establishes a strict separation
|
||||
between credential *lifecycle management* and credential *storage*.
|
||||
Authentication providers are responsible for acquiring, validating,
|
||||
refreshing, and revoking credentials, while concrete store
|
||||
@@ -30,21 +28,23 @@ T = TypeVar("T")
|
||||
|
||||
class CredentialStore(ABC, Generic[T]):
|
||||
"""
|
||||
Abstract base class defining a generic persistence interface for
|
||||
authentication credentials.
|
||||
Abstract base class defining a generic persistence interface.
|
||||
|
||||
Used for authentication credentials across different backends.
|
||||
|
||||
Notes:
|
||||
**Responsibilities:**
|
||||
|
||||
- Provide persistent storage separating life-cycle management from storage mechanics
|
||||
- Keep implementation focused only on persistence
|
||||
|
||||
- Provide persistent storage separating life-cycle management from
|
||||
storage mechanics.
|
||||
- Keep implementation focused only on persistence.
|
||||
|
||||
**Constraints:**
|
||||
|
||||
|
||||
- The store is intentionally agnostic to:
|
||||
- The concrete credential type being stored
|
||||
- The serialization format used to persist credentials
|
||||
- The underlying storage backend or durability guarantees
|
||||
- The concrete credential type being stored.
|
||||
- The serialization format used to persist credentials.
|
||||
- The underlying storage backend or durability guarantees.
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
@@ -54,14 +54,17 @@ class CredentialStore(ABC, Generic[T]):
|
||||
|
||||
Returns:
|
||||
Optional[T]:
|
||||
An instance of type ``T`` if credentials are available and
|
||||
loadable; otherwise ``None``.
|
||||
An instance of type `T` if credentials are available and
|
||||
loadable; otherwise `None`.
|
||||
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- Implementations should return ``None`` when no credentials are present or when stored credentials cannot be successfully decoded or deserialized
|
||||
- The store must not attempt to validate, refresh, or otherwise interpret the returned credentials
|
||||
- Implementations should return `None` when no credentials are
|
||||
present or when stored credentials cannot be successfully
|
||||
decoded or deserialized.
|
||||
- The store must not attempt to validate, refresh, or otherwise
|
||||
interpret the returned credentials.
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
|
||||
@@ -34,7 +34,8 @@ class MailIntakeAuthError(MailIntakeError):
|
||||
Notes:
|
||||
**Lifecycle:**
|
||||
|
||||
- Raised when authentication providers are unable to acquire, refresh, or persist valid credentials
|
||||
- Raised when authentication providers are unable to acquire,
|
||||
refresh, or persist valid credentials.
|
||||
"""
|
||||
|
||||
|
||||
@@ -45,7 +46,8 @@ class MailIntakeAdapterError(MailIntakeError):
|
||||
Notes:
|
||||
**Lifecycle:**
|
||||
|
||||
- Raised when a provider adapter encounters API errors, transport failures, or invalid provider responses
|
||||
- Raised when a provider adapter encounters API errors, transport
|
||||
failures, or invalid provider responses.
|
||||
"""
|
||||
|
||||
|
||||
@@ -56,5 +58,6 @@ class MailIntakeParsingError(MailIntakeError):
|
||||
Notes:
|
||||
**Lifecycle:**
|
||||
|
||||
- Raised when raw provider payloads cannot be interpreted or normalized into internal domain models
|
||||
- Raised when raw provider payloads cannot be interpreted or
|
||||
normalized into internal domain models.
|
||||
"""
|
||||
|
||||
@@ -1,10 +1,8 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Mail ingestion orchestration for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This package contains **high-level ingestion components** responsible for
|
||||
coordinating mail retrieval, parsing, normalization, and model construction.
|
||||
|
||||
@@ -12,19 +10,20 @@ It represents the **top of the ingestion pipeline** and is intended to be the
|
||||
primary interaction surface for library consumers.
|
||||
|
||||
Components in this package:
|
||||
- Are provider-agnostic
|
||||
- Depend only on adapter and parser contracts
|
||||
- Contain no provider-specific API logic
|
||||
- Expose read-only ingestion workflows
|
||||
|
||||
- Are provider-agnostic.
|
||||
- Depend only on adapter and parser contracts.
|
||||
- Contain no provider-specific API logic.
|
||||
- Expose read-only ingestion workflows.
|
||||
|
||||
Consumers are expected to construct a mail adapter and pass it to the
|
||||
ingestion layer to begin processing messages and threads.
|
||||
|
||||
---
|
||||
|
||||
## Public API
|
||||
# Public API
|
||||
|
||||
MailIntakeReader
|
||||
- `MailIntakeReader`
|
||||
|
||||
---
|
||||
"""
|
||||
|
||||
@@ -1,18 +1,17 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
High-level mail ingestion orchestration for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module provides the primary, provider-agnostic entry point for
|
||||
reading and processing mail data.
|
||||
|
||||
It coordinates:
|
||||
- Mail adapter access
|
||||
- Message and thread iteration
|
||||
- Header and body parsing
|
||||
- Normalization and model construction
|
||||
|
||||
- Mail adapter access.
|
||||
- Message and thread iteration.
|
||||
- Header and body parsing.
|
||||
- Normalization and model construction.
|
||||
|
||||
No provider-specific logic or API semantics are permitted in this layer.
|
||||
"""
|
||||
@@ -36,12 +35,18 @@ class MailIntakeReader:
|
||||
Notes:
|
||||
**Responsibilities:**
|
||||
|
||||
- This class is the primary entry point for consumers of the Mail Intake library
|
||||
- It orchestrates the full ingestion pipeline: Querying the adapter for message references, fetching raw provider messages, parsing and normalizing message data, constructing domain models
|
||||
- This class is the primary entry point for consumers of the
|
||||
Mail Intake library.
|
||||
- It orchestrates the full ingestion pipeline:
|
||||
- Querying the adapter for message references.
|
||||
- Fetching raw provider messages.
|
||||
- Parsing and normalizing message data.
|
||||
- Constructing domain models.
|
||||
|
||||
**Constraints:**
|
||||
|
||||
- This class is intentionally: Provider-agnostic, stateless beyond iteration scope, read-only
|
||||
|
||||
- This class is intentionally: Provider-agnostic, stateless beyond
|
||||
iteration scope, read-only.
|
||||
"""
|
||||
|
||||
def __init__(self, adapter: MailIntakeAdapter):
|
||||
@@ -87,13 +92,14 @@ class MailIntakeReader:
|
||||
An iterator of `MailIntakeThread` instances.
|
||||
|
||||
Raises:
|
||||
MailIntakeParsingError:
|
||||
`MailIntakeParsingError`:
|
||||
If a message cannot be parsed.
|
||||
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- Messages are grouped by `thread_id` and yielded as complete thread objects containing all associated messages
|
||||
- Messages are grouped by `thread_id` and yielded as complete
|
||||
thread objects containing all associated messages.
|
||||
"""
|
||||
threads: Dict[str, MailIntakeThread] = {}
|
||||
|
||||
|
||||
@@ -1,27 +1,26 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Domain models for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This package defines the **canonical, provider-agnostic data models**
|
||||
used throughout the Mail Intake ingestion pipeline.
|
||||
|
||||
Models in this package:
|
||||
- Represent fully parsed and normalized mail data
|
||||
- Are safe to persist, serialize, and index
|
||||
- Contain no provider-specific payloads or API semantics
|
||||
- Serve as stable inputs for downstream processing and analysis
|
||||
|
||||
- Represent fully parsed and normalized mail data.
|
||||
- Are safe to persist, serialize, and index.
|
||||
- Contain no provider-specific payloads or API semantics.
|
||||
- Serve as stable inputs for downstream processing and analysis.
|
||||
|
||||
These models form the core internal data contract of the library.
|
||||
|
||||
---
|
||||
|
||||
## Public API
|
||||
# Public API
|
||||
|
||||
MailIntakeMessage
|
||||
MailIntakeThread
|
||||
- `MailIntakeMessage`
|
||||
- `MailIntakeThread`
|
||||
|
||||
---
|
||||
"""
|
||||
|
||||
@@ -1,10 +1,8 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Message domain models for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module defines the **canonical, provider-agnostic representation**
|
||||
of an individual email message as used internally by the Mail Intake
|
||||
ingestion pipeline.
|
||||
@@ -26,12 +24,14 @@ class MailIntakeMessage:
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- This model represents a fully parsed and normalized email message
|
||||
- It is intentionally provider-agnostic and suitable for persistence, indexing, and downstream processing
|
||||
- This model represents a fully parsed and normalized email message.
|
||||
- It is intentionally provider-agnostic and suitable for
|
||||
persistence, indexing, and downstream processing.
|
||||
|
||||
**Constraints:**
|
||||
|
||||
- No provider-specific identifiers, payloads, or API semantics should appear in this model
|
||||
|
||||
- No provider-specific identifiers, payloads, or API semantics
|
||||
should appear in this model.
|
||||
"""
|
||||
|
||||
message_id: str
|
||||
|
||||
@@ -1,10 +1,8 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Thread domain models for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module defines the **canonical, provider-agnostic representation**
|
||||
of an email thread as used internally by the Mail Intake ingestion pipeline.
|
||||
|
||||
@@ -27,9 +25,11 @@ class MailIntakeThread:
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- A thread groups multiple related messages under a single subject and participant set
|
||||
- It is designed to support reasoning over conversational context such as job applications, interviews, follow-ups, and ongoing discussions
|
||||
- This model is provider-agnostic and safe to persist
|
||||
- A thread groups multiple related messages under a single subject
|
||||
and participant set.
|
||||
- It is designed to support reasoning over conversational context
|
||||
such as job applications, interviews, follow-ups, and ongoing discussions.
|
||||
- This model is provider-agnostic and safe to persist.
|
||||
"""
|
||||
|
||||
thread_id: str
|
||||
@@ -68,9 +68,9 @@ class MailIntakeThread:
|
||||
Notes:
|
||||
**Responsibilities:**
|
||||
|
||||
- Appends the message to the thread
|
||||
- Tracks unique participants
|
||||
- Updates the last activity timestamp
|
||||
- Appends the message to the thread.
|
||||
- Tracks unique participants.
|
||||
- Updates the last activity timestamp.
|
||||
"""
|
||||
self.messages.append(message)
|
||||
|
||||
|
||||
@@ -1,34 +1,34 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Message parsing utilities for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This package contains **provider-aware but adapter-agnostic parsing helpers**
|
||||
used to extract and normalize structured information from raw mail payloads.
|
||||
|
||||
Parsers in this package are responsible for:
|
||||
- Interpreting provider-native message structures
|
||||
- Extracting meaningful fields such as headers, body text, and subjects
|
||||
- Normalizing data into consistent internal representations
|
||||
|
||||
- Interpreting provider-native message structures.
|
||||
- Extracting meaningful fields such as headers, body text, and subjects.
|
||||
- Normalizing data into consistent internal representations.
|
||||
|
||||
This package does not:
|
||||
- Perform network or IO operations
|
||||
- Contain provider API logic
|
||||
- Construct domain models directly
|
||||
|
||||
- Perform network or IO operations.
|
||||
- Contain provider API logic.
|
||||
- Construct domain models directly.
|
||||
|
||||
Parsing functions are designed to be composable and are orchestrated by the
|
||||
ingestion layer.
|
||||
|
||||
---
|
||||
|
||||
## Public API
|
||||
# Public API
|
||||
|
||||
extract_body
|
||||
parse_headers
|
||||
extract_sender
|
||||
normalize_subject
|
||||
- `extract_body`
|
||||
- `parse_headers`
|
||||
- `extract_sender`
|
||||
- `normalize_subject`
|
||||
|
||||
---
|
||||
"""
|
||||
|
||||
@@ -1,4 +1,6 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Message body extraction utilities for Mail Intake.
|
||||
|
||||
This module contains helper functions for extracting a best-effort
|
||||
@@ -24,13 +26,16 @@ def _decode_base64(data: str) -> str:
|
||||
omit padding and use non-standard characters.
|
||||
|
||||
Args:
|
||||
data: URL-safe base64-encoded string.
|
||||
data (str):
|
||||
URL-safe base64-encoded string.
|
||||
|
||||
Returns:
|
||||
Decoded UTF-8 text with replacement for invalid characters.
|
||||
str:
|
||||
Decoded UTF-8 text with replacement for invalid characters.
|
||||
|
||||
Raises:
|
||||
MailIntakeParsingError: If decoding fails.
|
||||
MailIntakeParsingError:
|
||||
If decoding fails.
|
||||
"""
|
||||
try:
|
||||
padded = data.replace("-", "+").replace("_", "/")
|
||||
@@ -45,14 +50,17 @@ def _extract_from_part(part: Dict[str, Any]) -> Optional[str]:
|
||||
Extract text content from a single MIME part.
|
||||
|
||||
Supports:
|
||||
- text/plain
|
||||
- text/html (converted to plain text)
|
||||
|
||||
- `text/plain`
|
||||
- `text/html` (converted to plain text)
|
||||
|
||||
Args:
|
||||
part: MIME part dictionary from a provider payload.
|
||||
part (Dict[str, Any]):
|
||||
MIME part dictionary from a provider payload.
|
||||
|
||||
Returns:
|
||||
Extracted plain-text content, or None if unsupported or empty.
|
||||
Optional[str]:
|
||||
Extracted plain-text content, or `None` if unsupported or empty.
|
||||
"""
|
||||
mime_type = part.get("mimeType")
|
||||
body = part.get("body", {})
|
||||
@@ -79,16 +87,19 @@ def extract_body(payload: Dict[str, Any]) -> str:
|
||||
Extract the best-effort message body from a Gmail payload.
|
||||
|
||||
Priority:
|
||||
1. text/plain
|
||||
2. text/html (stripped to text)
|
||||
|
||||
1. `text/plain`
|
||||
2. `text/html` (stripped to text)
|
||||
3. Single-part body
|
||||
4. empty string (if nothing usable found)
|
||||
4. Empty string (if nothing usable found)
|
||||
|
||||
Args:
|
||||
payload: Provider-native message payload dictionary.
|
||||
payload (Dict[str, Any]):
|
||||
Provider-native message payload dictionary.
|
||||
|
||||
Returns:
|
||||
Extracted plain-text message body.
|
||||
str:
|
||||
Extracted plain-text message body.
|
||||
"""
|
||||
if not payload:
|
||||
return ""
|
||||
|
||||
@@ -1,10 +1,8 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Message header parsing utilities for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module provides helper functions for normalizing and extracting
|
||||
useful information from provider-native message headers.
|
||||
|
||||
@@ -21,7 +19,7 @@ def parse_headers(raw_headers: List[Dict[str, str]]) -> Dict[str, str]:
|
||||
|
||||
Args:
|
||||
raw_headers (List[Dict[str, str]]):
|
||||
List of header dictionaries, each containing ``name`` and ``value`` keys.
|
||||
List of header dictionaries, each containing `name` and `value` keys.
|
||||
|
||||
Returns:
|
||||
Dict[str, str]:
|
||||
@@ -30,23 +28,27 @@ def parse_headers(raw_headers: List[Dict[str, str]]) -> Dict[str, str]:
|
||||
Notes:
|
||||
**Guarantees:**
|
||||
|
||||
- Provider payloads (such as Gmail) typically represent headers as a list of name/value mappings
|
||||
- This function normalizes them into a case-insensitive dictionary keyed by lowercase header names
|
||||
- Provider payloads (such as Gmail) typically represent headers as a
|
||||
list of name/value mappings.
|
||||
- This function normalizes them into a case-insensitive dictionary
|
||||
keyed by lowercase header names.
|
||||
|
||||
Example:
|
||||
Typical usage:
|
||||
|
||||
Input:
|
||||
[
|
||||
{"name": "From", "value": "John Doe <john@example.com>"},
|
||||
{"name": "Subject", "value": "Re: Interview Update"},
|
||||
]
|
||||
|
||||
Output:
|
||||
{
|
||||
"from": "John Doe <john@example.com>",
|
||||
"subject": "Re: Interview Update",
|
||||
}
|
||||
|
||||
```python
|
||||
Input:
|
||||
[
|
||||
{"name": "From", "value": "John Doe <john@example.com>"},
|
||||
{"name": "Subject", "value": "Re: Interview Update"},
|
||||
]
|
||||
|
||||
Output:
|
||||
{
|
||||
"from": "John Doe <john@example.com>",
|
||||
"subject": "Re: Interview Update",
|
||||
}
|
||||
```
|
||||
"""
|
||||
headers: Dict[str, str] = {}
|
||||
|
||||
@@ -68,22 +70,24 @@ def extract_sender(headers: Dict[str, str]) -> Tuple[str, Optional[str]]:
|
||||
|
||||
Args:
|
||||
headers (Dict[str, str]):
|
||||
Normalized header dictionary as returned by :func:`parse_headers`.
|
||||
Normalized header dictionary as returned by `parse_headers()`.
|
||||
|
||||
Returns:
|
||||
Tuple[str, Optional[str]]:
|
||||
A tuple ``(email, name)`` where ``email`` is the sender email address and ``name`` is the display name, or ``None`` if unavailable.
|
||||
A tuple `(email, name)` where `email` is the sender email address
|
||||
and `name` is the display name, or `None` if unavailable.
|
||||
|
||||
Notes:
|
||||
**Responsibilities:**
|
||||
|
||||
- This function parses the ``From`` header and attempts to extract sender email address and optional human-readable display name
|
||||
- This function parses the `From` header and attempts to extract
|
||||
sender email address and optional human-readable display name.
|
||||
|
||||
Example:
|
||||
Typical values:
|
||||
|
||||
``"John Doe <john@example.com>"`` -> ``("john@example.com", "John Doe")``
|
||||
``"john@example.com"`` -> ``("john@example.com", None)``
|
||||
- `"John Doe <john@example.com>"` -> `("john@example.com", "John Doe")`
|
||||
- `"john@example.com"` -> `("john@example.com", None)`
|
||||
"""
|
||||
from_header = headers.get("from")
|
||||
if not from_header:
|
||||
|
||||
@@ -1,10 +1,8 @@
|
||||
"""
|
||||
# Summary
|
||||
|
||||
Subject line normalization utilities for Mail Intake.
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This module provides helper functions for normalizing email subject lines
|
||||
to enable reliable thread-level comparison and grouping.
|
||||
|
||||
@@ -36,14 +34,15 @@ def normalize_subject(subject: str) -> str:
|
||||
Notes:
|
||||
**Responsibilities:**
|
||||
|
||||
- Strips common prefixes such as ``Re:``, ``Fwd:``, and ``FW:``
|
||||
- Repeats prefix stripping to handle stacked prefixes
|
||||
- Collapses excessive whitespace
|
||||
- Preserves original casing (no lowercasing)
|
||||
- Strips common prefixes such as `Re:`, `Fwd:`, and `FW:`.
|
||||
- Repeats prefix stripping to handle stacked prefixes.
|
||||
- Collapses excessive whitespace.
|
||||
- Preserves original casing (no lowercasing).
|
||||
|
||||
**Guarantees:**
|
||||
|
||||
- This function is intentionally conservative and avoids aggressive transformations that could alter the semantic meaning of the subject
|
||||
- This function is intentionally conservative and avoids aggressive
|
||||
transformations that could alter the semantic meaning of the subject.
|
||||
"""
|
||||
if not subject:
|
||||
return ""
|
||||
|
||||
Reference in New Issue
Block a user