google styled doc

This commit is contained in:
2026-03-08 00:29:24 +05:30
parent 9f37af5761
commit 9f9e472ada
21 changed files with 593 additions and 358 deletions

View File

@@ -1,6 +1,10 @@
""" """
Mail Intake — provider-agnostic, read-only email ingestion framework. Mail Intake — provider-agnostic, read-only email ingestion framework.
---
## Summary
Mail Intake is a **contract-first library** designed to ingest, parse, and Mail Intake is a **contract-first library** designed to ingest, parse, and
normalize email data from external providers (such as Gmail) into clean, normalize email data from external providers (such as Gmail) into clean,
provider-agnostic domain models. provider-agnostic domain models.
@@ -20,9 +24,9 @@ as a first-class module at the package root:
The package root acts as a **namespace**, not a facade. Consumers are The package root acts as a **namespace**, not a facade. Consumers are
expected to import functionality explicitly from the appropriate module. expected to import functionality explicitly from the appropriate module.
---------------------------------------------------------------------- ---
Installation
---------------------------------------------------------------------- ## Installation
Install using pip: Install using pip:
@@ -35,9 +39,9 @@ Or with Poetry:
Mail Intake is pure Python and has no runtime dependencies beyond those Mail Intake is pure Python and has no runtime dependencies beyond those
required by the selected provider (for example, Google APIs for Gmail). required by the selected provider (for example, Google APIs for Gmail).
---------------------------------------------------------------------- ---
Basic Usage
---------------------------------------------------------------------- ## Quick start
Minimal Gmail ingestion example (local development): Minimal Gmail ingestion example (local development):
@@ -65,27 +69,41 @@ Iterating over threads:
for thread in reader.iter_threads("subject:Interview"): for thread in reader.iter_threads("subject:Interview"):
print(thread.normalized_subject, len(thread.messages)) print(thread.normalized_subject, len(thread.messages))
---------------------------------------------------------------------- ---
Extensibility Model
---------------------------------------------------------------------- ## Architecture
Mail Intake is designed to be extensible via **public contracts** exposed Mail Intake is designed to be extensible via **public contracts** exposed
through its modules: through its modules:
- Users MAY implement their own mail adapters by subclassing - Users MAY implement their own mail adapters by subclassing ``adapters.MailIntakeAdapter``
``adapters.MailIntakeAdapter`` - Users MAY implement their own authentication providers by subclassing ``auth.MailIntakeAuthProvider[T]``
- Users MAY implement their own authentication providers by subclassing - Users MAY implement their own credential persistence layers by implementing ``credentials.CredentialStore[T]``
``auth.MailIntakeAuthProvider[T]``
- Users MAY implement their own credential persistence layers by
implementing ``credentials.CredentialStore[T]``
Users SHOULD NOT subclass built-in adapter implementations. Built-in Users SHOULD NOT subclass built-in adapter implementations. Built-in
adapters (such as Gmail) are reference implementations and may change adapters (such as Gmail) are reference implementations and may change
internally without notice. internally without notice.
---------------------------------------------------------------------- **Design Guarantees:**
Public API Surface - Read-only access: no mutation of provider state
---------------------------------------------------------------------- - Provider-agnostic domain models
- Explicit configuration and dependency injection
- No implicit global state or environment reads
- Deterministic, testable behavior
- Distributed-safe authentication design
Mail Intake favors correctness, clarity, and explicitness over convenience
shortcuts.
**Core Philosophy:**
`Mail Intake` is built as a **contract-first ingestion pipeline**:
1. **Layered Decoupling**: Adapters handle transport, Parsers handle format normalization, and Ingestion orchestrates.
2. **Provider Agnosticism**: Domain models and core logic never depend on provider-specific (e.g., Gmail) API internals.
3. **Stateless Workflows**: The library functions as a read-only pipe, ensuring side-effect-free ingestion.
---
## Public API
The supported public API consists of the following top-level modules: The supported public API consists of the following top-level modules:
@@ -101,40 +119,7 @@ The supported public API consists of the following top-level modules:
Classes and functions should be imported explicitly from these modules. Classes and functions should be imported explicitly from these modules.
No individual symbols are re-exported at the package root. No individual symbols are re-exported at the package root.
---------------------------------------------------------------------- ---
Design Guarantees
----------------------------------------------------------------------
- Read-only access: no mutation of provider state
- Provider-agnostic domain models
- Explicit configuration and dependency injection
- No implicit global state or environment reads
- Deterministic, testable behavior
- Distributed-safe authentication design
Mail Intake favors correctness, clarity, and explicitness over convenience
shortcuts.
## Core Philosophy
`Mail Intake` is built as a **contract-first ingestion pipeline**:
1. **Layered Decoupling**: Adapters handle transport, Parsers handle format normalization, and Ingestion orchestrates.
2. **Provider Agnosticism**: Domain models and core logic never depend on provider-specific (e.g., Gmail) API internals.
3. **Stateless Workflows**: The library functions as a read-only pipe, ensuring side-effect-free ingestion.
## Documentation Design
Follow these "AI-Native" docstring principles across the codebase:
### For Humans
- **Namespace Clarity**: Always specify which module a class or function belongs to.
- **Contract Explanations**: Use the `adapters` and `auth` base classes to explain extension requirements.
### For LLMs
- **Dotted Paths**: Use full dotted paths in docstrings to help agents link concepts across modules.
- **Typed Interfaces**: Provide `.pyi` stubs for every public module to ensure perfect context for AI coding tools.
- **Canonical Exceptions**: Always use `: description` pairs in `Raises` blocks to enable structured error analysis.
""" """

View File

@@ -1,6 +1,10 @@
""" """
Mail provider adapter implementations for Mail Intake. Mail provider adapter implementations for Mail Intake.
---
## Summary
This package contains **adapter-layer implementations** responsible for This package contains **adapter-layer implementations** responsible for
interfacing with external mail providers and exposing a normalized, interfacing with external mail providers and exposing a normalized,
provider-agnostic contract to the rest of the system. provider-agnostic contract to the rest of the system.
@@ -15,8 +19,14 @@ Provider-specific logic **must not leak** outside of adapter implementations.
All parsings, normalizations, and transformations must be handled by downstream All parsings, normalizations, and transformations must be handled by downstream
components. components.
Public adapters exported from this package are considered the supported ---
integration surface for mail providers.
## Public API
MailIntakeAdapter
MailIntakeGmailAdapter
---
""" """
from .base import MailIntakeAdapter from .base import MailIntakeAdapter

View File

@@ -1,6 +1,10 @@
""" """
Mail provider adapter contracts for Mail Intake. Mail provider adapter contracts for Mail Intake.
---
## Summary
This module defines the **provider-agnostic adapter interface** used for This module defines the **provider-agnostic adapter interface** used for
read-only mail ingestion. read-only mail ingestion.
@@ -17,12 +21,16 @@ class MailIntakeAdapter(ABC):
""" """
Base adapter interface for mail providers. Base adapter interface for mail providers.
This interface defines the minimal contract required to: Notes:
- Discover messages matching a query **Guarantees:**
- Retrieve full message payloads
- Retrieve full thread payloads
Adapters are intentionally read-only and must not mutate provider state. - discover messages matching a query
- retrieve full message payloads
- retrieve full thread payloads
**Lifecycle:**
- adapters are intentionally read-only and must not mutate provider state
""" """
@abstractmethod @abstractmethod
@@ -30,17 +38,22 @@ class MailIntakeAdapter(ABC):
""" """
Iterate over lightweight message references matching a query. Iterate over lightweight message references matching a query.
Implementations must yield dictionaries containing at least:
- ``message_id``: Provider-specific message identifier
- ``thread_id``: Provider-specific thread identifier
Args: Args:
query: Provider-specific query string used to filter messages. query (str):
Provider-specific query string used to filter messages.
Yields: Yields:
Dict[str, str]:
Dictionaries containing message and thread identifiers. Dictionaries containing message and thread identifiers.
Example yield: Notes:
**Guarantees:**
- Implementations must yield dictionaries containing at least ``message_id`` and ``thread_id``
Example:
Typical yield:
{ {
"message_id": "...", "message_id": "...",
"thread_id": "..." "thread_id": "..."
@@ -54,11 +67,12 @@ class MailIntakeAdapter(ABC):
Fetch a full raw message by message identifier. Fetch a full raw message by message identifier.
Args: Args:
message_id: Provider-specific message identifier. message_id (str):
Provider-specific message identifier.
Returns: Returns:
Provider-native message payload Dict[str, Any]:
(e.g., Gmail message JSON structure). Provider-native message payload (e.g., Gmail message JSON structure).
""" """
raise NotImplementedError raise NotImplementedError
@@ -68,9 +82,11 @@ class MailIntakeAdapter(ABC):
Fetch a full raw thread by thread identifier. Fetch a full raw thread by thread identifier.
Args: Args:
thread_id: Provider-specific thread identifier. thread_id (str):
Provider-specific thread identifier.
Returns: Returns:
Dict[str, Any]:
Provider-native thread payload. Provider-native thread payload.
""" """
raise NotImplementedError raise NotImplementedError

View File

@@ -1,6 +1,10 @@
""" """
Gmail adapter implementation for Mail Intake. Gmail adapter implementation for Mail Intake.
---
## Summary
This module provides a **Gmail-specific implementation** of the This module provides a **Gmail-specific implementation** of the
`MailIntakeAdapter` contract. `MailIntakeAdapter` contract.
@@ -30,12 +34,15 @@ class MailIntakeGmailAdapter(MailIntakeAdapter):
Gmail REST API. It translates the generic mail intake contract into Gmail REST API. It translates the generic mail intake contract into
Gmail-specific API calls. Gmail-specific API calls.
This class is the ONLY place where: Notes:
- googleapiclient is imported **Responsibilities:**
- This class is the ONLY place where googleapiclient is imported
- Gmail REST semantics are known - Gmail REST semantics are known
- .execute() is called - .execute() is called
Design constraints: **Constraints:**
- Must remain thin and imperative - Must remain thin and imperative
- Must not perform parsing or interpretation - Must not perform parsing or interpretation
- Must not expose Gmail-specific types beyond this class - Must not expose Gmail-specific types beyond this class
@@ -50,9 +57,11 @@ class MailIntakeGmailAdapter(MailIntakeAdapter):
Initialize the Gmail adapter. Initialize the Gmail adapter.
Args: Args:
auth_provider: Authentication provider capable of supplying auth_provider (MailIntakeAuthProvider):
valid Gmail API credentials. Authentication provider capable of supplying valid Gmail API credentials.
user_id: Gmail user identifier. Defaults to `"me"`.
user_id (str):
Gmail user identifier. Defaults to `"me"`.
""" """
self._auth_provider = auth_provider self._auth_provider = auth_provider
self._user_id = user_id self._user_id = user_id
@@ -64,10 +73,12 @@ class MailIntakeGmailAdapter(MailIntakeAdapter):
Lazily initialize and return the Gmail API service client. Lazily initialize and return the Gmail API service client.
Returns: Returns:
Any:
Initialized Gmail API service instance. Initialized Gmail API service instance.
Raises: Raises:
MailIntakeAdapterError: If the Gmail service cannot be initialized. MailIntakeAdapterError:
If the Gmail service cannot be initialized.
""" """
if self._service is None: if self._service is None:
try: try:
@@ -84,15 +95,16 @@ class MailIntakeGmailAdapter(MailIntakeAdapter):
Iterate over message references matching the query. Iterate over message references matching the query.
Args: Args:
query: Gmail search query string. query (str):
Gmail search query string.
Yields: Yields:
Dictionaries containing: Dict[str, str]:
- ``message_id``: Gmail message ID Dictionaries containing ``message_id`` and ``thread_id``.
- ``thread_id``: Gmail thread ID
Raises: Raises:
MailIntakeAdapterError: If the Gmail API returns an error. MailIntakeAdapterError:
If the Gmail API returns an error.
""" """
try: try:
request = ( request = (
@@ -126,13 +138,16 @@ class MailIntakeGmailAdapter(MailIntakeAdapter):
Fetch a full Gmail message by message ID. Fetch a full Gmail message by message ID.
Args: Args:
message_id: Gmail message identifier. message_id (str):
Gmail message identifier.
Returns: Returns:
Dict[str, Any]:
Provider-native Gmail message payload. Provider-native Gmail message payload.
Raises: Raises:
MailIntakeAdapterError: If the Gmail API returns an error. MailIntakeAdapterError:
If the Gmail API returns an error.
""" """
try: try:
return ( return (
@@ -151,13 +166,16 @@ class MailIntakeGmailAdapter(MailIntakeAdapter):
Fetch a full Gmail thread by thread ID. Fetch a full Gmail thread by thread ID.
Args: Args:
thread_id: Gmail thread identifier. thread_id (str):
Gmail thread identifier.
Returns: Returns:
Dict[str, Any]:
Provider-native Gmail thread payload. Provider-native Gmail thread payload.
Raises: Raises:
MailIntakeAdapterError: If the Gmail API returns an error. MailIntakeAdapterError:
If the Gmail API returns an error.
""" """
try: try:
return ( return (

View File

@@ -1,6 +1,10 @@
""" """
Authentication provider implementations for Mail Intake. Authentication provider implementations for Mail Intake.
---
## Summary
This package defines the **authentication layer** used by mail adapters This package defines the **authentication layer** used by mail adapters
to obtain provider-specific credentials. to obtain provider-specific credentials.
@@ -15,6 +19,15 @@ Authentication providers:
Consumers should depend on the abstract interface and use concrete Consumers should depend on the abstract interface and use concrete
implementations only where explicitly required. implementations only where explicitly required.
---
## Public API
MailIntakeAuthProvider
MailIntakeGoogleAuth
---
""" """
from .base import MailIntakeAuthProvider from .base import MailIntakeAuthProvider

View File

@@ -1,6 +1,10 @@
""" """
Authentication provider contracts for Mail Intake. Authentication provider contracts for Mail Intake.
---
## Summary
This module defines the **authentication abstraction layer** used by mail This module defines the **authentication abstraction layer** used by mail
adapters to obtain provider-specific credentials. adapters to obtain provider-specific credentials.
@@ -23,15 +27,18 @@ class MailIntakeAuthProvider(ABC, Generic[T]):
providers and mail adapters by requiring providers to explicitly providers and mail adapters by requiring providers to explicitly
declare the type of credentials they return. declare the type of credentials they return.
Authentication providers encapsulate all logic required to: Notes:
**Responsibilities:**
- Acquire credentials from an external provider - Acquire credentials from an external provider
- Refresh or revalidate credentials as needed - Refresh or revalidate credentials as needed
- Handle authentication-specific failure modes - Handle authentication-specific failure modes
- Coordinate with credential persistence layers where applicable - Coordinate with credential persistence layers where applicable
Mail adapters must treat returned credentials as opaque and **Constraints:**
provider-specific, relying only on the declared credential type
expected by the adapter. - Mail adapters must treat returned credentials as opaque and provider-specific
- Mail adapters rely only on the declared credential type expected by the adapter
""" """
@abstractmethod @abstractmethod
@@ -39,15 +46,8 @@ class MailIntakeAuthProvider(ABC, Generic[T]):
""" """
Retrieve valid, provider-specific credentials. Retrieve valid, provider-specific credentials.
This method is synchronous by design and represents the sole
entry point through which adapters obtain authentication
material.
Implementations must either return credentials of the declared
type ``T`` that are valid at the time of return or raise an
authentication-specific exception.
Returns: Returns:
T:
Credentials of type ``T`` suitable for immediate use by the Credentials of type ``T`` suitable for immediate use by the
corresponding mail adapter. corresponding mail adapter.
@@ -55,5 +55,12 @@ class MailIntakeAuthProvider(ABC, Generic[T]):
Exception: Exception:
An authentication-specific exception indicating that An authentication-specific exception indicating that
credentials could not be obtained or validated. credentials could not be obtained or validated.
Notes:
**Guarantees:**
- This method is synchronous by design
- Represents the sole entry point through which adapters obtain authentication material
- Implementations must either return credentials of the declared type ``T`` that are valid at the time of return or raise an exception
""" """
raise NotImplementedError raise NotImplementedError

View File

@@ -1,6 +1,10 @@
""" """
Google authentication provider implementation for Mail Intake. Google authentication provider implementation for Mail Intake.
---
## Summary
This module provides a **Google OAuthbased authentication provider** This module provides a **Google OAuthbased authentication provider**
used primarily for Gmail access. used primarily for Gmail access.
@@ -33,13 +37,17 @@ class MailIntakeGoogleAuth(MailIntakeAuthProvider):
This provider implements the `MailIntakeAuthProvider` interface using This provider implements the `MailIntakeAuthProvider` interface using
Google's OAuth 2.0 flow and credential management libraries. Google's OAuth 2.0 flow and credential management libraries.
Responsibilities: Notes:
**Responsibilities:**
- Load cached credentials from a credential store when available - Load cached credentials from a credential store when available
- Refresh expired credentials when possible - Refresh expired credentials when possible
- Initiate an interactive OAuth flow only when required - Initiate an interactive OAuth flow only when required
- Persist refreshed or newly obtained credentials via the store - Persist refreshed or newly obtained credentials via the store
This class is synchronous by design and maintains a minimal internal state. **Guarantees:**
- This class is synchronous by design and maintains a minimal internal state
""" """
def __init__( def __init__(
@@ -52,15 +60,13 @@ class MailIntakeGoogleAuth(MailIntakeAuthProvider):
Initialize the Google authentication provider. Initialize the Google authentication provider.
Args: Args:
credentials_path: credentials_path (str):
Path to the Google OAuth client secrets file used to Path to the Google OAuth client secrets file used to initiate the OAuth 2.0 flow.
initiate the OAuth 2.0 flow.
store: store (CredentialStore[Credentials]):
Credential store responsible for persisting and Credential store responsible for persisting and retrieving Google OAuth credentials.
retrieving Google OAuth credentials.
scopes: scopes (Sequence[str]):
OAuth scopes required for Gmail access. OAuth scopes required for Gmail access.
""" """
self.credentials_path = credentials_path self.credentials_path = credentials_path
@@ -71,19 +77,23 @@ class MailIntakeGoogleAuth(MailIntakeAuthProvider):
""" """
Retrieve valid Google OAuth credentials. Retrieve valid Google OAuth credentials.
This method attempts to:
1. Load cached credentials from the configured credential store
2. Refresh expired credentials when possible
3. Perform an interactive OAuth login as a fallback
4. Persist valid credentials for future use
Returns: Returns:
Credentials:
A ``google.oauth2.credentials.Credentials`` instance suitable A ``google.oauth2.credentials.Credentials`` instance suitable
for use with Google API clients. for use with Google API clients.
Raises: Raises:
MailIntakeAuthError: If credentials cannot be loaded, refreshed, MailIntakeAuthError:
If credentials cannot be loaded, refreshed,
or obtained via interactive authentication. or obtained via interactive authentication.
Notes:
**Lifecycle:**
- Load cached credentials from the configured credential store
- Refresh expired credentials when possible
- Perform an interactive OAuth login as a fallback
- Persist valid credentials for future use
""" """
creds = self.store.load() creds = self.store.load()

View File

@@ -1,6 +1,10 @@
""" """
Global configuration models for Mail Intake. Global configuration models for Mail Intake.
---
## Summary
This module defines the **top-level configuration object** used to control This module defines the **top-level configuration object** used to control
mail ingestion behavior across adapters, authentication providers, and mail ingestion behavior across adapters, authentication providers, and
ingestion workflows. ingestion workflows.
@@ -18,28 +22,37 @@ class MailIntakeConfig:
""" """
Global configuration for mail-intake. Global configuration for mail-intake.
This configuration is intentionally explicit and immutable. Notes:
No implicit environment reads or global state. **Guarantees:**
Design principles: - This configuration is intentionally explicit and immutable
- Immutable once constructed - No implicit environment reads or global state
- Explicit configuration over implicit defaults - Explicit configuration over implicit defaults
- No direct environment or filesystem access - No direct environment or filesystem access
- This model is safe to pass across layers and suitable for serialization
This model is safe to pass across layers and suitable for serialization.
""" """
provider: str = "gmail" provider: str = "gmail"
"""Identifier of the mail provider to use (e.g., ``"gmail"``).""" """
Identifier of the mail provider to use (e.g., ``"gmail"``).
"""
user_id: str = "me" user_id: str = "me"
"""Provider-specific user identifier. Defaults to the authenticated user.""" """
Provider-specific user identifier. Defaults to the authenticated user.
"""
readonly: bool = True readonly: bool = True
"""Whether ingestion should operate in read-only mode.""" """
Whether ingestion should operate in read-only mode.
"""
credentials_path: Optional[str] = None credentials_path: Optional[str] = None
"""Optional path to provider credentials configuration.""" """
Optional path to provider credentials configuration.
"""
token_path: Optional[str] = None token_path: Optional[str] = None
"""Optional path to persisted authentication tokens.""" """
Optional path to persisted authentication tokens.
"""

View File

@@ -1,6 +1,10 @@
""" """
Credential persistence interfaces and implementations for Mail Intake. Credential persistence interfaces and implementations for Mail Intake.
---
## Summary
This package defines the abstractions and concrete implementations used This package defines the abstractions and concrete implementations used
to persist authentication credentials across Mail Intake components. to persist authentication credentials across Mail Intake components.
@@ -16,6 +20,16 @@ The package provides:
Credential lifecycle management, interpretation, and security policy Credential lifecycle management, interpretation, and security policy
decisions remain the responsibility of authentication providers. decisions remain the responsibility of authentication providers.
---
## Public API
CredentialStore
PickleCredentialStore
RedisCredentialStore
---
""" """
from mail_intake.credentials.store import CredentialStore from mail_intake.credentials.store import CredentialStore

View File

@@ -1,6 +1,10 @@
""" """
Local filesystembased credential persistence for Mail Intake. Local filesystembased credential persistence for Mail Intake.
---
## Summary
This module provides a file-backed implementation of the This module provides a file-backed implementation of the
``CredentialStore`` abstraction using Python's ``pickle`` module. ``CredentialStore`` abstraction using Python's ``pickle`` module.
@@ -29,13 +33,16 @@ class PickleCredentialStore(CredentialStore[T]):
filesystem. It is a simple implementation intended primarily for filesystem. It is a simple implementation intended primarily for
development, testing, and single-process execution contexts. development, testing, and single-process execution contexts.
This implementation: Notes:
**Guarantees:**
- Stores credentials on the local filesystem - Stores credentials on the local filesystem
- Uses pickle for serialization and deserialization - Uses pickle for serialization and deserialization
- Does not provide encryption, locking, or concurrency guarantees - Does not provide encryption, locking, or concurrency guarantees
Credential lifecycle management, validation, and refresh logic are **Constraints:**
explicitly out of scope for this class.
- Credential lifecycle management, validation, and refresh logic are explicitly out of scope for this class
""" """
def __init__(self, path: str): def __init__(self, path: str):
@@ -43,7 +50,7 @@ class PickleCredentialStore(CredentialStore[T]):
Initialize a pickle-backed credential store. Initialize a pickle-backed credential store.
Args: Args:
path: path (str):
Filesystem path where credentials will be stored. Filesystem path where credentials will be stored.
The file will be created or overwritten as needed. The file will be created or overwritten as needed.
""" """
@@ -53,15 +60,16 @@ class PickleCredentialStore(CredentialStore[T]):
""" """
Load credentials from the local filesystem. Load credentials from the local filesystem.
If the credential file does not exist or cannot be successfully
deserialized, this method returns ``None``.
The store does not attempt to validate or interpret the returned
credentials.
Returns: Returns:
Optional[T]:
An instance of type ``T`` if credentials are present and An instance of type ``T`` if credentials are present and
successfully deserialized; otherwise ``None``. successfully deserialized; otherwise ``None``.
Notes:
**Guarantees:**
- If the credential file does not exist or cannot be successfully deserialized, this method returns ``None``
- The store does not attempt to validate or interpret the returned credentials
""" """
try: try:
with open(self.path, "rb") as fh: with open(self.path, "rb") as fh:
@@ -73,12 +81,14 @@ class PickleCredentialStore(CredentialStore[T]):
""" """
Persist credentials to the local filesystem. Persist credentials to the local filesystem.
Any previously stored credentials at the configured path are
overwritten.
Args: Args:
credentials: credentials (T):
The credential object to persist. The credential object to persist.
Notes:
**Responsibilities:**
- Any previously stored credentials at the configured path are overwritten
""" """
with open(self.path, "wb") as fh: with open(self.path, "wb") as fh:
pickle.dump(credentials, fh) pickle.dump(credentials, fh)
@@ -87,8 +97,10 @@ class PickleCredentialStore(CredentialStore[T]):
""" """
Remove persisted credentials from the local filesystem. Remove persisted credentials from the local filesystem.
This method deletes the credential file if it exists and should Notes:
be treated as an idempotent operation. **Lifecycle:**
- This method deletes the credential file if it exists and should be treated as an idempotent operation
""" """
import os import os

View File

@@ -1,6 +1,10 @@
""" """
Redis-backed credential persistence for Mail Intake. Redis-backed credential persistence for Mail Intake.
---
## Summary
This module provides a Redis-based implementation of the This module provides a Redis-based implementation of the
``CredentialStore`` abstraction, enabling credential persistence ``CredentialStore`` abstraction, enabling credential persistence
across distributed and horizontally scaled deployments. across distributed and horizontally scaled deployments.
@@ -37,14 +41,16 @@ class RedisCredentialStore(CredentialStore[T]):
distributed and horizontally scaled deployments where credentials distributed and horizontally scaled deployments where credentials
must be shared across multiple processes or nodes. must be shared across multiple processes or nodes.
The store is intentionally generic and delegates all serialization Notes:
concerns to caller-provided functions. This avoids unsafe mechanisms **Responsibilities:**
such as pickle and allows credential formats to be explicitly
controlled and audited.
This class is responsible only for persistence and retrieval. - This class is responsible only for persistence and retrieval
It does not interpret, validate, refresh, or otherwise manage - It does not interpret, validate, refresh, or otherwise manage the lifecycle of the credentials being stored
the lifecycle of the credentials being stored.
**Guarantees:**
- The store is intentionally generic and delegates all serialization concerns to caller-provided functions
- This avoids unsafe mechanisms such as pickle and allows credential formats to be explicitly controlled and audited
""" """
def __init__( def __init__(
@@ -59,31 +65,20 @@ class RedisCredentialStore(CredentialStore[T]):
Initialize a Redis-backed credential store. Initialize a Redis-backed credential store.
Args: Args:
redis_client: redis_client (Any):
An initialized Redis client instance (for example, An initialized Redis client instance (for example, ``redis.Redis`` or a compatible interface) used to communicate with the Redis server.
``redis.Redis`` or a compatible interface) used to
communicate with the Redis server.
key: key (str):
The Redis key under which credentials are stored. The Redis key under which credentials are stored. Callers are responsible for applying appropriate namespacing to avoid collisions.
Callers are responsible for applying appropriate
namespacing to avoid collisions.
serialize: serialize (Callable[[T], bytes]):
A callable that converts a credential object of type A callable that converts a credential object of type ``T`` into a ``bytes`` representation suitable for storage in Redis.
``T`` into a ``bytes`` representation suitable for
storage in Redis.
deserialize: deserialize (Callable[[bytes], T]):
A callable that converts a ``bytes`` payload retrieved A callable that converts a ``bytes`` payload retrieved from Redis back into a credential object of type ``T``.
from Redis back into a credential object of type ``T``.
ttl_seconds: ttl_seconds (Optional[int]):
Optional time-to-live (TTL) for the stored credentials, Optional time-to-live (TTL) for the stored credentials, expressed in seconds. When provided, Redis will automatically expire the stored credentials after the specified duration. If ``None``, credentials are stored without an expiration.
expressed in seconds. When provided, Redis will
automatically expire the stored credentials after the
specified duration. If ``None``, credentials are stored
without an expiration.
""" """
self.redis = redis_client self.redis = redis_client
self.key = key self.key = key
@@ -95,16 +90,16 @@ class RedisCredentialStore(CredentialStore[T]):
""" """
Load credentials from Redis. Load credentials from Redis.
If no value exists for the configured key, or if the stored
payload cannot be successfully deserialized, this method
returns ``None``.
The store does not attempt to validate the returned credentials
or determine whether they are expired or otherwise usable.
Returns: Returns:
Optional[T]:
An instance of type ``T`` if credentials are present and An instance of type ``T`` if credentials are present and
successfully deserialized; otherwise ``None``. successfully deserialized; otherwise ``None``.
Notes:
**Guarantees:**
- If no value exists for the configured key, or if the stored payload cannot be successfully deserialized, this method returns ``None``
- The store does not attempt to validate the returned credentials or determine whether they are expired or otherwise usable
""" """
raw = self.redis.get(self.key) raw = self.redis.get(self.key)
if not raw: if not raw:
@@ -118,13 +113,15 @@ class RedisCredentialStore(CredentialStore[T]):
""" """
Persist credentials to Redis. Persist credentials to Redis.
Any previously stored credentials under the same key are
overwritten. If a TTL is configured, the credentials will
expire automatically after the specified duration.
Args: Args:
credentials: credentials (T):
The credential object to persist. The credential object to persist.
Notes:
**Responsibilities:**
- Any previously stored credentials under the same key are overwritten
- If a TTL is configured, the credentials will expire automatically after the specified duration
""" """
payload = self.serialize(credentials) payload = self.serialize(credentials)
if self.ttl_seconds: if self.ttl_seconds:
@@ -136,7 +133,10 @@ class RedisCredentialStore(CredentialStore[T]):
""" """
Remove stored credentials from Redis. Remove stored credentials from Redis.
This operation deletes the configured Redis key if it exists. Notes:
Implementations should treat this method as idempotent. **Lifecycle:**
- This operation deletes the configured Redis key if it exists
- Implementations should treat this method as idempotent
""" """
self.redis.delete(self.key) self.redis.delete(self.key)

View File

@@ -1,6 +1,10 @@
""" """
Credential persistence abstractions for Mail Intake. Credential persistence abstractions for Mail Intake.
---
## Summary
This module defines the generic persistence contract used to store and This module defines the generic persistence contract used to store and
retrieve authentication credentials across Mail Intake components. retrieve authentication credentials across Mail Intake components.
@@ -29,13 +33,15 @@ class CredentialStore(ABC, Generic[T]):
Abstract base class defining a generic persistence interface for Abstract base class defining a generic persistence interface for
authentication credentials. authentication credentials.
This interface separates *credential lifecycle management* from Notes:
*credential storage mechanics*. Implementations are responsible **Responsibilities:**
only for persistence concerns, while authentication providers
retain full control over credential creation, validation, refresh,
and revocation logic.
The store is intentionally agnostic to: - Provide persistent storage separating life-cycle management from storage mechanics
- Keep implementation focused only on persistence
**Constraints:**
- The store is intentionally agnostic to:
- The concrete credential type being stored - The concrete credential type being stored
- The serialization format used to persist credentials - The serialization format used to persist credentials
- The underlying storage backend or durability guarantees - The underlying storage backend or durability guarantees
@@ -46,16 +52,16 @@ class CredentialStore(ABC, Generic[T]):
""" """
Load previously persisted credentials. Load previously persisted credentials.
Implementations should return ``None`` when no credentials are
present or when stored credentials cannot be successfully
decoded or deserialized.
The store must not attempt to validate, refresh, or otherwise
interpret the returned credentials.
Returns: Returns:
Optional[T]:
An instance of type ``T`` if credentials are available and An instance of type ``T`` if credentials are available and
loadable; otherwise ``None``. loadable; otherwise ``None``.
Notes:
**Guarantees:**
- Implementations should return ``None`` when no credentials are present or when stored credentials cannot be successfully decoded or deserialized
- The store must not attempt to validate, refresh, or otherwise interpret the returned credentials
""" """
@abstractmethod @abstractmethod
@@ -63,18 +69,20 @@ class CredentialStore(ABC, Generic[T]):
""" """
Persist credentials to the underlying storage backend. Persist credentials to the underlying storage backend.
This method is invoked when credentials are newly obtained or Args:
have been refreshed and are known to be valid at the time of credentials (T):
persistence. The credential object to persist.
Notes:
**Lifecycle:**
- This method is invoked when credentials are newly obtained or have been refreshed and are known to be valid at the time of persistence
**Responsibilities:**
Implementations are responsible for:
- Ensuring durability appropriate to the deployment context - Ensuring durability appropriate to the deployment context
- Applying encryption or access controls where required - Applying encryption or access controls where required
- Overwriting any previously stored credentials - Overwriting any previously stored credentials
Args:
credentials:
The credential object to persist.
""" """
@abstractmethod @abstractmethod
@@ -82,9 +90,13 @@ class CredentialStore(ABC, Generic[T]):
""" """
Remove any persisted credentials from the store. Remove any persisted credentials from the store.
This method is called when credentials are known to be invalid, Notes:
revoked, corrupted, or otherwise unusable, and must ensure that **Lifecycle:**
no stale authentication material remains accessible.
Implementations should treat this operation as idempotent. - This method is called when credentials are known to be invalid, revoked, corrupted, or otherwise unusable
- Must ensure that no stale authentication material remains accessible
**Guarantees:**
- Implementations should treat this operation as idempotent
""" """

View File

@@ -1,6 +1,10 @@
""" """
Exception hierarchy for Mail Intake. Exception hierarchy for Mail Intake.
---
## Summary
This module defines the **canonical exception types** used throughout the This module defines the **canonical exception types** used throughout the
Mail Intake library. Mail Intake library.
@@ -14,11 +18,12 @@ class MailIntakeError(Exception):
""" """
Base exception for all Mail Intake errors. Base exception for all Mail Intake errors.
This is the root of the Mail Intake exception hierarchy. Notes:
All errors raised by the library must derive from this class. **Guarantees:**
Consumers should generally catch this type when handling - This is the root of the Mail Intake exception hierarchy
library-level failures. - All errors raised by the library must derive from this class
- Consumers should generally catch this type when handling library-level failures
""" """
@@ -26,8 +31,10 @@ class MailIntakeAuthError(MailIntakeError):
""" """
Authentication and credential-related failures. Authentication and credential-related failures.
Raised when authentication providers are unable to acquire, Notes:
refresh, or persist valid credentials. **Lifecycle:**
- Raised when authentication providers are unable to acquire, refresh, or persist valid credentials
""" """
@@ -35,8 +42,10 @@ class MailIntakeAdapterError(MailIntakeError):
""" """
Errors raised by mail provider adapters. Errors raised by mail provider adapters.
Raised when a provider adapter encounters API errors, Notes:
transport failures, or invalid provider responses. **Lifecycle:**
- Raised when a provider adapter encounters API errors, transport failures, or invalid provider responses
""" """
@@ -44,6 +53,8 @@ class MailIntakeParsingError(MailIntakeError):
""" """
Errors encountered while parsing message content. Errors encountered while parsing message content.
Raised when raw provider payloads cannot be interpreted Notes:
or normalized into internal domain models. **Lifecycle:**
- Raised when raw provider payloads cannot be interpreted or normalized into internal domain models
""" """

View File

@@ -1,6 +1,10 @@
""" """
Mail ingestion orchestration for Mail Intake. Mail ingestion orchestration for Mail Intake.
---
## Summary
This package contains **high-level ingestion components** responsible for This package contains **high-level ingestion components** responsible for
coordinating mail retrieval, parsing, normalization, and model construction. coordinating mail retrieval, parsing, normalization, and model construction.
@@ -15,6 +19,14 @@ Components in this package:
Consumers are expected to construct a mail adapter and pass it to the Consumers are expected to construct a mail adapter and pass it to the
ingestion layer to begin processing messages and threads. ingestion layer to begin processing messages and threads.
---
## Public API
MailIntakeReader
---
""" """
from .reader import MailIntakeReader from .reader import MailIntakeReader

View File

@@ -1,6 +1,10 @@
""" """
High-level mail ingestion orchestration for Mail Intake. High-level mail ingestion orchestration for Mail Intake.
---
## Summary
This module provides the primary, provider-agnostic entry point for This module provides the primary, provider-agnostic entry point for
reading and processing mail data. reading and processing mail data.
@@ -29,19 +33,15 @@ class MailIntakeReader:
""" """
High-level read-only ingestion interface. High-level read-only ingestion interface.
This class is the **primary entry point** for consumers of the Mail Notes:
Intake library. **Responsibilities:**
It orchestrates the full ingestion pipeline: - This class is the primary entry point for consumers of the Mail Intake library
- Querying the adapter for message references - It orchestrates the full ingestion pipeline: Querying the adapter for message references, fetching raw provider messages, parsing and normalizing message data, constructing domain models
- Fetching raw provider messages
- Parsing and normalizing message data
- Constructing domain models
This class is intentionally: **Constraints:**
- Provider-agnostic
- Stateless beyond iteration scope - This class is intentionally: Provider-agnostic, stateless beyond iteration scope, read-only
- Read-only
""" """
def __init__(self, adapter: MailIntakeAdapter): def __init__(self, adapter: MailIntakeAdapter):
@@ -49,8 +49,8 @@ class MailIntakeReader:
Initialize the mail reader. Initialize the mail reader.
Args: Args:
adapter: Mail adapter implementation used to retrieve raw adapter (MailIntakeAdapter):
messages and threads from a mail provider. Mail adapter implementation used to retrieve raw messages and threads from a mail provider.
""" """
self._adapter = adapter self._adapter = adapter
@@ -59,13 +59,16 @@ class MailIntakeReader:
Iterate over parsed messages matching a provider query. Iterate over parsed messages matching a provider query.
Args: Args:
query: Provider-specific query string used to filter messages. query (str):
Provider-specific query string used to filter messages.
Yields: Yields:
MailIntakeMessage:
Fully parsed and normalized `MailIntakeMessage` instances. Fully parsed and normalized `MailIntakeMessage` instances.
Raises: Raises:
MailIntakeParsingError: If a message cannot be parsed. MailIntakeParsingError:
If a message cannot be parsed.
""" """
for ref in self._adapter.iter_message_refs(query): for ref in self._adapter.iter_message_refs(query):
raw = self._adapter.fetch_message(ref["message_id"]) raw = self._adapter.fetch_message(ref["message_id"])
@@ -75,17 +78,22 @@ class MailIntakeReader:
""" """
Iterate over threads constructed from messages matching a query. Iterate over threads constructed from messages matching a query.
Messages are grouped by `thread_id` and yielded as complete thread
objects containing all associated messages.
Args: Args:
query: Provider-specific query string used to filter messages. query (str):
Provider-specific query string used to filter messages.
Returns: Yields:
MailIntakeThread:
An iterator of `MailIntakeThread` instances. An iterator of `MailIntakeThread` instances.
Raises: Raises:
MailIntakeParsingError: If a message cannot be parsed. MailIntakeParsingError:
If a message cannot be parsed.
Notes:
**Guarantees:**
- Messages are grouped by `thread_id` and yielded as complete thread objects containing all associated messages
""" """
threads: Dict[str, MailIntakeThread] = {} threads: Dict[str, MailIntakeThread] = {}
@@ -110,14 +118,16 @@ class MailIntakeReader:
Parse a raw provider message into a `MailIntakeMessage`. Parse a raw provider message into a `MailIntakeMessage`.
Args: Args:
raw_message: Provider-native message payload. raw_message (Dict[str, Any]):
Provider-native message payload.
Returns: Returns:
MailIntakeMessage:
A fully populated `MailIntakeMessage` instance. A fully populated `MailIntakeMessage` instance.
Raises: Raises:
MailIntakeParsingError: If the message payload is missing required MailIntakeParsingError:
fields or cannot be parsed. If the message payload is missing required fields or cannot be parsed.
""" """
try: try:
message_id = raw_message["id"] message_id = raw_message["id"]

View File

@@ -1,6 +1,10 @@
""" """
Domain models for Mail Intake. Domain models for Mail Intake.
---
## Summary
This package defines the **canonical, provider-agnostic data models** This package defines the **canonical, provider-agnostic data models**
used throughout the Mail Intake ingestion pipeline. used throughout the Mail Intake ingestion pipeline.
@@ -11,6 +15,15 @@ Models in this package:
- Serve as stable inputs for downstream processing and analysis - Serve as stable inputs for downstream processing and analysis
These models form the core internal data contract of the library. These models form the core internal data contract of the library.
---
## Public API
MailIntakeMessage
MailIntakeThread
---
""" """
from .message import MailIntakeMessage from .message import MailIntakeMessage

View File

@@ -1,6 +1,10 @@
""" """
Message domain models for Mail Intake. Message domain models for Mail Intake.
---
## Summary
This module defines the **canonical, provider-agnostic representation** This module defines the **canonical, provider-agnostic representation**
of an individual email message as used internally by the Mail Intake of an individual email message as used internally by the Mail Intake
ingestion pipeline. ingestion pipeline.
@@ -19,37 +23,58 @@ class MailIntakeMessage:
""" """
Canonical internal representation of a single email message. Canonical internal representation of a single email message.
This model represents a fully parsed and normalized email message. Notes:
It is intentionally provider-agnostic and suitable for persistence, **Guarantees:**
indexing, and downstream processing.
No provider-specific identifiers, payloads, or API semantics - This model represents a fully parsed and normalized email message
should appear in this model. - It is intentionally provider-agnostic and suitable for persistence, indexing, and downstream processing
**Constraints:**
- No provider-specific identifiers, payloads, or API semantics should appear in this model
""" """
message_id: str message_id: str
"""Provider-specific message identifier.""" """
Provider-specific message identifier.
"""
thread_id: str thread_id: str
"""Provider-specific thread identifier to which this message belongs.""" """
Provider-specific thread identifier to which this message belongs.
"""
timestamp: datetime timestamp: datetime
"""Message timestamp as a timezone-naive UTC datetime.""" """
Message timestamp as a timezone-naive UTC datetime.
"""
from_email: str from_email: str
"""Sender email address.""" """
Sender email address.
"""
from_name: Optional[str] from_name: Optional[str]
"""Optional human-readable sender name.""" """
Optional human-readable sender name.
"""
subject: str subject: str
"""Raw subject line of the message.""" """
Raw subject line of the message.
"""
body_text: str body_text: str
"""Extracted plain-text body content of the message.""" """
Extracted plain-text body content of the message.
"""
snippet: str snippet: str
"""Short provider-supplied preview snippet of the message.""" """
Short provider-supplied preview snippet of the message.
"""
raw_headers: Dict[str, str] raw_headers: Dict[str, str]
"""Normalized mapping of message headers (header name → value).""" """
Normalized mapping of message headers (header name → value).
"""

View File

@@ -1,6 +1,10 @@
""" """
Thread domain models for Mail Intake. Thread domain models for Mail Intake.
---
## Summary
This module defines the **canonical, provider-agnostic representation** This module defines the **canonical, provider-agnostic representation**
of an email thread as used internally by the Mail Intake ingestion pipeline. of an email thread as used internally by the Mail Intake ingestion pipeline.
@@ -20,40 +24,53 @@ class MailIntakeThread:
""" """
Canonical internal representation of an email thread. Canonical internal representation of an email thread.
A thread groups multiple related messages under a single subject Notes:
and participant set. It is designed to support reasoning over **Guarantees:**
conversational context such as job applications, interviews,
follow-ups, and ongoing discussions.
This model is provider-agnostic and safe to persist. - A thread groups multiple related messages under a single subject and participant set
- It is designed to support reasoning over conversational context such as job applications, interviews, follow-ups, and ongoing discussions
- This model is provider-agnostic and safe to persist
""" """
thread_id: str thread_id: str
"""Provider-specific thread identifier.""" """
Provider-specific thread identifier.
"""
normalized_subject: str normalized_subject: str
"""Normalized subject line used to group related messages.""" """
Normalized subject line used to group related messages.
"""
participants: Set[str] = field(default_factory=set) participants: Set[str] = field(default_factory=set)
"""Set of unique participant email addresses observed in the thread.""" """
Set of unique participant email addresses observed in the thread.
"""
messages: List[MailIntakeMessage] = field(default_factory=list) messages: List[MailIntakeMessage] = field(default_factory=list)
"""Ordered list of messages belonging to this thread.""" """
Ordered list of messages belonging to this thread.
"""
last_activity_at: datetime | None = None last_activity_at: datetime | None = None
"""Timestamp of the most recent message in the thread.""" """
Timestamp of the most recent message in the thread.
"""
def add_message(self, message: MailIntakeMessage) -> None: def add_message(self, message: MailIntakeMessage) -> None:
""" """
Add a message to the thread and update derived fields. Add a message to the thread and update derived fields.
This method: Args:
message (MailIntakeMessage):
Parsed mail message to add to the thread.
Notes:
**Responsibilities:**
- Appends the message to the thread - Appends the message to the thread
- Tracks unique participants - Tracks unique participants
- Updates the last activity timestamp - Updates the last activity timestamp
Args:
message: Parsed mail message to add to the thread.
""" """
self.messages.append(message) self.messages.append(message)

View File

@@ -1,6 +1,10 @@
""" """
Message parsing utilities for Mail Intake. Message parsing utilities for Mail Intake.
---
## Summary
This package contains **provider-aware but adapter-agnostic parsing helpers** This package contains **provider-aware but adapter-agnostic parsing helpers**
used to extract and normalize structured information from raw mail payloads. used to extract and normalize structured information from raw mail payloads.
@@ -16,6 +20,17 @@ This package does not:
Parsing functions are designed to be composable and are orchestrated by the Parsing functions are designed to be composable and are orchestrated by the
ingestion layer. ingestion layer.
---
## Public API
extract_body
parse_headers
extract_sender
normalize_subject
---
""" """
from .body import extract_body from .body import extract_body

View File

@@ -1,6 +1,10 @@
""" """
Message header parsing utilities for Mail Intake. Message header parsing utilities for Mail Intake.
---
## Summary
This module provides helper functions for normalizing and extracting This module provides helper functions for normalizing and extracting
useful information from provider-native message headers. useful information from provider-native message headers.
@@ -15,18 +19,23 @@ def parse_headers(raw_headers: List[Dict[str, str]]) -> Dict[str, str]:
""" """
Convert a list of Gmail-style headers into a normalized dict. Convert a list of Gmail-style headers into a normalized dict.
Provider payloads (such as Gmail) typically represent headers as a list
of name/value mappings. This function normalizes them into a
case-insensitive dictionary keyed by lowercase header names.
Args: Args:
raw_headers: List of header dictionaries, each containing raw_headers (List[Dict[str, str]]):
``name`` and ``value`` keys. List of header dictionaries, each containing ``name`` and ``value`` keys.
Returns: Returns:
Dict[str, str]:
Dictionary mapping lowercase header names to stripped values. Dictionary mapping lowercase header names to stripped values.
Notes:
**Guarantees:**
- Provider payloads (such as Gmail) typically represent headers as a list of name/value mappings
- This function normalizes them into a case-insensitive dictionary keyed by lowercase header names
Example: Example:
Typical usage:
Input: Input:
[ [
{"name": "From", "value": "John Doe <john@example.com>"}, {"name": "From", "value": "John Doe <john@example.com>"},
@@ -57,22 +66,24 @@ def extract_sender(headers: Dict[str, str]) -> Tuple[str, Optional[str]]:
""" """
Extract sender email and optional display name from headers. Extract sender email and optional display name from headers.
This function parses the ``From`` header and attempts to extract:
- Sender email address
- Optional human-readable display name
Args: Args:
headers: Normalized header dictionary as returned by headers (Dict[str, str]):
:func:`parse_headers`. Normalized header dictionary as returned by :func:`parse_headers`.
Returns: Returns:
A tuple ``(email, name)`` where: Tuple[str, Optional[str]]:
- ``email`` is the sender email address A tuple ``(email, name)`` where ``email`` is the sender email address and ``name`` is the display name, or ``None`` if unavailable.
- ``name`` is the display name, or ``None`` if unavailable
Examples: Notes:
``"John Doe <john@example.com>"`` → ``("john@example.com", "John Doe")`` **Responsibilities:**
``"john@example.com"`` → ``("john@example.com", None)``
- This function parses the ``From`` header and attempts to extract sender email address and optional human-readable display name
Example:
Typical values:
``"John Doe <john@example.com>"`` -> ``("john@example.com", "John Doe")``
``"john@example.com"`` -> ``("john@example.com", None)``
""" """
from_header = headers.get("from") from_header = headers.get("from")
if not from_header: if not from_header:

View File

@@ -1,6 +1,10 @@
""" """
Subject line normalization utilities for Mail Intake. Subject line normalization utilities for Mail Intake.
---
## Summary
This module provides helper functions for normalizing email subject lines This module provides helper functions for normalizing email subject lines
to enable reliable thread-level comparison and grouping. to enable reliable thread-level comparison and grouping.
@@ -12,27 +16,34 @@ import re
_PREFIX_RE = re.compile(r"^(re|fw|fwd)\s*:\s*", re.IGNORECASE) _PREFIX_RE = re.compile(r"^(re|fw|fwd)\s*:\s*", re.IGNORECASE)
"""Regular expression matching common reply/forward subject prefixes.""" """
Regular expression matching common reply/forward subject prefixes.
"""
def normalize_subject(subject: str) -> str: def normalize_subject(subject: str) -> str:
""" """
Normalize an email subject for thread-level comparison. Normalize an email subject for thread-level comparison.
Operations: Args:
subject (str):
Raw subject line from a message header.
Returns:
str:
Normalized subject string suitable for thread grouping.
Notes:
**Responsibilities:**
- Strips common prefixes such as ``Re:``, ``Fwd:``, and ``FW:`` - Strips common prefixes such as ``Re:``, ``Fwd:``, and ``FW:``
- Repeats prefix stripping to handle stacked prefixes - Repeats prefix stripping to handle stacked prefixes
- Collapses excessive whitespace - Collapses excessive whitespace
- Preserves original casing (no lowercasing) - Preserves original casing (no lowercasing)
This function is intentionally conservative and avoids aggressive **Guarantees:**
transformations that could alter the semantic meaning of the subject.
Args: - This function is intentionally conservative and avoids aggressive transformations that could alter the semantic meaning of the subject
subject: Raw subject line from a message header.
Returns:
Normalized subject string suitable for thread grouping.
""" """
if not subject: if not subject:
return "" return ""