67a3074ab4
using doc-forge ( #1 )
...
Reviewed-on: #1
Co-authored-by: Vishesh 'ironeagle' Bangotra <aetoskia@gmail.com >
Co-committed-by: Vishesh 'ironeagle' Bangotra <aetoskia@gmail.com >
2026-01-22 11:27:56 +00:00
7f1b0d9c10
docs: add contract-oriented docstrings across core, html, and pdf layers
...
- docs(core): document Content and ContentType canonical models
- docs(core): define BaseParser contract and parsing semantics
- docs(core): define BaseScraper contract and acquisition semantics
- docs(html): document HTML package purpose and scope
- docs(html): add HTMLParser base with DOM helpers and contracts
- docs(html): add HTTP-based HTMLScraper with content-type enforcement
- docs(pdf): document PDF package structure and public pipeline
- docs(pdf): add BasePDFClient abstraction and filesystem implementation
- docs(pdf): add PDFParser base contract for binary parsing
- docs(pdf): add PDFScraper coordinating client and Content normalization
- docs(api): expand top-level omniread module with install instructions and examples
2026-01-09 15:51:22 +05:30
de67c7b0b1
feat(pdf): add PDF client, scraper, parser, and end-to-end tests
...
- Introduce PDF submodule with client, scraper, and generic parser
- Add filesystem PDF client and test-only mock routing
- Add end-to-end PDF scrape → parse tests with typed output
- Mirror HTML module architecture for consistency
- Expose PDF primitives via omniread public API
2026-01-02 18:59:36 +05:30
358abc9b36
feat(api): expose core and html primitives via top-level package exports
...
- Re-export Content and ContentType from omniread.core
- Re-export HTMLScraper and HTMLParser from omniread.html
- Define explicit __all__ for stable public API surface
2026-01-02 18:36:29 +05:30
32ee43e77a
omni read basic modules
2025-12-31 14:28:50 +05:30