pikuri-pdf
v0.0.6RubyGems· Rubypikuri-pdf plugs PDF → text extraction into pikuri-core's +Pikuri::Extractor+ registry. The bundled +Pikuri::Extractors::PDF+ extractor wraps the pure-Ruby pdf-reader gem and extracts lazily: paged reads (the +read+ tool's windows) parse only the pages the window needs, so the first page of a 500-page PDF never pays for the other 499. Shipped separately from pikuri-core so the core's dependency tree stays minimal and auditable: pdf-reader and its transitive deps (Ascii85, afm, hashery, ruby-rc4, ttfunk) ride along only for hosts that opt into PDF support. Registration is explicit — +Pikuri::Extractors::PDF.register+ — so requiring the gem changes nothing by itself; the host script picks which extractors it wires in. One registration extends the +read+ tool, +web_scrape+, and the pikuri-vectordb indexer simultaneously.
- 0.0.6this month