Calculate the simhash value for a list of tokens
A TypeScript implementation of Simhash algorithm for near-duplicate detection
Command Line tool that compares two text files using simhash
A Javascript implementation of Charikar's hash for identification of similar documents.
Simhash implementation for detecting near-duplicate text using various hash functions like SipHash, MD5, and SHA256
SimHash implementation for detecting near-duplicate text using SipHash-2- function
Multi-layer text deduplication using SimHash, N-gram containment, and sentence-sequence LCS for Tiptap/ProseMirror documents
Simhash implementation for detecting near-duplicate text using various hash functions like SipHash, MD5, and SHA256
Javascript implementation for `simhash` algorithm which is widely used by Google for massive web pages
Command Line tool that compares two text files using simhash
Vocabulary-based SimHash implementation for similarity detection
64-bit SimHash for text deduplication with Hamming distance calculations
SimHash text clustering with OutRank outlier removal and Variation of Information analysis.
64-bit SimHash for text deduplication with Hamming distance calculations (WASM)
SimHash fingerprinting for fuzzy text deduplication - native C++ for Node.js with prebuilt binaries
Fast offline document search MCP server — trigram index, SimHash, zero dependencies
Complete string distance and similarity algorithms package with WebAssembly and JavaScript implementations
A TypeScript npm package for finding near duplicate string pairs
Information-theoretic context optimization MCP server with deterministic hallucination detection and suppression for AI coding agents.
Embedding Locality IDentifier - encode embeddings into sortable string IDs for vector search without vector stores, plus fast string similarity algorithms
A powerful toolkit for data structures and algorithms in TypeScript, designed for optimal performance and versatility. The toolkit provides implementations of various data structures and algorithms, with a focus on search and sort operations, caching, and
功能强大、高度可定制的回声洞插件。支持丰富的媒体类型、内容查重、AI分析、人工审核、用户昵称、数据迁移以及本地/S3 双重文件存储后端。
ELID with image embedding models (MobileNetV3-Small)
Detect reusable/duplicate React Native code (components, hooks, styles, utils) and suggest refactors. Ships as a CLI + Node API.
Simhash algorithm for 64-bit locality-sensitive hashing and similarity calculation
Text fingerprinting: MinHash + LSH, SimHash, and ONNX semantic embeddings
Locality Sensitive Hashing for Angular Similarity
Web-oriented 64-bit SimHash with weighted features and Google-paper-style lookup tables
Embedding Locality IDentifier - encode embeddings into sortable string IDs for vector search without vector stores, plus fast string similarity algorithms
Grep your documents with context — fast offline search for PDFs, DOCX, Markdown and code
Content deduplication utilities for web crawling
Sketching primitives for IR: minhash/simhash/LSH-style signatures.
Fast document indexer for finding duplicates and searching content
Produces locality-sensitive hashes (LHS) for hamming-space bags of features
Content deduplication via simhash and LRU cache
Common types and utilities for the Argus web crawler
Implementation of Charikar simhashes in Ruby
Implementation of Charikar simhashes in Ruby
Implementation of Charikar simhashes in Ruby
Implementation of Charikar simhashes in Ruby
A rewrite of the 'simhash' gem, which is an implementation of Moses Charikar's simhashes in Ruby.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.