About the MurmurHash Generator
The MurmurHash Generator computes MurmurHash values for any input string. MurmurHash is a fast, non-cryptographic hash function designed for high-throughput hash table lookups, data partitioning, consistent hashing, and fingerprinting. It is not suitable for password storage or security-sensitive applications, but excels in performance-critical systems that need fast, well-distributed integer hash values.
How to Use
- Enter the input string to hash.
- Select the MurmurHash variant: MurmurHash2 (32-bit or 64-bit) or MurmurHash3 (32-bit or 128-bit x86/x64).
- Optionally set a seed value. The same input with different seeds produces different hash values — useful for building multiple independent hash functions from one algorithm.
- Click Generate. The output is the hash value in decimal and hexadecimal.
MurmurHash Variants
- MurmurHash2 (32-bit) — The original widely-deployed variant. Fast on 32-bit platforms; used in Redis, Hadoop, and many early distributed systems.
- MurmurHash2 (64-bit) — Extended to 64 bits for larger hash spaces and better avalanche characteristics on 64-bit hardware.
- MurmurHash3 (32-bit) — The current standard for 32-bit output. Better avalanche effect and distribution than MurmurHash2.
- MurmurHash3 (128-bit x86) — 128-bit output optimised for 32-bit (x86) processors. Used when a 128-bit fingerprint is needed and 32-bit platform compatibility matters.
- MurmurHash3 (128-bit x64) — 128-bit output optimised for 64-bit processors. The fastest variant for large inputs on modern servers. Used in Apache Cassandra, ClickHouse, and Elasticsearch.
MurmurHash vs Other Hash Functions
- MurmurHash3 vs MD5/SHA-1 — MurmurHash3 is 5–10× faster than MD5 for short strings and is not reversible by cryptographic attack — but it has no security properties. Do not use it where collision resistance under adversarial input matters (hash flooding, digital signatures). Use it where you control the input and need speed.
- MurmurHash3 vs FNV / djb2 — FNV and djb2 are simpler non-cryptographic hash functions. MurmurHash3 has better avalanche characteristics (small input changes produce large, well-distributed output changes) and fewer clustering issues in hash tables.
- MurmurHash3 vs xxHash — xxHash3 is generally faster than MurmurHash3 on modern hardware with SIMD instructions. Both are appropriate for non-security hashing; xxHash is the newer benchmark winner for raw throughput on large data blocks.
- MurmurHash vs CRC32 — CRC32 is a cyclic redundancy check designed for error detection, not hash table distribution. CRC32 has known weaknesses for hash-table use (poor avalanche for short inputs). MurmurHash distributes more uniformly across the integer space.
Common Use Cases
- Hash tables and dictionaries — Language runtimes and databases use non-cryptographic hashes like MurmurHash for hash table bucket assignment. Speed and distribution quality matter more than collision resistance here.
- Consistent hashing in distributed systems — Sharding data across nodes in Cassandra, Kafka, and similar systems uses MurmurHash to map keys to partition numbers consistently regardless of cluster size.
- Data deduplication and fingerprinting — A fast hash of a data block serves as a fingerprint for deduplication checks. Two identical blocks produce the same hash; accidental collisions are unlikely over a well-distributed hash space.
- Bloom filters — Bloom filters use multiple independent hash functions. MurmurHash3 with different seeds provides multiple independent hash functions from a single implementation.
- Feature hashing (the hashing trick) — Machine learning pipelines use MurmurHash to map categorical feature strings to integer indices for fixed-size feature vectors without a dictionary lookup.
Frequently Asked Questions
- Is MurmurHash secure for passwords or tokens?
- No. MurmurHash is not cryptographically secure. It is fast by design, which makes it trivially brute-forceable for passwords. It has no pre-image resistance or second pre-image resistance. Use Argon2id, bcrypt, or scrypt for passwords, and SHA-256 or SHA-3 for integrity verification of trusted data.
- Why does the same input produce a different hash on different platforms?
- MurmurHash is sensitive to platform endianness. The x86 and x64 variants of MurmurHash3 produce different 128-bit outputs even for the same input. Ensure all nodes in a distributed system use the same variant and the same seed. Some language implementations have historically had bugs that produced platform-specific values — use a well-tested library and pin the version.
- What seed should I use?
- For deterministic, reproducible hashes (partitioning, fingerprinting), use a fixed seed (e.g., 0 or a constant specific to your application). For bloom filters or consistent hashing, use a different fixed seed for each hash function slot. Do not use a random seed if you need the hash to be stable across restarts or across nodes.
- What is the birthday paradox collision probability for MurmurHash3?
- For a 32-bit hash, the probability of at least one collision reaches 50% after approximately 77,000 inputs. For 64-bit, that threshold is approximately 5 billion inputs. For 128-bit, collisions are negligible in practice. Choose the bit width based on your expected data set size and acceptable collision probability.
- Where is MurmurHash used in well-known systems?
- Apache Cassandra uses MurmurHash3 for token ring partitioning. Elasticsearch uses MurmurHash3 for document ID hashing. ClickHouse uses MurmurHash3 as a built-in function for data partitioning. Redis 6.0 uses a variant for hash slot assignment. The Python
hashlib does not include MurmurHash (it only includes cryptographic hashes), but mmh3 (pip install mmh3) is the standard Python binding.