Computer Security

What Is Hashing?

Hashing is the process of converting input data of any size into a fixed-length string of characters through a one-way mathematical function. A hash function takes a file, password, or message and produces a hash value, also called a digest, that acts as a fixed-length fingerprint of the input. Hashing differs from encryption because no key reverses a hash back to the original data.

Password systems, file integrity checks, and digital signatures all depend on hashing. This article defines hashing, lists the properties a secure hash function holds, separates hashing from encryption, names the common algorithms, describes where hashing is used, and explains salting and why password systems require it. The National Institute of Standards and Technology, which standardizes the Secure Hash Algorithm family, and published cryptographic research supply the references used here.

Each section answers one question about hashing and connects to the next. Readers learn why a hash is irreversible, why a single changed bit alters the entire output, and why salting defends stored passwords.

What Is Hashing?

Hashing is a one-way function that converts input data into a fixed-length hash value. The function accepts input of any length and returns output of a constant length determined by the algorithm. The National Institute of Standards and Technology standardizes the Secure Hash Algorithm family used for this purpose.

A hash function is deterministic, so the same input always produces the same hash, but the function cannot be reversed to recover the input from the hash. This one-way property separates hashing from encryption and defines its role in verification.

What Properties Does a Secure Hash Function Have?

A secure hash function holds determinism, the one-way property, collision resistance, and the avalanche effect. These properties make a hash reliable for verification and resistant to attack. The list below states the core properties.

  • Deterministic output means the same input always produces the identical hash value.
  • One-way design means recovering the input from the hash is computationally infeasible.
  • Collision resistance means finding two different inputs that produce the same hash is infeasible.
  • Avalanche effect means a single changed bit in the input alters roughly half the output bits.

The avalanche effect ensures that similar inputs produce entirely different hashes, which prevents an attacker from inferring the input by comparing close outputs. A hash function that loses collision resistance becomes unsafe, which is why older algorithms were deprecated.

What Is the Difference Between Hashing and Encryption?

Hashing is a one-way function that cannot be reversed, while encryption is a reversible process that returns the original data with a key. Encryption protects confidentiality and assumes the data must be recovered.

What Is the Difference Between Hashing and Encryption? - What Is Hashing?

Hashing protects integrity and verification and never returns the original input. The list below states the core distinctions.

  • Direction separates the two, since hashing is one-way and encryption is two-way.
  • Keys differ, since encryption requires a key while a standard hash uses none.
  • Output length differs, since a hash is fixed-length while ciphertext scales with the input.
  • Purpose differs, since hashing verifies integrity while encryption preserves confidentiality.

The reversible nature of encryption and the irreversible nature of hashing place them in separate roles. The conversion of data with keys appears in the explanation of how encryption protects data and reverses with a key.

Which Hashing Algorithms Are Used Today?

Modern systems use SHA-256 for general hashing and bcrypt, scrypt, or Argon2 for password storage. Each algorithm fits a defined purpose. The list below names the active and deprecated algorithms.

  • SHA-256, part of the SHA-2 family standardized by the National Institute of Standards and Technology, produces a 256-bit digest for integrity and signatures.
  • SHA-3, standardized in 2015, provides an alternative construction as a backup to SHA-2.
  • bcrypt, scrypt, and Argon2 add deliberate slowness and memory cost to resist password cracking.
  • MD5 and SHA-1 are deprecated because researchers demonstrated practical collisions against both.

MD5 fell to collision attacks documented in 2004, and SHA-1 fell to the SHAttered collision in 2017. General-purpose hashes such as SHA-256 are intentionally fast, while password hashes are intentionally slow to raise the cost of brute-force attacks.

Where Is Hashing Used?

Hashing is used for password storage, file integrity verification, and digital signatures. Each use relies on the one-way and deterministic properties. The list below states the primary applications.

  • Password storage saves the hash of a password rather than the password itself, so a database leak does not expose plaintext credentials.
  • File integrity verification compares the hash of a downloaded file against a published value to confirm the file was not altered.
  • Digital signatures hash a message first, then sign the digest, which makes signing efficient and tamper-evident.
  • Data structures such as hash tables use hashing to index and retrieve records quickly.

Integrity verification appears whenever a software publisher lists a checksum beside a download. The role of hashing inside signing appears in the comparison of public-key methods that sign a hashed digest.

What Is Salting and Why Is It Used?

Salting is the addition of a unique random value to each password before hashing. The salt ensures that identical passwords produce different hashes, which defeats precomputed attack tables. The list below states how salting strengthens password storage.

  • Unique salts give every stored password a different hash even when two users choose the same password.
  • Rainbow table defense fails because precomputed hash tables cannot account for a random per-user salt.
  • Stored alongside the hash, the salt is not secret, but its randomness is what provides the protection.
  • Per-password generation requires a new random salt for each account at the time the password is set.

Without a salt, an attacker who steals a password database can match hashes against precomputed tables instantly. A strong, unique password remains the user-side defense, covered in the guide on how to create a strong and unique password.

What Is a Hash Collision?

A hash collision is a case where two different inputs produce the same hash value. Because a hash maps unlimited inputs to a fixed-length output, collisions exist in theory, but a secure function makes them infeasible to find. The list below states the significance of collisions.

  • Theoretical existence follows from mapping infinite inputs to a finite output space.
  • Practical resistance means a secure algorithm makes locating a collision computationally infeasible.
  • Broken algorithms such as MD5 and SHA-1 allow attackers to construct collisions deliberately.
  • Forgery risk arises when an attacker creates two files with the same hash to substitute one for the other.

The 2017 SHAttered attack produced two distinct PDF files with the same SHA-1 hash, which ended the use of SHA-1 in digital signatures. Collision resistance is the property that keeps a hash function trustworthy for verification.

How Does Hashing Verify File Integrity?

Hashing verifies file integrity by comparing a recomputed hash of a file against a trusted published value. A match confirms the file is unchanged, and a mismatch signals corruption or tampering. The list below states the verification steps.

  1. The publisher computes a hash of the original file and lists it as a checksum.
  2. The user downloads the file and computes its hash with the same algorithm.
  3. The user compares the computed hash to the published checksum.
  4. A match confirms integrity, while a mismatch indicates corruption or modification.

This method detects accidental corruption during download and deliberate tampering by an attacker who alters a file. The published checksum must come from a trusted source, because an attacker who controls both file and checksum can replace both.

What Is a Keyed Hash and HMAC?

A keyed hash is a hash function combined with a secret key to authenticate the origin of a message. The Hash-based Message Authentication Code, abbreviated HMAC, is the standard construction. The list below states how a keyed hash differs from a plain hash.

  • Secret key inclusion means only parties holding the key can produce or verify the correct code.
  • Authentication is added, since a plain hash verifies integrity but not the sender identity.
  • HMAC construction nests the key and message through two hashing passes to resist forgery.
  • Protocol use appears in TLS and API request signing, where HMAC confirms a message was not altered.

A plain hash verifies that data is unchanged, but anyone can recompute it, so it does not prove origin. HMAC adds a shared secret so the recipient confirms both integrity and that the holder of the key produced the message.

Why Are Password Hashes Made Deliberately Slow?

Password hashes are made deliberately slow because slowness raises the cost of guessing passwords by brute force. General-purpose hashes such as SHA-256 are fast, which helps attackers test billions of guesses per second. The list below states how slow hashing defends passwords.

Why Are Password Hashes Made Deliberately Slow? - What Is Hashing?
  • Work factors in bcrypt set an adjustable number of rounds that increase computation per guess.
  • Memory cost in scrypt and Argon2 forces each guess to consume memory, which limits parallel attacks on specialized hardware.
  • Tunable difficulty lets administrators raise the cost as hardware improves over time.
  • Per-attempt delay is negligible for one login but compounds across the millions of guesses an attacker needs.

A fast hash like SHA-256 suits file integrity but fails for passwords because it allows rapid guessing. Argon2, selected as the winner of the Password Hashing Competition in 2015, combines memory and time cost to resist modern cracking hardware.

Key Takeaways

  • Hashing converts input into a fixed-length one-way hash value.
  • A secure hash is deterministic, irreversible, and collision-resistant.
  • Hashing is one-way, while encryption is reversible with a key.
  • SHA-256 handles general hashing, and bcrypt or Argon2 handle passwords.
  • MD5 and SHA-1 are deprecated due to practical collision attacks.
  • Salting adds a unique random value to defeat precomputed tables.
  • Hashing verifies passwords, file integrity, and digital signatures.
AttributeHashingEncryption
DirectionOne-way, irreversibleTwo-way, reversible
Key requiredNo key for standard hashingRequires a key
Output lengthFixed regardless of inputScales with input size
Primary purposeIntegrity and verificationConfidentiality
Recover originalNot possiblePossible with the key
Common algorithmsSHA-256, bcrypt, Argon2AES, RSA, ChaCha20

What is the difference between hashing and encryption?

Hashing is a one-way function that cannot be reversed and verifies integrity. Encryption is reversible with a key and protects confidentiality. Hashed data cannot be returned to its original form.

Can a hash be reversed?

No. A secure hash function is one-way, so recovering the input from the hash is computationally infeasible. Attackers instead guess inputs and compare hashes, which salting and slow algorithms resist.

Why are passwords hashed instead of encrypted?

Passwords are hashed so a database leak does not expose plaintext credentials. The system never needs the original password back; it only compares the hash of a login attempt to the stored hash.

What is a salt in hashing?

A salt is a unique random value added to each password before hashing. It ensures identical passwords produce different hashes and defeats precomputed rainbow table attacks.

Is SHA-256 secure?

Yes. SHA-256, part of the SHA-2 family standardized by the National Institute of Standards and Technology, has no known practical collisions and is widely used for integrity and signatures.

Why are MD5 and SHA-1 no longer used?

MD5 and SHA-1 are deprecated because researchers demonstrated practical collisions, where two different inputs produce the same hash. This breaks their use in signatures and integrity verification.

Last Thoughts on Hashing

Hashing converts data into a fixed-length, irreversible fingerprint that supports verification rather than confidentiality. The properties of determinism, one-way design, collision resistance, and the avalanche effect define a secure hash function, and the loss of those properties retired MD5 and SHA-1. Hashing differs from encryption because no key recovers the original input, which suits password storage, file integrity, and digital signatures.

Salting strengthens stored passwords by ensuring identical inputs produce different hashes. Hashing underpins signatures, certificates, and integrity checks across the security cluster. The hub on cybersecurity concepts and data protection methods places hashing within the wider framework of verification and trust.

Nizam Ud Deen

Nizam Ud Deen is the founder of theCoreiTech, a tech-focused platform dedicated to simplifying the world of computers, hardware, and digital innovation. With nearly a decade of experience in digital marketing and IT, Nizam combines strategic marketing insight with deep technical understanding. As a passionate entrepreneur, he has built multiple successful digital products and online ventures, helping bridge the gap between technology and everyday users. His mission through theCoreiTech is to empower readers to make informed decisions about computers, hardware, and emerging tech trends through clear, data-driven, and actionable content.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button