Processes vs Threads: Definitions, Differences, and How They Work
A process is an independent program in execution with its own memory space. A thread is a unit of execution within a process, sharing the process’s memory. This guide defines both, explains process states, covers 7 technical differences, explains context switching cost, distinguishes concurrency from parallelism, and covers thread synchronization mechanisms and race conditions.
What Is a Process?
A process is a program in execution. The OS creates a process when a program is launched: it allocates a private virtual address space, loads the program’s executable code into memory, creates a Process Control Block (PCB) to track state, and assigns a unique Process ID (PID). A process is fully isolated from other processes — no other process can read or modify its memory without OS-mediated mechanisms (shared memory, pipes, sockets).
Every process contains at minimum one thread — the main thread — which begins execution at the program’s entry point (main() in C/C++, public static void main() in Java). Additional threads can be created within the process at runtime. All threads within one process share the same virtual address space, file handles, and OS resources but each has its own stack and register state.
Each process is identified by a unique PID assigned by the OS. Linux assigns PIDs sequentially starting at 1 (the init/systemd process).
The maximum PID on Linux is 32,768 by default, configurable up to 4,194,304. On Windows, PIDs are multiples of 4 starting from 4 (System process).
What Is a Thread?
A thread is the smallest unit of CPU execution schedulable by the OS. A thread exists within a process and inherits the process’s virtual address space, file descriptors, and signal handlers.
Each thread has its own program counter (instruction pointer), register set, and stack. Because threads share the heap, global variables, and file handles of their parent process, they can communicate by reading and writing shared memory directly — which is both faster than inter-process communication and the source of synchronization hazards.
Thread creation is significantly cheaper than process creation. On Linux, creating a thread (pthread_create) takes approximately 10–50 microseconds.
Creating a process (fork) takes approximately 100–1,000 microseconds because the OS must copy or copy-on-write map the entire page table. Windows CreateThread takes approximately 10–50 microseconds; CreateProcess takes 1–5 milliseconds.
Process States
A process transitions through 5 states during its lifetime as managed by the OS scheduler:

- New: The process is being created. The OS is allocating the PCB, virtual address space, and loading the executable.
- Ready: The process is loaded in memory and waiting for CPU time. All required resources are available; the process is in the scheduler’s run queue.
- Running: The process is executing on a CPU core. Only one process (or thread) runs per CPU core at any instant. On an 8-core CPU, a maximum of 8 processes/threads run simultaneously.
- Waiting (Blocked): The process is waiting for an event: I/O completion, a network packet, a lock, or a timer. The scheduler removes it from the run queue. Blocked processes consume no CPU time.
- Terminated: The process has finished execution or been killed. The OS reclaims its memory and file handles. The PCB remains briefly as a “zombie” until the parent process reads the exit status.
7 Differences Between Processes and Threads
| Property | Process | Thread |
|---|---|---|
| Memory isolation | Separate virtual address space per process | Shared address space within the parent process |
| Creation overhead | 100µs–5ms (full address space setup) | 10–50µs (stack + PCB only) |
| Communication method | IPC: pipes, sockets, shared memory, message queues | Direct read/write of shared heap memory |
| Crash impact | Crash isolated to one process; others continue | Crash in one thread (SIGSEGV) kills all threads in the process |
| Context switch cost | 1–10µs (includes TLB flush for new address space) | 0.1–1µs (no TLB flush needed; same address space) |
| Scheduling unit | Scheduled by OS scheduler (PID) | Scheduled by OS scheduler (TID — Thread ID) |
| Use case | Isolation between independent applications | Parallelism within a single application |
Context Switching Explained
A context switch is the OS operation of saving the current process or thread state and loading the state of the next one scheduled to run. The CPU registers (general-purpose registers, program counter, stack pointer, flags) are saved to the PCB of the outgoing process/thread and restored from the PCB of the incoming one. Context switch cost varies by type:
- Thread context switch (same process): 0.1–1 microsecond. Saves/restores registers. No TLB flush. Address space unchanged.
- Process context switch: 1–10 microseconds. Saves/restores registers plus CR3 register (page table base pointer). Changing CR3 flushes the TLB, causing subsequent memory accesses to re-warm the TLB cache. TLB re-warm adds 50–500 nanoseconds overhead per subsequent access until the TLB repopulates.
A system under high load may perform 100,000–1,000,000 context switches per second. Linux tracks context switch counts per process in /proc/[pid]/status (voluntary_ctxt_switches, nonvoluntary_ctxt_switches). Excessive involuntary context switches indicate CPU contention.
Concurrency vs. Parallelism
Concurrency and parallelism are distinct concepts that are frequently confused:
- Concurrency: Multiple tasks are in progress simultaneously, but not necessarily executing at the exact same instant. A single-core CPU achieves concurrency by time-slicing: switching between threads rapidly enough that all appear to progress simultaneously. A web server handling 10,000 concurrent connections on a 4-core CPU is using concurrency — only 4 threads execute at any instant, but all connections make progress through interleaved I/O waits.
- Parallelism: Multiple tasks execute simultaneously on multiple CPU cores or processors. A matrix multiplication split across 8 threads running on an 8-core CPU achieves parallelism — all 8 threads execute simultaneously.
A program can be concurrent without being parallel (single-core time-sharing), parallel without being concurrent (SIMD vector operations executing one task in parallel across data lanes), or both concurrent and parallel (a multi-threaded server running on a multi-core CPU).
Thread Synchronization: Mutex, Semaphore, and Lock
Because threads share the same heap and global variables, concurrent access to shared data without synchronization causes race conditions. Three synchronization primitives control access to shared resources:

Mutex (Mutual Exclusion Lock)
A mutex is a binary lock. Only one thread can hold a mutex at a time. Acquiring a mutex is O(1): if the mutex is unlocked, the thread locks it and proceeds.
If locked, the thread blocks until the holder releases it. Mutex acquisition takes approximately 10–50 nanoseconds in the uncontended case using atomic compare-and-swap instructions. Contended mutexes require OS intervention, increasing cost to 1–10 microseconds.
Semaphore
A semaphore is a non-negative integer counter with two atomic operations: wait (decrement, block if zero) and signal (increment, wake a blocked thread). A binary semaphore (count 0 or 1) functions like a mutex. A counting semaphore with count N allows up to N threads to access a resource simultaneously — useful for connection pools, bounded buffers, and rate limiting.
Read-Write Lock
A read-write lock (rwlock) allows concurrent read access by multiple threads but exclusive write access. Multiple readers can hold the lock simultaneously. A writer waits until all readers release the lock before acquiring exclusive access. rwlocks improve throughput in read-heavy workloads — a database cache with 95% reads and 5% writes benefits significantly from rwlock over a plain mutex.
Race Conditions
A race condition occurs when two or more threads access shared data concurrently and the final result depends on the execution order. The classic example: two threads both read a counter value of 5, both add 1, and both write 6 — the counter should be 7 but is 6 because one increment was lost.
Race conditions are non-deterministic: they may not reproduce consistently because thread scheduling order varies between runs. Detecting race conditions requires tools: ThreadSanitizer (TSan, built into GCC/Clang) instruments memory accesses at compile time and reports data races at runtime with zero false positives for the happens-before model it tracks.
Key Takeaways
- A process has its own isolated virtual address space; a thread shares its parent process’s address space, heap, and file handles.
- Thread creation costs 10–50µs; process creation costs 100µs–5ms due to address space setup.
- A thread context switch costs 0.1–1µs; a process context switch costs 1–10µs due to TLB invalidation.
- A crash in one thread kills all threads in the same process; a crash in one process leaves other processes unaffected.
- Concurrency means multiple tasks progress simultaneously through interleaving; parallelism means multiple tasks execute simultaneously on separate CPU cores.
- Race conditions require synchronization primitives (mutex, semaphore, rwlock) to prevent data corruption in shared memory access patterns.
Frequently Asked Questions
What is the main difference between a process and a thread?
A process has its own isolated virtual address space and resources. A thread is a unit of execution within a process, sharing the process’s memory. Creating a thread costs 10–50µs; creating a process costs 100µs–5ms. A thread crash kills all threads in the process; a process crash is isolated.
What is a context switch?
A context switch is the OS operation of saving one process or thread’s CPU register state and loading another’s. Thread context switches cost 0.1–1µs. Process context switches cost 1–10µs because changing the page table base register (CR3) flushes the TLB, adding re-warm overhead for subsequent memory accesses.
What is the difference between concurrency and parallelism?
Concurrency means multiple tasks are in progress simultaneously through interleaving on one or more CPUs. Parallelism means multiple tasks execute at the exact same instant on separate CPU cores. A single-core system can be concurrent but not parallel. A multi-core system running multi-threaded code achieves both simultaneously.
What is a race condition?
A race condition occurs when two or more threads access shared data concurrently and the result depends on execution order. The increment-then-write pattern on a shared counter is the classic example: both threads read the same value, both increment it, and only one increment takes effect. Mutexes and atomic operations prevent race conditions.
How many threads can a process have?
Thread count per process is limited by available stack memory and OS limits. On Linux, the default stack size is 8MB per thread; with 8GB RAM, approximately 1,000 threads fit before stack memory exhaustion. Linux’s /proc/sys/kernel/threads-max sets the system-wide limit (typically 32,768–4,194,304).
Last Thoughts on Processes vs Threads
Processes and threads represent the two fundamental units of concurrent execution in modern operating systems. Processes provide strong isolation — separate address spaces, independent crash domains — at the cost of higher creation overhead and expensive inter-process communication.
Threads provide efficient concurrency within a process — shared memory, low creation cost, fast context switches — at the cost of requiring explicit synchronization to prevent race conditions. Effective concurrent system design uses processes for isolation between independent components and threads for parallelism within a single workload, with mutexes, semaphores, and lock-free atomic operations guarding all shared state.


