The Global Interpreter Lock is the most misunderstood thing in Python. Developers hit it, blame Python for being slow, and reach for multiprocessing or Go. But the GIL isn’t a bug — it’s a deliberate trade-off, and once you understand exactly what it does and doesn’t prevent, you’ll use the right concurrency tool every time.
⚡ TL;DR: The GIL prevents two Python threads from executing Python bytecode simultaneously. It does NOT affect I/O-bound threads (they release the GIL while waiting). For CPU-bound parallelism: use multiprocessing. Python 3.13+ has experimental no-GIL mode. Here’s when each matters.
What the GIL Actually Does
The GIL is a mutex (mutual exclusion lock) that protects CPython’s internal state — particularly reference counts. Every Python object has a reference count. When the count hits zero, the object is freed. Without the GIL, two threads could simultaneously modify the same reference count, corrupting memory.
import threading
import time
counter = 0
def increment():
global counter
for _ in range(1_000_000):
counter += 1
# Without GIL protection, this would corrupt 'counter'
# With GIL, only one thread runs Python bytecode at a time
t1 = threading.Thread(target=increment)
t2 = threading.Thread(target=increment)
t1.start(); t2.start()
t1.join(); t2.join()
print(counter) # Always exactly 2,000,000 — GIL protects this
# But this comes at a cost for CPU-bound work:
The Benchmark That Shows Exactly When GIL Hurts
Master Python concurrency
→ Complete Python Bootcamp (Udemy) — Full section on threading, multiprocessing, and asyncio.
Sponsored links. We may earn a commission at no extra cost to you.
import threading
import multiprocessing
import time
def cpu_bound(n):
"""Pure CPU work — no I/O"""
return sum(i * i for i in range(n))
def io_bound(n):
"""I/O simulation — releases GIL during wait"""
time.sleep(0.1) # GIL released during sleep
return n
N = 10_000_000
# Test 1: Single thread
start = time.time()
cpu_bound(N)
cpu_bound(N)
print(f"Single thread: {time.time()-start:.2f}s")
# Result: ~1.2s
# Test 2: Two threads (CPU-bound — GIL kills parallelism)
start = time.time()
t1 = threading.Thread(target=cpu_bound, args=(N,))
t2 = threading.Thread(target=cpu_bound, args=(N,))
t1.start(); t2.start(); t1.join(); t2.join()
print(f"2 threads (CPU): {time.time()-start:.2f}s")
# Result: ~1.3s ← SLOWER than single thread! GIL overhead.
# Test 3: Two processes (CPU-bound — true parallelism)
start = time.time()
with multiprocessing.Pool(2) as pool:
pool.map(cpu_bound, [N, N])
print(f"2 processes (CPU): {time.time()-start:.2f}s")
# Result: ~0.65s ← 2x faster, real parallelism
# Test 4: Two threads (I/O-bound — GIL released, works fine)
start = time.time()
t1 = threading.Thread(target=io_bound, args=(N,))
t2 = threading.Thread(target=io_bound, args=(N,))
t1.start(); t2.start(); t1.join(); t2.join()
print(f"2 threads (I/O): {time.time()-start:.2f}s")
# Result: ~0.1s ← Both sleep concurrently, GIL released
Fix 1: multiprocessing — True CPU Parallelism
from multiprocessing import Pool, cpu_count
from concurrent.futures import ProcessPoolExecutor
data = list(range(1_000_000))
# ProcessPoolExecutor — cleaner API than Pool
def process_chunk(chunk):
return [x ** 2 for x in chunk]
# Split data into chunks, process in parallel
chunk_size = len(data) // cpu_count()
chunks = [data[i:i+chunk_size] for i in range(0, len(data), chunk_size)]
with ProcessPoolExecutor(max_workers=cpu_count()) as executor:
results = list(executor.map(process_chunk, chunks))
flat = [item for sublist in results for item in sublist]
# When to use ProcessPoolExecutor:
# ✅ CPU-bound: data processing, ML inference, image manipulation
# ❌ I/O-bound: use ThreadPoolExecutor or asyncio instead
# ⚠️ Data must be picklable (lambdas and local functions can't be pickled)
Fix 2: asyncio — I/O Concurrency Without Threads
import asyncio
import aiohttp # pip install aiohttp
async def fetch(session, url):
async with session.get(url) as response:
return await response.json()
async def fetch_all(urls):
async with aiohttp.ClientSession() as session:
# All requests fire concurrently — GIL irrelevant (all I/O)
tasks = [fetch(session, url) for url in urls]
return await asyncio.gather(*tasks)
urls = [f"https://api.example.com/item/{i}" for i in range(100)]
# asyncio vs threading for I/O:
# Both work. asyncio is more efficient (no thread overhead)
# Threading is simpler for existing synchronous code
# asyncio requires async/await throughout the call chain
Fix 3: Python 3.13 Free-Threaded Mode (No GIL)
# Python 3.13 experimental: run Python without the GIL
# Install free-threaded Python 3.13:
# pyenv install 3.13t (t = free-threaded build)
# Check if GIL is active:
import sys
print(sys._is_gil_enabled()) # False in free-threaded build
# With free-threaded Python, CPU-bound threads now run in parallel:
# threading is now as fast as multiprocessing for CPU work
# BUT: many C extensions assume GIL exists — may crash or corrupt data
# Production use: not yet (2025). Watch Python 3.14 for stability.
# To disable GIL at runtime (Python 3.13+, unstable):
# PYTHON_GIL=0 python script.py
Decision Tree: Which Concurrency Tool to Use
- 🔵 I/O-bound + many connections →
asyncio+aiohttp/aiofiles - 🟢 I/O-bound + existing sync code →
ThreadPoolExecutor - 🔴 CPU-bound →
ProcessPoolExecutorormultiprocessing.Pool - 🟡 CPU-bound + shared memory →
multiprocessing.shared_memoryor numpy with C extensions - ⚪ Simple background task →
threading.Thread(fine for I/O)
The memory efficiency concepts from Python __slots__ matter even more when using multiprocessing — each process copies the parent’s memory space, so smaller objects mean faster fork and lower total memory. For deploying Python concurrency on serverless, see the AWS Lambda cold start guide — Lambda’s concurrency model is architecturally similar to multiprocessing.
Recommended resources
- Fluent Python (2nd Edition) — Chapter 19 covers concurrency models including threading, multiprocessing, and asyncio with the clearest GIL explanation in print.
- Python Tricks — The concurrency tricks chapter covers practical patterns for working around the GIL without reaching for multiprocessing unnecessarily.
Disclosure: This post contains affiliate links. If you purchase through these links, CheatCoders earns a small commission at no extra cost to you. We only recommend tools and books we genuinely find valuable.
Free Weekly Newsletter
🚀 Don’t Miss the Next Cheat Code
You just read something most developers never learn. Get more secrets like this delivered every week — JavaScript internals, Python optimizations, AWS architectures, system design, and AI workflows.
Join 1,000+ senior developers who actually level up. Zero fluff, pure signal.
Discover more from CheatCoders
Subscribe to get the latest posts sent to your email.

Pingback: Python Dataclasses: 10 Advanced Features That Make __init__ Obsolete - CheatCoders
Pingback: Python asyncio vs Threading: The Benchmark That Changes How You Think About Concurrency - CheatCoders