36.Concurrency – threading & multiprocessing
Concurrency in Python allows programs to perform multiple tasks simultaneously, improving efficiency and responsiveness. Python provides two primary concurrency mechanisms: threading and multiprocessing. Understanding their differences, use cases, and limitations is essential for writing performant and scalable applications.
Threading
Threading enables concurrent execution of code within a single process. Threads share the same memory space, making communication between them easier but also introducing risks like race conditions. The Global Interpreter Lock (GIL) in CPython limits true parallelism in CPU-bound tasks, but threading is useful for I/O-bound operations.
Syntax and Example:
import threading
def worker():
print(‘Worker thread running’)
t = threading.Thread(target=worker)
t.start()
t.join()
Multiprocessing
Multiprocessing creates separate processes with their own memory space, allowing true parallelism. It bypasses the GIL and is ideal for CPU-bound tasks. However, inter-process communication is more complex and resource-intensive than threading.
Syntax and Example:
import multiprocessing
def worker():
print(‘Worker process running’)
p = multiprocessing.Process(target=worker)
p.start()
p.join()
Comparison of Threading vs Multiprocessing
Aspect | Threading | Multiprocessing |
Memory Space | Shared | Separate |
Parallelism | Limited by GIL | True parallelism |
Best for | I/O-bound tasks | CPU-bound tasks |
Communication | Easy | Complex |
Performance | Lightweight | Heavyweight |
Use Cases
- Use threading for tasks like web scraping, network operations, or file I/O.
- Use multiprocessing for data processing, image manipulation, or machine learning workloads.
Performance Considerations
Threading is lightweight and suitable for tasks that wait on external resources. Multiprocessing is heavier but can utilize multiple CPU cores effectively. Developers should profile their applications to choose the right concurrency model.
Best Practices
- Avoid shared state in threading unless using synchronization primitives.
- Use queues for communication between threads or processes.
- Handle exceptions in threads and processes gracefully.
- Use concurrent.futures for simplified concurrency management.
Common Pitfalls
- Ignoring the GIL when using threading for CPU-bound tasks.
- Not joining threads or processes, leading to orphaned execution.
- Overhead of spawning too many processes.
- Deadlocks due to improper synchronization.