Threads | Concurrency Tutorial

Rust Concurrency: Threads

Rust provides excellent support for concurrency, allowing you to write programs that can perform multiple tasks seemingly simultaneously. Threads are a fundamental building block for concurrent programming. This document will cover the basics of using threads in Rust.

What are Threads?

A thread is an independent sequence of instructions that can run concurrently with other threads within the same process. They share the same memory space, which allows for efficient communication but also introduces the potential for data races and other concurrency issues.

Creating Threads

Rust's standard library provides the std::thread module for working with threads. The primary way to create a thread is using the thread::spawn function.

use std::thread;
use std::time::Duration;

fn main() {
    // Spawn a new thread
    let handle = thread::spawn(|| {
        for i in 1..10 {
            println!("Spawned thread: {}", i);
            thread::sleep(Duration::from_millis(1)); // Simulate some work
        }
    });

    // Main thread
    for i in 1..5 {
        println!("Main thread: {}", i);
        thread::sleep(Duration::from_millis(1)); // Simulate some work
    }

    // Wait for the spawned thread to finish
    handle.join().unwrap(); // `join()` blocks until the thread completes
    println!("Spawned thread finished.");
}

Explanation:

use std::thread;: Imports the thread module.
use std::time::Duration;: Imports the Duration type for pausing execution.
thread::spawn(|| { ... });: This is the core of thread creation.
- thread::spawn takes a closure (an anonymous function) as an argument. This closure contains the code that will be executed in the new thread.
- The || syntax defines a closure that takes no arguments.
handle: thread::spawn returns a JoinHandle. This handle allows you to wait for the thread to finish and retrieve its result (if any).
handle.join().unwrap();: This line is crucial.
- join() blocks the current thread (in this case, the main thread) until the spawned thread completes.
- unwrap() handles the potential error that join() might return if the spawned thread panicked. In a production environment, you'd want to handle this error more gracefully.
thread::sleep(Duration::from_millis(1));: This pauses the current thread for a specified duration. It's used here to make the output more interleaved and demonstrate concurrency.

Moving Data to Threads

Threads can access data from the main thread, but there are important considerations. Rust's ownership and borrowing rules apply to threads as well. You can't simply pass references to data that might be dropped while the thread is running. Instead, you need to move ownership of the data into the thread.

use std::thread;

fn main() {
    let message = String::from("Hello from the main thread!");

    let handle = thread::spawn(move || {
        println!("Thread received: {}", message);
        // `message` is now owned by this thread.
    });

    // The following line would cause a compile error because `message` has been moved.
    // println!("Main thread: {}", message);

    handle.join().unwrap();
}

Explanation:

move || { ... }: The move keyword before the closure forces the closure to take ownership of any variables it uses from the surrounding environment. In this case, it takes ownership of message.
After the thread::spawn call, message is no longer valid in the main thread because its ownership has been transferred to the spawned thread. Attempting to use message in the main thread will result in a compile-time error.

Sharing Data Between Threads: `Mutex` and `Arc`

When multiple threads need to access and modify the same data, you need to ensure that access is synchronized to prevent data races. Rust provides Mutex (Mutual Exclusion) and Arc (Atomic Reference Counting) for this purpose.

Mutex<T>: Provides exclusive access to the data it wraps. Only one thread can hold the lock on a Mutex at a time. Other threads attempting to acquire the lock will block until it's released.
Arc<T>: Allows multiple threads to safely share ownership of data. It uses atomic operations to track the number of owners. The data is only dropped when the last owner goes out of scope.

use std::sync::{Mutex, Arc};
use std::thread;

fn main() {
    let counter = Arc::new(Mutex::new(0)); // Shared counter
    let mut handles = vec![];

    for _ in 0..10 {
        let counter = Arc::clone(&counter); // Clone the Arc to share ownership
        let handle = thread::spawn(move || {
            let mut num = counter.lock().unwrap(); // Acquire the lock
            *num += 1; // Increment the counter
            // Lock is automatically released when `num` goes out of scope
        });
        handles.push(handle);
    }

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Result: {}", *counter.lock().unwrap()); // Print the final result
}

Explanation:

Arc::new(Mutex::new(0)): Creates a new Mutex containing an integer initialized to 0, and then wraps it in an Arc to allow shared ownership.
Arc::clone(&counter): Clones the Arc, increasing the reference count. Each thread gets its own Arc pointing to the same underlying Mutex.
counter.lock().unwrap(): Acquires the lock on the Mutex. This blocks the current thread until the lock is available. unwrap() handles potential errors (e.g., if the Mutex is poisoned due to a panic).
*num += 1: Increments the value inside the Mutex.
The lock is automatically released when num goes out of scope (at the end of the closure).
println!("Result: {}", *counter.lock().unwrap());: Acquires the lock one last time to print the final value of the counter.

Channels

Channels provide a way for threads to communicate by sending messages to each other. Rust's std::sync::mpsc module provides multiple producer, single consumer (mpsc) channels.

use std::sync::mpsc;
use std::thread;

fn main() {
    let (tx, rx) = mpsc::channel(); // Create a channel

    thread::spawn(move || {
        let val = String::from("Hello from the thread!");
        tx.send(val).unwrap(); // Send a message
    });

    let received = rx.recv().unwrap(); // Receive a message
    println!("Got: {}", received);
}

Explanation:

let (tx, rx) = mpsc::channel();: Creates a new channel.
- tx (transmitter) is used to send messages.
- rx (receiver) is used to receive messages.
tx.send(val).unwrap();: Sends a message val through the channel.
rx.recv().unwrap();: Receives a message from the channel. This blocks the current thread until a message is available.

Common Pitfalls

Data Races: Occur when multiple threads access the same data concurrently, and at least one of them is modifying it, without proper synchronization. Use Mutex or other synchronization primitives to prevent data races.
Deadlocks: Occur when two or more threads are blocked indefinitely, waiting for each other to release resources. Carefully design your locking strategy to avoid deadlocks.
Panics in Threads: If a thread panics, it will unwind and potentially drop resources. Consider using catch_unwind to handle panics in threads gracefully.
Moving Data Incorrectly: Ensure you move ownership of data into threads when necessary, or use Arc and Mutex to share data safely.

Resources

Rust Book - Concurrency: https://doc.rust-lang.org/book/ch16-00-concurrency.html
Rust by Example - Threads: https://doc.rust-lang.org/rust-by-example/concurrency/threads.html
Rust Standard Library - std::thread: [https://doc.rust-lang.org/std/thread/](https://doc.rust-lang.org/std

Rust Concurrency: Threads

What are Threads?

Creating Threads

Moving Data to Threads

Sharing Data Between Threads: Mutex and Arc

Channels

Common Pitfalls

Resources

Sharing Data Between Threads: `Mutex` and `Arc`