Rust Programming: Strings
Rust has a strong emphasis on memory safety, and this extends to how it handles strings. Unlike some other languages, Rust has a few different types for working with strings, each with its own characteristics and use cases. This document will cover the core concepts.
1. String vs. &str
This is the most fundamental distinction to understand.
String: A growable, heap-allocated, owned string. Think of it as a vector of bytes that's guaranteed to be valid UTF-8. You own the data, meaning you're responsible for its memory management.Stringis mutable.&str: A string slice. It's a reference to a sequence of UTF-8 encoded bytes. It doesn't own the data; it borrows it from somewhere else (like aStringor a string literal).&stris immutable by default.
Analogy:
Imagine a house.
Stringis like owning the house. You can renovate it, add rooms, etc.&stris like renting a room in the house. You can look at it, but you can't change the house itself.
Example:
fn main() {
// String (owned)
let mut s = String::from("hello");
s.push_str(", world!"); // Mutable - can modify
println!("{}", s); // Output: hello, world!
// &str (string slice - borrowed)
let greeting: &str = "hello"; // String literal - creates a &str
println!("{}", greeting); // Output: hello
let part = &s[0..5]; // Create a &str slice from the String
println!("{}", part); // Output: hello
}
2. Creating Strings
- String Literals: String literals (e.g.,
"hello") are of type&str. String::from(): Creates aStringfrom a string literal or another&str.String::new(): Creates an emptyString.to_string()method: Many types have ato_string()method that converts them to aString.String::with_capacity(): Creates aStringwith a pre-allocated capacity, which can improve performance if you know the approximate size of the string beforehand.
fn main() {
let s1 = String::from("hello");
let s2: String = "world".to_string();
let s3 = String::new();
let s4 = String::with_capacity(10); // Pre-allocate space for 10 bytes
}
3. String Manipulation
Rust provides various methods for manipulating strings.
push_str(): Appends a&strto aString.push(): Appends a single character to aString.insert_str(): Inserts a&strat a given index.insert(): Inserts a character at a given index.remove(): Removes and returns a character at a given index.pop(): Removes and returns the last character of aString.replace(): Replaces all occurrences of a substring with another substring.trim(): Removes leading and trailing whitespace.split(): Splits a string into an iterator of substrings based on a delimiter.contains(): Checks if a string contains a substring.starts_with()/ends_with(): Checks if a string starts or ends with a substring.
fn main() {
let mut s = String::from("hello");
s.push_str(", world!");
println!("{}", s); // Output: hello, world!
s.insert_str(5, " beautiful");
println!("{}", s); // Output: hello beautiful, world!
let removed_char = s.remove(5);
println!("Removed char: {}", removed_char); // Output: Removed char: b
println!("{}", s); // Output: hello beautiful, world!
let trimmed = s.trim();
println!("{}", trimmed); // Output: hello beautiful, world!
for part in s.split(", ") {
println!("{}", part);
}
// Output:
// hello beautiful
// world!
if s.contains("world") {
println!("String contains 'world'");
}
}
4. String Formatting
Rust provides powerful string formatting capabilities using the format! macro.
fn main() {
let name = "Alice";
let age = 30;
let message = format!("My name is {} and I am {} years old.", name, age);
println!("{}", message); // Output: My name is Alice and I am 30 years old.
// Using named arguments
let message2 = format!("{name} is {age} years old.", name = "Bob", age = 25);
println!("{}", message2); // Output: Bob is 25 years old.
}
5. UTF-8 and Characters
Rust strings are UTF-8 encoded, meaning they can represent characters from any language. However, this also means that a single character might be represented by multiple bytes.
chars(): Returns an iterator over the characters in a string. This is the correct way to iterate over characters, as it handles multi-byte characters correctly.bytes(): Returns an iterator over the bytes in a string.len(): Returns the number of bytes in a string, not the number of characters.chars().count(): Returns the number of characters in a string.
fn main() {
let s = "你好,世界!"; // Chinese characters
println!("Length in bytes: {}", s.len()); // Output: Length in bytes: 12
println!("Length in chars: {}", s.chars().count()); // Output: Length in chars: 6
for c in s.chars() {
println!("{}", c);
}
// Output:
// 你
// 好
// ,
// 世
// 界
// !
}
6. Converting Between String and &str
Stringto&str: You can easily create a&strfrom aStringusing the dereference operator (&).let s = String::from("hello"); let slice: &str = &s; // Borrowing the String as a &str&strtoString: Useto_string()orString::from().let slice: &str = "world"; let s: String = slice.to_string(); // or String::from(slice);
Best Practices
- Prefer
&strwhen possible: If you don't need to modify the string, use&strto avoid unnecessary allocations and ownership transfers. - Use
String::with_capacity()when appropriate: If you know the approximate size of the string beforehand, pre-allocating capacity can improve performance. - Be mindful of UTF-8: When working with strings, always use
chars()to iterate over characters correctly. Avoid indexing directly into a string using byte indices, as this can lead to panics if you encounter a multi-byte character boundary. - Consider using string interning: For frequently used strings, consider using a string interning library to reduce memory usage.
This covers the essential aspects of working with strings in Rust. Remember to choose the appropriate string type (String or &str) based on your needs and to be mindful of UTF-8 encoding. The Rust documentation provides more detailed information and advanced features: https://doc.rust-lang.org/std/primitive.string.html