Module: Data Structures

Sets

Python Sets: A Comprehensive Guide

Sets are a fundamental data structure in Python, offering a unique way to store collections of items. They are unordered, mutable, and, most importantly, do not allow duplicate elements. This makes them incredibly useful for tasks like removing duplicates from a list, performing mathematical set operations (union, intersection, difference), and checking for membership efficiently.

1. Creating Sets

There are several ways to create sets in Python:

  • Using curly braces {}: This is the most common method.

    my_set = {1, 2, 3, 4, 5}
    print(my_set)  # Output: {1, 2, 3, 4, 5} (order may vary)
    
    # Empty set - IMPORTANT:  {} creates an empty dictionary, not an empty set!
    empty_set = set()  # Correct way to create an empty set
    print(empty_set)  # Output: set()
    
  • Using the set() constructor: This is useful for converting other iterable data types (like lists, tuples, strings) into sets.

    my_list = [1, 2, 2, 3, 4, 4, 5]
    my_set = set(my_list)
    print(my_set)  # Output: {1, 2, 3, 4, 5} (duplicates removed)
    
    my_string = "hello"
    my_set = set(my_string)
    print(my_set)  # Output: {'h', 'e', 'l', 'o'} (order may vary)
    

2. Key Characteristics of Sets

  • Unordered: Elements in a set have no specific order. You cannot access elements by index.

  • Mutable: You can add or remove elements from a set after it's created.

  • Unique Elements: Sets automatically eliminate duplicate values. If you try to add a duplicate, it's ignored.

  • Elements must be immutable: Sets can only contain immutable data types like numbers, strings, and tuples. Lists and dictionaries cannot be elements of a set.

    # Valid set
    valid_set = {1, "hello", (1, 2)}
    
    # Invalid set (list is mutable)
    # invalid_set = {1, [2, 3]}  # TypeError: unhashable type: 'list'
    

3. Common Set Operations

  • Adding Elements:

    • add(element): Adds a single element to the set.
    • update(iterable): Adds multiple elements from an iterable (list, tuple, set, etc.) to the set.
    my_set = {1, 2, 3}
    my_set.add(4)
    print(my_set)  # Output: {1, 2, 3, 4}
    
    my_set.update([5, 6, 7])
    print(my_set)  # Output: {1, 2, 3, 4, 5, 6, 7}
    
    my_set.update({8, 9})
    print(my_set) # Output: {1, 2, 3, 4, 5, 6, 7, 8, 9}
    
  • Removing Elements:

    • remove(element): Removes a specific element. Raises a KeyError if the element is not found.
    • discard(element): Removes a specific element if it exists. Does not raise an error if the element is not found.
    • pop(): Removes and returns an arbitrary element from the set. Raises a KeyError if the set is empty.
    • clear(): Removes all elements from the set.
    my_set = {1, 2, 3, 4}
    my_set.remove(3)
    print(my_set)  # Output: {1, 2, 4}
    
    my_set.discard(5)  # No error, as 5 is not in the set
    print(my_set)  # Output: {1, 2, 4}
    
    element = my_set.pop()
    print(element)  # Output: (e.g., 1) - arbitrary element
    print(my_set)  # Output: (e.g., {2, 4})
    
    my_set.clear()
    print(my_set)  # Output: set()
    
  • Set Operations:

    • union(other_set) or |: Returns a new set containing all elements from both sets.
    • intersection(other_set) or &: Returns a new set containing only the elements common to both sets.
    • difference(other_set) or -: Returns a new set containing elements present in the first set but not in the second set.
    • symmetric_difference(other_set) or ^: Returns a new set containing elements present in either set, but not in both.
    • isdisjoint(other_set): Returns True if the sets have no elements in common, False otherwise.
    • issubset(other_set): Returns True if all elements of the first set are also in the second set, False otherwise.
    • issuperset(other_set): Returns True if all elements of the second set are also in the first set, False otherwise.
    set1 = {1, 2, 3, 4}
    set2 = {3, 4, 5, 6}
    
    print(set1.union(set2))  # Output: {1, 2, 3, 4, 5, 6}
    print(set1 | set2)       # Output: {1, 2, 3, 4, 5, 6}
    
    print(set1.intersection(set2))  # Output: {3, 4}
    print(set1 & set2)       # Output: {3, 4}
    
    print(set1.difference(set2))  # Output: {1, 2}
    print(set1 - set2)       # Output: {1, 2}
    
    print(set1.symmetric_difference(set2))  # Output: {1, 2, 5, 6}
    print(set1 ^ set2)       # Output: {1, 2, 5, 6}
    
    print(set1.isdisjoint(set2))  # Output: False
    
    set3 = {1, 2}
    print(set3.issubset(set1))  # Output: True
    print(set1.issuperset(set3))  # Output: True
    

4. Set Comprehension

Similar to list comprehensions, you can create sets using set comprehensions:

numbers = [1, 2, 3, 4, 5]
squared_even_numbers = {x**2 for x in numbers if x % 2 == 0}
print(squared_even_numbers)  # Output: {4, 16}

5. When to Use Sets

  • Removing Duplicates: Quickly eliminate duplicate values from a collection.
  • Membership Testing: Checking if an element exists in a collection is very efficient with sets (O(1) on average).
  • Mathematical Set Operations: Performing union, intersection, difference, etc.
  • Data Analysis: Identifying unique values in a dataset.

6. Important Considerations

  • Hashability: Elements in a set must be hashable (immutable).
  • Order: Sets are unordered, so you cannot rely on the order of elements.
  • Performance: Sets are generally very efficient for membership testing and set operations. However, if order is important, consider using a list or tuple.

This guide provides a solid foundation for understanding and using sets in Python. Experiment with these concepts and explore the documentation for more advanced features and use cases.