Monday, August 18, 2025
0 comments

Master Data Structures & Libraries in Python: Module 7 - Stacks, Queues, Linked Lists, Dictionaries, Sets, NumPy, Pandas, and Visualization

 Welcome to Module 7 of our comprehensive Python course, designed to transform you from a beginner to an advanced Python programmer! 


In Module 6, we explored advanced concepts like iterators, decorators, and AsyncIO, equipping you with tools for high-performance coding. Now, we dive into data structures and libraries, the backbone of efficient programming and data analysis. This module covers stacks, queues, linked lists, dictionaries in depth, sets and frozen sets, NumPy basics, Pandas for data handling, and Matplotlib/Seaborn for visualization. These topics are crucial for building robust applications, from task managers to data dashboards.

This blog is beginner-friendly yet detailed enough for intermediate and advanced learners, offering real-world scenarios, multiple code examples, pros and cons, best practices, and alternatives. Whether you're implementing a task scheduler, analyzing sales data, or visualizing trends, this guide will equip you with the skills you need. Let’s dive in!
Table of Contents
  1. Stacks, Queues, Linked Lists
    • Understanding Stacks, Queues, and Linked Lists
    • Implementing with Python
    • Real-World Applications
    • Pros, Cons, and Alternatives
    • Best Practices
    • Example: Building a Task History Manager
  2. Dictionaries in Depth
    • Advanced Dictionary Operations
    • OrderedDict and DefaultDict
    • Pros, Cons, and Alternatives
    • Best Practices
    • Example: Creating a Word Frequency Counter
  3. Sets & Frozen Sets
    • Set Operations and Use Cases
    • Frozen Sets for Immutability
    • Pros, Cons, and Alternatives
    • Best Practices
    • Example: Managing Unique User IDs
  4. NumPy Basics
    • Introduction to NumPy Arrays
    • Array Operations and Broadcasting
    • Pros, Cons, and Alternatives
    • Best Practices
    • Example: Analyzing Stock Prices
  5. Pandas for Data Handling
    • DataFrames and Series
    • Data Manipulation and Analysis
    • Pros, Cons, and Alternatives
    • Best Practices
    • Example: Processing Sales Data
  6. Matplotlib/Seaborn for Visualization
    • Creating Plots with Matplotlib
    • Enhancing Visualizations with Seaborn
    • Pros, Cons, and Alternatives
    • Best Practices
    • Example: Visualizing Sales Trends
  7. Conclusion & Next Steps

1. Stacks, Queues, Linked ListsUnderstanding Stacks, Queues, and Linked ListsData structures organize and store data efficiently:
  • Stacks: Last-In-First-Out (LIFO) structure, like a stack of plates.
  • Queues: First-In-First-Out (FIFO) structure, like a line at a store.
  • Linked Lists: Nodes linked by pointers, ideal for dynamic data.
Stack Example (using a list):
python
stack = []
stack.append(1)  # Push
stack.append(2)
print(stack.pop())  # Pop: 2
Queue Example (using collections.deque):
python
from collections import deque
queue = deque()
queue.append(1)  # Enqueue
queue.append(2)
print(queue.popleft())  # Dequeue: 1
Linked List Example:
python
class Node:
    def __init__(self, data):
        self.data = data
        self.next = None

class LinkedList:
    def __init__(self):
        self.head = None
    
    def append(self, data):
        new_node = Node(data)
        if not self.head:
            self.head = new_node
            return
        current = self.head
        while current.next:
            current = current.next
        current.next = new_node
Real-World Applications
  • Stacks: Undo/redo functionality, browser history.
  • Queues: Task scheduling, print queues.
  • Linked Lists: Playlists, file systems.
Pros, Cons, and AlternativesPros:
  • Stacks and queues are simple and efficient for specific tasks.
  • Linked lists allow dynamic resizing and efficient insertions.
  • Python’s collections.deque optimizes queue operations.
Cons:
  • Stacks and queues have limited use cases.
  • Linked lists are slower for random access compared to lists.
  • Manual linked list implementation is error-prone.
Alternatives:
  • Lists: For general-purpose sequences, but less efficient for queues.
  • Arrays (NumPy): For numerical data with fixed size.
  • Third-Party Libraries: Like llist for linked lists.
Best Practices:
  • Use deque for stacks and queues instead of lists.
  • Implement linked lists only when dynamic insertion/deletion is needed.
  • Ensure proper memory management in linked lists (e.g., avoid cycles).
  • Test edge cases (e.g., empty structures).
Example: Building a Task History ManagerLet’s create a task history manager using a stack to support undo functionality.
python
from collections import deque

class TaskManager:
    def __init__(self):
        self.history = deque()  # Stack for undo
    
    def add_task(self, task):
        self.history.append(task)
        return f"Added task: {task}"
    
    def undo(self):
        if self.history:
            task = self.history.pop()
            return f"Undid task: {task}"
        return "No tasks to undo."

# Test the manager
manager = TaskManager()
print(manager.add_task("Write report"))  # Output: Added task: Write report
print(manager.add_task("Send email"))    # Output: Added task: Send email
print(manager.undo())                    # Output: Undid task: Send email
Advanced Example: Linked list for task history.
python
class TaskNode:
    def __init__(self, task):
        self.task = task
        self.prev = None

class AdvancedTaskManager:
    def __init__(self):
        self.head = None
    
    def add_task(self, task):
        new_node = TaskNode(task)
        new_node.prev = self.head
        self.head = new_node
        return f"Added task: {task}"
    
    def undo(self):
        if self.head:
            task = self.head.task
            self.head = self.head.prev
            return f"Undid task: {task}"
        return "No tasks to undo."

# Test the advanced manager
manager = AdvancedTaskManager()
print(manager.add_task("Write report"))  # Output: Added task: Write report
print(manager.add_task("Send email"))    # Output: Added task: Send email
print(manager.undo())                    # Output: Undid task: Send email
This example demonstrates stacks and linked lists for a practical task history manager, supporting undo functionality.
2. Dictionaries in DepthAdvanced Dictionary OperationsDictionaries store key-value pairs, supporting:
  • Access: dict[key]
  • Update: dict[key] = value
  • Iteration: dict.items(), dict.keys(), dict.values()
  • Merging: dict1 | dict2 (Python 3.9+)
Example:
python
user = {"name": "Alice", "age": 30}
user["email"] = "alice@example.com"
print(user.items())  # Output: dict_items([('name', 'Alice'), ('age', 30), ('email', 'alice@example.com')])
OrderedDict and DefaultDict
  • OrderedDict (collections.OrderedDict): Maintains insertion order (unnecessary in Python 3.7+).
  • DefaultDict (collections.defaultdict): Provides default values for missing keys.
Example:
python
from collections import defaultdict
word_count = defaultdict(int)
text = "apple banana apple"
for word in text.split():
    word_count[word] += 1
print(word_count)  # Output: defaultdict(<class 'int'>, {'apple': 2, 'banana': 1})
Pros, Cons, and AlternativesPros:
  • Fast key-based access (O(1) average).
  • Flexible for storing structured data.
  • defaultdict simplifies missing key handling.
Cons:
  • Memory overhead compared to lists.
  • Keys must be hashable.
  • Not ideal for ordered sequences.
Alternatives:
  • Lists/Tuples: For ordered data.
  • Sets: For unique keys without values.
  • Custom Classes: For complex data structures.
Best Practices:
  • Use descriptive keys for readability.
  • Use defaultdict for counting or grouping.
  • Avoid mutable default values in defaultdict.
  • Use dictionary comprehension for concise creation.
Example: Creating a Word Frequency CounterLet’s build a word frequency counter for text analysis.
python
from collections import defaultdict
import re

def word_frequency(text):
    """Count word frequencies in text."""
    words = re.findall(r'\w+', text.lower())
    freq = defaultdict(int)
    for word in words:
        freq[word] += 1
    return dict(freq)

# Test the counter
text = "The quick brown fox jumps over the lazy dog. The fox is quick."
print(word_frequency(text))  # Output: {'the': 2, 'quick': 2, 'brown': 1, 'fox': 2, 'jumps': 1, 'over': 1, 'lazy': 1, 'dog': 1, 'is': 1}
Advanced Example: Grouping words by length.
python
def group_by_length(text):
    """Group words by their length."""
    words = re.findall(r'\w+', text.lower())
    groups = defaultdict(list)
    for word in words:
        groups[len(word)].append(word)
    return dict(groups)

# Test grouping
print(group_by_length(text))  # Output: {3: ['the', 'fox', 'the'], 4: ['over', 'lazy'], 5: ['quick', 'brown', 'jumps', 'quick'], 3: ['dog', 'is']}
This example demonstrates dictionaries for text analysis, leveraging defaultdict for efficient grouping.
3. Sets & Frozen SetsSet Operations and Use CasesSets store unique, hashable items, supporting:
  • Union: set1 | set2
  • Intersection: set1 & set2
  • Difference: set1 - set2
  • Add/Remove: set.add(), set.remove()
Example:
python
set1 = {1, 2, 3}
set2 = {2, 3, 4}
print(set1 | set2)  # Output: {1, 2, 3, 4}
Frozen Sets: Immutable sets, hashable for use as dictionary keys.
python
frozen = frozenset([1, 2, 3])
Pros, Cons, and AlternativesPros:
  • Fast membership testing (O(1) average).
  • Efficient for unique data and set operations.
  • Frozen sets enable immutability.
Cons:
  • No indexing or ordering.
  • Limited to hashable elements.
  • Less flexible than lists or dictionaries.
Alternatives:
  • Lists: For ordered, non-unique data.
  • Dictionaries: For key-value pairs.
  • NumPy Arrays: For numerical sets.
Best Practices:
  • Use sets for unique data or membership testing.
  • Use frozen sets for dictionary keys or immutable data.
  • Avoid modifying sets during iteration.
  • Use set comprehension for concise creation.
Example: Managing Unique User IDsLet’s manage unique user IDs for an application.
python
def manage_users(new_users, existing_users):
    """Add new users, ensuring uniqueness."""
    existing = set(existing_users)
    new = set(new_users)
    added = new - existing
    existing.update(added)
    return list(existing)

# Test user management
existing = ["user1", "user2"]
new = ["user2", "user3", "user4"]
print(manage_users(new, existing))  # Output: ['user1', 'user2', 'user3', 'user4']
Advanced Example: Using frozen sets for caching.
python
def cache_results(inputs):
    cache = {}
    for input_set in inputs:
        frozen = frozenset(input_set)
        if frozen not in cache:
            cache[frozen] = sum(input_set)
    return cache

# Test caching
inputs = [[1, 2], [2, 1], [3, 4]]
print(cache_results(inputs))  # Output: {frozenset({1, 2}): 3, frozenset({3, 4}): 7}
This example uses sets and frozen sets for managing unique data and caching, common in user management and optimization tasks.
4. NumPy BasicsIntroduction to NumPy ArraysNumPy provides efficient arrays for numerical computations, supporting:
  • Creation: np.array(), np.zeros(), np.ones()
  • Operations: Element-wise arithmetic, matrix operations
  • Broadcasting: Apply operations across arrays
Example:
python
import numpy as np

arr = np.array([1, 2, 3])
print(arr + 2)  # Output: [3 4 5]
Array Operations and Broadcasting
python
matrix = np.array([[1, 2], [3, 4]])
print(matrix * 2)  # Output: [[2 4], [6 8]]
Pros, Cons, and AlternativesPros:
  • Fast, vectorized operations for numerical data.
  • Supports multidimensional arrays.
  • Broadcasting simplifies operations.
Cons:
  • Overhead for small datasets.
  • Requires installation (not built-in).
  • Less intuitive for non-numerical data.
Alternatives:
  • Lists: For small, non-numerical data.
  • Pandas: For tabular data with labels.
  • SciPy: For advanced scientific computations.
Best Practices:
  • Use NumPy for numerical computations, not general-purpose lists.
  • Leverage broadcasting to avoid loops.
  • Use np.vectorize for custom functions on arrays.
  • Check array shapes to avoid broadcasting errors.
Example: Analyzing Stock PricesLet’s analyze stock prices using NumPy.
python
import numpy as np

def analyze_stocks(prices):
    """Calculate stock metrics."""
    prices = np.array(prices)
    returns = np.diff(prices) / prices[:-1]
    return {
        "mean_price": np.mean(prices),
        "volatility": np.std(returns)
    }

# Test analysis
prices = [100, 102, 101, 105, 103]
print(analyze_stocks(prices))  # Output: {'mean_price': 102.2, 'volatility': 0.016...}
This example uses NumPy for efficient stock price analysis, a common task in finance.
5. Pandas for Data HandlingDataFrames and SeriesPandas provides DataFrames (tables) and Series (columns) for data manipulation:
  • DataFrame: 2D labeled data structure.
  • Series: 1D labeled array.
Example:
python
import pandas as pd

df = pd.DataFrame({
    "name": ["Alice", "Bob"],
    "age": [30, 25]
})
print(df)  # Output:     name  age
           #         0  Alice   30
           #         1    Bob   25
Data Manipulation and Analysis
  • Filtering: df[df['age'] > 25]
  • Grouping: df.groupby('column')
  • Merging: pd.merge(df1, df2)
Pros, Cons, and AlternativesPros:
  • Intuitive for tabular data.
  • Powerful for data cleaning and analysis.
  • Integrates with NumPy and visualization libraries.
Cons:
  • Memory-intensive for large datasets.
  • Steeper learning curve than lists/dictionaries.
  • Requires installation.
Alternatives:
  • NumPy: For numerical data without labels.
  • Dask: For big data processing.
  • SQL: For database-style operations.
Best Practices:
  • Use vectorized operations instead of loops.
  • Handle missing data with fillna() or dropna().
  • Use meaningful column names.
  • Save DataFrames to CSV or Parquet for persistence.
Example: Processing Sales DataLet’s analyze sales data with Pandas.
python
import pandas as pd

def analyze_sales(data):
    df = pd.DataFrame(data)
    df["date"] = pd.to_datetime(df["date"])
    summary = df.groupby("product")["price"].agg(["sum", "count"])
    return summary

# Test analysis
sales = [
    {"product": "Laptop", "price": 999.99, "date": "2025-08-18"},
    {"product": "Mouse", "price": 29.99, "date": "2025-08-18"},
    {"product": "Laptop", "price": 999.99, "date": "2025-08-19"}
]
print(analyze_sales(sales))
Output:
         sum  count
product            
Laptop  1999.98      2
Mouse     29.99      1
This example demonstrates Pandas for sales data analysis, a common task in business intelligence.
6. Matplotlib/Seaborn for VisualizationCreating Plots with MatplotlibMatplotlib creates customizable plots:
  • Line Plots: plt.plot()
  • Bar Charts: plt.bar()
  • Scatter Plots: plt.scatter()
Example:
python
import matplotlib.pyplot as plt

x = [1, 2, 3, 4]
y = [10, 20, 25, 30]
plt.plot(x, y)
plt.title("Line Plot")
plt.show()
Enhancing Visualizations with SeabornSeaborn builds on Matplotlib, offering statistical plots and better aesthetics:
  • Heatmaps: sns.heatmap()
  • Box Plots: sns.boxplot()
Example:
python
import seaborn as sns
import pandas as pd

df = pd.DataFrame({"x": [1, 2, 3], "y": [10, 20, 25]})
sns.scatterplot(data=df, x="x", y="y")
plt.show()
Pros, Cons, and AlternativesPros:
  • Matplotlib is highly customizable.
  • Seaborn simplifies complex statistical plots.
  • Integrates with Pandas and NumPy.
Cons:
  • Matplotlib has a steep learning curve for customization.
  • Seaborn is less flexible for non-statistical plots.
  • Requires installation.
Alternatives:
  • Plotly: For interactive plots.
  • Bokeh: For web-based visualizations.
  • Altair: For declarative visualizations.
Best Practices:
  • Use Seaborn for quick, aesthetic plots.
  • Customize Matplotlib for specific needs.
  • Save plots to files (plt.savefig()).
  • Use meaningful titles, labels, and legends.
Example: Visualizing Sales TrendsLet’s visualize sales trends with Matplotlib and Seaborn.
python
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

def visualize_sales(data):
    df = pd.DataFrame(data)
    df["date"] = pd.to_datetime(df["date"])
    df = df.groupby("date")["price"].sum().reset_index()
    
    plt.figure(figsize=(10, 5))
    sns.lineplot(data=df, x="date", y="price")
    plt.title("Daily Sales Trend")
    plt.xlabel("Date")
    plt.ylabel("Total Sales ($)")
    plt.savefig("sales_trend.png")
    plt.show()

# Test visualization
sales = [
    {"product": "Laptop", "price": 999.99, "date": "2025-08-18"},
    {"product": "Mouse", "price": 29.99, "date": "2025-08-18"},
    {"product": "Laptop", "price": 999.99, "date": "2025-08-19"}
]
visualize_sales(sales)
This example creates a line plot of sales trends, a common task in data analysis.
7. Conclusion & Next StepsCongratulations on mastering Module 7! You’ve learned essential data structures (stacks, queues, linked lists, dictionaries, sets) and powerful libraries (NumPy, Pandas, Matplotlib, Seaborn) for building efficient, data-driven applications like task managers, word counters, user ID trackers, stock analyzers, sales processors, and visualizations.Next Steps:
  • Practice: Enhance the examples (e.g., add features to the sales visualizer).
  • Explore: Dive into advanced libraries like SciPy or Plotly.
  • Advance: Move to Module 8, covering APIs, databases, and testing.
  • Resources:

0 comments:

Featured Post

Master Angular 20 Basics: A Complete Beginner’s Guide with Examples and Best Practices

Welcome to the complete Angular 20 learning roadmap ! This series takes you step by step from basics to intermediate concepts , with hands...

Subscribe

 
Toggle Footer
Top