Dynamic Array Resizing in Python: A Comprehensive Deep Dive

Dynamic arrays are a cornerstone of modern programming, offering flexible, resizable storage for collections of data. In Python, the built-in list type is a dynamic array under the hood, automatically resizing itself as elements are added or removed. Unlike static arrays with fixed sizes, dynamic arrays adapt to changing needs, making them incredibly versatile. In this blog, we’ll explore how dynamic array resizing works in Python, its internal mechanics, resizing strategies, practical examples, and performance implications.


What is a Dynamic Array?

link to this section

A dynamic array (or resizable array) is a data structure that provides array-like functionality—sequential storage and index-based access—while automatically adjusting its capacity as elements are added or removed. In Python, the list type is implemented as a dynamic array in CPython, the reference implementation.

Static vs. Dynamic Arrays

  • Static Array : Fixed size, defined at creation (e.g., C’s int arr[10]).
  • Dynamic Array : Variable size, grows or shrinks as needed (e.g., Python’s list).

Key Features

  • Contiguous Memory : Elements are stored in a single, continuous block.
  • Resizable : Allocates more memory when full, copies elements to the new block.
  • Random Access : O(1) time complexity for indexing (e.g., lst[5]).

How Dynamic Array Resizing Works in Python

link to this section

Python’s list is implemented as a dynamic array in CPython, using a C structure called PyListObject. Here’s how resizing operates:

Internal Structure

  • Array of Pointers : A list stores references (pointers) to Python objects in a contiguous block of memory.
  • Capacity vs. Length :
    • Length : Number of elements currently in the list (accessed via len(lst)).
    • Capacity : Total allocated slots, usually greater than the length to allow growth.
  • Buffer : Extra space is pre-allocated to reduce frequent resizing.

Resizing Mechanism

When a list runs out of capacity (e.g., during append), Python:

  1. Checks Capacity : If the current length equals the capacity, resizing is triggered.
  2. Allocates New Memory : Requests a larger contiguous block from the OS.
  3. Copies Elements : Moves all existing elements to the new block.
  4. Updates Pointer : Adjusts the list’s internal pointer to the new block.
  5. Frees Old Memory : Deallocates the old block (handled by Python’s memory manager).

Growth Factor

Python uses an over-allocation strategy with a growth factor to minimize resizing frequency:

  • New capacity is calculated as: new_capacity = old_capacity + (old_capacity >> 3) + (old_capacity < 9 ? 3 : 6).
  • Roughly, it grows by ~1.125x plus a small constant, ensuring amortized O(1) append time.

Example: Appending to a List

lst = [] 
print(lst) # Output: [] 
lst.append(1) # Initial allocation 
lst.append(2) # Fits in buffer 
lst.append(3) # May trigger resize if capacity exceeded 
print(lst) # Output: [1, 2, 3]

Capacity Growth

  • Start: Capacity = 0.
  • After 1st append: Capacity = 4 (initial over-allocation).
  • After 4th append: Capacity = 4 (still fits).
  • After 5th append: Capacity = 8 (resizes to ~1.125 * 4 + extras).

You can’t directly query capacity in Python, but it’s visible in CPython’s source (listobject.c).


Internal Mechanics in CPython

link to this section

PyListObject Structure

In CPython, a list is defined as:

typedef struct {
    PyObject_VAR_HEAD  // Header with refcount, type, size
    PyObject **ob_item;  // Pointer to array of object pointers
    Py_ssize_t allocated;  // Capacity (total slots)
} PyListObject;
  • ob_item: Points to the contiguous memory block.
  • allocated: Tracks capacity, distinct from ob_size (length).

Resizing Algorithm

From listobject.c (list_resize function):

  • New capacity grows roughly exponentially but with adjustments:
    • First few sizes: 4, 8, 16, 25, 35, 46, etc.
    • Formula balances growth rate and memory efficiency.

Memory Allocation

  • Uses PyMem_Realloc to request new memory.
  • Copies pointers (not objects) to the new block, preserving references.

Shrinking

Python also shrinks lists when elements are removed (e.g., via pop or del), but only if the length drops significantly below capacity (typically < 50% full) to avoid thrashing.


Practical Examples

link to this section

Example 1: Appending and Growth

lst = [] 
    
for i in range(10): 
    lst.append(i) 
   
print(f"Length: {len(lst)}") # Tracks growth 
    
print(lst) # Output: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
  • Resizes at 4 → 8, then fits 9 and 10 without resizing.

Example 2: Pre-allocating with *

lst = [0] * 100 # Pre-allocates 100 slots 
print(len(lst)) # Output: 100 
lst.append(1) # No resize needed 
print(len(lst)) # Output: 101
  • * sets both length and capacity, reducing initial resizes.

Example 3: Shrinking with del

lst = [1, 2, 3, 4, 5, 6, 7, 8] 
del lst[4:] # Removes last 4 elements 
print(lst) # Output: [1, 2, 3, 4] 
# Capacity may shrink if length << capacity

Example 4: Amortized Cost

import time 
    
def append_test(n): 
    lst = [] 
    start = time.time() 
    for i in range(n): 
        lst.append(i) 
    return time.time() - start 
    
print(append_test(10000)) # Fast due to amortized O(1) 
print(append_test(100000)) # Scales linearly
  • Over-allocation ensures appends are amortized O(1).

Performance Implications

link to this section

Time Complexity

  • Append : Amortized O(1) due to over-allocation.
    • Occasional O(n) resize when capacity is exceeded, but rare.
  • Insert : O(n) – shifts elements after the insertion point.
  • Pop (end) : O(1) – no shifting.
  • Pop (middle) : O(n) – shifts elements.
  • Resize : O(n) for copying elements during growth/shrink.

Space Complexity

  • O(n + buffer) : Capacity exceeds length, using extra memory for efficiency.
  • Trade-off: Wastes some space to avoid frequent resizing.

Benchmarking Growth

import sys 
    
lst = [] 
for i in range(10): 
    lst.append(i)
    print(f"Length: {len(lst)}, Size in bytes: {sys.getsizeof(lst)}") 
    
# Output: 
# Length: 1, Size in bytes: 64 
# Length: 2, Size in bytes: 64 
# Length: 3, Size in bytes: 64 
# Length: 4, Size in bytes: 64 
# Length: 5, Size in bytes: 96 (resized)
  • sys.getsizeof shows memory jumps at resize points.

Dynamic Array vs. Other Structures

link to this section
  • List (Dynamic Array) : Fast indexing, amortized O(1) append, O(n) insert.
  • Linked List : O(1) insert/delete, O(n) indexing (use collections.deque in Python).
  • Array (array module) : Fixed-type, static size, no resizing.

Practical Use Cases

link to this section
  1. Dynamic Data Collection :
    logs = [] 
    while True: 
        log = input("Log entry (or 'q' to quit): ") 
        if   log == 'q': 
            break 
        logs.append(log)
    print(logs)
  2. Buffer Management : Pre-allocate for known sizes:
    buffer = [None] * 1000 # Fixed initial capacity
  3. Stack Implementation : Use append/pop for LIFO behavior.
  4. Data Processing : Grow lists incrementally for analysis.

Edge Cases and Gotchas

link to this section

1. Large Initial Sizes

Pre-allocating too much wastes memory:

lst = [0] * 10**9 # May exhaust memory

2. Frequent Inserts

Inserting at the start or middle is slow:

lst = [] 
for i in range(1000): 
    lst.insert(0, i) # O(n) each time, total O(n²)

3. Shrinking Inefficiency

Python doesn’t always shrink aggressively:

lst = [1] * 1000 del lst[1:]
print(sys.getsizeof(lst)) # Still large, e.g., 8040 bytes

Conclusion

link to this section

Dynamic array resizing in Python, embodied by the list type, is a marvel of engineering that balances flexibility, performance, and ease of use. By over-allocating memory and employing a smart growth strategy, Python ensures that appending is fast while maintaining O(1) access times. Understanding its mechanics—capacity management, resizing triggers, and amortized costs—empowers you to use lists efficiently and recognize their strengths and limitations. Dynamic arrays are a testament to Python’s pragmatic design, making them an indispensable part of any programmer’s toolkit.