LeetCode 635: Design Log Storage System Solution in Python – A Step-by-Step Guide
Imagine you’re a system administrator managing a server that logs events with timestamps—like "2017:01:01:23:59:59"—and you need to build a tool to store these logs and fetch them over time ranges, zooming in from years to seconds. That’s the core of LeetCode 635: Design Log Storage System, a medium-level problem that’s all about designing a data structure to handle timestamped data with flexible granularity. Using Python, we’ll explore two solutions: the Best Solution, a list-based approach with binary search for efficiency, and an Alternative Solution, a trie-based method with timestamp mapping that’s more complex but versatile. With detailed examples, beginner-friendly breakdowns—especially for the list method—and clear code, this guide will help you build that log system, whether you’re new to coding or leveling up. Let’s boot up our server and start logging!
What Is LeetCode 635: Design Log Storage System?
In LeetCode 635: Design Log Storage System, you’re tasked with implementing a LogSystem class with two main methods:
- put(id, timestamp): Store a log with an ID and timestamp (format "YYYY:MM:DD:HH:MM:SS").
- retrieve(start, end, granularity): Return IDs of logs with timestamps in [start, end] at a given granularity (Year, Month, Day, Hour, Minute, Second).
For example, storing logs like [1, "2017:01:01:23:59:59"] and retrieving from "2017:01:01:23:00:00" to "2017:01:01:23:59:59" at "Hour" granularity should catch logs within that hour. This problem tests your ability to design a system for timestamp storage and range queries with flexible precision.
Problem Statement
- Input:
- put: id (integer), timestamp (string "YYYY:MM:DD:HH:MM:SS").
- retrieve: start, end (strings "YYYY:MM:DD:HH:MM:SS"), granularity (string: "Year", "Month", etc.).
- Output: A LogSystem class with:
- put: None (stores log).
- retrieve: List of log IDs in range.
- Rules:
- Timestamps are in "YYYY:MM:DD:HH:MM:SS" format.
- Granularity levels: Year, Month, Day, Hour, Minute, Second.
- Return IDs for logs where timestamp falls in [start, end] at specified granularity.
Constraints
- 1 ≤ id ≤ 500.
- Timestamps range from 2000 to 2017.
- Up to 100 calls to put and retrieve.
Examples
- Input:
- ["LogSystem","put","put","retrieve"]
- [[],[1,"2017:01:01:23:59:59"],[2,"2017:01:01:22:59:59"],["2017:01:01:22:00:00","2017:01:02:23:59:59","Hour"]]
- Output: [null,null,null,[2,1]]
- Logs at 22:xx and 23:xx hours.
- Input:
- ["LogSystem","put","retrieve"]
- [[],[1,"2017:01:01:23:59:59"],["2017:01:01:00:00:00","2017:12:31:23:59:59","Year"]]
- Output: [null,null,[1]]
- All logs in 2017.
These examples set the log stage—let’s solve it!
Understanding the Problem: Designing a Log System
To solve LeetCode 635: Design Log Storage System in Python, we need to create a class that stores logs with timestamps and retrieves them over ranges at different granularity levels. A naive approach—scanning all logs per query—would be O(n) per retrieval, reasonable for 100 calls but inefficient for larger scales. We’ll use:
- Best Solution (List-Based with Binary Search): O(n log n) for put, O(log n) for retrieve, O(n) space—fast and practical.
- Alternative Solution (Trie-Based with Timestamp Mapping): O(1) for put, O(n) for retrieve, O(n) space—complex but versatile.
Let’s start with the list-based solution, breaking it down for beginners.
Best Solution: List-Based with Binary Search
Why This Is the Best Solution
The list-based approach with binary search is the top choice for LeetCode 635 because it’s efficient—O(n log n) for initial sorting with O(log n) retrieval—and simple, storing logs in a sorted list and using binary search to find range boundaries. It fits constraints (100 calls, n ≤ 100) and balances setup cost with query speed. It’s like keeping a sorted logbook and flipping to the right pages!
How It Works
Think of this as a log librarian:
- Step 1: Initialize Storage:
- List to store (timestamp, id) pairs.
- Granularity map to index timestamp parts (Year: 0, Month: 1, etc.).
- Step 2: Put Log:
- Append (timestamp, id) to list, sort by timestamp.
- Step 3: Retrieve Logs:
- Adjust start and end timestamps to granularity.
- Binary search for range bounds, collect IDs.
- Step 4: Return Result:
- List of IDs within range.
It’s like a log sorter—store and search!
Step-by-Step Example
Example: Operations: put(1,"2017:01:01:23:59:59"), put(2,"2017:01:01:22:59:59"), retrieve("2017:01:01:22:00:00","2017:01:02:23:59:59","Hour")
- Step 1: Init:
- Logs = [], Gran = {"Year": 0, "Month": 1, "Day": 2, "Hour": 3, "Minute": 4, "Second": 5}.
- Step 2: put(1, "2017:01:01:23:59:59"):
- Logs = [("2017:01:01:23:59:59", 1)].
- Step 3: put(2, "2017:01:01:22:59:59"):
- Logs = [("2017:01:01:22:59:59", 2), ("2017:01:01:23:59:59", 1)] (sorted).
- Step 4: retrieve("2017:01:01:22:00:00", "2017:01:02:23:59:59", "Hour"):
- Gran = 3 (Hour).
- Start = "2017:01:01:22", End = "2017:01:02:23".
- Binary search:
- Left = 0 (≥ "2017:01:01:22").
- Right = 1 (≤ "2017:01:02:23").
- IDs = [2, 1] (both in range).
- Output: [null, null, null, [2, 1]].
Code with Detailed Line-by-Line Explanation
Here’s the Python code, explained clearly:
class LogSystem:
def __init__(self):
# Step 1: Initialize logs and granularity map
self.logs = [] # (timestamp, id) pairs
self.gran = {"Year": 0, "Month": 1, "Day": 2, "Hour": 3, "Minute": 4, "Second": 5}
self.units = ["Year", "Month", "Day", "Hour", "Minute", "Second"]
def put(self, id: int, timestamp: str) -> None:
# Step 2: Store log and sort
self.logs.append((timestamp, id))
self.logs.sort() # Sort by timestamp
def retrieve(self, start: str, end: str, granularity: str) -> List[int]:
# Step 3: Adjust timestamps to granularity
idx = self.gran[granularity]
start_parts = start.split(":")
end_parts = end.split(":")
# Truncate to granularity (e.g., Hour -> YYYY:MM:DD:HH)
start_key = ":".join(start_parts[:idx + 1])
end_key = ":".join(end_parts[:idx + 1])
# Step 4: Binary search for range bounds
left = bisect.bisect_left(self.logs, (start_key,))
right = bisect.bisect_right(self.logs, (end_key,))
# Step 5: Collect IDs in range
return [log[1] for log in self.logs[left:right]]
import bisect # Place this at the top of the file
- Lines 4-6: Init: Empty log list, granularity map.
- Lines 9-11: put: Append and sort logs.
- Lines 14-26: retrieve:
- Truncate timestamps to granularity level.
- Use binary search (bisect) to find bounds.
- Extract IDs from range.
- Time Complexity:
- put: O(n log n)—sorting.
- retrieve: O(log n)—binary search.
- Space Complexity: O(n)—log list.
This is like a log indexer—fast and neat!
Alternative Solution: Trie-Based with Timestamp Mapping
Why an Alternative Approach?
The trie-based approach with timestamp mapping organizes logs in a trie—O(1) for put, O(n) for retrieve, O(n) space. It’s more complex, building a hierarchical structure for granularity, but offers flexibility for prefix-based queries. It’s like a timestamp tree for log navigation!
How It Works
Picture this as a log tree:
- Step 1: Init trie with granularity levels.
- Step 2: Put: Insert log into trie by timestamp parts.
- Step 3: Retrieve: Traverse trie to granularity, collect IDs in range.
- Step 4: Return result.
It’s like a log arborist—grow and prune!
Step-by-Step Example
Example: Same as above
- Init: Trie = {}.
- Put(1, "2017:01:01:23:59:59"): Trie[2017][01][01][23][59][59] = [1].
- Put(2, "2017:01:01:22:59:59"): Trie[2017][01][01][22][59][59] = [2].
- Retrieve("2017:01:01:22:00:00", "2017:01:02:23:59:59", "Hour"):
- Path to Hour: 2017:01:01:22 to 2017:01:02:23.
- IDs = [2, 1].
- Output: [2, 1].
Code for Trie Approach
class LogSystem:
def __init__(self):
self.trie = {}
self.gran = {"Year": 0, "Month": 1, "Day": 2, "Hour": 3, "Minute": 4, "Second": 5}
def put(self, id: int, timestamp: str) -> None:
parts = timestamp.split(":")
node = self.trie
for part in parts:
if part not in node:
node[part] = {}
node = node[part]
if "ids" not in node:
node["ids"] = []
node["ids"].append(id)
def retrieve(self, start: str, end: str, granularity: str) -> List[int]:
idx = self.gran[granularity]
start_parts = start.split(":")[:idx + 1]
end_parts = end.split(":")[:idx + 1]
def traverse(node, parts, depth, result):
if depth == len(parts):
if "ids" in node:
result.extend(node["ids"])
return
part = parts[depth]
for key in node:
if depth < len(parts) - 1 or (start_parts[depth] <= key <= end_parts[depth]):
traverse(node[key], parts, depth + 1, result)
result = []
traverse(self.trie, start_parts, 0, result)
return result
- Lines 4-5: Init trie and granularity map.
- Lines 8-15: put: Build trie path, store ID.
- Lines 18-32: retrieve: Traverse trie to granularity, collect IDs.
- Time Complexity:
- put: O(1)—fixed 6 levels.
- retrieve: O(n)—traverse all logs.
- Space Complexity: O(n)—trie nodes.
It’s a log tree—versatile but slower!
Comparing the Two Solutions
- List with Binary Search (Best):
- Pros: O(n log n) put, O(log n) retrieve, O(n) space, fast queries.
- Cons: Sorting overhead on put.
- Trie-Based (Alternative):
- Pros: O(1) put, O(n) retrieve, O(n) space, flexible structure.
- Cons: Slower retrieval.
List-based wins for query speed.
Additional Examples and Edge Cases
- Input: put(1,"2017:01:01:00:00:00"), retrieve("2017:01:01:00:00:00","2017:01:01:00:00:00","Second")
- Output: [null, [1]].
- Input: Empty system → [].
Both handle these well.
Complexity Breakdown
- List: Put O(n log n), Retrieve O(log n), Space O(n).
- Trie: Put O(1), Retrieve O(n), Space O(n).
List is optimal for retrieval.
Key Takeaways
- List: Binary search speed—smart!
- Trie: Tree structure—clear!
- Design: Logs are fun.
- Python Tip: Lists rock—see Python Basics.
Final Thoughts: Store Those Logs
LeetCode 635: Design Log Storage System in Python is a neat design challenge. List-based with binary search offers fast queries, while trie-based provides a structural view. Want more? Try LeetCode 146: LRU Cache or LeetCode 208: Implement Trie. Ready to log? Head to Solve LeetCode 635 on LeetCode and build that system today!