Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .jules/bolt.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,9 @@
## 2024-05-24 - [Avoid `email.parser` for large package indexes]
**Learning:** `email.parser.Parser` performs full RFC 822/2822 compliance checks which adds massive overhead. When parsing tens of thousands of machine-generated opkg package index blocks with predictable `Key: Value` line formats, standard string splitting and `.startswith()` checks provide a ~14x speedup.
**Action:** When extracting a few specific headers from a trusted and uniform block format instead of parsing arbitrary emails, avoid `email.parser.Parser` and use fast native python string operations instead. Make sure to use `.strip()` when parsing values to correctly handle `\r\n` line endings.
## 2024-05-14 - Python String Concatenation Optimization
**Learning:** In Python, string concatenation within a loop using the `+=` operator involves creating a new string object and copying contents on each iteration because strings are immutable. This leads to O(N^2) performance degradation over many iterations.
**Action:** Replace `+=` string concatenation inside loops with `list.append()` to collect the string parts, followed by `''.join(list)` outside the loop. This ensures an O(N) linear time complexity and avoids unnecessary allocations, providing significant performance speedups.

## 2026-05-22 - [Python String Splitting Memory Overhead]
**Learning:** When parsing tens of thousands of blocks in a massive text file, using `.split()` to chunk the entire string creates an intermediate list containing all chunk strings simultaneously, leading to massive memory bloat (O(N) memory overhead in addition to the original string).
Expand Down
6 changes: 3 additions & 3 deletions scripts/dl_github_archive.py
Original file line number Diff line number Diff line change
Expand Up @@ -350,7 +350,7 @@ def _init_commit_ts(self):
version_is_sha1sum = len(self.version) == 40
if not version_is_sha1sum:
apis.insert(0, apis.pop())
reasons = ''
reasons = []
for api in apis:
url = api['url']
attr_path = api['attr_path']
Expand All @@ -364,8 +364,8 @@ def _init_commit_ts(self):
self.commit_ts_cache.set(url, ct)
return
except Exception as e:
reasons += '\n' + (" {}: {}".format(url, e))
raise self._error('Cannot fetch commit ts:{}'.format(reasons))
reasons.append('\n {}: {}'.format(url, e))
raise self._error('Cannot fetch commit ts:{}'.format(''.join(reasons)))

def _init_commit_ts_remote_get(self, url, attrpath):
resp = self._make_request(url)
Expand Down
Loading