swallow out-of-range numeric timestamps in JSON parser by HrachShah · Pull Request #4 · HrachShah/log-analyzer-cli

HrachShah · 2026-06-09T08:08:07Z

datetime.fromtimestamp raises ValueError (year out of range, e.g. 1e12 seconds = year 33658) and OverflowError (1e308 — larger than platform time_t) on numeric Unix timestamps in JSON log lines. _extract_timestamp's contract is 'returns Optional[datetime]', so a busted numeric field should be 'no timestamp', not 'the whole JSON line is unparseable'. The previous code let the exception escape parse(), which dropped the entire ParsedEntry (level, message, source, metadata) and surfaced a traceback to the CLI caller for what should be a per-line parse failure. Wrapping the call in try/except (ValueError, OSError, OverflowError) keeps the rest of the entry, sets timestamp to None, and lets the CLI analyze the other 10,000 lines in the file as usual.

Summary by Sourcery

Handle out-of-range numeric Unix timestamps in JSON logs without failing entry parsing.

Bug Fixes:

Prevent datetime.fromtimestamp errors from bubbling up when JSON timestamp fields contain extremely large or otherwise invalid numeric values by treating them as missing timestamps.

Tests:

Add regression tests verifying that out-of-range numeric JSON timestamps yield entries with timestamp=None while preserving other fields.

Summary by CodeRabbit

Bug Fixes
- JSON log parser now gracefully handles out-of-range or invalid numeric timestamps instead of crashing, allowing log processing to continue with the timestamp marked as unavailable.
Tests
- Added test coverage for out-of-range numeric timestamps in JSON log parsing.

…mtimestamp raises ValueError (year out of range, e.g. 1e12 = year 33658, 1e15 seconds) and OverflowError (1e308 — larger than platform time_t); a negative timestamp can raise OSError on some platforms. _extract_timestamp is internal and its contract is 'returns Optional[datetime]', so a busted numeric field is just 'no timestamp', not 'the whole JSON line is unparseable'. The old code let the exception escape parse(), dropping the entire ParsedEntry (level, message, source, metadata) and surfacing a traceback to the CLI caller for what should be a per-line parse failure. Wrapping the call in try/except (ValueError, OSError, OverflowError) keeps the rest of the entry, sets timestamp to None, and lets the CLI analyze the other 10,000 lines in the file as usual

sourcery-ai · 2026-06-09T08:08:14Z

Reviewer's Guide

This PR hardens the JSON log parser so that out-of-range numeric Unix timestamps no longer cause parse() to fail the entire log entry, instead treating them as missing timestamps, and adds tests to cover the new behavior for various overflow cases.

Sequence diagram for handling out-of-range numeric timestamps in JSON parser

sequenceDiagram
    actor CLIUser
    participant LogAnalyzerCLI
    participant JSONLogParser
    participant datetime

    CLIUser->>LogAnalyzerCLI: analyze_logs
    LogAnalyzerCLI->>JSONLogParser: parse(line)
    JSONLogParser->>JSONLogParser: _extract_timestamp(data)
    JSONLogParser->>datetime: fromtimestamp(value)
    datetime-->>JSONLogParser: [raises ValueError/OSError/OverflowError]
    JSONLogParser-->>JSONLogParser: _extract_timestamp returns None
    JSONLogParser-->>LogAnalyzerCLI: ParsedEntry(timestamp=None)
    LogAnalyzerCLI-->>CLIUser: results for remaining lines

File-Level Changes

Change	Details	Files
Handle out-of-range numeric Unix timestamps gracefully in JSONLogParser._extract_timestamp	Expanded _extract_timestamp docstring to clarify behavior and edge cases for numeric Unix timestamps in JSON logs. Wrapped datetime.fromtimestamp calls for numeric timestamps in a try/except block catching ValueError, OSError, and OverflowError. Maintained existing seconds vs. milliseconds heuristic (values > 1e12 treated as milliseconds) while returning None instead of propagating exceptions when conversion fails.	`src/log_analyzer_cli/parsers/json_log.py`
Add regression tests for JSON numeric timestamps that overflow datetime.fromtimestamp	Introduced a new test that feeds JSONLogParser numeric timestamps that are too large for datetime.fromtimestamp, asserting parse() returns an entry with timestamp set to None rather than raising. Covered both seconds and milliseconds representations that map beyond supported datetime ranges, as well as extremely large floats that overflow platform time_t. Verified that other fields (level, message) are still parsed and normalized correctly even when timestamp is invalid.	`tests/test_parsers.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2026-06-09T08:08:19Z

📝 Walkthrough

Walkthrough

This PR hardens JSON log timestamp extraction to gracefully handle numeric Unix timestamps that overflow datetime.fromtimestamp(). The implementation catches conversion errors and returns None; a test verifies the parser still yields a complete entry with a None timestamp instead of failing.

Changes

JSON timestamp overflow resilience

Layer / File(s)	Summary
Timestamp error handling with safe conversion `src/log_analyzer_cli/parsers/json_log.py`	`_extract_timestamp` wraps numeric timestamp conversions in try/except to catch `ValueError`, `OSError`, and `OverflowError`; on failure returns `None` instead of propagating exceptions. Docstring expanded to document out-of-range behavior.
Out-of-range numeric timestamp test `tests/test_parsers.py`	New test method verifies that `parse()` gracefully handles extremely large numeric timestamp values, returning a non-None entry with timestamp set to `None`, while preserving level normalization and message content.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~20 minutes

Poem

A timestamp too vast for the clock to bear,
Returns None with graceful care,
No exception raised in the JSON night,
The parser hops on, forever bright! 🐰✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: hardening JSON timestamp extraction to gracefully handle out-of-range numeric values instead of raising exceptions.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/json-parser-out-of-range-timestamp-2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

The 1e12 threshold used to distinguish seconds from milliseconds is a magic number; consider extracting it into a named constant (with a brief comment) so the intent and units are clear and easier to adjust later.
Swallowing ValueError/OSError/OverflowError and returning None makes malformed timestamps silent; consider emitting a debug-level log or other lightweight signal when this occurs to aid diagnosing bad log producers without breaking parsing.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The 1e12 threshold used to distinguish seconds from milliseconds is a magic number; consider extracting it into a named constant (with a brief comment) so the intent and units are clear and easier to adjust later.
- Swallowing ValueError/OSError/OverflowError and returning None makes malformed timestamps silent; consider emitting a debug-level log or other lightweight signal when this occurs to aid diagnosing bad log producers without breaking parsing.

## Individual Comments

### Comment 1
<location path="src/log_analyzer_cli/parsers/json_log.py" line_range="89-96" />
<code_context>
         for field in self.TIMESTAMP_FIELDS:
             if field in data:
                 value = data[field]
                 if isinstance(value, (int, float)):
-                    if value > 1e12:
-                        return datetime.fromtimestamp(value / 1000)
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Exclude booleans from the numeric timestamp path to avoid misinterpreting them as timestamps.

Since `bool` is a subclass of `int`, JSON `true`/`false` would pass this `isinstance(value, (int, float))` check and be treated as timestamps. To avoid converting boolean flags, tighten the condition, e.g. `isinstance(value, (int, float)) and not isinstance(value, bool)`.

```suggestion
                value = data[field]
                if isinstance(value, (int, float)) and not isinstance(value, bool):
                    try:
                        if value > 1e12:
                            return datetime.fromtimestamp(value / 1000)
                        return datetime.fromtimestamp(value)
                    except (ValueError, OSError, OverflowError):
                        return None
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-06-09T08:10:03Z

                value = data[field]
                if isinstance(value, (int, float)):
-                    if value > 1e12:
-                        return datetime.fromtimestamp(value / 1000)
-                    return datetime.fromtimestamp(value)
+                    try:
+                        if value > 1e12:
+                            return datetime.fromtimestamp(value / 1000)
+                        return datetime.fromtimestamp(value)
+                    except (ValueError, OSError, OverflowError):
+                        return None


suggestion (bug_risk): Exclude booleans from the numeric timestamp path to avoid misinterpreting them as timestamps.

Since bool is a subclass of int, JSON true/false would pass this isinstance(value, (int, float)) check and be treated as timestamps. To avoid converting boolean flags, tighten the condition, e.g. isinstance(value, (int, float)) and not isinstance(value, bool).

Suggested change

value = data[field]

if isinstance(value, (int, float)):

if value > 1e12:

return datetime.fromtimestamp(value / 1000)

return datetime.fromtimestamp(value)

try:

if value > 1e12:

return datetime.fromtimestamp(value / 1000)

return datetime.fromtimestamp(value)

except (ValueError, OSError, OverflowError):

return None

value = data[field]

if isinstance(value, (int, float)) and not isinstance(value, bool):

try:

if value > 1e12:

return datetime.fromtimestamp(value / 1000)

return datetime.fromtimestamp(value)

except (ValueError, OSError, OverflowError):

return None

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/log_analyzer_cli/parsers/json_log.py (2)

11-11: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Remove unused import.

The detect_log_level import is flagged by flake8 as unused. It should be removed to fix the CI pipeline failure.

🔧 Proposed fix

-from log_analyzer_cli.utils import detect_log_level

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/log_analyzer_cli/parsers/json_log.py` at line 11, Remove the unused
import detect_log_level from the top of the json_log parser module (the import
statement "from log_analyzer_cli.utils import detect_log_level"); simply delete
that import so flake8 no longer flags detect_log_level as unused while leaving
any other imports in json_log.py intact.

Source: Linters/SAST tools

27-27: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix formatting issues flagged by flake8.

Multiple formatting issues are causing CI failures:

Lines 27, 32, 62: Trailing whitespace (W291)
Line 63: Continuation line under-indented for visual indent (E128)

🔧 Proposed fix

Remove trailing whitespace on lines 27, 32, and 62, and fix the indentation on line 63:

     TIMESTAMP_FIELDS = [
-        "timestamp", "time", "ts", "`@timestamp`", "datetime", 
+        "timestamp", "time", "ts", "`@timestamp`", "datetime",
         "date", "created_at", "created", "logged_at"
     ]
     
     LEVEL_FIELDS = [
-        "level", "severity", "loglevel", "log_level", "lvl", 
+        "level", "severity", "loglevel", "log_level", "lvl",
         "priority", "importance"
     ]

-        metadata = {k: v for k, v in data.items() 
-                   if k not in self.TIMESTAMP_FIELDS + self.LEVEL_FIELDS + self.MESSAGE_FIELDS}
+        metadata = {k: v for k, v in data.items()
+                    if k not in self.TIMESTAMP_FIELDS + self.LEVEL_FIELDS + self.MESSAGE_FIELDS}

Also applies to: 32-32, 62-63

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/log_analyzer_cli/parsers/json_log.py` at line 27, Remove trailing
whitespace and fix the continuation indentation in the JSON log parser list of
timestamp keys: strip trailing spaces from the string list entries (the line
containing "timestamp", "time", "ts", "`@timestamp`", "datetime", and the later
multiline list entries) and align the continuation line so it is indented to
match the opening bracket (visual indent) — update the timestamp keys list
(e.g., TIMESTAMP_KEYS or timestamp_fields in json_log.py) accordingly to
eliminate W291 and E128 errors.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/log_analyzer_cli/parsers/json_log.py`:
- Line 11: Remove the unused import detect_log_level from the top of the
json_log parser module (the import statement "from log_analyzer_cli.utils import
detect_log_level"); simply delete that import so flake8 no longer flags
detect_log_level as unused while leaving any other imports in json_log.py
intact.
- Line 27: Remove trailing whitespace and fix the continuation indentation in
the JSON log parser list of timestamp keys: strip trailing spaces from the
string list entries (the line containing "timestamp", "time", "ts",
"`@timestamp`", "datetime", and the later multiline list entries) and align the
continuation line so it is indented to match the opening bracket (visual indent)
— update the timestamp keys list (e.g., TIMESTAMP_KEYS or timestamp_fields in
json_log.py) accordingly to eliminate W291 and E128 errors.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cd12ff06-d30e-4ce0-84c3-52ec2adbbbee

📥 Commits

Reviewing files that changed from the base of the PR and between e93757f and 3213155.

📒 Files selected for processing (2)

src/log_analyzer_cli/parsers/json_log.py
tests/test_parsers.py

sourcery-ai Bot reviewed Jun 9, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

swallow out-of-range numeric timestamps in JSON parser#4

swallow out-of-range numeric timestamps in JSON parser#4
HrachShah wants to merge 1 commit into
mainfrom
fix/json-parser-out-of-range-timestamp-2

HrachShah commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

sourcery-ai Bot commented Jun 9, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

sourcery-ai Bot Jun 9, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

HrachShah commented Jun 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Summary by CodeRabbit

Uh oh!

sourcery-ai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for handling out-of-range numeric timestamps in JSON parser

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

HrachShah commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

sourcery-ai Bot commented Jun 9, 2026 •

edited

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading