Conversation
📝 WalkthroughWalkthroughA single HTML file was updated to add a client-side file browser with GitHub API integration, allowing per-package file inspection. Additionally, manifest generation now supports per-path ignore rules, version sorting logic was improved to prefer newer dates, and new UI elements were introduced for GitHub token configuration and file browsing. Changes
Sequence DiagramsequenceDiagram
participant User
participant Client as Client UI
participant State as Local State
participant GH as GitHub API
User->>Client: Select package
Client->>State: Initialize files state (CUR_FILES, caches)
Client->>GH: Call fetchGhData() with package info
GH-->>Client: Return readme, tags with dates, contents
Client->>State: Store GitHub data (GH_DATA)
User->>Client: Click Files tab
Client->>Client: loadFilesIfNeeded()
Client->>GH: buildGhApiUrl() → fetch repository contents
GH-->>Client: Return file tree with metadata
Client->>State: Update CUR_FILES, CUR_FILES_CACHED_PKG
Client->>Client: renderFilesList() + renderFileRow()
Client-->>User: Display file browser with navigation
User->>Client: Click file/folder or navigate path
Client->>Client: navFilesPath(subPath)
Client->>GH: Fetch contents for new path
GH-->>Client: Return updated file tree
Client->>Client: renderFilesList() again
Client-->>User: Update display with new directory contents
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dfetch_hub/site/index.html`:
- Around line 532-565: The code persists the GitHub personal access token to
localStorage via the setGhToken flow (input id="ghToken",
oninput="setGhToken(this.value)"), which is insecure; change setGhToken and any
storage helpers so the token is kept only in memory (a JS variable/module-scoped
cache) or, if you require reload persistence, in sessionStorage instead of
localStorage; remove any auto-save logic that writes to localStorage and update
the user-facing copy in the element with class "gh-token-note" (and the
hint/status elements "ghTokenHint"/"ghTokenStatus") to reflect that the token is
stored only locally in the browser session (not long-lived localStorage) and is
sent only to GitHub for authenticated requests.
- Around line 976-983: The bug is that RAW[id]._detailLoaded is set before
GitHub enrichment, so fetchGhData(id) won't be retried after a later token is
provided; change the logic in the block that reads ../data/${id}.json and deals
with RAW[id] so that either (a) RAW[id]._detailLoaded is only set after
fetchGhData(c) completes successfully, or (b) introduce and check a separate
flag (e.g., RAW[id]._ghEnriched) and ensure fetchGhData(c) is invoked whenever
_ghEnriched is false even if _detailLoaded is true; also make sure to
clear/unset the relevant flag on rate-limit/error and surface errors from
fetchGhData so retries are possible.
- Around line 867-870: The path model flips because CUR_FILES_PATH is sometimes
set directly to path.split('/') while readers expect index 0 to be a root
sentinel; ensure CUR_FILES_PATH always has the sentinel at index 0 (use ['']
followed by the path segments) and never an empty array—when clearing set
CUR_FILES_PATH = [''] and whenever assigning from a string use a prepend of the
root sentinel (e.g., [''].concat(path === '' ? [] : path.split('/'))); update
every site location that assigns or compares CUR_FILES_PATH (the places that
also touch CUR_FILES, CUR_FILES_CACHED_PKG and CUR_FILES_CACHED_VERSION) so
readers no longer drop the first segment, re-fetch incorrectly, or carry stale
subpaths.
- Around line 357-375: Several CSS rule lines (e.g., selectors like
.mf-settings-drop input, .gh-token-wrap input, .gh-token-steps and others)
exceed the repo's 120-character line length; run the repository formatter or
manually wrap long one-liner rules into multiple lines (one property per line
and break long selector lists) to respect the 120-char limit for HTML/CSS/JS in
{**/*.html,**/*.md,**/*.rst}; reformat the specified hunks (including the other
flagged ranges 986-1004, 1301-1431, 1556) and re-run the test/workflow so the
file passes the line-length/doc8 checks.
- Around line 993-1000: The README base64 is being decoded with atob(d.content)
which produces a Latin1 string and will corrupt non-ASCII; inside the
if(readmeResp.ok) block replace the raw atob assignment to c.readme with a
proper base64→UTF‑8 decode: obtain the binary via atob(d.content), convert each
charCode to a Uint8Array, then use new TextDecoder('utf-8').decode(...) to
produce the correct UTF‑8 text and assign that to c.readme (keep the existing
try/catch and null checks around d.content).
- Around line 1343-1356: The file browser is vulnerable because
renderFileRow/building HTML uses esc() (HTML-escaping) inside inline event
handler attributes and raw innerHTML (fb.innerHTML) so crafted filenames can
break out and execute JS; fix by stopping use of inline JS handlers and
string-concatenated innerHTML: in renderFileRow and where
parentPath/item.name/fullPath are used, construct DOM nodes via
document.createElement, set element.textContent for file names, use
element.addEventListener to call navFilesPath or toggleFileIgnore (referencing
renderFileRow, navFilesPath, toggleFileIgnore, CUR_FILES_PATH, MFS_IGNORES), and
ensure any remaining template insertion (e.g., yamlQ() output) is HTML-escaped
before setting innerHTML or, better, inserted as textContent; do not rely on
esc() for JS contexts—use proper JS string handling only when building JS code
(but prefer event listeners instead).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 40f7e685-90cc-498a-b8d2-a1cce2e6fe0d
📒 Files selected for processing (1)
dfetch_hub/site/index.html
| .mf-settings-drop input{font-family:var(--mono);font-size:.74rem;background:rgba(255,255,255,.1);border:1px solid rgba(255,255,255,.2);border-radius:var(--rm);color:#fff;padding:.18rem .38rem;width:80px;outline:none;flex:1} | ||
| .mf-settings-drop input:focus{border-color:var(--accent)} | ||
| .gh-token-row{padding-top:.35rem;border-top:1px solid rgba(255,255,255,.1);margin-top:.35rem} | ||
| .gh-token-wrap{display:flex;gap:.25rem;flex:1} | ||
| .gh-token-wrap input{width:100%!important} | ||
| .gh-token-toggle{background:rgba(255,255,255,.1);border:1px solid rgba(255,255,255,.2);border-radius:var(--rm);color:rgba(255,255,255,.6);cursor:pointer;padding:.12rem .35rem;font-size:.75rem;flex-shrink:0} | ||
| .gh-token-toggle:hover{background:rgba(255,255,255,.18);color:#fff} | ||
| .gh-token-status{font-size:.68rem;color:rgba(255,255,255,.45);padding-top:.25rem;display:flex;align-items:center;gap:.25rem} | ||
| .gh-token-status .gh-token-valid{color:#27ae60} | ||
| .gh-token-status .gh-token-invalid{color:#e74c3c} | ||
| .gh-token-status .gh-token-clear{color:rgba(255,255,255,.5);cursor:pointer;margin-left:auto;text-decoration:underline} | ||
| .gh-token-status .gh-token-clear:hover{color:#fff} | ||
| .gh-token-help{font-size:.68rem;color:rgba(255,255,255,.45);padding:.2rem 0 .1rem} | ||
| .gh-token-help summary{color:var(--accent);cursor:pointer;font-weight:600;user-select:none;outline:none} | ||
| .gh-token-help summary:hover{color:var(--accent2)} | ||
| .gh-token-hint{font-size:.65rem;color:rgba(255,255,255,.4);padding:.15rem 0} | ||
| .gh-token-steps{margin-top:.4rem;padding:.45rem .5rem;background:rgba(255,255,255,.05);border-radius:var(--rm);font-size:.66rem;line-height:1.7;color:rgba(255,255,255,.7)} | ||
| .gh-token-steps strong{color:#fff} | ||
| .gh-token-note{color:rgba(255,255,255,.4);font-size:.62rem;margin-top:.25rem;font-style:italic} |
There was a problem hiding this comment.
Run the formatter on these new hunks.
The Test workflow is already failing on this file, and these added CSS/JS one-liners exceed the repo's 120-character limit.
As per coding guidelines, {**/*.html,**/*.md,**/*.rst}: Enforce maximum line length of 120 characters in HTML/CSS/JS and RST/Markdown files with doc8.
Also applies to: 986-1004, 1301-1431, 1556-1556
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@dfetch_hub/site/index.html` around lines 357 - 375, Several CSS rule lines
(e.g., selectors like .mf-settings-drop input, .gh-token-wrap input,
.gh-token-steps and others) exceed the repo's 120-character line length; run the
repository formatter or manually wrap long one-liner rules into multiple lines
(one property per line and break long selector lists) to respect the 120-char
limit for HTML/CSS/JS in {**/*.html,**/*.md,**/*.rst}; reformat the specified
hunks (including the other flagged ranges 986-1004, 1301-1431, 1556) and re-run
the test/workflow so the file passes the line-length/doc8 checks.
| <div class="mf-settings-row gh-token-row"> | ||
| <label for="ghToken" | ||
| title="Personal access token for higher GitHub API rate limits."> | ||
| GitHub token: | ||
| </label> | ||
| <div class="gh-token-wrap"> | ||
| <input id="ghToken" | ||
| type="password" | ||
| value="" | ||
| placeholder="ghp_xxxxxxxxxxxx" | ||
| oninput="setGhToken(this.value)" | ||
| onkeydown="if(event.key==='Enter'){event.preventDefault();toggleMfSettings()}" | ||
| aria-label="GitHub personal access token" /> | ||
| <button class="gh-token-toggle" | ||
| type="button" | ||
| onclick="toggleGhTokenVisibility()" | ||
| aria-label="Toggle token visibility"> | ||
| 👁 | ||
| </button> | ||
| </div> | ||
| </div> | ||
| <div class="gh-token-hint" id="ghTokenHint"> | ||
| <span>Rate: 60/hr → 5000/hr with token</span> | ||
| </div> | ||
| <details class="gh-token-help"> | ||
| <summary>How to get a token (click)</summary> | ||
| <div class="gh-token-steps"> | ||
| <div>1. <button type="button" class="gh-token-get" onclick="openGhTokenPage()">Go to GitHub ↗</button></div> | ||
| <div>2. Scroll down, click <strong>Generate token</strong></div> | ||
| <div>3. Copy the token and paste above (auto-saves)</div> | ||
| <div class="gh-token-note">Stored locally in your browser, never sent to any server</div> | ||
| </div> | ||
| </details> | ||
| <div class="gh-token-status" id="ghTokenStatus"></div> |
There was a problem hiding this comment.
Avoid persisting a personal access token in localStorage.
This leaves a long-lived credential readable by any script on this origin and on shared machines after the tab closes. Prefer in-memory storage, or at most sessionStorage if you need reload persistence. The helper copy at Line 562 should also be updated, because authenticated requests here do send the token to GitHub.
Also applies to: 872-872, 1400-1427
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@dfetch_hub/site/index.html` around lines 532 - 565, The code persists the
GitHub personal access token to localStorage via the setGhToken flow (input
id="ghToken", oninput="setGhToken(this.value)"), which is insecure; change
setGhToken and any storage helpers so the token is kept only in memory (a JS
variable/module-scoped cache) or, if you require reload persistence, in
sessionStorage instead of localStorage; remove any auto-save logic that writes
to localStorage and update the user-facing copy in the element with class
"gh-token-note" (and the hint/status elements "ghTokenHint"/"ghTokenStatus") to
reflect that the token is stored only locally in the browser session (not
long-lived localStorage) and is sent only to GitHub for authenticated requests.
| let CUR_FILES = null; // {path:[{name,type,sha,ignored?}]} | ||
| let CUR_FILES_PATH = []; // current breadcrumb path stack | ||
| let CUR_FILES_CACHED_PKG = null; // package ID for cached files | ||
| let CUR_FILES_CACHED_VERSION = null; // version for cached files |
There was a problem hiding this comment.
The file browser's path model is broken right now.
CUR_FILES_PATH flips between [''] and path.split('/'), but the readers always treat index 0 as a root sentinel. That makes the first directory open refetch the current directory, drops the first segment of nested paths, and can carry a stale subpath into the next package/version load.
Possible fix
-let CUR_FILES_PATH = []; // current breadcrumb path stack
+let CUR_FILES_PATH = ''; // current path relative to repo root
...
- _pushURL({pkg:id});curPkg=id;CUR_FILES=null;CUR_FILES_CACHED_PKG=null;CUR_FILES_CACHED_VERSION=null;showPage('page-detail');window.scrollTo(0,0);
+ _pushURL({pkg:id});curPkg=id;CUR_FILES=null;CUR_FILES_PATH='';CUR_FILES_CACHED_PKG=null;CUR_FILES_CACHED_VERSION=null;showPage('page-detail');window.scrollTo(0,0);
...
- CUR_FILES=null;
+ CUR_FILES=null;
+ CUR_FILES_PATH='';
CUR_FILES_CACHED_PKG=null;
CUR_FILES_CACHED_VERSION=null;
...
-function buildGhApiUrl(c,version){
+function buildGhApiUrl(c,version,path=''){
let org=c.org||'';let repo=c.repo||c.name||'';
const m=(c.url||'').match(/github\.com[\/:]([^\/]+)\/([^\/\s]+)/);
if(m){org=m[1].toLowerCase();repo=m[2].replace(/\.git$/,'').toLowerCase();}
const ref=version?(version.kind==='branch'?version.name:version.name):c.default_branch||'main';
- const pth=CUR_FILES_PATH.slice(1).join('/');
- const pathParam=pth?'/'+pth:'';
- return `https://api.github.com/repos/${org}/${repo}/contents${pathParam}?ref=${ref}`;
+ const pathParam=path ? '/' + path.split('/').map(encodeURIComponent).join('/') : '';
+ return `https://api.github.com/repos/${org}/${repo}/contents${pathParam}?ref=${encodeURIComponent(ref)}`;
}
...
- const curPath=CUR_FILES_PATH.slice(1).join('/');
+ const curPath=CUR_FILES_PATH;
...
- if(CUR_FILES[targetPath]){CUR_FILES_PATH=subPath?subPath.split('/'):[''];renderFilesList();return;}
+ if(CUR_FILES[targetPath]){CUR_FILES_PATH=targetPath;renderFilesList();return;}
...
- const apiUrl=buildGhApiUrl(c,curVersion);
+ const apiUrl=buildGhApiUrl(c,curVersion,targetPath);
...
- CUR_FILES_PATH=targetPath?targetPath.split('/'):[''];
+ CUR_FILES_PATH=targetPath;Also applies to: 975-975, 1231-1233, 1325-1333, 1335-1343, 1379-1392
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@dfetch_hub/site/index.html` around lines 867 - 870, The path model flips
because CUR_FILES_PATH is sometimes set directly to path.split('/') while
readers expect index 0 to be a root sentinel; ensure CUR_FILES_PATH always has
the sentinel at index 0 (use [''] followed by the path segments) and never an
empty array—when clearing set CUR_FILES_PATH = [''] and whenever assigning from
a string use a prepend of the root sentinel (e.g., [''].concat(path === '' ? []
: path.split('/'))); update every site location that assigns or compares
CUR_FILES_PATH (the places that also touch CUR_FILES, CUR_FILES_CACHED_PKG and
CUR_FILES_CACHED_VERSION) so readers no longer drop the first segment, re-fetch
incorrectly, or carry stale subpaths.
| if(!c._detailLoaded){ | ||
| document.getElementById('readmeBody').innerHTML='<div class="readme-empty">Loading…</div>'; | ||
| document.getElementById('vtbody').innerHTML='<tr><td colspan="4" style="text-align:center;color:var(--subtle)">Loading…</td></tr>'; | ||
| try{const resp=await fetch(`../data/${id}.json`);if(resp.ok){const d=await resp.json();Object.assign(RAW[id],d);}}catch(e){} | ||
| RAW[id]._detailLoaded=true; | ||
| const isGh=c.source_type==='github'||(c.source_labels||[]).includes('github'); | ||
| if(isGh){await fetchGhData(c);} | ||
| } |
There was a problem hiding this comment.
Make GitHub enrichment retryable.
Once _detailLoaded is set, fetchGhData() never runs again for that package. If the first unauthenticated call is rate-limited and the user adds a token afterward, README/tags/changelog stay stale until a hard refresh.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@dfetch_hub/site/index.html` around lines 976 - 983, The bug is that
RAW[id]._detailLoaded is set before GitHub enrichment, so fetchGhData(id) won't
be retried after a later token is provided; change the logic in the block that
reads ../data/${id}.json and deals with RAW[id] so that either (a)
RAW[id]._detailLoaded is only set after fetchGhData(c) completes successfully,
or (b) introduce and check a separate flag (e.g., RAW[id]._ghEnriched) and
ensure fetchGhData(c) is invoked whenever _ghEnriched is false even if
_detailLoaded is true; also make sure to clear/unset the relevant flag on
rate-limit/error and surface errors from fetchGhData so retries are possible.
| const [readmeResp,tagsResp,contentsResp,repoResp]=await Promise.all([ | ||
| fetch(`https://api.github.com/repos/${org}/${repo}/readme`,opts), | ||
| fetch(`https://api.github.com/repos/${org}/${repo}/tags?per_page=100`,opts), | ||
| fetch(`https://api.github.com/repos/${org}/${repo}/contents/`,opts), | ||
| fetch(`https://api.github.com/repos/${org}/${repo}`,opts) | ||
| ]); | ||
| if(readmeResp.ok){try{const d=await readmeResp.json();if(d.content){const txt=atob(d.content);c.readme=txt;}}catch(e){}} | ||
| if(tagsResp.ok){try{const tags=await tagsResp.json();if(Array.isArray(tags)){const tagList=tags.slice(0,50);const commitResps=await Promise.allSettled(tagList.map(t=>fetch(`https://api.github.com/repos/${org}/${repo}/commits/${t.commit.sha}`,opts).then(r=>r.ok?r.json():null)));tagList.forEach((t,i)=>{if(commitResps[i].status==='fulfilled'&&commitResps[i].value?.commit?.committer?.date)t.date=commitResps[i].value.commit.committer.date;});c.tags=[...new Map(tagList.map(t=>({name:t.name,commit_sha:t.commit?.sha||'',date:t.date||'',...(t.commit?.url?{commit_url:t.commit.url}:{})})).map(t=>[t.name,t])).values()];}}catch(e){}} |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
GitHub API README endpoint content encoding base64
💡 Result:
GitHub’s REST README endpoint (GET /repos/{owner}/{repo}/readme) returns a contents JSON object where:
encodingis"base64"contentis the README bytes base64-encoded (often with\nline breaks inserted) [1]
So to get the README text, remove line breaks from content and base64-decode it.
If you want the README as raw bytes/text instead of base64 JSON, request the raw media type via Accept (GitHub documents application/vnd.github.raw+json) [1]. Older examples commonly use application/vnd.github.v3.raw and work in many clients [2].
Example (curl, raw):
curl -L \
-H "Accept: application/vnd.github.raw+json" \
https://api.github.com/repos/OWNER/REPO/readmeExample (decode JSON content in JS):
const decoded = Buffer.from(content.replace(/\n/g, ""), "base64").toString("utf8");Sources: GitHub REST “Repository contents / Get a repository README” docs [1], Stack Overflow example using the raw Accept header [2].
🏁 Script executed:
# Check the current line lengths in the suggested fix to see if they comply with 120 char limit
python3 << 'EOF'
# Suggested fix lines
lines = [
" if(readmeResp.ok){",
" try{",
" const d=await readmeResp.json();",
" if(d.content){",
" const bin=atob(d.content.replace(/\\n/g,''));",
" const bytes=Uint8Array.from(bin,ch=>ch.charCodeAt(0));",
" c.readme=new TextDecoder('utf-8').decode(bytes);",
" }",
" }catch(e){}",
" }"
]
max_len = 120
violations = []
for i, line in enumerate(lines, 1):
if len(line) > max_len:
violations.append(f"Line {i}: {len(line)} chars - {line}")
if violations:
print("Line length violations (>120 chars):")
for v in violations:
print(v)
else:
print("All lines comply with 120 char limit")
print(f"Longest line: {max(len(line) for line in lines)} chars")
EOFRepository: dfetch-org/dfetch-hub
Length of output: 124
Decode README content as UTF-8 bytes, not a raw atob() string.
atob() decodes base64 to a Latin1 binary string. Assigning that directly to c.readme will garble any non-ASCII characters in the README (emoji, accented characters, etc.).
Suggested fix
- if(readmeResp.ok){try{const d=await readmeResp.json();if(d.content){const txt=atob(d.content);c.readme=txt;}}catch(e){}}
+ if(readmeResp.ok){
+ try{
+ const d=await readmeResp.json();
+ if(d.content){
+ const bin=atob(d.content.replace(/\n/g,''));
+ const bytes=Uint8Array.from(bin,ch=>ch.charCodeAt(0));
+ c.readme=new TextDecoder('utf-8').decode(bytes);
+ }
+ }catch(e){}
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@dfetch_hub/site/index.html` around lines 993 - 1000, The README base64 is
being decoded with atob(d.content) which produces a Latin1 string and will
corrupt non-ASCII; inside the if(readmeResp.ok) block replace the raw atob
assignment to c.readme with a proper base64→UTF‑8 decode: obtain the binary via
atob(d.content), convert each charCode to a Uint8Array, then use new
TextDecoder('utf-8').decode(...) to produce the correct UTF‑8 text and assign
that to c.readme (keep the existing try/catch and null checks around d.content).
| html+=`<div class="files-row files-parent-row" onclick="navFilesPath('${esc(parentPath)}')" style="border-bottom:1px solid var(--border)"><span class="files-ico">${iconParent()}</span><span class="files-name">.. (parent directory)</span></div>`; | ||
| } | ||
| sorted.forEach(item=>{html+=renderFileRow(item);}); | ||
| html+='</div></div>'; | ||
| fb.innerHTML=html; | ||
| updFilesIgnCnt(); | ||
| } | ||
| function renderFileRow(item){ | ||
| const ico=item.type==='dir'?iconFolder():iconFile(); | ||
| const curDir=CUR_FILES_PATH.slice(1).join('/'); | ||
| const fullPath=curDir?curDir+'/'+item.name:item.name; | ||
| const ign=MFS_IGNORES[curPkg]?.has(fullPath)||item.ignored; | ||
| const folderNav=item.type==='dir'?`ondblclick="event.stopPropagation();navFilesPath('${esc(item.name)}')" title="Double-click to open"`:' title="Check to ignore"'; | ||
| return `<div class="files-row"><input type="checkbox" ${ign?'checked':''} onchange="toggleFileIgnore('${esc(fullPath)}',this.checked)"><span class="files-ico">${ico}</span><span class="files-name${ign?' ignored':''}" ${folderNav}>${esc(item.name)}</span></div>`; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check the file exists and get line count
wc -l dfetch_hub/site/index.htmlRepository: dfetch-org/dfetch-hub
Length of output: 96
🏁 Script executed:
# Read the specific lines mentioned in the review
sed -n '1343,1356p' dfetch_hub/site/index.htmlRepository: dfetch-org/dfetch-hub
Length of output: 1123
🏁 Script executed:
# Also check lines 1510-1528
sed -n '1510,1528p' dfetch_hub/site/index.htmlRepository: dfetch-org/dfetch-hub
Length of output: 1351
🏁 Script executed:
# Find the esc() function implementation
rg -n "function esc" dfetch_hub/site/index.html | head -20Repository: dfetch-org/dfetch-hub
Length of output: 141
🏁 Script executed:
# Find the yamlQ() function implementation
rg -n "function yamlQ|const yamlQ|let yamlQ" dfetch_hub/site/index.html | head -20Repository: dfetch-org/dfetch-hub
Length of output: 186
🏁 Script executed:
# Find the _ESC definition
rg -n "_ESC" dfetch_hub/site/index.html | head -10Repository: dfetch-org/dfetch-hub
Length of output: 219
🏁 Script executed:
# Get more context around the _ESC definition
rg -B5 -A5 "_ESC\s*=" dfetch_hub/site/index.htmlRepository: dfetch-org/dfetch-hub
Length of output: 1249
🏁 Script executed:
# Let's also check how innerHTML is being used with the html variable
rg -n "innerHTML" dfetch_hub/site/index.html | head -20Repository: dfetch-org/dfetch-hub
Length of output: 2038
🏁 Script executed:
# Search for where yamlQ results are inserted into the DOM
rg -B3 -A3 "yamlQ" dfetch_hub/site/index.htmlRepository: dfetch-org/dfetch-hub
Length of output: 2602
🏁 Script executed:
# Check for CSP or other security headers
rg -i "content-security-policy|csp|script.*src" dfetch_hub/site/index.html | head -20Repository: dfetch-org/dfetch-hub
Length of output: 233
🏁 Script executed:
# Check if DOMPurify is used and how
rg -n "DOMPurify" dfetch_hub/site/index.htmlRepository: dfetch-org/dfetch-hub
Length of output: 366
🏁 Script executed:
# Verify the exact context where yamlQ is used in HTML generation (around 1510-1528)
sed -n '1500,1530p' dfetch_hub/site/index.htmlRepository: dfetch-org/dfetch-hub
Length of output: 2104
🏁 Script executed:
# Check for any sanitization or token storage details
rg -n "token|localStorage|sessionStorage" dfetch_hub/site/index.html | head -20Repository: dfetch-org/dfetch-hub
Length of output: 1793
🏁 Script executed:
# Let me check what DOMPurify is used for and if it's applied to the file list HTML
sed -n '1340,1360p' dfetch_hub/site/index.htmlRepository: dfetch-org/dfetch-hub
Length of output: 1585
🏁 Script executed:
# Check the function that uses fb.innerHTML to see if any sanitization happens
rg -B10 "fb.innerHTML" dfetch_hub/site/index.htmlRepository: dfetch-org/dfetch-hub
Length of output: 5153
🏁 Script executed:
# Verify the exact vulnerability: test if esc() in onclick attributes provides JS-context escaping
python3 << 'EOF'
# Test case: a filename with a quote and JavaScript
filename = "');alert('xss');//"
# Simulate esc() function
def esc(s):
s = str(s or '')
escapes = {'&': '&', '<': '<', '>': '>', '"': '"', "'": '''}
return ''.join(escapes.get(c, c) for c in s)
escaped = esc(filename)
print(f"Original filename: {filename}")
print(f"After esc(): {escaped}")
# Now simulate HTML parsing - browsers decode HTML entities in attributes
html_string = f"<div onclick=\"navFilesPath('{escaped}')\"></div>"
print(f"HTML string: {html_string}")
# When parsed, entities are decoded
decoded_onclick = html_string.replace(''', "'")
print(f"After browser parsing entities: {decoded_onclick}")
# Extract what JS sees
import re
match = re.search(r"onclick=\"([^\"]+)\"", decoded_onclick)
if match:
js_code = match.group(1)
print(f"JavaScript sees: {js_code}")
EOFRepository: dfetch-org/dfetch-hub
Length of output: 369
Critical XSS vulnerability in file browser: crafted filenames can execute JavaScript and steal tokens.
Repository paths (parentPath, item.name, fullPath) are interpolated into inline event handlers using esc(), which performs HTML entity escaping (' for quotes) rather than JavaScript string escaping. When innerHTML renders this HTML, the browser decodes entities back to literal characters, allowing attackers to break out of the string context and inject arbitrary code into onclick/ondblclick handlers.
Additionally, at lines 1510–1528, yamlQ() results are wrapped in HTML without escaping, permitting direct HTML injection through filenames.
A crafted filename like ');alert(1);// or <img src=x onerror=alert(1)> can execute JavaScript with access to stored GitHub tokens in localStorage.
Use textContent instead of innerHTML where possible, or apply proper JavaScript string escaping for event attributes (not HTML entity escaping), and HTML escape all content inserted via innerHTML.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@dfetch_hub/site/index.html` around lines 1343 - 1356, The file browser is
vulnerable because renderFileRow/building HTML uses esc() (HTML-escaping) inside
inline event handler attributes and raw innerHTML (fb.innerHTML) so crafted
filenames can break out and execute JS; fix by stopping use of inline JS
handlers and string-concatenated innerHTML: in renderFileRow and where
parentPath/item.name/fullPath are used, construct DOM nodes via
document.createElement, set element.textContent for file names, use
element.addEventListener to call navFilesPath or toggleFileIgnore (referencing
renderFileRow, navFilesPath, toggleFileIgnore, CUR_FILES_PATH, MFS_IGNORES), and
ensure any remaining template insertion (e.g., yamlQ() output) is HTML-escaped
before setting innerHTML or, better, inserted as textContent; do not rely on
esc() for JS contexts—use proper JS string handling only when building JS code
(but prefer event listeners instead).
Summary by CodeRabbit
New Features
Improvements