Refactor to use API as source of truth#48
Conversation
There was a problem hiding this comment.
I did a high-level review and it looks good to me.
Once this is merged, I'm going to clone it and move over some of my code that was uncommitted and didn't make it into this copy. I'll also see if anything broke for me locally, and submit the updates + any fixes as a PR.
|
Since this PR is contingent on the backend serving all the public files as well, I'll hold off on merging. If I were to merge, the dashboard would at least still be live since it's deploying from the private repo, but the data would stop updating daily (not that it changes that much day-to-day usually). Feel free to append any needed backend changes to this PR. |
|
@seandavi FYI we're changing the architecture here. Just wanted to make you aware, I don't think it will affect anyone other than Faisal and I. Note that the poster I presented at the spring meeting is no longer true -- that's no longer how the pipeline works -- just so we don't misspeak about technical details in meetings. |
…0.0.1 over localhost
|
Hey @vincerubinetti, after a few tweaks (committed) I got the frontend to contact the backend for user info, but it seems to not be issuing other requests, e.g. for core projects. I see this in the browser's console:
(Sorry for the image of the traceback; Chrome gave me all sorts of nonsense when I tried to copy-paste the text.) Also, the database starts out unpopulated as expected, and since the data isn't public I don't want to put the dump in this repo. If you'd like to try it locally, I made a dumpfile from the current prod database; I can send you it through another channel as well as instructions on how to load it. |
|
Did you see the I'll look into the router matching issue. If you could send me the db dump that'd be great. Maybe it's even small enough in this case to send through slack? |
Aha, that was it. Would it be ok for me to modify that to pull an env var and default to true if it's not found? About the database, sure thing; it's extremely small (~1mb) and would be easy to send over, e.g. Slack. One last thing, now that I have the frontend calling the backend I've noticed that this version of the backend code is still relying on some private JSON files being available on the backend's filesystem that I was intending to move into the database anyway. I'm thinking I should make that change in this PR rather than holding off on it since you've already done a lot of work removing the other private data dependencies. |
Sounds good to me, let's close it all out with this PR. Let me know if you run into difficulty running the pipeline locally to generate all the json files. |
|
FYI, going to push some small changes today. Also running |
… an error as a result
… to be run in a container
…the "pipeline" service
|
Hey @vincerubinetti, FYI I've updated the PR with the following significant changes:
|
|
@vincerubinetti just an FYI, but I noticed when running the gather script I encountered the following error both inside the container and via the Not sure if that's expected or if you have insight into what might be going on. I had successfully run the |
|
@falquaddoomi Regarding the pipeline failure, that is only happening on GitHub actions, it runs fine locally. The issue is that the "playwright install" step is hanging. From some research, it seems like it's due to unreliability of apt install mirrors that the GitHub microsoft/playwright#23388 It seems like we might be able to just wait out the problem, or pin the Ubuntu version possibly, or use an official Playwright Docker container on Actions: microsoft/playwright#23388 (comment) I'll look into whether that causes any problems with the pipeline on Actions. Have you experienced the same issue with the Regardless, the issue isn't due to anything in this PR and is unrelated. |
|
@vincerubinetti thanks for the response; yeah, seems that error I was seeing was something transient, as I'm able to run the pipeline locally fine, both in the new Given that, here's the remaining todos on my end:
|
|
@falquaddoomi Note my last changes. I renamed the Also, to simplify and normalize things in the Google Analytics, I made some changes. See I thought about trying to update the backend with these naming changes, and I could probably get most of the way there, but I realized I'd probably screw up the migrations especially. However, since we're starting from scratch with this PR, if you wanted to start fresh with the migrations, I can try to make these naming changes for you. Or maybe I can make a first pass at the backend renames, and you can take it from there. |


This PR makes the API/database the single source of truth for all of the data. As a result of this, the gather pipeline could be drastically simplified.
After this PR is merged and deployed, we will be able to delete
icc-eval-core-privateandicc-eval-core-api, making this repo a mono-repo for everything, which simplifies the cross-talk needed between the different systems, and increases transparency imo.Major changes:
icc-eval-core-apiinto this repo. I opted for this other than merging my code into that repo because this repo has a lot of commit history I'd like to preserve./rawfile./appto/frontendgetAnalyticsOverview(CFDE-wide analytics info) next togetAnalytics(for one project) next to each other, instead of having the overview code in the maingather.tsscript. Easier to read and maintain.gather.tsscript, save after each step (again, co-locating by domain). I originally did this because I wanted the whole script to either succeed or fail together and avoid accidentally creating partial/mixed states, e.g. failing on repo step and now half the json files are updated and half are not. Now that the output files are meant to be ephemeral (hopefully made clear by them being gitignored), this doesn't matter as much.