-
-
Notifications
You must be signed in to change notification settings - Fork 111
Enhance Monitor dict, adding primary monitor and name attributes #469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Enhance Monitor dict, adding primary monitor and name attributes #469
Conversation
Currently the Windows implementation is completed but other platforms just use a fallback to monitor in slot 1. Also added Monitor name for Windows.
src/mss/models.py
Outdated
| Pixels = list[tuple[Pixel, ...]] | ||
|
|
||
|
|
||
| class Monitor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far, I've been careful to not change any existing aspects of the public API surface. I haven't talked with @BoboTiG about that, though, so he may have different priorities.
This is an example of something that would be a change to that surface. The usage would still be API-compatible, but the signature wouldn't: something using my_monitor: dict = sct.monitors[1] would stop passing type-checking.
The alternative would be to have Monitor inherit from dict, and provide @property attributes that access its entries. That would provide type and API compatibility, but we could mark dict access as deprecated, and eliminate it in the next major rev, simplifying Monitor at that time.
I know I'm the one that suggested using __getattr__ in the first place, so my apologies about the course change.
As another example, I've got one change in the works that affects a lot of ScreenShot, and I'm looking at the likely changes: I'm seeing which changes can be made in an API- and type-compatible way, and which can't. Along the same lines, I'm thinking about what parts of the exposed API we want to say are "public", in the sense of semantic versioning, and which ones we want to stop exposing with public names (such as mss.windows.MSS.user32).
What do you think, @halldorfannar and @BoboTiG?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to not touch the current MSS-specific public API (primary functions and types). The example about mss.windows.MSS.user32 is not considered MSS-specific and can be made private when we want. But Monitor should be kept backward compatible.
The code being used for years still needs to work without changes after we ship the next release. And for the next major release, I am also kind of against breaking the API since this is not really necessary.
How does it sound to both of you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to make sure we’re all aligned on terminology and expectations, since this is really about policy going forward.
First off: I completely agree with the priority of protecting existing users. MSS has earned a lot of trust, and preserving working code across releases is hugely important. Nothing here is meant to undermine that.
What I’m trying to clarify is how we define “public API” and “breaking change,” especially in Python, where the line can be fuzzy.
I’m assuming we’re roughly following Semantic Versioning principles. SemVer only really works if we agree on what the public API is. In Python, the usual convention is “no leading underscore == public,” but MSS historically hasn’t been strict about that (such as the user32 example). That’s fine — but it means we need to be explicit about what we consider stable and versioned.
One thing I’ve personally been treating as part of the public API is type declarations, not just runtime behavior. That’s increasingly important now that many users run MyPy or Pyright in CI. A type signature is effectively a promise, even if Python itself won’t enforce it at runtime.
That leads to an important distinction:
- Runtime compatibility: what actually happens when code runs
- Type compatibility: whether static type checkers still accept the code
They matter in different ways.
Python APIs are famously hard to change without some theoretical breakage. It’s basically impossible to guarantee that no user code could ever be affected — even adding a key to a dict could break someone doing something exotic. So instead of “never break anything,” I’ve been thinking in terms of typical usage. If something would only break highly unusual or contrived code paths, that carries much less weight than breaking common, idiomatic usage.
With that framing, here’s how I’ve been thinking about changes:
- Runtime-breaking changes (new required parameters, removing methods, changing behavior in common paths) are the most serious. Those should only happen in a major release, after deprecation, and only if there’s a strong reason. I agree that we should avoid these without a strong reason.
- Type-only breaking changes (changing a return type in a way that still works at runtime for typical usage) are still API breaks, so should only be at major version boundaries, but they’re lower-impact. They mostly affect users running static analysis, and usually require updating annotations rather than logic.
The Monitor discussion falls into that second category. Returning a Mapping instead of strictly a dict would still behave the same at runtime for typical access (m["left"], iteration, etc.), but it would break type checks for code that annotates it as dict. That’s why I hesitated once I realized the implication.
Now, tying this back to your response: when you say “Monitor should be kept backward compatible,” I fully agree at runtime. Where I want to be careful is whether we’re also committing to never changing exposed types, even at major versions.
The reason this matters is future performance work.
For example, today ScreenShot.bgra is declared as bytes. That implicitly commits us to copying a very large buffer every capture. On a 4K screen that’s ~32 MB, which is several milliseconds per frame — a nontrivial cost if someone is trying to hit 60 fps.
Most OS capture APIs already give us a writable buffer in CPU memory. Python can expose that safely via memoryview, avoiding the copy entirely. For typical usage, a memoryview behaves almost identically to bytes — indexing, slicing, passing to PIL / NumPy / PyTorch all work — but it’s dramatically faster.
From a runtime perspective, this change would be almost invisible to most users. From a typing perspective, however, the distinction matters more.
MSS would need to widen its type declarations (for example, from bytes, to Buffer or bytes | bytearray | memoryview) to reflect what it is returning. Any user code that has explicitly annotated bgra as bytes would start triggering type checker warnings once they upgrade. The runtime logic would still work as before, but static analysis tools would complain. That’s why I’ve been treating this class of change as a public API change, and so far only considering it at major version boundaries.
That’s the kind of change I’m trying to plan carefully around. I don’t want to sneak it in under a minor release — but I do think it’s reasonable to allow this sort of evolution at a major version boundary, because it enables MSS to better deliver on its “ultra-fast” goal.
So my proposed policy, in short, would be:
- Keep the runtime API stable across releases, including major ones, insofar as is reasonable for typical usage. Avoid breaking existing working code unless there is a compelling reason and no practical alternative.
- Treat type changes as public API changes, and allow them at major version boundaries when they enable meaningful improvements, even if runtime behavior for typical usage remains unchanged.
- Be explicit about what we consider public, so users and contributors know what’s safe to rely on.
My intent here isn’t to justify breaking working code, but to make sure we have room to improve performance and structure without surprising users or their tooling.
I’m taking this opportunity to agree on the rules now, so future work doesn’t turn into a philosophical debate every time we touch a boundary.
Curious if that framing matches how you’re thinking about it, or if you see things differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your words are perfectly aligned with my vision, I would not have been able to better phrase it :)
I tried to follow SemVer from the beginning, hopefully enough that it made sense so far.
In my perspective, and I was too lazy at the beginning to properly make those objects private, all objects used by MSS to power MSS content (like user32 on Windows, core on macOS and so on) are not intended to be used by developers/users. I mean, they should have been private from the start, and so we can treat them as we want. It is OK to make them private once and for all.
About the MSS API already used by thousands of people, I do not close the door to improvements, and sometimes necessary breaking changes. If that deserves MSS, then lets do it!
As I am typing, I am thinking that if we change the type of Monitor, for instance, then it might be doable without, hopefully, too many problems to users: in a user code using Monitor, I guess they import the type object from MSS, so if we change it, this could be done transparently to them (best case scenario).
All in all, I am not a conservator, I know pretty well things must change if we want to move forward. Lets doing it with parcimony, and keep users in mind everytime since I would prefer to lower their burden at each release.
I am also totally aligned about preventing philosophy talks at each change (I am not a meeting person, I prefer action first :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I enjoy this conversation. It is the right discussion to have, even if it's happening as fallout from my draft MR 😄 I am also a stickler for SemVer. I have followed it on many projects to the letter, and even though it has felt constricting and painful at times, my customers have always thanked me for it.
One of the reasons I made this MR a draft was exactly the uneasy feeling I got when I saw the fallout from changing the Monitor type. Even though I was bending over backwards to ensure runtime compatibility there were still issues, as @jholveck has rightly pointed out. I also just don't like it when a supposedly simple change spreads into multiple files. It feels like a smell, something is off.
My suggestion therefore is that we keep Monitor as a dict type for now. But we add a TODO in there (or whatever mechanism this project prefers for future annnouncement) about this becoming a Class type in the next major release. That way we announce our intent.
I would also like for us to do this with other potential changes we may want to make in the future. If they require changes that break the API we need to target them for the next major release. As part of release planning we should collect those somewhere, so that we can look up the next major release and see what things we are planning to break with that release. I emphasize planning because in my experience we usually find that not all changes get implemented (some are deemed low-value while others have in the light of time and experience been deemed a bad idea 😆).
Finally, I think we should get low-hanging fruit (like this Monitor improvement) into shape so it causes no breakage and cut a release soon. We should then discuss when the next major release is planned, so we can start aligning changes with the right release (major or minor).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A TODO could be added right above the Monitor declaration. It will be a matter of looking for "TODO" strings if anyone wants to give a try.
As MSS is quite stable, there is no really version schedule. When there is a known problem with a fix, or an improvement done, then I cut a fresh version.
Since the latest available MSS version, changes to be released are huge! Cutting a new release soon would allow us to break things for the next version, which would be a major.
I would like to spread those significant improvements @jholveck and you provided so far to the community without a major release.
So yes, maybe sticking to Monitor as a dict would be a good thing until the next release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All righty then! This is a smaller change now which I think is better for everybody involved. I have removed the "Draft" label. Please review at your leisure.
This is to maintain full backwards compatibility.
Had to adjust pyproject.toml so that TODOs don't fail the checking. Wish they could be flagged as warnings but ruff does not support warnings, only errors. Before major releases we should remove the ignore entry for TODO and see what we get.
Add back a line that I accidentally dropped from the CHANGELOG.md
This will make the docs build without warning. The warning is benign.
|
Would you mind updating https://github.com/BoboTiG/python-mss/blob/main/CHANGES.md too? |
|
Do you know what happens with display mirroring, like if you've got a screen being shown on both your laptop and a projector? There's no bad choice there; I was just curious. I might suggest making sure it only shows up once in the list, though, and not multiple times with the same coordinates. I'm drafting the Linux version, and just noticed that possibility. XRandR exposes all the outputs for a given MSS-style "monitor", and each output has its own name (actually, possibly two). Right now, I'm preferring the primary output's name, if it's on the primary monitor, and otherwise just whichever was first in the list. Also, what do y'all think we should do if the system can't detect the name, like if they don't have XRandR installed? Use an empty string? One reason I originally wanted to improve the way we handle the monitor list by using classes with some logic was to be able to query something like At the same time as we change So it might make sense to have each monitor have a |
For now, skipping "name" from the dict steems a good idea. I like the idea to be able to pick a monitor from a name. I never had this use case, but it is interesting to have! |
|
While we're talking about the idea that a user might have different ways they want to identify a monitor, on Windows, there's also the PnP ID ( We are talking about two different use cases: one for a human-friendly label, and possibly multiple for a stable identifier. Uniqueness is also a consideration, if somebody has multiple monitors of the same model. I'm not saying we need to address all this now; we're talking about unusual use cases. But we should at least make sure our design doesn't preclude us from making this viable in the future. |
|
Ooof, and ChatGPT just pointed out an issue with HiDPI and mirroring.
Apparently, Windows can mirror between a HiDPI display and a non-HiDPI display. (ChatGPT elaborates here.) But IIRC, MSS will globally set the running app as HiDPI-aware. I'm not sure how that figures into things. I don't entirely trust LLMs to be 100% accurate on these sorts of things: HiDPI also implies lots of changes to underlying rendering that the model that ChatGPT implies here. But it does seem to be worth examining. |
Why do we have two files, CHANGES.md and CHANGELOG.md? Feels a bit heavyweight, especially considering that CHANGELOG.md seems to contain most changes. If we need both then I would appreciate a guideline about what goes where. I couldn't see anything about these files in the docs and this repo does not have a CONTRIBUTING.md file. |
|
@jholveck I like your thoughts about future improvements. Yes, some of these we will only be able to do in a future major release. I do agree that
I would expect monitors list to contain both displays, and they would have the same coordinates. That makes sense to me. The user may have a reason for wanting to use one or the other. I do not want to reduce this to a single entry and take away that choice. Finally, the DPI scaling curve-ball. I hadn't thought about how that effects things. If I understand you correctly, the design today is such that it is not hurting us - is that right? |
Yeah, the PnP ID is nice for unique identification. Do we have something similar on Linux/macOS? I suggest we name this attribute |
|
Linux doesn't have something quite like that, no. On Linux, the connector ID, like "DVI-1", is the most common identifier. I'll be parsing the EDID block anyway to get the model name, so I can extract the full ID. That's a 10-byte structure including the three-letter manufacturer code (packed five-bit letters, like "PHL" for Philips), product code (16 bits, usually written in hex), serial number (32 bits), and date of manufacture (week and year). Does that seem to be part of what Windows is using there too, or is it totally different? |
No idea. 🤷 |
While I normally prefer to give an API caller all the information and let them decide what to do with that information, I'm not sure about this case. We're really giving the user capture regions, not monitors. When they pass it back to us, all we use is the capture region information. You mention the user wanting to use one or the other, but MSS will do the exact same thing with the data, regardless of which the user provides. The only time that duplicates in the monitors list are relevant, that I can think of, is when invoking Edit: fix quoting. Also, I'm just giving an alternative perspective; we can still use duplicates if that still seems to make more sense. |
I do not recall why two files exactly. At fist, I think I wanted to split technical changes from a humand-readable changelog, so that developers can simply check CHANGES.md for technical adjustments in their code. |
This is good! This will work as UID. |
Hmmm ... so CHANGES.md is for internal changes that benefit those who contribute to this project while CHANGELOG.md is for consumers of the python package? I'm asking so I can get a clear picture of what the intent is. |
Yes! |
If it's the same information as Windows provides, we may as well format it the same: I don't get it in text form, so I'll be building a string. Do you have a monitor's Windows PnP ID handy? If you also have its EDID block, that's be handy, so I can make sure I'm using everything the same (like making hex strings bytewise / big-endian vs. wordwise). |
Changes proposed in this PR
Currently the Windows implementation is completed but other platforms just use a fallback to monitor in slot 1. Also added Monitor name for Windows.
Added a TODO so we can revisit the lack of class for Monitor in a future major release 😉
Fixes #153
(...)
./check.shpassed