-
-
Notifications
You must be signed in to change notification settings - Fork 34k
gh-140232: Do not track frozenset objects with immutables #140234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Mikhail Efimov <efimov.mikhail@gmail.com>
|
Maybe it is worth to change PyObject *
PyFrozenSet_Alloc(PyTypeObject *type, Py_ssize_t nitems)
{
PyObject *obj = PyType_GenericAlloc(type, nitems);
if (obj == NULL) {
return NULL;
}
_PyFrozenSet_MaybeUntrack(obj);
return obj;
} |
The |
sergey-miryanov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me.
…cpython into frozenset_immutable_tracking
|
Would it be possible to write tests in Python rather than in C? |
I tried, but it is not easy. We have to expose Line 2778 in d78d7a5
And when calling |
IIUC, if you return the first argument from pyset_add then you can test it on the python side. |
Ok, I gave it another try. The first attempt failed, but by using the vectorcall convention I can keep the reference count at 1 also from the Python side. |
Objects/setobject.c
Outdated
| void | ||
| _PyFrozenSet_MaybeUntrack(PyObject *op) | ||
| { | ||
| if (op == NULL || !PyFrozenSet_CheckExact(op)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not untracking frozenset subtypes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make a cycle using frozenset subtype:
>>> class F(frozenset):
... pass
...
>>> f = F([1,2,3])
>>> f.cycle = fSo, we need to track them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok. In this case, please add a comment explaining that :-)
Lib/test/test_set.py
Outdated
| # Test the PySet_Add c-api for frozenset objects | ||
| assert _testcapi.pyset_add(frozenset(), 1) == frozenset([1]) | ||
| frozen_set = frozenset() | ||
| self.assertRaises(SystemError, _testcapi.pyset_add, frozen_set, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why the second test fails, whereas the first succeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second test fails because the argument frozen_set is not uniquely referenced. The error is raised here:
Line 2777 in ce4b0ed
| (!PyFrozenSet_Check(anyset) || !_PyObject_IsUniquelyReferenced(anyset))) { |
I will add a comment to the test
Co-authored-by: Victor Stinner <vstinner@python.org>
Modules/_testcapimodule.c
Outdated
|
|
||
| static PyObject * | ||
| // Interface to PySet_Add, returning the set | ||
| pyset_add(PyObject* self, PyObject* const* args, Py_ssize_t nargsf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this function to Modules/_testlimitedcapi/set.c.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the function and converted the calling convention to METH_VARARGS. This adds another reference to the set causing the test to fail.
I will convert back to METH_FASTCALL, or define the set at C level.
Modules/_testcapimodule.c
Outdated
| if (return_value < 0) { | ||
| return NULL; | ||
| } | ||
| return Py_NewRef(set); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should return None on success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From #140234 (comment) I gathered we prefer to write the tests in Python (as much as possible). I can return None here, but then I have convert the code from test_set to C (possible, but more code and a bit harder to read)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vstinner A branch with the tests written in C is here:
main...eendebakpt:cpython:frozenset_immutable_tracking_c
Let me know if you prefer that approach, then I will merge the branch into this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can return None here, but then I have convert the code from test_set to C (possible, but more code and a bit harder to read)
Why do you have to write tests in C if this function returns None? Are you talking about tests on frozenset? I see this code in PySet_Add():
if (PyFrozenSet_Check(anyset) && _PyObject_IsUniquelyReferenced(anyset)) {
// We can only change frozensets if they are uniquely referenced. The
// API limits the usage of `PySet_Add` to "fill in the values of brand
// new frozensets before they are exposed to other code". In this case,
// this can be done without a lock.
return set_add_key((PySetObject *)anyset, key);
}If you want to write tests for this code path, writing them in C would be more reliable to have a fine control on the reference count, right. But other tests on the set types can be written in Python, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for returning the set in pyset_add was indeed to be able test the code path you mention for the frozenset. Writing
x = frozenset()
_testinternalcapi.pyset_add(x, 1)
fails (because the frozenset is not uniquely referenced). For that reason I used:
x =_testinternalcapi.pyset_add(frozenset(), 1)
assert 1 in x
Doing that in C indeed seems better.
The other tests (for this PR the TestPySet_Add.test_set and TestPySet_Add.test_frozenset) can still be written in Python, although there are a couple already there for set and set subclasses, so there is not much left to test from the Python side.
Lib/test/test_set.py
Outdated
| for cubevert in edge: | ||
| self.assertIn(cubevert, g) | ||
|
|
||
| class TestPySet_Add(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move these tests to Lib/test/test_capi/test_set.py.
…cpython into frozenset_immutable_tracking
In the PR we untrack frozen tuples for the normal constructors. There are a few methods shared between the
setandfrozenset(for exampleset_intersectioninsetobject.c) where we have not added the untracking. (this is possible, but I am not sure this is worthwhile to do).Here is a small script to test the idea:
It measures the performance of garbage collection, and outputs some statistics for the numbers of frozen containers.
Main:
PR
Note: generative ai was used in creating the PR