gh-150411: fix gc_generation.count race#150413
Conversation
|
Same concerns as #150356 (comment) CC @nascheme |
| gcstate->old[0].count, | ||
| gcstate->old[1].count); | ||
| _Py_atomic_load_int_relaxed(&gcstate->young.count), | ||
| _Py_atomic_load_int_relaxed(&gcstate->old[0].count), |
There was a problem hiding this comment.
Does gcstate->old[0].count need to be atomic here? I think it is only ever written under stw so it is unnecessary.
There was a problem hiding this comment.
Hi @kumaraditya303 ,
Thanks very much for your review! ❤️
Yes, I searched the usage of count and find that only the young.count is updated without STW guard.
old[generation].count are all protected by STW.
So we only need to make young.count use atomic action and old[generation].count read is safe.
b71383e to
639cc24
Compare
|
Hi @pablogsal and @kumaraditya303 , Thanks very much for your review and suggestions! I have updated the patch with only atomic for Please correct me if I made any mistake or misunderstanding. Wish you a good day! 🌞 |
Issue
gh-150411
Root Cause
Refer the analytics in issue gh-150411, it should be a
gc_generation.countupdate during the cyclic object allocation triggered the local allocation count migrated to young generation. Meantime we try to read thegc_generation.countwithout sync caused the race.cpython/Python/gc_free_threading.c
Lines 2017 to 2037 in c714b56
I find this problem during proposing gh-150356. So they have similar reproduce way.
Proposed Changes
I added an atomic relax load guard for the
gc_generation.count.It was protected in other places expect current one.