Skip to content

gh-150662: Stop unbounded memory growth in Tachyon --gecko collector#150845

Open
maurycy wants to merge 5 commits into
python:mainfrom
maurycy:gecko-ad-inf
Open

gh-150662: Stop unbounded memory growth in Tachyon --gecko collector#150845
maurycy wants to merge 5 commits into
python:mainfrom
maurycy:gecko-ad-inf

Conversation

@maurycy
Copy link
Copy Markdown
Contributor

@maurycy maurycy commented Jun 3, 2026

The PR fixes an unbounded memory growth caused by:

for t in times:
samples_stack.append(stack_index)
samples_time.append(t)
samples_delay.append(None)

It was reported in gh-150662 and the detailed idea for the fix by @pablogsal:

#150662 (comment)

Discussion

I don't think others collector have this issue. pstats, collapsed/flamegraph, heatmap, jsonl should just plateau. I've reviewed them. I pondered this for a day, and I don't think there's a better fix? It's not really crash-resillent safe. It likely doesn't matter here that much, as I'm really not sold on using Gecko for really long term profiling. Binary format is much better in this regard, and I've started experimenting with a different fix there. Perhaps we should encourage recording binary pattern more? The tests stay as is.

(No longer) Reproduction

2026-06-03T13:48:12.219584000+0200 maurycy@gimel /Users/maurycy/src/github.com/maurycy/cpython (gecko-ad-inf fcfb002*) % ./python.exe -c "
def work(): return sum(i*i for i in range(2000))
while True: work()
" & TARGET=$!

sudo ./python.exe -m profiling.sampling attach --gecko -r 10000 -d 900 -o /tmp/gecko.json $TARGET &
sleep 2; PROF=$(pgrep -fn "profiling.sampling attach")
for i in $(seq 15); do printf "t=%2dmin  RSS=%d MB\n" $i $(($(ps -o rss= -p $PROF|tr -d ' ')/1024)); sleep 60; done
[1] 80893
[2] 80894
t= 1min  RSS=30 MB
t= 2min  RSS=30 MB
t= 3min  RSS=30 MB
t= 4min  RSS=30 MB
t= 5min  RSS=30 MB
t= 6min  RSS=30 MB
t= 7min  RSS=30 MB
t= 8min  RSS=30 MB
t= 9min  RSS=30 MB
t=10min  RSS=30 MB
t=11min  RSS=30 MB
t=12min  RSS=30 MB
t=13min  RSS=30 MB
t=14min  RSS=30 MB
t=15min  RSS=30 MB
Captured 9,000,001 samples in 900.00 seconds
Sample rate: 10,000.00 samples/sec
Error rate: 27.59
Gecko profile written to /tmp/gecko.json
Open in Firefox Profiler: https://profiler.firefox.com/

[2]  + done       sudo ./python.exe -m profiling.sampling attach --gecko -r 10000 -d 900 -o  

yield chunk


class NDJSONSpillColumn:
Copy link
Copy Markdown
Contributor Author

@maurycy maurycy Jun 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The story here is that I started with TypedSpillColumn, as in the idea from gh-150662, but array is not really a great fit for opcode markers, if we don't to maintain a separate serialization layer.

Reusing NDJSON name to avoid confusion with the --jsonl collector.

I think the best call would be to have only SpillColumn without array. It would massively simplify GeckoThreadSpill, but at the expense of 2-3 higher disk usage.

"processType": thread_data["processType"],
"processName": thread_data["processName"],
}
file.write("{")
Copy link
Copy Markdown
Contributor Author

@maurycy maurycy Jun 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json would suck the file back into memory.

self._prepare_for_serialization()
file = io.StringIO()
self._stream_profile(file)
return json.loads(file.getvalue())
Copy link
Copy Markdown
Contributor Author

@maurycy maurycy Jun 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_build_profile() is now a test helper. Maybe we should move it to tests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant