Skip to content

[mypyc] Specialize s[i] == 'x' to a codepoint int compare#21579

Open
VaggelisD wants to merge 1 commit into
python:masterfrom
VaggelisD:str-index-compare-specialize
Open

[mypyc] Specialize s[i] == 'x' to a codepoint int compare#21579
VaggelisD wants to merge 1 commit into
python:masterfrom
VaggelisD:str-index-compare-specialize

Conversation

@VaggelisD
Copy link
Copy Markdown
Contributor

7th PR of #21418

Lowers s[i] == 'x' (and the symmetric == / != forms) down to a bounds-checked codepoint read + int compare, instead of CPyStr_GetItem + CPyStr_EqualLiteral which (may) allocate a 1-character PyUnicode per iteration. No annotations are required for this optimization.

On microbenchmarks (1-compare-per-iter hot loop, ~2.5M-codepoint SQL-like string) the comparison is ~3.6x times faster.


Some follow up optimizations that might be worth it I can work on:

  • In operator e.g s[i] in ('a', 'b', 'c') --> Fuse to one check with N int comparisons
  • Comparison operators e.g s[i] < 'x' --> Need to expand the op set
  • s[i] == s[j] — both sides IndexExpr(str) --> Need drop the literal-only guard

Recognizes the AST shape `IndexExpr(str) == StrLiteral` (and the symmetric
`StrLiteral == IndexExpr(str)`, plus the `!=` variants) and lowers it to
an int compare of codepoints reusing the existing CPyStr_GetItemUnsafeAsInt
primitive.

Today the pattern lowers to CPyStr_GetItem + CPyStr_EqualLiteral, which
allocates or looks up a 1-character PyUnicode object per iteration and
goes through a generic string-equality call. After specialization it
becomes an inlined PyUnicode_READ plus an int compare -- about 4x faster
on bench_str_compare with a 3-compares-per-iteration workload, and closer
to ~9x with the more typical 1-compare-per-iteration shape.

No annotations required; benefits any code that compares a string index
against a 1-character literal. Multi-character / empty literals fall
through to the generic path (which still correctly returns False).
Bounds checking is preserved -- the helper raises IndexError for
out-of-range indices, same as the unspecialized path.

Stack: builds on the `ord(s[i])` primitive (python#20578) and the librt.strings
codepoint helpers (python#21462, python#21504, python#21509, python#21521, python#21522, python#21553).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant