Skip to content

gh-119118: Fix performance regression in tokenize module#119615

Merged
pablogsal merged 7 commits intopython:mainfrom
lysnikolaou:performance-tokenize
May 28, 2024
Merged

gh-119118: Fix performance regression in tokenize module#119615
pablogsal merged 7 commits intopython:mainfrom
lysnikolaou:performance-tokenize

Conversation

@lysnikolaou
Copy link
Copy Markdown
Member

@lysnikolaou lysnikolaou commented May 27, 2024

  • Cache line object to avoid creating a Unicode object for all of the tokens in the same line.
  • Speed up byte offset to column offset conversion by using the smallest buffer possible to measure the difference.

- Cache line object to avoid creating a Unicode object
  for all of the tokens in the same line.
- Speed up byte offset to column offset conversion by using the
  smallest buffer possible to measure the difference.