← All blogs

NotesMySQL · Indexing

How a 2 GB buffer pool made MySQL full-text 100× faster

The brief looked simple: search a global contact list by name, prefix matches first, partial matches second. The interesting part wasn't picking the right index — it was discovering how much the buffer pool changed the answer.

The problem

Search a company table by name. Results that start with the query should rank above results that merely contain it. The table holds tens of millions of rows. The dataset I tested against had two sizes — 23 million rows and 93 million rows — both indexed two ways: a regular B-tree and a MySQL FULLTEXT index.

Two indexes, two very different jobs

On paper, a full-text index is the textbook answer for "search inside text." In practice the two indexes do different things:

Two other defaults bite you on the first run:

The query

Two variants — one for each index. The CASE in the ORDER BY is how prefix matches float above substring matches.

-- Full-text
SELECT name FROM company, (SELECT @a:=NULL) AS init
WHERE MATCH(name) AGAINST("sing*" IN BOOLEAN MODE)
ORDER BY
  CASE
    WHEN name LIKE "sing%" THEN 0
    ELSE LOCATE("sing", name)
  END, name
LIMIT 5000;

-- Regular B-tree
SELECT _name FROM company, (SELECT @a:=NULL) AS init
WHERE _name LIKE "sing%"
ORDER BY _name
LIMIT 5000;

The (SELECT @a:=NULL) trick disables MySQL's query cache for the run, so each measurement starts cold.

First pass: looks like a tie

On 23M rows, sorted, with default settings:

IndexQueryTimeRows touched
Full-textsing*5.8 s23,429
B-treesing%30 ms19,927
B-tree%sing%11.5 s23.3 M

B-tree on a prefix query is the obvious winner. But the assignment also wanted substring matches surfaced, where the B-tree falls off a cliff. Full-text "should" win — and yet it took 5.8 seconds. The estimated query cost from EXPLAIN FORMAT=JSON said full-text was four times cheaper. Why was it the slowest in practice?

The buffer pool was the bottleneck

MySQL's innodb_buffer_pool_size caches pages of data and index in memory. The default is 128 MB. When the working set of a full-text index doesn't fit, every probe pays disk I/O.

The full-text index for the 23M-row table didn't fit in 128 MB. So I raised the buffer pool to 2 GB and re-ran the exact same queries.

ConfigIndexQueryTime
128 MB bufferFTsing*5.8 s
2 GB bufferFTsing*172 ms
2 GB bufferB-tree%sing%11.5 s

Same data, same query, same SQL — ~34× faster. The change was a single config knob.

It scales

On 93M rows the gap widens. Sorted full-text on default settings ran for over 21 seconds. With a 3 GB buffer it finished in 585 ms.

ConfigIndexTime (sorted)
128 MB bufferFT21.7 s
2 GB bufferFT10.1 s
3 GB bufferFT585 ms
B-tree sing%153 ms
B-tree %sing%46 s

A heuristic dropped out of this: with a default-sized buffer, full-text seems to take roughly 5 seconds for every 25K rows it scans. With the index resident in memory, that drops to 0.5–1 second per 100K rows.

A few surprises worth remembering

A decision tree

The lesson: the right index is half the answer. The other half is making sure the index lives in RAM. On full-text indexes especially, innodb_buffer_pool_size is the lever — get the working set in memory and the same query runs orders of magnitude faster, with no code change.