Why MySQL slows down without specifying an index

I'm trying to optimize a SQL query, but I'd like to know how to do it correctly.

SELECT 
    r0_.*
FROM ride r0_ use index (ride_booking_id_IDX)
LEFT JOIN booking b1_ ON r0_.booking_id = b1_.id 
LEFT JOIN spot s2_ ON r0_.from_spot_id = s2_.id 
LEFT JOIN spot s3_ ON r0_.to_spot_id = s3_.id 
WHERE b1_.start_at <= '2023-04-21' 
    AND b1_.end_at >= '2023-04-20' 
    AND b1_.paid_at IS NOT NULL 
    AND b1_.cancelled_at IS NULL 
    AND ((s2_.zone_id = 1 OR s3_.zone_id = 1)) 
    AND s2_.type = 'parking';

Here, I force the index (booking_id, from_spot_id, to_spot_id), which causes the query to execute within 25 seconds of the nearest date for about 100 milliseconds!

The

booking table has about 2 million rows, and the ride table has about 5 million rows.

However, I can see it scanning more rows using the forced index:

id	Choose a type	surface	type	Possible keys	key	key_len	refer to	OK	Filtered	additional
1	Simple	b1_	scope	Main, booking_id_end_IDX, booking_id_IDX, booking_id_start_IDX, IDX_E00CEDDEB75363F7, IDX_E00CEDDE37D3107C, IDX_E00CEDDEDEA4208C, booking_paid_at_IDX, booking_cancelled_at_IDX	IDX_E00CEDDE37D3107C	6		111456	6.6	Use index conditions; use location
1	Simple	r0_	refer to	ride_booking_id_IDX	ride_booking_id_IDX	109	ector.b1_.id	1	100.0
1	Simple	s2_	eq_ref	main,IDX_B9327A739F2C3FAB,spot_type_IDX	main	4	ector.r0_.from_spot_id	1	72.52	place of use
1	Simple	s3_	eq_ref	main	main	4	ector.r0_.to_spot_id	1	100.0	place of use

Compared to the same query without using indexes:

id	Choose a type	surface	type	Possible keys	key	key_len	refer to	OK	Filtered	additional
1	Simple	s2_	refer to	main,IDX_B9327A739F2C3FAB,spot_type_IDX	spot_type_IDX	767	constant	161	100.0	Use index conditions
1	Simple	r0_	refer to	IDX_9B3D7CD0ABAF30D3, IDX_9B3D7CD03301C60, ride_booking_id_IDX, booking_from_spot_to_spot_IDX	IDX_9B3D7CD0ABAF30D3	5	ector.s2_.id	392	100.0
1	Simple	b1_	eq_ref	Main, booking_id_end_IDX, booking_id_IDX, booking_id_start_IDX, IDX_E00CEDDEB75363F7, IDX_E00CEDDE37D3107C, IDX_E00CEDDEDEA4208C, booking_paid_at_IDX, booking_cancelled_at_IDX	main	108	ector.r0_.booking_id	1	5.0	place of use
1	Simple	s3_	eq_ref	main	main	4	ector.r0_.to_spot_id	1	100.0	place of use

As far as I can tell, the dates I'm using to compare against start_at and end_at are the reason why the query is noticeably faster.

So I try to isolate the slowest parts to smaller queries:

Select * from booking b where b.start_at < '2021-01-01' 和 b.end_at > '2021-01-01';

On the table booking I have two indexes (start_at) and (end_at) which help this query run faster as you approach the max and min values Faster (since the index will filter most rows, leaving very few rows to scan).

However, when I take a random value far enough in the past, it becomes much slower. The above query takes 10 seconds to run because it only uses one of the two indexes as expected and I don't know why the explanation for merge_index doesn't show up on such a simple query:

id	Choose a type	surface	Partition	type	Possible keys	key	key_len	refer to	OK	Filtered	additional
1	Simple	b		scope	IDX_E00CEDDEB75363F7,IDX_E00CEDDE37D3107C	IDX_E00CEDDEB75363F7	6		1147319	50	Use index conditions; use location

Since I can't have an index that satisfies both range conditions, I tried splitting the query in half

SELECT * from booking b
INNER JOIN booking b2 use index(booking_id_start_IDX)  ON b.id = b2.id and b2.start_at < '2021-01-01'
INNER JOIN booking b3 use index(booking_id_end_IDX) ON b.id = b3.id and b3.end_at > '2021-01-01';

This query runs significantly faster, taking approximately 600 milliseconds. However, due to the simplicity of the query and the fact that it returns about 7k rows, I would expect it to be in the double digits at best.

I don't understand why the query doesn't automatically select my indexes (id, start_at) and (id, end_at)? What am I missing?

I know I could partition the table to get better results, but I have foreign keys that cannot be deleted, so that's not a solution. Should I consider another schema and have a table that holds the reservation dates separately without any foreign keys and have the reservations table reference it so I can partition the reservations table? Is it possible to use foreign keys to reference a partitioned subscription table at subscription time?

The Mysql engine is running in AWS with the following version: 8.0.mysql_aurora.3.02.2

The output of SELECT @@optimizer_switch is:

index_merge=on, index_merge_union=on, index_merge_sort_union=on, index_merge_intersection=on, engine_condition_pushdown=on, index_condition_pushdown=on, mrr=on, mrr_cost_based=on, block_nested_loop=on, batched_key_access=off, materialization=on, semijoin=on, loosescan=on, firstmatch=on, duplicateweedout=on, subquery_materialization_cost_based=on, use_index_extensions=on, condition_fanout_filter=on, provided_merge=on, use_invisible_indexes=off, skip_scan=on, hash_join=on, subquery_to_derivative=off , prefer_ordering_index=on, hypergraph_optimizer=off, derivative_condition_pushdown=on

P粉787806024561 days ago721

Why MySQL slows down without specifying an index

reply all(1)I'll reply