The problem is with a hypothetical online store; I will simplify the data for my specific problem.
Suppose there are two tables: one holds the registration date of each customer
CREATE TABLE customers ( customerID int, register DATE, );
, the second one saves all the customer’s online visits
CREATE TABLE visits ( customerID int, visit DATE, );
For each user, I want to find the last day the user visited the online store within 7 days of registration, in the most efficient way. If the user has not visited the store since registration, I should return NULL as the last visit date within 7 days of registration.
Of course I can LEFT JOIN customers access by Customer ID and filter the access date to between the registration day and Add seven, and finally get the maximum date of each customer. However, the initial join results in a huge table and I want to avoid this computationally expensive operation.
Thank you very much for your thoughts!
P粉5144588632024-03-23 10:28:00
This is probably the most efficient:
SELECT c.customerID, c.register, ( SELECT MAX(v.visit) FROM visits AS v WHERE v.customerID = c.customerID AND v.visit < c.register + INTERVAL 7 DAY ) AS last_date FROM customers AS c;
index:
customers: PRIMARY KEY(customerId) -- is this what you have? visits: INDEX(customerID, visit) -- or perhaps PRIMARY KEY
Your concerns about "computationally expensive operations":
accessing
the index is to "overwrite" it, so it only looks at the index.