Home  >  Q&A  >  body text

How to optimize a SQL query with over 2 million rows

<p>I have a Sql database with over 2 million rows and it is growing rapidly. There aren't many columns, just <code>code, price, date, and stationID</code>. </p> <p>The purpose is to get the latest price by code and stationID. The query works great but takes over 10 seconds. </p> <p>Is there any way to optimize the query? </p> <pre class="brush:php;toolbar:false;">$statement = $this->pdo->prepare( 'WITH cte AS ( SELECT stationID AS ind, code, CAST(price AS DOUBLE ) AS price, date ,ROW_NUMBER() OVER( PARTITION BY code, stationID ORDER BY date DESC ) AS latest FROM price ) SELECT * FROM cte WHERE latest = 1 ' ); $statement->execute(); $results = $statement->fetchAll(PDO::FETCH_GROUP | PDO::FETCH_ASSOC);</pre> <p>Edit: The first column has an index called "id". I don't know if this helps. </p> <p>The database (InnoDB) looks like this: </p> <pre class="brush:php;toolbar:false;">id primary - int stationID - int code - int price-decimal(10,5) date - datetime</pre> <p>Edit 2:</p> <p>The results need to be grouped by stationID, and multiple rows need to be displayed for each stationID. One line for each code with the latest date.像这样:</p> <pre class="brush:php;toolbar:false;">22456: code: 1 price: 3 date: 2023-06-21 code: 2 price: 2 date: 2023-06-21 code: 3 price: 5 date: 2023-06-21 22457: code: 1 price: 10 date: 2023-06-21 code: 2 price: 1 date: 2023-06-21 code: 3 price: 33 date: 2023-06-21</pre> <p>The json output should be 像这样:</p> <pre class="brush:php;toolbar:false;">{"1000001":[{"code":1,"price":1.661,"date":"2023-06-06 12:46:32","latest":1},{"code":2,"price":1.867,"date":"2023-06-06 12:46:32","latest":1},{"code":3,"price":1.05,"date":"2023-06-06 12:46:32","latest":1},{"code":5,"price":1.818,"date":"2023-06-06 12:46:32","latest":1},{"code":6,"price":1.879,"date":"2023-06-06 12:46:32","latest":1}],"1000002":[{"code":1,"price":1.65,"date":"2023-06-03 08:53:26","latest":1},{"code":2,"price":1.868,"date":"2023-06-03 08:53:26","latest":1},{"code":6,"price":1.889,"date":"2023-06-03 08:53:27","latest":1}],…</pre></p>
P粉127901279P粉127901279415 days ago528

reply all(2)I'll reply

  • P粉141455512

    P粉1414555122023-09-02 11:48:59

    I guess you need the following indexes for the query to execute well (you only need to do it once as part of the database design).

    CREATE INDEX IX ON price
      (code, stationID, date DESC, price)

    The first two columns can be arranged in any order.

    reply
    0
  • P粉297434909

    P粉2974349092023-09-02 00:29:02

    As long as there can't be two rows of the same code, stationID pair with the same datetime, using window functions is a bit like using a sledgehammer to crack a nut.

    select p.stationID, p.code, p.price, p.date
    from (
        select code, stationID, max(date) as max_date
        from price
        group by code, stationID
    ) max
    join price p
        on max.code = p.code
       and max.stationID = p.stationID
       and max.max_date = p.date;

    It requires the following indexes:

    alter table price add index (code, stationID, date desc);

    This query should take less than 1 millisecond because the derived table can be built from the index and then it reads only the required rows from the table.

    Alternatively, you can significantly reduce the effort if you know that each code, stationID pair will receive an updated price within a specific time period (1 hour, 1 day, 1 week) The window function needs to add a where clause:

    with cte as 
    (
        select stationID as ind, code, price, date, row_number() over(partition by code, stationID order by date desc) as latest
        from price
        where date >= now() - interval 1 week
    )
    select * from cte where latest  = 1;

    reply
    0
  • Cancelreply