首頁  >  文章  >  資料庫  >  mysql:“我附近的泰國菜”,或:在資料庫中進行地理距離計算。

mysql:“我附近的泰國菜”,或:在資料庫中進行地理距離計算。

Barbara Streisand
Barbara Streisand原創
2024-09-21 06:30:51748瀏覽

我們都熟悉「我附近的泰國菜」這件事。你在手機中輸入該短語,它會返回一份你附近的泰國餐廳列表。我們對其背後的工作原理有一種了解:谷歌或任何擁有泰國餐館數據庫及其緯度和經度的人,並通過我們的手機知道我們的位置,然後執行“一些過程”來找出泰國的哪些地方都在附近。

在這篇文章中,我們將回顧「某些過程」部分,看看如何使用 mysql 來做一些標準位置的事情。我們將介紹mysql 的POINT 和POLYGON 類型,查找球體上兩點之間的距離(地球,與您在互聯網上讀到的相反),確定一個點是否在由點定義的多邊形內部,並查看諸如“空間參考系統」之類的內容,它定義瞭如何在地球表面繪製座標。

mysql: “thai food near me”, or: doing geo distance calculations in your database.
一家餐廳嘗試進行 SQL 注入攻擊。

在 mysql 中提出一個觀點

mysql有一整套專門用於空間資料的函數和資料型態。它們的數量令人眼花撩亂,官方文件也密密麻麻。幸運的是,我們只需使用一小部分就可以完成我們想做的事情。我們將從 POINT 開始。

POINT 既是一種資料型,也是傳回該資料型別的函數。如果我們想在一個好的老式 x/y 圖上定義一個點,我們可以這樣做:

SELECT POINT(3, 7);

該查詢的結果是 POINT 類型值中的 x/y 點。 mysql 以二進位格式儲存 POINT,因此我們 select 的結果並不是特別有用:

SELECT POINT(3, 7);
+------------------------------------------------------+
| POINT(3, 7)                                          |
+------------------------------------------------------+
| 0x00000000010100000000000000000008400000000000001C40 |
+------------------------------------------------------+

mysql 透過提供兩個方便的函數來從一個點提取 x 和 y 值來解決這個問題:

  • ST_X()
  • ST_Y()

它們都接受 POINT 值作為參數。例如:

SELECT ST_X(POINT(3,7)) AS x, ST_Y(POINT(3,7)) AS y;
+------+------+
| x    | y    |
+------+------+
|    3 |    7 |
+------+------+

因為 POINT 是一種資料類型,所以我們可以在表格定義中使用它,就像 INT 或 VARCHAR 一樣。

CREATE TABLE `some_coords` (
  `coords` POINT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

如果我們有一個 POINT 類型的列,則只有 POINT 資料可以放入其中。我們稍後會詳細介紹這一點。

關於 x、y、地圖的簡短題外話以及缺乏標準

我們在學校都學過如何使用水平的 x 軸和垂直的 y 軸在藍線方格紙上繪製點。點定義為 x/y;水平第一,垂直第二。一直如此,大家都同意。

除了製作地圖的人。

製作地圖的人將點定義為緯度/經度。當然,緯度是南北走向的,在地圖上是垂直的。經度,即東西軸,是水平的。本質上,地圖人員決定使用 y/x。

顯然,這會產生問題。讓我們看看當我們創建一個點來代表阿爾伯塔省卡加利市中心的船舶和錨酒吧的位置時會發生什麼(我偶爾會在那裡寫博客)

SELECT ST_X(POINT(51.037913, -114.073277)) as longitude, ST_Y(POINT(51.037913, -114.073277)) as latitude;
+-----------+-------------+
| longitude | latitude    |
+-----------+-------------+
| 51.037913 | -114.073277 |
+-----------+-------------+

緯度和經度混淆了;我們的酒吧走錯地方了。更糟的是,由於緯度的最大值為 90,我們已將船和錨放置在太空中的某個位置。不好。

當使用地圖或地球儀上的點時,mysql 透過提供兩個函數來取代 ST_X() 和 ST_Y() 來解決這個問題:

  • ST_Latitude()
  • ST_經度()

這是個好東西,除了,如果我們嘗試在上面的查詢中使用它們,我們會收到此錯誤訊息:

ERROR 3726 (22S00): Function st_latitude is only defined for geographic spatial reference systems, but one of its arguments is in SRID 0, which is not geographic.

這個錯誤看起來令人生畏(SRID 0 到底是什麼?),但 mysql 在這裡告訴我們的是,我們正在使用的 POINT 尚未定義為 地圖點 。它們只是普通的舊袋 x 和 y。

我們稍後將討論 SRID 和 SRS。

mysql: “thai food near me”, or: doing geo distance calculations in your database.
緯度和經度總是 y/x。

更好的表達觀點的方法:眾所周知的文本

到目前為止,我們已經使用 POINT() 函數選擇了 POINT 類型的值。目前這工作正常,但是有一種更好、更靈活的方法來做到這一點,當事情開始變得更加複雜時,這將使處理點和多邊形變得更容易。

ST_GeomFromText() 函數將我們要建立的幾何物件(在本例中為 POINT)的文字表達式(字串)作為參數,並傳回正確類型的值。

these text expressions are formatted using a syntax called "well-known text". the format is, basically, the name of the geometric object you want to create (ie. POINT) and the coordinates that define it. let's look:

SELECT ST_GeomFromText('POINT(51.037913 -114.073277)');

this looks very straightforward, but there's a glaring question: where is the comma separating the arguments in our POINT call?

the answer is that the well-known text here isn't a call to the function POINT(), it's a definition of the data type POINT.

back at the beginning of this discussion, we went over how POINT is both a function and a datatype. when we use POINT() as a function, the coordinates are arguments that are separated by a comma. when we define a value using POINT as a type, the coordinates do not take a comma.

we can use ST_GeomFromText() to create any sort of geometric object that's defined in the well-known text. there aren't many of these, and we'll be sticking in this post to POINTs and POLYGONs (which include things like squares and triangles).

spatial reference systems: not all points are the same

on my desk i have a small chess board where i occasionally work through annotated games. it's my idea of "fun". that chess board is a coordinates system. i also have a large, widescreen computer monitor on my desk. it's a coordinate system as well.

however, just because my chess board and monitor are both coordinate systems doesn't mean that the coordinates from one can be transferred to the other. the x/y position of my white bishop is meaningless on my monitor; that x/y point only has meaning in the context of the chess board.

a context defines things like the origin points, axes, units of measurement and the like. useful stuff that helps us make sense of what a coordinate actually means.

when it comes to plotting points and lines and polygons on the surface of the earth, that context is called a 'spatial reference system', or SRS.

there are a lot of different SRSs. a lot. some of them treat the earth as a sphere, others as a projected flat map. some cover the entire planet, many more only deal with a sub region, like a country. some include a z axis from the center of the earth, most don't.

if we want to peruse all the different SRSs that mysql has, we can run this select:

SELECT * FROM INFORMATION_SCHEMA.ST_SPATIAL_REFERENCE_SYSTEMS;

there are about five thousand of them.

fortunately, we don't need to read through all of these to choose one. we're just going to use 4326, a global, unprojected coordinate system that (just about) everybody uses.

that 4326 number is the id of the SRS. it's called, unsurprisingly, an SRID. if we remember back to when we tried to call the ST_Latitude() function on the POINT we made, we got the error:

ERROR 3726 (22S00): Function st_latitude is only defined for geographic spatial reference systems, but one of its arguments is in SRID 0, which is not geographic.

now that we have more of an understanding about SRSs, we can see that here mysql is complaining that we are asking for the latitude, but the SRS of our POINT isn't one that uses latitude and longitude. the SRS we are using, according to the error message, is SRID 0.

SRID 0 is just a 'flat, cartesian plane' with no units. think of it as a sheet of that blue-lined graph paper from math class stretching off into infinity in all directions. this is a great SRS for some applications, but is not very meaningful for using latitude and longitude to map places on a spherical earth. SRID 0 is the default SRS that mysql assigns to POINTs (and other shapes) when one is not specified.

by comparison, the 4326 SRS is specifically designed for global mapping. it treats the surface of the earth as an ellipsoid, uses degrees for measurement and defines the axes as the equator and prime meridian. exactly what we want. 4326 is, in turn, based on a big set of data about the earth called the world geodetic system 1984, or WSG84, that was compiled in that year in an effort to unify and standardize the mishmash of national mapping data. if you're one of those 'further reading' types, you can read over a detailed explainer on SRID 4326 here or peruse the surprisingly-entertaining wikipedia entry on WSG84.

mysql: “thai food near me”, or: doing geo distance calculations in your database.
a developer accidentally uses SRID 0 for their geolocation select.

actually using SRID 4326

using SRID 4326 as our SRS when creating a POINT is pretty straightforward; we just add the SRID as a second argument to ST_GeomFromText().

SELECT ST_GeomFromText('POINT(51.037913 -114.073277)', 4326);

and, just like that, our x/y values are now treated as longitude and latitude coordinates on earth. let's try ST_Latitude() again:

SELECT ST_Latitude(ST_GeomFromText('POINT(51.037913 -114.073277)', 4326)) AS latitude;
+-----------+
| latitude  |
+-----------+
| 51.037913 |
+-----------+

exactly what we wanted.

creating a table for our POINTs

selecting geometric data like POINTs (or POLYGONs or LINESTRINGs) created using literal data is fine, but what we probably want to do is persist that data in a table so we can use it later. let's do that. we'll start with creating our table.

CREATE TABLE `calgary` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(200) COLLATE utf8mb4_unicode_ci NOT NULL,
  `coords` POINT NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

here, we've defined a pretty standard-looking table of notable locations in the city of calgary, alberta. the interesting column here is coords, which is defined as a POINT.

that POINT doesn't have an SRS associated with it. this means that on every insert, we will have to define the SRID we are using for our point. this is very flexible, but if we want to we can add the SRS to the column definition.

CREATE TABLE `calgary` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(200) COLLATE utf8mb4_unicode_ci NOT NULL,
  `coords` POINT SRID 4326 NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

by defining our coords column as POINT SRID 4326 we are enforcing that any POINT in that column must be of SRID 4326. if we try to insert a point that has a different SRID, mysql will complain with an error like:

ERROR 3643 (HY000): The SRID of the geometry does not match the SRID of the column 'coords'. The SRID of the geometry is 0, but the SRID of the column is 4326. Consider changing the SRID of the geometry or the SRID property of the column.

for all the examples going forward, we will be using a table with a coords column that does not define the SRID.

now that we have a table, we can insert some rows. we'll add a list of calgary landmarks.

INSERT INTO calgary VALUES (null, 'calgary tower', ST_GeomFromText('POINT(51.044270 -114.062019)', 4326));
INSERT INTO calgary VALUES (null, 'peace bridge', ST_GeomFromText('POINT(51.0542 -114.0793)', 4326));
INSERT INTO calgary VALUES (null, 'saddledome', ST_GeomFromText('POINT(51.0374 -114.0519)', 4326));
INSERT INTO calgary VALUES (null, 'national music centre', ST_GeomFromText('POINT(51.04250 -114.06083)', 4326));
INSERT INTO calgary VALUES (null, 'baitun nur mosque', ST_GeomFromText('POINT(51.101743 -113.972039)', 4326));
INSERT INTO calgary VALUES (null, 'olympic oval', ST_GeomFromText('POINT(51.07694 -114.13556)', 4326));
INSERT INTO calgary VALUES (null, 'heritage park', ST_GeomFromText('POINT(50.98528 -114.10833)', 4326));
INSERT INTO calgary VALUES (null, 'international avenue', ST_GeomFromText('POINT(51.03778 -113.98167)', 4326));
INSERT INTO calgary VALUES (null, 'fort calgary', ST_GeomFromText('POINT(51.045139 -114.045778)', 4326));

there's a lot of things to see in calgary!

in these insert statements, we create our point using ST_GeomFromText() and set the SRID as 4326 like so:

ST_GeomFromText('POINT(51.0542 -114.0793)', 4326)

we can then select this data back, getting the latitude and longitude of each location with ST_latitude() and ST_longitude().

SELECT  id,
        name,
        ST_Latitude(coords) AS latitude,
        ST_Longitude(coords) AS longitude
FROM    calgary;
+----+-----------------------+-----------+-------------+
| id | name                  | latitude  | longitude   |
+----+-----------------------+-----------+-------------+
|  1 | calgary tower         |  51.04427 | -114.062019 |
|  2 | peace bridge          |   51.0542 |   -114.0793 |
|  3 | saddledome            |   51.0374 |   -114.0519 |
|  4 | national music centre |   51.0425 |  -114.06083 |
|  5 | baitun nur mosque     | 51.101743 | -113.972039 |
|  6 | olympic oval          |  51.07694 |  -114.13556 |
|  7 | heritage park         |  50.98528 |  -114.10833 |
|  8 | international avenue  |  51.03778 |  -113.98167 |
|  9 | fort calgary          | 51.045139 | -114.045778 |
+----+-----------------------+-----------+-------------+

at last, calculating distance

so far, we've made some spatial POINTs and assigned them to SRID 4326 so we can actually make sense of them as latitude and longitude. it's finally time to focus on what we really want to do: getting the distance between two points.

to do this, we're going to use mysql's ST_Distance_Sphere() function.

as one would expect, ST_Distance_Sphere() calculates the distance between two points, provided as arguments to the function, on a sphere. the distance returned will always be the shortest one (since, on a sphere, we can always go the opposite direction and travel further to get to the same place). the unit of measurement is meters.

ST_Distance_Sphere() takes an optional third argument: the radius of the sphere. if we do not set this argument, the value 6,370,986 meters is used. that's the radius of the earth, and is the value we almost certainly want to use.

knowing all that, an example select would look like:

SELECT  name,
        ST_Distance_Sphere(ST_GeomFromText('POINT(51.037913 -114.073277)', 4326), coords) AS distance_meters
FROM calgary;
+-----------------------+--------------------+
| name                  | distance_meters    |
+-----------------------+--------------------+
| calgary tower         | 1057.9217149476015 |
| peace bridge          |  1859.336539883446 |
| saddledome            | 1495.7790780297603 |
| national music centre | 1008.7085120625501 |
| baitun nur mosque     |  10020.62038333001 |
| olympic oval          | 6146.6116509785015 |
| heritage park         |  6345.541637300453 |
| international avenue  |  6405.199613693066 |
| fort calgary          |  2083.730747912871 |
+-----------------------+--------------------+

here we can see that we passed two POINT arguments to ST_Distance_Sphere(). The first is one we constructed from literal values using ST_GeomFromText(). it's the location of the ship & anchor pub in central calgary, where i promise i am not writing this post. the second argument is our coords column.

the result is the distance from our starting POINT, the ship & anchor, to all the POINTs in our table, in meters.

from here, building 'near me' functionality is just a matter of applying a WHERE or ORDER BY clause.

going regional: finding points inside a square (or any shape)

perhaps, instead of a basic 'near me' feature, we want our users to be able to draw a square on a map and say "show me all the calgary landmarks in here."

to do this, the fist step we need to take is defining a square.

creating a square

a square is a type of polygon, and mysql provides a POLYGON data type that we can use to describe a square (or any shape). POLYGONs are defined by a set of coordinates that identify the corners of the shape. this means, to create a square, we provide POLYGON with five coordinate sets.

wait, five? don't we mean four? a square has four corners, after all.

the important thing to note here is that a polygon must be closed. this means that the first coordinate set and the last coordinate set must be the same. it completes the shape by going back to the beginning. the result is that a square is defined has having five sets of coordinates. to illustrate, let's look at this glorious ascii diagram that shows the five coordinates that create a square.

1/5 ---- 4
  |      |    
  |      |    
  2 ---- 3

with that in mind, we can create a square of latitude and longitude values. the example we'll be using is this square covering most of downtown calgary.

mysql: “thai food near me”, or: doing geo distance calculations in your database.
a square covering most of downtown calgary.

to select this as a POLYGON in mysql, we would do:

SELECT ST_GeomFromText('POLYGON( (  51.053913 -114.094391, 51.028008 -114.094391, 51.028008 -114.037743, 51.053913 -114.037743, 51.053913 -114.094391) )', 4326);

given our experience creating a POINT, this should be fairly straightforward. the only difference is that instead of passing one coordinate set to POINT, we pass five to POLYGON. the result is a geometric shape, stored in a binary format, that we can use for comparisons against POINTS or, even, other POLYGONs.

finding POINTs 'within' a square

we now have a POLYGON defined from some literal values, and a table full of POINTs, all that's left is to find out which POINTs in our table are inside our POLYGON. we can do this with the mysql function ST_Within(). here's an example:

SELECT  name,
        ST_Latitude(coords) AS latitude,
        ST_Longitude(coords) AS longitude
FROM    calgary
WHERE   ST_Within(
            coords,
            ST_GeomFromText('POLYGON( (  51.053913 -114.094391, 51.028008 -114.094391, 51.028008 -114.037743, 51.053913 -114.037743, 51.053913 -114.094391) )', 4326)
         )

we can see that ST_Within() takes two arguments: a POINT, and a POLYGON. if the POINT is 'within' the POLYGON, ST_Within() returns 1. if it isn't, we get a 0.

conclusion

once we have an understanding of how to create POINTs and POLYGONs and use ST_Distance_Sphere() and ST_Within() we can combine and extrapolate them to get more complex data, like "the closest daycare in a given school district" or "all the burrito busses on this side of the river" or, even, answer the question that has driven so many of the great minds in computer science: "where is a thai restaurant near me"?

? this post originally appeared in the grant horwood technical blog

以上是mysql:“我附近的泰國菜”,或:在資料庫中進行地理距離計算。的詳細內容。更多資訊請關注PHP中文網其他相關文章!

陳述:
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn