Maison  >  Article  >  base de données  >  mysql : « cuisine thaïlandaise près de chez moi », ou : faire des calculs de distance géographique dans votre base de données.

mysql : « cuisine thaïlandaise près de chez moi », ou : faire des calculs de distance géographique dans votre base de données.

Barbara Streisand
Barbara Streisandoriginal
2024-09-21 06:30:51748parcourir

Nous connaissons tous le truc de la "nourriture thaïlandaise à proximité". vous tapez cette phrase sur votre téléphone et il répond avec une liste de restaurants thaïlandais qui sont, eh bien, près de chez vous. et nous avons une sorte de compréhension de comment cela fonctionne sous le capot : Google ou quiconque a une base de données de restaurants thaïlandais avec leurs latitudes et longitudes et connaît notre position grâce à notre téléphone et effectue ensuite « un processus » pour déterminer quels endroits thaïlandais sont à proximité.

dans cet article, nous passerons en revue cette partie « certains processus », en examinant comment utiliser MySQL pour effectuer certaines opérations de localisation standard. nous couvrirons les types POINT et POLYGON de MySQL, trouvant la distance entre deux points sur une sphère (ce qu'est la terre, contrairement à ce que vous avez pu lire sur Internet), déterminant si un point est à l'intérieur d'un polygone défini par des points , et regardez des choses comme les « systèmes de référence spatiale » qui définissent la façon dont les coordonnées sont tracées à la surface de la Terre.

mysql: “thai food near me”, or: doing geo distance calculations in your database.
un restaurant tente une attaque par injection SQL.

faire un POINT dans MySQL

mysql dispose de toute une suite de fonctions et de types de données consacrés aux données spatiales. leur nombre est vertigineux et la documentation officielle est presque criminellement dense. Heureusement, nous pouvons accomplir ce que nous voulons en utilisant seulement un petit sous-ensemble. nous commencerons par POINT.

POINT est à la fois un type de données et une fonction qui renvoie ce type de données. si nous voulions définir un point sur un bon graphique x/y à l'ancienne, nous pouvons le faire comme ceci :

SELECT POINT(3, 7);

le résultat de cette requête est notre point x/y dans une valeur de type POINT. mysql stocke POINT dans un format binaire, donc le résultat de notre sélection n'est pas particulièrement utile :

SELECT POINT(3, 7);
+------------------------------------------------------+
| POINT(3, 7)                                          |
+------------------------------------------------------+
| 0x00000000010100000000000000000008400000000000001C40 |
+------------------------------------------------------+

mysql résout ce problème en fournissant deux fonctions pratiques pour extraire les valeurs x et y d'un point :

  • ST_X()
  • ST_Y()

ils acceptent tous les deux une valeur POINT comme argument. par exemple :

SELECT ST_X(POINT(3,7)) AS x, ST_Y(POINT(3,7)) AS y;
+------+------+
| x    | y    |
+------+------+
|    3 |    7 |
+------+------+

Comme POINT est un type de données, nous pouvons l'utiliser dans les définitions de tables, tout comme nous le ferions avec INT ou VARCHAR.

CREATE TABLE `some_coords` (
  `coords` POINT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

si nous avons une colonne de type POINT, seules les données POINT peuvent y entrer. nous en parlerons plus tard.

une petite digression sur x, y, les cartes et le manque de standards

nous avons tous appris à l'école comment tracer des points sur du papier millimétré aux lignes bleues en utilisant l'axe des x, qui est horizontal, et l'axe des y, qui est vertical. les points ont été définis comme x/y ; horizontal d’abord, vertical ensuite. c'est comme ça depuis toujours, et tout le monde est d'accord là-dessus.

Sauf les gens qui font des cartes.

les personnes qui font des cartes définissent les points comme latitude/longitude. la latitude, bien sûr, va du nord au sud, ce qui est vertical sur une carte. la longitude, l'axe est-ouest, est horizontale. les gens de la carte, en substance, ont décidé d'utiliser y/x.

évidemment, cela crée des problèmes. regardons ce qui se passe lorsque nous créons un POINT représentant l'emplacement du pub Ship & Anchor dans le centre de Calgary, en Alberta (d'où je suis connu, à l'occasion, pour bloguer)

SELECT ST_X(POINT(51.037913, -114.073277)) as longitude, ST_Y(POINT(51.037913, -114.073277)) as latitude;
+-----------+-------------+
| longitude | latitude    |
+-----------+-------------+
| 51.037913 | -114.073277 |
+-----------+-------------+

latitude et longitude sont mélangées ; notre pub n'est pas au bon endroit. Pire encore, puisque la valeur maximale de la latitude est de 90, nous avons placé le navire et l'ancre quelque part dans l'espace. pas bon.

mysql résout ce problème en fournissant deux fonctions pour remplacer ST_X() et ST_Y() lors de l'utilisation de points sur une carte ou un globe :

  • ST_Latitude()
  • ST_Longitude()

c'est une bonne chose, sauf que si nous essayons de les utiliser dans notre requête ci-dessus, nous obtenons ce message d'erreur :

ERROR 3726 (22S00): Function st_latitude is only defined for geographic spatial reference systems, but one of its arguments is in SRID 0, which is not geographic.

cette erreur semble intimidante (c'est quoi le SRID 0 ?), mais tout ce que MySQL nous dit ici, c'est que les POINTS que nous utilisons n'ont pas été définis comme étant des points de la carte. ce ne sont que de vieux sacs ordinaires de x et de y.

nous reviendrons sur les SRID et les SRS plus tard.

mysql: “thai food near me”, or: doing geo distance calculations in your database.
latitude et longitude ont toujours été y/x.

une meilleure façon de faire valoir un POINT : texte connu

jusqu'à présent, nous avons sélectionné une valeur de type POINT en utilisant la fonction POINT(). cela fonctionne bien pour le moment, mais il existe une manière meilleure et plus flexible de procéder qui facilitera le travail avec les POINTS et les POLYGONES lorsque les choses commenceront à devenir plus compliquées.

la fonction ST_GeomFromText() prend en argument une expression textuelle (une chaîne) de l'objet géométrique que l'on souhaite créer (un POINT dans ce cas), et renvoie une valeur du type correct.

these text expressions are formatted using a syntax called "well-known text". the format is, basically, the name of the geometric object you want to create (ie. POINT) and the coordinates that define it. let's look:

SELECT ST_GeomFromText('POINT(51.037913 -114.073277)');

this looks very straightforward, but there's a glaring question: where is the comma separating the arguments in our POINT call?

the answer is that the well-known text here isn't a call to the function POINT(), it's a definition of the data type POINT.

back at the beginning of this discussion, we went over how POINT is both a function and a datatype. when we use POINT() as a function, the coordinates are arguments that are separated by a comma. when we define a value using POINT as a type, the coordinates do not take a comma.

we can use ST_GeomFromText() to create any sort of geometric object that's defined in the well-known text. there aren't many of these, and we'll be sticking in this post to POINTs and POLYGONs (which include things like squares and triangles).

spatial reference systems: not all points are the same

on my desk i have a small chess board where i occasionally work through annotated games. it's my idea of "fun". that chess board is a coordinates system. i also have a large, widescreen computer monitor on my desk. it's a coordinate system as well.

however, just because my chess board and monitor are both coordinate systems doesn't mean that the coordinates from one can be transferred to the other. the x/y position of my white bishop is meaningless on my monitor; that x/y point only has meaning in the context of the chess board.

a context defines things like the origin points, axes, units of measurement and the like. useful stuff that helps us make sense of what a coordinate actually means.

when it comes to plotting points and lines and polygons on the surface of the earth, that context is called a 'spatial reference system', or SRS.

there are a lot of different SRSs. a lot. some of them treat the earth as a sphere, others as a projected flat map. some cover the entire planet, many more only deal with a sub region, like a country. some include a z axis from the center of the earth, most don't.

if we want to peruse all the different SRSs that mysql has, we can run this select:

SELECT * FROM INFORMATION_SCHEMA.ST_SPATIAL_REFERENCE_SYSTEMS;

there are about five thousand of them.

fortunately, we don't need to read through all of these to choose one. we're just going to use 4326, a global, unprojected coordinate system that (just about) everybody uses.

that 4326 number is the id of the SRS. it's called, unsurprisingly, an SRID. if we remember back to when we tried to call the ST_Latitude() function on the POINT we made, we got the error:

ERROR 3726 (22S00): Function st_latitude is only defined for geographic spatial reference systems, but one of its arguments is in SRID 0, which is not geographic.

now that we have more of an understanding about SRSs, we can see that here mysql is complaining that we are asking for the latitude, but the SRS of our POINT isn't one that uses latitude and longitude. the SRS we are using, according to the error message, is SRID 0.

SRID 0 is just a 'flat, cartesian plane' with no units. think of it as a sheet of that blue-lined graph paper from math class stretching off into infinity in all directions. this is a great SRS for some applications, but is not very meaningful for using latitude and longitude to map places on a spherical earth. SRID 0 is the default SRS that mysql assigns to POINTs (and other shapes) when one is not specified.

by comparison, the 4326 SRS is specifically designed for global mapping. it treats the surface of the earth as an ellipsoid, uses degrees for measurement and defines the axes as the equator and prime meridian. exactly what we want. 4326 is, in turn, based on a big set of data about the earth called the world geodetic system 1984, or WSG84, that was compiled in that year in an effort to unify and standardize the mishmash of national mapping data. if you're one of those 'further reading' types, you can read over a detailed explainer on SRID 4326 here or peruse the surprisingly-entertaining wikipedia entry on WSG84.

mysql: “thai food near me”, or: doing geo distance calculations in your database.
a developer accidentally uses SRID 0 for their geolocation select.

actually using SRID 4326

using SRID 4326 as our SRS when creating a POINT is pretty straightforward; we just add the SRID as a second argument to ST_GeomFromText().

SELECT ST_GeomFromText('POINT(51.037913 -114.073277)', 4326);

and, just like that, our x/y values are now treated as longitude and latitude coordinates on earth. let's try ST_Latitude() again:

SELECT ST_Latitude(ST_GeomFromText('POINT(51.037913 -114.073277)', 4326)) AS latitude;
+-----------+
| latitude  |
+-----------+
| 51.037913 |
+-----------+

exactly what we wanted.

creating a table for our POINTs

selecting geometric data like POINTs (or POLYGONs or LINESTRINGs) created using literal data is fine, but what we probably want to do is persist that data in a table so we can use it later. let's do that. we'll start with creating our table.

CREATE TABLE `calgary` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(200) COLLATE utf8mb4_unicode_ci NOT NULL,
  `coords` POINT NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

here, we've defined a pretty standard-looking table of notable locations in the city of calgary, alberta. the interesting column here is coords, which is defined as a POINT.

that POINT doesn't have an SRS associated with it. this means that on every insert, we will have to define the SRID we are using for our point. this is very flexible, but if we want to we can add the SRS to the column definition.

CREATE TABLE `calgary` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(200) COLLATE utf8mb4_unicode_ci NOT NULL,
  `coords` POINT SRID 4326 NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

by defining our coords column as POINT SRID 4326 we are enforcing that any POINT in that column must be of SRID 4326. if we try to insert a point that has a different SRID, mysql will complain with an error like:

ERROR 3643 (HY000): The SRID of the geometry does not match the SRID of the column 'coords'. The SRID of the geometry is 0, but the SRID of the column is 4326. Consider changing the SRID of the geometry or the SRID property of the column.

for all the examples going forward, we will be using a table with a coords column that does not define the SRID.

now that we have a table, we can insert some rows. we'll add a list of calgary landmarks.

INSERT INTO calgary VALUES (null, 'calgary tower', ST_GeomFromText('POINT(51.044270 -114.062019)', 4326));
INSERT INTO calgary VALUES (null, 'peace bridge', ST_GeomFromText('POINT(51.0542 -114.0793)', 4326));
INSERT INTO calgary VALUES (null, 'saddledome', ST_GeomFromText('POINT(51.0374 -114.0519)', 4326));
INSERT INTO calgary VALUES (null, 'national music centre', ST_GeomFromText('POINT(51.04250 -114.06083)', 4326));
INSERT INTO calgary VALUES (null, 'baitun nur mosque', ST_GeomFromText('POINT(51.101743 -113.972039)', 4326));
INSERT INTO calgary VALUES (null, 'olympic oval', ST_GeomFromText('POINT(51.07694 -114.13556)', 4326));
INSERT INTO calgary VALUES (null, 'heritage park', ST_GeomFromText('POINT(50.98528 -114.10833)', 4326));
INSERT INTO calgary VALUES (null, 'international avenue', ST_GeomFromText('POINT(51.03778 -113.98167)', 4326));
INSERT INTO calgary VALUES (null, 'fort calgary', ST_GeomFromText('POINT(51.045139 -114.045778)', 4326));

there's a lot of things to see in calgary!

in these insert statements, we create our point using ST_GeomFromText() and set the SRID as 4326 like so:

ST_GeomFromText('POINT(51.0542 -114.0793)', 4326)

we can then select this data back, getting the latitude and longitude of each location with ST_latitude() and ST_longitude().

SELECT  id,
        name,
        ST_Latitude(coords) AS latitude,
        ST_Longitude(coords) AS longitude
FROM    calgary;
+----+-----------------------+-----------+-------------+
| id | name                  | latitude  | longitude   |
+----+-----------------------+-----------+-------------+
|  1 | calgary tower         |  51.04427 | -114.062019 |
|  2 | peace bridge          |   51.0542 |   -114.0793 |
|  3 | saddledome            |   51.0374 |   -114.0519 |
|  4 | national music centre |   51.0425 |  -114.06083 |
|  5 | baitun nur mosque     | 51.101743 | -113.972039 |
|  6 | olympic oval          |  51.07694 |  -114.13556 |
|  7 | heritage park         |  50.98528 |  -114.10833 |
|  8 | international avenue  |  51.03778 |  -113.98167 |
|  9 | fort calgary          | 51.045139 | -114.045778 |
+----+-----------------------+-----------+-------------+

at last, calculating distance

so far, we've made some spatial POINTs and assigned them to SRID 4326 so we can actually make sense of them as latitude and longitude. it's finally time to focus on what we really want to do: getting the distance between two points.

to do this, we're going to use mysql's ST_Distance_Sphere() function.

as one would expect, ST_Distance_Sphere() calculates the distance between two points, provided as arguments to the function, on a sphere. the distance returned will always be the shortest one (since, on a sphere, we can always go the opposite direction and travel further to get to the same place). the unit of measurement is meters.

ST_Distance_Sphere() takes an optional third argument: the radius of the sphere. if we do not set this argument, the value 6,370,986 meters is used. that's the radius of the earth, and is the value we almost certainly want to use.

knowing all that, an example select would look like:

SELECT  name,
        ST_Distance_Sphere(ST_GeomFromText('POINT(51.037913 -114.073277)', 4326), coords) AS distance_meters
FROM calgary;
+-----------------------+--------------------+
| name                  | distance_meters    |
+-----------------------+--------------------+
| calgary tower         | 1057.9217149476015 |
| peace bridge          |  1859.336539883446 |
| saddledome            | 1495.7790780297603 |
| national music centre | 1008.7085120625501 |
| baitun nur mosque     |  10020.62038333001 |
| olympic oval          | 6146.6116509785015 |
| heritage park         |  6345.541637300453 |
| international avenue  |  6405.199613693066 |
| fort calgary          |  2083.730747912871 |
+-----------------------+--------------------+

here we can see that we passed two POINT arguments to ST_Distance_Sphere(). The first is one we constructed from literal values using ST_GeomFromText(). it's the location of the ship & anchor pub in central calgary, where i promise i am not writing this post. the second argument is our coords column.

the result is the distance from our starting POINT, the ship & anchor, to all the POINTs in our table, in meters.

from here, building 'near me' functionality is just a matter of applying a WHERE or ORDER BY clause.

going regional: finding points inside a square (or any shape)

perhaps, instead of a basic 'near me' feature, we want our users to be able to draw a square on a map and say "show me all the calgary landmarks in here."

to do this, the fist step we need to take is defining a square.

creating a square

a square is a type of polygon, and mysql provides a POLYGON data type that we can use to describe a square (or any shape). POLYGONs are defined by a set of coordinates that identify the corners of the shape. this means, to create a square, we provide POLYGON with five coordinate sets.

wait, five? don't we mean four? a square has four corners, after all.

the important thing to note here is that a polygon must be closed. this means that the first coordinate set and the last coordinate set must be the same. it completes the shape by going back to the beginning. the result is that a square is defined has having five sets of coordinates. to illustrate, let's look at this glorious ascii diagram that shows the five coordinates that create a square.

1/5 ---- 4
  |      |    
  |      |    
  2 ---- 3

with that in mind, we can create a square of latitude and longitude values. the example we'll be using is this square covering most of downtown calgary.

mysql: “thai food near me”, or: doing geo distance calculations in your database.
a square covering most of downtown calgary.

to select this as a POLYGON in mysql, we would do:

SELECT ST_GeomFromText('POLYGON( (  51.053913 -114.094391, 51.028008 -114.094391, 51.028008 -114.037743, 51.053913 -114.037743, 51.053913 -114.094391) )', 4326);

given our experience creating a POINT, this should be fairly straightforward. the only difference is that instead of passing one coordinate set to POINT, we pass five to POLYGON. the result is a geometric shape, stored in a binary format, that we can use for comparisons against POINTS or, even, other POLYGONs.

finding POINTs 'within' a square

we now have a POLYGON defined from some literal values, and a table full of POINTs, all that's left is to find out which POINTs in our table are inside our POLYGON. we can do this with the mysql function ST_Within(). here's an example:

SELECT  name,
        ST_Latitude(coords) AS latitude,
        ST_Longitude(coords) AS longitude
FROM    calgary
WHERE   ST_Within(
            coords,
            ST_GeomFromText('POLYGON( (  51.053913 -114.094391, 51.028008 -114.094391, 51.028008 -114.037743, 51.053913 -114.037743, 51.053913 -114.094391) )', 4326)
         )

we can see that ST_Within() takes two arguments: a POINT, and a POLYGON. if the POINT is 'within' the POLYGON, ST_Within() returns 1. if it isn't, we get a 0.

conclusion

once we have an understanding of how to create POINTs and POLYGONs and use ST_Distance_Sphere() and ST_Within() we can combine and extrapolate them to get more complex data, like "the closest daycare in a given school district" or "all the burrito busses on this side of the river" or, even, answer the question that has driven so many of the great minds in computer science: "where is a thai restaurant near me"?

? this post originally appeared in the grant horwood technical blog

Ce qui précède est le contenu détaillé de. pour plus d'informations, suivez d'autres articles connexes sur le site Web de PHP en chinois!

Déclaration:
Le contenu de cet article est volontairement contribué par les internautes et les droits d'auteur appartiennent à l'auteur original. Ce site n'assume aucune responsabilité légale correspondante. Si vous trouvez un contenu suspecté de plagiat ou de contrefaçon, veuillez contacter admin@php.cn