Home > Article > Backend Development > How Python uses the GDAL module to read raster data and filter the specified data
First of all, you need to prepare the modules used and various paths for storing raster images.
import os import copy import numpy as np import pylab as plt from osgeo import gdal # rt_file_path="G:/Postgraduate/LAI_Glass_RTlab/Rc_Lai_A2018161_h22v03.tif" # gl_file_path="G:/Postgraduate/LAI_Glass_RTlab/GLASS01E01.V50.A2018161.h22v03.2020323.hdf" # out_file_path="G:/Postgraduate/LAI_Glass_RTlab/test.tif" rt_file_path="I:/LAI_RTLab/A2018161/" gl_file_path="I:/LAI_Glass/2018161/" out_file_path="I:/LAI_Dif/"
Among them, rt_file_path
is the storage path of its own products, gl_file_path
is the storage path of GLASS products, out_file_path
The storage path of the final result after difference processing between the two rasters.
Next, you need to obtain all the raster images to be processed using os.listdir()
, and use for
Loop to prepare for loop batch processing operations.
rt_file_list=os.listdir(rt_file_path) for rt_file in rt_file_list: file_name_split=rt_file.split("_") rt_hv=file_name_split[3][:-4] gl_file_list=os.listdir(gl_file_path) for gl_file in gl_file_list: if rt_hv in gl_file: rt_file_tif_path=rt_file_path+rt_file gl_file_tif_path=gl_file_path+gl_file
Among them, since the requirement of this article is to make a difference between the two products, it is first necessary to combine the hv
frame numbers of the two, and put the two remote sensing images with the same frame number in At the same time; therefore, based on the characteristics of the own product file name, select .split()
to split the string, and then intercept the hv
frame number of the remote sensing image.
The path for storing the output file has been configured in the foregoing part 1.1, but the output file name has not yet been configured; so here we It is necessary to configure the file storage path and name of each remote sensing image after processing. Among them, we directly use the hv
number of the remote sensing image as the output result file name.
DRT_out_file_path=out_file_path+"DRT/" if not os.path.exists(DRT_out_file_path): os.makedirs(DRT_out_file_path) DRT_out_file_tif_path=os.path.join(DRT_out_file_path,rt_hv+".tif") eco_out_file_path=out_file_path+"eco/" if not os.path.exists(eco_out_file_path): os.makedirs(eco_out_file_path) eco_out_file_tif_path=os.path.join(eco_out_file_path,rt_hv+".tif") wat_out_file_path=out_file_path+"wat/" if not os.path.exists(wat_out_file_path): os.makedirs(wat_out_file_path) wat_out_file_tif_path=os.path.join(wat_out_file_path,rt_hv+".tif") tim_out_file_path=out_file_path+"tim/" if not os.path.exists(tim_out_file_path): os.makedirs(tim_out_file_path) tim_out_file_tif_path=os.path.join(tim_out_file_path,rt_hv+".tif")
This part of the code is divided into four parts because the LAI of our own products is obtained based on four algorithms respectively. When making a difference, each algorithm needs to be combined with #GLASS Products are subtracted, so four output path folders are configured.
Next, use the gdal
module to read .tif
and .hdf
Wait for two raster images to be read.
rt_raster=gdal.Open(rt_file_path+rt_file) rt_band_num=rt_raster.RasterCount rt_raster_array=rt_raster.ReadAsArray() rt_lai_array=rt_raster_array[0] rt_qa_array=rt_raster_array[1] rt_lai_band=rt_raster.GetRasterBand(1) # rt_lai_nodata=rt_lai_band.GetNoDataValue() # rt_lai_nodata=32767 # rt_lai_mask=np.ma.masked_equal(rt_lai_array,rt_lai_nodata) rt_lai_array_mask=np.where(rt_lai_array>30000,np.nan,rt_lai_array) rt_lai_array_fin=rt_lai_array_mask*0.001 gl_raster=gdal.Open(gl_file_path+gl_file) gl_band_num=gl_raster.RasterCount gl_raster_array=gl_raster.ReadAsArray() gl_lai_array=gl_raster_array gl_lai_band=gl_raster.GetRasterBand(1) gl_lai_array_mask=np.where(gl_lai_array>1000,np.nan,gl_lai_array) gl_lai_array_fin=gl_lai_array_mask*0.01 row=rt_raster.RasterYSize col=rt_raster.RasterXSize geotransform=rt_raster.GetGeoTransform() projection=rt_raster.GetProjection()
First of all, take the first paragraph of the above code as an example to explain. Among them, gdal.Open()
reads the raster image; .RasterCount
obtains the number of raster image bands; .ReadAsArray()
converts each band of the raster image The information is read in Array
format. When the number of bands is greater than 1
, it has three dimensions, and the first dimension is the number of bands; rt_raster_array[0]
It means taking the first band in Array
, which in this article is the LAI band of our own product; rt_qa_array=rt_raster_array[1]
means taking the first band The two bands, in this article, are the QA bands of our own products; .GetRasterBand(1)
means getting the first band in the raster image (note that the serial number here is not Starting from 0
instead of 1
); np.where(rt_lai_array>30000,np.nan,rt_lai_array)
means using np.where ()
function selects pixel >30000
in the first band in Array
and sets it to nan
, leaving other values unchanged . This step is a method to eliminate fill values and Nodata
values in the image. The last sentence *0.001
is to restore the original scaling factor of the layer.
Secondly, the third section of the above code is to obtain the raster row, column number and projection transformation information.
Next, first perform the difference operation on our own products and GLASS products, and then we need to perform the difference operation on the four algorithms respectively. extract.
lai_dif=rt_lai_array_fin-gl_lai_array_fin lai_dif=lai_dif*1000 rt_qa_array_bin=copy.copy(rt_qa_array) rt_qa_array_row,rt_qa_array_col=rt_qa_array.shape for i in range(rt_qa_array_row): for j in range(rt_qa_array_col): rt_qa_array_bin[i][j]="{:012b}".format(rt_qa_array_bin[i][j])[-4:] # DRT_pixel_pos=np.where((rt_qa_array_bin>=100) & (rt_qa_array_bin==11)) # eco_pixel_pos=np.where((rt_qa_array_bin<100) & (rt_qa_array_bin==111)) # wat_pixel_pos=np.where((rt_qa_array_bin<1000) & (rt_qa_array_bin==1011)) # tim_pixel_pos=np.where((rt_qa_array_bin<1100) & (rt_qa_array_bin==1111)) # colormap=plt.cm.Greens # plt.figure(1) # # plt.subplot(2,4,1) # plt.imshow(rt_lai_array_fin,cmap=colormap,interpolation='none') # plt.title("RT_LAI") # plt.colorbar() # plt.figure(2) # # plt.subplot(2,4,2) # plt.imshow(gl_lai_array_fin,cmap=colormap,interpolation='none') # plt.title("GLASS_LAI") # plt.colorbar() # plt.figure(3) # dif_colormap=plt.cm.get_cmap("Spectral") # plt.imshow(lai_dif,cmap=dif_colormap,interpolation='none') # plt.title("Difference_LAI (RT-GLASS)") # plt.colorbar() DRT_lai_dif_array=np.where((rt_qa_array_bin>=100) | (rt_qa_array_bin==11), np.nan,lai_dif) eco_lai_dif_array=np.where((rt_qa_array_bin<100) | (rt_qa_array_bin==111), np.nan,lai_dif) wat_lai_dif_array=np.where((rt_qa_array_bin<1000) | (rt_qa_array_bin==1011), np.nan,lai_dif) tim_lai_dif_array=np.where((rt_qa_array_bin<1100) | (rt_qa_array_bin==1111), np.nan,lai_dif) # plt.figure(4) # plt.imshow(DRT_lai_dif_array) # plt.colorbar() # plt.figure(5) # plt.imshow(eco_lai_dif_array) # plt.colorbar() # plt.figure(6) # plt.imshow(wat_lai_dif_array) # plt.colorbar() # plt.figure(7) # plt.imshow(tim_lai_dif_array) # plt.colorbar()
Among them, the first two sentences of the above code are difference calculation and data integration. Converting the data to integers reduces the amount of data in the resulting data layer (because there is no need to store decimals).
Then, data screening and masking began based on the QA band. In fact, the QA bands of various remote sensing images (such as MODIS, Landsat, etc.) are relatively similar: the quality of remote sensing images is represented by a string of binary codes. , information, etc., in which different bits often represent a characteristic. For example, the figure below shows the meaning of the QA band of Landsat Collection 2 Level-2.
Here, the QA band was originally in decimal (in order to save space in general remote sensing images, the QA band is written in decimal form), so it needs to be converted to Binary; then determine the pixel by obtaining the required number of binary data digits (in this article, it is the number of binary digits that can determine which algorithm this pixel in our own products comes from) Which algorithm is used to obtain the LAI, so that the pixels corresponding to each algorithm are processed together. The four variables including DRT_lai_dif_array
respectively represent all the pixels in the four algorithms except the pixels obtained by the own algorithm; the reason for choosing this method is because we can directly mask it later. If the film is removed, what is left are the pixels of this algorithm itself.
Among them, the plt
related content commented out in the above code can be used to draw spatial distribution maps. If you are interested, you can try to use it.
接下来,将我们完成上述差值计算与依据算法进行筛选后的图像保存。
driver=gdal.GetDriverByName("Gtiff") out_DRT_lai=driver.Create(DRT_out_file_tif_path,row,col,1,gdal.GDT_Float32) out_DRT_lai.SetGeoTransform(geotransform) out_DRT_lai.SetProjection(projection) out_DRT_lai.GetRasterBand(1).WriteArray(DRT_lai_dif_array) out_DRT_lai=None driver=gdal.GetDriverByName("Gtiff") out_eco_lai=driver.Create(eco_out_file_tif_path,row,col,1,gdal.GDT_Float32) out_eco_lai.SetGeoTransform(geotransform) out_eco_lai.SetProjection(projection) out_eco_lai.GetRasterBand(1).WriteArray(eco_lai_dif_array) out_eco_lai=None driver=gdal.GetDriverByName("Gtiff") out_wat_lai=driver.Create(wat_out_file_tif_path,row,col,1,gdal.GDT_Float32) out_wat_lai.SetGeoTransform(geotransform) out_wat_lai.SetProjection(projection) out_wat_lai.GetRasterBand(1).WriteArray(wat_lai_dif_array) out_wat_lai=None driver=gdal.GetDriverByName("Gtiff") out_tim_lai=driver.Create(tim_out_file_tif_path,row,col,1,gdal.GDT_Float32) out_tim_lai.SetGeoTransform(geotransform) out_tim_lai.SetProjection(projection) out_tim_lai.GetRasterBand(1).WriteArray(tim_lai_dif_array) out_tim_lai=None print(rt_hv)
其中,.GetDriverByName("Gtiff")
表示保存为.tif
格式的GeoTIFF文件;driver.Create(DRT_out_file_tif_path,row,col,1,gdal.GDT_Float32)
表示按照路径、行列数、波段数与数据格式等建立一个新的栅格图层,作为输出图层的框架;其后表示分别将地理投影转换信息与像素具体数值分别赋予这一新建的栅格图层;最后=None
表示将其从内存空间中释放,完成写入与保存工作。
本文所需完整代码如下:
# -*- coding: utf-8 -*- """ Created on Thu Jul 15 19:36:15 2021 @author: fkxxgis """ import os import copy import numpy as np import pylab as plt from osgeo import gdal # rt_file_path="G:/Postgraduate/LAI_Glass_RTlab/Rc_Lai_A2018161_h22v03.tif" # gl_file_path="G:/Postgraduate/LAI_Glass_RTlab/GLASS01E01.V50.A2018161.h22v03.2020323.hdf" # out_file_path="G:/Postgraduate/LAI_Glass_RTlab/test.tif" rt_file_path="I:/LAI_RTLab/A2018161/" gl_file_path="I:/LAI_Glass/2018161/" out_file_path="I:/LAI_Dif/" rt_file_list=os.listdir(rt_file_path) for rt_file in rt_file_list: file_name_split=rt_file.split("_") rt_hv=file_name_split[3][:-4] gl_file_list=os.listdir(gl_file_path) for gl_file in gl_file_list: if rt_hv in gl_file: rt_file_tif_path=rt_file_path+rt_file gl_file_tif_path=gl_file_path+gl_file DRT_out_file_path=out_file_path+"DRT/" if not os.path.exists(DRT_out_file_path): os.makedirs(DRT_out_file_path) DRT_out_file_tif_path=os.path.join(DRT_out_file_path,rt_hv+".tif") eco_out_file_path=out_file_path+"eco/" if not os.path.exists(eco_out_file_path): os.makedirs(eco_out_file_path) eco_out_file_tif_path=os.path.join(eco_out_file_path,rt_hv+".tif") wat_out_file_path=out_file_path+"wat/" if not os.path.exists(wat_out_file_path): os.makedirs(wat_out_file_path) wat_out_file_tif_path=os.path.join(wat_out_file_path,rt_hv+".tif") tim_out_file_path=out_file_path+"tim/" if not os.path.exists(tim_out_file_path): os.makedirs(tim_out_file_path) tim_out_file_tif_path=os.path.join(tim_out_file_path,rt_hv+".tif") rt_raster=gdal.Open(rt_file_path+rt_file) rt_band_num=rt_raster.RasterCount rt_raster_array=rt_raster.ReadAsArray() rt_lai_array=rt_raster_array[0] rt_qa_array=rt_raster_array[1] rt_lai_band=rt_raster.GetRasterBand(1) # rt_lai_nodata=rt_lai_band.GetNoDataValue() # rt_lai_nodata=32767 # rt_lai_mask=np.ma.masked_equal(rt_lai_array,rt_lai_nodata) rt_lai_array_mask=np.where(rt_lai_array>30000,np.nan,rt_lai_array) rt_lai_array_fin=rt_lai_array_mask*0.001 gl_raster=gdal.Open(gl_file_path+gl_file) gl_band_num=gl_raster.RasterCount gl_raster_array=gl_raster.ReadAsArray() gl_lai_array=gl_raster_array gl_lai_band=gl_raster.GetRasterBand(1) gl_lai_array_mask=np.where(gl_lai_array>1000,np.nan,gl_lai_array) gl_lai_array_fin=gl_lai_array_mask*0.01 row=rt_raster.RasterYSize col=rt_raster.RasterXSize geotransform=rt_raster.GetGeoTransform() projection=rt_raster.GetProjection() lai_dif=rt_lai_array_fin-gl_lai_array_fin lai_dif=lai_dif*1000 rt_qa_array_bin=copy.copy(rt_qa_array) rt_qa_array_row,rt_qa_array_col=rt_qa_array.shape for i in range(rt_qa_array_row): for j in range(rt_qa_array_col): rt_qa_array_bin[i][j]="{:012b}".format(rt_qa_array_bin[i][j])[-4:] # DRT_pixel_pos=np.where((rt_qa_array_bin>=100) & (rt_qa_array_bin==11)) # eco_pixel_pos=np.where((rt_qa_array_bin<100) & (rt_qa_array_bin==111)) # wat_pixel_pos=np.where((rt_qa_array_bin<1000) & (rt_qa_array_bin==1011)) # tim_pixel_pos=np.where((rt_qa_array_bin<1100) & (rt_qa_array_bin==1111)) # colormap=plt.cm.Greens # plt.figure(1) # # plt.subplot(2,4,1) # plt.imshow(rt_lai_array_fin,cmap=colormap,interpolation='none') # plt.title("RT_LAI") # plt.colorbar() # plt.figure(2) # # plt.subplot(2,4,2) # plt.imshow(gl_lai_array_fin,cmap=colormap,interpolation='none') # plt.title("GLASS_LAI") # plt.colorbar() # plt.figure(3) # dif_colormap=plt.cm.get_cmap("Spectral") # plt.imshow(lai_dif,cmap=dif_colormap,interpolation='none') # plt.title("Difference_LAI (RT-GLASS)") # plt.colorbar() DRT_lai_dif_array=np.where((rt_qa_array_bin>=100) | (rt_qa_array_bin==11), np.nan,lai_dif) eco_lai_dif_array=np.where((rt_qa_array_bin<100) | (rt_qa_array_bin==111), np.nan,lai_dif) wat_lai_dif_array=np.where((rt_qa_array_bin<1000) | (rt_qa_array_bin==1011), np.nan,lai_dif) tim_lai_dif_array=np.where((rt_qa_array_bin<1100) | (rt_qa_array_bin==1111), np.nan,lai_dif) # plt.figure(4) # plt.imshow(DRT_lai_dif_array) # plt.colorbar() # plt.figure(5) # plt.imshow(eco_lai_dif_array) # plt.colorbar() # plt.figure(6) # plt.imshow(wat_lai_dif_array) # plt.colorbar() # plt.figure(7) # plt.imshow(tim_lai_dif_array) # plt.colorbar() driver=gdal.GetDriverByName("Gtiff") out_DRT_lai=driver.Create(DRT_out_file_tif_path,row,col,1,gdal.GDT_Float32) out_DRT_lai.SetGeoTransform(geotransform) out_DRT_lai.SetProjection(projection) out_DRT_lai.GetRasterBand(1).WriteArray(DRT_lai_dif_array) out_DRT_lai=None driver=gdal.GetDriverByName("Gtiff") out_eco_lai=driver.Create(eco_out_file_tif_path,row,col,1,gdal.GDT_Float32) out_eco_lai.SetGeoTransform(geotransform) out_eco_lai.SetProjection(projection) out_eco_lai.GetRasterBand(1).WriteArray(eco_lai_dif_array) out_eco_lai=None driver=gdal.GetDriverByName("Gtiff") out_wat_lai=driver.Create(wat_out_file_tif_path,row,col,1,gdal.GDT_Float32) out_wat_lai.SetGeoTransform(geotransform) out_wat_lai.SetProjection(projection) out_wat_lai.GetRasterBand(1).WriteArray(wat_lai_dif_array) out_wat_lai=None driver=gdal.GetDriverByName("Gtiff") out_tim_lai=driver.Create(tim_out_file_tif_path,row,col,1,gdal.GDT_Float32) out_tim_lai.SetGeoTransform(geotransform) out_tim_lai.SetProjection(projection) out_tim_lai.GetRasterBand(1).WriteArray(tim_lai_dif_array) out_tim_lai=None print(rt_hv)
The above is the detailed content of How Python uses the GDAL module to read raster data and filter the specified data. For more information, please follow other related articles on the PHP Chinese website!