Home  >  Article  >  Backend Development  >  How to use Python to capture administrative division codes

How to use Python to capture administrative division codes

WBOY
WBOYOriginal
2016-12-05 13:27:191850browse

Foreword

The National Bureau of Statistics website has relatively complete administrative division codes. For some websites, this is very basic data, so I wrote a Python program to capture this part of the data.

Note: After grabbing it, you need to do simple manual sorting

Sample code:

# -*- coding:utf-8 -*-
'''
获取国家统计局上的行政区划码
'''
import requests,re
base_url = 'http://www.stats.gov.cn/tjsj/tjbz/xzqhdm/201504/t20150415_712722.html'
 
def get_xzqh():
 html_data = requests.get(base_url).content
 pattern = re.compile('<p class="MsoNormal" style=".*&#63;"><span lang="EN-US" style=".*&#63;">(\d+)<span>.*&#63;</span></span><span style=".*&#63;">(.*&#63;)</span></p>')
 areas = re.findall(pattern,html_data)
 print "code,name,level"
 for area in areas:
  print area[0],area[1].decode('utf-8').replace(u' ',''),area[1].decode('utf-8').count(u' ')
 
if __name__=='__main__':
 get_xzqh()

Notes:

In addition, there is another way to obtain information about the country and region table, which is the country and region information table that comes with the QQ software. (The file name is LocList.xml), the general storage location is: C:Program FilesTencentQQI18N2052

If you want the Chinese version, install the Chinese version of QQ to get it. If you want the English version, install the English version of QQ. The international version is in Catalog 1033.

The codes are all written in accordance with ISO3166 standards and are easy to import into the database.

Summary

The above is all about using Python to obtain administrative division codes. I hope the content of this article can be helpful to everyone in learning or using Python. If you have any questions, you can leave a message to communicate.

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn