Home > Article > Backend Development > How to use Python Pymysql for data storage?
Python connects and operates the MySQL database, mainly through the Pymysql module. Here we explain how to store the captured data into a MySQL database.
First you should make sure that the MySQL database has been installed on your computer, and then proceed as follows:
# 1. 连接到mysql数据库 mysql -h227.0.0.1 -uroot -p123456 # 2. 建库 create database maoyandb charset utf8; # 3. 切换数据库 use maoyandb; # 4. 创建数据表 create table filmtab( name varchar(100), star varchar(400), time varchar(30) );
1) Connect to the database
db = pymysql.connect('localhost','root','123456','maoyandb')
Parameter description:
localhost: local MySQL server address, or the IP address of the remote database.
root: The user name used to connect to the data.
password: The password used to connect to the database, the local MySQL server password is "123456".
db: The name of the connected database.
2) Create cursor object
cursor = db.cursor()
3) Execute sql command
execute() method is used to execute SQL statements. As shown below:
#第一种方法:编写sql语句,使用占位符传入相应数据 sql = "insert into filmtab values('%s','%s','%s')" % ('刺杀,小说家','雷佳音','2021') cursor.excute(sql) 第二种方法:编写sql语句,使用列表传参方式 sql = 'insert into filmtab values(%s,%s,%s)' cursor.execute(sql,['刺杀,小说家','雷佳音','2021'])
4) Submit data
db.commit()
5) Close the database
cursor.close() db.close()
The complete code is as follows:
# -*-coding:utf-8-*- import pymysql #创建对象 db = pymysql.connect('localhost','root','123456','maoyandb') cursor = db.cursor() # sql语句执性,单行插入 info_list = ['刺杀,小说家','雷佳音,杨幂','2021-2-12'] sql = 'insert into movieinfo values(%s,%s,%s)' #列表传参 cursor.execute(sql,info_list) db.commit() # 关闭 cursor.close() db.close()
Query data results , as shown below:
mysql> select * from movieinfo;
------------- ------------- ------ -----------
| name | star | time |
------------- ------- ------------ -----------
| Assassination, novelist | Lei Jiayin, Yang Mi | 2021-2-12 |
----- -------- ------------------- -----------
1 rows in set (0.01 sec)
There is also a more efficient method, using executemany() to insert multiple pieces of data at the same time. An example is as follows:
db = pymysql.connect('localhost','root','123456','maoyandb',charset='utf8') cursor = db.cursor() # sql语句执性,列表元组 info_list = [('我不是药神','徐峥','2018-07-05'),('你好,李焕英','贾玲','2021-02-12')] sql = 'insert into movieinfo values(%s,%s,%s)' cursor.executemany(sql,info_list) db.commit() # 关闭 cursor.close() db.close()
Query the insertion results as follows:
mysql> select * from movieinfo;
------------- -------------------------------
| name | star | time |
------ ------- ------------------- ------------
| I am not the god of medicine | Xu Zheng | 2018 -07-05 |
| Hello, Li Huanying | Jia Ling | 2021-02-12 |
------------- ----------- -------- ----------------
2 rows in set (0.01 sec)
below Modify the crawler program to store the captured data in the MySQL database. As shown below:
# coding=gbk from urllib import request import re import time import random from ua_info import ua_list import pymysql class MaoyanSpider(object): def __init__(self): #初始化属性对象 self.url = 'https://maoyan.com/board/4?offset={}' #数据库连接对象 self.db = pymysql.connect( 'localhost','root','123456','maoyandb',charset='utf8') #创建游标对象 self.cursor = self.db.cursor() def get_html(self,url): headers = {'User-Agent':random.choice(ua_list)} req = request.Request(url=url,headers=headers) res = request.urlopen(req) html = res.read().decode() # 直接解析 self.parse_html(html) def parse_html(self,html): re_bds = '<div class="movie-item-info">.*?title="(.*?)".*?<p class="star">(.*?)</p>.*?class="releasetime">(.*?)</p>' pattern = re.compile(re_bds,re.S) r_list = pattern.findall(html) self.save_html(r_list) def save_html(self, r_list): L = [] sql = 'insert into movieinfo values(%s,%s,%s)' # 整理数据 for r in r_list: t = ( r[0].strip(), r[1].strip()[3:], r[2].strip()[5:15] ) L.append(t) print(L) # 一次性插入多条数据 L:[(),(),()] try: self.cursor.executemany(sql,L) # 将数据提交数据库 self.db.commit() except: # 发生错误则回滚 self.db.rollback() def run(self): for offset in range(0,11,10): url = self.url.format(offset) self.get_html(url) time.sleep(random.uniform(1,3)) # 断开游标与数据库连接 self.cursor.close() self.db.close() if __name__ == '__main__': start=time.time() spider = MaoyanSpider() spider.run() end=time.time() print("执行时间:%.2f" % (end-start))
The database query stores the results as follows:
mysql> select * from movieinfo;
---------------- ------------------ ---------------------------------------- ---------- --
| name --------------------------------------- ---------- -
| I am not the God of Medicine | Xu Zheng, Zhou Yiwei, Wang Chuanjun Dun | 1994-09-10 |
| Green Paper | Veguo & Middot; Mo Tengsen, Mach Sala & Middot; Ali, Linda & Middot; Kadrini | 2019-03-01 |
| Sea piano Division | Tim Ross, Bill Nunn, Clarence Williams III | 2019-11-15 |
| Shoplifters | Masaya Nakagawa, Sakura Ando, Mayu Matsuoka | -08-03 |
| Farewell My Concubine | Leslie Cheung, Zhang Fengyi, Gong Li Mo Since -26 |
| Life is Beautiful | Roberto Bernini, Giustino Durano, Sergio Bini Busterik | 2020-01-03 |
| This killer is not too cold | Jean Reno, Gary Oldman, Natalie Portman to ;DiCaprio, Ken Watanabe, Joseph Gordon-Levitt -------------------------------------------------- --- ------------
10 rows in set (0.01 sec)
The above is the detailed content of How to use Python Pymysql for data storage?. For more information, please follow other related articles on the PHP Chinese website!