Home  >  Article  >  Backend Development  >  How to use Python Pymysql for data storage?

How to use Python Pymysql for data storage?

WBOY
WBOYforward
2023-04-23 20:34:151197browse

Python connects and operates the MySQL database, mainly through the Pymysql module. Here we explain how to store the captured data into a MySQL database.

Create storage data table

First you should make sure that the MySQL database has been installed on your computer, and then proceed as follows:

# 1. 连接到mysql数据库
mysql -h227.0.0.1 -uroot -p123456
# 2. 建库
create database maoyandb charset utf8;
# 3. 切换数据库
use maoyandb;
# 4. 创建数据表
create table filmtab(
name varchar(100),
star varchar(400),
time varchar(30)
);

Basic use of Pymysql

1) Connect to the database

db = pymysql.connect('localhost','root','123456','maoyandb')

Parameter description:

  • localhost: local MySQL server address, or the IP address of the remote database.

  • root: The user name used to connect to the data.

  • password: The password used to connect to the database, the local MySQL server password is "123456".

  • db: The name of the connected database.

2) Create cursor object

cursor = db.cursor()

3) Execute sql command

execute() method is used to execute SQL statements. As shown below:

#第一种方法:编写sql语句,使用占位符传入相应数据
sql = "insert into filmtab values('%s','%s','%s')" % ('刺杀,小说家','雷佳音','2021')
cursor.excute(sql)
第二种方法:编写sql语句,使用列表传参方式
sql = 'insert into filmtab values(%s,%s,%s)'
cursor.execute(sql,['刺杀,小说家','雷佳音','2021'])

4) Submit data

db.commit()

5) Close the database

cursor.close()
db.close()

The complete code is as follows:

# -*-coding:utf-8-*-
import pymysql
#创建对象
db = pymysql.connect('localhost','root','123456','maoyandb')
cursor = db.cursor()
# sql语句执性,单行插入
info_list = ['刺杀,小说家','雷佳音,杨幂','2021-2-12']
sql = 'insert into movieinfo values(%s,%s,%s)'
#列表传参
cursor.execute(sql,info_list)
db.commit()
# 关闭
cursor.close()
db.close()

Query data results , as shown below:

mysql> select * from movieinfo;
------------- ------------- ------ -----------
| name | star | time |
------------- ------- ------------ -----------
| Assassination, novelist | Lei Jiayin, Yang Mi | 2021-2-12 |
----- -------- ------------------- -----------
1 rows in set (0.01 sec)

There is also a more efficient method, using executemany() to insert multiple pieces of data at the same time. An example is as follows:

db = pymysql.connect('localhost','root','123456','maoyandb',charset='utf8')
cursor = db.cursor()
# sql语句执性,列表元组
info_list = [('我不是药神','徐峥','2018-07-05'),('你好,李焕英','贾玲','2021-02-12')]
sql = 'insert into movieinfo values(%s,%s,%s)'
cursor.executemany(sql,info_list)
db.commit()
# 关闭
cursor.close()
db.close()

Query the insertion results as follows:

mysql> select * from movieinfo;
------------- -------------------------------
| name | star | time |
------ ------- ------------------- ------------
| I am not the god of medicine | Xu Zheng | 2018 -07-05 |
| Hello, Li Huanying | Jia Ling | 2021-02-12 |
------------- ----------- -------- ----------------
2 rows in set (0.01 sec)

Modify the crawler program

below Modify the crawler program to store the captured data in the MySQL database. As shown below:

# coding=gbk
from urllib import request
import re
import time
import random
from ua_info import ua_list
import pymysql
class MaoyanSpider(object):
    def __init__(self):
        #初始化属性对象
        self.url = 'https://maoyan.com/board/4?offset={}'
        #数据库连接对象
        self.db = pymysql.connect(
            'localhost','root','123456','maoyandb',charset='utf8')
        #创建游标对象
        self.cursor = self.db.cursor()
    def get_html(self,url):
        headers = {'User-Agent':random.choice(ua_list)}
        req = request.Request(url=url,headers=headers)
        res = request.urlopen(req)
        html = res.read().decode()
        # 直接解析
        self.parse_html(html)
    def parse_html(self,html):
        re_bds = &#39;<div class="movie-item-info">.*?title="(.*?)".*?<p class="star">(.*?)</p>.*?class="releasetime">(.*?)</p>&#39;
        pattern = re.compile(re_bds,re.S)
        r_list = pattern.findall(html)
        self.save_html(r_list)
    def save_html(self, r_list):
        L = []
        sql = &#39;insert into movieinfo values(%s,%s,%s)&#39;
        # 整理数据
        for r in r_list:
            t = (
                r[0].strip(),
                r[1].strip()[3:],
                r[2].strip()[5:15]
            )
            L.append(t)
            print(L)
        # 一次性插入多条数据 L:[(),(),()]
        try:
            self.cursor.executemany(sql,L)
            # 将数据提交数据库
            self.db.commit()
        except:
            # 发生错误则回滚
            self.db.rollback()
    def run(self):
        for offset in range(0,11,10):
            url = self.url.format(offset)
            self.get_html(url)
            time.sleep(random.uniform(1,3))
        # 断开游标与数据库连接
        self.cursor.close()
        self.db.close()
if __name__ == &#39;__main__&#39;:
    start=time.time()
    spider = MaoyanSpider()
    spider.run()
    end=time.time()
    print("执行时间:%.2f" % (end-start))

The database query stores the results as follows:

mysql> select * from movieinfo;
---------------- ------------------ ---------------------------------------- ---------- --
| name                                                                              --------------------------------------- ---------- -
| I am not the God of Medicine | Xu Zheng, Zhou Yiwei, Wang Chuanjun                                                                                                                                                                                          Dun | 1994-09-10 |
| Green Paper | Veguo & Middot; Mo Tengsen, Mach Sala & Middot; Ali, Linda & Middot; Kadrini | 2019-03-01 |
| Sea piano Division | Tim Ross, Bill Nunn, Clarence Williams III | 2019-11-15 |
| Shoplifters | Masaya Nakagawa, Sakura Ando, ​​Mayu Matsuoka                                                               | -08-03 |
| Farewell My Concubine | Leslie Cheung, Zhang Fengyi, Gong Li                                                                                                                                                                                                                Mo                                                                                                                                                                              Since -26 |
| Life is Beautiful | Roberto Bernini, Giustino Durano, Sergio Bini Busterik | 2020-01-03 |
| This killer is not too cold | Jean Reno, Gary Oldman, Natalie Portman                                                                                                                                   to ;DiCaprio, Ken Watanabe, Joseph Gordon-Levitt                                                                                                    -------------------------------------------------- --- ------------
10 rows in set (0.01 sec)

The above is the detailed content of How to use Python Pymysql for data storage?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete