搜索

首页  >  问答  >  正文

python - 多线程中mysql插入出错

在写爬虫中,我要把爬取到的数据存到数据库中.每一个页面里边有很多条目,比如一个人的访客可能有很多个,于是插入卸载循环中,

                try:
                    sql_visitor='INSERT INTO visitor (ownername,owneruid,visitorname,visitoruid,visittime) VALUE ("%s",%d,"%s",%d,"%s")'%(ownername,owneruid,visitorname,visitoruid,visitortime)
                    print sql_visitor
                    self.cursor.execute(sql_visitor)
                    self.connect.commit()
                except Exception as e:
                    print e

一个页面一个线程,嫌弃慢的我开了5个

max_threads=5
        while uid < 8000000 or threadlist:
            for thread1 in threadlist:
                if not thread1.is_alive():
                    threadlist.remove(thread1)
            while len(threadlist) < max_threads and uid < 8000000:
                uid+=1
                thread2=threading.Thread(target=run,args=(uid,))
                thread2.setDaemon(True)
                thread2.start()
                threadlist.append(thread2)
            time.sleep(5)

运行很顺利:

INSERT INTO visitor (ownername,owneruid,visitorname,visitoruid,visittime) VALUE ("huosai7",4893,"Liang2017",7252799,"2017-5-22 21:06")
INSERT INTO personalinfo (ownername,owneruid,jifen,huajiao,xiaomijiao,jinbi,haoyou,zhuti,rizhi,xiangce,fenxiang,kongjianfangwenliang,youxiangyanzheng,shipinrenzheng,juzhudi,chushengdi,shangcifabiaoshijian,shangcihuodongshijian,zuihoufangwen,zhuceshijian,zaixianshijian,shengri,xingbie) VALUE("huosai7",4893,0,0,0,0,0,0,0,0,0,0,0,0,"","","2100-01-01 12:00","2100-01-01 12:00","2100-01-01 12:00","2004-1-3 19:28",0,"2100-01-01 12:00",0)
INSERT INTO visitor (ownername,owneruid,visitorname,visitoruid,visittime) VALUE ("龙乐",4894,"Liang2017",7252799,"2017-5-22 21:06")
(1062, "Duplicate entry '4894-7252799-2017-05-22 21:06:00' for key 'PRIMARY'")
INSERT INTO personalinfo (ownername,owneruid,jifen,huajiao,xiaomijiao,jinbi,haoyou,zhuti,rizhi,xiangce,fenxiang,kongjianfangwenliang,youxiangyanzheng,shipinrenzheng,juzhudi,chushengdi,shangcifabiaoshijian,shangcihuodongshijian,zuihoufangwen,zhuceshijian,zaixianshijian,shengri,xingbie) VALUE("龙乐",4894,0,0,0,0,0,0,0,0,0,0,0,0,"","","2100-01-01 12:00","2100-01-01 12:00","2100-01-01 12:00","2004-1-3 20:21",0,"2100-01-01 12:00",0)
.......

于是我将max_thread设置成10,于是结果如下:

INSERT INTO visitor (ownername,owneruid,visitorname,visitoruid,visittime) VALUE ("xiao61",4889,"Liang2017",7252799,"2017-5-22 21:06")

(2006, 'MySQL server has gone away')

INSERT INTO personalinfo (ownername,owneruid,jifen,huajiao,xiaomijiao,jinbi,haoyou,zhuti,rizhi,xiangce,fenxiang,kongjianfangwenliang,youxiangyanzheng,shipinrenzheng,juzhudi,chushengdi,shangcifabiaoshijian,shangcihuodongshijian,zuihoufangwen,zhuceshijian,zaixianshijian,shengri,xingbie) VALUE("xiao61",4889,0,0,0,0,0,0,0,0,0,0,0,0,"","","2100-01-01 12:00","2100-01-01 12:00","2100-01-01 12:00","2004-1-3 15:56",0,"2100-01-01 12:00",0)

(2006, 'MySQL server has gone away')

INSERT INTO visitor (ownername,owneruid,visitorname,visitoruid,visittime) VALUE ("糊涂酷酷熊",4897,"Liang2017",7252799,"2017-5-22 21:06")

(2006, 'MySQL server has gone away')

INSERT INTO personalinfo (ownername,owneruid,jifen,huajiao,xiaomijiao,jinbi,haoyou,zhuti,rizhi,xiangce,fenxiang,kongjianfangwenliang,youxiangyanzheng,shipinrenzheng,juzhudi,chushengdi,shangcifabiaoshijian,shangcihuodongshijian,zuihoufangwen,zhuceshijian,zaixianshijian,shengri,xingbie) VALUE("糊涂酷酷熊",4897,611,0,1655,0,0,2,0,0,0,34,0,0,"","","2007-3-27 00:37","2007-3-27 00:37","2007-3-27 00:37","2004-1-3 21:08",0,"2100-01-01 12:00",1)

(2006, 'MySQL server has gone away')
.......

可以看出2006出来了,然后我将max_thread设置成30,然后结果如下:

就将,够详细吗?不够详细还需要什么只管说!

伊谢尔伦伊谢尔伦2722 天前1122

全部回复(1)我来回复

  • 巴扎黑

    巴扎黑2017-06-13 09:26:41

    看这里,我猜你是用的是pymysql,它的线程安全描述为1,对应的pep249里面做了详细的描述:

    Threads may share the module, but not connections.

    线程可以共享模块但不能共享连接。这也就是说你可能得在每个线程中创建一个连接。

    呐~为什么不用orm来做呢?

    回复
    0
  • 取消回复