Home  >  Article  >  Backend Development  >  Introduction to identification and verification codes for getting started with Python

Introduction to identification and verification codes for getting started with Python

高洛峰
高洛峰Original
2017-03-06 11:28:291447browse

Preface

Verification code? Can I crack it too?

I won’t say much about the introduction of verification codes. Various verification codes appear from time to time in people’s lives. As a student, the one you come into contact with most every day is the system of the Academic Affairs Office. Verification code, such as the following verification code:

Introduction to identification and verification codes for getting started with Python

Identification method

Simulated login has complicated steps. Here, regardless of other operations, we are only responsible for returning an answer string based on an input verification code image.

We know that the verification code will make the picture colorful in order to create interference, and we first need to remove these interferences. This step requires continuous experimentation, enhancing the color of the picture, increasing the contrast, etc. can help.

Introduction to identification and verification codes for getting started with Python

Introduction to identification and verification codes for getting started with Python

After various operations on the pictures, I finally found a more perfect solution for removing interference. It can be seen that after removing the interference, under optimal circumstances, we will get a very pure black and white character picture. There are four characters in a picture. It is impossible to recognize all four characters at once. The picture needs to be cropped so that each small picture has only one character, and then each picture is recognized separately.

#The next step is to recognize the text, we First, convert the obtained small picture into a matrix represented by 01, each matrix represents a character.


For example, the matrix of the number six

num_6=[
0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,1,1,0,0,0,0,0,0,
0,0,0,0,1,1,1,0,0,0,0,0,0,
0,0,0,1,1,1,0,0,0,0,0,0,0,
0,0,0,1,1,0,0,0,0,0,0,0,0,
0,0,1,1,0,0,0,0,0,0,0,0,0,
0,0,1,1,0,0,0,0,0,0,0,0,0,
0,1,1,1,1,1,1,1,0,0,0,0,0,
0,1,1,1,1,1,1,1,1,0,0,0,0,
0,1,1,0,0,0,0,1,1,1,0,0,0,
0,1,1,0,0,0,0,0,1,1,0,0,0,
0,1,1,0,0,0,0,0,1,1,0,0,0,
0,1,1,1,0,0,0,1,1,1,0,0,0,
0,0,1,1,1,1,1,1,1,0,0,0,0,
0,0,0,1,1,1,1,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,
]

Looking at it from a distance, you can still distinguish it if you squint.


Because the verification code is very regular and the position of each number is fixed, there is no need to involve any machine learning algorithm. It is just a simple comparison of the matrices. Just find the matrix with the highest similarity among all the implemented matrices. There are various comparison methods here. Anyway, as long as the data is simple and can be correctly identified.

At this point, our verification code identification work is over.

The verification code recognition carried out this time mainly uses python's PIL for image manipulation. Please see all the codes for simulating login and automatically filling in the verification code here:

Sample code

# -*- coding: utf-8 -*
import sys
reload(sys)
sys.setdefaultencoding( "utf-8" )
import re
import requests
import io
import os
import json
from PIL import Image
from PIL import ImageEnhance
from bs4 import BeautifulSoup

import mdata

class Student:
 def __init__(self, user,password):
  self.user = str(user)
  self.password = str(password)
  self.s = requests.Session()

 def login(self):
  url = "http://202.118.31.197/ACTIONLOGON.APPPROCESS?mode=4"
  res = self.s.get(url).text
  imageUrl = &#39;http://202.118.31.197/&#39;+re.findall(&#39;<img src="(.+?)" width="55"&#39;,res)[0]
  im = Image.open(io.BytesIO(self.s.get(imageUrl).content))
  enhancer = ImageEnhance.Contrast(im)
  im = enhancer.enhance(7)
  x,y = im.size
  for i in range(y):
   for j in range(x):
    if (im.getpixel((j,i))!=(0,0,0)):
     im.putpixel((j,i),(255,255,255))
  num = [6,19,32,45]
  verifyCode = ""
  for i in range(4):
   a = im.crop((num[i],0,num[i]+13,20))
   l=[]
   x,y = a.size
   for i in range(y):
    for j in range(x):
     if (a.getpixel((j,i))==(0,0,0)):
      l.append(1)
     else:
      l.append(0)
   his=0
   chrr="";
   for i in mdata.data:
    r=0;
    for j in range(260):
     if(l[j]==mdata.data[i][j]):
      r+=1
    if(r>his):
     his=r
     chrr=i
   verifyCode+=chrr
   # print "辅助输入验证码完毕:",verifyCode
  data= {
  &#39;WebUserNO&#39;:str(self.user),
  &#39;Password&#39;:str(self.password),
  &#39;Agnomen&#39;:verifyCode,
  }
  url = "http://202.118.31.197/ACTIONLOGON.APPPROCESS?mode=4"
  t = self.s.post(url,data=data).text
  if re.findall("images/Logout2",t)==[]:
   l = &#39;[0,"&#39;+re.findall(&#39;alert((.+?));&#39;,t)[1][1][2:-2]+&#39;"]&#39;+" "+self.user+" "+self.password+"\n"
   # print l
   # return &#39;[0,"&#39;+re.findall(&#39;alert((.+?));&#39;,t)[1][1][2:-2]+&#39;"]&#39;
   return [False,l]
  else:
   l = &#39;登录成功 &#39;+re.findall(&#39;! (.+?) &#39;,t)[0]+" "+self.user+" "+self.password+"\n"
   # print l
   return [True,l]

 def getInfo(self):
  imageUrl = &#39;http://202.118.31.197/ACTIONDSPUSERPHOTO.APPPROCESS&#39;
  data = self.s.get(&#39;http://202.118.31.197/ACTIONQUERYBASESTUDENTINFO.APPPROCESS?mode=3&#39;).text #学籍信息
  data = BeautifulSoup(data,"lxml")
  q = data.find_all("table",attrs={&#39;align&#39;:"left"})
  a = []
  for i in q[0]:
   if type(i)==type(q[0]) :
    for j in i :
     if type(j) ==type(i):
      a.append(j.text)
  for i in q[1]:
   if type(i)==type(q[1]) :
    for j in i :
     if type(j) ==type(i):
      a.append(j.text)
  data = {}
  for i in range(1,len(a),2):
   data[a[i-1]]=a[i]
  # data[&#39;照片&#39;] = io.BytesIO(self.s.get(imageUrl).content)
  return json.dumps(data)

 def getPic(self):
  imageUrl = &#39;http://202.118.31.197/ACTIONDSPUSERPHOTO.APPPROCESS&#39;
  pic = Image.open(io.BytesIO(self.s.get(imageUrl).content))
  return pic

 def getScore(self):
   score = self.s.get(&#39;http://202.118.31.197/ACTIONQUERYSTUDENTSCORE.APPPROCESS&#39;).text #成绩单
   score = BeautifulSoup(score, "lxml")
   q = score.find_all(attrs={&#39;height&#39;:"36"})[0]
   point = q.text
   print point[point.find(&#39;平均学分绩点&#39;):]
   table = score.html.body.table
   people = table.find_all(attrs={&#39;height&#39; : &#39;36&#39;})[0].string
   r = table.find_all(&#39;table&#39;,attrs={&#39;align&#39; : &#39;left&#39;})[0].find_all(&#39;tr&#39;)
   subject = []
   lesson = []
   for i in r[0]:
    if type(r[0])==type(i):
     subject.append(i.string)
   for i in r:
    k=0
    temp = {}
    for j in i:
     if type(r[0])==type(j):
      temp[subject[k]] = j.string
      k+=1
    lesson.append(temp)
   lesson.pop()
   lesson.pop(0)
   return json.dumps(lesson)

 def logoff(self):
  return self.s.get(&#39;http://202.118.31.197/ACTIONLOGOUT.APPPROCESS&#39;).text

if __name__ == "__main__":
 a = Student(20150000,20150000)
 r = a.login()
 print r[1]
 if r[0]:
  r = json.loads(a.getScore())
  for i in r:
   for j in i:
    print i[j],
   print
  q = json.loads(a.getInfo())
  for i in q:
   print i,q[i]
  a.getPic().show()
 a.logoff()

For more articles related to the introduction of identification and verification codes for getting started with python, please pay attention to the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn