Home >Backend Development >Python Tutorial >Thanks to learning this Python library, I eliminated a pyramid scheme in one night...
"This is a piece of data exported from the computer in their den. You can see if you can find any clues first. I will then go find a few people to ask questions."
Team Wang threw it to me A USB flash drive, he picked up the lunch box and stuffed a few mouthfuls of rice into it, then picked up his hat and walked out of the office quickly.
Tonight, based on the intelligence, we went to a MLM den to carry out an arrest operation and brought back more than a dozen people.
However, no important evidence was found at the scene, and the few people arrested remained silent. Now there is no way to know whether they have other dens, nor do they know who their informants are and where they are. A deadlock was reached for a while.
The document in my hand at this time may become the key to breaking the situation.
I started to observe this list of people: the invite_id field is not repeated and should have a one-to-one relationship with the person's name; while invited_id is repeated a lot, and basically It is the data that appeared in invite_id.
So we can basically infer that this is a list that records the upline and downline relationships of MLM organizations. There are hundreds of pieces of data, which shows that this is a large criminal organization.
In less than an hour, Team Wang returned.
"It's no use, I'm still a tough talker." Captain Wang sat down on the chair and glanced at the time. It was almost twelve o'clock. "How are you looking at that data? Have you found any clues?"
"This is a very large organization, with hundreds of members in total. We may have only caught the tip of the iceberg now. "
"You are right, but because of this, we must hurry up now." Team Wang walked to my work station, "Our current task is to find their upline first and capture the thief. Capture the king first. But none of the people captured tonight can find anything. Although their identities can be determined, the relationship between them cannot be determined for the time being."
"You said they What's the relationship between them?" I suddenly remembered the networkx python library I saw some time ago, and it might come in handy this time. "Leave it to me, give me five minutes."
First use pandas to import the data in the file and filter out the parts we need:
df = pd.read_excel('./doc/1_evidence.xls') df = df.loc[:, ['id','name','invite_id','invited_id']] df.columns = ['id','title','to', 'from']
Then call the networkx library to generate the hierarchy Relationship diagram and export it:
G = nx.from_pandas_edgelist(df, 'from', 'to', create_using=nx.DiGraph()) nt = net.Network('960px', '1280px', directed=True) nt.from_nx(G) nt.show('1_evidence.html')
In this way, I get the hierarchical relationship diagram corresponding to this document. The relationship between the upper and lower lines is instantly clear:
In an instant Team Wang's face was filled with joy, but he immediately returned to seriousness: "Isn't this a bit flashy? Although it seems intuitive, can you find out who is the top of this organization?"
This certainly does not trouble me. The top level is the root node of the network in the graph. There must be no other points pointing to it, so we only need to traverse all the nodes and find the point with an in-degree of 0.
# 找到根节点 top_node = [] for node, degrees in G.in_degree(): if degrees == 0: top_node.append(node) print('Big Boss:', top_node)
"Big Boss: [100000]" This output appears on the screen. "The number 100000 does not have a corresponding person in the table, but there is only one offline person with the number 162385 under 100000. He should be the leader of this organization."
"Yes, this is what you want! I'll go and get it. If other colleagues are looking for information about this person, you continue to study the data and find all the people who are close to this person!"
This must not be a problem. Now that the root node has been found, we can get The number of layers at which all nodes are located.
# 设置所有节点级别 l = nx.shortest_path_length(G, 100000) nx.set_node_attributes(G, l, 'level') # 计算每级人员数目 data = {} for node, level in l.items(): if level in data.keys(): data[level].append(node) else: data[level] = [node] for level, nodes in data.items(): print(level, len(nodes))
This organization has developed to 36 levels. Thinking about it really makes me sweat. Fortunately, my colleagues discovered it in time.
Then color the nodes according to the level for easy observation:
# 添加颜色 for node in G.nodes: G.nodes[node]['title'] = str(node) level = G.nodes[node]['level'] if level == 0: G.nodes[node]['color'] = 'red' elif level == 1: G.nodes[node]['color'] = 'orange' elif level == 2: G.nodes[node]['color'] = 'yellow'
You can see that the person numbered 162385 has only two offline, and Each of these two downlines has developed dozens of additional downlines, which is quite interesting to think about.
"Found it!" Team Wang opened the door of the office. "The person was found. Among the group of people arrested tonight, two of his subordinates were also inside. They were also They have all admitted it and are now undergoing intensive interrogation." It seems that the actual situation is completely consistent with my speculation, thanks to this list.
"You did a great job tonight!" Captain Wang came over and patted me on the shoulder. "But it's not over yet. According to their confessions, the file contains all the personnel information of their organization. Now you can find out for me the people they have developed the most and we will immediately arrange targeted arrests based on the information. ”
It’s similar to the previous one, but this time I need to traverse the out-degrees of all nodes, then sort them in reverse order, and just take the first few.
# 给下线前十的目标添加颜色 degrees = G.out_degree() top_nodes = sorted(degrees, key=lambda x: x[1], reverse=True)[:10] print(top_nodes) for node in top_nodes: G.nodes[node[0]]['color'] = 'green'
然后给目标节点加上颜色,方便观察,最终得到了这样的关系图:
“干得不错,只要再把这几个人抓到,就相当于切断了这个组织的大动脉,后面的慢慢收尾就可以了。”王队把文件合上,笑着对我说。“没想到你还有这本事,真是后生可畏啊!”
“今天抓到的那三条大鱼,现在审出什么结果了?”相对于其他,我还是对案情本身更感兴趣。
“别提了,都快笑死我了。这仨人看见证据直接慌了神,开始互相推卸责任,老大说同伙全是另外两个人拉来的他都没参与,另外俩人说骗局全是老大策划的他们就是手下打工的,现在估计还吵着呢...”
The above is the detailed content of Thanks to learning this Python library, I eliminated a pyramid scheme in one night.... For more information, please follow other related articles on the PHP Chinese website!