What does big data desensitization mean
Big data data desensitization, also known as data bleaching, data deprivatization or data deformation, It refers to the transformation of certain sensitive information through desensitization rules to achieve reliable protection of sensitive private data, so that the desensitized real data set can be used safely in development, testing, other non-production environments and outsourcing environments.
Privacy data desensitization technology
Usually in big data platforms, data is stored in a structured format, and each table It is composed of many rows, and each row of data is composed of many columns. According to the data attributes of the column, data columns can usually be divided into the following types:
Columns that can accurately locate a person are called identifiable columns, such as ID number, address, name, etc.
A single column cannot locate an individual, but multiple columns of information can be used to potentially identify a person. These columns are called semi-identifying columns, such as postal code, birthday and gender. A research paper in the United States stated that 87% of Americans can be identified using only zip code, birthday and gender information[3].
Columns containing sensitive user information, such as transaction amounts, illnesses, and income.
Other columns that do not contain user sensitive information.
The so-called avoidance of privacy data leakage refers to preventing people who use the data (data analysts, BI engineers, etc.) from identifying a certain row of data as a certain person's information. Data desensitization technology desensitizes data, such as removing identifying columns, converting semi-identifying columns, etc., so that data users can ensure that the #2 (after conversion) semi-identifying columns, #3 sensitive information columns, and #4 On the basis of data analysis in other columns, it is guaranteed to a certain extent that it cannot reversely identify users based on the data, achieving a balance between ensuring data security and maximizing the value of the data.
Privacy data leakage types
Privacy data leakage can be divided into many types. According to different types, different privacy data leakage risk models can usually be used to measure and prevent The risk of privacy data leakage, and the desensitization of data corresponding to different data desensitization algorithms. Generally speaking, types of privacy data leaks include:
Personal identity leakage. When a data user confirms through any means that a piece of data in a data table belongs to a certain person, it is called a personal identity leak. Personal identity leakage is the most serious, because once personal identity leakage occurs, data users can obtain sensitive information about specific individuals.
Attribute leakage, when data users learn new attribute information about a person based on the data table they access, it is called attribute leakage. Personal identity leakage will certainly lead to attribute leakage, but attribute leakage can also occur independently.
Member relationship leaked. When a data user can confirm that a person's data exists in a data table, it is called membership disclosure. The risk of membership relationship leakage is relatively small. Personal identity leakage and attribute leakage definitely mean membership relationship leakage, but membership relationship leakage may also occur independently.
Privacy data leakage risk model
Opening data to data analysts also introduces the risk of privacy data leakage. Maximizing the potential of data analysis and mining while limiting the risk of privacy data leakage within a certain range is the ultimate goal of data desensitization technology. Currently, in the field of privacy data desensitization, there are several different models that can be used to measure the possible privacy data leakage risks of data from different angles.
Recommended tutorial: "PHP Tutorial"
The above is the detailed content of What does big data desensitization mean?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

WebStorm Mac version
Useful JavaScript development tools

Atom editor mac version download
The most popular open source editor

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software