Home > Article > Technology peripherals > How to ensure the security of data used to train machine learning models?
It’s not difficult for cybercriminals to remotely manipulate and negatively impact machine learning model performance.
Malicious users can poison machine learning training data, illegally access sensitive user information in training data sets, and cause similar other problems.
The adoption of machine learning and artificial intelligence has soared over the past decade. Applications involving these technologies range from facial recognition and weather forecasting applications to sophisticated recommendation systems and virtual assistants. As artificial intelligence becomes more and more integrated into our lives, the issue of cybersecurity in artificial intelligence systems also arises. According to the World Economic Forum’s 2022 Global Risks Report, cybersecurity failures are among the top 10 global risks to watch over the next decade.
Cybersecurity and artificial intelligence will inevitably intersect at some point, but the idea is to harness the power of artificial intelligence to enhance cybersecurity. While it exists in its place, the power of cybersecurity is also needed to protect the integrity of machine learning models. The threat to these models comes from the source: the model training data. The danger is that machine learning training data can be manipulated by hackers, remotely or on-site. Cybercriminals manipulate training data sets to influence the output of algorithms and degrade system defenses. This method is often untraceable because the attacker pretends to be the user of the algorithm.
The machine learning cycle involves continuous training using updated information and user insights. Malicious users can manipulate this process by providing specific inputs to the machine learning model. Using the manipulated records, they were able to determine confidential user information such as bank account numbers, social security details, demographic information and other classified data that was used as training data for machine learning models.
Some common methods used by hackers to manipulate machine learning algorithms are:
Data poisoning involves compromising the training data used for machine learning models. This training data comes from independent parties such as developers, individuals, and open source databases. If a malicious party is involved in providing information to a training data set, they will be fed "toxic" data carefully constructed so that the algorithm misclassifies it.
For example, if you are training an algorithm to recognize horses, the algorithm will process thousands of images in the training data set to identify horses. To enhance this learning, you also feed the algorithm images of black and white cows. However, if you accidentally add an image of a brown cow to the dataset, the model will classify it as a horse. The model won't understand the difference until it's trained to tell the difference between a brown cow and a brown horse.
Similarly, attackers can manipulate training data to teach models classification scenarios that favor them. For example, they could train algorithms to view malware as benign software and security software as dangerous software that uses toxic data.
Another way data is poisoned is through a “backdoor” into a machine learning model. A backdoor is a type of input that may not be known to the model designer, but can be used by an attacker to manipulate the algorithm. Once hackers find a vulnerability in an AI system, they can exploit it to directly teach the model what they want to do.
Suppose an attacker accesses the backdoor to teach the model that when certain characters are present in a file, it should be classified as benign. Now, an attacker can make any file benign by adding these characters, and whenever the model encounters such a file, it will classify it as benign as it was trained to do.
Data poisoning is also combined with another attack called a membership inference attack. The Membership Inference Attack (MIA) algorithm allows an attacker to evaluate whether a specific record is part of the training dataset. Combined with data poisoning, membership inference attacks can be used to partially reconstruct the information inside the training data. Although machine learning models work well on generalized data, they perform well on training data. Membership inference attacks and reconstruction attacks exploit this ability to provide input that matches the training data and use the machine learning model output to recreate user information in the training data.
The model is periodically retrained with new data, and it is during this retraining period that toxic data can be introduced into the training data set. Since it occurs over time, it can be difficult to track such activity. Model developers and engineers can enforce blocking or detection of such inputs before each training cycle through input validity testing, regression testing, rate limiting, and other statistical techniques. They can also limit the number of inputs from a single user, check if there are multiple inputs from similar IP addresses or accounts, and test retrained models against golden datasets. Golden datasets are proven and reliable reference points for machine learning-based training datasets.
Hackers need information about how machine learning models work to perform backdoor attacks. Therefore, it is important to protect this information by implementing strong access controls and preventing information leakage. General security practices such as restricting permissions, data versioning, and logging code changes will strengthen model security and protect machine learning training data from poisoning attacks.
Businesses should consider testing machine learning and artificial intelligence systems when conducting regular penetration tests of their networks. Penetration testing simulates potential attacks to identify vulnerabilities in security systems. Model developers can similarly run simulated attacks on their algorithms to see how they can build defenses against data poisoning attacks. When you test your model for data poisoning vulnerabilities, you can learn about possible added data points and build mechanisms to discard such data points.
Even seemingly trivial amounts of bad data can render machine learning models ineffective. Hackers have adapted to exploit this weakness and compromise corporate data systems. As businesses increasingly rely on artificial intelligence, they must protect the security and privacy of machine learning training data or risk losing customer trust.
The above is the detailed content of How to ensure the security of data used to train machine learning models?. For more information, please follow other related articles on the PHP Chinese website!