Home >Technology peripherals >AI >DarkBERT: AI born from the dark web, the world's first AI model trained based on the dark web
May 25 According to foreign media reports, Korean researchers recently developed DarkBERT, a large-scale language model trained based on dark web data. This AI model is designed to help cybersecurity professionals extract intelligence about cyber threats from the dark web.
DarkBERT can research the dark web to identify and flag potential cybersecurity threats, including data breaches and ransomware.
Researchers at the Korea Advanced Institute of Science and Technology (KAIST) collaborated with data intelligence agency S2W to develop the generative AI language model DarkBERT, which is specially trained on data sets from the dark web.
Unlike chatbots like ChatGPT or Bard, this model is designed as a tool for analyzing data sets and answering specific queries. DarkBERT can help cybersecurity professionals and law enforcement by verifying whether using the dark web as a data set allows AI tools to better understand the language used in these environments.
To optimize DarkBert for languages used on the dark web, the research team created a large database by crawling the proxy Tor network. The research team also employs deduplication, data filtering and pre-processing techniques to mitigate ethical concerns related to dark web content, which often contains large amounts of sensitive information.
The model inputs two sets of data within 16 days. The pre-processed data includes the name of the victim organization, details of the leaked data, threat statements, illegal images and other information.
Due to the potential risks of dark web information, DarkBert will not be open to the public for the time being. However, users can make requests to use this AI model for academic purposes.
The above is the detailed content of DarkBERT: AI born from the dark web, the world's first AI model trained based on the dark web. For more information, please follow other related articles on the PHP Chinese website!