Home >Technology peripherals >AI >The doctor accidentally discovers a secret: DALL-E 2 created its own language, which is incomprehensible to humans but can generate specific images, which may be used to cause trouble!
DALL·E 2, this AI actually developed a secret language of its own.
For example, these two very strange phrases:
(The translation software will crash, you can try it)
But when it comes to DALL·E 2, the style of painting is completely different.
In its view, A means "birds" and C means "pest".
So, if you feed DALL·E 2 the sentence: A eat C, then its opening method will be like this:
all output The pictures are all about birds eating pests.
And if you tell DALL·E 2 to generate "Two whales discussing food, with subtitles", the result will be like this:
In the picture "Wa ch zod rea" actually means "food" in DALL·E 2's vocabulary!
Once this matter was exposed, it instantly aroused heated discussions among many netizens.
Some people have even suggested that with these secret languages, DALL·E 2’s “banned word filter” can be bypassed, thereby generating some controversial images.
(Making trouble!)
So, what is the secret spell about DALL·E 2?
The person who discovered this problem was a foreign doctoral student majoring in computer science.
He noticed that the DALL-E 2 model always had some strange words when it encountered the need to give images with text.
For example, enter this sentence: "Two farmers talking about vegetables, with subtitles (Two farmers talking about vegetables, with subtitles)", and an image like this will appear:
It seems quite similar, but what are the subtitles written? It’s neither English nor French. It’s so strange.
"What are you translating for me?"
The little brother had an idea and threw one of the "words" "Vicootes" as a description to the model. Unexpectedly, it came out like this Pile of images:
There are radishes, pumpkins, and small persimmons...Does "Vicootes" represent vegetables?
interesting.
Then he threw the string of "Apoploe vesrreaitais" in the bubble to DALL-E 2, and a bunch of bird pictures appeared:
" Oh I see, the word stands for 'bird', so the farmers seem to be talking about the birds affecting their vegetables?"
Looks like DALL-E 2 isn't fooling people...
" I discovered the secret language of DALL-E 2!" The little brother exclaimed, and then planned to verify whether this was a coincidence.
In the example of the whale discussing food just mentioned, the little brother input the string "Wa ch zod rea" back.
In the end, there was a lot of food, and it was all seafood, which was in line with the whales’ “eating habits.”
DALL-E 2, sincerity will not deceive me.
Going one step further, he used these "spells" with words describing the image style to see if DALL-E 2 could parse it normally.
The results are okay. Take a look at these "hand-drawn birds", "cartoon birds", "3D birds" and "line drawing birds":
emmmm, how did a mosquito get mixed into the last one?
Ignore it for now (we’ll talk about it later).
So why does this model need to be expressed in this secret language?
The hot topic of "DALL-E 2 Secret Spell" has also attracted the attention of many "analysts".
For example, a netizen named k1uge suggested that the problem lies with BPE (Byte Pair Encoding).
#BPE is one of the more important coding methods in natural language processing. It is also a common token compression method and is involved in many large language models.
The core idea is:
Every step replaces the most common pair of adjacent data units with a new unit that has not appeared in the data, and iterates repeatedly until the stopping condition is met.
for example.
If you want to compress the word "aaabdaaabac", BPE will first find the most common adjacent byte pair, which is "aa".
After finding it, you can replace it with the new byte Z, then the word becomes "ZabdZabac".
Similarly, the next most common adjacent byte pair is "ab", replaced by Y, the word will be further compressed into "ZYdZYac".
The next most common adjacent byte pair is "ZY", replace it with X, and the final word becomes "XdXac".
......
So, based on this principle, this netizen checked the BPE used by DALL-E 2 for "birds".
It looks like this:
apo<span style="color: rgb(89, 89, 89); margin: 0px; padding: 0px; background: none 0% 0% / auto repeat scroll padding-box border-box rgba(0, 0, 0, 0);">,</span> plo<span style="color: rgb(89, 89, 89); margin: 0px; padding: 0px; background: none 0% 0% / auto repeat scroll padding-box border-box rgba(0, 0, 0, 0);">,</span> e<span style="color: rgb(89, 89, 89); margin: 0px; padding: 0px; background: none 0% 0% / auto repeat scroll padding-box border-box rgba(0, 0, 0, 0);">,</span> <span style="color: rgb(89, 89, 89); margin: 0px; padding: 0px; background: none 0% 0% / auto repeat scroll padding-box border-box rgba(0, 0, 0, 0);">,</span>ve<span style="color: rgb(89, 89, 89); margin: 0px; padding: 0px; background: none 0% 0% / auto repeat scroll padding-box border-box rgba(0, 0, 0, 0);">,</span> sr<span style="color: rgb(89, 89, 89); margin: 0px; padding: 0px; background: none 0% 0% / auto repeat scroll padding-box border-box rgba(0, 0, 0, 0);">,</span> re<span style="color: rgb(89, 89, 89); margin: 0px; padding: 0px; background: none 0% 0% / auto repeat scroll padding-box border-box rgba(0, 0, 0, 0);">,</span> ait<span style="color: rgb(89, 89, 89); margin: 0px; padding: 0px; background: none 0% 0% / auto repeat scroll padding-box border-box rgba(0, 0, 0, 0);">,</span> ais
In reality, the Latin names of many birds have the prefixes of "apo" and "plo".
For example, Apodidae (swifts) and Ploceidae (weaver birds), these two words belong to 2 bird families, each family has more than 100 species.
Apodiformes (Swifts) are the largest order among birds, with more than 400 species in total.
So this netizen believed that DALL-E 2 obtained most of the information about birds from pictures labeled with these "academic terms".
Perhaps this is the reason for the secret spell of DALL-E 2.
The excited doctor wrote a small paper about this matter, and posted these findings on Twitter, attracting Thousands of netizens watched, and everyone called "Incredible".
#But soon someone tried it personally and found that things were not that simple.
For example, the string "Contarra ccetnxniams luryca tanniounons" representing "bugs" will also generate some images of frogs, cows or pigeons in addition to bugs.
If you add the word "cartoon" as a qualification to this description, what will be generated is some "grandma", which has nothing to do with insects? ?
"Apoploe vesrreaitais" is no problem, there are still some birds coming out.
But again, once you add words like "cartoon" and "3D render" to it, something is wrong again, and some bugs will come out.
(This also corresponds to the mosquitoes that appeared in the last example.)
The same is true for "Vicootes" representing vegetables, single loss No problem, once style restrictions are added, the species that appear change; and it can basically be said that they only conform to the style settings of "oil painting" and "cartoon" and have nothing to do with the previous noun restrictions, such as "Vicootes" and "painting" A bunch of pure landscape paintings.
He then used the same "two whales talking about food, with subtitles" to generate some pictures. As a result, most of the text was unclear and could not be transcribed.
Finally found one like this:
After he re-entered using "Evve waeles" above, although he got a photo of dessert, many photos of athletes, animals and even kettles appeared.
I’m really confused.
So the experimenter said:
In my opinion, this is more like some random noise, not DALL-E 2 Secret language.
He liked the doctor, hoping he could give contrary evidence.
There is no reply yet.
But this is indeed a topic worthy of attention and discussion. In view of the fact that some "spells" and images can be matched, if it is really a BPE code, then it is really possible what the doctor said:
Some people use the "white box" method to unlock this kind of rules and get some "spells" of banned words to bypass the model's filter.
Reference link:
[1]https://twitter.com/giannis_daras/status/1531693093040230402
[2]https ://twitter.com/BarneyFlames/status/1531736708903051265
[3]https://twitter.com/benjamin_hilton/status/1531780892972175361
[4]https://giannisdaras.github .io/publications/Discovering_the_Secret_Language_of_Dalle.pdf
[5]https://zhuanlan.zhihu.com/p/424631681
The above is the detailed content of The doctor accidentally discovers a secret: DALL-E 2 created its own language, which is incomprehensible to humans but can generate specific images, which may be used to cause trouble!. For more information, please follow other related articles on the PHP Chinese website!