Home > Article > Technology peripherals > Silicon Valley is betting that generative AI is on the rise, allowing you to turn simple text into images or even videos
The so-called "generative AI" that has emerged in recent years is attracting the interest of Silicon Valley technology giants and venture capital institutions. This kind of AI can generate matching images based on a small number of words in a few seconds. Analysts predict that this technology will be widely used in various industries and generate trillions of dollars in economic value.
Although the images generated by these computer programs are not perfect, such as extra fingers on the hands and unnatural bends of the limbs, etc. At the same time, image generators also encounter problems when processing text, such as generating meaningless symbols. However, these image-generating programs may be the start of a tech boom. David Beisel, an investor at NextView Ventures, a Silicon Valley venture capital firm, said: "In the past three months, the term 'generative artificial intelligence' has become a buzzword."
Since 2021, generative AI technology has made huge progress, even inspiring many people to quit their jobs to start new companies, dreaming that AI can power a new generation of technology giants in the future.
The field of AI has been booming over the past five years or so, but most of these advances have to do with making sense of existing data. AI models have become efficient enough to recognize whether there is a cat in a photo someone just took with their phone. Additionally, these models are reliable enough to serve billions of search results to the Google search engine every day. However, generative AI models can generate completely new things that weren’t available before. In other words, they create, not just analyze, data.
Boris Dayma, founder of AI and machine learning platform Craiyon Productive AI, said: “The most impressive thing is that generative AI They can also create new things. They are not just creating similar old images, but they can also create new things that are completely different from before."
Sequoia Capital, a well-known venture capital firm in Silicon Valley ) posted on its website: "From games to advertising to law, generative AI has the potential to transform all areas where human creativity comes into play. This technology has the potential to generate trillions of dollars in economic value." More Interestingly, Sequoia Capital also pointed out in the post that its above-mentioned article was partly written by GPT-3, which itself is a generative AI capable of generating text.
Image generation uses techniques from a subset of machine learning called deep learning. Deep learning has driven much of the progress in AI since a landmark 2012 paper on image classification reignited interest in the technology. Deep learning uses models trained on large data sets until the program understands the relationships in that data. The model can then be used in applications such as identifying whether there is a dog in a picture or translating text.
The image generator works by reversing this process. Instead of translating English to French, they convert English phrases into images. They usually consist of two main parts, one that processes the initial phrase and another that converts the data into an image.
Part One Generative AI is based on a method called Generative Adversarial Networks (Generative Adversarial Networks, or GANs for short). Previously, these GANs were often used to generate photos of non-existent people. Essentially, they work by pitting two AI models against each other to better create images that meet a predetermined goal.
Newer methods often use converters, a concept first proposed by Google in a 2017 paper. This is an emerging technology that can take advantage of larger data sets, although its training costs can run into millions of dollars.
The first image generator to gain a lot of attention was Dall-E, a project launched in 2021 by Silicon Valley startup OpenAI. OpenAI released an updated and more powerful version this year. "With Dall-E 2, this is really the moment we cross the Uncanny Valley," said Christian Cantrell, a developer specializing in generative AI. ##Another commonly used AI-based image generator is Craiyon, formerly known as Dall-E Mini, which is available online. After the user enters the phrase, they can see the resulting drawing in the browser within minutes.
Since its launch in July 2021, Craiyon is now generating approximately 10 million images per day, for a total of 1 billion never-before-seen images, according to Daima, the creator of the AI and machine learning platform Craiyon Productive AI. picture of. After usage spiked earlier this year, Daimar began dedicating all of his energy to Craiyon. He said he focused on using ads to keep users free because the site's server costs were high. Craiyon has a Twitter account dedicated to posting the weirdest and most creative images, and it has over 1 million followers.
But the project that sparked the most enthusiasm was Stable Diffusion, which was released to the public in August of this year. Its code is available on GitHub and can be run on a computer, in the cloud or through a programming interface. This allows users to adapt the program code to their own purposes or build new programs on top of it.
For example, Stable Diffusion is integrated into Adobe Photoshop through a plug-in that allows users to generate backgrounds and other parts of images, which they can then manipulate directly in the app using layers and other PS tools to transform the generated AI goes from a technology that generates finished images to a tool that professionals can use.
Cantrell, the developer of the plug-in, worked at Adobe for 20 years and resigned this year to focus on generative AI. The veteran said the plug-in has been downloaded tens of thousands of times. Artists told him they used it in countless places he never expected, such as animating Godzilla or creating images of Spider-Man in any pose the artist could imagine.
An emerging art of using generative AI is how to construct “prompts,” phrases that generate images. A search engine called Lexica can connect images of Stable Diffusion with the exact strings of words that can be used to generate them. Platforms like Reddit and Discord have tips on how to get people to enter the phrase they want to generate an image for.
Many investors see generative AI as a potentially transformative platform, like smartphones or the Internet. Same as early days. This shift greatly expands the size of the potential market that might be able to use this technology.
Cantrell believes that generative AI is similar to a more fundamental technology, namely databases. He said: "Generative AI is a bit like a database. Databases help unlock the huge potential of applications. Almost every application we use in life is built on a database, but no one cares about how the database works. , they just know how to use it."
Michael Dempsey, managing partner at Compound VC, said it was "very rare" for a technology previously limited to the lab to enter the mainstream, attracting risk A lot of attention from investors, who like to bet on areas with huge potential. But he warned that generative AI is currently in a “curiosity phase” closer to the peak of the hype cycle. Companies in this stage may fail because they are not focused on a specific use that businesses or consumers are willing to pay for.
Others in the field believe that the startups pioneering these technologies today could eventually challenge the software giants that currently dominate the AI field, including Google, Facebook parent Meta and Microsoft, and set the stage for the rise of the next generation of technology giants. Pave the way.
Hugging Face CEO Clement Delangue said: "There will be a large number of new trillion-dollar companies born, and these startups will use this new technology to Basics." Hugging Face is a developer platform similar to GitHub that hosts pre-trained AI models, including Craiyon and Stable Diffusio. Its goal is to make it easier for programmers to build AI technology.
Some companies have received significant investment. Huging Face was valued at $2 billion after raising funding earlier this year from investors including Lux Capital and Sequoia Capital. OpenAI, the most prominent startup in the space, has received more than $1 billion in funding from Microsoft and Khosla Ventures. Meanwhile, Stable Diffusion developer Stability AI is in talks to raise venture capital at a valuation of up to $1 billion.
Cloud service providers such as Amazon, Microsoft and Google may also benefit because generative AI can be a computationally intensive technology. Meta and Google have hired many of the brightest minds in the field to integrate this advanced technology into the company's products. In September, Meta announced an AI initiative called Make-A-Video that takes the technology to the next level by generating videos rather than just images.
Meta CEO Mark Zuckerberg posted on his Facebook page: "This is an amazing advancement. Generating a video is much harder than generating a photo because besides getting it right In addition to generating each pixel, the system must also predict how they will change over time." Recently, Google also released program code called Phenaki that can convert text into minutes-long videos.
The craze could also give a boost to chipmakers such as Nvidia, AMD and Intel, whose graphics processors are ideal for training and deploying AI models. At a conference last week, Nvidia CEO Jensen Huang highlighted generative AI as a key use of the company's latest chips, saying such technology could soon revolutionize communications.
However, the benefits of generative AI to end users are still limited. A lot of the excitement these days revolves around free or low-cost experiments. For example, some authors have tried using image generators to create illustrations for their articles. Nvidia is experimenting with using models to generate new 3D images of people, animals, vehicles or furniture that can populate virtual gaming worlds.
Ultimately, everyone developing generative AI will have to grapple with the ethical issues posed by image generators.
The first is the employment issue. Although many programs require powerful graphics processors, computer-generated content is still much cheaper than the cost of a professional illustrator's time, who can be paid hundreds of dollars per hour. Generative AI could spell big trouble for artists, videographers, and others who make a living creating their work. "It turns out that machine learning models may become better, faster and cheaper than humans," said Michael Dempsey, managing partner at Compound VC.
Around Originality and Ownership, Generate Modern AI will also bring more complex challenges. This AI model was trained using a large number of existing images, and it is still debated whether the creator of the original image owns the copyright to the image generated in the original style. An artist recently won an art competition in Colorado, USA, using images primarily created by a generative AI called MidJourney. He said in an interview after his win that he selected one of the hundreds of images he generated and then tweaked and processed it in PS.
Some of the images generated by Stable Diffusion appear to be watermarked, suggesting that part of the original dataset is protected by copyright. Some tip guides advise users to use the name of a specific, living artist to achieve better results in imitating that artist's creative style. Last month, Getty Images banned users from uploading generative AI images to its database of stock images, fearing copyright infringement disputes.
The Image Generator can also be used to create new images of trademarked characters or objects, such as Minions, Marvel characters, or the Throne from Game of Thrones. As image-generating software gets better, it also has the potential to trick users into believing false information, or show images or videos of events that never happened.
Developers must also grapple with the possibility that AI models trained on large amounts of data may contain biases related to gender, race, or culture in the data, which may cause the model to show up in the output This kind of prejudice. Huging Face has published material on ethical issues and discussed the issue of developing AI models responsibly.
Hugging Face CEO Clement de Lange said: “We see short-term and current challenges with these models because they are probabilistic models, trained on large data sets, and tend to absorb A lot of bias." He cited the example of generative AI being asked to draw a portrait of a "software engineer," and it generated an image of a white male.
The above is the detailed content of Silicon Valley is betting that generative AI is on the rise, allowing you to turn simple text into images or even videos. For more information, please follow other related articles on the PHP Chinese website!