Home >Technology peripherals >AI >Come quickly! Luchen Open-Sora can collect wool, and you can easily start video generation for 10 yuan.
Recently, the video generation model track is booming, with Vincent videos, Tu videos, and so on. However, even though there are many models on the market, most people still cannot experience them because they do not have the qualifications for internal testing, so they can only look at the "models" and sigh. Not long ago, we reported on Luchen Technology's Open-Sora model. As the world's first open source Sora-like model, it not only performs well on multiple types of videos, but is also low-cost and available to everyone. Does it work? how to use? Let’s take a look at the review of this site.
The recent open-source version 1.2 of Open-Sora can generate 720p high-definition videos up to 16 seconds long. The official video effect is as follows:
The generated effect is really amazing. No wonder so many readers in the background want to get started. experience.
Compared with many closed source software, which require long queues to wait for internal testing qualifications, this completely open source Open-Sora is obviously more accessible. However, the official Github of Open-Sora is full of technology and code. If you want to deploy the experience yourself, not to mention the high hardware requirements of the model, it is also a big challenge for the user's coding skills when configuring the environment.
So is there any way to make it easy for even novice AI users to use Open-Sora?
First the conclusion: Yes, and it can be deployed with one click. After startup, it can also control the video length, frame, lens and other parameters with zero code.
Are you excited? Then let us take a look at how to implement Open-Sora deployment. At the end of the article, there are nanny-level detailed tutorials and usage addresses, which can be operated without any technical background.
Visualization solution based on Gradio
Regarding the latest technical details of Open-Sora, We have done an in-depth report. In the report, we focused on the core architecture of the OpenSora model and its innovative video compression network (VAE). At the end of that article, we mentioned that the Luchen Open-Sora team provides Gradio applications that can be deployed with one click. So, what exactly does this Gradio application look like?
Gradio itself is a Python package designed for rapid deployment of machine learning models. It allows developers to automatically generate a web interface by defining the input and output of the model, thereby simplifying the online display and interaction process of the model.
We carefully read the GitHub homepage of Open-Sora and found that the application organically combines the Open-Sora model with Gradio, providing an elegant and concise interaction solution.
It uses a graphical interface to make the operation easier. In the interface, users can freely modify basic parameters such as the duration, aspect ratio, and resolution of the generated video. They can also independently adjust the motion amplitude, aesthetic score, and more advanced lens movement methods of the generated video. It also supports calling GPT-4 to optimize prompt, so it can support both Chinese and English text input.
After deploying the application, users do not need to write any code when using the Open-Sora model. They only need to enter prompt and click to replace parameters to try different parameter combinations to generate videos. The generated video will also be displayed directly in the Gradio interface and can be downloaded directly on the web page without the need for complicated paths.
Image source: https://github.com/hpcaitech/Open-Sora/blob/main/assets/readme/gradio_basic.png
The Open-Sora team has The script to adapt the model to Gradio is provided in Github, and the command line code for deployment is also provided. However, we still need to go through complex environment configuration to successfully run the deployed code. If we want to fully experience the functions of Open-Sora, especially to generate long-term high-resolution videos (such as 720P 16 seconds), we need a graphics card with good performance and large video memory (the official one is H800). Gradio's solution doesn't seem to mention how to solve these two problems.
These two problems may seem very difficult at first glance, but they can be perfectly solved by Luchen Cloud, truly achieving easy deployment without the need for technology. How to get started? There is a super simple tutorial here on this site.
Super simple one-click deployment tutorial
How easy is it to deploy Open-Sora on Luchen Cloud?
First of all, Luchenyun provides multiple types of graphics cards, among which high-end graphics cards such as A800 and H800 can also be easily rented. After our testing, this 80GB video memory card can meet the inference requirements of the Open-Sora project with a single card.
Secondly, Luchen Cloud has equipped a dedicated image for the Open-Sora project. This image is like a finely decorated room that you can move into with your luggage. The entire operating environment can be started with one click, eliminating the need for complex environment configuration links.
Finally, Luchenyun also has super favorable prices and super personalized services. The price of an A800 card is less than 10 yuan per hour, and the time for initializing the image is not billed. The cloud host is shut down at any time and billing stops. In other words, for less than 10 yuan/hour, you can fully enjoy the surprising experience brought by Open-Sora! In addition, we have also included a method to obtain a 100 yuan coupon at the end of the article. Hurry up and register an account to get the coupon and follow our tutorial!
Luchenyun website: https://cloud.luchentech.com/
First, enter the website to register an account on Luchenyun. As soon as you enter the main page, you can directly see the machines available for rent in the computing power market. Get a coupon or recharge 10 yuan, and you can follow Luchenyun's user guide to start building a cloud host.
The first step is to choose a mirror. As soon as you open the public image, the first one you click on is OpenSora (1.2), which is really convenient.
The second step is to choose the billing method. There are two billing methods, tide billing and pay-as-you-go billing. We tried it and found that tide metering saves money and the A800 is even cheaper during idle periods!
For Open-Sora inference, an A800 is enough, we chose a 1-card configuration, and allowed SSH connection, storage persistence, and mounted public data (including model weights). These functions are free of charge, provide more convenience, and are super conscience.
After selecting, click Create. The startup time of the cloud host is very short, and the machine will be up within tens of seconds. This period of time is not billed, so if you encounter a relatively large image that takes a long time, you don’t have to worry about the cost.
In the third step, we click JupyerLab from the cloud host page to enter the web page. As soon as we entered, a terminal was opened for us.
We enter ls to view the files of the cloud host. We can see that the Open-Sora folder is at the initial path.
Since we are using Open-Sora exclusive image, we do not need to install any additional environment. The most time-consuming step was solved perfectly.
At this time, we can directly enter the command to run Gradio to quickly start Gradio and truly achieve one-click deployment.
Bashpython gradio/app.py
The speed is very fast, it only takes more than ten seconds for Gradio to start running.
However, we found that this gradio runs on the server's http://0.0.0.0:7860 by default. If you want to use it in your local browser, you must first add your ssh public key to Luchen Cloud's in the machine. This step is also very simple. Just enter the file below and paste the secret key of the local machine into it.
Next, we also need to write the local completion port mapping instructions. We can follow the instructions in this screenshot. When you use it, you need to replace it with the specific address and port of your own cloud host.
Then, open the corresponding web page, and a visual operation interface will soon appear.
We first randomly entered an English prompt and clicked to start generating (the default 480p was used, which will be faster).
a river flowing through a rich landscape of trees and mountains (一条河流流经茂密的树木和山脉)
很快生成就完成了,耗时约 40 秒。生成结果整体还不错,有河有山有树木,和指令符合。但是我们期待的是雄鹰从高处俯瞰的效果。
没关系,调整了指令再来一次:
a bird's eye view of a river flowing through a rich landscape of trees and mountains (鸟瞰河流流经树木和山脉的丰富景观)
这次生成的内容果然带上了鸟瞰效果。不错,这个模型还是很听话的。
如前文所说,gradio 界面上还有很多其他选项,比如调整分辨率、画幅长宽比、视频时长,甚至还能控制视频的动态效果幅度等,可玩性非常强,我们测试时使用的是 480P 分辨率,而最高可支持 720P,大家可以逐个尝试,看看不同选项搭配的效果。
想要进阶?微调也能轻松上手
此外,继续深挖 Open-Sora 的网页,我们发现他们还提供了继续微调模型的代码指令。使用自己喜欢的类型的视频微调模型的话,就能让这个模型生成更符合我的审美要求的视频了!
让我们用潞晨云的公开数据中提供的视频数据来验证一下。
由于环境全都是配置好的,我们只需复制粘贴训练指令。
torchrun --standalone --nproc_per_node 1 scripts/train.py configs/opensora-v1-2/train/stage1.py --data-path /root/commonData/Inter4K/meta/meta_inter4k_ready.csv
这边输出了一连串模型训练的信息。
训练已经正常启动了,居然只要单卡就能训!
( 踩坑提示:在此之前我们遭遇了一次 OOM, 结果发现程序挂了以后显存依旧被占用,然后发现是忘记关闭上一步 Gradio 的推理了 ORZ,所以大家用单卡训的时候一定要记得关掉 Gradio,因为 Gradio 上面加载了模型一直在等待用户输入来进行推理)。
以下是我们训练的时候 GPU 资源占用情况:
简单算一笔账,训练一步大约耗时约 20 秒,根据 Open-Sora 提供的数据,训练 70k 步(如下图所示),那他们耗时大约在 16 天左右,和他们文档中声称的 2 周左右相近(假设他们的所有机器各完成一个 step 的时间和我们这台机器相似)。
在这 70k 步中,第一阶段占 30k 步,第二阶段占 23k 步,那第三阶段其实只训练了 17k 步。而这个第三阶段,就是用高质量视频进行微调,用来大幅度提升模型质量,也就是我们现在想要做的事情。
不过,从报告中看,他们的训练使用了 12 台 8 卡机器,所以如果我们用潞晨云平台训练和第三阶段相同的数据量,大约需要:
95 小时 * 8 卡 * 12 台 * 10 元 / 小时 = 91200 元
This number is still a bit threshold for evaluation, but it is also very cost-effective for creating an exclusive Vincent video model. Especially for enterprises, there is basically no preparatory work required. By following the step-by-step tutorial, you can complete a fine-tuning for less than 100,000 yuan or even less. Really looking forward to seeing more enhanced versions of Open-Sora in the professional field!
Finally, let’s add the 100 yuan coupon benefit event we mentioned earlier ~ Although the cost of our review is less than 10 yuan, we still have to save the money!
From the official information of Luchen Cloud, we can see that users share their experience on social media and professional forums (such as Zhihu, Xiaohongshu, Weibo, CSDN, etc.) (with #Luchenyun or @潞chen Technology ), you can get a 100 yuan voucher (valid for one week) by sharing it effectively once, which is equivalent to five or six hundred videos generated during our evaluation~
Finally, we have compiled relevant resource links Put it below so that everyone can get started quickly. Friends who want to try it immediately, click to read the original text to send it with one click and start your AI video journey!
Related resource links:
Lu Chenyun platform: https://cloud.luchentech.com/
Open-Sora code base: https://github.com/hpcaitech /Open-Sora/tree/main?tab=readme-ov-file#inference
Bilibili tutorial: https://www.bilibili.com/video/BV1ow4m1e7PX/?vd_source=c6b752764cd36ff0e535a768e35d98d2
The above is the detailed content of Come quickly! Luchen Open-Sora can collect wool, and you can easily start video generation for 10 yuan.. For more information, please follow other related articles on the PHP Chinese website!