[ComfyUI]Gemini2.0:一键轻松搞定!图文视频音频打标&电商虚拟试穿&图像编辑

Gemini2.0:图文视频音频反推&电商图像编辑

大家好!欢迎来到破狼主页。感谢大家的支持与鼓励。在AIGC探索道路上,我将与你一路同行。喜欢就星标关注破狼或文末扫码加入交流群 !

Gemini2.0简介

在之前的文章中已经介绍谷歌最新LLM视觉大语言模型:Gemini2.0的图文编辑和反推的强大功能([ComfyUI]谷歌Gemini2.0:电商广告图文无忧,动动嘴就可轻松搞定图像编辑和修改)。今天我们将继续探索它的更多有用的功能。在近期Wan2.1视频的图生视频LORA训练兴起,但对于视频能够精确打标的模型并不多。Google Gemini Flash 2.0 Experimental 模型就是一款既能够解决图生生成和编辑,同时还能完成图文、音频、视频文件打标的工具利器。并且在这区已有了对应ComfyUI工具集成,能够支持我们直接在 ComfyUI 工作流中对文本、图像、视频帧和音频进行多模态分析,甚至是图像生成和局部修改的全功能插件。

往期视频特效镜头:

Gemini2.0音频和视频反推ComfyUI体验

本文将使用ComfyUI-Gemini_Flash_2.0_Exp的ComfyUI工作流体验。文末网盘模型下载

• ComfyUI-Gemini_Flash_2.0_Expgithub.com/ShmuelRonen/

• API KEY申请:访问Google AI Studio获取,注意地区限制。aistudio.google.com/api

• 图文生成和局部修改参见文章:[ComfyUI]谷歌Gemini2.0:电商广告图文无忧,动动嘴就可轻松搞定图像编辑和修改

Gemini2.0音频和视频反推ComfyUI体验

Gemini2.0音频和视频反推ComfyUI体验工作流下载:liblib.art/modelinfo/7b

注意:

  • • 因为使用的远程LLM API分析图像、视频、音频,只需要申请API和正确网络访问,就能轻松实现多类型文件反推,并且能够精准分析。
  • • 对于视频等文件分析结果精准度很好,特别使用于万相Wan2.1视频LORA训练反推打标。当然需要给于更好的提示词模板,生成固定格式标注。
  • • 同时也支持多图文分析和图像生成功能
  • • 有时会出现图像蓝屏图像,是因为免费API KEY的并发限制,需要控制下API并发访问频率。
  • • 更多的Wan2.1和混元等ComfyUI工作流参见个人主页:runninghub.cn/user-cent

01. 视频反推-火凤凰

能够清晰分析出视频中每一帧的细节信息。

Here's a breakdown of what I see in the video frames: * **Frames 1-4:** A woman in a long, red traditional Chinese dress is standing outdoors. She has her hair up in a bun and is wearing a red veil that covers the lower half of her face. She is standing with her hands pressed together in front of her chest in a gesture that could be a greeting or a sign of respect. The background shows a misty, mountainous landscape. * **Frame 5:** The woman is blurred, suggesting movement. Her arms are out to her sides. * **Frame 6:** A figure of the woman is blurred and superimposed with an image of a phoenix. The phoenix is in shades of orange and yellow, and there is a bright light surrounding the figures. * **Frame 7:** A phoenix is flying in the sky, with a beam of light connecting it to a point on the ground. There is a reflection of the phoenix beneath the point on the ground. Overall, the video seems to be showcasing a transformation or a magical event, possibly involving the woman and a phoenix. The setting and clothing suggest a Chinese cultural context.

02.视频反推-女毒液

Here's a breakdown of what I see in the video frames:

**Frames 1-3:**

*   **Subject:** A young woman with long, dark wavy hair.
*   **Attire:** She is wearing a white corset, white thigh-high stockings, and a white choker. She has a white fur-like shawl draped over her shoulder.
*   **Setting:** She is sitting on a white, ornate pedestal against a gray background.
*   **Overall Impression:** The image has a soft, elegant, and slightly sensual aesthetic.

**Frames 4-6:**

*   **Transformation:** The woman is now covered in a black, glossy substance, resembling the Marvel character Venom.
*   **Venom Mask:** A mask with Venom's characteristic white eyes and sharp teeth covers her face.
*   **Overall Impression:** The image has transformed from elegant to monstrous, with the Venom-like appearance dominating the scene. The black substance appears to be engulfing her original attire.

03.音频反推-关注破狼

大家好,歡迎來到破狼主頁,感謝大家的支持與鼓勵,在GC探索道路上,我將與你一路同行,喜歡就星標關注破狼或文末掃碼加入交流群。

04.音频反推-歌曲成都

Here is the transcription of the audio and who is speaking:

**Transcription:**

讓我掉下眼淚的 不知昨夜的酒
讓我依依不捨的 不知你的溫柔
雨路還要走多久 你攥著我的手
讓我感到為難的是 掙扎的自由

分別總是在九月 回憶是思念的愁
深秋暖綠的垂柳 親吻著我額頭
在那座陰雨的小城裡 我從未忘記你
成都 帶不走的 只有你

和我在成都的街頭走一走 喔喔喔
直到所有的燈都熄滅了 也不停留
你會挽著我的衣袖 我會把手揣進褲兜
走到玉林路的盡頭 坐在小酒館的門口

分別總是在九月 回憶是思念的愁
深秋暖綠的垂柳 親吻著我額頭
在那座陰雨的小城裡 我從未忘記你
成都 帶不走的 只有你

和我在成都的街頭走一走 喔喔喔
直到所有的燈都熄滅了 也不停留
你會挽著我的衣袖 我會把手揣進褲兜
走到玉林路的盡頭 坐在小酒館的門口

和我在成都的街頭走一走 喔喔喔
直到所有的燈都熄滅了 也不停留
和我在成都的街頭走一走 喔喔喔
直到所有的燈都熄滅了 也不停留
你會挽著我的衣袖 我會把手揣進褲兜
走到玉林路的盡頭 走過小酒館的門口

和我在成都的街頭走一走 喔喔喔
直到所有的燈都熄滅了 也不停留

**Who is speaking?**

The song is sung by a male vocalist, and at the end, a child sings the chorus.

05.多图文反推-巾帼英雄

Here's a description of each image:

**Image 1:**

*   **Subject:** A young woman dressed in ornate, dark-colored armor with gold detailing. She has long, dark hair pulled back into a high ponytail with a red ribbon. She has fair skin, red lipstick, and what appears to be a small, stylized red mark on her cheek.
*   **Attire:** The armor is elaborate, covering her chest and shoulders. A red garment is visible beneath the armor. She holds a sword with a bloodied blade.
*   **Setting:** The background appears to be an out-of-focus cityscape or town, possibly with traditional Asian architecture. The sky is visible above.
*   **Overall Impression:** The image conveys a sense of strength, beauty, and perhaps a hint of danger. The woman looks like a warrior or noblewoman, possibly after a battle.

**Image 2:**

*   **Subject:** A man dressed in similar ornate armor, but with a more red and gold color scheme. He has dark hair pulled back into a topknot.
*   **Attire:** He wears a full suit of armor with intricate gold detailing. A red robe or cloak is also part of his attire. He holds a sword and stands next to a white horse.
*   **Setting:** The background is a blurred, reddish landscape that could be interpreted as a battlefield or a dramatic natural setting.
*   **Overall Impression:** The image depicts a regal and powerful figure, likely a warrior or nobleman. The horse adds to the impression of status and strength. The red background creates a sense of drama and intensity.

In summary, both images depict figures in elaborate, historical-style armor, suggesting a theme of warriors or nobility in a possibly Asian-inspired setting. The first image is a close-up of a female warrior, while the second image shows a male warrior with a horse in a more dramatic setting.

06.图像生成-模特试衣

图一女人穿上图2衣服

07.图像合并-拥抱

图1女人和图2男人拥抱,保持人物和服装的一致性

本文多次抽卡风格不固定,还需要优化提示词

07.图像修改-局部修改

请将女人衣服换位红色长裙,请保持人物一致性输出

08.图像修改-文案

衣服文字logo"我",保持人物一致性

• 推荐不想本地自己折腾的同学一个可在线使用Runninghub平台可在线体验AI应用和工作流(注册即送1000积分可用)。主页更多精彩工作流可在线体验: runninghub.cn/user-cent 。阿里万相-最强开源图生视频AI应用:runninghub.cn/ai-detail 。AI工作流:runninghub.cn/post/1894

• 推荐使用云端镜像体验:新注册即送 8 元免费白嫖额度,4090D 大约 1.59 元/小时。注册链接xiangongyun.com/registe 。万相和混元视频推理和炼丹一体镜像操作指南xiangongyun.com/image/d

• 更多AGI资料:yuque.com/yuqueyonghuwh

• 网盘下载pan.quark.cn/s/481bbe36

更多推荐文章:

• [ComfyUI]Wan2.1: 超酷I2V变身视频特效!一键返老还童&变身新娘&白雪公主等众多视频特效

• [ComfyUI]Wan2.1: 首尾帧图生视频控制来了!万相视频生态仍在继续

• [ComfyUI]Wan2.1: 效果炸裂!一键生成超多风靡Pika特效视频

• [ComfyUI]Wan2.1: 超好玩视频特效镜头!开源媲美闭源生态崛起

• [ComfyUI]谛韵:媲美闭源AI音乐模型,一键仅需1分钟生成4分钟完整全音乐,质量优于YuE

• [ComfyUI]阿里Wan视频: 太好玩啦!风靡Pika和可灵的捏捏乐和切切乐视频玩法,一键免费无限制体验

• 阿里Wan2.1:最强开源视频模型,LORA炼丹炉仙宫云就绪,开启视频生态新纪元