解鎖影像的魔力：使用尖端 SmolVLM-M 模型的快速簡便指南-Python教學-PHP中文網

首頁

後端開發

Python教學

解鎖影像的魔力：使用尖端 SmolVLM-M 模型的快速簡便指南

Susan Sarandon

Jan 24, 2025 pm 02:10 PM

本文展示了 SmolVLM-500M-Instruct，這是一種尖端、緊湊的視覺到文字模型。儘管其規模相對較小（5 億個參數），但它展示了令人印象深刻的功能。

這是Python程式碼：

import torch
from transformers import AutoProcessor, AutoModelForVision2Seq
from PIL import Image
import warnings

warnings.filterwarnings("ignore", message="Some kwargs in processor config are unused")

def describe_image(image_path):
    processor = AutoProcessor.from_pretrained("HuggingFaceTB/SmolVLM-500M-Instruct")
    model = AutoModelForVision2Seq.from_pretrained("HuggingFaceTB/SmolVLM-500M-Instruct")

    image = Image.open(image_path)

    prompt = "Describe the image content in detail.  Provide a concise textual response."
    inputs = processor(text=[prompt], images=[image], return_tensors="pt")

    with torch.no_grad():
        outputs = model.generate(
            pixel_values=inputs["pixel_values"],
            input_ids=inputs["input_ids"],
            attention_mask=inputs["attention_mask"],
            max_new_tokens=150,
            do_sample=True,
            temperature=0.7
        )

    description = processor.batch_decode(outputs, skip_special_tokens=True)[0]
    return description.strip()

if __name__ == "__main__":
    image_path = "images/bender.jpg"

    try:
        description = describe_image(image_path)
        print("Image Description:", description)
    except Exception as e:
        print(f"Error: {e}")

此腳本利用 Hugging Face Transformers 函式庫從影像產生文字描述。它載入預先訓練的模型和處理器，處理圖像並輸出描述性文字。包括錯誤處理。

程式碼在這裡：https://www.php.cn/link/042886829869470b75f63dddfd7e9d9d

使用以下非庫存圖像（放置在項目的圖像目錄中）：

Unlock the Magic of Images: A Quick and Easy Guide to Using the Cutting-Edge SmolVLM-M Model

模型生成描述（可以調整提示和參數以實現更精細的控制）：一個機器人坐在沙發上，全神貫注地閱讀。背景中可以看到書架和門。場景中還有一張有坐墊的白色椅子。

與較大的語言模型相比，此模型的速度和效率值得注意。

以上是解鎖影像的魔力：使用尖端 SmolVLM-M 模型的快速簡便指南的詳細內容。更多資訊請關注PHP中文網其他相關文章！

陳述

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

您如何將元素附加到Python列表中？May 04, 2025 am 12:17 AM

toAppendElementStoApythonList，usetheappend（）方法forsingleements，Extend（）formultiplelements，andinsert（）forspecificpositions.1）useeAppend（）foraddingoneOnelementAttheend.2）useextendTheEnd.2）useextendexendExendEnd（

您如何創建Python列表？舉一個例子。May 04, 2025 am 12:16 AM

TocreateaPythonlist,usesquarebrackets[]andseparateitemswithcommas.1)Listsaredynamicandcanholdmixeddatatypes.2)Useappend(),remove(),andslicingformanipulation.3)Listcomprehensionsareefficientforcreatinglists.4)Becautiouswithlistreferences;usecopy()orsl

討論有效存儲和數值數據的處理至關重要的實際用例。May 04, 2025 am 12:11 AM

金融、科研、医疗和AI等领域中，高效存储和处理数值数据至关重要。1)在金融中，使用内存映射文件和NumPy库可显著提升数据处理速度。2)科研领域，HDF5文件优化数据存储和检索。3)医疗中，数据库优化技术如索引和分区提高数据查询性能。4)AI中，数据分片和分布式训练加速模型训练。通过选择适当的工具和技术，并权衡存储与处理速度之间的trade-off，可以显著提升系统性能和可扩展性。

您如何創建Python數組？舉一個例子。May 04, 2025 am 12:10 AM

pythonarraysarecreatedusiseThearrayModule，notbuilt-Inlikelists.1）importThearrayModule.2）指定tefifythetypecode，例如，'i'forineizewithvalues.arreaysofferbettermemoremorefferbettermemoryfforhomogeNogeNogeNogeNogeNogeNogeNATATABUTESFELLESSFRESSIFERSTEMIFICETISTHANANLISTS。

使用Shebang系列指定Python解釋器有哪些替代方法？May 04, 2025 am 12:07 AM

除了shebang線，還有多種方法可以指定Python解釋器：1.直接使用命令行中的python命令；2.使用批處理文件或shell腳本；3.使用構建工具如Make或CMake；4.使用任務運行器如Invoke。每個方法都有其優缺點，選擇適合項目需求的方法很重要。

列表和陣列之間的選擇如何影響涉及大型數據集的Python應用程序的整體性能？May 03, 2025 am 12:11 AM

ForhandlinglargedatasetsinPython,useNumPyarraysforbetterperformance.1)NumPyarraysarememory-efficientandfasterfornumericaloperations.2)Avoidunnecessarytypeconversions.3)Leveragevectorizationforreducedtimecomplexity.4)Managememoryusagewithefficientdata

說明如何將內存分配給Python中的列表與數組。May 03, 2025 am 12:10 AM

Inpython，ListSusedynamicMemoryAllocationWithOver-Asalose，而alenumpyArraySallaySallocateFixedMemory.1）listssallocatemoremoremoremorythanneededinentientary上，respizeTized.2）numpyarsallaysallaysallocateAllocateAllocateAlcocateExactMemoryForements，OfferingPrediCtableSageButlessemageButlesseflextlessibility。