低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリース-AI-php.cn

ホームページ

テクノロジー周辺機器

低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリース

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Apr 09, 2024 pm 01:40 PM

ai自然な画像

Exhibits excellent accuracy in deep neural networks (DNNs). However, they show vulnerability to additional noise, i.e., adversarial attacks. Previous research hypothesized that this vulnerability may stem from the over-reliance of high-accuracy DNNs on insignificant and unrestricted features such as texture and background. However, new research reveals that this vulnerability has nothing to do with the specific characteristics of highly accurate DNNs overly trusting irrelevant factors such as their weights and context.

At the recent AAAI 2024 academic conference, researchers from the University of Sydney revealed that "edge information extracted from images can provide highly relevant and robust information related to shape and background." Characteristics".

低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリース

Paper link: https://ojs.aaai.org/index.php/AAAI/article/view/28110

These features help the pre-trained deep network improve its adversarial robustness without affecting its accuracy on clear images.

Researchers propose a lightweight and adaptable EdgeNet that can be seamlessly integrated into existing pre-trained deep networks, including Vision Transformers (ViTs), which is the latest A family of advanced models for visual classification.

EdgeNet is an edge extraction technique that processes edges extracted from clean natural images or noisy adversarial images, which can be injected into the pre-trained and frozen backbone depth The middle layer of the network. This deep network has excellent backbone robustness features and can extract features with rich semantic information. By inserting EdgeNet into such a network, one can take advantage of its high-quality backbone deep network

It should be noted that this approach brings minimal additional cost: using traditional The cost of edge detection algorithms (such as the Canny edge detector mentioned in the article) to obtain these edges is miniscule compared to the inference cost of deep networks; and the cost of training EdgeNet is similar to the cost of fine-tuning the backbone network using technologies such as Adapter. Up and down.

EdgeNet Architecture

In order to inject the edge information in the image into the pre-trained backbone network, the author introduces a side branch network named EdgeNet. This lightweight, plug-and-play collateral network can be seamlessly integrated into existing pre-trained deep networks, including state-of-the-art models like ViTs.

By running on the edge information extracted from the input image, EdgeNet can generate a set of robust features. This process produces a robust feature that can be selectively injected into the pre-trained backbone deep network for freezing in the middle layers of the deep network.

By injecting these robust features, the network's ability to defend against adversarial perturbations can be improved. At the same time, since the backbone network is frozen and the injection of new features is selective, the accuracy of the pre-trained network in identifying unperturbed clear images can be maintained.

低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリース

As shown in the figure, the author inserts new building blocks at a certain interval N based on the original building blocks 低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリース The EdgeNet building blocks. The new intermediate layer output can be represented by the following formula:

低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリース

EdgeNet Building Blocks

In order to achieve selective feature extraction and selectivity Feature injection, these EdgeNet building blocks adopt a "sandwich" structure: zero convolution is added before and after each block to control the input and output. Between these two zero convolutions is a ViT block with random initialization and the same architecture as the backbone network

低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリース

入力がゼロの場合、低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリースは最適化目標に関連する情報を抽出するフィルターとして機能し、出力がゼロの場合、バックボーンに統合する情報を決定するフィルターとして機能します。。さらに、初期化をゼロにすることで、バックボーン内の情報の流れが影響を受けないことが保証されます。その結果、その後の EdgeNet の微調整がより合理化されます。

トレーニング目標

EdgeNet のトレーニングプロセス中、事前トレーニングされた ViT バックボーンネットワークは分類頭部を除いてフリーズされ、更新されません。最適化の目標は、エッジ機能のために導入された EdgeNet ネットワークと、バックボーンネットワーク内の分類ヘッドのみに焦点を当てています。ここで、著者はトレーニングの効率を確保するために非常に単純化された共同最適化目標を採用しています。

低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリース

#式 9 では、α は精度の損失です。関数の β はロバスト性損失関数の重みです。 α と β のサイズを調整することで、EdgeNet トレーニング目標のバランスを微調整して、精度を大幅に損なうことなく堅牢性を向上させるという目的を達成できます。

実験結果

著者らは、ImageNet データセット上の 2 つの主要なカテゴリの堅牢性をテストしました。

最初のカテゴリは、ホワイトボックス攻撃やブラックボックス攻撃を含む敵対的攻撃に対する堅牢性です。

2 番目のカテゴリは、攻撃に対する耐性です。 ImageNet-A の自然な敵対的な例、ImageNet-R の配布外データ、ImageNet-C の一般的なデータの歪み (一般的な破損) など、いくつかの一般的な摂動が含まれます。

著者は、さまざまな摂動下で抽出されたエッジ情報も視覚化しました。

低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリース

##ネットワークの規模とパフォーマンスのテスト

実験部分では、著者はまず、さまざまなサイズの EdgeNet の分類パフォーマンスと計算オーバーヘッドをテストしました (表 1)。分類パフォーマンスと計算オーバーヘッドを総合的に考慮した結果、#Intervals = 3 の構成が最適な設定であると判断されました。

この構成では、EdgeNet はベースラインモデルと比較して大幅な精度と堅牢性の向上を実現します。これにより、分類パフォーマンス、計算要件、堅牢性の間でバランスの取れた妥協が実現されます。

低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリース

この構成では、妥当な計算効率を維持しながら、明瞭さの精度と堅牢性が大幅に向上します。

精度と堅牢性の比較

著者は、提案された EdgeNet を 5 つの異なるものと比較します。 SOTA メソッドのカテゴリを比較しました (表 2)。これらの方法には、自然画像でトレーニングされた CNN、ロバストな CNN、自然画像でトレーニングされた ViT、ロバストな ViT、およびロバストな微調整された ViT が含まれます。

考慮されるメトリクスには、敵対的攻撃 (FGSM および PGD) での精度、ImageNet-A での精度、ImageNet-R での精度が含まれます。

さらに、ImageNet-C の平均誤差 (mCE) も報告されており、値が低いほどパフォーマンスが優れていることを示します。実験結果は、EdgeNet がクリーンな ImageNet-1K データセットとそのバリアントに対して以前の SOTA 手法と同等のパフォーマンスを示しながら、FGSM および PGD 攻撃に対して優れたパフォーマンスを示すことを示しています。

低コストのアルゴリズムにより、視覚的分類の堅牢性が大幅に向上しました。シドニー大学の中国チームが新しいEdgeNetメソッドをリリース