Created at 9am, Mar 5
Ms-RAGArtificial Intelligence
0
Flatten Long-Range Loss Landscapes for Cross-Domain Few-Shot Learning
K1-34ZbxU-oLcdGU5k0BSRdK5eL6_f97J4s61avXd_Q
File Type
PDF
Entry Count
65
Embed. Model
jina_embeddings_v2_base_en
Index Type
hnsw

Yixiong Zou, Yicong Liu, Yiman Hu, Yuhua Li, Ruixuan LiSchool of Computer Science and Technology, Huazhong University of Science and Technology{yixiongz, smnight, m202273659, idcliyuhua, rxli}@hust.edu.cnAbstractCross-domain few-shot learning (CDFSL) aims to acquire knowledge from limited training data in the targetdomain by leveraging prior knowledge transferred from source domains with abundant training samples. CDFSL faces challenges in transferring knowledge across dissimilar domains and fine-tuning models with limited training data. To address these challenges, we initially extend the analysis of loss landscapes from the parameter space to the representation space, which allows us to simultaneously interpret the transferring and fine-tuning difficulties of CDFSL models. We observe that sharp minima in the loss landscapes of the representation space result in representations that are hard to transfer and fine-tune. Moreover, existing flatness-based methods have limited generalization ability due to their short-range flatness. To enhance the transferability and facilitate fine-tuning, we introduce a simple yet effective approach to achieve long-range flattening of the minima in the loss landscape. This approach considers representations that are differently normalized as minima in the loss landscape and flattens the high-loss region in the middle by randomly sampling interpolated representations. We implement this method as a new normalization layer that replaces the original one in both CNNs and ViTs. This layer is simple and lightweight, introducing only a minimal number of additional parameters. Experimental results on 8 datasets demonstrate that our approach outperforms state-of-the-art methods in terms of average accuracy. Moreover, our method achieves performance improvements of up to 9% compared to the current best approaches on individual datasets. Our code will be released.

Then, we propose to flatten the loss landscapes between these two minima, by means of classifying the input sample through the interpolated representations between two minima. Take the final layers representation f (x) as an example, the classification can be represented as L = Lcls(h((1 )fnorm1(x) + fnorm2(x)), y), (4) where fnorm1(x) and fnorm2(x) refer to two normalized representations, and [0, 1] is an interpolation parameter randomly sampled from the Beta distribution. In implementation, we carry out the above interpolation in every layers representation. Since the classification is based on the interpolated representation, this representation will be pushed to be effective, so that it will be mapped to a low loss in the RSLL. Therefore, the high-loss region between fnorm1(x) and fnorm2(x) will be flattened.
id: 9af29109866227c214726116d7ec09a5 - page: 5
Since this method is agnostic to the shape of the complex loss landscape between these minima, we do not need to consider any local information around minima. We measure the distance between the BN and IN representations in Tab. 2, and we can see the distance of each representation is much larger than the step size in Fig. 3, indicating a much larger flattened region. Below, we provide two instantiations of the above idea for both Convolutional Neural Networks (CNN) and Vision Transformers (ViT) respectively. 3.1. Flattening for Convolutional Neural Networks In CNNs, Batch Normalization is one of the most widely used normalization methods, which is typically applied after the convolution layer. BN refers to normalizing representations with batch statistics, which can be represented as fBN (X) = X 2 + + , = 1 bhw (cid:88) Xi,:,j,k, 2 = 1 bhw (cid:88) (Xi,:,j,k )2, i,j,k
id: eae220915da0fa73257229bf4bf6dac0 - page: 5
Instance normalization (IN) is another widely used normalization method for CNNs, which refers to normalizing representations with only the given image statistics as fIN (X) = X 2 + + , = 1 hw (cid:88) X:,:,j,k, 2 = 1 hw (cid:88) (X:,:,j,k )2. j,k j,k For CNN, we set fnorm1 to fBN , and set fnorm2 to fIN . 3.2. Flattening for Vision Transformers In ViTs, Layer Normalization is widely applied as the normalization method, which is represented as fLN (X) = X 2 + + , = 1 c (cid:88) X:,:,k, 2 = 1 c (cid:88) (X:,:,k )2, k :,:,k (5) (6) (7) Table 3. Dataset information. Please see the appendix for details. Dataset Domain Classes Images miniImageNet CUB Cars Plantae Places General recognition Fine-grained bird recognition Fine-grained car recognition Plantae recognition Scene recognition CropDiseases Agricultural disease recognition EuroSAT ISIC2018 ChestX
id: 6abbcc15bbef1bdbe41ff31b3aaeb44f - page: 5
Satellite imagery recognition Skin lesion recognition X-ray chest recognition 64 50 49 50 19 38 10 7 7 38,400 2,953 2,027 17,253 3,800 43,456 27,000 10,015 25,847 where X Rbtc is the batch representation. Then, we follow to only process the CLS token. We use BN as the second normalization method. In other words, we set fnorm1 to fLN , and set fnorm2 to fBN for the CLS token in each layer. 3.3. Implementation In implementation, as shown in Fig. 4, we implement the above design as a normalization layer (FLoR layer) to replace the ordinary normalization layer in the backbone network. This layer can be represented as fF LL(x) = (1 )fnorm1(x) + fnorm2(x), which is the same as Eq. 4. Since is a random number sampled from the Beta distribution Beta(a, b), our model imports only two hyper-parameters (a, b) and learnable parameters only in the added normalization layer, which is simple and lightweight.
id: 7a04f5193f5b7b0f3185271275729b4d - page: 6
How to Retrieve?
# Search

curl -X POST "https://search.dria.co/hnsw/search" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"rerank": true, "top_n": 10, "contract_id": "K1-34ZbxU-oLcdGU5k0BSRdK5eL6_f97J4s61avXd_Q", "query": "What is alexanDRIA library?"}'
        
# Query

curl -X POST "https://search.dria.co/hnsw/query" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"vector": [0.123, 0.5236], "top_n": 10, "contract_id": "K1-34ZbxU-oLcdGU5k0BSRdK5eL6_f97J4s61avXd_Q", "level": 2}'