Created at 11am, Jan 5
sXwbvaWWArtificial Intelligence
1
Deep Residual Learning for Image Recognition
HV5DboBwlwTiFTUvuMGJBnH7O_72PdZE6JnlpUqZPHs
File Type
PDF
Entry Count
61
Embed. Model
jina_embeddings_v2_base_en
Index Type
hnsw

When it comes to teaching computers to recognize images (like distinguishing between a cat and a dog in a photo), deeper neural networks (a type of artificial intelligence) are usually more effective. However, the deeper the network (meaning it has more layers of processing), the harder it is to train.The breakthrough of Deep Residual Learning is like giving the computer a set of shortcuts to learn more effectively. In a regular deep neural network, each layer tries to learn a new feature or pattern from the image. But in a Residual Network (or ResNet), each layer also has the option to refer back to what was learned in previous layers. Think of it like a student who, instead of learning everything from scratch, can build on what they already know.This approach makes it easier to train very deep networks. The team behind this research tested networks with up to 152 layers, which is significantly deeper than previous models, yet these networks were easier to train and performed better.Their method proved to be highly effective. On a major image recognition challenge (ILSVRC 2015), their model set a new record for accuracy. They also showed that this approach works well on other tasks, like detecting and classifying objects in images (like finding and labeling all the dogs in a bunch of photos).In summary, Deep Residual Learning is a smart way of training AI to be better at recognizing and understanding images by allowing it to build upon previously learned information, leading to more accurate and efficient image recognition.

18 layers 34 layers plain 27.94 28.54 ResNet 27.88 25.03 reducing of the training error3. The reason for such optimization difculties will be studied in the future. Table 2. Top-1 error (%, 10-crop testing) on ImageNet validation. Here the ResNets have no extra parameter compared to their plain counterparts. Fig. 4 shows the training procedures. 34-layer plain net has higher training error throughout the whole training procedure, even though the solution space of the 18-layer plain network is a subspace of that of the 34-layer one.
id: 7a6e9ad044245800306960ade0ab6cee - page: 5
We argue that this optimization difculty is unlikely to be caused by vanishing gradients. These plain networks are trained with BN , which ensures forward propagated signals to have non-zero variances. We also verify that the backward propagated gradients exhibit healthy norms with BN. So neither forward nor backward signals vanish. In fact, the 34-layer plain net is still able to achieve competitive accuracy (Table 3), suggesting that the solver works to some extent. We conjecture that the deep plain nets may have exponentially low convergence rates, which impact the
id: 218e49234ab03a6a6904e68ff8d2313f - page: 5
Residual Networks. Next we evaluate 18-layer and 34layer residual nets (ResNets). The baseline architectures are the same as the above plain nets, expect that a shortcut connection is added to each pair of 33 lters as in Fig. 3 (right). In the rst comparison (Table 2 and Fig. 4 right), we use identity mapping for all shortcuts and zero-padding for increasing dimensions (option A). So they have no extra parameter compared to the plain counterparts. We have three major observations from Table 2 and Fig. 4. First, the situation is reversed with residual learning the 34-layer ResNet is better than the 18-layer ResNet (by 2.8%). More importantly, the 34-layer ResNet exhibits considerably lower training error and is generalizable to the validation data. This indicates that the degradation problem is well addressed in this setting and we manage to obtain accuracy gains from increased depth. Second, compared to its plain counterpart, the 34-layer
id: 153ba84d0d11568f0f6a86dfed134cc4 - page: 5
3We have experimented with more training iterations (3) and still observed the degradation problem, suggesting that this problem cannot be feasibly addressed by simply using more iterations. 774 model VGG-16 GoogLeNet PReLU-net top-1 err. 28.07 24.27 top-5 err. 9.33 9.15 7.38 plain-34 ResNet-34 A ResNet-34 B ResNet-34 C ResNet-50 ResNet-101
id: af59ba60a7252477214d6d648d73d3a0 - page: 5
How to Retrieve?
# Search

curl -X POST "https://search.dria.co/hnsw/search" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"rerank": true, "top_n": 10, "contract_id": "HV5DboBwlwTiFTUvuMGJBnH7O_72PdZE6JnlpUqZPHs", "query": "What is alexanDRIA library?"}'
        
# Query

curl -X POST "https://search.dria.co/hnsw/query" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"vector": [0.123, 0.5236], "top_n": 10, "contract_id": "HV5DboBwlwTiFTUvuMGJBnH7O_72PdZE6JnlpUqZPHs", "level": 2}'