Created at 12pm, Jan 5
sXwbvaWWArtificial Intelligence
0
Gradient-based learning applied to document recognition
OJt1iOfJEgKTsk5Ivu_xV-PnCa9vtRtlAVkzwY3gR94
File Type
PDF
Entry Count
330
Embed. Model
jina_embeddings_v2_base_en
Index Type
hnsw

This article delves into how gradient-based learning, particularly using multilayer neural networks and the back-propagation algorithm, can be effectively applied in the field of document recognition. This type of learning involves gradually adjusting the parameters of a neural network to improve its accuracy in tasks like recognizing handwritten characters.The key idea is that, with the right network architecture, these learning algorithms can create a complex decision-making process. This process is capable of classifying patterns that are high-dimensional and intricate, such as handwritten letters and numbers, with minimal initial processing of the data.The paper reviews various methods used for handwritten character recognition and compares them using a standard task of recognizing handwritten digits. Among these methods, convolutional neural networks (ConvNets) stand out. ConvNets are specially designed to handle the variability in 2D shapes, making them particularly effective for tasks like reading handwriting.Moreover, the paper introduces a new learning concept called Graph Transformer Networks (GTNs). GTNs are designed for more complex document recognition systems that involve several steps, such as extracting specific fields from a document, segmenting different parts of the text, recognizing the segmented parts, and understanding the language. GTNs enable all these different modules to be trained together in a way that optimizes the overall performance.The authors also discuss practical applications, including two systems developed for online handwriting recognition. These systems, trained globally using gradient-based methods, demonstrate the benefits of this comprehensive training approach. One notable application described is a system for reading bank cheques. It combines ConvNet-based character recognizers with these global training techniques, achieving high accuracy in reading both business and personal cheques. Impressively, this system is not just a theoretical model but is actually deployed in the real world, where it processes several million cheques every day.

Given an interpretation, there is a well-known method, called the forward algorithm for computing the above quantity efciently . The penalty computed with this procedure for a particular interpretation is called the forward penalty. Consider again the concept of constrained 2302 PROCEEDINGS OF THE IEEE, VOL. 86, NO. 11, NOVEMBER 1998
id: d6695161631ef715b5919e28d122153b - page: 25
There is one constrained graph for each possible label sequence (some may be empty graphs, which have innite penalties). Given an interpretation, running the forward algorithm on the corresponding constrained interpretation. graph gives the forward penalty for that The forward algorithm proceeds in a way very similar to the Viterbi algorithm, except that the operation used at each node to combine the incoming cumulated penalties, instead of being the min function, is the so-called logadd operation, which can be seen as a soft version of the min function (13) where is the set of upstream arcs of node is the penalty on arc
id: d3ff6849dc3c9ad04e9c4dad639650d9 - page: 26
An interesting analogy can be drawn if we consider that a graph on which we apply the forward algorithm is equivalent to an NN on which we run a forward propagation, except that multiplications are replaced by additions, the additions are replaced by log-adds, and there are no sigmoids.
id: ae884979f44555722676bb137c76506c - page: 26
One way to understand the forward algorithm is to think about multiplicative scores (e.g., probabilities) instead of additive penalties on the arcs: score In that case the Viterbi algorithm selects the path with the largest cumulative score (with scores multiplied along the path), whereas the forward score is the sum of the cumulative scores associated to each of the possible paths from the start to the end node. The forward penalty is always lower than the cumulated penalty on any of the paths, but if one path dominates (with a much lower penalty), its penalty is almost equal to the forward penalty. The forward algorithm gets its name from the forward pass of the well-known BaumWelsh algorithm for training HMMs . Section VIII-E gives more details on the relation between this work and HMMs.
id: b55ecfabf045687f0b4f4deaa805267b - page: 26
How to Retrieve?
# Search

curl -X POST "https://search.dria.co/hnsw/search" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"rerank": true, "top_n": 10, "contract_id": "OJt1iOfJEgKTsk5Ivu_xV-PnCa9vtRtlAVkzwY3gR94", "query": "What is alexanDRIA library?"}'
        
# Query

curl -X POST "https://search.dria.co/hnsw/query" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"vector": [0.123, 0.5236], "top_n": 10, "contract_id": "OJt1iOfJEgKTsk5Ivu_xV-PnCa9vtRtlAVkzwY3gR94", "level": 2}'