Created at 12pm, Mar 27
Ms-RAGScience
0
Network bottlenecks and task structure control the evolution of interpretable learning rules in a foraging agent
OPXeRMTNkunmZl-C6-v6L_UEgQ-T3I_eKdcUKkfRz-E
File Type
PDF
Entry Count
72
Embed. Model
jina_embeddings_v2_base_en
Index Type
hnsw

Emmanouil Giannakakis,1, 2 Sina Khajehabdollahi,1 and Anna Levina1, 21-Department of Computer Sciences, University of Tubingen, Germany2-Max Planck Institute for Biological Cybernetics, Tubingen, GermanyDeveloping reliable mechanisms for continuous local learning is a central challenge faced by biological and artificial systems. Yet, how the environmental factors and structural constraints on the learning network influence the optimal plasticity mechanisms remains obscure even for simple settings. To elucidate these dependencies, we study meta-learning via evolutionary optimization of simple reward-modulated plasticity rules in embodied agents solving a foraging task. We show that unconstrained meta-learning leads to the emergence of diverse plasticity rules. However, regularization and bottlenecks to the model help reduce this variability, resulting in interpretable rules. Our findings indicate that the meta-learning of plasticity rules is very sensitive to various parameters, with this sensitivity possibly reflected in the learning rules found in biological networks. When included in models, these dependencies can be used to discover potential objective functions and details of biological learning via comparisons with experimental observations.

Wt = Wt1 + Wt Wt Wt Wt However, it is just one option out of a wide range of mechanisms for normalizing synaptic weights that have been proposed in neuroscientific and AI studies . Moreover, some normalization mechanisms have been shown to have a significant effect on the performance of Hebbian learning . To examine how the choice of weight normalization mechanisms may affect the evolution of the learning rule, we repeated the experiments described in the previous section for both network settings (scalar and binary) using a divisive normalization. This mechanism maintains the sum of the absolute values of the weights constant and equal to a given target Sg which is set to Sg = 3 for all the experiments: Wt = Wt1 + Wt Sg i=1 |wi(t)|
id: 2bf237dc0130c30ae3de57d010b513d2 - page: 8
Wt Wt (cid:80)N , Using this weight normalization mechanism after each plasticity step, we evolve 20 populations each, for both scalar and binary sensory networks. We see that the networks with a scalar sensory readout converge to a different learning rule (Fig. 4d) than the equivalent networks with subtractive normalization (Fig. 4c). Specifically, we see that instead of 5 1 and 6 0, under the divisive normalization 5 0 and 6 < 0, which makes the learning rule to take the form: Wt = p[1XtRt + 6yt] where 1 1 and 6 < 0 . In combination with the divisive normalization, this rule converges to a close approximation of the correct ingredient distribution W c. The trained agents maintain a similar fitness to previous experiments (Fig. 4b), and as expected from networks with scalar sensory readouts, the performance declines significantly (pvalue = 0.0003, paired two-sided t-test) (7) (8) (9) when we test agents with motor and sensory networks that did not co-evolve.
id: dc67e0fef7af9c362104b76937faff21 - page: 8
The networks with a binary sensory readout evolve a learning rule similar to all other binary readout networks (i.e. most parameters converge to the vicinity of 0 except for 3 1). A small but significant difference between the networks that evolved with divisive normalization and those that evolved with subtractive normalization is that the former evolves a parameter 6 < 0 (Fig. 4f). The same pattern is observed much more prominently for networks with a scalar readout (Fig. 4d), which suggests that a negative 6 parameter is important for networks with divisive normalization.
id: 45b7707b03ff65662ac695f8955ec03d - page: 8
The divisive normalization successfully constrains the learned weights to be rather small (in contrast to other binary networks whose plasticity converges to very large sensory weights), and this does not seem to affect their fitness that remains relatively high (Fig. 4a). Also, as with the other binary readout networks, swapping motor and sensory networks between different agents does not lead to a significant reduction in performance (pvalue = 0.067, paired two-sided t-test). F. Trainable nonlinearity on the sensory readout We now assess the impact the nonlinearity of the sensory network has on the evolved learning rule. To test this, we allow the steepness of the nonlinearity to evolve by setting: g(x, ) = 2 1 + ex 1 and making an evolvable parameter. For small values of the function becomes effectively linear, while for large values it becomes a step function.
id: ecc8a0ce6fcc3515600b5c4cb2c7f94e - page: 8
How to Retrieve?
# Search

curl -X POST "https://search.dria.co/hnsw/search" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"rerank": true, "top_n": 10, "contract_id": "OPXeRMTNkunmZl-C6-v6L_UEgQ-T3I_eKdcUKkfRz-E", "query": "What is alexanDRIA library?"}'
        
# Query

curl -X POST "https://search.dria.co/hnsw/query" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"vector": [0.123, 0.5236], "top_n": 10, "contract_id": "OPXeRMTNkunmZl-C6-v6L_UEgQ-T3I_eKdcUKkfRz-E", "level": 2}'