Created at 10am, Mar 5
Ms-RAGArtificial Intelligence
0
PARALLEL HYPERPARAMETER OPTIMIZATION OF SPIKING NEURAL NETWORKS
ogWkfNWVmD3980e90QXV1oncQ4GIn2WOOJluGVe3hTw
File Type
PDF
Entry Count
95
Embed. Model
jina_embeddings_v2_base_en
Index Type
hnsw

Thomas FirminCNRS, Inria, Centrale Lille, UMR 9189 CRIStALUniversité de LilleLille, F-59000 Francethomas.firmin@univ-lille.frPierre BouletCNRS, Inria, Centrale Lille, UMR 9189 CRIStALUniversité de LilleLille, F-59000 Francepierre.boulet@univ-lille.frEl-Ghazali TalbiCNRS, Inria, Centrale Lille, UMR 9189 CRIStALUniversité de LilleLille, F-59000 Franceel-ghazali.talbi@univ-lille.frABSTRACTHyperparameter optimization of spiking neural networks (SNNs) is a difficult task which has not yet been deeply investigated in the literature. In this work, we designed a scalable constrained Bayesian based optimization algorithm that prevents sampling in non-spiking areas of an efficient high dimensional search space. These search spaces contain infeasible solutions that output no or only a few spikes during the training or testing phases, we call such a mode a “silent network”. Finding them is difficult, as many hyperparameters are highly correlated to the architecture and to the dataset. We leverage silent networks by designing a spike-based early stopping criterion to accelerate the optimization process of SNNs trained by Spike Timing Dependent Plasticity (STDP) and surrogate gradient. We parallelized the optimization algorithm asynchronously, and ran large-scale experiments on heterogeneous multi-GPU Petascale architecture. Results show that by considering silent networks, we can design more flexible high-dimensional search spaces while maintaining a good efficacy. The optimization algorithm was able to focus on networks with high performances by preventing costly and worthless computation of silent networks.

Performances of SNN are often assessed on standard ANN classification datasets, such as, Poisson encoded MNIST [49, 36, 25, 43, 50]. But these benchmarks should be considered a proof-of-concept for SNNs, and spiking analog dataset should be preferred to test SNNs performances [51, 52]. Therefore, we have selected two benchmarks: Poisson encoded MNIST and DVS128 Gesture . In this work, MNIST was encoded within 100 frames, the shape of the data is the following: B.T.C.H.W , for batch size, frames, channels, height, and width (B.100.1.28.28). The number of frames is 3.5 time lower than in . No other transformation was made, such as denoising or centering. For experiment 3 based on SLAYER, T was set to 25 .
id: 80c78390ec30bd75ca778e4d895d72ad - page: 9
Concerning DVS128 Gesture, spikes were accumulated within 100 frames, overlapping spikes during this process are considered as a single spike. Data was denoised using a 5000ms temporal neighborhood. Both ON and OFF channels are considered, so the shape is B.100.2.128.128. The Tonic Python package was used to process the data. The higher pixel and temporal resolutions of DVS128 Gesture is a real challenge as it involves higher topology, memory, and computation complexities. Both datasets were divided into training, validation, and testing datasets of respective sizes 48000, 12000 and 10000 for MNIST. DVS128 Gesture is divided into datasets of sizes 862, 215 and 264. The optimized accuracy is the one obtained on the validation dataset, and final results of the best solution found are assessed on the testing datasets.
id: 389406ebc43f7a654e43da1c63d38a15 - page: 9
4.4 Hardware and software specifications Long-run experiments were carried out on the GPU partition of the Jean Zay supercomputer. Each experiment lasted for 100h, 15 Nvidia Tesla V100 with 32Gb of RAM were dedicated to the computation of SNNs, one additional GPU was used for the computations of SCBO (e.g., Gaussian processes). A single experiment represents a total of 1600 GPU hours. The 16 GPUs are grouped by clusters of 4, containing 2 Intel Cascade Lake 6248 processors of 20 cores each, a cluster cumulates a total of 160Gb of RAM. The experiments were parallelized using OpenMPI interfaced by the python library mpi4py. BindsNET is fully based on PyTorch, while LAVA-DL also compiles custom CUDA code, both can easily be run onto Nvidia GPUs. SCBO was implemented and instantiated using Zellij2 and BoTorch. 5 Computational results on large-scale experiments
id: 0549da8d4f6c85089b9199c4f7eb022a - page: 9
5.1 Analysis of the HPO process In the following lines, an analysis of the impact of silent-networks on SCBO is made. In Figure 7a to 7d, a single horizontal line corresponds to the starting and ending dates of the evaluation of a single SNN. This representation emphasizes the ability of SCBO to detect silent-network and focus on fully trained SNNs with high accuracies. The observed drops in accuracy and computation time, for experience 1 and 3, are explained by the reset of the trust region once it shrank to its limits. Then, new random points are sampled, computed, added to the list of existing ones, and SCBO restarts the process. During experiment 1, one can see, in Figure 3, while almost 73% of the evaluated networks were stopped, silent networks only consumed about 36% of the 1500GPU hours. So, for experiment 1, the early stopping criterion and constraints worked. Indeed, Figure 7a emphasizes the focus of SCBO on non-silent networks, resulting in high validation accuracies.
id: f9178ee77df06dfccd1d07dca6ff208c - page: 9
How to Retrieve?
# Search

curl -X POST "https://search.dria.co/hnsw/search" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"rerank": true, "top_n": 10, "contract_id": "ogWkfNWVmD3980e90QXV1oncQ4GIn2WOOJluGVe3hTw", "query": "What is alexanDRIA library?"}'
        
# Query

curl -X POST "https://search.dria.co/hnsw/query" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"vector": [0.123, 0.5236], "top_n": 10, "contract_id": "ogWkfNWVmD3980e90QXV1oncQ4GIn2WOOJluGVe3hTw", "level": 2}'