Created at 6am, Mar 27
Ms-RAGOther
0
Brain-grounding of semantic vectors improves neural decoding of visual stimuli
mSKDDHWdR9SKbt6PQrF61_ZfaEDxBZRihRivmFzefm8
File Type
PDF
Entry Count
61
Embed. Model
jina_embeddings_v2_base_en
Index Type
hnsw

Shirin Vafaei1, Ryohei Fukuma1, 2, 3, Huixiang Yang2, Haruhiko Kishima1, Takufumi Yanagisawa1, 2, 31 Department of Neurosurgery, Graduate School of Medicine, Osaka University, Suita, Japan2 Institute for Advanced Co-Creation Studies, Osaka University, Suita, Japan3 ATR Computational Neuroscience Laboratories, Seika-cho, JapanKeywordsAbstractDeveloping algorithms for accurate and comprehensive neural decoding of mental contents is one of the long cherished goals in the field of neuroscience and brain-machine interfaces. Previous studies have demonstrated the feasibility of neural decoding by training machine learning models to map brain activity patterns into a semantic vector representation of stimuli. These vectors, hereafter referred as pretrained feature vectors, are usually derived from semantic spaces based solely on image and/or text features and therefore they might have a totally different characteristics than how visual stimuli is represented in the human brain, resulting in limiting the capability of brain decoders to learn this mapping. To address this issue, we propose a representation learning framework, termed brain-grounding of semantic vectors, which fine-tunes pretrained feature vectors to better align with the neural representation of visual stimuli in the human brain. We trained this model this model with functional magnetic resonance imaging (fMRI) of 150 different visual stimuli categories, and then performed zero-shot brain decoding and identification analyses on 1) fMRI and 2) magnetoencephalography (MEG). Interestingly, we observed that by using the brain-grounded vectors, the brain decoding and identification accuracy on brain data from different neuroimaging modalities increases. These findings underscore the potential of incorporating a richer array of brain-derived features to enhance performance of brain decoding algorithms.

normalLOCFeature0.500.550.(cid:17)00.(cid:17)50.(cid:18)00.(cid:18)50.(cid:19)0 normalLOCFeature0.500.550.(cid:17)00.(cid:17)50.(cid:18)00.(cid:18)50.(cid:19)0 Sample-wise (cid:33)imension-wise (cid:36)(cid:40)enti(cid:41)cation 0.05 0.8 75 (cid:45) c a r u c c a (cid:44) n (cid:40) o c e (cid:40) (cid:38) (cid:43) (cid:42) 0.10 70 0.6 Subject 1 0.05 Chance 65 0.4 i 0.00 60 0.2 0.05 55 0.0 50 0.10 norm al norm al norm al norm al norm al norm al = 10= 10= 10= 10= 10= 10C(cid:35)(cid:36)(cid:37) (cid:38)(cid:35)o(cid:39)e C(cid:35)(cid:36)(cid:37) (cid:38)(cid:35)o(cid:39)e C(cid:35)(cid:36)(cid:37) (cid:38)(cid:35)o(cid:39)e (cid:34)eature space Figure 4 MEG decoding and identification results.
id: b8c364be04080ccda5f76231bf945aba - page: 7
Discussion Here we proposed a simple framework for creating a new type of semantic space, that vectors in this space contain some features about how visual information is represented in the human brain. This framework works by aligning the geometry of concept representation in human brain into the pretrained feature vectors derived from text or image processing tasks. We show that by using this new semantic space, decoders can better learn to map neural activity patterns into their corresponding semantic vector, and further be identified, even for categories that were not used in autoencoders training procedure. Furthermore, we show that even though this new semantic space is created by leveraging brain activity patterns measured by fMRI, it can be used to decode/identify category representations in data obtained by other neuroimaging modalities, suggesting that grounding the original vectors using the second-order representation in human brain results in creating vectors
id: 4fbe0b8895d6344cb056c39516e0657c - page: 8
Recently, the spotlight in neuroscience and machine learning research has shifted towards developing multimodal learning models. As an example, in a pioneer study by (Karpathy & Fei-Fei, 2015), they show that by aligning the features extracted from convolutional neural networks trained for classifying the image and recurrent neural networks trained for generating texts, they can highly improve the quality of caption generation for an image. The more recent model, CLIP, is also a text and image multimodal model in which it learns the visual concepts by natural language supervision and has shown great zero-shot capabilities compared to its similar models (Radford et al., 2021). In neuroscience, several studies have either directly used humans brain activity data or monkeys brain activity
id: 21b60edf1d5b673ba2eae0a2bf5d6d78 - page: 8
, 2020) or
id: eef9151c601ec6faf534a2a5bdc3a35d - page: 8
How to Retrieve?
# Search

curl -X POST "https://search.dria.co/hnsw/search" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"rerank": true, "top_n": 10, "contract_id": "mSKDDHWdR9SKbt6PQrF61_ZfaEDxBZRihRivmFzefm8", "query": "What is alexanDRIA library?"}'
        
# Query

curl -X POST "https://search.dria.co/hnsw/query" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"vector": [0.123, 0.5236], "top_n": 10, "contract_id": "mSKDDHWdR9SKbt6PQrF61_ZfaEDxBZRihRivmFzefm8", "level": 2}'