Dria Whitepaper

Created at 10pm, Apr 26

cyranodb

Technology

Dria Whitepaper

Contract ID

Cfu7b4HYJRgYBpAE4AMzsWpCFxdmVcNoIjvn0bDzwM8

File Type

PDF

Entry Count

156

Embed. Model

jina_embeddings_v2_base_en

Index Type

hnsw

Dria represents a paradigm shift towards democratizing knowledge for AI, ensuring the continued development towards achieving AGI while adhering to the principles of open collaboration and unrestricted access to information. Developers can leverage Dria to create domain-specific expert agents, power intelligent research assistants, and unlock novel use cases driven by access to a comprehensive, collectively maintained knowledge repository. By incentivizing the provision of high-quality knowledge bases and allowing open access, Dria fosters an ecosystem for innovation across various AI applications. Also, by incentivizing the curation of high-quality synthetic knowledge in any domain, Dria removes the limitation that prevents us from reaching better LLMs.

Algorithm 2 Dria Computation Require: x broadcasted by Dria node with Drias content topic Require: (kpriv, kpub) as a secp256k1 key-pair that belongs to the node Require: dpub Dria public-key of a secp256k1 key-pair y f (x) s Sign(kpriv, y) e Encrypt(dpub, s||y) h Hash(s||y) return h||s||e f is a public function We first describe our compute methodology in algorithm 2. In the naive solution, a node would simply compute y f (x) and publish y, but as we discussed above, that comes with problems. Instead, compute nodes within the network return their results to the network in a way that only Dria can read (with 25 4 2 0 2 , 5 2 l i r p A 4 v t f a r D encryption) while also providing authenticity of the result (with signature) and a commitment to their result (with hashing).

id: b7228dfe87366a383ac99040b4e47b21 - page: 25

After computing y = f (x), a node signs the computed result with their own private key. A concatenation of this signature and the result itself is then encrypted, and by including the signature within this encryption we prevent dis-honest nodes to copy the encrypted result as their address wont match that of the recovered one from the signature. Finally, the signature is concatenated with the plain-text result and a commitment is obtained using a hash function. This serves the purpose of verification of a revealed result, which is explained in the sections below. All encryption and signatures schemes here are using the elliptic curve secp256k1, which is the curve used by Bitcoin and Ethereum accounts, and our methods are based on ECIES (Elliptic Curve Integrated Encryption Scheme) and ECDSA (Elliptic Curve Digital Signature Algorithm) respectively, built on top of libsecp256k15. 5.1.1 Result Aggregation

id: b0f2330218e33f20d2b365850c5f8e1a - page: 26

Algorithm 3 Result Aggregation Require: hi||si||ei from all nodes i where i {1, 2, . . . , N } Require: (dpriv, dpub) a secp256k1 key-pair that belongs to Dria for i 1..N do yi Decrypt(dpriv, ei) ai ECRecover(si, yi) if ai has not staked enough $BATCH then ignore the result of yi end if end for y Aggregate(y1, y2, . . . , yN ) for i 1..N do if y = yi then send $BATCH rewards to ai end if end for Dria aggregates the results as shown in algorithm 3. The owner of a compute result is retrieved from their authentic digital signature using elliptic curve publickey recovery, which also exists in EVM-compatible chains as a pre-compiled contract at address 0x016. The Aggregate function takes in all the collected results and decides on the correct result by comparing all of them, and when there are more honest nodes than malicious ones, the correct result can be decided upon. A naive method for

id: 057ebcbee94aea81d5d841a4d35bb736 - page: 26

5 6 26 4 2 0 2 , 5 2 l i r p A 4 v t f a r D a generic computation result would be to pick the most repeating result, that is, the majority vote. However, in Dria Knowledge Network we are working in particular with LLMs which may not always return the same result for the same input7. To aggregate the gathered LLM responses, we utilize a novel technique based on . Within the paper, several 7B agents are used in an ensemble, and with a clever method of response aggregation using n-grams based BLEU the aggregated response reaches the quality of a 13B parameter agent. As written in our algorithm descriptions above, consider N samples y1, . . . , yN . The final result is chosen to be the sample with the highest cumulative similarity to the other samples. This approach has two advantages: Within all honest answers, the best one is selected as highlighted by the work of Li et al. .

id: 92e91398c46f86093b0af04e0ed8ec84 - page: 26

How to Retrieve?

# Search

curl -X POST "https://search.dria.co/hnsw/search" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"rerank": true, "top_n": 10, "contract_id": "Cfu7b4HYJRgYBpAE4AMzsWpCFxdmVcNoIjvn0bDzwM8", "query": "What is alexanDRIA library?"}'
        
# Query

curl -X POST "https://search.dria.co/hnsw/query" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"vector": [0.123, 0.5236], "top_n": 10, "contract_id": "Cfu7b4HYJRgYBpAE4AMzsWpCFxdmVcNoIjvn0bDzwM8", "level": 2}'