Parallelized Inference Pre-Execution (PIPE)
Last updated
Last updated
In order to achieve maximum throughput and performance on the Vanna Network, we designed a novel EVM execution framework called Parallelized Inference Pre-Execution (PIPE). PIPE allows Vanna to execute transactions and inferences in the most efficient way, providing a seamless interaction between model inference and the EVM.
One of Vanna's main goals is supporting AI inference on the blockchain in a highly scalable way. Complex models however can be computationally expensive, which poses a risk when embedding them into native blockchain transactions. Slow model inferences could severely impact the overall performance of the network, slowing it down for all users.
Instead, Vanna designed a 3-phase approach when executing inferences embedded into transactions.
In this first phase, Vanna runs every transaction through a simulator in order to find out what inference requests the transaction will make. In normal execution, when a transaction makes an inference request, the network would have to execute the inference synchronously, while blocking the EVM and all other transactions as well. In our simulation however, we only record the inference request(s) sent by each transaction, we don't actually execute them.
Once the simulation is done, the transaction is returned to the Vanna Node, along with all the inference requests that were made.
In the second phase, transactions and their inference requests are added to a special Mempool that sends off the requests to the Vanna Inference Nodes. Transactions are kept in the mempool while their corresponding requests are being executed. Once all requests made by a particular transaction are done, it can be submitted to be executed in the EVM.
Finally, we inject the inference results into the EVM so the transaction can read it directly, just like it would read any other variable. The transaction is then executed and committed to the blockchain. Because no actual inference takes place at this stage, execution is blazing fast, and has no impact on the overall performance of the network.
The benefits of this approach are:
No single transaction can slow down other transactions, or even worse the entire blockchain
Transactions that use faster models can be executed as soon as inference is completed
Transactions that do not use model inference at all are guaranteed to execute immediately
Network resources are used in the most efficient way
The network's performance can be scaled horizontally by running more AI inference nodes
No modifications required to smart contracts, SIF is implemented entirely at the execution layer
Alternative execution models are characterized by slow and unreliable performance, and open up a new vector of DDoS attacks on the network.
To sum up, SIF is one of the core technologies that allows Vanna to deliver the seamless and performant AI inference on the blockchain.