Inference API
Describes the interfaces Valence exposes for using inference in smart contracts.
Use npm i valence-inference-lib
to install our Solidity library.
Inference Precompile
Valence Inference is provided through a standard Interface that any smart contract can use. The inference is implemented by a custom precompile called IValenceInference
.
The Inference precompile and its functions are accessible at address 0x00000000000000000000000000000000000000F4
Valence exposes various types of inference, ranging from LLMs to classical ML models. In addition, it supports various security techniques, such as ZKML and TEE inference. Developers have the option to choose the most suitable methods for their use-case and requirements.
The 2 high-level functions exposed by the inference precompile are runModel
and runLllm
. runLllm
is a function specifically designed for running large language models, whereas runModel
is a generic method that can be used to execute any type of AI and ML models.
LLM Inference Requests
The request and response for LLMs (runLlm
) are defined as follows. The main input is a prompt, and the answer is returned as a string as well. In addition, there are some common LLM parameters that can be tuned.
The list of LLMs you can use can be found in Supported LLMs.
To read more about is_simulation_result
, please see Simulation Results
Generic Inference Requests
For all other ML models, the input and response can take various shapes and forms (such as numbers and strings), so we built a flexible framework that lets you define any type of input that your ONNX model expects. The input is made up of an array of number tensors and string tensors. You only need to set the tensors that your model expects (eg can leave string tensors empty if your model only expects numbers).
To read more about is_simulation_result
, please see Simulation Results
Simulation Results
Both LlmResponse
and ModelOutput
have a flag called is_simulation_result
that indicates whether the result returned is "real" or not. As explained in Parallelized Inference Pre-Execution (PIPE), Vanna transactions are executed in 2 phases. In the first phase, the transaction is executed in simulation mode to gather and execute all inference requests in the background. Once the results are ready, the transaction is re-executed with the actual inference results. is_simulation_result
indicates whether the transaction is currently being executed in simulation mode or not. When it it set to false
, the value returned is coming from the model, however, when it is set to true
, the value is empty and developers should explicitly handle this scenario in their code.
Transaction simulation results are never committed to the blockchain
For example:
Last updated