ZK-Oracle WIKI

Project Background

1. Why do we need an Oracle?

Decentralized applications based on smart contracts have huge potential and practical value due to their automated execution characteristics. The replacement of centralized intermediaries by trustless smart contracts will produce subversive innovations in traditional businesses. In the Web 2.0 era, centralized enterprises are the most profitable organizations in data storage, analysis and distribution fields. In the Web3.0 era, the decentralized network can produce a paradigm shift, allowing users to regain the ownership of their own data and truly acquire the dominance of the Internet.
However, decentralized applications that run in a trustless manner still rely on external data sources. Due to the unique consensus models, blockchains cannot directly import real-world data and information by actively initiating a network call action. The consensus mechanism of the blockchain determines that each node must reach the same state after the on-chain transaction and block data is processed. Therefore, even if the external data interface is arranged on chain to directly import data, these data will not be accepted because it cannot meet the consensus. Smart contracts currently do not have a simple way to access reliable real-world data, which makes their application scenarios quite limited. Current decentralized applications still rely on centralized data services to provide data, which means that the security, privacy and decentralization of the oracle is questionable at the data level , which includes the following aspects:

a) Data Availability Issues

Smart contracts cannot store data by themselves. For decentralized applications, there is no built-in query middle layer to effectively and accurately receive and verify real-world data. Before decentralized applications can use simple methods to call and receive actual external data input, there will be obstacles to develop and implement the DApps (Decentralized Applications) based on oracles.

b) Demand for Credible and Reliable Data

In a completely decentralized environment, it is very important to add economic incentives to prevent others from destroying and attacking key data sources. Before establishing a strong incentive mechanism to ensure high-quality and reliable data resources, DApp applications will have great risks and continue to suffer from these security issues.
For Example: If the oracle manipulated the external data input of the smart contract, then it has the ability to determine the feedback and behavior of the smart contract. If the oracle is compromised, then smart contracts and all systems that rely on smart contracts will be compromised, resulting in obvious weaknesses in the security of the entire DAPP.
In summary, DApps need an oracle system to work with it to solve the problem of reliable access to external centralized data.

2. Centralized or Decentralized?

The workflow of Oracle can be summarized as: The data provider obtains external data through the off-chain API and transmits it to the on-chain oracle contract. The oracle contract will verify the data according to the built-in mechanism and provide the qualified data to the user's smart contract. The workflow of the oracle seems to be simple, but it is very difficult to ensure that the data meets the user’s need for trust in the whole process, and to be as confidential and efficient as possible. This is also the frontier that the oracle needs to explore.
The Workflow of Oracles
Existing oracle projects can be classified from the perspective of data verification methods and can be divided into two categories: centralized oracles and decentralized oracles. Due to the disadvantages of the mechanism of centralized oracles, decentralization is becoming the main trend.

a) Centralized Oracle

The data of the centralized oracle comes from third-party centralized organizations. In order to ensure the authenticity of the data, the organization must have full authority. And the user's trust in data is transferred to the trust in the centralized organizations.
The centralized oracle machine adopts a centralized control method, which can help the on-chain contract to obtain data quickly and efficiently. For the trust problem caused by centralization, the current main solution is to use " authenticity proof technology " as a guarantee. Take the Provable (formerly Oraclize) project as example, well-known centralized oracle that provides centralized oracle services for Ethereum. Provable relies on Amazon AWS services and TLSNotary technology, and can provide an unmodified proof for each returned result, indicating that the data provided to the contract is the correct data from the data source at a certain point in time. Although the “authenticity proof technology” can prove that the data provided in the contract is the same as the data on the data source and has not been modified, it is difficult to verify the correctness of the data of the centralized data source itself, which means that this mechanism has data source risks. The Synthetix project once suffered heavy losses due to this problem: On June 25, 2019, Synthetix Oracle, the oracle machine of the Synthetix system, provided Synthetix smart contracts with external data with very large errors. The KRW (Korean Won) price reported in this data turned out to be more than 1,000 times the actual price, Synthetix paid about $1 billion in losses for this. Although it was later negotiated and resolved, Synthetix recovered most of the losses after paying a certain price, but in such an event, the fatal flaws of the centralized oracle were exposed.
In addition, such a centralized mechanism also has other obvious drawbacks: its own algorithms (such as the TLSNotary algorithm used by Provable) logic and functional defects, system downtime caused by single point of failure, etc. These potential risks are like ticking time bombs, once detonated, they will pose a huge threat to the system platform and user assets.

b) Decentralized Oracle

For the oracle, the ultimate goal is to achieve safe and reliable data transmission on chain, and the drawbacks of centralization not only fail to allow users to fully trust the system, but also cause many security risks. So far, decentralization has become the exploration direction and development trend of oracles.
Different from the mechanism in which centralized oracles provide data from centralized institutions, decentralized oracles use the same operating logic as the distributed blockchain ledger, and use multiple nodes to provide data at the same time. Therefore, the mechanism and process of the decentralized oracle are more transparent. It does not rely on any single data source to eliminate the single point of failure and the independent economic consensus model also ensures the stability and development of the system. Therefore, the distributed oracle structure is expected to solve the single point of failure and user trust problems in the centralized oracle, thereby creating a trustless decentralized system.
As a mainstream and very popular oracle project, Chainlink is the first decentralized oracle solution on the Ethereum blockchain. In the early stage of the project, Chainlink’s plan mainly aims to ensure the correctness of its data through two-pronged approach of on-chain aggregation and on-chain governance. Generally speaking, this scheme is simple and effective, but it also has the problem of excessive gas consumption caused by on-chain aggregation. As the project develops and expands in scale, the more clients of the consensus oracle, the more transactions on chain, and the gas consumption will inevitably increase sharply. In Chainlink's mid- and long-term plan, on-chain aggregation will gradually be replaced by off-chain aggregation, and the expensive on-chain gas consumption issue will be naturally resolved.
However, in addition to the gas cost issue, in terms of data verification mechanisms, Chainlink and the DOS Network, which is also a mainstream decentralized oracle project, have systemic risks. This systemic risk lies in the cost of attack caused by the indirect verification mechanism and the credit system of nodes. The contracts of these two projects only verify the identity and credit of the data uploader, and ensure the accuracy of the data through such indirect verification. In this way, the node's attack cost is almost dependent on the credit stake, and will not change as the data provision behavior changes. The mechanism of indirect verification can effectively improve the efficiency of the system, but it is also accompanied by a high success rate of malicious nodes. When dealing with small-scale, non-financial scenarios, because the benefits of malicious attacks are small, the simple verification mechanism has almost no security risks. However, once the scale of assets derived from the data provision behavior is much larger than the staked funds of nodes, the profit of malicious attack will be much greater than the credit risk, and the tendency of the malicious attack will greatly increase. This is an inherent problem of this type of indirect verification mechanism. If you only modify the superficial logical relationship (such as increasing the nodes’ credit risk control, etc.), the risk cannot be substantially eliminated. Based on the above considerations, such a mechanism is only suitable for small-scale and non-financial application scenarios, and is difficult to apply to large-scale, financial markets.
Before the DeFi hype, the prediction market was a field where decentralized oracle machines were used in the early days. Augur and Gnosis are two well-known projects in the decentralized prediction market using collective predictions to predict real-world events. In an ideal situation, voting rights are allocated to different token holders, and the prediction results need to be agreed by a majority of the votes. Augur and Gnosis are good at handling low-frequency and future events like presidential election results and sports betting. However, when dealing with frequent, real-time and adaptable events, problems such as high user participation and long voting cycles will be magnified. These problems also exposed the efficiency of current decentralized oracles for large-scale, random events.
To sum up, the distributed and decentralized technology is an effective way to solve the pain points of the centralized oracle project, and the market potential of its combination with oracles cannot be underestimated. At present, most of the decentralized oracle projects on the market focus on DeFi, and the scale of funds that can be adapted is not high enough, i.e., the universal compatibility is not high. In addition, the user's privacy and security is also a noteworthy issue. The leakage of user privacy data, especially in the financial and prediction markets, is a fatal loophole. The user's personal information and operation behavior can be used as an important basis for judging the user’s strategic intentions, which will have a huge impact on the subsequent results.