funded

On-Chain Data Analytics

$30,940.00 Received
$92,820.00 Requested
Ideascale logo View on ideascale
Community Review Results (1 reviewers)
Addresses Challenge
Feasibility
Auditability
Solution

Develop on-chain data analytical lab with high density and quality data, model validation and experimentation tools.

Problem:

DeFi protocols need pricing and asset information that is high density, historically complete, tick by tick, and analytical tools.

Yes Votes:
₳ 312,524,109
No Votes:
₳ 4,979,911
Votes Cast:
2235

This proposal was approved and funded by the Cardano Community via Project F7: Boosting Cardano's DeFi Catalyst funding round.

  • on-chain-data-analytics-catalyst-research-graphic
  • on-chain-data-analytics-catalyst-research-graphic

Detailed Plan

Problem Statement

DeFi protocols are set to bloom in 2022. However, DeFi protocols need pricing and asset information that is high density, historically complete, and tick by tick. This information is fundamental for the design of sound protocols, on-chain data analytics, trading strategy validation, and will result in wider adoption and better perception of the Cardano DeFi ecosystem.

How does better trading help Cardano?

High quality data allows for more sophisticated trading, sounder decision making, and better trading means more efficient markets. This data will power an ability for enhanced economic planning, which then attracts more capital into the ecosystem. This new capital will provide economic fuel to further expand the utility of the Cardano platform and drive adoption.

As the Cardano ecosystem matures during 2022, demand for complete and high quality on-chain data will rise in response to work on sophisticated DeFi protocols. Users will need better data to make evidence driven decisions. Developers and researchers will also need complete data and tools to analyze and improve protocols.

A high performance and lively warehouse of Cardano on-chain financial data will address the challenges of data availability, quality, and performance. Additionally, an on-chain data lab will allow for easy hypothesis testing and interaction with the data, increasing the value and impact of this data many times over.

The Cardano protocol would significantly benefit from access to on-chain data warehouses and a suite of lab tools for analyzing and interfacing with the data. These instruments will enable protocol users to make data-driven decisions and aid in developing sophisticated financial tools.

A couple of high impact use cases are backtesting and implementation of trading systems. Backtesting is the process of evaluating trading hypotheses using historical data many times over. By backtesting on top of high performance data systems one is able to efficiently formulate and test trading strategies. After the strategy is formulated, it can be implemented. On the development side, it enables creation of data heavy protocols such as synthetics, financial derivatives, synthetics, and a series of other instruments. On the user side, it allows for implementation and following of trading systems, both automated and manual.

Additional Challenge: Live Data Feeds

A data warehousing system provides access to quality historical data and the tools to work with it. This can be further augmented by creating live feeds from on-chain data. These feeds can be subscribed to, which allows near real-time interaction and derivation of new metrics from this data.

A separate tool can allow for the creation of specialized data feeds. Users can mint a new feed by defining a function taking inputs and producing output, which can then be subscribed to.

Additional Challenge: Semantic Model of On-Chain Data

Data can be provided via an API which usually requires complex documentation.

Alternatively, if time and budget allows a semantic based intuitive model can be built that will guide the user in data exploration. Semantic models remove the burden of ensuring that data is queried correctly. They enable the user to focus on the task at hand. Development of a semantic data model will result in a boost to user productivity and be less prone to error.

We propose

1. Creation of a high performance and high availability data warehousing solution. We have experience building data lake solutions at a scale of hundreds of terabytes per day in mission critical traditional finance applications. We'll apply this expertise to design architecture for an on-chain data warehouse and share it with the community.

2. We will deploy and maintain the infrastructure for this on-chain data warehousing solution.

3. We will develop live feeds to the data warehouse from the Cardano blockchain using a custom high-performance Cardano Node connector.

4. We will design an open source stack solution and develop integration libraries for performing data analytics without leaving the browser.

5. (stretch goal) We will provision live data feeds, allowing subscriptions to live on-chain data feeds and feeds of derivative data. This, combined with data warehousing, will enable the creation of a series of analytical tools, dashboards, and oracles.

6. (stretch goal) We will design and implement a semantic model for querying on-chain data. This will make working with the data warehouse and live feeds intuitive and increase user and protocol developers productivity.

7. We will implement a free to use model for typical blockchain users and a fair charge (to cover infrastructure costs) for large consumers of data.

We will deliver a guide on how such systems are built and make a publicly available data warehouse, an on-chain data analytics lab allowing the data to be interacted with from a browser and live data feeds (stretch goal).

Use of the Protocol Beyond Trading

We assume there will be a broader use of the provided data and tools beyond trading:

• Powering analysis of the chain, user and dApp activity for education, social commentary, and price analysis.

• Validation of assumptions and hypothesis testing by protocol designers and developers.

• Creation of dashboards for Cardano blockchain and native tokens statistics.

• … and many more.

Relevant Experience

Jarek Hirniak has over 8 years experience leading multiple development teams in traditional financial institutions like UBS and Citadel Securities. He finished the Plutus Pioneers (NFT: https://pool.pm/d068fe47123ec4c86460eeb74c7d7765c67d2df295a3ac86d664ed45.PlutusFirstClassPhoto438)) program and has been programming in Haskell for 8 years. He is a holder of a Certificate in Quantitative Finance (CQF) and Master of Informatics (MInf) with specialization in distributed systems and formal methods from the University of Edinburgh. At Microsoft Research he designed and implemented novel distributed machine learning algorithms responsible for processing exabytes of Office and Windows telemetry.

<https://www.linkedin.com/in/hirniak/>

Tony Morris, the technical lead, has been programming with Haskell for 18 years. He is the course developer and former coordinator/lecturer for Functional & Logic Programming at the University of Queensland, which focuses on Haskell. Tony has been using Functional Programming software development techniques within industry teams for 15 years, with a view to improve the reliability and time to market of software and lead other engineers to achieve their business goals. He contributes to over 300 open-source software projects that all utilize software engineering methods (FP) using several programming languages.

<https://www.linkedin.com/in/tony-morris-1961a02/>

PEH Zheng Yan, software engineer has 5 years of technical experience across finance, property and airlines. He is an active speaker in Malaysia local functional programming clubs, often delivers talks about functional programming and software architecture within Malaysia. He has experience delivering performant and safe softwares in multidisciplinary global teams.

<https://www.linkedin.com/in/zheng-yan-peh-947a7a117/>

Key Metrics of this project related to the Catalyst Fund 7 Goals

Here is what we foresee as relevant, in order of impact on the project:

Fund Intention: F7: Boosting Cardano's DeFi

"How can we encourage DeFi teams to build/deploy open finance solutions on Cardano in the next 6 months?"

Project Impact : High

1. Data quality, in terms of granularity, and dependability is vital for researching and implementing portfolio management strategies in terms of. Bringing portfolio managers to the blockchain will unlock a significant source of total value locked for the entire ecosystem.

2. For the Cardano DeFi ecosystem to mature and for DeFi to continue its evolution from the simple methods of 1.0, to the smart money of DeFi 2.0, analytical data is required. It will enable users to make educated and evidence based decisions and allow protocol implementers to design and build the next generation of DeFi protocols.

3. High quality research and data is associated with brand perception in traditional finance; institutions proudly display the comparative scores of their market research with other institutions. Adding a product that's highly perceived by traditional investment circles adds credibility to the blockchain, which in turn helps onboard retail and large institutional traders.

Definition of Success After 3, 6, and 12 Months (Milestones)

3 Months: Detailed architecture will be designed and provisioned.

6 Months: A high performance Cardano Node data connector will be developed to feed data into the warehouse. First iteration of the lab platform and API will be built.

12 Months: Stretch goals of live data feeds and semantics models will be attempted. Documentation will be completed via GitBook and the solution released to the community.

Expected Launch Date: August 2022 (launch), December 2022 (completion)

Budget: $92,820

$12,000

120 hours of front-end development, at $100 per hour with tax.

$37,500

250 hours of Haskell back-end and on-chain development, at $150 per hour with tax.

$40,320

Warehouse hosting cost per month for 6 months then self-funding from high volume consumers, 7 total instances (approx. number) each about $640 per calendar month (approximate cost), totaling $4,480 per month, and $40,320 for 9 months.

$3,000

General other costs - graphics design, software licenses (e.g., JavaScript visualization libraries), outreach, growth, and integration.

KPIs (Key Performance Indicators)

  • Number of data feeds.
  • MAU (Monthly Active Users).
  • Monthly API queries.
  • Monthly lab analysis.

Community Reviews (1)

Comments

Monthly Reports

We're code complete. Only work is to publish to Github.

Disbursed to Date
$30,940
Status
Still in progress
Completion Target
5/31/2022
Comments 0

Login or Register to leave a comment!

close

Playlist

  • EP2: epoch_length

    Authored by: Darlington Kofa

    3m 24s
    Darlington Kofa
  • EP1: 'd' parameter

    Authored by: Darlington Kofa

    4m 3s
    Darlington Kofa
  • EP3: key_deposit

    Authored by: Darlington Kofa

    3m 48s
    Darlington Kofa
  • EP4: epoch_no

    Authored by: Darlington Kofa

    2m 16s
    Darlington Kofa
  • EP5: max_block_size

    Authored by: Darlington Kofa

    3m 14s
    Darlington Kofa
  • EP6: pool_deposit

    Authored by: Darlington Kofa

    3m 19s
    Darlington Kofa
  • EP7: max_tx_size

    Authored by: Darlington Kofa

    4m 59s
    Darlington Kofa
0:00
/
~0:00