over budget

Cardano Risk Assessment Tool

$68,000.00 Requested
Ideascale logo View on ideascale
Community Review Results (1 reviewers)
Addresses Challenge
Feasibility
Auditability
Solution

Develop a Risk Assessment Oracle. A public web-service to predict the fraud risk of a given wallet i.e. if it belongs to a fraudster.

Problem:

DeFi markets require a method to judge the trustworthiness of a wallet that is requesting funds.

Yes Votes:
₳ 103,705,959
No Votes:
₳ 9,497,496
Votes Cast:
523

  • download
  • download
  • download

Detailed Plan

<u>Background</u>

Credit fraud and scams are common throughout all financial systems - including DeFi.

Risk management firm Elliptic estimates DeFi theft and fraud rising from $1.5B to $10.5B in the last year (decrypt article link below).

Given DeFi's decentralised, unregulated and potentially anonymous systems of trade, recovering frauded funds is challenging. This poses a signficant risk to the uptake of DeFi generally.

In response, the team proposes to develop the Cardano Risk Assessment Tool, which will analyse and judge the trustworthiness of a wallet that is requesting funds.

Our hope is that this tool will be used to add a measurable layer of confidence and trust to all parties entering into a Cardano DeFi transaction.

Decrypt article https://decrypt.co/86503/defi-users-lost-billion-theft-fraud-2021-mostly-ethereum-report

<u>The Plan</u>

The Cardano Risk Assessment Tool will house a machine-learning model that statistically analyses the transaction behaviour, wallet network and metadata of a given address in order to predict fraud.

The model's output will be summarised as a Risk Score and provided via an API and visually on a public web resource with data visualisations displaying the primary drivers of fraud risk.

For this initial development, the model will be trained on fraud data from already known, valuable resources in the Cardano community such as the Cardano Phishing Bot (https://twitter.com/CardanoPhishing)

As a part of this project, the team will engage the community to seek out other worthwhile fraud and risk data collaborations.

The team is crucially aware that for the community to trust the tool, its output must be publicly verifiable as strong and predictive. As such the team will register and regularly report on the Oracle's performance using the Croesus protocol - allowing for transparent, public monitoring of the service and accuracy of model prediction.

The Croesus protocol for Oracle and Model performance monitoring was delivered by the team in the Fund 5 and Fund 6 Metadata Challenges.

Croesus links:

https://www.croesus-blockchain.com

https://croesus-blockchain.github.io/#fund-7-proposal---cardano-risk-assessment-tool

<u>The Team</u>

Each team member brings over a decade of experience as Data Science and Machine Learning professionals working in credit and fraud risk, for numerous credit bureaux and credit providers in Australia and New Zealand.

Each of the team have been responsible for building, assessing, deploying and monitoring credit risk, fraud, sentiment analysis, graph analytics and propensity models.

This is the team's third Catalyst proposal (hopefully the third successful proposal!)

Team members have sufficient experience to deliver all proposal deliverables for a public launch date within four calendar months without extra resourcing.

Team Members

Michael https://www.linkedin.com/in/michael-hodder-417a4ba7

Steve https://www.linkedin.com/in/steve-pirois-8b373593

Thushare ('t') https://www.linkedin.com/in/thushare

Each team member has experience standing up Cardano testnet / mainnet nodes and submitting metadata transactions.

Team members have been part of the first Plutus Pioneer cohort ( https://pool.pm/d068fe47123ec4c86460eeb74c7d7765c67d2df295a3ac86d664ed45.PlutusFirstClassPhoto192 ) and are currently participating in the first Atala PRISM Pioneer cohort.

<u>Deliverables</u>

1) Cardano Risk Assessment Tool

A public oracle where a wallet address may be entered and risk score and key drivers returned - available via GUI or API. GUI interface will show key wallet risk insights as well as network of related wallets with risk characteristics of entire network.

2) Croesus ML-model registration

Registration of the ML risk model underpinning the Risk Assessment Tool via the Croesus protocol on the Cardano blockchain.

This will allow performance monitoring of the oracle and the model's predictions.

3) Atala PRISM Extension

As part of the Atala PRISM Pioneer program, prepare a report assessing whether the risk assessment tool provides better performance analysing DIDs (Decentralised Identifiers) instead of Wallet Addresses.

<u>Project scope</u>

The following scope breaks the project into its components and provides an estimate of days effort required to deliver.

Data Staging (2-3 days)

  • Hosting and streaming the Cardano blockchain on a Spark cluster (e.g. Google's DataProc) via DB Sync and GraphQL.
  • This will enable statistical modelling of wallets, transactions, metadata to develop the machine. learning model that calculates the the Cardano Risk Assessment Tool's risk score.

Outcome exploration and validation (5-10 days)

  • Engaging the community seeking examples of wallets and transactions that are known to be scams, fraud and high risk.
  • Validating any proposed outcome sample for their applicability in modelling.
  • As a baseline outcome set, wallets from the Cardano Phishing Bot (https://twitter.com/CardanoPhishing) will be tagged as scam outcomes for modelling.

Outcome definition and framework (4-5 days)

  • Implementing the final, validated outcome logic.
  • Piping external data sources so the Cardano Risk Assessment Tool can receive and learn/model new fraud outcomes over time.

Machine Learning model development (42-55 days)

  • Data integrity (5-10 days)
  • Sample construction (2-3 days)
  • Feature engineering - wallet metadata, transaction history and network graphs (8-12 days)
  • Modelling (12-15 days)
  • Validation (10 days)
  • Learning architecture (5 days)

transactionModel deployment (8 days)

  • Model deployment and oracle construction.
  • Croesus registration.
  • Croesus batch reporting mechanism.

Insights, visualisation and delivery (8 days)

  • Risk score, network graph and key fraud characteristics.
  • Website development.
  • Reporting deployment.

Effort estimate: 65-98 days

The team proposes to deliver the above scope in four elapsed months.

<u>Funding</u>

Total: $68,000

Breakdown:

  • $60,000 Developer costs (1x Machine Learning Engineer, 2x Data Scientist - blended day rate of ~$800/day is aligned with local A/NZ market)
  • $5,000 for Spark/ML modelling resources throughout development.
  • $3,000 for Cardano Risk Assessment Tool hosting for a year, ML model refitting resources and maintaining a database of scores and results.

<u>Metrics / KPIs</u>

  • Number of wallets assessed using the tool on Cardano mainnet.
  • How many DeFI dapps use the tool on Cardano mainnet within 3, 6 and 12 months after funding.
  • Croesus predictive model monitoring - development and validation set discrimination statistics (e.g. Gini, Detection Rate and False-Positive Rate) for the ML fraud model.
  • Number of wallets assessed as being likely fraudulent.

<u>Risks and Challenges</u>

Fraud outcomes for modelling

Crucial to the performance of the Cardano Risk Assessment Tool, is the fraud data sourced for modelling. The team believes that data from the Cardano Phishing Bot (*https://twitter.com/CardanoPhishing*) will provide an invaluable baseline, but more data sources will increase the value of the tool overall. The team has intentionally allocated significant time to engage the community and validate potential fraud data sets.

New wallets with limited data

Wallets without sufficient data to analyse may well pose a problem in producing a Risk Score by the tool. Analysis will confirm if associated network analytics can solve the problem of single transaction wallets and sparse metadata.

Nonetheless the team believes the lack of data associated with a wallet is a valuable insight when considering the trustworthiness (or risk) of a wallet requesting funds - and usually carries a higher risk premium. This is a simple example where the Risk Score of the tool may be used as an input when pricing DeFi transactions.

<u>Clarifying questions/answers from comments</u>

@seyahn asked:

Oh, I love this. What a great idea. I'm struggling to come up tough questions because everything here is so thorough too. Best I can come up with might be important though. What kind of safeguards would be in place for the other kind of risk: that your tool might misidentify otherwise innocent people as scammers? How would you even know if the tool is making this kind of mistake?

@t answered:

Fantastic questions! Thanks for posing these questions as they are crucial considerations for any developers of fraud models, but also for users of this service.

From a statistical modelling perspective, well developed fraud models prioritise precision and the minimisation of false positive responses (misclassifing innocent people as scammers).

Implementing these requirements in the modelling process is a data science/machine learning expertise where we feel our experience building these models for credit bureaux is valuable. Much of the time spent in the validation phase of model development will be spent analysing false positives and fine tuning the model to improve precision.

Furthermore, we will publicly report on the model's predictive power, accuracy and precision via it's Croesus registration on the Cardano blockchain (links above in proposal). This way users of the tool can understand how good (or bad) the tool's prediction is.

There is also another important distinction to clarify:

This tool will provide a score which indicates fraud risk, which may be high or low, but it will not classify wallets as 'scam' or 'not scam'. As all parties in a transaction have their own appetite for risk, it is up to the user of the tool to consume the information and then ultimately decide upon their next action. Some users may still be willing to complete a high fraud risk transaction, but may perhaps demand a higher return. Other users may not be willing to take on such risk, and may demand a more trustworthy (lower fraud risk) counterparty to trade with. Ultimately, this tool's purpose is to inform decision-making by providing a likelihood of fraud risk.

@j.trane asked:

Can you clarify one point please? I can't tell from the proposal if what's being assessed is (i) risk of a wallet being controlled by scammers or (ii) risk that the wallet would default if extended an under-collateralized loan. I believe it's the latter. If that's right, can you clarify that it's a credit-worthiness assessment tool (instead of risk assessment tool, which is a bit opaque)?

@t answered:

Hi! Thanks for your question - this is a super important clarification!

So actually it is: (i) risk of a wallet being controlled by scammers.

We are effectively using the attributes and characteristics of known scammer wallets to build a profile of risk.

We can then apply this profile (or model) to analyse any wallet in the Cardano ecosystem and provide an assessment of its fraud risk profile - essentially answering your question of: 'what is the risk this wallet is being controlled by scammers?'.

In terms of ii) we cannot model this risk without first being able to observe these events i.e. we'd require example DeFi loans that have defaulted and not-defaulted to analyse their credit-worthiness. We think this is a natural extension to the Cardano Risk Assessment Tool, once DeFi and data is prevalent, but for now we're focussing on the first component of default behaviour - which is fraud.

Community Reviews (1)

Comments

close

Playlist

  • EP2: epoch_length

    Authored by: Darlington Kofa

    3m 24s
    Darlington Kofa
  • EP1: 'd' parameter

    Authored by: Darlington Kofa

    4m 3s
    Darlington Kofa
  • EP3: key_deposit

    Authored by: Darlington Kofa

    3m 48s
    Darlington Kofa
  • EP4: epoch_no

    Authored by: Darlington Kofa

    2m 16s
    Darlington Kofa
  • EP5: max_block_size

    Authored by: Darlington Kofa

    3m 14s
    Darlington Kofa
  • EP6: pool_deposit

    Authored by: Darlington Kofa

    3m 19s
    Darlington Kofa
  • EP7: max_tx_size

    Authored by: Darlington Kofa

    4m 59s
    Darlington Kofa
0:00
/
~0:00