Alternative Assessment Process R&D and Experimental Implementation

Funds Fund 10 Proposals Catalyst Systems improvements Alternative Assessment Process R&D and Experimental Implemen...

withdrawn

View on Ideascale View on projectcatalyst.io

Current Project Status

Unfunded

Amount
Received

₳0

Amount
Requested

₳710,000

Percentage
Received

0.00%

Solution

R&D and experimental implementation of several alternative assessment/QA models for Catalyst funds distribution, one of them being the 2-stage proposal assessment process,

Problem

Voters are overwhelmed. Proposals receive roughly equal scrutiny. Reviews are not meaningful because community reviewers are not adequately incentivised to conduct deep critical investigations.

Impact / Alignment

Feasibility

Value for money

Alternative Assessment Process R&D and Experimental Implementation

Impact

Please describe your proposed solution.

This proposal seeks to research and develop several competing assessment models in parallel and rigorously test the quality of output that each of them produces.

Many have been proposed in community discussions, but it is unclear what the relative merits of each one of these are until they have been empirically and experimentally tested in the field.

Some of the alternative mechanisms that would be researched and developed are discussed in the following Tally blog post written by the lead proposer Simon Sällström and Jan Ole Ernst (PhD Quantum physics, Oxford). These include Holographic voting and conviction voting.

Problem

There are too many proposals for voters to meaningfully engage within a given Catalyst round
It does not allow proposal assessors to fully exploit their relative strengths, reducing meaningful participation in the system.
The current system largely dedicates a roughly equal amount to the assessment of proposals even though some proposals ought to get a higher degree of scrutiny.
Current proposal assessment ratings are too noisy to provide meaningful guidance to voters.
Assessments do not adequately incentivise thorough analysis and deep critical investigation.

Solution

We propose to divide the proposal assessment into two stages. In the first stage, proposal assessors check proposals against a list of very well-defined proposal requirements and indicate the domain expertise required to assess the proposal in more depth. Proposal assessors must also give one point of constructive criticism that provides specific examples and actionable suggestions for positive change at this stage. To proceed to the second stage, a proposal needs to satisfy 80% of the requirements.

The second stage is the qualitative assessment stage. It takes inspiration from the scientific article referee review system. The qualitative assessments have several objectives. First, to provide quality assurance on the first-stage assessments. Second, to provide a concise and easily understandable summary of the proposal. Third, to thoroughly investigate and critically engage with all aspects of the proposal. Fourth, to provide constructive feedback to the proposer. Fifth, to make a recommendation to “fund”, “revise and resubmit”, or “not fund”. A revision of the PA model does not necessarily have to be implemented as part of the 2-stage proposal assessment model.

How does your proposed solution address the challenge and what benefits will this bring to the Cardano ecosystem?

Summary

It will improve the efficiency of the use of funds and help voters make better, more well-informed, decisions regarding who should win grants.

Problems with the current process

A thorough review of all issues with the current Catalyst process is beyond the scope of this proposal but we will focus on the ones which we believe the present proposal addresses. The problems are

Lack of system enabling efficient division of labour according to proposal assessor strength
Lack of feedback during a funding round
Lack of incentives and capacity to conduct a thorough analysis
Waste of resources through expensive PA assessments of incomplete proposals

First, the current system is not well suited for specialisation and division of labour. Some proposals are subpar in the sense that they do not present certain basic elements such as measurable KPIs or some analysis of existing solutions to the problem that is to be solved. Checking for the existence of these necessary components is easy but critically engaging with most of them and providing constructive feedback is difficult. Such critical analysis often requires domain expertise. It will most likely also require the person to be a native (or near-native) English speaker and to have excellent writing skills. There are few people who possess all of the above and those who do will typically have very good outside options in the form of very well-paid jobs. Their time is very valuable and it is a waste of resources to have them read through incomplete proposals. Given their high skill and outside options, a very high remuneration rate will be needed to properly incentivise them.

Second, lack of feedback. In the current system, getting feedback from proposal assessors is not easy. Generally, proposers comment on other proposals hoping to get comments on their proposal in return. However, accessing the group of proposal assessors is much harder. Proposal assessors are generally busy and although some proposers post in the Proposal Advisor Telegram chat and receive, if they are lucky, some feedback, this is the exception rather than the rule. The system does not have a mechanism to incentivise feedback that the proposers can incorporate into their proposal in the same funding round.

Third, lack of incentives to conduct a thorough analysis. Each proposal in a funding round is allocated a budget that is sufficient to reward 3 ‘good’ assessments and 2 ‘excellent’. An excellent assessment receives 3x the reward of that of a ‘good’ assessment. In the past rounds, around 3% of assessments were rated as ‘excellent’. The problem is that many Proposal Assessors (PA) who make cost-benefit calculations arrive at the conclusion that it is more profitable to aim to write ‘good’ assessments. If we assume that attempting to write an excellent assessment takes three times as long as writing a ‘good’ assessment, then the PA should be indifferent between focusing on these two tasks. However, due to a combination of (a) very high ‘excellent assessment’ standards and (b) it is not clear what constitutes an excellent assessment, many Proposal Assessors no longer attempt to write excellent assessments. Instead, they focus on writing ‘good’ assessments where the time invested leads to a more predictable reward. This reduces the overall value that proposal assessors provide to the voting community.

Fourth, there’s a lack of capacity. The sheer volume of proposals means that proposal assessors cannot properly invest time into investigating and engaging with existing proposals to increase their quality. Similarly, voters are overwhelmed by the sheer number of proposals and will only be able to express their preferences on a small subset of the proposals submitted. By adding more quality filtering mechanisms we address both of these problems, thereby improving the quality of the guidance provided and voting decisions being made.

Fifth, each proposal is currently allocated the same budget in terms of Proposal Assessor rewards. This is a waste of resources since there are many proposals that are not complete by objective standards. At the moment, the budget available for each of the ~1000 proposals submitted is around $440.Our hypothesis is that a proposal that doesn’t contain sufficient information for it to be properly assessed can be discarded after only a 15-minute skim. Even if every proposal is skimmed in this manner by seven independent Proposal Assessors remunerated at $30/hour (western Europe white-collar hourly rate), this discovery process would in total only cost $52. Assuming that 100-200 proposals (10-20%) would not have passed this basic objective checklist threshold, then under the assumptions stated above, we would save between $58,200-$116,400; funds which could instead be used to incentivise thorough analysis, interviews and investigation of remaining proposals. This is indeed what we propose here.

How do you intend to measure the success of your project?

The main idea would be to produce a ranking of proposals based on the alternative assessment process, and then conduct qualitative surveys with voters and proposers.

Proposers

Do the qualitative assessment give a fair and balanced review of your proposal?
Do the numeric scores by the current community reviewers provide an fair score of your proposal?

Voters

Are the assessments helpful for your decision on which proposals to support?
Did you learn something from the assessment?

We may include more questions. This is part of the preparation and research agenda for this.

Please describe your plans to share the outputs and results of your project?

Public channels for ongoing updates

Discord updates
Notion page monthly updates
Catalyst Town hall discussions and town halls

Closeout

Pre-analysis plan
Research report summary
Video recording and slide deck with findings

Capability/ Feasibility

What is your capability to deliver your project with high levels of trust and accountability?

Why best suited?

Author of the 2-stage proposal assessment process document
Involved in Project Catalyst since Fund 6
Core contributor behind the Catalyst United Initiative
Founder of Cardano student hub Oxford.
Research master's degree in economics (empirical experimentation specialisation) from the University of Oxford

Processes to manage funds

Multi-sig custodianship of the received funds.
Converting the funds into 2 different stablecoins upon receipt.

What are the main goals for the project and how will you validate if your approach is feasible?

The main goals of the project can be summarized as follows:

Improve the grant distribution mechanism of Project Catalyst with the objective of ensuring that "Proposals that provide the most value to the Cardano ecosystem wins grants".

This overarching goal is broken down into sub-components

Identify current problems including but not limited to the lack of efficient division of labor, feedback mechanisms, incentives for thorough analysis, and resource wastefulness in overhead costs.
Identify potential solutions to the identified problems. An example of such a solution is the 2-stage assessment process.
Experimentally test the alternative proposal mechanisms with real proposals, reviewers and voters.

To validate the feasibility of the proposed approach, the project can employ the following validation methods:

Feasibility study in the first part of the proposal.
Experimental Testing: sample proposals, reviewers and voters for the experiment.
Data Analysis: Analyze the collected data from the experimental implementation, including assessment ratings, user feedback, and voting outcome
User Surveys and Feedback: Gather feedback from proposal assessors, proposers, and community members involved in the experimental process.
Cost-Benefit Analysis: Evaluate the cost implications of the proposed changes by comparing the resource allocation and savings amongst different methods.
Iterative Refinement: Based on the outcomes and feedback gathered during the validation process, iterate and refine the proposed approach to address any identified shortcomings or areas of improvement

By combining these validation methods, the project can gather evidence and insights to determine the feasibility and viability of the proposed approach to improving the proposal and assessment process in Project Catalyst.

Please provide a detailed breakdown of your project’s milestones and each of the main tasks or activities to reach the milestone plus the expected timeline for the delivery.

Milestones for the project:

Milestone 1: Project set-up and Recruitment of team (Month 1-2)

Draft job descriptions and contract terms in consultation with legal advisors.
Set up public communication channels and project workflow

Milestone 2: Research and Development (Month 3-4)

Surveying of community-initiated governance mechanisms
Literature review of existing studies and findings
Feasibility study (for technical solutions)

Milestone 3: Experimental preparation (Month: 3-4)

Background research of governance mechanisms
Set up the experimental framework for testing the proposed model and pre-analysis plan.
Define the parameters for sample selection and recruitment of proposal assessors.
Develop a plan to ensure representative sampling and sufficient observation points per proposal.

Milestone 4: Alternative 1 technical development (Month: 3-4)

Technical implementation of the 2-stage assessment model governance framework.

Milestone 5: Alternative 2 and 3 development (Month 4-5)

Development of MVP of the technological infrastructure for the testnet implementation of two of the other governance mechanisms that have been found to have the highest potential after the research done for Milestone 3.

Milestone 6: Sample Selection (Month 5)

Randomly sample N number of proposals from Fund 12. Ensure the sample represents a diverse range of proposals and covers an adequate number of observation points per proposal by using stratified randomisation.
Recruit M community reviewers, combining different level reviewers similar to a typical funding community reviewer pool.

Milestone 7: Hypothesized Outcome Validation Alternative 1 (Month 5-7)

Conduct the experiment to test the hypothesis that fewer proposals will pass to the second stage requiring written assessments.
Analyze the experimental results to validate the hypothesis.
Determine if the proposals that would not have proceeded to the second stage under the current system generally receive low star ratings and lack community support.
Evaluate the cost reduction achieved by comparing the cost efficiency of the two assessment models.

Milestone 8: Hypothesized Outcome Validation: alternative 2 (Month 5-7)

Conduct the experiment to evaluate Alternative 2
Analyze the experimental results relative to the status quo grant winner
Evaluate the cost reduction achieved

Milestone 9: Hypothesized Outcome Validation: alternative 3 (Month 5-7)

Conduct the experiment to evaluate Alternative 3
Analyze the experimental results relative to the status quo grant winner
Evaluate the cost reduction achieved

Milestone 10: Hypothesized Outcome Validation: alternative 4 (Month 5-7)

Conduct the experiment to evaluate Alternative 4
Analyze the experimental results relative to the status quo grant winner
Evaluate the cost reduction achieved

Milestone 11: Evaluation Criteria Establishment (Month 7)

Assess the limitations of the current assessment system as a benchmark for evaluating the proposed reform.
Determine the rate of false negatives for proposals with different ratings.
Identify the range of ratings that provides an informative assessment of proposal quality.
Establish key performance metrics based on user feedback and the accurate identification of unsupported proposals.

Milestone 12: User feedback analysis (Month 8)

Collect user feedback on the assessment written under the current system and alternative systems
Analyze and compare the user feedback to assess the effectiveness of the proposed reforms.
Measure the accuracy of the objective cutoff in identifying unsupported proposals by examining the ratio of yes-no voting power.

Milestone 13: Final Analysis and Report Write-up (Month 9-10)

Compile and analyze all experimental data and evaluation results.
Draw conclusions regarding the effectiveness of the proposed model compared to the current system.
Prepare a comprehensive report summarizing the findings, including recommendations for future implementation.
Identify any limitations or areas for improvement.
Prepare follow-up survey and data collection of proposals that won grants.

Please describe the deliverables, outputs and intended outcomes of each milestone.

Milestone 1: Job description and Project Set up

Deliverables:
Job descriptions and contract terms for operational staff and researcher
Established public communication channels and project workflow
Outputs:
Clearly defined roles and responsibilities for project staff
Efficient communication channels and streamlined project workflow
Intended Outcomes:
Facilitated recruitment process for operational staff and researcher
Improved coordination and collaboration within the project team for effective project execution

Milestone 2: Experimental Implementation

Deliverables:
Experimental framework documentation.
Defined parameters for sample selection and recruitment.
Outputs:
Established experimental setup for testing the proposed model.
Clear guidelines for sample selection and assessor recruitment.
Intended Outcomes:
A solid foundation for conducting the experiment.
Clarity on the sampling and recruitment process.

Milestone 3: Sample Selection

Deliverables:
Sampled proposals using stratified randomization techniques.
Recruitment of proposal assessors.
Outputs:
A representative sample of proposals for the experiment.
A diverse pool of proposal assessors.
Intended Outcomes:
Ensured fairness and representativeness in the sample.
Availability of assessors to evaluate the proposals.

Milestone 4: Hypothesized Outcome Validation

Deliverables:
Experiment results and analysis.
Comparison of proposal outcomes under the proposed model and the current system.
Outputs:
Validated hypothesis regarding the reduction in proposals passing to the second stage.
Identification of proposals with low star ratings and lack of community support.
Evaluation of cost efficiency between the two assessment models.
Intended Outcomes:
Evidence supporting the effectiveness of the proposed model.
Insights into the relationship between proposal quality and assessment outcomes.
Understanding the potential cost reduction and its impact on the assessment process.

Milestone 5: Evaluation Criteria Establishment

Deliverables:
Assessment of the current system's limitations as a benchmark.
Determination of informative star rating ranges for proposal quality.
Defined key performance metrics.
Outputs:
Identification of the current system's shortcomings in evaluating reform.
Clarity on the star rating ranges that provide meaningful assessment.
Established criteria for evaluating the proposed model's performance.
Intended Outcomes:
Improved evaluation methods for the proposed model.
Objective metrics for measuring performance and success.

Milestone 6: Comparison of User Feedback and Objective Cutoff

Deliverables:
Collected user feedback on assessments.
Comparative analysis of user feedback and qualitative assessments.
Evaluation of the objective cutoff's accuracy.
Outputs:
Insights into the effectiveness of the proposed model based on user feedback.
Comparison of user feedback with qualitative assessments.
Assessment of the objective cutoff's ability to identify unsupported proposals.
Intended Outcomes:
Understanding the user perception of the proposed model's assessments.
Validation of the objective cutoff as a reliable indicator of unsupported proposals.

Milestone 7: Final Analysis and Reporting

Deliverables:
Comprehensive report summarizing the experiment and findings.
Recommendations for future implementation.
Outputs:
Detailed analysis of all experimental data and evaluation results.
Clear conclusions on the effectiveness of the proposed model.
Actionable recommendations for further improvements or implementation.
Intended Outcomes:
Documentation of the project's outcomes and insights.
A comprehensive report that can inform decision-making and future initiatives.

Resources & Value For Money

Please provide a detailed budget breakdown of the proposed work and resources.

Personnel Costs

Research and operations (1 FTE for 10 months). $60,000
Technical implementation and operations (1 FTE for 10 months). $60,000
Recruitment, Set-Up and Management. $20,000

Subtotal: $140,000

Equipment and Supplies:

Qualtrics subscription, $1500
Zoom subscription, $100
Airgram, $250
Cloud storage, $150

Subtotal: $2000

Participant Recruitment and Compensation

Assessment incentives: $40,000
Control survey response incentives: $2000

Subtotal: : $42,000

Travel and Accommodation:

Presentation at Cardano Summit: $3000
Miscellaneous for in-person meetings: $300

Subtotal: $3300

Marketing and Outreach: $2000

Legal and Admin: $3000

Contingency: $5000

TOTAL USD = $197,500

TOTAL ADA = 790000 ADA

*assuming 0.25 USD = 1 ADA

Who is in the project team and what are their roles?

Simon Sällström. Project Manager. Key responsibilities will be to recruit a researcher to lead this work and facilitate communication between relevant stakeholders.

Research lead [to be recruited]. A person with PhD in a relevant empirical field, or 3+ years of relevant empirical research.

Technical specialist [to be recruited]. Technical Specialist with relevant background to deploy solution on continuous testnet and assist with data collection and analysis.

How does the cost of the project represent value for money for the Cardano ecosystem?

High-level justification

So far about $31m has been distributed. It is not known how much of this can be considered "wasted" or money that has gone into scam proposals, nor to what extent this could have been avoided with a better assessment process.

The Cardano treasury, at its peak, held around $1 trillion. If Cardano is to succeed, investing in properly developing well-researched and evidenced-based assessment methodologies for how it will use funds will give returns that far exceed the cost of an individual project.

Rather than trying to quantify this, let us illustrate with just a small number of proposals. If this research endeavour could eliminate funding for even just 5 proposals like this then the investment pays for itself.

The following proposals were funded and at the time of writing, I was not able to find functioning outputs from this proposal (daocoders.net not working)

DAO-NET: DAO Deployment Platform | Lido Nation English

DAO-NET: Legal Defense DAO | Lido Nation English

DAO-NET: Multilingual Translation | Lido Nation English

DAO-NET: Auditor DAO | Lido Nation English

DAO-NET: Small Developer Funding | Lido Nation English

The amount received by this proposer amounts to about $200k in total. This is merely one example, probably amongst the biggest ones. There are others. Investing in a good process is worth the effort.

Justification for recruiting full-time researcher and person for technical implementation

To rigorously and experimentally evaluate different governance mechanisms, we need to have highly qualified individuals. It may be difficult to recruit such individuals on mere 6-month-long contracts, whereas 12-month contracts are more likely and feasible.

Salaries

The indicated salaries are below market rates for highly skilled professionals in the UK but above that of academic earnings (post-docs) and entry-level positions (e.g. at Big 4 consultancy). For a person with 3-5 years of professional experience, the salary is unlikely to be competitive, but our hope is that the autonomy, remote work and interesting job description will incentivise highly qualified applicants to seek out this position.

Legal structure

The people on this team would formally be contractors with special contracts that are adapted to the conditions of the Catalyst grant funding milestones and conditions. Legal advice will be sought out as part of the first deliverable.

bookmarked!

bookmarked!