not approved

Nanopublications Dashboard: a searchable natural language tool for atomic knowledge-sharing

₳70,181.00 Requested
Ideascale logo View on ideascale
Community Review Results (1 reviewers)
Impact Alignment
Feasibility
Value for money
Solution

Build a dashboard to collect atomic, searchable learning points from Catalyst projects (similar to nanopublications but more accessible); populate it with data from existing projects.

Problem:

Learning-points from Catalyst projects are hard to find - they’re often poorly-evidenced, and buried in unsearchable PDFs / videos. Connections between different projects’ discoveries are opaque.

Yes Votes:
₳ 31,952,598
No Votes:
Votes Cast:
255

[SOLUTION] Please describe your proposed solution.

<u>Making learning from Catalyst projects usable</u>

Our community’s 500+ completed Catalyst projects represent a large corpus of knowledge. However, much of this material remains locked up in close-out reports in the form of videos and PDFs, which are unsearchable and undiscoverable; so once a project has presented its close-out, the learning from it is often forgotten. Even where material is held on searchable platforms, it often contains no clear link between a specific piece of knowledge or developer-relevant information, and the evidence that supports it, and no clear attribution to the person who discovered it. There is also no easy way to see connections (or even interesting contradictions) between what different projects have learnt; and no way for a developer who is making a new proposal to look at what has been discovered already and build on it. So we often end up either losing knowledge that is useful to the ecosystem, or reinventing it time and again.

The solution we propose is to build a searchable platform where developers and project teams can add individual, atomic learning points from their completed Catalyst projects, expressed in a CNL (controlled natural language), and supported by links to the evidence and attribution for each learning point. This will enable people to search for insights on specific topics, or from specific projects or types of projects, and immediately see connections or contradictions.

The aim is for developers to add learning points from their own proposals once the platform is built. But in order to make the platform usable from the outset, we will populate it with learning points from completed Catalyst projects from Fund 7 to Fund 10. This means we can a) properly test the tool and the process, and b) provide some retrospective data to search, so the platform can immediately fulfil its intended function of enabling developers to search for and build on existing learning. We recognise that this data-population work is not typical of the process of building a tool - but in this instance, we consider it an intrinsic part of the work. If we simply built an empty database and waited for developers to fill it, it could be a long time before the platform was actually useful. Populating it with retrospective data not only tests its functionality, but also enables it to be used immediately by developers to search for and build on prior learning in Catalyst, thus supporting and encouraging an open-source ethos in the developer ecosystem.

Additionally, we will support the tool with documentation and tool tips showing exactly how to shape one’s material into these individual learning points, so that the developer community is able to use the Dashboard effectively.

<u>The theory and research behind our approach</u>

The idea is based on nanopublications: an approach that is popular in life-sciences research, but has yet to reach very far beyond that field. Essentially, a nanopublication is “the smallest possible unit of publishable information” - a small, discrete, machine-readable assertion, supported by provenance information (i.e. what the assertion is derived from, and the research evidence that supports it). It’s an excellent way to share knowledge and make ecosystem-wide connections between the things we are building - the only drawback is that a “classic” nanopublication is expressed in RDF notation (a WC3 standard originally designed as a data model for metadata). This presents a barrier to adoption because many people find it difficult, and because it might not even be appropriate to express some kinds of knowledge.

So in this proposal, we draw on research by Tobias Kuhn et al in 2013, which looked at how to broaden the scope of the “nanopublication” concept by using CNLs (controlled natural language) rather than RDF triples to express a research conclusion; so people can essentially write their learning-points in normal English.

Kuhn’s research developed a concept called the “AIDA statement” (an acronym for “Atomic, Independent, Declarative, Absolute”, and unrelated to the “AIDA” acronym used in the field of marketing!) AIDA is a simple framework for what a nanopublication statement in natural language should look like, and is the approach we intend to use.

  • Atomic: a sentence describing one thought that cannot be further broken down in a practical way
  • Independent: a sentence that can stand on its own, without external references like “this effect” or “we”
  • Declarative: a complete sentence ending with a full stop that could in theory be either true or false
  • Absolute: a sentence describing the core of a claim, ignoring the (un)certainty about its truth and ignoring how it was discovered (no “probably” or “evaluation showed that”); typically in present tense

<u>How we’ll put this research into practice</u>

Kuhn found that scientists were fairly easily able to create AIDA statements from the abstracts of published research papers. Based on this, we feel confident that with some supporting "how-to" documentation (which we will create), Catalyst developers will be able to do the same with material from their monthly or closing reports. Once the material is expressed in this atomic, declarative way, it can then be connected to the provenance that supports it - this could be any link, from a heading in a document or a timestamp in a video, to a GitHub commit, a Tweet, a cell in a spreadsheet, or anywhere else a project recorded its discoveries.

Note that while the “nanopublications” approach is most obviously a fit for research-based projects, our initial explorations have shown that it is also very effective for developer projects, especially when (as they commonly do) they have documented their progress, and noted bug-fixes, results of user-testing, etc.

If a developer enters their material into our database via a dashboard-style frontend, it means it will be searchable by project, by keyword, by developer, etc; so connections and similarities between different proposals will become visible. We will also be able to see attribution (i.e. which project or person came up with this learning?), which will help us become more aware of where insights are coming from. Also, note that additions to the platform would not necessarily have to be restricted to material about Catalyst project reporting - potentially, the community could also add the knowledge that surfaces in a meeting, a collaborative document, or a Twitter space, by translating it into a series of AIDA statements and adding it.

To make the dashboard usable and valuable from the start, this proposal includes a process for our team to populate it with data from finished Catalyst proposals from F7 to F10 and run some test searches, thereby testing that the database is working as intended, and refining the methodology so that we can design the frontend effectively. This work will cover 200 proposals, creating 3 to 5 AIDA statements from each one. We consider this data population process as an essental component of the building and testing of the tool - without it, the dashbord will not be immediately useful, and it will be difficult to see whether it “works” in the sense of surfacing the connections and insights that it is supposed to.

We will then open the dashboard to the community. We'll offer supporting “how-to” materials, some sessions to raise interest, and some ongoing support; and then proposers of finished projects will be incentivised via Dework bounties to add the learning from their F9 and F10 proposals. We’ll also offer small bounties for people to send us a record of any useful and interesting connections they have discovered from searching the database, which we’ll collate on the project GitBook as a way of demonstrating what kind of insights the dashboard is helping the community to uncover.

So our process will be

  1. Build the platform; meanwhile, the data-population team prepares AIDA statements from completed Catalyst projects, as part of the testing process.
  2. Add prepared data to the dashboard and run test searches.
  3. Create a short training video and text-based “how-to” documentation, and run 3 sessions for Catalyst developers to introduce them to the tool and the Nanopublications concept.
  4. Offer bounties for developers to add their projects; and smaller bounties for people to record useful connections and insights that they have discovered from searching the database.

Essentially, this approach frames the things we do in Catalyst (potentially, everything we do, from proposals, to After TownHalls, to discussions on Telegram or Twitter) as the “experiments” we have always said they are, complete with the research outcomes and insights that characterise experimentation. It will help clarify and evidence what the developer community has learnt from projects, and make that learning searchable and discoverable; it will also surface new insights and previously-unseen connections. Our approach is adaptable to both qualitative and quantitative insights; it turns all our discussions and ideas into a collaborative research pool that we can all draw on, and embeds attribution and recognition for developers. In this way, it will support an open-source ethos in the Cardano developer community, enabling us to amplify and build on each other’s work.

[IMPACT] Please define the positive impact your project will have on the wider Cardano community.

This proposal addresses the core question of this Category by offering a tool to enhance the Cardano developer ecosystem and support an open-source ethos. Our adaptation of the nanopublications standard will make it easier to develop on Cardano, by making it easier to research and build on existing knowledge.

Developers on Cardano, particularly newcomers, will be able to use the Nanopublications Dashboard to find out what has already been created in past projects, and iterate on it, very much in the way that traditional nanopublications help academics to discover and build on existing research. This helps Cardano developers to amplify existing discoveries, rather than reinventing the wheel. The Dashboard also enables developers to log insights from their own work, facilitating proper attribution, and helping them find and collaborate with others who are working on similar ideas. Insights added from completed projects might include pitfalls or problems, thus helping future developers avoid or address them. Overall, the proposal offers an approach that can help the Cardano ecosystem become more iterative and more collaborative.

The benefit to Cardano as a whole includes ensuring that we don’t lose or forget what we learn (whether from Catalyst funded proposals or anything else), and that we can continue to access and draw on it longterm. It will help us see the connections between different projects’ discoveries; it will also help us see any points of disagreement between proposals on similar topics, which could provide fruitful avenues for further exploration. In short, it forms part of our community's memory.

The Dashboard also has the potential to help Cardano with auditability and assessment of impact. It will enable us to audit core learning from a proposal more easily, and track exactly how the team derived that learning. Also, since the process of framing one’s work in the way required by the Dashboard will tend to emphasise conclusions and insights, this encourages us to look at the effects of what we do, and will help us as a community to see the impact that is being made across Catalyst on particular topics.

In the long term, our team hopes to integrate AI tooling into this concept, using LLMs both to create AIDA statements and to compare them/discover similarity. In order to enable this kind of work (which could have far-reaching beneficial effects for Catalyst and Cardano) we need to build this initial proof-of-concept and engage the community with how to use it.

<u>We will measure our impact by:</u>

  • Number of GitHub commits during the build process
  • Number of AIDA statements created during the data-population process
  • Qualitative feedback from data population team on ease/ difficulty of creating AIDA statements
  • Number of attendees at 3 awareness sessions held with the community
  • Number of pageviews of our training material on GitBook
  • Qualitative feedback from awareness sessions on usefulness of the approach
  • Number of people claiming bounties to add material to the database
  • Amount and quality of material added
  • Number of people claiming bounties to report insights from searches

<u>We will share our outputs in the following ways:</u>

  • The dashboard build process will be fully open-source, and trackable on GitHub.

  • Our initial “mini-whitepaper” on our proposed methodology, plus our documentation of the data population team’s working process, and the "how-to" documentation we create, will all be publicly available on the project's GitBook. We will share them widely in the Catalyst community via Discord, Telegram, Twitter, and the Cardano forum.

  • Our awareness sessions, our documentation, and our Dework bounties to the community, will enable us to share the dashboard and its underlying ideas widely.

    [CAPABILITY & FEASIBILITY] What is your capability to deliver your project with high levels of trust and accountability? How do you intend to validate if your approach is feasible?

The team members are skilled and experienced members of the Catalyst community, and all have experience of working in transparent and open-source ways via GitHub, GitBook, and Dework, providing a trackable, accountable and trustworthy audit trail. See for example

The Data Population team also have a thorough grasp of natural language processing and an interest in AI.

Our proposal not only includes thorough documentation and sharing; it also rests on a substantial amount of testing of the nanopublications dashboard, and engaging the community with the nanopublication concept. This will offer a high level of trust and accountability, since the community itself tests the validity of the work.

We are taking this approach because building a tool for the community is not the end of the process. Often, the follow-up work of building a user base and increasing engagement is overlooked; we don’t want to do that. It is for this reason that this relatively simple platform requires additional resources and thinking to populate the dashboard and bring the community along with training and engagement activities.

[Project Milestones] What are the key milestones you need to achieve in order to complete your project successfully?

<u>Milestone 1 (to be completed at the end of Month 2): Whitepaper; dashboard build. 12% of budget</u>

Outputs

  • “Mini-whitepaper” defining the methodology that will be used for adding material to the database, and the database’s exact structure.

Acceptance criteria

  • The whitepaper and database structure are clear and easy to understand

Evidence

  • Whitepaper published on GitBook
  • Database structure defined on GitHub

><u>Milestone 2 (to be completed at the end of Month 2):</u><u> </u><u>building and data collection. 30% of budget.</u>

Outputs:

  • a working Dashboard;
  • a spreadsheet of AIDA statements and provenance from c. 200 past Catalyst proposals

Acceptance criteria:

  • The Dashboard is usable
  • the AIDA statements cover a diverse range of completed Catalyst proposals

Evidence:

  • a working Dashboard
  • a spreadsheet of AIDA statements

><u>Milestone 3 (to be completed at the end of month 3): data population and documentation. 28% of budget</u>

Outputs:

  • The dashboard’s database is populated with material from past proposals;
  • A session plan for awareness sessions
  • A “how to” training video demonstrating how to add material to the Dashboard
  • Text-based “how to” on GitBook

Acceptance criteria:

  • Dashboard search function is working, and returns results from material added by the data population team
  • “How to” materials and frontend are accesible and clear

Evidence:

  • Searchable dashboard containing material from past Catalyst projects
  • Awareness session plan
  • “How-to” resources as video and text

><u>Milestone 4 (to be completed at the end of month 4): </u>

Outputs:

  • 3 awareness sessions delivered across Catalyst community
  • Dework bounties created and widely publicised

Acceptance criteria:

  • Awareness session attendees rate the training as good or higher ( assessed via feedback form)

Evidence:

  • Attendees list and list of dates of awareness sessions
  • Bounties available on Dework

><u>Final Milestone 5 (to be completed at the end of month 6): close-out and learnings. 16% of budget</u>

Outputs:

  • Data from 30 new projects added to the dashboard via bounties, including data from this project itself.
  • Close-out report and video

Acceptance criteria:

  • Closing report is accepted by IOG
  • Data added from new projects is valid; developers are able to add data from their projects without problems

Evidence:

  • New data added to dashboard from 30 projects
  • Closing report and video

Post-funding:

  • bug fixes and minor feature additions for 1 year

  • Outcomes: the platform will continue to be usable, and to develop in response to input from users.

    [RESOURCES] Who is in the project team and what are their roles?

<u>Vanessa Cardui:</u> Community engagement professional with 20+ years' experience of working with communities to help them engage in grounded-theory research, and record and archive their lives. Part of QA-DAO where she leads on documentation (for example, see documenting Catalyst Circle) part of CGO (Community Governance Oversight), where she facilitated meetings and edited the F8 closing report; founding member of The Facilitators’ Collective; part of the SingularityNET archives team; part of the SingularityNET DeepFunding Focus Group.

Role: managing data population team; managing whitepaper writing team; project management and reporting.

<u>Alokam Augusta Chinenyenwa:</u> A dedicated and forward-thinking student of Computer Science, driven by a passion for ML, NLP, and blockchain technology. With knowledge in Data Science, community management, and a vision for applying and building ML tools on the blockchain, she is poised to make a significant impact in the field and drive innovation in the intersection of technology and decentralisation.

Role: Data population team; devising “how-to” documentation.

<u>Efua Edufua Abekah</u><u>:</u> A budding professional combining her academic background in Economics and French with a deep fascination for blockchain technology. Her journey has been marked by a continuous pursuit of knowledge and innovative applications of technology in financial systems. At Wada, where she serves as the Executive Assistant, she is responsible for providing administrative support for Wada leadership and providing support on projects. Her proficiency in languages and insights into economic frameworks uniquely position her to drive initiatives in creating more transparent, secure, and efficient financial systems with a keen eye on the potential of blockchain technology.

Role: Data population team; whitepaper writing team; publicity and engagement.

<u>Phil Khoo</u> - experience as an accountant, UI/UX frontend and graphic design and business advisor amongst numerous other pursuits. He currently has a lead position in the development and direction of Cardano AIM and is co-creator of the Community Tools.

Role: front-end and data design; whitepaper writing team; managing AIM development team; delivery of awareness sessions.

<u>AIM Development Team:</u> Cardano AIM has developed several dashboards and other tools that are widely used by the Cardano community, for example the Catalyst Voter Tool <https://cardanocataly.st/voter-tool/#/>

Role: initial dashboard build; ongoing maintenance for 1 year.

[BUDGET & COSTS] Please provide a cost breakdown of the proposed work and resources.

<u>Dashboard build: 27,821 ADA</u>

  • Initial build: 80 hours @ 178 ADA/hr = 14,240 ADA
  • Frontend design: 40 hours @ 178 ADA/hr = 7,129 ADA
  • Maintenance for 1 year (covers service costs; bug resolution, minor feature additions): 6,452 ADA

<u>Mini-whitepaper: 1,615 ADA</u>

  • Planning meeting (3 people x 323 ADA) = 969 ADA
  • Writing = 646 ADA

<u>Data population team: 19,092 ADA</u>

  • onboarding/planning session (4 people x 323 ADA) = 1,292 ADA
  • create and add data from 200 proposals: 100 hours, @ 178 ADA/hr = 17,800 ADA

<u>Creating “how to” materials: 1,612 ADA</u>

  • comprises short video; a text-based “how-to” on GitBook; and session-plan for awareness sessions

<u>Awareness sessions delivery: 3,234 ADA</u>

  • 646 ADA /session, x3 sessions = 1,938 ADA
  • ongoing support for people uploading their own material: 8 hours @ 162 ADA /hr = 1,296 ADA

<u>Publicity and community engagement: 890 ADA</u>

  • Publicising and sharing the Dashboard via Discord, Telegram, Twitter, Cardano forum, and via official Catalyst proposer channels - 5 hours @ 178 ADA/hr = 890 ADA

<u>Community bounties: 1,730 ADA</u>

  • We aim for 25 projects adding their info to the dashboard, x 50 ADA per project = 1,250 ADA (Note: lower rate than for the data population team, because proposers know their own proposals and don’t have the extra overhead of reading and understanding the proposal first)
  • plus 25 people reporting insights from searches, x 16 ADA per insight = 400 ADA

<u>Project management: 8,000 ADA</u>

(comprises team coordination, project documentation, monthly reporting, milestone reporting, wallet management, close-out report and video.)

Total: 63,914 ADA

[VALUE FOR MONEY] How does the cost of the project represent value for money for the Cardano ecosystem?

This project represents value for money because it combines building a tool with testing its practical use, populating it with enough data to make it usable from the start, support for users, and ongoing maintenance for a year. There is also an element of targeted community engagement, which we believe should be part of most proposals to build community tools. The proposal also introduces some novel thinking, in the shape of adapting the Nanopublications standard; since this approach is new to Catalyst, we believe it will prove interesting and fruitful for the developer community, and that this proposal could represent the start of some far-reaching benefits to the Cardano ecosystem for a relatively low cost.

The pay rates given are standard freelance rates in the relevant fields in the parts of the world where we are based. (Note that freelance rates are generally higher than salary rates, since they take into account the employment overheads of the people contracted. For example, freelancers, unlike employees, do not get sick pay, holiday pay, or national insurance contributions, and have to pay all the overheads for their own workspaces.)

Everyone working on this project is taking on the currency risk of being paid in ADA. When converting our costs to ADA, despite the recent price rally, we are anticipating continued market macro conditions that will suppress ₳ prices by March 2024. As of late November 2023, we are basing our budget on a conversion rate of $0.31 (V$/₳ = 0.31).

Based on the above, we believe this proposal offers excellent value for money.

Avis des conseillers communautaires (1)

Comments

close

Playlist

  • EP2: epoch_length

    Authored by: Darlington Kofa

    3 min 24 s
    Darlington Kofa
  • EP1: 'd' parameter

    Authored by: Darlington Kofa

    4 min 3 s
    Darlington Kofa
  • EP3: key_deposit

    Authored by: Darlington Kofa

    3 min 48 s
    Darlington Kofa
  • EP4: epoch_no

    Authored by: Darlington Kofa

    2 min 16 s
    Darlington Kofa
  • EP5: max_block_size

    Authored by: Darlington Kofa

    3 min 14 s
    Darlington Kofa
  • EP6: pool_deposit

    Authored by: Darlington Kofa

    3 min 19 s
    Darlington Kofa
  • EP7: max_tx_size

    Authored by: Darlington Kofa

    4 min 59 s
    Darlington Kofa
0:00
/
~0:00