The internet deserves
better information

We are building the infrastructure to make quality information findable, verifiable, and accessible to everyone, at a fraction of the cost it takes today.

The authority problem

The internet has lost the ability to tell us who to trust

PageRank was the last serious attempt to solve this. Google's insight was that a page linked to by many other pages is probably more authoritative than one that is not. That worked brilliantly when the web was a network of documents built by people who cared enough to create hyperlinks. It was never designed for a web of creators, platforms, and machine-generated content.

Today, social metrics measure popularity, not expertise. A tweet from an mRNA researcher in Rotterdam gets the same algorithmic weight as one from someone who read their first abstract last week. Verification badges, once meaningful, have become a pay-to-play feature on most major platforms. Institutional affiliation is too slow and too exclusionary: it takes years to acquire a title that reflects real knowledge, and it completely misses independent experts, citizen scientists, and coral reef biologists with no university affiliation who have spent thirty years in the water.

We are left with no working mechanism to answer the most basic question anyone asks when they encounter information online: is this person actually an authority on this topic? That question is not trivial. It is the foundation of every rational decision made on the basis of information. And right now, the infrastructure to answer it simply does not exist.

Why AI makes this urgent

AI restructures the economics of content, not just the volume

AI does not just add more content to the existing noise. It restructures the economics of content production entirely. The marginal cost of producing a convincing, plausible-sounding piece of writing, analysis, or code has dropped to near zero. A model that would have cost millions to run three years ago now runs on a laptop. The constraint is no longer production: it is verification.

The constraint is no longer production. It is verification.

LLMs present a subtler problem than most people have fully reckoned with. They are trained on web data, they generate content that ends up on the web, and that content gets scraped back into the next training run. This feedback loop means synthetic content gradually dilutes the underlying corpus. The model's sense of what a coral reef biologist sounds like starts to drift toward what AI thinks a coral reef biologist sounds like, rather than what actual coral reef biologists have written over decades. That drift is cumulative and largely invisible.

Retrieval systems compound this further. RAG pipelines and search APIs surface whatever ranks best, regardless of whether a human with genuine expertise was involved. An OSINT analyst's meticulous three-part thread gets the same retrieval weight as a hallucinated summary of the same topic. Every layer of the AI stack inherits the authority problem from the web below it. As models become more capable, the systems they are built on become less reliable. The gap between what AI can do and what AI can safely be trusted to do widens unless something changes at the infrastructure level.

What we built instead

We started with a different question

Not "what pages rank for this topic?" but "who actually has authority on this topic?" Those two questions sound similar, but they lead to completely different architectures. One starts from documents. The other starts from people.

The Authority Index is a graph of millions of verified human authorities, continuously updated and queryable across hundreds of topic domains. It covers mRNA researchers and maritime lawyers, artisan bakers and Harajuku fashion writers, OSINT analysts and coral reef biologists. We built it by mapping cross-platform signals that are genuinely hard to fake: sustained output over time, peer citation and reference, consistent topical specificity, and presence across multiple independent platforms. A single viral post proves nothing. A pattern of substantive engagement in a specific domain, sustained over years and across multiple contexts, is much harder to manufacture.

Authority is a spectrum, not a binary credential. We treat it that way. Someone can be a tier-one authority on wastewater treatment policy and a complete non-expert on the electoral system, even if they post about both. Our graph captures that granularity. We publish the results, not the recipe: the methodology stays proprietary, but the index is open to query. The principle is straightforward: we measure real expertise, not reputation proxies.

The trust layer

One graph, many applications, one consistent answer

Every AI system that retrieves, generates, or trains on web content needs to be able to ask a simple question: is this from a verified authority? That capability, implemented as a fast, composable API, is what we call the trust layer. It sits below the application, not inside it.

A RAG pipeline building a medical Q&A bot can filter its retrieval pool to verified healthcare authorities before the first token is generated. A foundation model training run can weight its corpus by verified expertise rather than raw engagement. A newsroom verification tool can check a source in seconds. A brand partnership platform can surface rising creators with genuine domain authority, not inflated follower counts. A government disinformation unit can flag content that claims expertise the author does not have. One graph, many applications, one consistent answer to the question of who knows what.

This is not a feature. It is infrastructure. The same way TLS made "is this connection secure?" answerable across the entire web, a trust layer makes "is this from a verified authority?" answerable across every system that ingests or produces information. The value compounds with adoption: every new application that queries the Authority Index adds to the signal we use to improve it, and every improvement makes every application more accurate.

This is not a feature. It is infrastructure.
Why Europe

Trust infrastructure should not live inside the systems it evaluates

We did not choose to build this in Europe by accident. The hyperscaler ecosystem, however capable, has a structural conflict of interest here. The same companies whose business models depend on scraping, indexing, and monetizing web content are not the right companies to build the system that evaluates that content's authority. That conflict shapes incentives in ways that are subtle but real: what gets indexed, how it gets ranked, what counts as authoritative.

Our infrastructure runs on Hetzner in Helsinki, with models from Mistral in Paris and vector storage on Weaviate in Amsterdam. That is not a GDPR checkbox. It is a statement about where we think trust infrastructure should live: outside the blast radius of any single hyperscaler, subject to European data governance frameworks, and accountable to European courts. Digital sovereignty matters in almost every product category, but it matters most in the category that is literally deciding whose knowledge gets amplified and whose gets suppressed.

Europe also has an institutional head start here that the rest of the world is only beginning to catch up to. GDPR established that data about people has obligations attached to it. The AI Act is now establishing that systems trained on that data have obligations too. Trust infrastructure fits naturally into that framework: it is the mechanism that lets any system demonstrate, verifiably, that the content it relies on comes from a real expert rather than a synthetic approximation of one. We are not building to comply with regulation. We are building what regulation is pointing toward.

The future we are building toward

As automatic as HTTPS, as foundational as DNS

We want "is this from a verified authority?" to become as automatic as "is this HTTPS?" Every developer building a retrieval system, every journalist verifying a source, every model training run curating a corpus: they should all be able to assume that authority verification is available, fast, and accurate. The way no serious web developer today debates whether to implement transport security, no serious AI developer in ten years should have to build their own authority verification from scratch.

When every RAG pipeline can filter for verified expertise, the floor for AI response quality rises. When every newsroom can check a source in seconds, the cost of disinformation increases. When every training dataset is weighted by genuine domain knowledge, the feedback loop that degrades models over time gets interrupted. None of this happens instantly. It requires infrastructure to exist before the applications that depend on it are built. That is why we started building before it was fashionable, and why we intend to be the layer that the rest of the ecosystem builds on.

That is the mission. It is bigger than any single product. It is the work of a generation, and we are glad to be doing it now.

The people behind
the Authority Index

We're a tight-knit team with deep roots in public and social media data. Many of us have been working together for 5 to 10+ years. We know each other's strengths, we move fast, and we've built in this space long before it was fashionable.

Thomas Slabbers

Thomas Slabbers

Duco Janssen

Duco Janssen

Joran Cornelisse

Joran Cornelisse

Read bio
PK

Peter van Kampen

Want to work with us?

We're always interested in exceptional people who care about signal over noise.

Apply now