Entity Resolution

Canonical Identity Mapping

Creates a unified identity registry across your document corpus. Resolves aliases, tracks role changes, and maps relationships between individuals.

Alias DetectionRole TrackingRelationship MappingTemporal ContextConflict Detection

The Problem

The same person appears five different ways across 500 pages. "Dr. J. Smith," "Jane Smith," "Ms Smith," "the psychologist," and "JS" are all one actor — but fragmented identity means fragmented analysis. Manual tracking breaks down at scale, and missed connections mean missed accountability.

How It Works

1Extract named entities with NLP
2Cluster by phonetic similarity and context
3Resolve clusters using document proximity and co-reference chains
4Build canonical identity with all known aliases
5Map relationships from co-occurrence and explicit mentions

Inputs

• Document corpus
• Named entity extraction
• Co-reference resolution

Outputs

• Identity registry
• Relationship graph
• Conflict of interest flags

What You Get

Canonical Identity Card

Canonical Name: Dr. Jane Smith
Aliases: J. Smith, Jane Smith, Ms Smith, "the psychologist," JS
Role: Clinical Psychologist (2019–present)
Organisation: Regional Assessment Service
Documents: 47 appearances across 23 documents
First Seen: Doc 3, p.2 (14 March 2021)
Last Seen: Doc 156, p.8 (9 October 2024)
Relationships: Supervised by Prof. R. Williams; assessed Client A (12 sessions); authored reports E4.1, E4.3, E4.7
Flags: Role change: "Independent Expert" → "Trust Employee" at Doc 89 (no disclosure in subsequent reports)

Works With

TTemporal Parser

Uses resolved entities to build per-person timelines and detect when someone’s role or involvement changed.

KContradiction

Relies on canonical identities to determine whether two conflicting statements came from the same or different sources.

LAccountability

Maps statutory duties to resolved individuals, ensuring breach findings attach to the right person regardless of how they were named.

Use Cases

Family court proceedings

A social worker is referred to by name, job title, and pronouns across 40 reports from different agencies. Entity Resolution unifies these references so their assessments can be tracked as a single professional narrative.

Corporate investigations

Beneficial ownership tracing across shell companies where directors use variations of their names. The engine surfaces hidden connections between entities that appeared unrelated.

Multi-agency reviews

Witness and professional identification across police, health, and education records where different agencies use different naming conventions for the same individuals.

Technical Approach

Named Entity Recognition using spaCy transformer models, tuned for UK institutional documents (titles, honorifics, role-based references)
Phonetic clustering via Soundex and Double Metaphone algorithms to catch spelling variations and transliterations
Co-reference resolution using neural co-reference chains to link pronouns and descriptions back to named entities
Manual verification layer — ambiguous merges (confidence < 0.85) are flagged for human review before finalisation