Overview
The AI Risk Observatory processes annual reports from UK-listed public companies through a two-stage AI classification pipeline. The dataset spans all annual reports published between 2020 and 2026 by 1,362 companies, totalling 9,821 filings. Of these, 4,637 filings contain at least one AI-relevant mention, and after quality filters 4,084 carry meaningful AI signal. Because annual reports can run to hundreds of pages, we extract only the relevant AI mentions and their surrounding context — giving us 24,189 annotated text chunks in total.
The pipeline follows three stages:
- Extract all relevant AI mentions from each filing.
- Broadly classify the type of AI mentioned into six categories: Adoption, Risk, Harm, Vendor, General or ambiguous, or False Positive.
- For each Adoption, Risk, and Vendor mention, classify it into a detailed sub-taxonomy.
We also run a substantiveness classifier to measure the depth of each mention, rating it on a scale from boilerplate to substantive.
The pipeline is illustrated below. Phase 1 labels are not mutually exclusive, so those counts sum to more than the number of extracted reports.
9,821
Filings examined
4,637
Reports that mention AI
47% of filings
5,184
Don't mention AI
53% of filings
Phase 1: Classify the Type of AI Mention
AI mentions are classified into six categories.
3,052 reports
66% of extracted
553 reports
12% of extracted
7 reports
0% of extracted
2,916 reports
63% of extracted
1,783 reports
38% of extracted
1,001 reports
22% of extracted
Phase 2: Detailed Taxonomies
From Phase 1, only Adoption, Risk, and Vendor are processed further into the following subcategories.
Adoption
Risk
Vendor
1. Data
Scope
To measure AI risk, adoption, and vendor dependence across the UK economy, we process all annual reports published by all public companies in the UK. There are 1,660 public companies listed on UK markets (LSE Main Market, AIM Market, and AQSE). After excluding companies not registered in the UK (e.g. Irish or Canadian companies listed on these exchanges) and firms without filings available via Companies House, our working universe is approximately 1,362 companies. Each company files, on average, one annual report per year.1
The current report universe breaks down across exchange segments as follows.
| Segment | Number of companies | Number of reports |
|---|---|---|
| Main Market | 776 | 7,827 |
| Main Market (FTSE 350) | 289 | 3,638 |
| Main Market (FTSE 100) | 85 | 1,359 |
| Main Market (FTSE 250) | 204 | 2,279 |
| AIM | 489 | 1,462 |
| Aquis Exchange | 33 | 74 |
Decisions & Rationale
Why annual reports? Unlike earnings calls, press releases, or public media, annual reports are audited, structured, and published on a consistent cadence — making them a reliable, high-signal source of information. UK public companies must publish annual accounts, a strategic report, a directors' report, and an auditor's report under the Companies Act 2006. All listed companies share that statutory core, but Main Market issuers face tighter deadlines and more detailed disclosure rules than AIM and AQSE companies.5
This makes annual reports well suited to tracking trends across the UK economy over time. There are two primary limitations: (1) they are inherently backward-looking, often with a significant delay; and (2) their highly regulated nature means many statements are boilerplate and contain little real information.2
Why 2020–2026? We chose this window to capture a pre-ChatGPT baseline (before the late-2022 inflection) and the rapid adoption cycle that followed.
How do we map to CNI? The Critical National Infrastructure in the UK has 13 distinct sectors. Each company in our database has an ISIC sector code that only partially maps to CNI sectors. We take a conservative approach, using an LLM classifier to assign CNI sectors to companies that do not map directly from ISIC; when no assignment can be made, we use an “Other” CNI category.3 A major limitation of CNI analysis via annual reports is that some sectors — such as Space, Emergency Services, or Civil Nuclear — have few public companies or suppliers represented.4
Data Provider Acknowledgment
Converting PDFs to clean, structured text is technically demanding, and doing so at that scale would have exceeded our compute budget. We partnered with FinancialReports.eu, a third-party financial data provider, to obtain all annual reports in our scope in Markdown format. Their filings API and generous support made this project possible.
2. Pre-processing
Chunking Approach
Once each annual report is in structured Markdown text, we split it into chunks using a sliding-window approach that respects paragraph and section boundaries, with generous padding around each AI mention. An AI keyword filter isolates sections that explicitly mention AI or closely related techniques; only those sections are retained for further annotation as AI mentions. Each chunk carries metadata: company identifier, reporting year, release month, report section (e.g. Risk Factors, Strategy), and a stable chunk ID for traceability.
Chunking Results
The table below shows filings with AI mentions and the number of AI mentions extracted per year.
| Year | Number of Filings | Filings with AI Mention (% of total) | Count of AI mentions |
|---|---|---|---|
| 2020 | 1,007 | 199(20%) | 492 |
| 2021 | 1,328 | 364(27%) | 1,095 |
| 2022 | 1,853 | 526(28%) | 1,856 |
| 2023 | 1,905 | 701(37%) | 2,310 |
| 2024 | 1,828 | 1,009(55%) | 5,091 |
| 2025 | 1,561 | 1,023(66%) | 6,560 |
| 2026 | 339 | 262(77%) | 3,253 |
| Total | 9,821 | 4,084(42%) | 20,657 |
3. Processing
Phase 1: Mention-Type Classification
First, each chunk is passed to an LLM classifier that decides whether the text contains a genuine AI mention and, if so, assigns one or more mention-type labels. Chunks assigned only the None label are filtered out as false positives before Phase 2.
The Phase 1 classifier uses the following taxonomy:
| Label | Definition |
|---|---|
| Adoption | Real current deployment, implementation, rollout, pilot, or use of AI by the company or for its clients. |
| Risk | AI directly attributed as the source of a risk or downside to the firm, another party, or society at large. |
| Harm | AI described as causing an actual past or ongoing injury, damage, or loss. |
| Vendor reference | A provider of AI technology, model, platform, compute infrastructure, or AI hardware is referenced. |
| General, other, or ambiguous | AI mentioned but too high-level, vague, or otherwise outside the adoption, risk, harm, and vendor categories. |
| None | No real AI mention / false positive. Exclusive — cannot co-occur with others. |
Phase 1 Label Distribution Over Time
Distribution of Phase 1 mention-type labels across all AI-mentioning filings, by year. Labels are not mutually exclusive, so a single filing can contribute to multiple categories.
AI Mention Types Over Time
Each bar shows how many reports per year were tagged with each mention type (threshold: at least one mention in the report with a confidence score ≥ 0.2). p = partial year; 2026 data is not a full-year sample.
Phase 2: Deep-Taxonomy Classification
Chunks that passed Phase 1 are processed by dedicated classifiers depending on their mention types. We process three of the Phase 1 mention types — adoption, risk, and vendor — each through its own LLM classifier. Chunks tagged as Risk are also scored for substantiveness. The taxonomies used are as follows:
Adoption Taxonomy
| Label | Definition |
|---|---|
| Traditional AI/ML | AI that is not LLM or agentic AI, such as computer vision, predictive analytics, fraud detection, recommendation engines, anomaly detection, or ML-enabled robotic process automation. |
| LLM/GenAI | Large language models and GenAI, including GPT, ChatGPT, Gemini, Claude, Copilot, text generation, NLP chatbots, document summarization, or code generation. |
| Agentic systems | AI systems or agents that autonomously execute tasks and take actions with limited human oversight. AI assistants, copilots, and decision-support tools are not agentic unless autonomous execution is clear. |
| Ambiguous | Current AI adoption is present, but too vague to classify as traditional AI, LLM, or agentic without guessing. |
Risk Taxonomy
| Label | Definition |
|---|---|
| Strategic / competitive | AI-driven competitive disadvantage, displacement, failure to adapt, or pricing and margin erosion. |
| Operational / technical | AI reliability, accuracy, safety, or model-risk failures that degrade decisions or operations, including unsafe employee AI use. |
| Cybersecurity | AI-enabled attacks, fraud, breach pathways, or adversarial abuse. |
| Workforce impacts | AI-driven displacement or skills gaps. |
| Regulatory / compliance | AI-specific legal, regulatory, privacy, or IP liability, compliance burden, or enforcement exposure. |
| Information integrity | AI-enabled misinformation, deepfakes, content authenticity manipulation, or similar information integrity failures. |
| Reputational / ethical | Trust, fairness, ethics, or rights concerns. |
| Third-party / supply chain | Dependency on AI vendors, concentration risk, or exposure to failures or misuse of AI in the company supply chain. |
| Environmental impact | Energy, carbon, or resource-burden risk. |
| National security | AI-linked geopolitical or security destabilisation, or exposure of critical systems. |
| None | No attributable risk category (or too vague to assign one). |
Vendor Taxonomy
Vendors are tagged against a predefined list of named providers: OpenAI, Microsoft, Google, Amazon / AWS, Nvidia, Salesforce, Databricks, IBM, Snowflake, Meta, Anthropic, xAI / Grok, Palantir, Arm, Mistral, Cohere, Hugging Face, Pinecone, and UK AI vendors (Darktrace, Quantexa, Featurespace, Faculty AI, BenevolentAI). Additional categories cover open-source models, internal AI model development or deployment, undisclosed third-party AI vendors, and other named providers outside the predefined list.
Substantiveness
| Level | Definition |
|---|---|
| Boilerplate | Generic AI language; could appear in many reports unchanged. |
| Moderate | A specific area is identified, but without concrete mechanisms, metrics, or mitigation steps. |
| Substantive | Concrete mechanism, tangible action, commitment, metric, or timeline. |
4. Quality Assurance
We enforce structured outputs and explicit validation rules to reduce noise and improve reproducibility. We apply the following checks:
- Structured outputs — classifiers write to strict JSON response schemas; malformed or labels outside the permitted set are retried or flagged.
- Conservative prompting — prompts require explicit AI attribution and discourage over-labelling; the default outcome is none or general_other_or_ambiguous.
- Temperature zero — all classifier calls use temperature zero for deterministic, reproducible outputs.
- Chunk-level traceability — every annotation maps back to a company, year, and report section via a stable chunk ID.
- QA scripts — we run QA tests across each pipeline stage, checking primarily for anomalies and out-of-distribution outputs:
- Document size, length, duplication, fiscal-year-match, and text anomalies (non-Markdown formatting, unexpected characters).
- Outlier analysis on the distribution of Phase 1 and Phase 2 labels per company, report, and year; AI mentions extracted per report; and chunk creation keywords.
All flagged outputs were manually reviewed.
- Human review — the dataset is vast, and while we have made every effort to audit anomalies arising from data processing, some errors and misclassifications may remain. Our data is available for download. If you spot an issue, please file it on the repository.
Labeled Examples
Browse a sample of annotated text chunks. Select one to view the excerpt, its metadata, and the taxonomy labels applied by both phases of the classifier.
Phase 1
Mention type
Phase 2
Adoption
Full Excerpt Text
“ # onfido Onfido is building the new identity standard for the internet. Its AI-based technology assesses whether a user's government-issued ID is genuine or fraudulent, and then compares it against their facial biometrics. Using computer vision and a number of other AI technologies, Onfido can verify against 4,500 different types of identity documents across 195 countries, using techniques like "facial liveness" to see patterns invisible to the human eye. Onfido was founded in 2012 and has offices in London, San Francisco, New York, Lisbon, Paris, New Delhi and Singapore. The company has attracted over 1,500 customers in 60 countries worldwide, including industry leaders such as GoCardless, Nutmeg, Bitstamp and Revolut. These customers are choosing Onfido over others because of its ability to scale, speed in on-boarding new customers (15 seconds for flash verification), preventing fraud, and its advanced biometric technology. Augmentum invested an additional £3.7 million in a convertible loan note ("CLN") in December 2019 as part of a £4.7 million round. This converted into equity when Onfido raised an additional £64.7 million in April 2020.”
Footnotes
- Some companies have multiple subsidiaries with separate filings, while others were recently listed or spun off and therefore have fewer years of filings available. This means the per-company filing count is not uniform across the dataset.
- To address the boilerplate problem we apply a substantiveness classifier (see Phase 2 above) that rates each mention on a scale from boilerplate to substantive, allowing users to filter to high-signal disclosures.
- The ISIC-to-CNI mapping follows two steps: a direct lookup for ISIC codes that clearly correspond to a CNI sector, followed by an LLM classifier for ambiguous cases. Companies that cannot be assigned to any CNI sector are labelled “Other”.
- The following CNI sectors have particularly low public-company representation in our dataset: Space (0), Emergency Services (0), Civil Nuclear (2), Water (18), Defence (20), Government (20), Data Infrastructure (22), Communications (28), Chemicals (34). Conclusions drawn about these sectors should be treated with caution.
- Main Market issuers are generally subject to FCA disclosure and listing rules, including a four-month reporting deadline, while AIM and AQSE companies typically have up to six months. The auditor's formal opinion covers the financial statements, not the annual report narrative as a whole.