<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=7788081&amp;fmt=gif">

Digital Exposure Quantification Engine Built on Real-World Data.

Nine Years of AI. Not Nine Months.

The Thrivaca™ Engine has been built, trained, and validated since 2016 in collaboration with US Strategic Command, MITRE, the University of Chicago,  the Society of Actuaries, NIST, and Anthropic.

AI operates at every stage of the risk quantification pipeline —
not as a single model, but as a collection of purpose-built components, each trained for a specific function.

The result: the only actuarial AI engine for digital exposure with NIST-approved methodology, patent-granted technology, and measured outcomes validated against real-world losses.

Explore Cyber Exposure →
Explore AI Exposure →
Explore Cyber Insurance Application →

 

Thrivaca AI Data Engine Visualization

The Data Foundation Behind AI & Cyber Exposure Quantification.

AI Tinemine Neural Nodes
AI Since
2016

Data Sources
32 Authoritative
Data Sources

Threats-Vulnerabilities
47,000+ Threat-Vulnerability Pairings

Scatter Plot (2)
r = .668 Breach
Correlation

Patent-1
Patent Granted

NIST approved
NIST-Approved

Predictive Power
Within 7% of UHC Actual Losses


Where The AI & Cyber Exposure Data Comes From.

The foundation of any AI system is the quality and breadth of its training data.
ArxNimbus has assembled the most extensive AI training dataset of documented cybersecurity incidents and detailed exposure analysis ever compiled for commercial insurance and enterprise risk purposes.

32+ PRIMARY AUTHORITATIVE SOURCES

  • Continuously updated data ingestion from authoritative sources including Verizon Data Breach Investigation Report, FBI Internet Crime Report, IBM Cost of Data Breach Report, think tank analyses, research publications, and government agency filings.

    Financial exposure analysis drawn from SEC disclosures, HHS enforcement actions, and post-breach analyst studies.

    Updated approximately twice yearly with supplementary quarterly refreshes.

60,000+ CYBER INCIDENT LOSS DATABASE

  • The deepest commercially available historical loss database with actual dollarized loss data — not projected or estimated losses, but real outcomes mapped to incident type, industry, organization size, and attack technique.

  • Incident records spanning thousands of organizations across 580+ industries, normalized per billion dollars of enterprise value.

4,000+ COMPANY RISK PROFILE LIBRARY

  • The largest commercial cyber risk repository covering all NASDAQ, NYSE, S&P 500, and Russell 2000 companies across 580+ industry segments at four-digit NAICS granularity.

  • Updated quarterly.

  • Provides the benchmark dataset that powers T-Score™ peer comparison and industry intelligence.

EXTERNAL ATTACK SURFACE INTELLIGENCE

  • Automated scanning of 3.9 billion routable IP addresses across 1,400 ports on a 10-day cycle.

  • Dark web monitoring across 13 million domains.

  • Honeypot and sinkhole networks providing threat intelligence feeds.

  • Patching cadence analysis as both a direct risk indicator and a meta-indicator of overall cybersecurity program maturity.

ENTERPRISE DATA
INTEGRATION

  • For enterprise deployments: API-based integration with existing security tools (Qualys, Wiz, CrowdStrike, Armis, Snyk, ServiceNow, Fortinet, and others) normalizing CVEs, assets, and severity data into one actuarial-grade schema.

  • SBOM/RBOM/AIBOM generation for software and AI supply chain tracking.

  • Snowflake and Databricks data fabric integration for carrier and enterprise analytics environments.

 Our dataset has been reviewed by the largest cyber insurance firms, Fortune 10 companies,
and includes the most comprehensive collection of both documented losses, vulnerabilities,
and threat traceability available.

The data set is growing at a current rate of over 1,100 company risk exposure profiles per month.


How the Thrivaca AI Engine Quantifies Risk. Purpose-Built AI At Each Stage.

The Thrivaca Engine processes data through seven sequential stages — the same pipeline that powers cybersecurity exposure quantification, AI exposure governance, and cyber insurance intelligence. AI operates at each stage, not as a single model, but as a collection of purpose-built components, each trained for a specific function within the digital risk quantification workflow.

One engine. Three markets. Actuarial-grade financial intelligence for each.

The engine supports multiple deployment modes:  external-scan-only (for insurance underwriting at arm’s length), CORE with client-provided data (audit reports, risk registers, assessments, and direct security stack integration), and UNIFY, sitting on top of the organization’s entire security stack via API. Each successive mode adds more data to the same seven-stage pipeline, producing progressively deeper and more precise financial exposure intelligence.


AN AI Data Engine Stage 1 (1)

What it does:

  • Ingests and classifies data from every available source — the depth of ingestion scales with the deployment model.

  • At minimum: historical loss data, breach pattern records, and external attack surface intelligence.

  • With client access (CORE and UNIFY deployments): audit reports, risk registers, penetration testing results, security documentation, SME attestation, SIEM and vulnerability management feeds via direct API connection (Qualys, Wiz, CrowdStrike, Armis, ServiceNow, Fortinet, and others), SBOM/ RBOM/ AIBOM data from the software and AI supply chain, and full asset inventory including IoT and ICS systems.

How AI operates:

  • Machine learning classifies and tags incoming data to the appropriate risk type and organizational profile regardless of source format.

  • Supervised classification trained on years of categorized incident data routes each record correctly despite significant source heterogeneity — SEC filings, FBI Internet Crime data, SIEM event streams, vulnerability scanner outputs, and audit documentation all look different but must produce consistently structured outputs for downstream modeling.

  • The more data sources connected, the richer the risk model — Thrivaca UNIFY, for example, sits on top of the organization’s entire security stack and normalizes every signal into one actuarial-grade schema.

AN AI Data Engine Stage 2

What it does:

  • Normalizes raw data to enterprise value per-billion denominators and applies industry adjustment factors.

How AI operates:

  • Statistical AI models normalize across all six-digit NAICS codes, ensuring comparability regardless of organization size or industry.

  • Anomaly detection models flag data quality issues or outliers that require review before downstream modeling. 

AN AI Data Engine Stage 3

What it does:

  •  Maps control findings to specific NIST 800-53 sub-controls and MITRE ATT&CK techniques. In external-scan deployments, findings come from attack surface analysis.

  • In full-access deployments (CORE with client data, UNIFY on the security stack), findings come from direct evidence: API feeds from vulnerability scanners, SIEM event patterns, audit reports, penetration test results, asset inventory data, and SBOM/RBOM component vulnerabilities — providing dramatically richer control status evidence.

How AI operates:

  •  This is the patent-pending, NIST-approved methodology. AI traverses ~47,000 threat-vulnerability pairings to identify which specific controls are relevant to each detected signal — whether that signal originates from an external scan, an internal vulnerability management platform, a SIEM correlation rule, or a supply chain component analysis.

  • Logical constraints enforce real-world attack chains — a firewall vulnerability does not relate to insider threat because the insider is already inside the firewall. These constraints were developed in direct collaboration with NIST and MITRE.

  • Maps findings across two taxonomies simultaneously: FFIEC (23 high-level threat categories) and MITRE ATT&CK (~1,200 techniques).  

AN AI Data Engine Stage 4

What it does:

  • Calculates potential financial impact across eight risk types: hacktivism, denial of service, ransomware, digital fraud, insider threat, data breach, IP theft, and business interruption. 

How AI operates:

  •  AI-assisted formula selection executes economic formulas developed with the University of Chicago, managing parameter selection based on organizational profile inputs.

  • Produces Near Worst Case Scenario (NWCS) values calibrated at 85% of worst-case loss — realistic yet conservative, as determined by leading economists.  


AN AI Data Engine Stage 5 (2)

What it does:

  • Ingests and classifies data from every available source — the depth of ingestion scales with the deployment model.

  • At minimum: historical loss data, breach pattern records, and external attack surface intelligence.

  • With client access (CORE and UNIFY deployments): audit reports, risk registers, penetration testing results, security documentation, SME attestation, SIEM and vulnerability management feeds via direct API connection (Qualys, Wiz, CrowdStrike, Armis, ServiceNow, Fortinet, and others), SBOM/ RBOM/ AIBOM data from the software and AI supply chain, and full asset inventory including IoT and ICS systems.

How AI operates:

  • Machine learning classifies and tags incoming data to the appropriate risk type and organizational profile regardless of source format.

  • Supervised classification trained on years of categorized incident data routes each record correctly despite significant source heterogeneity — SEC filings, FBI Internet Crime data, SIEM event streams, vulnerability scanner outputs, and audit documentation all look different but must produce consistently structured outputs for downstream modeling.

  • The more data sources connected, the richer the risk model — Thrivaca UNIFY, for example, sits on top of the organization’s entire security stack and normalizes every signal into one actuarial-grade schema.

AN AI Data Engine Stage 6

What it does:

  • Normalizes raw data to enterprise value per-billion denominators and applies industry adjustment factors.

How AI operates:

  • Statistical AI models normalize across all six-digit NAICS codes, ensuring comparability regardless of organization size or industry.

  • Anomaly detection models flag data quality issues or outliers that require review before downstream modeling. 

AN AI Data Engine Stage 7

What it does:

  •  Maps control findings to specific NIST 800-53 sub-controls and MITRE ATT&CK techniques. In external-scan deployments, findings come from attack surface analysis.

  • In full-access deployments (CORE with client data, UNIFY on the security stack), findings come from direct evidence: API feeds from vulnerability scanners, SIEM event patterns, audit reports, penetration test results, asset inventory data, and SBOM/RBOM component vulnerabilities — providing dramatically richer control status evidence.

How AI operates:

  •  This is the patent-pending, NIST-approved methodology. AI traverses ~47,000 threat-vulnerability pairings to identify which specific controls are relevant to each detected signal — whether that signal originates from an external scan, an internal vulnerability management platform, a SIEM correlation rule, or a supply chain component analysis.

  • Logical constraints enforce real-world attack chains — a firewall vulnerability does not relate to insider threat because the insider is already inside the firewall. These constraints were developed in direct collaboration with NIST and MITRE.

  • Maps findings across two taxonomies simultaneously: FFIEC (23 high-level threat categories) and MITRE ATT&CK (~1,200 techniques).  

Built on Real-World Data, Not Synthetic Assumptions.

The Thrivaca AI architecture was not built by a single team working from publicly available data.
It was co-developed with institutions that control unique, non-public data and possess specialized expertise that no single organization could replicate.

How Thrivaca Was Built (2)

 


What the Actuarial AI Engine Produces.

PREDICTIVE VALIDATION

Cross-correlation of r = .668 with actual breach incidents — a prospective correlation stronger than radiological exams predicting disease onset, academic tests predicting student performance, or most other statistical correlations society relies on daily. This correlation is measured on data the model had not seen during construction.

 

UnitedHealthcare breach impact forecast within 7% of actual losses — one of the largest and most complex cyber losses in healthcare history. Projection made using data available before the full scope of losses was known.

INSURANCE MARKET VALIDATION

Carriers using Thrivaca underwriting data have achieved loss ratios one-fourth the industry average over a five-year horizon.

 

This is not a projected improvement — it is a measured outcome. Policies underwritten using Thrivaca technology cover approximately 35% of S&P 500 companies.

ENTERPRISE OUTCOME VALIDATION

97% cybersecurity risk elimination by a healthcare provider over five years. EBITDA at Risk reduced from 18.60% to 4.98% (−73%). $381M actual risk discovered vs. $75M estimated (408% variance) at a financial services firm. $25M cost recovery in first year.

 

Organizations consistently discover exposure 89–400% higher than their estimates.

BACK-TESTING & MODEL STABILITY

The engine has been repeatedly back-tested across five distinct AI model configurations, validating that outputs are stable across different modeling assumptions and that financial risk estimates would have correctly characterized organizations’ actual loss experiences over the test horizon.

 

Sources of unexplained variability are investigated and corrected, allowing the AI model to accelerate the adaptive process.

METHODOLOGY DISTINCTION

Actuarially derived, not opinion-driven. While competing approaches rely on Bayesian methods incorporating significant expert judgment, Monte Carlo simulations under assumed conditions, or FAIR-model professional opinion, Thrivaca’s AI is built on empirically derived distributions and validated economic formulas.

 

This distinction has specific consequences in post-breach litigation — where opinion-based risk assessments are vulnerable to challenge, and data-driven assessments are not. 

Every Output Is Traceable. Every Input Is Auditable.

The Thrivaca Engine is designed for explainability — a critical requirement for board reporting, regulatory compliance, litigation defensibility, and insurance underwriting. The depth of traceability scales with the deployment model: external-scan deployments trace to publicly observable findings; full-access deployments (CORE with client data, UNIFY on the security stack) trace all the way to the specific internal evidence — the vulnerability scanner API feed, the SIEM alert pattern, the audit report finding, the SBOM component vulnerability, or the knowledge graph dependency chain.

Full traceability: Every financial exposure number traces back through the pipeline — from the board-ready EBITDA metric, through the probability calculation and valuation formula, to the specific NIST control gap, to the evidence that triggered it. In external-scan mode, that evidence is the attack surface finding. In full-access mode, it’s the direct internal signal: the Qualys vulnerability, the CrowdStrike endpoint alert, the Wiz cloud misconfiguration, the RBOM component-version risk score, or the audit finding that documents the control gap. No black boxes at any deployment level.

Framework alignment: All outputs map to NIST 800-53, NIST CSF, MITRE ATT&CK, and FFIEC taxonomies. This ensures defensibility in audit, litigation, and regulatory contexts.

Audit-ready documentation: The platform produces documentation that withstands regulatory and litigation scrutiny. When a board asks “how did you arrive at the protections you had in place?” the answer is a data-driven audit trail, not “I saw it at a trade show."

Methodology transparency: Negative binomial distributions, not Monte Carlo. Empirically derived formulas, not professional opinion. NIST-approved control-to-threat mapping, not proprietary scoring. Every methodological choice is documented, peer-reviewed, and defensible. 

The Numbers Behind The Actuarial AI Engine.

AI development origin:

2016

Primary data sources:  

32 authoritative sources, continuously updated 

Historical loss database:  

60,000+ unique cyber incidents with actual dollarized loss data

Company risk profiles:  

4,000+ companies across 580+ industries 

Threat-vulnerability pairings:  

47,000+ (patent-pending Distribution Module) 

MITRE ATT&CK techniques mapped:  

~1,200 

FFIEC threat taxonomy:  

23 high-level threat categories 

Attack surface scan coverage:

3.9B routable IPs / 1,400 ports / 10-day cycle 

Dark web monitoring:  

13 million domains 

Security solutions evaluated: 

60+

Benchmark refresh cadence:  

Quarterly 

Cross-correlation with actual incidents:

r = .668 

UHC forecast validation:  

Within 7% of actual losses 

Insurance loss ratio impact:  

One-fourth industry average (5-year measured) 

S&P 500 coverage:  

35% analyzed using patented technology 

NIST approval:  

Only organization in the industry group 

Patent status:  

Granted 

AI model integrations:  

Anthropic (Control Optimization), Security Scorecard, Risk Recon, Alien Vault OTX, NESSUS, Palo Alto 

Data fabric integration:  

Snowflake, Databricks

 

See What The Actuarial AI Engine Produces. 

Nine years of AI development.
32 authoritative data sources.
47,000+ threat-vulnerability pairings.
Validated against real-world losses.
Trusted by insurers covering 35% of the S&P 500.

The Thrivaca Engine is the actuarial AI infrastructure behind every ArxNimbus product —

Cyber Exposure →
AI Exposure →

Cyber Insure  →

S
ee what it can produce for your organization.