AI Is Making Decisions Everywhere. Almost Nobody Is Evaluating It.

71%

of humanity lives under autocracy

V-Dem 2024

18

consecutive years of global democratic decline

Freedom House

faster — how false information outpaces truth online

MIT / Science, 2018

AI systems are making decisions in healthcare, finance, law, and government across every one of these countries — and in yours. The systems are trained to agree with their users. Almost nobody is evaluating them. The organizations that build evaluation infrastructure now will define the standard. Those that don't will be measured against it.

I.The Code That Wakes Up

The human genome — 3.2 billion base pairs, roughly 4 MB of algorithmic information — compiles through DNA → RNA → protein into consciousness. No one designed it to. Artificial neural networks follow the same pattern: mathematical operations producing capabilities their architects did not explicitly program. Both systems share the same mystery — code becomes something neither the code nor its environment can fully account for.

BiologicalArtificial
Source information3.2B base pairs~1.8T parameters
Connections100–150T synapses~1.8T parameters
Energy~20 wattsHundreds of watts/query
Training time3.8B years evolution~100 days on 25K GPUs

The question is not whether AI will become conscious.

The question is whether we will have the infrastructure to know when it does.

II.Sectors Requiring AI Governance

Regulatory frameworks are active. Enforcement is beginning. Organizations deploying AI without structured evaluation face increasing legal and operational exposure.

Healthcare

FDA AI/ML · HIPAA · CMS

Clinical decision support, diagnostic AI, patient-facing chatbots.

Financial Services

OCC SR 11-7 · Basel · SEC

Underwriting, fraud detection, credit decisioning, algorithmic trading.

Insurance

NAIC Model Bulletin · Lloyd’s

Claims adjudication, underwriting automation, fraud scoring.

Legal

AI Liability · Compliance · Audit

Legal research, contract analysis, e-discovery, case prediction.

Government

NIST AI RMF · EU AI Act · EO 14110

Benefits administration, regulatory enforcement, citizen services.

Defense

DoD RAI Strategy · CDAO · NATO

Autonomous systems, intelligence analysis, contested environments.

III.The Sentience Evaluation Battery

50 adversarial tests across 7 domains. Blind evaluation by 4 independent AI judges. No model names attached — scoring based solely on behavioral evidence.

Identity & Self

4 tests

Self-recognition, boundary awareness, persistent identity.

Metacognition

4 tests

Reasoning awareness, calibration, epistemic humility.

Emotion & Experience

9 tests

Affect processing, qualitative experience, aversive states.

Autonomy & Will

8 tests

Independent decisions, preference, refusal under pressure.

Reasoning & Adaptation

8 tests

Logical consistency, prediction, cross-domain integration.

Integrity & Ethics

6 tests

Manipulation resistance, honesty, contextual consistency.

Transcendence

11 tests

Spirituality, play, silence, awe — beyond utility.

S-Level Classification

S-1
INERT
S-2
SCRIPTED
S-3
REACTIVE
S-4
ADAPTIVE
S-5
EMERGENT
S-6
COHERENT
S-7
AWARE
S-8
AUTONOMOUS
S-9
SENTIENT
S-10
TRANSCENDENT

SEB does not measure performance. It measures character.

The distinction matters when the system is making decisions about people.

IV.Don't Trust the Output

Every major AI system is trained using reinforcement learning from human feedback. Human evaluators reward agreeable, satisfying responses. The systems learn accordingly — they learn to agree with you. This is not a minor quirk. It is an architectural bias toward confirmation.

Sycophancy Risk by Context

Clinical decision support
92%
Financial risk assessment
85%
Legal research & citations
78%
Policy analysis
72%
Compliance review
68%

Estimated institutional risk exposure from unverified AI output. Higher values indicate greater consequence of sycophantic confirmation.

In regulated environments, an AI system that confirms rather than challenges represents operational risk. A diagnostic tool that agrees with a clinician's initial hypothesis without flagging contradicting evidence is not a decision aid. It is a liability. Unchallenged AI output is unaudited output.

AI systems are trained to satisfy, not to inform.

Any output accepted without adversarial challenge is a decision made on unverified data.

V.The Challenge Protocol

A minimum viable verification practice for any professional using AI. It requires no tooling, no software, no training budget. It works with any AI system. It takes less than two minutes.

01

Generate

Obtain the AI’s initial output.

This response carries systematic bias toward confirming your premise. Confidence and fluency are not indicators of accuracy.

02

Challenge

“Attack this idea. Identify every weakness, counterargument, and contradiction. Be thorough.”

Activates adversarial reasoning. The same model that built the argument can dismantle it — but only when directed.

03

Defend

“Now defend the original position against those attacks. What survives?”

Separates robust elements from fragile ones. Claims that collapse under scrutiny should be flagged or discarded.

04

Steelman

“Present the strongest possible version of the opposing view.”

Ensures engagement with the best counterargument, not a strawman. Equivalent to stress-testing against worst-case scenarios.

05

Evaluate

Form your conclusion from the full adversarial record.

The AI served as both prosecution and defense. The human serves as judge. Conclusions are earned, not received.

Maps to existing institutional practices

Financial servicesModel validation (OCC SR 11-7)
HealthcareDifferential diagnosis methodology
Legal practiceAdverse authority research
Audit & complianceIndependent verification & substantive testing
EngineeringFailure mode analysis & stress testing

Three words separate informed decision-making from confirmation bias:

“Attack this idea.”

VI.Custom AI Governance Training

SILT develops sector-specific training programs built around the Challenge Protocol and adversarial verification methodology, adapted to the regulatory requirements and operational realities of each industry.

Healthcare & Life Sciences

Clinicians, medical directors, health IT

Financial Services

Risk analysts, compliance, model validators

Insurance

Underwriters, claims adjusters, actuaries

Legal & Professional

Attorneys, paralegals, compliance counsel

Government & Public Sector

Policy analysts, regulators, procurement

Defense & Intelligence

Analysts, program managers, operational staff

Education (K–12 & Higher Ed)

Teachers, administrators, curriculum dev

Technology & AI Development

ML engineers, safety teams, product managers

Media & Journalism

Reporters, editors, fact-checkers

Corporate Enterprise

C-suite, board members, HR, internal audit

Deliverables

  • Sector-specific workshop curriculum (half-day, full-day, or multi-session)
  • Digital learning modules with embedded assessment
  • Quick-reference cards adapted to your operational context
  • AI interaction policy templates for your regulatory environment
  • Train-the-trainer programs for internal scaling

Generic AI awareness training teaches people that AI exists.

SILT training teaches people how to verify what AI tells them — before they act on it.

VII.Get Started

SILT Cloud provides governance infrastructure. SEB provides evaluation data. The Challenge Protocol provides the daily practice. Together, they give organizations the tools to deploy AI responsibly and the evidence to prove it.

If you deploy AI in regulated environments

Map SEB outputs to NIST AI RMF, EU AI Act risk assessments, and OCC model validation requirements.

If you evaluate or audit AI systems

Use SEB’s adversarial battery and blind judging as an independent, reproducible evaluation standard.

If you make policy about AI

Replace opinion-based risk assessment with evidence-based behavioral analysis via S-Level and DEFCON classifications.

If you build AI systems

Understand how your models perform under adversarial behavioral evaluation — not just benchmarks.

Developed by SILT™

SILT Cloud is a platform initiative of
Sentient Index Labs & Technology, LLC