Randomized Controlled Trial Results
Study Title: “Think First, Verify Always”: Training Humans to Face AI Risks
Published: arXiv:2508.03714v1 [cs.HC] 23 Jul 2025
Sample Size
Study Design
Micro Learning
P-Value
Effect Size (Cohen’s d)
Assessment Item

Empirically Validated research demonstrating the effectiveness of the TFVA protocol in enhancing human self-protection against AI Threats.
The study achieved p = 0.0017 (well below the standard threshold of p < 0.05), indicating the results are statistically significant and highly unlikely to have occurred by chance. Cohen’s d of 0.52 represents a medium effect size, demonstrating practical significance alongside statistical significance.
Performance Improvements by Principle
Relative improvement in ethical decision-making scenarios (+16.0% absolute)
Detailed Performance Comparison
| Principle | AIJET Group | Control Group | Absolute Gain | Cohen’s d | P-Value |
|---|---|---|---|---|---|
| Awareness | 62.0% | 57.8% | +4.2% | 0.22 | 0.17 |
| Integrity | 54.4% | 43.4% | +11.0% | 0.54 | 0.001 |
| Judgment | 84.7% | 76.0% | +8.7% | 0.31 | 0.064 |
| Ethical Responsibility | 52.0% | 36.0% | +16.0% | 0.33 | 0.046 |
| Transparency | 84.2% | 81.3% | +2.9% | 0.08 | 0.64 |
| OVERALL | 65.28% | 57.41% | +7.87% | 0.52 | 0.0017 |
Key Findings
1. Rapid Knowledge Transfer
Participants successfully applied AIJET principles to novel scenarios not covered in training, demonstrating effective knowledge transfer to ambiguous situations similar to real-world AI threats.
2. Ethics as Security Multiplier
Ethical judgment showed the most notable improvement (+44.4%), suggesting that framing security through ethics substantially enhances decision quality.
3. Integrity Validation
The significant improvement in integrity-based tasks (+25.3%, p = 0.001) validates the effectiveness of the cognitive security layer approach.
4. Performance Normalization
While control group performance varied widely (36.0% to 81.3%), AIJET training helped normalize disparities, creating more consistent performance across all security dimensions.
Study Methodology
Design
On May 9, 2025, 151 U.S.-based participants were recruited through Prolific.com, ensuring demographic diversity by age (20-76 years, mean 41), gender, education level, and technical background. Participants were randomly assigned to one of two conditions:
- Treatment Group (n=76): Received a 3-minute micro-lesson on the ‘Think First, Verify Always’ cognitive protocol, with brief primers on each of the five AIJET principles
- Control Group (n=75): Received a matched 3-minute micro-lesson on traditional cybersecurity topics (software updates, backup protocols, password hygiene), primarily device-focused, with equivalent cognitive load
Assessment
Both groups completed an identical 18-item assessment designed to measure performance across the five AIJET domains. The test included scenario-based questions such as:
- Identifying markers of potentially AI-generated phishing attempts
- Determining appropriate verification steps for critical communications
- Making decisions in time-pressured scenarios with AI-generated inputs
- Selecting proper documentation approaches for AI-influenced decisions
Two attention checks were included during the experiment, both answered correctly by 100% of participants. The median quiz duration was 12 minutes, 42 seconds.
Scoring
The assessment incorporated questions with interpretive nuance rather than simple binary choices. For nuanced ethical scenarios, responses were coded as correct or high-quality if they demonstrated alignment with pre-defined ethical considerations such as fairness, autonomy, or consent. This design reflects the reality that cognitive cybersecurity threats rarely present as clear-cut scenarios with single ‘correct’ answers.
Practical Implications
> Rapid Deployment Potential
The minimal time required for effective training (3 minutes) addresses one of the primary barriers to cybersecurity training adoption: time constraints. Organizations can integrate Think First, Verify Always Protocol into:
- Employee onboarding processes
- Regular security awareness campaigns
- Pre-task reminders for high-risk activities
- Phishing drill debriefings
> Measurable ROI
With a 7.87% absolute improvement from just 3 minutes of training, organizations can calculate tangible returns:
- Time investment: 3 minutes per employee
- Performance gain: 7.87% improvement in cognitive security tasks
- Scalability: Digital delivery enables organization-wide deployment
- Cost: Minimal (primarily instructional design and platform costs)
> Broad Applicability
The effectiveness with a general population sample (not cybersecurity professionals) suggests these principles can be deployed beyond enterprise settings:
- Educational institutions (K-12, universities)
- Senior centers and community programs
- Healthcare settings
- Government agencies
- General public awareness campaigns
> Bridging Technical & Human Security
TFVA provides the operational “how” for human-empowered controls that technical specifications cannot capture, complementing frameworks such as:
- NIST Cybersecurity Framework 2.0 (PR.AT – Awareness and Training)
- NIST AI Risk Management Framework (Human Oversight)
- ISO 27001 (A.6 Organization, A.7 Human Resources)
- NIST SP 800-53 (AT-2 Literacy Training, PM-11 Mission/Business Focus)