Black Hat USA – Las Vegas – At Black Hat USA, Christian Dameff (UC San Diego Center for Healthcare Cybersecurity) and Ariana Mirian (Censys) presented their findings after an 8-month study researching the effectiveness of phishing training. The results? Just a 1.7% reduction in phishing rates.
Study Design: A Real-World Experiment
The randomized controlled trial involving over 19,500 employees at UC San Diego Health (UCSD Health) found that conducting both annual cybersecurity awareness training and conducting regular phishing simulations did little to improve protection against cyber threats. The study tested two common phishing training methods:
- Annual cybersecurity awareness training: A mandatory 35–40-minute online module focusing on phishing and security best practices, completed yearly via UCSD Health’s HR system.
- Embedded phishing simulations: Monthly simulated phishing campaigns where users who clicked malicious links received immediate training.
Employees were randomly assigned to five groups:
- Control group (~3,950 users): Received no training, only a 404 error page upon failing a simulation.
- Generic static group (~3,950 users): Received static, general anti-phishing advice.
- Generic interactive group (~3,950 users): Engaged with interactive training (e.g., answering questions about phishing cues).
- Contextual static group (~3,950 users): Received static training tailored to the failed phishing lure.
- Contextual interactive group (~3,950 users): Engaged with interactive, lure-specific training.
Ten unique phishing lures, including common phishing techniques like “Outlook Password Reset,” were sent monthly, with users split into four tracks.
Key Findings: Minimal Impact, High Variability
The study’s results revealed significant weaknesses in the effectiveness of the phishing simulation and training.
- No Benefit from annual training: No correlation was found between the recency of annual training completion and phishing failure rates (OR = 0.998 per 30-day increase, 95% CI: 0.996–1.000, P = 0.06). Employees trained within 30 days performed no better than those overdue by over a year (Figure 3).
- Limited embedded training efficacy: Embedded training reduced failure rates by only 1.7% on average compared to the control group (OR = 0.905, 95% CI: 0.863–0.950, P < 0.001). Some lures saw 3–4% reductions, but others (e.g., “Vacation Policy”) had no difference.
- Lure variability drives outcomes: Failure rates varied widely, from 1.8% (“Outlook Password”) to 30.8% (“Vacation Policy”). Simple text-based lures like “Dress Code” (27.65%) outperformed flashier ones. This demonstrated that the effectiveness of phishing messaging has much more impact than the training provided – essentially, the better the lure, the better the hit rate.
- Low engagement undermines training: 37–51% of training sessions lasted zero seconds, with users closing the page immediately. Median engagement was 0–10 seconds, and only 15–24% of sessions were completed. Low engagement likely explains the minimal efficacy.
- Mixed training outcomes: Among the small subset completing training, interactive training reduced future failures by 19%, but static training correlated with an 18.5% increased failure likelihood per session completed, possibly due to self-selection bias.
- High long-term failure rates: 56% of users failed at least one simulation by month eight, with 25.9% failing twice and 3.5% failing four or more, suggesting most users are vulnerable over time, no matter how much training is provided.
Implications: A Broken Model
The findings challenge the common security argument that training humans to spot phishing attacks leads to better security outcomes.
- Lure control manipulates metrics: Organizations can skew failure rates by choosing low-impact (e.g., 1.8%) or high-impact (e.g., 30%) lures.
- Punishment is ineffective: With 56% of users failing at least once, punishing failures (e.g., suspensions) is impractical and unfair, especially given the trendline suggesting given enough time, most people would fall for a phishing email at some point.
- AI amplifies asymmetry: LLMs can be used by attackers to instantly create lures that the study suggests will be highly effective at catching a high percentage of your employees.
Recommendations For Cybersecurity
Dameff and Mirian argued that, while the study raises obvious questions about the efficacy of phishing simulations, the broader point of interest is how cybersecurity decision makers decide where to make investments.
“Is the juice worth the squeeze?” Mirian asks.
They argue we need to move to a more evidence-based approach, similar to medical research, by conducting randomized trials to measure the effectiveness of new strategies. This would enable teams to find out, for example, if investing in hardware MFA keys was a better investment than ongoing phishing simulation.
Future Directions
When it comes to phishing simulations, Dameff and Mirian suggest teams should rethink their approaches:
- Systemic solutions: Invest in technical countermeasures (e.g., AI-driven email filtering, password managers) to shift the burden from users.
- Incentive structures: Explore rewards or nudges to boost engagement without punitive measures, while addressing ethical concerns.
- Tailored training: Investigate whether interactive, contextual training can scale to broader populations or if self-selection biases drive observed benefits.
The 13-page paper and R code for statistical models are available at https://www.sysnet.ucsd.edu/~voelker/pubs/phishtrain-oakland25.pdf.