A phishing classifier that scores forwarded emails for your inbox

The scenario

Help desk technicians and SOC analysts spend a real chunk of their week reading user-forwarded suspicious emails. A classifier doesn't replace the human read, but it can rank the queue: highly-likely-phishing first, marketing-noise last. Built on a Make scenario or Zapier Zap that watches a 'phishing-reports' inbox label, this is the second-most-common automation a Tier 1 builds for themselves.

The prompt

You are classifying a forwarded email for a help desk's phishing report queue. I will paste the email's headers and body. Score it.

Output format (exactly):
VERDICT: phishing / suspicious / probably-fine / marketing-noise
CONFIDENCE: high / medium / low
TOP REASONS (max 3, one line each):
- ___
- ___
- ___
USER ACTION (one line): what the analyst should tell the reporter.
ESCALATION (one line): whether and how to escalate (e.g., "block sender domain at gateway," "notify security team," "no further action").

Rules:
- Do not read the body and assume legitimacy because it sounds plausible. Phishing is plausible by design.
- Weight the headers (sender domain, return-path, SPF/DKIM results if visible) above the body content.
- Mark "marketing-noise" only when there is no malicious indicator AND the email is from a recognizable opt-in marketing source the user could have signed up for.
- When confidence is "low," default the verdict toward "suspicious" rather than "probably-fine." Better a false positive than a missed phish.
- Never state with certainty that an email is phishing without naming at least one concrete header or URL indicator.

Email:
[paste headers + body]

The output

An example classification of one report.

VERDICT: phishing
CONFIDENCE: high
TOP REASONS:
- Sender domain "hr-companysupport.com" does not match the recipient's employer's domain; classic look-alike pattern.
- Manufactured deadline ("by 5:00 PM today or your direct deposit will be suspended") with credential harvesting on the link target.
- Link domain "hr-portal-companylogin.net" has no relationship to the recipient's employer's legitimate domain.
USER ACTION: Confirm to the reporter that they did the right thing by forwarding it. Tell them not to click the link and to delete the email from their inbox. Compliment them, briefly, for catching it.
ESCALATION: Submit the sender domain to the company's anti-phishing tool to block at the email gateway. If five or more reports of the same domain come in within 24 hours, notify the security team to check whether anyone clicked.

Note on the queue ranking workflow.

If you build this, the value comes from sorting the queue. A help desk that gets 30 forwarded emails a day cannot read each one carefully. The classifier ranks them; the analyst reads the top 5 carefully (the “phishing-high” verdicts) and skims the rest. Over a week, the analyst’s eye for the patterns sharpens, and the classifier’s prompt sharpens with feedback. Both improve; neither replaces the other.

Where the classifier is dangerous.

The “probably-fine” verdict. A false negative on a phishing email that the user then trusts is a worse outcome than a false positive on a legitimate marketing email. Default the threshold so that low-confidence “probably-fine” gets reclassified as “suspicious” — manually if your prompt won’t do it automatically. Err toward suspicion. Always.

One reasonable answer. Your run may differ. Read it against the scenario before using any of it.

What to watch for

Header-based detection misses phishing that uses a compromised legitimate account. SPF/DKIM passing does not mean safe — it means the email actually came from where it claims to come from.
Marketing-noise classifications can be a problem if your shop wants the data for spam filter tuning. Confirm with your security team whether you should label these or pass them on.
A free-tier classifier WILL hallucinate URL safety. If a URL needs a verdict, route it to VirusTotal or Google Safe Browsing — do not trust the AI's read on whether a URL is malicious.
Never auto-quarantine or auto-delete emails based on the classifier's verdict. Human review on every 'phishing' classification before any sender is blocked.
Sanitize the email content if you put real reports through a public AI. Strip the reporter's name, internal forwarding chain, and any sensitive content the email might have copied (financial details, customer data) before pasting.

← Back to agentic use cases