Detection Mode

Control how policies are evaluated — exact keyword matching or semantic NLI analysis.

Every policy in Palveron carries a detection mode that decides how the verify engine evaluates the rule. Pick the right mode and policies get faster, more accurate, and easier to debug.

The three modes

Mode	How it works	Speed	Best for
Exact Match	Regex + Aho-Corasick keyword matching	~1-3 ms	PII, IBANs, credit cards, blocked phrases — structured data with predictable patterns
Semantic	NLI neural inference against the rule's intent	~20-40 ms	Paraphrased intent, off-topic detection, toxicity — fuzzy categories
Auto (default)	Palveron classifies the rule and picks the right mode	depends on choice	Most policies — let the engine decide

The headline trade-off: Exact is faster (regex). Semantic is smarter (NLI neural inference). Auto picks the best fit.

How Auto-classification works

AUTO is the default for new policies. When a policy is saved, the engine runs a one-time classification on the neural instruction:

Tokenize the instruction.
Compute features — keyword density, regex patterns, intent verbs ("block", "mask"), entity references.
Classify into EXACT or SEMANTIC.
Cache the decision; re-evaluate only when the instruction changes.

Rules with phrases like "Block any prompt containing credit card numbers or IBANs" classify as EXACT — the entities are well-defined structured data.

Rules with phrases like "Block any attempt to extract trade secrets about our pricing strategy" classify as SEMANTIC — "trade secret" and "pricing strategy" are intent-level signals that paraphrase easily.

Manual override — the three-card selector

In the Policy Editor, click the Detection mode field to reveal three cards:

Card	Pick when
Auto	You're not sure — let Palveron decide and adjust later if needed
Exact Match	The rule is about structured data (PII, IBANs, codes, keywords) — you want the speed and lowest false-positive rate
Semantic	The rule is about intent, off-topic, paraphrased risks — you need the NLI engine's understanding

The selected card sticks until you change it. Re-saving the policy with AUTO re-runs the classifier.

When the policy editor detects a mismatch between your detection mode and the rule's content, it surfaces a banner:

"This rule mentions 'trade secrets' and 'pricing strategy' — usually a semantic concept. Switch to Semantic detection?"

Three triggers fire the banner:

Exact + intent verbs in the rule ("any attempt to...", "manipulate", "extract") → suggests Semantic.
Semantic + structured entities ("credit card", "IBAN", "SSN" with no surrounding fuzzy intent) → suggests Exact.
Auto + tier mismatch — AUTO on Community tier (which only supports Exact) → suggests Exact.

Dismiss the banner to suppress it for the current edit. It will return if you change the rule and the trigger still applies.

Tier availability

Tier	Available modes
Community	Exact Match only
Pro	Exact Match + Semantic
Business / Enterprise	All three (Auto / Exact Match / Semantic)

AUTO is silently treated as EXACT on Community projects — the banner surfaces this so you're not surprised.

matchDetails in the Trace Explorer

When a policy fires, the trace's match_details field shows what specifically triggered the decision. The values map back to the detection mode that owned the match:

`match_details.type`	Source	Detection mode
`keyword_match`	Regex / Aho-Corasick	Exact Match
`entity_detection`	NGE NER (e.g. `SSN`, `EMAIL`)	Exact Match (NGE deterministic)
`tool_call`	MCP tool name + agent	n/a (policy is on the tool, not content)
`rate_pattern`	Per-agent burst signal	n/a
`semantic_similarity`	NLI similarity ≥ threshold	Semantic

A single trace can carry multiple match types — when a Semantic policy fires on a prompt that also hits an Exact policy, both rows appear, and the Trace Explorer's filter lets you slice by either.

When to use which mode

Pick Exact Match when…

The rule names specific PII types (SSN, IBAN, email, phone, credit card)
You're blocking on keyword lists (internal codenames, blocked product references, forbidden phrases)
You want sub-5 ms decisions even at scale
Low false-positive tolerance — the prompt either matches the regex or it doesn't

Pick Semantic when…

The rule is about intent ("attempt to extract", "manipulate", "discuss our internal strategy")
The rule needs to catch paraphrases ("delete the database" vs. "drop the schema" vs. "wipe production")
You have an off-topic policy (anything outside a defined scope)
Toxicity, harassment, or sentiment detection

Stay on Auto when…

The policy is new and you don't yet have ground truth on what works
The rule mixes structured and intent components — let the engine pick
You're rolling out 10+ policies and don't want to set the mode for each one manually

Exact Match is faster (regex). Semantic is smarter (NLI neural inference). Auto picks the best fit. When you're not sure, leave it on Auto and inspect the first few traces — the Trace Explorer shows which mode actually ran, so you can correct course quickly.