Detection Architecture¶

How Kysira scores a request, what attack classes it covers, and how it handles encoding obfuscation.

Two-layer detection¶

Every request passes through two layers in sequence:

1. Heuristic layer — fast, deterministic pattern matching against a catalog of attack signatures. Runs first. If a high-confidence match is found, the ML layer is skipped entirely to keep latency low.

2. ML layer — purpose-trained classifiers for attack classes where heuristic patterns alone have insufficient recall (SQL injection and prompt injection). These run only when the heuristic layer does not already produce a kill-threshold score.

This ordering bounds inference latency: the vast majority of requests are decided by the heuristic layer in under 1 ms. The ML classifiers add 50–400 ms on CPU and are only invoked when needed.

Request normalization¶

Before any detection runs, the request text is normalized to a stable form. This means:

Recursive URL-decoding to a fix point — double and triple encoding (%2527, %252527) collapse to plain text
HTML entity decoding (<, ', <) — inline with URL decoding so mixed-encoding payloads are handled in one pass
Form body splitting on both & and ; separators, including parameter names (not just values) — payloads hidden in parameter names are not missed

Detection always runs on the normalized form, making it caller-independent: a payload is detected regardless of whether the network layer, load balancer, or application framework decoded it before Kysira saw it.

Detector catalog¶

Detector	OWASP class	Type
SQL injection	A03:2021 Injection	ML + heuristic backstop
Prompt injection	LLM01:2025	ML + keyword pre-filter
Cross-site scripting (XSS)	A03:2021 Injection	Heuristic
NoSQL injection	A03:2021 Injection	Heuristic
Command injection	A03:2021 Injection	Heuristic
Server-side template injection (SSTI)	A03:2021 Injection	Heuristic
Path traversal / LFI / RFI	A01:2021 Broken Access Control	Heuristic
LDAP injection	A03:2021 Injection	Heuristic
XPath injection	A03:2021 Injection	Heuristic
SSRF	A10:2021 SSRF	Heuristic
XXE	A05:2021 Security Misconfiguration	Heuristic
Insecure deserialization	A08:2021 Integrity Failures	Heuristic
Open redirect	A01:2021 Broken Access Control	Heuristic
CRLF / HTTP response splitting	A03:2021 Injection	Heuristic
HTTP request smuggling	A03:2021 Injection	Structural (proxy layer)

Scoring tiers¶

Detectors return a score in [0.0, 1.0]. The default kill threshold is 0.95.

Score range	Meaning	Default action
`≥ 0.95`	High-confidence attack	Block (active mode) or flag (shadow mode)
`0.6 – 0.94`	Advisory — structurally ambiguous	Flag only, never auto-block
`< 0.6`	Clean	Pass through

The advisory tier exists for classes where a generic WAF cannot distinguish attack from legitimate use without application context — open redirects (any external URL could be legitimate), private-host SSRF (internal health checks), and certain NoSQL operators used in filter APIs. These surface in the dashboard for review without risking false-positive blocks.

The /score/all endpoint returns the maximum score across all detectors along with a per-detector breakdown, so you can inspect which class fired and at what confidence.

Known limitations¶

The heuristic layer is tuned for high precision (low false positives). Some heavily obfuscated payloads may not be caught:

Alternate encodings not in the normalization pipeline — base64 data URIs carrying scripts, hex/octal/IPv6-mapped IP addresses for SSRF
Whitespace or comment obfuscation inside keywords — e.g. SQL keywords split across comments
Novel jailbreak phrasings for prompt injection that don't match known patterns

Shadow mode is the recommended starting point precisely because it gives visibility into edge cases before committing to active blocking. The tiered scoring design means ambiguous cases surface as advisory scores rather than silently passing.