Detection Architecture¶
How Kysira scores a request, what attack classes it covers, and how it handles encoding obfuscation.
Two-layer detection¶
Every request passes through two layers in sequence:
1. Heuristic layer — fast, deterministic pattern matching against a catalog of attack signatures. Runs first. If a high-confidence match is found, the ML layer is skipped entirely to keep latency low.
2. ML layer — purpose-trained classifiers for attack classes where heuristic patterns alone have insufficient recall (SQL injection and prompt injection). These run only when the heuristic layer does not already produce a kill-threshold score.
This ordering bounds inference latency: the vast majority of requests are decided by the heuristic layer in under 1 ms. The ML classifiers add 50–400 ms on CPU and are only invoked when needed.
Request normalization¶
Before any detection runs, the request text is normalized to a stable form. This means:
- Recursive URL-decoding to a fix point — double and triple encoding (
%2527,%252527) collapse to plain text - HTML entity decoding (
<,',<) — inline with URL decoding so mixed-encoding payloads are handled in one pass - Form body splitting on both
∧separators, including parameter names (not just values) — payloads hidden in parameter names are not missed
Detection always runs on the normalized form, making it caller-independent: a payload is detected regardless of whether the network layer, load balancer, or application framework decoded it before Kysira saw it.
Detector catalog¶
| Detector | OWASP class | Type |
|---|---|---|
| SQL injection | A03:2021 Injection | ML + heuristic backstop |
| Prompt injection | LLM01:2025 | ML + keyword pre-filter |
| Cross-site scripting (XSS) | A03:2021 Injection | Heuristic |
| NoSQL injection | A03:2021 Injection | Heuristic |
| Command injection | A03:2021 Injection | Heuristic |
| Server-side template injection (SSTI) | A03:2021 Injection | Heuristic |
| Path traversal / LFI / RFI | A01:2021 Broken Access Control | Heuristic |
| LDAP injection | A03:2021 Injection | Heuristic |
| XPath injection | A03:2021 Injection | Heuristic |
| SSRF | A10:2021 SSRF | Heuristic |
| XXE | A05:2021 Security Misconfiguration | Heuristic |
| Insecure deserialization | A08:2021 Integrity Failures | Heuristic |
| Open redirect | A01:2021 Broken Access Control | Heuristic |
| CRLF / HTTP response splitting | A03:2021 Injection | Heuristic |
| HTTP request smuggling | A03:2021 Injection | Structural (proxy layer) |
Scoring tiers¶
Detectors return a score in [0.0, 1.0]. The default kill threshold is 0.95.
| Score range | Meaning | Default action |
|---|---|---|
≥ 0.95 | High-confidence attack | Block (active mode) or flag (shadow mode) |
0.6 – 0.94 | Advisory — structurally ambiguous | Flag only, never auto-block |
< 0.6 | Clean | Pass through |
The advisory tier exists for classes where a generic WAF cannot distinguish attack from legitimate use without application context — open redirects (any external URL could be legitimate), private-host SSRF (internal health checks), and certain NoSQL operators used in filter APIs. These surface in the dashboard for review without risking false-positive blocks.
The /score/all endpoint returns the maximum score across all detectors along with a per-detector breakdown, so you can inspect which class fired and at what confidence.
Known limitations¶
The heuristic layer is tuned for high precision (low false positives). Some heavily obfuscated payloads may not be caught:
- Alternate encodings not in the normalization pipeline — base64 data URIs carrying scripts, hex/octal/IPv6-mapped IP addresses for SSRF
- Whitespace or comment obfuscation inside keywords — e.g. SQL keywords split across comments
- Novel jailbreak phrasings for prompt injection that don't match known patterns
Shadow mode is the recommended starting point precisely because it gives visibility into edge cases before committing to active blocking. The tiered scoring design means ambiguous cases surface as advisory scores rather than silently passing.