PHP Luminova: Input Sanitization and Escaper
Escaper provides essential methods for escaping and sanitizing data to enhance security and prevent vulnerabilities, this ensures that user-generated input is safely handled across different contexts.
The Input Escape class is designed to provide a useful set of methods for escaping and sanitizing data. This ensures that input data is safe to use in HTML, JavaScript, URLs, and other contexts where untrusted input could potentially lead to security vulnerabilities, such as Cross-Site Scripting (XSS) attacks.
Features
- HTML Escaping: Prevents HTML injection attacks by escaping characters that could be interpreted as HTML tags or attributes.
- JavaScript Escaping: Escapes characters that might be interpreted as JavaScript code, ensuring that data is safely embedded in scripts.
- URL Escaping: Encodes special characters in URLs to ensure they are correctly transmitted and interpreted.
- Attribute Escaping: Escapes characters in HTML attributes to prevent malicious data injection.
- Custom Escaping: Allows for custom escaping rules and contexts as needed.
Usages
Third-Party Libraries
The Escape class also supports integration with third-party libraries such as \Laminas\Escaper\Escaper. To utilize the methods provided by this library, you first need to install it. You can do so by running the following command:
composer require laminas/laminas-escaperOnce installed, whenever you call escape methods, it will use the \Laminas\Escaper\Escaper instead of the built-in Escape class methods for enhanced escaping functionality.
Initializing Escape
use Luminova\Security\Escape;
$escaper = new Escape();Global Helper Functions
More easy approach is to use the global helper function. to escape input which also supports escaping array.
echo escape('<script>alert("XSS")</script>');Or escaping an array.
echo escape([
'html' => '<script>alert("XSS")</script>',
'js' => 'alert("XSS")'
]);Factory Helpers
Luminova provide different ways initialize the and use escape class.
$escaper = factory()>escaper();$escaper = factory('escaper');Use the Factory class loader.
use Luminova\Foundation\Module\Factory;
$escaper = Factory::escaper();Examples
These examples demonstrates how to use the Escaper class to safely output user-generated content in different contexts:
Escape HTML special characters.
// Safe HTML output
echo $escaper->escapeHtml('<script>alert("XSS")</script>');
// Output: <script>alert("XSS")</script>Escapes JavaScript characters.
// Safe JavaScript output
echo $escaper->escapeJs('alert("XSS")');
// Output: alert("XSS");Encodes special characters in a URL.
// Safe URL output
echo $escaper->escapeUrl('https://example.com/?search=foo&bar=baz');
// Output: https%3A%2F%2Fexample.com%2F%3Fsearch%3Dfoo%26bar%3DbazEscape HTML attribute values.
// Safe HTML attribute output
echo $escaper->escapeHtmlAttr('" onmouseover="alert(\'XSS\')"');
// Output: " onmouseover="alert('XSS')"Escape with custom patterns.
echo $escaper->escapeWith('<div>Example</div>', [
'/</' => '<',
'/>/' => '>',
]);
// Output: <div>Example</div>Class Definition
- Class namespace:
Luminova\Security\Escape
Supported Encodings
The $supportedEncodings array lists the character encodings supported by the system. Each entry represents a specific encoding type that the system can handle.
| Type | Description |
|---|---|
iso-8859-1 | ISO 8859-1 Latin-1 (Western European) |
iso8859-1 | Alternative notation for ISO 8859-1 |
iso-8859-5 | ISO 8859-5 Latin/Cyrillic |
iso8859-5 | Alternative notation for ISO 8859-5 |
iso-8859-15 | ISO 8859-15 Latin-9 (Western European with Euro) |
iso8859-15 | Alternative notation for ISO 8859-15 |
utf-8 | Unicode Transformation Format - 8 bits |
cp866 | Code Page 866 (Cyrillic) |
ibm866 | IBM Code Page 866 (Cyrillic) |
866 | Alternative notation for Code Page 866 |
cp1251 | Code Page 1251 (Cyrillic) |
windows-1251 | Windows Code Page 1251 (Cyrillic) |
win-1251 | Alternative notation for Windows Code Page 1251 |
1251 | Alternative notation for Code Page 1251 |
cp1252 | Code Page 1252 (Western European) |
windows-1252 | Windows Code Page 1252 (Western European) |
1252 | Alternative notation for Code Page 1252 |
koi8-r | KOI8-R (Cyrillic) |
koi8-ru | KOI8-RU (Cyrillic) |
koi8r | Alternative notation for KOI8-R |
big5 | Big5 (Traditional Chinese) |
950 | Alternative notation for Big5 |
gb2312 | GB 2312 (Simplified Chinese) |
936 | Alternative notation for GB 2312 |
big5-hkscs | Big5-HKSCS (Traditional Chinese with Hong Kong Supplementary Character Set) |
shift_jis | Shift JIS (Japanese) |
sjis | Alternative notation for Shift JIS |
sjis-win | Shift JIS (Japanese) for Windows |
cp932 | Code Page 932 (Japanese) |
932 | Alternative notation for Code Page 932 |
euc-jp | EUC-JP (Japanese) |
eucjp | Alternative notation for EUC-JP |
eucjp-win | EUC-JP (Japanese) for Windows |
macroman | MacRoman (Western European) |
Properties
encoding
The Escaper encoding (default: utf-8).
protected string $encoding = 'utf-8';encodingFlags
The Escaper encoding flags for special characters to HTML entities (default: ENT_QUOTES|ENT_SUBSTITUTE).
protected int $encodingFlags = ENT_QUOTES | ENT_SUBSTITUTE;supportedEncodings
The list of supported encodings for escaper.Full supported list can be found below in this documentation.
protected string[] $supportedEncodings = [
'utf-8',
//...
]Methods
constructor
Initialize the escaper constructor.
public __construct(?string $encoding = 'utf-8')Parameters:
| Parameter | Type | Description |
|---|---|---|
$encoding | string|null | The character encoding to use (default: 'utf-8'). |
Throws:
- \Luminova\Exceptions\InvalidArgumentException - Throws if unsupported encoding or empty string is provided.
with
Create a static shared escaper instance with the given encoding.
This method checks if the optional third-party escaper class is available.If it exists, the created instance will use it internally for escaping.Otherwise, the instance falls back to the built-in escaper logic.
public static with(?string $encoding = null): selfParameters:
| Parameter | Type | Description |
|---|---|---|
$encoding | string|null | The character encoding to use (default: 'utf-8'). |
Return Value
Luminova\Security\Escape Returns shared static instance of escaper.
Throws:
- \Luminova\Exceptions\InvalidArgumentException - if unsupported encoding or empty string is provided.
Example:
$escaper = Escaper::with('UTF-8');
$escaped = $escaper->escapeHtml('<b>Hello</b>');escape
Escapes a user input string based on the specified context.
Supported Context
Context names and a brief explanation of each:
html:- Purpose: Escapes characters that could be interpreted as HTML tags or entities. It replaces characters like
<,>, and&with their corresponding HTML entities (<,>,&), ensuring that they are displayed as plain text and not interpreted as HTML.
- Purpose: Escapes characters that could be interpreted as HTML tags or entities. It replaces characters like
js:- Purpose: Escapes characters that have special meanings in
JavaScript, such as quotes and backslashes, to prevent injection attacks when inserting data intoJavaScriptstrings or variables.
- Purpose: Escapes characters that have special meanings in
css:- Purpose: Escapes characters that could affect
CSSstyling or lead toCSSinjection attacks, such as special characters in style attributes or css rules.
- Purpose: Escapes characters that could affect
url:- Purpose: Escapes characters that are not valid in
URLsor could break URL structure. This ensures that user-provided data included inURLs does not lead to unexpected behavior or vulnerabilities.
- Purpose: Escapes characters that are not valid in
public static escape(string $input, string $context, ?string $encoding = null): string Parameters:
| Parameter | Type | Description |
|---|---|---|
$input | string | The input string to escape.. |
$context | string | The escaper context (e.g, html, js, css or url). |
$encoding | string|null | The character encoding to use (default: utf-8). |
Return Value
string Return the escaped string.
Throws:
- \Luminova\Exceptions\InvalidArgumentException - If an unsupported, invalid or blank encoding is provided.
Example:
$escaped = Escaper::escape('<b>Hello</b>', 'html');setEncoding
Set escaper encoding type.If set encoding is called when using Laminas Escaper library, new instance of Laminas Escaper will be created.
public setEncoding(string $encoding): selfParameters:
| Parameter | Type | Description |
|---|---|---|
$encoding | string | The character encoding to use (e.g: 'utf-8'). |
Return Value
Luminova\Security\Escape Return instance of escape class.
Throws:
- \Luminova\Exceptions\InvalidArgumentException - Throws if unsupported encoding or empty string is provided.
getEncoding
Get the character encoding used by the escaper.
protected getEncoding(): stringReturn Value:
string - Return the character encoding.
escapeHtml
Escapes HTML characters in a string to prevent HTML injection attacks.Converts special characters to their HTML entities.
protected escapeHtml(string $string): stringParameters:
| Parameter | Type | Description |
|---|---|---|
$string | string | The string to be escaped. |
Return Value:
string - Return the escaped string.
escapeHtmlAttr
Escapes characters in HTML attributes to prevent injection attacks within HTML attributes.
protected escapeHtmlAttr(string $string): stringParameters:
| Parameter | Type | Description |
|---|---|---|
$string | string | The string to be escaped. |
Return Value:
string - Return the escaped string.
escapeJs
Escapes characters in a string that might be interpreted as JavaScript code.Ensures that data used in JavaScript contexts is safe.
protected escapeJs(array|string $string): stringParameters:
| Parameter | Type | Description |
|---|---|---|
$string | array|string | The string or array of strings to be escaped. |
Return Value:
string - The escaped string or array of strings.
escapeCss
Escape CSS special characters.
protected escapeCss(string $string): stringParameters:
| Parameter | Type | Description |
|---|---|---|
$string | string | The string to be escaped. |
Return Value:
string - Return the escaped string.
escapeUrl
Encodes special characters in a URL to ensure it is correctly interpreted by browsers and servers.
protected escapeUrl(string $string): stringParameters:
| Parameter | Type | Description |
|---|---|---|
$string | string | The URL to be escaped. |
Return Value:
string - Return the escaped URL.
escapeWith
Applies custom escaping rules to the input string.Allows for flexibility in handling various contexts.
protected escapeWith(string $string, array<string,string> $rules): stringParameters:
| Parameter | Type | Description |
|---|---|---|
$string | string | The URL to be escaped. |
$rules | array<string,string> | An associative array where keys are patterns and values are replacements. |
Return Value:
string - Return the escaped string.
toUtf8
Convert a string to UTF-8 encoding.
protected toUtf8(string $string): stringParameters:
| Parameter | Type | Description |
|---|---|---|
$string | string | The string to be converted. |
Return Value:
string - Return the converted string.
Throws:
- \Luminova\Exceptions\RuntimeException - When the string is not valid UTF-8 or cannot be converted.
fromUtf8
Convert a string from UTF-8 encoding.
protected fromUtf8(string $string): stringParameters:
| Parameter | Type | Description |
|---|---|---|
$string | string | The string to be converted. |
Return Value:
string - Return the converted string.
convertEncoding
Convert a string to a different character encoding.
protected convertEncoding(array|string $string, string $to, array|string|null $from = null): stringParameters:
| Parameter | Type | Description |
|---|---|---|
$string | array|string | The string or array of strings to be converted. |
$to | string | The target character encoding. |
$from | array|string|null | The source character encoding. Defaults to null (auto-detection). |
Return Value:
string - Return the converted string.