Suppose I have an object of string keys and string values, and I want to write them as CSS custom properties into some server-generated HTML. How can I do this safely?
What I mean by security is
To keep it simple, I will restrict the key to only allow characters in the [a-zA-Z0-9_-]
class.
From reading the CSS spec and some personal testing, I think you can make a lot of progress by getting the value by following these steps:
{([
outside the string outside the string has a matching closing brace. If not, discard this key-value pair. \3C
to escape all instances of <<
, and use 3E
to escape all instances of >
. \3B
to escape all instances of ;
. I came up with the above steps based on this CSS syntax specification
For context, these properties can be used by user-defined styles that we insert elsewhere, but the same object is also used as template data in the template, so it may contain strings intended as content and strings expected of mixins as CSS variables. I feel like the algorithm above strikes a good balance of being very simple without running the risk of throwing away too many key-value pairs that might be useful in CSS (even allowing for future additions to CSS, but I want to make sure I don't Missing something.
Here's some JS code showing what I'm trying to achieve. obj
is the object in question, and preprocessPairs
is a function that takes the object and preprocesses it, removing/reformatting the values as described in the above steps.
function generateThemePropertiesTag(obj) { obj = preprocessPairs(obj); return `<style> :root { ${Object.entries(obj).map(([key, value]) => { return `--theme-${key}: ${value};` }).join("\n")} } </style>` }
So when given an object like this
{ "color": "#D3A", "title": "The quick brown fox" }
I want the CSS to look like this:
:root { --theme-color: #D3A; --theme-title: The quick brown fox; }
Although --theme-title
is a pretty useless custom variable when used in CSS, it doesn't actually break the stylesheet because CSS ignores properties it doesn't understand.
P粉8981078742023-09-07 21:34:20
We might actually just use regular expressions and some other algorithms without having to rely on a specific language, hopefully that's what you need.
By declaring that the object key is inside [a-zA-Z0-9_-]
we need to parse the value somehow.
So we can break it down into categories and see what we come across (they may be slightly simplified for clarity):
'.*'
(String surrounded by apostrophes; greedy) ".*"
(String enclosed in double quotes; greedy) [ -]?\d (\.\d )?(%|[A-z] )?
(integer and decimal, optional percentage or with unit) #[0-9A-f]{3,6}
(color)[A-z0-9_-]
(keywords, named colors, "ease in", etc.) ([\w-] )\([^)] \)
(functions similar to url()
, calc()
> etc. ) I can imagine you could do some filtering before trying to identify these patterns. Maybe we trim the value string first. As you mentioned, <<
and >
can be escaped at the beginning of the preprocessPairs()
function as it won't appear as we have above any mode. If you don't want unescaped semicolons appearing anywhere, you can also escape them.
We can then try to identify these patterns within the values , and for each pattern we may need to run filtering again. We expect these patterns to be separated by some (or two) whitespace characters.
It should be okay to include support for multiline strings, which is an escaped newline.
We need to realize that we have at least two contexts to filter - HTML and CSS. When we include styles in elements, the input must be safe and it must be valid CSS. Fortunately, you're not including the CSS in the element's
style
attribute, so this is slightly easier.
So points 1-5 will be very simple and most of the values will be covered by the simple filtering and trimming ahead. With some additions (don't know what impact on performance) it might even do extra checks for correct units, keywords, etc.
But compared to other points, I think the relatively bigger challenge is point 6. You might decide to simply disable url()
in this custom style, letting you check the input to the function, so for example you might want to escape the semicolon, or maybe even check inside the function again with a tiny tweak The pattern is for example calc()
.
In general, this is my opinion. With a few tweaks to these regular expressions, it should complement what you're already doing and give you as much flexibility as possible in typing CSS while saving you from having to tweak your code every time you tweak a CSS feature.
function preprocessPairs(obj) { // Catch-all regular expression // Explanation: // ( Start of alternatives // \w+\(.+?\)| 1st alternative - function // ".+?(?<!\)"| 2nd alternative - string with double quotes // '.+?(?<!\)'| 3rd alternative - string with apostrophes // [+-]?\d+(?:\.\d+)?(?:%|[A-z]+)?| 4th alternative - integer/decimal number, optionally per cent or with a unit // #[0-9A-f]{3,6}| 5th alternative - colour // [A-z0-9_-]+| 6th alternative - keyword // ''| 7th alternative - empty string // "" 8th alternative - empty string // ) // [\s,]* const regexA = /(\w+\(.+?\)|".+?(?<!\)"|'.+?(?<!\)'|[+-]?\d+(?:\.\d+)?(?:%|[A-z]+)?|#[0-9A-f]{3,6}|[A-z0-9_-]+|''|"")[\s,]*/g; // newObj contains filtered testObject const newObj = {}; // Loop through all object properties Object.entries(obj).forEach(([key, value]) => { // Replace <>; value = value.trim().replace('<', '\00003C').replace('>', '\00003E').replace(';', '\00003B'); // Use catch-all regex to split value into specific elements const matches = [...value.matchAll(regexA)]; // Now try to build back the original value string from regex matches. // If these strings are equal, the value is what we expected. // Otherwise it contained some unexpected markup or elements and should // be therefore discarded. // We specifically set to ignore all occurences of url() and @import let buildBack = ''; matches.forEach((match) => { if (Array.isArray(match) && match.length >= 2 && match[0].match(/url\(.+?\)/gi) === null && match[0].match(/@import/gi) === null) { buildBack += match[0]; } }); console.log('Compare\n'); console.log(value); console.log(buildBack); console.log(value === buildBack); if (value === buildBack) { newObj[key] = value; } }); return newObj; }
Please comment, discuss, criticize, and let me know if I forgot to touch on a topic that is of particular interest to you.
Disclaimer: I am not the author, owner, investor, or contributor of the sources mentioned below. I just happen to use them to get some information.