I'm developing a headless WordPress website using Nuxt as the frontend.
The website has thousands of articles with shortcodes. I get all the page data via graphql and render the content using v-html and everything is fine, but the shortcode apparently only renders as plain text.
Most of them are very simple shortcodes, so I will create Vue components to replace them
<component :is="someshortcode">
What I need to do is split my html into an array of objects that I can use to render parts of the page into html or into components, depending on what it is.
I think the best way to do this is to use regular expressions, and this is where I'm confused.
Suppose I have the following html and some shortcodes
<h1>Lorem ipsum dolor sit amet</h1> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <h2>Lorem ipsum dolor sit amet</h2> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> [someshortcode attr1="value1" attr2="value2"] <h2>Lorem ipsum dolor sit amet</h2> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <h2>Lorem ipsum dolor sit amet</h2> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> [someshortcode attr1="value1" attr2="value2"] <h2>Lorem ipsum dolor sit amet</h2> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <h2>Lorem ipsum dolor sit amet</h2> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p>
What I want to do is return an array of objects as shown below
[ { type: 'html', content: `<h1>Lorem ipsum dolor sit amet</h1> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <h2>Lorem ipsum dolor sit amet</h2> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p>` }, { type: 'shortcode', content: `[someshortcode attr1="value1" attr2="value2"]` }, { type: 'html', content: `<h1>Lorem ipsum dolor sit amet</h1> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <h2>Lorem ipsum dolor sit amet</h2> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p>` }, { type: 'shortcode', content: `[someshortcode attr1="value1" attr2="value2"]` }, { type: 'html', content: `<h1>Lorem ipsum dolor sit amet</h1> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <h2>Lorem ipsum dolor sit amet</h2> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p>` }, ]
This is the base of what I need and then I will be able to break down the shortcode further by getting properties etc.
What is the best way to solve this problem? Are regular expressions the best approach?
P粉7148447432023-09-09 09:21:27
You can use a DOM parser and iterate over the top elements of the DOM. If such an element is a text node and has shortcode format, create a separate object for it in the output array, otherwise get iterate over the HTML of the element and accumulate it if it is not a shortcode, and finally output it as an object:
const html = `<h1>Lorem ipsum dolor sit amet</h1> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> [someshortcode attr1="value1" attr2="value2"] <h2>Lorem ipsum dolor sit amet</h2> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p> [someshortcode attr1="value1" attr2="value2"] <h2>Lorem ipsum dolor sit amet</h2> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus facilisis elit ante. Vivamus semper dui eget justo viverra facilisis. Etiam ut leo fermentum, sagittis mauris nec, placerat lorem.</p>`; const {body} = new DOMParser().parseFromString(html, 'text/html'); let content = ""; const arr = []; for (const child of [...body.childNodes]) { if (child.nodeType === 3 && child.textContent.trim()[0] == "[") { if (content) arr.push({ type: "html", content }); content = ""; arr.push({ type: "shortcode", content: child.textContent.trim() }); } else { content += (child.outerHTML ?? child.textContent); } } if (content) arr.push({ type: "html", content }); console.log(arr);