Home  >  Article  >  Java  >  How to Split Strings Preserving Delimiters?

How to Split Strings Preserving Delimiters?

Barbara Streisand
Barbara StreisandOriginal
2024-10-24 18:19:03861browse

How to Split Strings Preserving Delimiters?

Splitting Strings with Delimiters Preserved

When working with multiline strings, it often becomes necessary to split them into their component parts using delimiters. However, the default behavior of String.split() discards the delimiters, leaving only the extracted text.

Problem:

Consider the following string:

(Text1)(DelimiterA)(Text2)(DelimiterC)(Text3)(DelimiterB)(Text4)

Splitting this string using String.split() yields:

  • Text1
  • Text2
  • Text3
  • Text4

Desired Output:

To retain the delimiters and split the string accordingly, we require an approach that preserves the delimiters.

Solution:

The JDK provides a way to achieve this using lookahead and lookbehind Regular Expression (regex) features. Here's how it works:

<code class="java">System.out.println(Arrays.toString("a;b;c;d".split("(?<=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("(?=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("((?<;=;)|(?=;))")));</code>

This results in the following output:

  • [a;, b;, c;, d]
  • [a, ;b, ;c, ;d]
  • [a, ;, b, ;, c, ;, d]

The last output aligns with the desired format, where each delimiter is retained and the string is split into separate parts.

Regex Explanation:

  • (?<=;): Looks behind for a semicolon and selects a zero-width placeholder before it.
  • (?=;): Looks ahead for a semicolon and selects a zero-width placeholder after it.
  • ((?<;=;)|(?=;)): A group that matches either a zero-width placeholder before a semicolon or after a semicolon.

By combining these patterns, we effectively split the string at every delimiter while preserving the delimiter itself as part of the output.

Readability Enhancements:

For improved readability, consider using named regular expressions as follows:

<code class="java">static public final String WITH_DELIMITER = "((?<=%1$s)|(?=%1$s))";

public void someMethod() {
    final String[] aEach = "a;b;c;d".split(String.format(WITH_DELIMITER, ";"));
    ...
}</code>

This makes the regular expression more self-explanatory and easier to maintain.

The above is the detailed content of How to Split Strings Preserving Delimiters?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn