Golang Regex Boundary Issue with Non-ASCII Characters
In Go, the b boundary option is expected to match at the boundary of ASCII characters, excluding accented characters such as é. This behavior can lead to unexpected results when working with strings containing non-ASCII characters. For instance, consider the following code:
<code class="go">package main import ( "fmt" "regexp" ) func main() { r, _ := regexp.Compile(`\b(vis)\b`) fmt.Println(r.MatchString("re vis e")) // True fmt.Println(r.MatchString("revise")) // False fmt.Println(r.MatchString("révisé")) // True }</code>
In this example, the b(vis)b regex matches the substring "vis" at word boundaries. However, when applied to "révisé", it incorrectly returns True because é is not considered a word character. To address this issue, you can employ an alternative approach:
<code class="go">r, _ := regexp.Compile(`(?:\A|\s)(vis)(?:\s|\z)`) fmt.Println(r.MatchString("vis")) // True fmt.Println(r.MatchString("re vis e")) // True fmt.Println(r.MatchString("revise")) // False fmt.Println(r.MatchString("révisé")) // False</code>
This solution utilizes a non-capturing group (?:A|s)(vis)(?:s|z) to match any of the following characters:
- Start of string (A)
- Whitespace (s)
This mimics the behavior of b but includes non-ASCII characters as potential word boundaries. By combining these components, it successfully matches "vis" at the beginning or end of a word, regardless of the surrounding characters.
The above is the detailed content of Why Does Go\'s Regex \\b Boundary Fail with Non-ASCII Characters?. For more information, please follow other related articles on the PHP Chinese website!

The article explains how to use the pprof tool for analyzing Go performance, including enabling profiling, collecting data, and identifying common bottlenecks like CPU and memory issues.Character count: 159

OpenSSL, as an open source library widely used in secure communications, provides encryption algorithms, keys and certificate management functions. However, there are some known security vulnerabilities in its historical version, some of which are extremely harmful. This article will focus on common vulnerabilities and response measures for OpenSSL in Debian systems. DebianOpenSSL known vulnerabilities: OpenSSL has experienced several serious vulnerabilities, such as: Heart Bleeding Vulnerability (CVE-2014-0160): This vulnerability affects OpenSSL 1.0.1 to 1.0.1f and 1.0.2 to 1.0.2 beta versions. An attacker can use this vulnerability to unauthorized read sensitive information on the server, including encryption keys, etc.

The article discusses writing unit tests in Go, covering best practices, mocking techniques, and tools for efficient test management.

This article demonstrates creating mocks and stubs in Go for unit testing. It emphasizes using interfaces, provides examples of mock implementations, and discusses best practices like keeping mocks focused and using assertion libraries. The articl

This article explores Go's custom type constraints for generics. It details how interfaces define minimum type requirements for generic functions, improving type safety and code reusability. The article also discusses limitations and best practices

The article discusses Go's reflect package, used for runtime manipulation of code, beneficial for serialization, generic programming, and more. It warns of performance costs like slower execution and higher memory use, advising judicious use and best

This article explores using tracing tools to analyze Go application execution flow. It discusses manual and automatic instrumentation techniques, comparing tools like Jaeger, Zipkin, and OpenTelemetry, and highlighting effective data visualization

The article discusses using table-driven tests in Go, a method that uses a table of test cases to test functions with multiple inputs and outcomes. It highlights benefits like improved readability, reduced duplication, scalability, consistency, and a


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Zend Studio 13.0.1
Powerful PHP integrated development environment

Atom editor mac version download
The most popular open source editor

SublimeText3 Chinese version
Chinese version, very easy to use