Ruby string


String objects in Ruby are used to store or manipulate sequences of one or more bytes.

Ruby strings are divided into single-quote strings (') and double-quote strings ("). The difference is that double-quote strings can support more escape characters.

Single quotation marks String

The simplest string is a single-quoted string, that is, storing the string within single quotes:

'这是一个 Ruby 程序的字符串'

If you need to use single-quote characters inside a single-quoted string, then You need to use a backslash (\) in a single-quoted string so that the Ruby interpreter will not think that the single-quote character is the terminating symbol of the string:

'Won\'t you read O\'Reilly\'s book?'

A backslash can also escape another backslash. slash, so that the second backslash itself is not interpreted as an escape character.

The following are string-related features in Ruby

In double-quoted strings we can use

#{}

pound signs and curly brackets to calculate the value of expressions: Embedded variables in strings:

#!/usr/bin/ruby
# -*- coding: UTF-8 -*-

name1 = "Joe"
name2 = "Mary"
puts "你好 #{name1},  #{name2} 在哪?"


The output of the above example is:

你好 Joe,  Mary 在哪?

Mathematical operation in the string:

#!/usr/bin/ruby
# -*- coding: UTF-8 -*-

x, y, z = 12, 36, 72
puts "x 的值为 #{ x }"
puts "x + y 的值为 #{ x + y }"
puts "x + y + z 的平均值为 #{ (x + y + z)/3 }"


The output of the above example The running output is:

x 的值为 12
x + y 的值为 48
x + y + z 的平均值为 40

Ruby also supports a string variable guided by %q and %Q. %q uses a single quote rule, while %Q uses a double quote rule. Followed by (! [ { The starting delimiter of etc. and } ] ) The ending delimiter of etc.

The character following q or Q is the delimiter. The delimiter can be any one. Non-alphanumeric single-byte characters. For example: [,{,(,<,!, etc., the string will be read until a matching terminator is found.

#!/usr/bin/ruby
# -*- coding: UTF-8 -*-

desc1 = %Q{Ruby 的字符串可以使用 '' 和 ""。}
desc2 = %q|Ruby 的字符串可以使用 '' 和 ""。|

puts desc1
puts desc2


The output of the above example is:

Ruby 的字符串可以使用 '' 和 ""。
Ruby 的字符串可以使用 '' 和 ""。

Escape characters

The subscript lists the escape characters or non-printing characters that can be escaped using the backslash symbol.

Note:

In a string enclosed in double quotes, escape characters will be parsed. In a string enclosed in single quotes, escape characters will not be parsed. , output as is.

Backslash symbolHexadecimal characterDescription
\a 0x07Alarm symbol
\b0x08Backspace key
\cx Control-x
\C-x Control-x
\e0x1bEscape character
\f0x0cFormfeed
\M-\C-x Meta-Control-x
\n0x0aLine break
\nnn Octal notation, where n The range is 0.7
\r0x0dCarriage return
\s 0x20Space character
\t0x09Tab character
\v0x0bVertical tab character
\x Character x
\xnn Hexadecimal notation, where n ranges from 0.9, a.f or A.F

Character encoding

Ruby's default character set is ASCII, and characters can be represented by a single byte. If you use UTF-8 or other modern character sets, characters may be represented by one to four bytes.

You can change the character set using $KCODE at the beginning of your program, as shown below:

$KCODE = 'u'

The following are the possible values ​​for $KCODE.

EncodingDescription
aASCII (same as none). This is the default.
eEUC.
nNone (same as ASCII).
uUTF-8.

String built-in methods

We need an instance of a String object to call the String method. Here's how to create an instance of a String object:

new [String.new(str="")]

This will return a new String object containing a copy of str. Now, using the str object, we can call any of the available instance methods. For example:

#!/usr/bin/ruby

myStr = String.new("THIS IS TEST")
foo = myStr.downcase

puts "#{foo}"

This will produce the following results:

this is test

The following are the public string methods (assuming str is a String object):

Serial numberMethod & Description
1str % arg
Format strings using format specifications. If arg contains more than one substitution, then arg must be an array. For more information on format specifications, see sprintf under "Kernel Modules".
2str * integer
Returns a new string containing integer strs. In other words, str is repeated integer times.
3str + other_str
Connect other_str to str.
4str << obj
Connect an object to a string. If the object is a fixed number Fixnum in the range 0.255, it is converted to a character. Compare this to concat.
5str <=> other_str
Compare str with other_str and return -1 (less than), 0 (equal to) or 1 (greater than). Comparisons are case-sensitive.
6str == obj
Checks the equality of str and obj. Returns false if obj is not a string, true if str <=> obj, or 0.
7str =~ obj
Matches str according to the regular expression pattern obj. Returns the position where the match begins, otherwise returns false.
8
9 str.capitalize
Convert the string to uppercase letters for display.
10str.capitalize!
Same as capitalize, but str will be changed and returned.
11str.casecmp
Case-insensitive string comparison.
12str.center
Center the string.
13str.chomp
Remove the record separator ($/) from the end of the string, usually \n. If there is no record separator, no action is taken.
14str.chomp!
Same as chomp, but str will be changed and returned.
15str.chop
Remove the last character in str.
16str.chop!
Same as chop, but str will be changed and returned.
17str.concat(other_str)
Concatenate other_str to str.
18str.count(str, ...)
Count one or more character sets. If there are multiple character sets, the intersection of these sets is counted.
19str.crypt(other_str)
Apply a one-way cryptographic hash to str. The argument is a two-character string, each character in the range a.z, A.Z, 0.9, ., or /.
20str.delete(other_str, ...)
Returns a copy of str with all characters in the intersection of the arguments removed.
21str.delete!(other_str, ...)
Same as delete, but str will change and return .
22str.downcase
Returns a copy of str, with all uppercase letters replaced with lowercase letters.
23str.downcase!
Same as downcase, but str will be changed and returned.
24str.dump
Returns a version of str with all non-printing characters replaced with \nnn symbols and all Special characters are escaped.
25str.each(separator=$/) { |substr| block }
Use parameters as record separators ( The default is $/) delimited str, passing each substring to the provided block.
26str.each_byte { |fixnum| block }
Pass each byte of str to block, in bytes The decimal representation of each byte is returned.
27str.each_line(separator=$/) { |substr| block }
Use parameters as record separators ( The default is $/) delimited str, passing each substring to the provided block.
28str.empty?
Returns true if str is empty (that is, the length is 0).
29str.eql?(other)
If two strings have the same length and content, then the two strings are equal.
30str.gsub(pattern, replacement) [or]
str.gsub(pattern) { |match| block }

Returns a copy of str, with all occurrences of pattern replaced by the value of replacement or block. pattern is usually a regular expression Regexp; if it is a String, no regular expression metacharacters are interpreted (i.e., /\d/ will match a number, but '\d' will match a backslash followed by a 'd').
31str[fixnum] [or] str[fixnum,fixnum] [or] str[range] [or] str[regexp] [ or] str[regexp, fixnum] [or] str[other_str]
Use the following parameters to reference str: if the parameter is one Fixnum, then the character encoding of fixnum is returned; if the parameter is two Fixnum, then one from The substring starting from the offset (the first fixnum) and ending at the length (the second fixnum); if the parameter is range, a substring within the range is returned; if the parameter is regexp, the part of the matching string is returned ;If the parameter is regexp with fixnum, then the matching data at fixnum position will be returned; if the parameter is other_str, then the substring matching other_str will be returned. A negative Fixnum starts at -1 from the end of the string.
32str[fixnum] = fixnum [or] str[fixnum] = new_str [or] str[fixnum, fixnum] = new_str [or ] str[range] = aString [or] str[regexp] =new_str [or] str[regexp, fixnum] =new_str [or] str[other_str] = new_str ]
Replace an entire string or part of a string. Synonymous with slice!.
33str.gsub!(pattern, replacement) [or] str.gsub!(pattern) { |match| block }
Perform the replacement of String#gsub and return str. If no replacement is performed, nil is returned.
34str.hash
Returns a hash based on the length and content of the string.
35str.hex
Treat the leading character of str as a string of hexadecimal digits (an optional symbol and an optional 0x), and returns the corresponding number. Returns zero if error occurs.
36str.include? other_str [or] str.include? fixnum
If str contains the given string or character, returns true.
37str.index(substring [, offset]) [or]
str.index(fixnum [, offset]) [or ]
str.index(regexp [, offset])

Returns the index of the first occurrence of the given substring, character (fixnum), or pattern (regexp) in str. Returns nil if not found. If the second argument is supplied, specifies the position in the string to begin the search.
38str.insert(index, other_str)
Insert other_str before the character at the given index and modify str. Negative indexes count from the end of the string and are inserted after the given character. The intent is to insert a string starting at the given index.
39str.inspect
Returns a printable version of str, with escaped special characters.
40str.intern [or] str.to_sym
Returns the symbol corresponding to str, if it does not exist before, The symbol is created.
41str.length
Returns the length of str. Compare it to size.
42str.ljust(integer, padstr=' ')
If integer is greater than the length of str, the returned length is A new string of integer, left-aligned with str and padded with padstr. Otherwise, str is returned.
43str.lstrip
Returns a copy of str with leading spaces removed.
44str.lstrip!
Remove leading spaces from str, or return nil if there is no change.
45str.match(pattern)
If pattern is not a regular expression, convert pattern to regular expression Regexp , and then calls its matching method on str.
46str.oct
Treat the leading character of str as a string of decimal digits (an optional symbol) , and return the corresponding number. If the conversion fails, 0 is returned.
47str.replace(other_str)
Replace the content in str with the corresponding value in other_str.
48str.reverse
Returns a new string that is the reverse order of str.
49str.reverse!
Reverse str, str will change and return.
50str.rindex(substring [, fixnum]) [or]
str.rindex(fixnum [, fixnum]) [or ]
str.rindex(regexp [, fixnum])

Returns the index of the last occurrence of the given substring, character (fixnum), or pattern (regexp) in str. Returns nil if not found. If the second argument is supplied, specifies the position in the string at which to end the search. Characters beyond this point will not be considered.
51str.rjust(integer, padstr=' ')
If integer is greater than the length of str, the returned length is A new string of integer, right-aligned with str and padded with padstr. Otherwise, str is returned.
52str.rstrip
Returns a copy of str with trailing spaces removed.
53str.rstrip!
Remove trailing spaces from str, or return nil if there is no change.
54str.scan(pattern) [or]
str.scan(pattern) { |match, ...| block }

Two forms match pattern (can be a regular expression Regexp or a string String) to traverse str. For each match, a result is generated, which is added to the results array or passed to the block. If pattern contains no grouping, each individual result consists of the matched string, $&. If pattern contains groups, each individual result is an array containing an entry for each group.
55str.slice(fixnum) [or] str.slice(fixnum, fixnum) [or]
str.slice(range ) [or] str.slice(regexp) [or]
str.slice(regexp, fixnum) [or] str.slice(other_str)
See str[fixnum], etc.
str.slice !(fixnum) [or] str.slice!(fixnum, fixnum) [or] str.slice!(range) [or] str.slice!(regexp) [or] str.slice!(other_str)

Removes the specified part from str and returns the deleted part. If the value is out of range and the argument takes the form Fixnum, an IndexError will be generated. If the parameter is in the form of range, a RangeError will be generated. If the parameter is in the form of Regexp and String, the execution action will be ignored.
56str.split(pattern=$;, [limit])

Based on the delimiter, str Divides into substrings and returns an array of these substrings.

If pattern is a string String, it will be used as a separator when splitting str. If pattern is a single whitespace, str is split based on whitespace, and leading whitespace and consecutive whitespace characters are ignored.

If pattern is a regular expression Regexp, str is split where pattern matches. When pattern matches a string of length 1, str is split into individual characters.

If the pattern parameter is omitted, the value of $; is used. If $; is nil (the default), str is split on whitespace, as if ` ` was specified as the delimiter.

If the limit parameter is omitted, trailing null fields will be suppressed. If limit is a positive number, up to that number of fields are returned (if limit is 1, the entire string is returned as the only entry in the array). If limit is a negative number, the number of fields returned is unlimited, and trailing null fields are not suppressed.

57str.squeeze([other_str]*)
Create a sequence from the other_str parameter using the procedure described for String#count Series characters. Returns a new string in which occurrences of identical characters in the set are replaced with a single character. If no argument is given, all identical characters are replaced with a single character.
58str.squeeze!([other_str]*)
Same as squeeze, but str will change and return, Returns nil if there are no changes.
59str.strip
Returns a copy of str with leading and trailing spaces removed.
60str.strip!
Remove leading and trailing spaces from str and return if there is no change nil.
61str.sub(pattern, replacement) [or]
str.sub(pattern) { |match| block }

Returns a copy of str. The first occurrence of pattern will be replaced with the value of replacement or block. pattern is usually a regular expression Regexp; if it is a String, no regular expression metacharacters are interpreted.
62str.sub!(pattern, replacement) [or]
str.sub!(pattern) { |match| block }

Perform String#sub replacement and return str. If no replacement is performed, nil is returned.
63str.succ [or] str.next
Returns the inheritance of str.
64str.succ! [or] str.next!
is equivalent to String#succ, but str will change and return.
65str.sum(n=16)
Returns the n-bit checksum of the characters in str, where n Is an optional Fixnum parameter, the default is 16. The result is simply the sum of the binary values ​​of each character in str, modulo 2n - 1. This is not a particularly good checksum.
66str.swapcase
Returns a copy of str, with all uppercase letters converted to lowercase letters, and all lowercase letters converted are capital letters.
67str.swapcase!
is equivalent to String#swapcase, but str will change and return, if there is no change then Return nil.
68str.to_f
Returns the result of interpreting the leading characters in str as floating point numbers. Extra characters beyond the end of a valid digit are ignored. If there are no valid digits at the beginning of str, 0.0 is returned. This method does not generate an exception.
69str.to_i(base=10)
Returns the leading character in str interpreted as an integer base (the base is 2, 8, 10 or 16). Extra characters beyond the end of a valid digit are ignored. If there are no valid digits at the beginning of str, 0 is returned. This method does not generate an exception.
70str.to_s [or] str.to_str
Returns the received value.
71str.tr(from_str, to_str)
Returns a copy of str, replacing the characters in from_str with the corresponding characters in to_str. If to_str is shorter than from_str, it is padded to the last character. Both strings can use c1.c2 notation to represent ranges of characters. If from_str begins with ^, it means all characters except those listed.
72str.tr!(from_str, to_str)
is equivalent to String#tr, but str will change and return , returns nil if there is no change.
73str.tr_s(from_str, to_str)
Process str according to the rules described by String#tr, and then move it Except for repeated characters that would affect translation.
74str.tr_s!(from_str, to_str)
Equivalent to String#tr_s, but str will change and return , returns nil if there is no change.
75str.unpack(format)
Decode str (may contain binary data) according to the format string, and return the extracted an array of each value. The format character consists of a series of single-character instructions. Each command can be followed by a number indicating the number of times the command is repeated. The asterisk (*) will use all remaining elements. The sSiIll directives, each possibly followed by an underscore (_), use the underlying platform's native size for the specified type, otherwise use a platform-independent consistent size. Spaces in the format string are ignored.
76str.upcase
Returns a copy of str, with all lowercase letters replaced with uppercase letters. The operation is context-insensitive, only characters a through z are affected.
77str.upcase!
Change the content of str to uppercase, or return nil if there is no change.
78str.upto(other_str) { |s| block }
Traverse continuous values, starting with str and ending with other_str End (inclusive), passing each value to the block in turn. The String#succ method is used to generate each value.

String unpack instruction

The following table lists the decompression instructions for the method String#unpack.

##AStringRemove trailing nulls and spaces. aStringString. BStringExtracts the bits from each character (most significant bit first). bStringExtracts bits from each character (least significant bit first). CFixnumExtracts a character as an unsigned integer. cFixnumExtracts a character as an integer. D, dFloat Treat sizeof(double) length characters as native double. EFloat Treat sizeof(double) length characters as a double in littleendian byte order. eFloatTreat sizeof(float) length characters as a float in littleendian byte order. F, fFloat Treat sizeof(float) length characters as native float. GFloat Treat sizeof(double) length characters as double in network byte order. gFloatTreat sizeof(float) length characters as a float in network byte order. HStringExtracts the hexadecimal value from each character (most significant bit first). hStringExtracts hexadecimal from each character (least significant bit first). IIntegerTreat consecutive characters of sizeof(int) length (modified by _) as native integer. iIntegerTreat consecutive characters of sizeof(int) length (modified by _) as a signed native integer. LIntegerTreat four consecutive characters (modified by _) as an unsigned native long integer. lIntegerTreat four consecutive characters (modified by _) as a signed native long integer. MStringQuote printable. mStringBase64 encoding. NIntegerTreat four characters as an unsigned long in network byte order. nFixnumTreat two characters as an unsigned short in network byte order. PString Treat sizeof(char *) characters as a pointer and return \emph{len} from the referenced position character. pString Treat sizeof(char *) characters as a pointer to a null-terminated character. QIntegerTreat eight characters as an unsigned quad word (64 bits). qIntegerTreat eight characters as a signed quad word (64 bits). SFixnumTreat two (different if _ is used) consecutive characters as an unsigned short in native byte order.
CommandReturnDescription
sFixnumTreat two (different if _ is used) consecutive characters as a signed short in native byte order.
UIntegerUTF-8 character as an unsigned integer.
uStringUU encoding.
VFixnumTreat four characters as an unsigned long in little-endian byte order.
vFixnumTreat two characters as an unsigned short in little-endian byte order.
wIntegerBER compressed integer.
X Skip one character backward.
x Skip forward one character.
ZString Used with * to remove trailing nulls up to the first null.
@ Skip the offset given by the length parameter.

Examples

Try the following examples to decompress various data.

"abc rrreeerrreeeabc rrreeerrreee".unpack('A6Z6')   #=> ["abc", "abc "]
"abc rrreeerrreee".unpack('a3a3')           #=> ["abc", " rrreee0rrreee0"]
"abc rrreeeabc rrreee".unpack('Z*Z*')       #=> ["abc ", "abc "]
"aa".unpack('b8B8')                 #=> ["10000110", "01100001"]
"aaa".unpack('h2H2c')               #=> ["16", "61", 97]
"\xfe\xff\xfe\xff".unpack('sS')     #=> [-2, 65534]
"now=20is".unpack('M*')             #=> ["now is"]
"whole".unpack('xax2aX2aX1aX2a')    #=> ["h", "e", "l", "l", "o"]