©
本文档使用
php.cn手册 发布
在头文件<stdio.h>中定义 | ||
---|---|---|
(1) | ||
int scanf(const char * format,...); | (直到C99) | |
int scanf(const char * restrict format,...); | (自C99以来) | |
(2) | ||
int fscanf(FILE * stream,const char * format,...); | (直到C99) | |
int fscanf(FILE * restrict stream,const char * restrict format,...); | (自C99以来) | |
(3) | ||
int sscanf(const char * buffer,const char * format,...); | (直到C99) | |
int sscanf(const char *限制缓冲区,const char *限制格式,...); | (自C99以来) | |
int scanf_s(const char * restrict format,...); | (4) | (自C11以来) |
int fscanf_s(FILE * restrict stream,const char * restrict format,...); | (5) | (自C11以来) |
int sscanf_s(const char *限制缓冲区,const char *限制格式,...); | (6) | (自C11以来) |
从各种来源读取数据,根据其解释并将format
结果存储到给定位置。
1)从中读取数据 stdin
2)从文件流中读取数据 stream
3)从空终止的字符串中读取数据buffer
。到达字符串的末尾等同于达到文件结束条件fscanf
4-6)与(1-3)相同,不同之处在于%c
,%s
和%[
转换说明符每个都需要两个参数(通常的指针和rsize_t
表示接收数组大小的类型值,当使用%c读取时可能为1成一个字符),除了在运行时检测到以下错误并调用当前安装的约束处理函数:
指针类型的任何参数都是空指针
format
,stream
或者buffer
是空指针
%c,%s或%[,加上终止空字符,将会超过为每个转换说明符提供的第二个(rsize_t)参数所写的字符数
可选地,还有任何其他可检测到的错误,例如未知的转换说明符
由于所有的边界检查功能,scanf_s
,fscanf_s
,和sscanf_s
仅保证可供如果__STDC_LIB_EXT1__
由实现所定义,并且如果用户定义__STDC_WANT_LIB_EXT1__
的整数常数1
,包括之前<stdio.h>
。
流 | - | 输入文件流从中读取 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
缓冲 | - | 指向以null结尾的字符串读取的指针 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
格式 | - | pointer to a null-terminated character string specifying how to read the input. The format string consists of. non-whitespace multibyte characters except %: each such character in the format string consumes exactly one identical character from the input stream, or causes the function to fail if the next character on the stream does not compare equal. whitespace characters: any single whitespace character in the format string consumes all available consecutive whitespace characters from the input (determined as if by calling isspace in a loop). Note that there is no difference between "\n", " ", "\t\t", or other whitespace in the format string. conversion specifications. Each conversion specification has the following format: introductory % character (optional) assignment-suppressing character *. If this option is present, the function does not assign the result of the conversion to any receiving argument. (optional) integer number (greater than zero) that specifies maximum field width, that is, the maximum number of characters that the function is allowed to consume when doing the conversion specified by the current conversion specification. Note that %s and %[ may lead to buffer overflow if the width is not provided. (optional) length modifier that specifies the size of the receiving argument, that is, the actual destination type. This affects the conversion accuracy and overflow rules. The default destination type is different for each conversion type (see table below). conversion format specifier The following format specifiers are available: Conversion specifier Explanation Argument type length modifier hh (C99). h (none) l ll (C99). j (C99). z (C99). t (C99). L % matches literal % N/A N/A N/A N/A N/A N/A N/A N/A N/A c matches a character or a sequence of characters If a width specifier is used, matches exactly width characters (the argument must be a pointer to an array with sufficient room). Unlike %s and %[, does not append the null character to the array. N/A N/A char* wchar_t* N/A N/A N/A N/A N/A s matches a sequence of non-whitespace characters (a string) If width specifier is used, matches up to width or until the first whitespace character, whichever appears first. Always stores a null character in addition to the characters matched (so the argument array must have room for at least width+1 characters). set matches a non-empty sequence of character from set of characters. If the first character of the set is ^, then all characters not in the set are matched. If the set begins with ] or ^] then the ] character is also included into the set. It is implementation-defined whether the character - in the non-initial position in the scanset may be indicating a range, as in 0-9. If width specifier is used, matches only up to width. Always stores a null character in addition to the characters matched (so the argument array must have room for at least width+1 characters). d matches a decimal integer. The format of the number is the same as expected by strtol() with the value 10 for the base argument. signed char* or unsigned char* signed short* or unsigned short* signed int* or unsigned int* signed long* or unsigned long* signed long long* or unsigned long long* intmax_t* or uintmax_t* size_t* ptrdiff_t* N/A i matches an integer. The format of the number is the same as expected by strtol() with the value 0 for the base argument (base is determined by the first characters parsed). u matches an unsigned decimal integer. The format of the number is the same as expected by strtoul() with the value 10 for the base argument. o matches an unsigned octal integer. The format of the number is the same as expected by strtoul() with the value 8 for the base argument. x, X matches an unsigned hexadecimal integer. The format of the number is the same as expected by strtoul() with the value 16 for the base argument. n returns the number of characters read so far. No input is consumed. Does not increment the assignment count. If the specifier has assignment-suppressing operator defined, the behavior is undefined. a, A(C99) e, E f, F g, G matches a floating-point number. The format of the number is the same as expected by strtof(). N/A N/A float* double* N/A N/A N/A N/A long double* p matches implementation defined character sequence defining a pointer. printf family of functions should produce the same sequence using %p format specifier. N/A N/A void** N/A N/A N/A N/A N/A N/A For every conversion specifier other than n, the longest sequence of input characters which does not exceed any specified field width and which either is exactly what the conversion specifier expects or is a prefix of a sequence it would expect, is what's consumed from the stream. The first character, if any, after this consumed sequence remains unread. If the consumed sequence has length zero or if the consumed sequence cannot be converted as specified above, the matching failure occurs unless end-of-file, an encoding error, or a read error prevented input from the stream, in which case it is an input failure. All conversion specifiers other than [, c, and n consume and discard all leading whitespace characters (determined as if by calling isspace) before attempting to parse the input. These consumed characters do not count towards the specified maximum field width. The conversion specifiers lc, ls, and l[ perform multibyte-to-wide character conversion as if by calling mbrtowc() with an mbstate_t object initialized to zero before the first character is converted. The conversion specifiers s and [ always store the null terminator in addition to the matched characters. The size of the destination array must be at least one greater than the specified field width. The use of %s or %[, without specifying the destination array size, is as unsafe as gets. The correct conversion specifications for the fixed-width integer types (int8_t, etc) are defined in the header <inttypes.h> (although SCNdMAX, SCNuMAX, etc is synonymous with %jd, %ju, etc). There is a sequence point after the action of each conversion specifier; this permits storing multiple fields in the same "sink" variable. When parsing an incomplete floating-point value that ends in the exponent with no digits, such as parsing "100er" with the conversion specifier %f, the sequence "100e" (the longest prefix of a possibly valid floating-point number) is consumed, resulting in a matching error (the consumed sequence cannot be converted to a floating-point number), with "r" remaining. Existing implementations do not follow this rule and roll back to consume only "100", leaving "er", e.g. glibc bug 1765. | Conversion specifier | Explanation | Argument type | length modifier | hh (C99). | h | (none) | l | ll (C99). | j (C99). | z (C99). | t (C99). | L | % | matches literal % | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | c | matches a character or a sequence of characters If a width specifier is used, matches exactly width characters (the argument must be a pointer to an array with sufficient room). Unlike %s and %[, does not append the null character to the array. | N/A | N/A | char* | wchar_t* | N/A | N/A | N/A | N/A | N/A | s | matches a sequence of non-whitespace characters (a string) If width specifier is used, matches up to width or until the first whitespace character, whichever appears first. Always stores a null character in addition to the characters matched (so the argument array must have room for at least width+1 characters). | set | matches a non-empty sequence of character from set of characters. If the first character of the set is ^, then all characters not in the set are matched. If the set begins with ] or ^] then the ] character is also included into the set. It is implementation-defined whether the character - in the non-initial position in the scanset may be indicating a range, as in 0-9. If width specifier is used, matches only up to width. Always stores a null character in addition to the characters matched (so the argument array must have room for at least width+1 characters). | d | matches a decimal integer. The format of the number is the same as expected by strtol() with the value 10 for the base argument. | signed char* or unsigned char* | signed short* or unsigned short* | signed int* or unsigned int* | signed long* or unsigned long* | signed long long* or unsigned long long* | intmax_t* or uintmax_t* | size_t* | ptrdiff_t* | N/A | i | matches an integer. The format of the number is the same as expected by strtol() with the value 0 for the base argument (base is determined by the first characters parsed). | u | matches an unsigned decimal integer. The format of the number is the same as expected by strtoul() with the value 10 for the base argument. | o | matches an unsigned octal integer. The format of the number is the same as expected by strtoul() with the value 8 for the base argument. | x, X | matches an unsigned hexadecimal integer. The format of the number is the same as expected by strtoul() with the value 16 for the base argument. | n | returns the number of characters read so far. No input is consumed. Does not increment the assignment count. If the specifier has assignment-suppressing operator defined, the behavior is undefined. | a, A(C99) e, E f, F g, G | matches a floating-point number. The format of the number is the same as expected by strtof(). | N/A | N/A | float* | double* | N/A | N/A | N/A | N/A | long double* | p | matches implementation defined character sequence defining a pointer. printf family of functions should produce the same sequence using %p format specifier. | N/A | N/A | void** | N/A | N/A | N/A | N/A | N/A | N/A |
Conversion specifier | Explanation | Argument type | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
length modifier | hh (C99). | h | (none) | l | ll (C99). | j (C99). | z (C99). | t (C99). | L | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
% | matches literal % | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
c | matches a character or a sequence of characters If a width specifier is used, matches exactly width characters (the argument must be a pointer to an array with sufficient room). Unlike %s and %[, does not append the null character to the array. | N/A | N/A | char* | wchar_t* | N/A | N/A | N/A | N/A | N/A | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
s | matches a sequence of non-whitespace characters (a string) If width specifier is used, matches up to width or until the first whitespace character, whichever appears first. Always stores a null character in addition to the characters matched (so the argument array must have room for at least width+1 characters). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
set | matches a non-empty sequence of character from set of characters. If the first character of the set is ^, then all characters not in the set are matched. If the set begins with ] or ^] then the ] character is also included into the set. It is implementation-defined whether the character - in the non-initial position in the scanset may be indicating a range, as in 0-9. If width specifier is used, matches only up to width. Always stores a null character in addition to the characters matched (so the argument array must have room for at least width+1 characters). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
d | matches a decimal integer. The format of the number is the same as expected by strtol() with the value 10 for the base argument. | signed char* or unsigned char* | signed short* or unsigned short* | signed int* or unsigned int* | signed long* or unsigned long* | signed long long* or unsigned long long* | intmax_t* or uintmax_t* | size_t* | ptrdiff_t* | N/A | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
i | matches an integer. The format of the number is the same as expected by strtol() with the value 0 for the base argument (base is determined by the first characters parsed). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
u | matches an unsigned decimal integer. The format of the number is the same as expected by strtoul() with the value 10 for the base argument. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
o | matches an unsigned octal integer. The format of the number is the same as expected by strtoul() with the value 8 for the base argument. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
x, X | matches an unsigned hexadecimal integer. The format of the number is the same as expected by strtoul() with the value 16 for the base argument. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
n | returns the number of characters read so far. No input is consumed. Does not increment the assignment count. If the specifier has assignment-suppressing operator defined, the behavior is undefined. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
a,A(C99)e,E f,F g,G | 匹配一个浮点数。数字的格式与strtof()的预期相同。 | N / A | N / A | 浮动* | 双* | N / A | N / A | N / A | N / A | 长双* | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
p | 匹配定义指针的实现定义的字符序列。printf系列函数应该使用%p格式说明符产生相同的序列。 | N / A | N / A | 无效** | N / A | N / A | N / A | N / A | N / A | N / A | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
... | - | 接收参数 |
非空白多字节字符除外%
:格式字符串中的每个这样的字符只消耗输入流中的一个完全相同的字符,或者如果流中的下一个字符不相等,则会导致函数失败。
空白字符:格式字符串中的任何单个空白字符都会消耗输入中所有可用的连续空白字符(如同通过调用isspace
循环来确定)。请注意,有没有什么区别"\n"
," "
,"\t\t"
在格式字符串或其他空白。
转换规格。每个转换规范具有以下格式:
介绍%
人物
(可选)分配抑制字符*
。如果存在此选项,则函数不会将转换结果分配给任何接收参数。
(可选)指定最大字段宽度的整数数字(大于零),即该函数在执行由当前转换规范指定的转换时允许使用的最大字符数。请注意,如果未提供宽度,%s和%[可能会导致缓冲区溢出。
(可选)长度修饰符,用于指定接收参数的大小,即实际的目标类型。这会影响转换精度和溢出规则。每种转换类型的默认目标类型都不相同(请参阅下表)。
转换格式说明符
以下格式说明符可用:
Conversion
说明符说明参数类型长度修饰符 hh
(C99)。
h
(none) l
ll
(C99).
j
(C99).
z
(C99).
t
(C99).
L
%
匹配字面%
N / AN / AN / AN / AN / AN / AN / AN / AN / A c
一个匹配字符或序列字符如果使用宽度说明,完全匹配宽度的字符(参数必须是一个指针数组有足够的空间)。与%s和%[不同,不会将空字符追加到数组。
N/A N/A char*
wchar_t*
N / AN / AN / AN / AN / A s
匹配非空白字符序列(字符串)如果使用宽度说明符,则匹配宽度或直到第一个空白字符,以先出现者为准。除了匹配的字符外,总是存储一个空字符(所以参数数组必须至少有宽度+ 1个字符的空间)。
`[`set`]` matches a non-empty sequence of character from set of characters. If the first character of the set is `^`, then all characters not in the set are matched. If the set begins with `]` or `^]` then the `]` character is also included into the set. It is implementation-defined whether the character `-` in the non-initial position in the scanset may be indicating a range, as in `[0-9]`. If width specifier is used, matches only up to _width_. Always stores a null character in addition to the characters matched (so the argument array must have room for at least _width+1_ characters).
`d` matches a **decimal integer**. The format of the number is the same as expected by [`strtol()`](../string/byte/strtol) with the value `10` for the `base` argument.
`signed char*` or `unsigned char*`
`signed short*` or `unsigned short*`
`signed int*` or `unsigned int*`
`signed long*` or `unsigned long*`
`signed long long*` or `unsigned long long*`
[`intmax_t`](../types/integer)`*` or [`uintmax_t`](../types/integer)`*`
size_t*
ptrdiff_t*
N / A i
匹配整数。数的格式是相同的通过按预期strtol()
与值0
的base
参数(基部由解析的第一字符确定)。
`u` matches an unsigned **decimal integer**. The format of the number is the same as expected by [`strtoul()`](../string/byte/strtoul) with the value `10` for the `base` argument.
`o` matches an unsigned **octal integer**. The format of the number is the same as expected by [`strtoul()`](../string/byte/strtoul) with the value `8` for the `base` argument.
`x`, `X` matches an unsigned **hexadecimal integer**. The format of the number is the same as expected by [`strtoul()`](../string/byte/strtoul) with the value `16` for the `base` argument.
`n` returns the **number of characters read so far**. No input is consumed. Does not increment the assignment count. If the specifier has assignment-suppressing operator defined, the behavior is undefined.
`a`, `A`(C99)
e
, E
f
, F
g
,G
匹配一个浮点数。数字的格式与预期的相同strtof()
。
N/A N/A float*
double*
N/A N/A N/A N/A long double*
`p` matches implementation defined character sequence defining a **pointer**. `printf` family of functions should produce the same sequence using `%p` format specifier.
N/A N/A void**
n
不适用不适用不适用不适用不适用不适用不适用不适用于任何指定字段宽度的最长输入字符序列的 每个转换说明符,或者正是转换说明符所期望的或者是它期望的顺序是从流中消耗的。在消耗序列之后的第一个字符(如果有的话)仍然未读。如果消耗的序列长度为零或消费的序列不能如上所述进行转换,则会发生匹配失败,除非文件结束,编码错误或读取错误阻止了来自流的输入,在这种情况下,它是输入失败。
在尝试解析输入之前,除了[
,,之外的所有转换说明符都会消耗并放弃所有前导空白字符(如同通过调用一样确定)。这些消耗的字符不会计入指定的最大字段宽度。cnisspace
转换说明符lc
,ls
并l[
执行多字节到宽字符转换,就好像通过在第一个字符转换之前使用初始化为零mbrtowc()
的mbstate_t
对象调用一样。
除了匹配的字符之外,转换说明符s
并[
始终存储空终止符。目标数组的大小必须至少比指定的字段宽度大1。在不指定目标数组大小的情况下,使用%s
或%[
不安全gets
。
对于固定宽度的整数类型(正确的转换规格int8_t
<inttypes.h>还(虽然,等等)都在头定义SCNdMAX
,SCNuMAX
等是同义词%jd
,%ju
等)。
每个转换说明符的操作之后都有一个序列点; 这允许将多个字段存储在相同的“汇”变量中。
在解析指数中不包含数字的不完整浮点值时(例如"100er"
使用转换说明符进行解析),会消耗%f
序列"100e"
(可能有效的浮点数的最长前缀),从而导致匹配错误(消耗的序列不能转换为浮点数),"r"
剩余的。现有的实现不遵循此规则并仅回滚消耗"100"
,只留下"er"
例如glibc错误1765。
... - receiving arguments
1-3)成功分配的接收参数数量(EOF
如果在分配第一个接收参数之前发生匹配故障,则可能为零),或者在分配第一个接收参数前发生输入故障。
4-6)与(1-3)相同,但是EOF
如果存在运行时约束冲突,也会返回。
由于大多数转换说明符首先消耗所有连续的空白,所以代码如。
scanf("%d", &a);scanf("%d", &b);
将读取在不同行上输入的两个整数(第二个%d将使用第一个剩下的换行符)或在同一行上,由空格或制表符分隔(第二个%d将使用空格或制表符)。不使用前导空格的转换说明符(如%c)可以通过在格式字符串中使用空格字符来完成:
scanf("%d", &a);scanf(" %c", &c); // ignore the endline after %d, then read a char
#define __STDC_WANT_LIB_EXT1__ 1#include <stdio.h>#include <stddef.h>#include <locale.h> int main(void){ int i, j; float x, y; char str1[10], str2[4]; wchar_t warr[2]; setlocale(LC_ALL, "en_US.utf8"); char input[] = "25 54.32E-1 Thompson 56789 0123 56ß水"; /* parse as follows: %d: an integer %f: a floating-point value %9s: a string of at most 9 non-whitespace characters %2d: two-digit integer (digits 5 and 6) %f: a floating-point value (digits 7, 8, 9) %*d: an integer which isn't stored anywhere ' ': all consecutive whitespace %3[0-9]: a string of at most 3 decimal digits (digits 5 and 6) %2lc: two wide characters, using multibyte to wide conversion */ int ret = sscanf(input, "%d%f%9s%2d%f%*d %3[0-9]%2lc", &i, &x, str1, &j, &y, str2, warr); printf("Converted %d fields:\ni = %d\nx = %f\nstr1 = %s\n" "j = %d\ny = %f\nstr2 = %s\n" "warr[0] = U+%x warr[1] = U+%x\n", ret, i, x, str1, j, y, str2, warr[0], warr[1]); #ifdef __STDC_LIB_EXT1__ int n = sscanf_s(input, "%d%f%s", &i, &x, str1, (rsize_t)sizeof str1); // writes 25 to i, 5.432 to x, the 9 bytes "thompson\0" to str1, and 3 to n.#endif}
输出:
Converted 7 fields:i = 25x = 5.432000str1 = Thompson j = 56y = 789.000000str2 = 56warr[0] = U+df warr[1] = U+6c34
C11标准(ISO / IEC 9899:2011):
7.21.6.2 fscanf函数(p:317-324)
7.21.6.4 scanf函数(p:325)
7.21.6.7 sscanf函数(p:326)
K.3.5.3.2 fscanf_s函数(p:592-593)
K.3.5.3.4 scanf_s函数(p:594)
K.3.5.3.7 sscanf_s函数(p:596)
C99标准(ISO / IEC 9899:1999):
7.19.6.2 fscanf函数(p:282-289)
7.19.6.4 scanf函数(p:290)
7.19.6.7 sscanf函数(p:291)
C89 / C90标准(ISO / IEC 9899:1990):
4.9.6.2 fscanf函数
4.9.6.4 scanf函数
4.9.6.6 sscanf函数