Maison >php教程 >php手册 >php 中文字符串统计、截取

php 中文字符串统计、截取

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBoriginal
2016-06-06 19:46:061008parcourir

这几天复习基础时,看到字符串这一章,有个题目是给文章分页! 如是就自己动手写写。本以为很简单的,结果却折腾了两天,期间老是东看西看,东做西做的,一点也不专注 //批评下自己 好在,终于弄出一个。只不过需要使用者自己传入当前的字符编码 o(╯□╰)o

这几天复习基础时,看到字符串这一章,有个题目是给文章分页!

如是就自己动手写写。本以为很简单的,结果却折腾了两天,期间老是东看西看,东做西做的,一点也不专注………… //批评下自己

好在,终于弄出一个。只不过需要使用者自己传入当前的字符编码php 中文字符串统计、截取o(╯□╰)o。网上找到的都是利用各字符编码所在ASCII码中的十六进制区间来判断汉字………… //我承认我技术不行

下边是代码。 //写完之后就来发博客,巩固、加深印象 :-D

注:每个英文字母、汉字、特殊字符不管占多少字节,我这里始终按一个字符进行处理

<span>  1</span> <span>php
</span><span>  2</span> <span>header</span>("content-type:text/html; charset=utf-8"<span>);
</span><span>  3</span> <span>echo</span> '<pre class="brush:php;toolbar:false">'<span>;
</span><span>  4</span> <span>/*</span><span>*
</span><span>  5</span> <span> *    字符串统计,每个字符按一个长度计算
</span><span>  6</span> <span> *        支持GBK,UTF8
</span><span>  7</span> <span> *    类似mb_strlen()
</span><span>  8</span> <span> *    @author 谭宁宁
</span><span>  9</span> <span> *    @time 2012-08-05
</span><span> 10</span>  <span>*/</span>    
<span> 11</span> <span>if</span>( !<span>function_exists</span>('strcount'<span>))
</span><span> 12</span> <span>{
</span><span> 13</span>     <span>function</span> strcount(<span>$string</span>, <span>$char</span>='utf8'<span>)
</span><span> 14</span> <span>    {
</span><span> 15</span>         <span>$count</span>    = <span>strlen</span>(<span>$string</span><span>);
</span><span> 16</span>         <span>$i</span>        = 0;    <span>//</span><span>当前的字节数</span>
<span> 17</span>         <span>$j</span>        = 0;    <span>//</span><span>按照字符进行累加</span>
<span> 18</span>         <span>while</span> (<span>$i</span>$count<span>)
</span><span> 19</span> <span>        {
</span><span> 20</span>             <span>//</span><span>英文及半角特殊字符</span>
<span> 21</span>             <span>if</span>(<span>ord</span>(<span>$string</span>[<span>$i</span>]) >=0 && <span>ord</span>(<span>$string</span>[<span>$i</span>]) )
<span> 22</span>             {    <span>$charset</span>    = 'en'<span>;    }
</span><span> 23</span>             <span>//</span><span>汉字及全角字符</span>
<span> 24</span>             <span>else</span>
<span> 25</span>             {    <span>$charset</span>    = <span>$char</span><span>;}
</span><span> 26</span> 
<span> 27</span>             <span>switch</span> (<span>strtolower</span>(<span>$charset</span><span>))
</span><span> 28</span> <span>            {
</span><span> 29</span>                 <span>case</span> 'gb2312':
<span> 30</span>                 <span>case</span> 'gbk':
<span> 31</span>                     <span>$i</span>        += 1<span>;
</span><span> 32</span>                     <span>break</span><span>;
</span><span> 33</span>                 <span>case</span> 'utf8':
<span> 34</span>                     <span>$i</span>        += 2<span>;
</span><span> 35</span>                     <span>break</span><span>;
</span><span> 36</span>                 <span>case</span> 'en':
<span> 37</span>                 <span>default</span>:
<span> 38</span>                     <span>break</span><span>;
</span><span> 39</span> <span>            }
</span><span> 40</span>             <span>$j</span>++<span>;
</span><span> 41</span>             <span>$i</span>++<span>;
</span><span> 42</span> <span>        }
</span><span> 43</span>         <span>return</span> <span>$j</span><span>;
</span><span> 44</span> <span>    }
</span><span> 45</span> <span>}
</span><span> 46</span> <span>else</span>
<span> 47</span> {    <span>echo</span> '<p>fun strcount exists!</p>'<span>;    }
</span><span> 48</span> 
<span> 49</span> <span>/*</span><span>*
</span><span> 50</span> <span> *    自定义字符串截取函数,防止mb_substr()没有开启
</span><span> 51</span> <span> *    通过用户输入的$char来判断当前汉字的字符集编码
</span><span> 52</span> <span> *    @param int $start 开始的字符数
</span><span> 53</span> <span> *    @param int $offest 偏移量,及从$start开始往后输出多少个字符
</span><span> 54</span> <span> *    @param str $char 使用者手动输入当前的汉字符编码
</span><span> 55</span> <span> *    @author 谭宁宁
</span><span> 56</span> <span> *    @time 2012-08-05
</span><span> 57</span>  <span>*/</span>    
<span> 58</span> <span>if</span>( !<span>function_exists</span>('strsub'<span>))
</span><span> 59</span> <span>{
</span><span> 60</span>     <span>function</span> strsub(<span>$string</span>, <span>$start</span>=0, <span>$offest</span>=0, <span>$char</span>='utf8'<span>)
</span><span> 61</span> <span>    {
</span><span> 62</span>         <span>$count</span>    = <span>strlen</span>(<span>$string</span><span>);
</span><span> 63</span>         <span>$rs</span>        = ''<span>;
</span><span> 64</span>         <span>$i</span>        = 0;    <span>//</span><span>按字节数累计</span>
<span> 65</span>         <span>$j</span>        = 0;    <span>//</span><span>按字符数累计</span>
<span> 66</span>         <span>$size</span>    = 1;    <span>//</span><span>记录每次substr时的终止位置,汉字需要考虑gbk和utf8两种情况</span>
<span> 67</span>         <span>while</span> (<span>$i</span> $count<span>)
</span><span> 68</span> <span>        {
</span><span> 69</span>             <span>//</span><span>英文及半角特殊字符</span>
<span> 70</span>             <span>if</span>(<span>ord</span>(<span>$string</span>[<span>$i</span>]) >=0 && <span>ord</span>(<span>$string</span>[<span>$i</span>]) )
<span> 71</span>             {    <span>$charset</span>    = 'en'<span>;    }
</span><span> 72</span>             <span>//</span><span>汉字及全角字符</span>
<span> 73</span>             <span>else</span>
<span> 74</span>             {    <span>$charset</span>    = <span>$char</span><span>;}
</span><span> 75</span>             
<span> 76</span>             <span>switch</span> (<span>strtolower</span>(<span>$charset</span><span>))
</span><span> 77</span> <span>            {
</span><span> 78</span>                 <span>case</span> 'gb2312':
<span> 79</span>                 <span>case</span> 'gbk':
<span> 80</span>                     <span>$i</span>        += 1<span>;
</span><span> 81</span>                     <span>$size</span>    = 2<span>;
</span><span> 82</span>                     <span>break</span><span>;
</span><span> 83</span>                 <span>case</span> 'utf8':
<span> 84</span>                     <span>$i</span>        += 2<span>;
</span><span> 85</span>                     <span>$size</span>    = 3<span>;
</span><span> 86</span>                     <span>break</span><span>;
</span><span> 87</span>                 <span>case</span> 'en':
<span> 88</span>                 <span>default</span>:
<span> 89</span>                     <span>$size</span>    = 1<span>;
</span><span> 90</span>                     <span>break</span><span>;
</span><span> 91</span> <span>            }
</span><span> 92</span>             
<span> 93</span>             <span>if</span>(<span>$j</span> intval(<span>$start</span>+<span>$offest</span>) && <span>$j</span> >= <span>$start</span><span>)
</span><span> 94</span> <span>            {
</span><span> 95</span>                 <span>$tstart</span>    = <span>intval</span>(<span>$i</span>-<span>$size</span>)+1<span>;
</span><span> 96</span>                 <span>$rs</span>        .= <span>substr</span>(<span>$string</span>, <span>$tstart</span>, <span>$size</span><span>);
</span><span> 97</span> <span>            }
</span><span> 98</span>             <span>$j</span>++<span>;
</span><span> 99</span>             <span>$i</span>++<span>;
</span><span>100</span> <span>        }
</span><span>101</span>         <span>return</span> <span>$rs</span><span>;
</span><span>102</span> <span>    }
</span><span>103</span> <span>}
</span><span>104</span> <span>else</span>
<span>105</span> {    <span>echo</span> '<p>fun strsub exists!</p>'<span>;    }
</span><span>106</span> 
<span>107</span> <span>/*</span><span>$string    = '123456789汉字胡总温中文啊abcdefghijklmn·=-';
</span><span>108</span> <span>echo 'substr():',substr($string, 9, 3),'<br>';
</span><span>109</span> <span>echo '长度:',strcount($string),'<br>';
</span><span>110</span> <span>echo '截取测试:',strsub($string, 0, 11),'<br>';</span><span>*/</span>
<span>111</span> 
<span>112</span> <span>$fileContent</span>    = <span>file_exists</span>('reg.txt') ? <span>file_get_contents</span>('reg.txt') : ''<span>;
</span><span>113</span> 
<span>114</span> <span>$count</span>            = strcount(<span>$fileContent</span><span>);
</span><span>115</span> <span>$page</span>            = !<span>isset</span>(<span>$_GET</span>['p']) ? 1 : <span>$_GET</span>['p'];    <span>//</span><span>获取当前页码,默认为1</span>
<span>116</span> <span>$pagesize</span>        = 350;    <span>//</span><span>每页多少字符</span>
<span>117</span> <span>$pagecount</span>        = <span>$count</span>/<span>$pagesize</span><span>;
</span><span>118</span> <span>$pagecount</span>        = <span>strpos</span>(<span>$pagecount</span>, '.') ? <span>intval</span>(<span>$pagecount</span>)+1 : <span>intval</span>(<span>$pagecount</span>);    <span>//</span><span>总页码,如果出现小数,那么就得+1页</span>
<span>119</span> <span>$start</span>            = <span>$page</span>$page-1)*<span>$pagesize</span><span>;
</span><span>120</span> 
<span>121</span> <span>$fileContent</span>    = strsub(<span>$fileContent</span>, <span>$start</span>, <span>$pagesize</span>, 'utf8'<span>);
</span><span>122</span> ?>
<span>123</span> 
<span>124</span> header>
<span>125</span> <style type="text/css">
<span>126 <span>p
<span>127 {    margin: 10px; word-wrap: <span>break-word; border:<span>#<span>000 1px solid; padding:5px;    }
<span>128 <span>p a
<span>129 {    margin:<span> 5px;    }
<span>130 </style>
<span>131</span> <span>header</span>>
<span>132</span> 
<span>133</span> <p><?php <span>echo <span>$fileContent</span>; ?></p>
<span>134</span> 
<span>135</span> <p>
<span>136</span> <span>php
</span><span>137</span> <span>echo</span> "共有字符:<span>$count</span> /每页 <span>$pagesize</span> 个  "<span>;
</span><span>138</span> <span>echo</span> " 共 <span>$pagecount</span> 页/当前第 <span>$page</span> 页"<span>;
</span><span>139</span> 
<span>140</span> <span>if</span>(<span>$page</span> )
<span>141</span> <span>{
</span><span>142</span>     <span>echo</span> '<a>首页</a>'<span>;
</span><span>143</span>     <span>echo</span><span>;
</span><span>144</span> <span>}
</span><span>145</span> <span>else</span>
<span>146</span> <span>{
</span><span>147</span>     <span>$up</span>    = <span>$page</span>-1<span>;
</span><span>148</span>     <span>echo</span> "<a href="/contentpage.php?p=1">首页</a>"<span>;
</span><span>149</span>     <span>echo</span> "<a href="/contentpage.php?p=<span>%24up</span><span>;%0A</span><span>150</span>%20<span>%7D%0A</span><span>151</span>%20%0A<span>152</span>%20<span>if</span>(<span>%24page</span>%20==%20<span>%24pagecount</span><span>)%0A</span><span>153</span>%20<span>%7B%0A</span><span>154</span>%20%20%20%20%20<span>echo</span><span>;%0A</span><span>155</span>%20%20%20%20%20<span>echo</span>%20">尾页</a>'<span>;
</span><span>156</span> <span>}
</span><span>157</span> <span>else</span>
<span>158</span> <span>{
</span><span>159</span>     <span>$down</span>    = <span>$page</span>+1<span>;
</span><span>160</span>     <span>echo</span> "<a href="/contentpage.php?p=<span>%24down</span><span>;%0A</span><span>161</span>%20%20%20%20%20<span>echo</span>%20%22<a%20href=">$pagecount'>尾页</a>"<span>;
</span><span>162</span> <span>}
</span><span>163</span> ?>
<span>164</span> </p>
Déclaration:
Le contenu de cet article est volontairement contribué par les internautes et les droits d'auteur appartiennent à l'auteur original. Ce site n'assume aucune responsabilité légale correspondante. Si vous trouvez un contenu suspecté de plagiat ou de contrefaçon, veuillez contacter admin@php.cn