search
HomeJavajavaTutorialJava character stream example analysis

    1. The origin of character stream

    Since it is not very convenient to use byte stream to control Chinese, Java provides character stream to control Chinese

    Implementation principle: byte stream encoding table

    Why is there no problem when using byte stream to copy text files with Chinese characters?

    Because the underlying operation will automatically splice bytes into Chinese

    How to identify that the byte is Chinese?

    When Chinese characters are stored, whether it is UTF-8 or GBK, the first byte is a negative number to prompt

    2. Coding table

    Character set:

    is a collection of all characters supported by the system, including national characters, punctuation marks, graphic symbols, numbers, etc.

    To accurately store and recognize various character set symbols, a computer needs to perform character processing Encoding, a set of character sets must have at least one set of character encodings

    Common character sets include ASCII character set, GBXXX character set, Unicode character set, etc.

    GBK: the most commonly used Chinese code table, It is an extended specification based on the GB2312 standard. It uses a double-byte encoding scheme and contains a total of 21,003 Chinese characters. It is fully compatible with the GB2312 standard and supports traditional Chinese characters, Japanese and Korean Chinese characters, etc.

    GB18030: The latest Chinese The code table contains 70244 Chinese characters, using multi-byte encoding. Each character can be composed of 1, 2 or 4 bytes. Supports the characters of Chinese ethnic minorities, as well as traditional Chinese characters, Japanese and Korean Chinese characters, etc.

    Unicode character set:

    is designed to express any character in any language. It is a standard in the industry, also known as It is Unicode and Standard Universal Code; it uses up to 4 bytes of numbers to express each letter, symbol, or text. There are three encoding schemes: UTF-8, UTF-16, and UTF32. The most commonly used is UTF-8

    UTF-8: It can be used to represent any character in the Unicode standard. It is used for emails, web pages, and The preferred encoding used in other applications that store or transfer files. The Internet Working Group requires that all Internet protocols must support the UTF-8 encoding format. It uses one to four bytes to encode each character

    UTF-8 encoding rules:

    128 US-ASCII characters, only one byte encoding is required

    Latin Chinese and other characters require two bytes to encode

    Most commonly used characters (including Chinese) use three bytes to encode

    Other rarely used UniCode auxiliary characters use four characters Section encoding

    Summary: Which rule is used when encoding, and the corresponding rule needs to be used for decoding, otherwise the code will be garbled

    3. Encoding and decoding issues in strings

    Encoding Method (IDEA):

    byte[] getBytes(): Use the platform's default character set to encode the String into a series of bytes, and store the result in a new byte array

    byte[] getBytes(String charsetName): Use the specified character set to encode the String into a series of bytes, and store the result in a new byte array

    Decoding method (IDEA):

    String(byte[]bytes): Constructs a new String by decoding the specified byte array using the platform's default character set

    String(byte[]bytes,String charsetName): Constructs a new String by decoding the specified byte array using the platform's default character set Decode the specified byte array to construct a new String

    The default encoding format in IDEA is UTF-8

    4. Character stream encoding and decoding issues

    Character stream abstraction Base class:

    Reader: abstract class of character input stream

    Writer: abstract class of character output stream

    Two classes related to encoding and decoding issues in the character stream:

    InputStreamReader: is a bridge from byte stream to character stream: it reads bytes and decodes them into characters using the specified character set. The character set it uses can be specified by name, can be specified explicitly, or can accept the platform's default character set

    Constructor:

    InputStreamReader( InputStream in) Create an InputStreamReader using the default character set.
    InputStreamReader(InputStream in, String charsetName) Create an InputStreamReader that uses a named character set.

    OutputStreamWruter: It is a bridge from character stream to byte stream: it uses a custom character set to encode written characters into bytes. The character set it uses can Specified by name, can be specified explicitly, or can accept the platform's default character set

    Construction method:

    OutputStreamWriter(OutputStream out) Create an OutputStreamWriter using the default character encoding.
    OutputStreamWriter(OutputStream out, String charsetName) Create an OutputStreamWriter that uses a named character set.
    public class ConversionStreamDemo {
        public static void main(String[] args) throws IOException {
            //创建一个默认编码格式的InputStreamReader\OutputStreamWriter
            InputStreamReader ipsr = new InputStreamReader(new FileInputStream("E:\\abc.txt"));
            OutputStreamWriter opsw = new OutputStreamWriter(new FileOutputStream("E:\\abc.txt"));
            //写入数据
            opsw.write("你好啊");
            opsw.close();
            //读数据,方式一:一次读取一个字节数据
            int ch;
            while ((ch = ipsr.read()) != -1) {
                System.out.print((char) ch);
            }
            ipsr.close();
    
        }
    }

    四、字符流写数据的五种方法

    方法名 说明
    void write(int c)     写一个字符
    void write(char[] cbuf) 写入一个字符数组
    void write(char[] cbuf,int off,int len) 写入字符数组的一部分
    void write(String str) 写入一个字符串
    void write(String str,int off,int len) 写入一个字符串的一部分

    字符流写数据需要注意缓冲区的问题,如果想要将缓冲区的数据加载出来需要在写入方法后加上刷新方法flush();

    前三个方法与字节流写入方法使用相同,这里重点介绍下面两种方式

    public class OutputStreamWriterDemo {
        public static void main(String[] args) throws IOException {
            //创建一个默认编码格式的OutputStreamWriter对象
            OutputStreamWriter opsw=new OutputStreamWriter(new FileOutputStream("E:\\abc.txt"));
            //方式一:写入一个字节
            opsw.write(97);
            opsw.flush();//如果需要在文件中立即显示输入的数据,就需要加入刷新方法
            //方式二:写入一个字符数组
            char[]ch={'a','b','c','二'};
            opsw.write(ch);
            opsw.flush();//如果需要在文件中立即显示输入的数据,就需要加入刷新方法
            //方式三:写入一个字符数组的一部分
            opsw.write(ch,0,2);
            opsw.flush();//如果需要在文件中立即显示输入的数据,就需要加入刷新方法
            //方式四:写入一个字符串
            opsw.write("一二三");
            opsw.flush();//如果需要在文件中立即显示输入的数据,就需要加入刷新方法
            //方式五:写入一个字符串的一部分
            opsw.write("三四五",1,2);
            opsw.flush();//如果需要在文件中立即显示输入的数据,就需要加入刷新方法
        }
    }

    五、字符流读数据的两种方法

    方法名 说明
    int read()     一次读取一个字符数据
    int read(char[] cbuf) 一次读取一个字符数组数据
    public class InputStreamReadDemo {
        public static void main(String[] args) throws IOException {
            //创建一个默认编码格式的InputStreamReader
            InputStreamReader ipsr=new InputStreamReader(new FileInputStream("E:\\abc.txt"));
            //读取数据,方式一一次读取一个字符数据
            int ch;
            while ((ch=ipsr.read())!=-1){
                System.out.print((char) ch);
            }
            ipsr.close();
            //方式二:一次读取一个字符数组数据
            char []ch=new char[1024];
            int len;
            while ((len=ipsr.read(ch))!=-1){
                System.out.print(new String(ch,0,len));
            }
            ipsr.close();
        }
    }

    小结:如果使用默认编码格式的话,那么字符输入流InputStreamReader可以使用子类FileReader来替代,字符输出流OutputStreamWriter可以使用其子类FileWriter来替代,两者在使用默认编码格式的情况下作用一致。

    The above is the detailed content of Java character stream example analysis. For more information, please follow other related articles on the PHP Chinese website!

    Statement
    This article is reproduced at:亿速云. If there is any infringement, please contact admin@php.cn delete
    Is Java Platform Independent if then how?Is Java Platform Independent if then how?May 09, 2025 am 12:11 AM

    Java is platform-independent because of its "write once, run everywhere" design philosophy, which relies on Java virtual machines (JVMs) and bytecode. 1) Java code is compiled into bytecode, interpreted by the JVM or compiled on the fly locally. 2) Pay attention to library dependencies, performance differences and environment configuration. 3) Using standard libraries, cross-platform testing and version management is the best practice to ensure platform independence.

    The Truth About Java's Platform Independence: Is It Really That Simple?The Truth About Java's Platform Independence: Is It Really That Simple?May 09, 2025 am 12:10 AM

    Java'splatformindependenceisnotsimple;itinvolvescomplexities.1)JVMcompatibilitymustbeensuredacrossplatforms.2)Nativelibrariesandsystemcallsneedcarefulhandling.3)Dependenciesandlibrariesrequirecross-platformcompatibility.4)Performanceoptimizationacros

    Java Platform Independence: Advantages for web applicationsJava Platform Independence: Advantages for web applicationsMay 09, 2025 am 12:08 AM

    Java'splatformindependencebenefitswebapplicationsbyallowingcodetorunonanysystemwithaJVM,simplifyingdeploymentandscaling.Itenables:1)easydeploymentacrossdifferentservers,2)seamlessscalingacrosscloudplatforms,and3)consistentdevelopmenttodeploymentproce

    JVM Explained: A Comprehensive Guide to the Java Virtual MachineJVM Explained: A Comprehensive Guide to the Java Virtual MachineMay 09, 2025 am 12:04 AM

    TheJVMistheruntimeenvironmentforexecutingJavabytecode,crucialforJava's"writeonce,runanywhere"capability.Itmanagesmemory,executesthreads,andensuressecurity,makingitessentialforJavadeveloperstounderstandforefficientandrobustapplicationdevelop

    Key Features of Java: Why It Remains a Top Programming LanguageKey Features of Java: Why It Remains a Top Programming LanguageMay 09, 2025 am 12:04 AM

    Javaremainsatopchoicefordevelopersduetoitsplatformindependence,object-orienteddesign,strongtyping,automaticmemorymanagement,andcomprehensivestandardlibrary.ThesefeaturesmakeJavaversatileandpowerful,suitableforawiderangeofapplications,despitesomechall

    Java Platform Independence: What does it mean for developers?Java Platform Independence: What does it mean for developers?May 08, 2025 am 12:27 AM

    Java'splatformindependencemeansdeveloperscanwritecodeonceandrunitonanydevicewithoutrecompiling.ThisisachievedthroughtheJavaVirtualMachine(JVM),whichtranslatesbytecodeintomachine-specificinstructions,allowinguniversalcompatibilityacrossplatforms.Howev

    How to set up JVM for first usage?How to set up JVM for first usage?May 08, 2025 am 12:21 AM

    To set up the JVM, you need to follow the following steps: 1) Download and install the JDK, 2) Set environment variables, 3) Verify the installation, 4) Set the IDE, 5) Test the runner program. Setting up a JVM is not just about making it work, it also involves optimizing memory allocation, garbage collection, performance tuning, and error handling to ensure optimal operation.

    How can I check Java platform independence for my product?How can I check Java platform independence for my product?May 08, 2025 am 12:12 AM

    ToensureJavaplatformindependence,followthesesteps:1)CompileandrunyourapplicationonmultipleplatformsusingdifferentOSandJVMversions.2)UtilizeCI/CDpipelineslikeJenkinsorGitHubActionsforautomatedcross-platformtesting.3)Usecross-platformtestingframeworkss

    See all articles

    Hot AI Tools

    Undresser.AI Undress

    Undresser.AI Undress

    AI-powered app for creating realistic nude photos

    AI Clothes Remover

    AI Clothes Remover

    Online AI tool for removing clothes from photos.

    Undress AI Tool

    Undress AI Tool

    Undress images for free

    Clothoff.io

    Clothoff.io

    AI clothes remover

    Video Face Swap

    Video Face Swap

    Swap faces in any video effortlessly with our completely free AI face swap tool!

    Hot Tools

    mPDF

    mPDF

    mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

    Zend Studio 13.0.1

    Zend Studio 13.0.1

    Powerful PHP integrated development environment

    SecLists

    SecLists

    SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

    MantisBT

    MantisBT

    Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

    DVWA

    DVWA

    Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software