Introduce what is Apache Flink-Apache-php.cn

Home

Operation and Maintenance

Apache

Introduce what is Apache Flink

王林

Oct 20, 2020 pm 05:10 PM

apache flink

Introduce what is Apache Flink

Introduction to Apache Flink:

Apache Flink is a framework and distributed processing engine for stateful processing on unbounded and bounded data streams calculation. Flink runs in all common cluster environments and can compute at memory speeds and at any scale.

(Recommended tutorial: apache)

Next, let’s introduce the important aspects of the Flink architecture.

Handling unbounded and bounded data

Any type of data can form an event stream. Credit card transactions, sensor measurements, machine logs, user interaction records on a website or mobile app, all this data forms a stream.

Data can be processed as unbounded or bounded streams.

1. Unbounded flow The beginning of the flow is defined, but the end of the flow is not defined. They generate data endlessly. The data of unbounded flow must be processed continuously, that is, the data needs to be processed immediately after being ingested. We cannot wait until all the data arrives before processing because the input is infinite and will never be completed at any time. Processing unbounded data often requires ingesting events in a specific order, such as the order in which they occur, to be able to infer the completeness of the results.

2. Bounded flow defines the beginning of the flow and the end of the flow. Bounded streams allow calculations to be performed after all data has been ingested. All data in bounded streams can be sorted, so ordered ingestion is not required. Bounded stream processing is often called batch processing.

Introduce what is Apache Flink

Apache Flink is good at processing unbounded and bounded data sets. Precise time control and statefulness enable Flink's runtime to run any application that handles unbounded streams. Bounded streams are processed internally by algorithms and data structures specifically designed for fixed-size data sets, resulting in excellent performance.

Deepen your understanding by exploring use cases built on top of Flink.

Deploy applications anywhere

Apache Flink is a distributed system that requires computing resources to execute applications. Flink integrates with all common cluster resource managers, such as Hadoop YARN, Apache Mesos and Kubernetes, but can also run as a standalone cluster.

Flink is designed to work well with each of the above resource managers, which is achieved through a resource-manager-specific deployment mode. Flink can interact with the current resource manager in a manner appropriate to it.

When you deploy a Flink application, Flink automatically identifies the required resources based on the application's configured parallelism and requests these resources from the resource manager. In the event of a failure, Flink replaces the failed container by requesting new resources. All communication to submit or control applications occurs through REST calls, which simplifies the integration of Flink into a variety of environments.

Run applications at any scale

Flink is designed to run stateful streaming applications at any scale. Therefore, the application is parallelized into potentially thousands of tasks that are distributed across the cluster and executed concurrently. So applications can take advantage of endless CPU, memory, disk, and network IO. And Flink makes it easy to maintain very large application state. Its asynchronous and incremental checkpointing algorithm has minimal impact on processing latency while ensuring the consistency of exactly-once state.

Flink users have reported some impressive scalability numbers in their production environments

Processing trillions of events per day, applications maintaining terabytes of state, and applications running on data Run on thousands of cores.

Exploiting memory performance

Stateful Flink programs are optimized for local state access. The state of a task is always maintained in memory or, if the state size exceeds available memory, is saved in an on-disk data structure that can be accessed efficiently. Tasks perform all computations by accessing local (usually in-memory) state, resulting in very low processing latency. Flink ensures exact-once state consistency in failure scenarios by regularly and asynchronously persisting local state storage.

Introduce what is Apache Flink

The above is the detailed content of Introduce what is Apache Flink. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:掘金. If there is any infringement, please contact admin@php.cn delete

Apache's Continuing Importance: Reasons for Its LongevityApr 22, 2025 am 12:08 AM

Reasons for Apache's continued importance include its diversity, flexibility, strong community support, widespread use and high reliability in enterprise-level applications, and continuous innovation in emerging technologies. Specifically, 1) The Apache project covers multiple fields from web servers to big data processing, providing rich solutions; 2) The global community of the Apache Software Foundation (ASF) provides continuous support and development momentum for the project; 3) Apache shows high stability and scalability in enterprise-level applications such as finance and telecommunications; 4) Apache continues to innovate in emerging technologies such as cloud computing and big data, such as breakthroughs from ApacheFlink and ApacheArrow.

Beyond the Hype: Assessing Apache's Current RoleApr 21, 2025 am 12:14 AM

Apache remains important in today's technology ecosystem. 1) In the fields of web services and big data processing, ApacheHTTPServer, Kafka and Hadoop are still the first choice. 2) In the future, we need to pay attention to cloud nativeization, performance optimization and ecosystem simplification to maintain competitiveness.

Apache's Impact: Web Hosting and Content DeliveryApr 20, 2025 am 12:12 AM

ApacheHTTPServer has a huge impact on WebHosting and content distribution. 1) Apache started in 1995 and quickly became the first choice in the market, providing modular design and flexibility. 2) In web hosting, Apache is widely used for stability and security and supports multiple operating systems. 3) In terms of content distribution, combining CDN use improves website speed and reliability. 4) Apache significantly improves website performance through performance optimization configurations such as content compression and cache headers.

Apache's Role: Serving HTML, CSS, JavaScript, and MoreApr 19, 2025 am 12:09 AM

Apache can serve HTML, CSS, JavaScript and other files. 1) Configure the virtual host and document root directory, 2) receive, process and return requests, 3) use .htaccess files to implement URL rewrite, 4) debug by checking permissions, viewing logs and testing configurations, 5) enable cache, compressing files, and adjusting KeepAlive settings to optimize performance.

What Apache is Known For: Key Features and AchievementsApr 18, 2025 am 12:03 AM

ApacheHTTPServer has become a leader in the field of web servers for its modular design, high scalability, security and performance optimization. 1. Modular design supports various protocols and functions by loading different modules. 2. Highly scalable to adapt to the needs of small to large applications. 3. Security protects the website through mod_security and multiple authentication mechanisms. 4. Performance optimization improves loading speed through data compression and caching.

The Enduring Relevance of Apache: Examining Its Current StatusApr 17, 2025 am 12:06 AM

ApacheHTTPServer remains important in modern web environments because of its stability, scalability and rich ecosystem. 1) Stability and reliability make it suitable for high availability environments. 2) A wide ecosystem provides rich modules and extensions. 3) Easy to configure and manage, and can be quickly started even for beginners.

Apache's Popularity: Reasons for Its SuccessApr 16, 2025 am 12:05 AM

The reasons for Apache's success include: 1) strong open source community support, 2) flexibility and scalability, 3) stability and reliability, and 4) a wide range of application scenarios. Through community technical support and sharing, Apache provides flexible modular design and configuration options, ensuring its adaptability and stability under a variety of needs, and is widely used in different scenarios from personal blogs to large corporate websites.

Apache's Legacy: What Made It Famous?Apr 15, 2025 am 12:19 AM

Apachebecamefamousduetoitsopen-sourcenature,modulardesign,andstrongcommunitysupport.1)Itsopen-sourcemodelandpermissiveApacheLicenseencouragedwidespreadadoption.2)Themodulararchitectureallowedforextensivecustomizationandadaptability.3)Avibrantcommunit

See all articles