Home  >  Article  >  Database  >  Detailed explanation of Oracle classic skills RAC

Detailed explanation of Oracle classic skills RAC

WBOY
WBOYforward
2022-03-14 17:28:113687browse

This article brings you relevant knowledge about Oracle, which mainly introduces RAC related issues. Oracle Real Application Cluster is used to realize multi-machine shared database in a cluster environment to ensure High availability of the application, I hope it will be helpful to everyone.

Detailed explanation of Oracle classic skills RAC

Recommended tutorial: "Oracle Tutorial"

There are usually two general directions to deal with the increasing business volume, one is One is vertical expansion, which is to increase the CPU computing power, memory capacity, disk carrying capacity, etc. of a single server; the other is horizontal expansion, which is to increase processing power by increasing the number of servers. The former has many problems such as business interruption and expansion limit. Especially with the rapid development of Internet services, a single server can hardly meet the business load requirements. Therefore, horizontal expansion is currently the more popular method.

What is Oracle RAC?

Oracle Real Application Cluster (RAC, real-time application cluster) is used to realize multi-machine shared database in a cluster environment to ensure high availability of applications; at the same time, it can automatically realize parallel processing and load balancing, and It can achieve fault tolerance and breakpoint-free recovery of the database when it fails. It is the core technology of Oracle database to support network computing environment.

Shared Storage Multi-Active Cluster

In this architecture, multiple nodes in the cluster run the same database instance, and the data is completely consistent, and no matter where the user starts from Node access, the data obtained are the same. The figure below is a schematic diagram of Oracle RAC. A cluster is formed by 3 nodes, and they share data.
Detailed explanation of Oracle classic skills RAC
The characteristics of RAC can be summarized as follows:

  • Nodes are interconnected and displayed as a server;
  • Cluster software hides the internal structure ;
  • The disk is shared;
  • Each machine uses the same operating system;
  • Multiple instances access the same database;
  • Each node There is an example;
  • Database files support physical or logical access;
  • The reading and writing of data are controlled by software.

Detailed explanation of Oracle classic skills RAC
Detailed explanation of Oracle classic skills RAC

The difference between Oracle RAC and a single database server

As shown below, the main difference is that RAC is a clustered database. Managed through cluster software.
Detailed explanation of Oracle classic skills RAC
Detailed explanation of Oracle classic skills RAC

Oracle RAC hardware architecture

In order to ensure the availability of the entire cluster, Oracle RAC has many requirements for hardware during deployment. At the network level, Oracle RAC has a total of three network systems, namely the external access network, the internal private network and the storage network.

Needless to say more about external access to the network, I believe everyone understands it. The internal private network is mainly used for internal use of the Oracle cluster, including data transmission, heartbeat and cluster management. This part of the network requires dual switches and dual physical links during deployment to ensure that cluster abnormalities will not be caused by link failures. Behind is the storage network, which is used by the RAC cluster to access storage resources. This part is also link redundant.
Detailed explanation of Oracle classic skills RAC

Another example is the figure below, which is a two-node RAC system. It can be seen that redundancy is implemented from the host to the switch. No matter which device fails, the business will not Affected. Shared storage comes with RAID redundancy.
Detailed explanation of Oracle classic skills RAC
Among them, the host is called a node (Node). They must have the same CPU, memory and other configurations. Each one must have at least two network cards to access the internal and external networks, and an HBA card to connect to the shared storage, but more Most are connected to FC storage arrays through FC switches. Shared storage is the core of the RAC architecture. Most files are in shared storage, and many functions are developed for the security of shared storage. Generally, the optical fiber is connected through the FC interface and runs the SCSI protocol.

Oracle RAC software architecture

As shown below, it is a two-node RAC system. Oracle RDBMS is the database software, and Oracle Clusterware is the cluster software. The drivers are mainly network cards, HBA cards, ASMLib, etc.

Each node must have the same operating system and the version must be consistent, including patch numbers, etc. For example, operating system: RHEL AS 4.8 64bit, Linux kernel version: 2.6.9-89.EL.
Detailed explanation of Oracle classic skills RAC
In order to understand Oracle RAC more deeply, let's take a look at the composition of its internal software modules. There are not many differences at the entire database level. The main additions are the following: virtual IP (VIP), ASM, Clusterware and quorum disk. These new components work together to complete Oracle's multi-active cluster function.

Virtual IP is the entrance for applications to access the database. This IP is not bound to any server, but can drift between any servers in the cluster. Due to this feature, when a server crash occurs, the database cluster can ensure that services are provided to the outside world through the same interface.
Detailed explanation of Oracle classic skills RAC
ASM and Clusterware implement cluster management functions. ASM implements disk management to avoid the risk of data inconsistency caused by simultaneous disk access, while Clusterware is used to manage the software processes and resource scheduling of Oracle clusters. .

The quorum disk is used to determine the abnormality of servers in the cluster. The nodes in the cluster mark their own health status by regularly updating the data in specific areas of the quorum disk. Other nodes can determine whether the node is down based on this data.

Logical structure

The following figure is the logical structure of Oracle RAC, and each component inside will be introduced in turn.
Detailed explanation of Oracle classic skills RAC

DLM: Concurrency Control

Each node has equal rights to access shared stored data, Oracle RAC uses Distribute Lock Management (DLM) for multiple nodes Control of concurrent access. The distributed lock manager is responsible for coordinating the competition for shared resources between nodes. When a node accesses data, it must first apply through DLM and confirm that it will not conflict with other nodes before it can be used.

OCR: Amnesia

Amnesia: If each node has a copy of the cluster configuration information, then an error will occur if there is no synchronization after modifying the configuration.

So, the cluster can only have one configuration information, shared by all nodes. Oracle RAC uses OCR Disk files to solve amnesia.

OCR Disk can only be modified by the Master node. Each node has a copy in the OCR Cache memory. When a node wants to modify the OCR Disk, it requests the Master node, and the OCR process on this node updates the local and other node OCR Cache contents. OCR Disk is backed up regularly every few hours.

Voting Disk: Split Brain

Split Brain: The nodes in the cluster use heartbeat detection to know whether the other party is good. If there is a problem with the heartbeat, both nodes will think that the other party is wrong. It will require exclusive data, which will destroy the consistency of the data.

Voting Disk is used to record the status of members between nodes. When a split-brain occurs, the one with the highest number of votes will be selected to gain control and other nodes will be kicked out.

IO isolation: The kicked-out node can no longer access data, so IO isolation is required. The mechanism of Oracle RAC is to restart the failed node.

Cache Fusion Lock

Database files are shared, and Cache Fusion Lock solves storage sharing and concurrent access control outside the cluster level.

Four kinds of networks

1.Public network: Use Public IP to provide external data query, database maintenance, and server maintenance.

2.Virtual network: Use Virtual IP to provide application connections, and the application uses this IP.

In the TCP/IP protocol, the TCP Header contains the source and destination ports, the IP header contains the source and destination IPs, and the application layer database monitor records the IP and port. When the TCP times out, it is known that there is a problem with the database or monitor. The TCP/IP protocol stack timeout is determined by the OS, and each OS has different definitions. To shorten the time to catch errors, Oracle RAC uses VIP. Virtual IP is floating and not bound to the physical network card. If a node fails, the VIP will be moved to a good node. However, the VIP cannot be found in the monitoring of the good node. The application will immediately detect the problem and switch to Other VIPs send connection requests.

3.Private network: Use Private IP for RAC heartbeat detection and Cache Fusion Lock, which requires high speed.

4. Storage network: composed of storage devices, HBA cards, and FC networks.

Oracle Clusterware

On a single machine, only the OS can handle upper-layer requests, but when there are multiple machines, the cluster management software Oracle Clusterware will intercept requests to other nodes before the OS kernel, and other The node's Clusterware communication completes the request.

Application layer: RDBMS

The application layer is composed of several CRS Resources. Each resource is a complete service composed of several processes. LMON (Lock Monitor) monitors CRS Resource and restarts and switches when abnormality occurs to ensure the high availability of Oracle RAC service.

Cache Fusion

Cache Fusion can be understood literally - cache fusion, in fact, is to uniformly manage the cache of each node, avoiding the need to operate the disk for every read, and accelerating IO performance . Because the Private network is very fast, it is faster than reading the disk. Time to read data blocks from different locations:

  • Local Cache: 0.01 ms
  • Network access to other node Cache: 2.5 ms
  • Disk: 14 ms. But now with SSD or all-flash arrays on the storage side, the latency is around 1 ms. So with the all-flash array, Cache Fusion is probably meaningless.

Cache Fusion is managed through GCS (Global Cache Service), which treats your Cache as one big Cache.

Read Cache process

Detailed explanation of Oracle classic skills RAC

##Write Cache process

Detailed explanation of Oracle classic skills RAC

Write disk process

Detailed explanation of Oracle classic skills RAC

Load Balancing

When users initiate various requests to RAC, load balancing is responsible for allocating the tasks to different machines.

Assigned by user

Is it a random selection from several machines? NO! That's too low. RAC will dynamically allocate tasks based on the status of each machine. When the database is running, the load information of the PMON background process node is registered to the Listener and updated every 1-10 minutes. The Listener of each node knows the load status of all nodes and sends client requests to the busiest node.

Allocation by service

Allocation by user has the disadvantage that it may make Cache Fusion busier. Because RAC node data is shared, everyone synchronizes data through Cache Fusion. The performance of RAC is largely limited by the performance of Cache Fusion. Either make the Private network more powerful, such as using expensive InfiniBand, or the other is to reduce the traffic of Cache Fusion, which actually reduces the dependence between node instances. The mechanism of allocation according to users is not conducive to the subsequent solutions.

So RAC supports allocating nodes based on services. For example, different nodes are used for production and sales. Your own data is in your own Cache, so you don’t need to access other nodes. Performance is improved.

Three types of clusters

    Load balancing cluster: allocate requests to different members according to a certain algorithm;
  • High performance cluster (HPC, High Performance Cluster): adopted Awesome machines composed of specialized software and hardware such as vector processors, with powerful computing performance and extremely expensive prices, such as Tianhe Computer;
  • High Reliability Cluster (HAC: High Available Cluster, Failover Cluster): Very good reliability , the hardware and software are very fault-tolerant, and the database clusters used daily fall into this category.
High reliability cluster

Dual-machine hot standby

Usually one of them is idle and on standby. If the one working fails, let him Alternate replacement.


Detailed explanation of Oracle classic skills RAC
Dual-machine mutual backup

Usually both machines do their own work, but some resources must be reserved, because if one hangs up, the other will fail. It’s the work of two people.


Detailed explanation of Oracle classic skills RAC
Dual-machine duplex

For example:

The boss and the second brother open a shop together. Usually the boss mainly sells steamed buns, and the second boss sells steamed buns. The second one mainly sells soy milk. If the eldest brother has a problem, the second one will sell steamed buns and soy milk. If the second one fails, the eldest one will sell steamed buns and soy milk again.

Baozi Soy Milk is data. Taking care of each other is called heartbeat detection, and taking over each other's work is called failover. If two brothers are suddenly blind and deaf, and they don’t know whether the other is working now, and both think they have to take over the other’s work, this is called split brain, and then a third party, such as their father, is needed to solve the problem. This is called voting, or asking their two wives to come and take one of them away. This is called IO isolation. Oracle RAC is in this category, with the best performance and the most complex system.

Detailed explanation of Oracle classic skills RAC

Recommended tutorial: "

Oracle Learning Tutorial"

The above is the detailed content of Detailed explanation of Oracle classic skills RAC. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:csdn.net. If there is any infringement, please contact admin@php.cn delete