I started to come into contact with container technology based on Docker in 2015. For more than two years, as a Docker DevOps practitioner, I also witnessed the rapid development of Docker’s technical system. This article mainly makes a simple summary based on the practical process of microservice architecture built in the company. I hope to give some reference to DevOps students who are exploring how to lay out a service architecture system in the early stages of starting a business, or who want to have a preliminary understanding of enterprise-level architecture.
Regarding the technical layout of startups, many voices basically say that startups need to go online quickly and trial and error quickly. Quickly integrate, develop, and release quickly using a single application or separation of front-end and back-end applications. But in fact, the hidden costs caused by this result will be higher. When the business develops and there are more developers, it will face the problem of deployment efficiency of huge systems and development collaboration efficiency. Then, the structure is restructured through the splitting of services, separation of reading and writing of data, sub-database and sub-table, etc. Moreover, if this method is to be done thoroughly, it will require a lot of manpower and material resources.
My personal suggestion is that DevOps should be combined with your own judgment on the current and long-term development of the business, and you can use the microservice architecture in the early stages of the project, which will benefit future generations.
With the development of the open source community around Docker, the concept of microservice architecture can have a better implementation plan. And within each microservice application, the hexagonal architecture of DDD (Domain-Drive Design) can be used for in-service design. For some concepts about DDD, you can also refer to several previously written articles: Domain-driven design organization - concepts & architecture, domain-driven design organization - entity and value object design, domain services, domain events.
Clear microservice domain division, elegant implementation of architectural levels within the service, necessary IPC between services through RPC or event-driven, using API gateway to forward all microservice requests, non-blocking Request results to be merged. The following article will introduce in detail how to quickly build a microservice architecture with Docker with the above characteristics in a distributed environment.
If you use Docker technology to build a microservice system, service discovery is an inevitable topic. There are currently two mainstream service discovery modes: client discovery mode and server discovery mode.
Client discovery mode
The architecture diagram of the client discovery mode is as follows:
The typical implementation of the client discovery mode is Netflix system technology. The client queries all available service instances from a service registration service center. The client uses a load balancing algorithm to select one of multiple available service instances and then makes a request. A typical open source implementation is Netflix's Eureka.
Netflix-Eureka
Eureka’s client adopts a self-registration mode. The client needs to be responsible for processing the registration and deregistration of service instances and sending heartbeats.
When using SpringBoot to integrate a microservice, automatic registration can be easily achieved by combining it with the SpringCloud project. Add @EnableEurekaClient to the service startup class to register the service with the configured Eureka server when the service instance is started, and send heartbeats regularly. Client-side load balancing is implemented by Netflix Ribbon. The service gateway uses Netflix Zuul and the circuit breaker uses Netflix Hystrix.
In addition to the supporting framework for service discovery, SpringCloud's Netflix-Feign provides a declarative interface to handle Rest requests for services. Of course, in addition to using FeignClient, you can also use Spring RestTemplate. If you use @FeignClient in the project, the code will be more readable, and the Rest API will be clear at a glance.
The registration management and query of service instances are all done by calling the REST API interface provided by Eureka within the application (of course, there is no need to write this part of the code when using SpringCloud-Eureka). Since service registration and deregistration are requested through the client itself, a major problem with this model is that different services will be registered for different programming languages, and service discovery logic needs to be developed separately for each development language. Additionally, health check support needs to be configured explicitly when using Eureka.
Server-side discovery mode
The architecture diagram of the server-side discovery mode is as follows:
The client sends a request to the load balancer, and the load The balancer makes a request to the service registry, forwarding the request to the service instances available in the registry. Service instances are also registered and unregistered in the registry. Load balancing can use Haproxy or Nginx. The current mainstream solutions for server-side discovery mode based on Docker are Consul, Etcd and Zookeeper.
Consul
Consul provides an API that allows clients to register and discover services. Its consistency is based on the RAFT algorithm. Manage members and broadcast messages through WAN's Gossip protocol to complete cross-data center synchronization and support ACL access control. Consul also provides a health check mechanism and supports kv storage service (not supported by Eureka). For a more detailed introduction to Consul, please refer to the article I wrote before: Docker container deployment of Consul cluster.
Etcd
Etcd is strongly consistent (CP that meets CAP) and highly available. Etcd is also based on the RAFT algorithm to achieve strong consistency of KV data synchronization. Kubernetes uses the KV structure of Etcd to store the life cycle of all objects.
For some internal principles of Etcd, you can read etcd v3 principle analysis
Zookeeper
ZK was first used in Hadoop. Its system is very mature and is often used in large companies. . If you already have your own ZK cluster, you can consider using ZK as your own service registration center.
Zookeeper is the same as Etcd, with strong consistency and high availability. The consensus algorithm is based on Paxos. For the initial stage of microservice architecture, there is no need to use the heavier ZK for service discovery.
The service registry is an important component in service discovery. In addition to Kubernetes and Marathon, service discovery is a built-in module. Services need to be registered in the registry. Eureka, consul, etcd and ZK introduced above are all examples of service registries.
There are two typical registration methods for how microservices are registered in the registry: self-registration mode and third-party registration mode.
Self-registration pattern Self-registration pattern
The Netflix-Eureka client above is a typical example of the self-registration pattern. That is, each microservice instance itself needs to be responsible for registering and deregistering the service. Eureka also provides a heartbeat mechanism to ensure the accuracy of registration information. The specific heartbeat sending interval can be configured in the SpringBoot of the microservice.
As follows, when using Eureka as the registry, there will be a service registration information when the microservice (SpringBoot application) starts:
Similarly, When the application is deactivated, the service instance needs to actively log out the instance information:
The self-registration method is a relatively simple service registration method that does not require additional facilities or agents. The microservice instance itself manages service registration. But the shortcomings are also obvious. For example, Eureka currently only provides Java clients, so it is not convenient for multi-language microservice expansion. Because the microservice needs to manage the service registration logic itself, the microservice implementation also couples the service registration and heartbeat mechanism. Cross-language performance is relatively poor.
Here I recommend an architecture learning and exchange group to everyone. Communication and learning group number: 478030634. It will share some videos recorded by senior architects: Spring, MyBatis, Netty source code analysis, principles of high concurrency, high performance, distributed, microservice architecture, JVM performance optimization, distributed architecture, etc. These have become a necessary knowledge system for architects. You can also receive free learning resources, and you are currently benefiting a lot
Third party registration, that is, the management of service registration (registration, cancellation of services) through a A dedicated service manager (Registar) is responsible. Registrator is an open source service manager implementation. Registrator provides registry service support for Etcd and Consul. Registrator, as a proxy service, needs to be deployed and run on the server or virtual machine where the microservice is located. The simpler installation method is to run it as a container through Docker.
The architecture diagram of the three-party registration model is as follows:
By adding a service manager, the microservice instance no longer directly registers or logs out with the registration center. The service manager (Registar) discovers available service instances by subscribing to services, tracking heartbeats, registering with the registration center (consul, etcd, etc.), deregistering instances, and sending heartbeats. In this way, the decoupling of service discovery components and microservice architecture can be achieved.
Registrator cooperates with Consul and Consul Template to build a service discovery center. You can refer to: Scalable Architecture DR CoN: Docker, Registrator, Consul, Consul Template and Nginx. This article provides an example of using Nginx for load balancing. In the specific implementation process, Haproxy or other solutions can also be used instead.
In addition to the above several service discovery technologies, Kubernetes comes with a service discovery module, which is responsible for processing the registration and deregistration of service instances. Kubernetes also runs an agent on each cluster node to implement server-side discovery of routers. If the orchestration technology uses k8n, you can use k8n's complete set of Docker microservice solutions. Those who are interested in k8n can read the Kubernetes architecture design and core principles.
In actual technology selection, the most important thing is to make reasonable judgments based on the future development characteristics of the business and system.
In CAP theory. Eureka satisfies AP, Consul is CA, ZK and Etcd are CP. Both Eureka and Consul can guarantee availability in distributed scenarios. Building an Eureka service will be relatively faster because there is no need to build an additional high-availability service registration center. When using small-scale server instances, using Eureka can save certain costs.
Eureka and Consul both provide WebUI components that can view service registration data. Consul also provides KV storage and supports http and dns interfaces. For startups to first build microservices, these two are recommended.
In terms of multiple data centers, Consul comes with a data center WAN solution. Neither ZK nor Etcd provides support for multi-data center functionality and requires additional development.
In terms of cross-language, Zookeeper needs to use the client API provided by it, and cross-language support is weak. Etcd and Eureka both support http, and Etcd also supports grpc. In addition to http, Consul also provides DNS support.
In terms of security, Consul and Zookeeper support ACL, and Consul and Etcd support secure channel Https.
SpringCloud currently has corresponding support for Eureka, Consul, Etcd, and ZK.
Consul, like Docker, is implemented in Go language. Microservice applications based on Go language can give priority to using Consul.
After solving the problem of service discovery according to the microservice architecture system. You need to choose an appropriate communication mechanism between services. If you are in a SpringBoot application, using the REST API based on the HTTP protocol is a synchronous solution. Moreover, Restful-style API can make each microservice application more resource-oriented, and the use of lightweight protocols has also been advocated by microservices.
If each microservice uses the DDD (Domain-Driven Design) idea, then each microservice needs to try not to use the synchronous RPC mechanism. Asynchronous message-based methods such as AMQP or STOMP would be a good choice to loosely couple the dependencies between microservices. Currently, there are many options for message-based point-to-point pub/sub frameworks. The following is a detailed introduction to some solutions of the two IPCs.
Synchronization
For synchronous request/response mode communication method. You can choose to communicate between services based on the Restful-style Http protocol, or the Thrift protocol with good cross-language capabilities. If you are using pure Java language microservices, you can also use Dubbo. If it is a microservice architecture system integrated with SpringBoot, it is recommended to choose RPC with good cross-language performance and better support from the Spring community.
Dubbo
Dubbo is an open source Java client RPC framework developed by Alibaba. Dubbo transmits data based on long connections of the TCP protocol. The transfer format is using Hessian binary serialization. The service registration center can be implemented through Zookeeper.
ApacheThrift
ApacheThrift is an RPC framework developed by Facebook. Its code generation engine can create efficient services in multiple languages such as C, Java, Python, PHP, Ruby, Erlang, Perl, etc. The transmitted data is in binary format, and its packets are smaller than the HTTP protocol using Json or XML format. High concurrency is more advantageous in big data scenarios.
Rest
Rest is based on the HTTP protocol, and the HTTP protocol itself has rich semantics. As Springboot is widely used, more and more Restful-style APIs are becoming popular. REST is based on the HTTP protocol, and most developers are familiar with HTTP.
Another point to mention here is that many companies or teams also use Springboot, and they also say that they are based on Restful style. But the reality is that implementation is often not in place. To find out whether your Restful is really Restful, you can refer to this article, which conducts a four-level analysis of the maturity of Restful-style APIs: Richardson Maturity Model steps toward the glory of REST.
If you use Springboot, no matter what service discovery mechanism is used, you can use Spring's RestTemplate to encapsulate basic Http requests.
If you use Netflix-Eureka mentioned above, you can use Netflix-Feign. Feign is a declarative web service client. Client-side load balancing uses Netflix-Ribbon.
Asynchronous
In the microservice architecture, pure "event-driven architecture" is excluded. The scenario of using message queue is generally for decoupling between microservices. Services do not need to know which service instance consumes or publishes messages. Just handle the logic within your own domain, and then publish it through the message channel, or subscribe to the messages you care about. There are currently many open source message queue technologies. For example, Apache Kafka, RabbitMQ, Apache ActiveMQ and Alibaba's RocketMQ have now become one of the Apache projects. In the message queue model, the three main components are:
Producer: produces messages and writes messages to the channel.
Message Broker: Message Broker, manages the messages written to the channel according to the structure of the queue. Responsible for storing/forwarding messages. Broker is generally a cluster that needs to be built and configured separately, and must be highly available.
Consumer: The consumer of the message. Most current message queues ensure that messages are consumed at least once. Therefore, depending on the message queue facility used, consumers must be idempotent.
Different message queue implementations have different message models. The characteristics of each framework are also different:
RabbitMQ
RabbitMQ is an open source implementation based on the AMQP protocol, written in Erlang, which is famous for its high performance and scalability. Currently the client supports Java, .Net/C# and Erlang. Among AMQP (Advanced Message Queuing Protocol) components, Broker can contain multiple Exchange (switch) components. Exchange can bind multiple Queues as well as other Exchanges. Messages will be sent to the corresponding Message Queue according to the Routing rules set in Exchange. After the Consumer consumes the message, it will establish a connection with the Broker. Send notifications for consumed messages. Then Message Queue will remove the message.
Kafka
Kafka is a high-performance publish/subscribe based cross-language distributed messaging system. The development language of Kafka is Scala. Its more important features are:
Fast message persistence with time complexity of O(1);
High throughput rate;
Supports message partitioning between services and distributed consumption, while ensuring Messages are transmitted sequentially;
supports online horizontal expansion and comes with load balancing;
supports consumption only and only once (Exactly Once) mode, etc.
Let’s talk about a shortcoming: The management interface is a bit useless, you can use the open source kafka-manager
Its high throughput characteristics, in addition to being a message queue between microservices, can also be used for log collection, offline analysis, real-time analysis, etc.
Kafka officially provides a Java version of the client API, and the Kafka community currently supports multiple languages, including PHP, Python, Go, C/C, Ruby, NodeJS, etc.
ActiveMQ
ActiveMQ is a JMSProvider implemented based on JMS (Java Messaging Service). JMS mainly provides two types of messages: Point-to-Point and Publish/Subscribe. Currently the client supports Java, C, C, C#, Ruby, Perl, Python, and PHP. And ActiveMQ supports multiple protocols: Stomp, AMQP, MQTT, and OpenWire.
RocketMQ/ONS
RocketMQ is an open source high-availability distributed message queue developed by Alibaba. ONS is a high-availability cluster that provides a commercial version. ONS supports pull/push. It can support active push and tens of billions of messages accumulation. ONS supports global sequential messages and has a friendly management page that can monitor the consumption of message queues and supports manual triggering of multiple message resends.
Through the service discovery mechanism of microservices in the previous article and the addition of Restful API, the synchronization inter-process communication between microservices can be solved. Of course, since microservices are used, we hope that all microservices can have reasonable bounded contexts (system boundaries). Synchronous communication between microservices should be avoided as much as possible to prevent the domain models between services from intruding on each other. In order to avoid this situation, you can use a layer of API gateway (will be introduced below) in the microservice architecture. All microservices forward and merge unified requests through the API gateway. And the API gateway also needs to support synchronous requests and NIO asynchronous requests (which can improve the efficiency and performance of request merging).
Message queue can be used for decoupling between microservices. In a service cluster environment based on Docker microservices, the network environment will be more complex than a general distributed cluster. Just choose a highly available distributed message queue implementation. If you build a cluster environment such as Kafka or RabbitMQ yourself, you will have high requirements for the high availability of the Broker facilities. For microservices based on Springboot, it is recommended to use Kafka or ONS. Although ONS is commercially available, it is easy to manage and has high stability. It is more suitable for microservice architectures that rely on message queues for communication only in necessary scenarios. If you consider that there will be log collection, real-time analysis and other scenarios, you can also build a Kafka cluster. At present, Alibaba Cloud also has commercial cluster facilities based on Kafka.
Use API Gateway to handle microservice request forwarding and merging
The previous section mainly introduces how to solve the service discovery and communication problems of microservices. In the microservice architecture system, when using DDD thinking to divide bounded contexts between services, calls between microservices will be minimized. In order to decouple microservices, there is an optimization solution based on API Gateway.
Decoupling microservice calls
For example, the following is a common demand scenario - an aggregation page of "User Order List". You need to request "User Service" to obtain basic user information, and "Order Service" to obtain order information, and then request "Product Service" to obtain product pictures, titles and other information in the order list. As shown in the figure below:
If the client (such as H5, Android, iOS) is allowed to issue multiple requests to resolve multiple information aggregations, the client's the complexity. A more reasonable way is to add an API Gateway layer. Like microservices, API Gateway can also be deployed and run in Docker containers. It is also a Springboot application. As follows, after forwarding through the Gateway API:
All requested information is aggregated by the Gateway, which is also the only node entering the system. And Gateway and all microservices, as well as provided to clients, are also Restful style APIs. The introduction of the Gateway layer can well solve the problem of information aggregation. And it can better adapt to the requests of different clients. For example, H5 pages do not need to display user information, and iOS clients need to display user information. You only need to add a Gateway API to request resources. The resources of the microservice layer do not need to be displayed. Changes are required.
API gateway can not only merge and forward requests. It also needs other features to become a complete Gateway.
Responsive Programming
Gateway is the entry point for all client requests. Similar to Facade mode. In order to improve the performance of requests, it is best to choose a non-blocking I/O framework. In some scenarios where multiple microservices need to be requested, requests for each microservice do not necessarily need to be synchronized. In the "user order list" example given above, obtaining user information and obtaining the order list are two independent requests. Only to obtain the product information of the order, you need to wait for the order information to be returned, and then request the product microservice based on the product ID list of the order. In order to reduce the response time of the entire request, Gateway needs to be able to process independent requests concurrently. One solution is to use reactive programming.
The current reactive programming methods using the Java technology stack include CompletableFuture of Java8, and the JVM-based implementation provided by ReactiveX - RxJava.
ReactiveX is a programming interface that uses observable data streams for asynchronous programming. ReactiveX combines the essence of the observer pattern, iterator pattern and functional programming. In addition to RxJava, there are also multi-language implementations such as RxJS and RX.NET.
For Gateway, the Observable provided by RxJava can solve parallel independent I/O requests very well, and if Java8 is used in the microservice project, team members will learn and absorb the functions of RxJava faster. Responsive programming also based on Lambda style can make the code more concise. For a detailed introduction to RxJava, you can read the RxJava documentation and tutorials.
Through the Observable mode of reactive programming, you can create event streams and data streams very simply and conveniently, and use simple functions to combine and convert data. At the same time, you can subscribe to any observable data stream and execute it. operate.
By using RxJava, the resource request sequence diagram of "User Order List":
Responsive programming can better handle various thread synchronization and concurrent requests, through Observables and Schedulers provide transparent data flow and thread processing of event flow. In the agile development model, reactive programming makes the code more concise and easier to maintain.
Authentication
Gateway is the only entrance to the system. All authentication based on microservices can be done around Gateway. In the Springboot project, basic authorization can use spring-boot-starter-security and Spring Security (Spring Security can also be integrated in the Spring MVC project).
Spring Security mainly uses AOP to intercept resource requests, and maintains a role's Filter Chain internally. Because microservices are all requested through Gateway, the @Secured of microservices can be set according to the role levels of different resources in Gateway.
Spring Security provides basic role verification interface specifications. However, the encryption, storage, and verification of the Token information requested by the client need to be completed by the application itself. Redis can be used to store Token encrypted information. One more thing to mention here is that in order to ensure the variability of some encrypted information, it is best to consider supporting multiple version keys when designing the Token module at the beginning to prevent the internal key from being leaked (I heard a friend say before His company’s Token encryption code was released by employees). As for the encryption algorithm and its specific implementation, we will not expand on it here. After Gateway authentication is passed, the parsed token information can be directly passed to the microservice layer that needs to continue the request.
If the application requires authorization (requiring resource requests to manage different roles and permissions), it only needs to be done based on the AOP idea based on Gateway's Rest API. Unified management of authentication and authorization is also one of the benefits of using the Gateway API similar to the Facade mode.
Load Balancing
API Gateway, like Microservice, provides Rest api as a Springboot application. So it also runs in a Docker container. Service discovery between Gateway and microservices can still use the client discovery mode or server discovery mode mentioned above.
In a cluster environment, API Gateway can expose a unified port, and its instances will run on servers with different IPs. Because we use Alibaba Cloud's ECS as the container infrastructure, Alibaba Cloud's load balancing SLB is also used for load balancing in the cluster environment, and AliyunDNS is also used for domain name resolution. The following figure is a diagram of a simple network request:
#In practice, in order not to expose the service port and resource address, the Nginx service can also be deployed in the service cluster. As a reverse proxy, external load balancing facilities such as SLB can forward requests to the Nginx server, and then forward the requests to the Gateway port through Nginx. If it is a cluster built in a self-built computer room, a highly available load balancing center needs to be built. In order to cope with cross-machine requests, it is best to use Consul, Consul (Consul Template) Registor Haproxy as a service discovery and load balancing center.
Caching
For some high QPS requests, you can do multi-level caching in API Gateway. Distributed cache can use Redis, Memcached, etc. If it is some page-level requests that do not have high real-time requirements and have low change frequency but high QPS, local caching can also be done at the Gateway layer. And Gateway can make the caching solution more flexible and versatile.
API Gateway error handling
In the specific implementation process of Gateway, error handling is also a very important thing. For Gateway error handling, you can use Hystrix to handle request fusing. And the onErrorReturn callback that comes with RxJava can also conveniently handle the return of error information. For the circuit breaker mechanism, the following aspects need to be dealt with:
Fault-tolerant processing of service requests
As a reasonable Gateway, it should only be responsible for Process data flow and event flow, not business logic. When processing requests from multiple microservices, microservice requests may time out and become unavailable. In some specific scenarios, partial failures need to be handled reasonably. For example, in the "User Order List" in the above example, when an error occurs in the "User" microservice, it should not affect the request for "Order" data. The best way to deal with it is to return a default data to the wrong user information request at that time, such as displaying a default avatar and default user nickname. Then, the correct data will be returned for requesting normal orders and product information. If a key microservice request is abnormal, such as when a microservice in the "Order" field is abnormal, the client should be given an error code and a reasonable error message. This kind of processing can try to improve the user experience when part of the system is unavailable. When using RxJava, the specific implementation method is to write onErrorReturn and make error data compatible for different client requests.
Exception capture and recording
Gateway is mainly responsible for forwarding and merging requests. In order to troubleshoot problems clearly and locate the problem with a specific service or even which Docker container, Gateway needs to be able to capture and record different types of exceptions and business errors. If you use FeignClient to request microservice resources, you can further filter the Response results and record all request information in the log by implementing the ErrorDecoder interface. If you are using Spring Rest Template, you can define a customized RestTempate and parse the returned ResponseEntity. Log error messages before returning the serialized result object.
Timeout mechanism
Most of the Gateway threads are IO threads. In order to prevent the Gateway from having too many waiting threads due to blocking of a certain microservice request, exhausting system resources such as thread pools and queues. A timeout mechanism needs to be provided in the Gateway so that graceful service degradation can be performed on the timeout interface.
Hystrix is integrated in SpringCloud's Feign project. Hystrix provides a relatively comprehensive circuit breaker mechanism for timeout processing. By default, the timeout mechanism is enabled. In addition to configuring timeout-related parameters, Netflix also provides real-time monitoring Netflix-Dashboard based on Hytrix, and the cluster service only needs to be additionally deployed with Netflix-Turbine. For general Hytrix configuration items, please refer to Hystrix-Configuration.
If you are using RxJava's Observable reactive programming and want to set different timeouts for different requests, you can directly set the callback method and timeout time in the parameters of the Observable's timeout() method.
Retry mechanism
For some key businesses, in order to ensure that correct data is returned when the request times out, Gateway needs to be able to provide retry test mechanism. If you use SpringCloudFeign, its built-in Ribbon will provide a default retry configuration, which can be turned off by setting spring.cloud.loadbalancer.retry.enabled=false. The retry mechanism provided by Ribbon will be triggered when the request times out or socket read timeout. In addition to setting the retry, you can also customize the retry time threshold and the number of retries.
For applications that use Spring RestTemplate in addition to Feign, you can use a customized RestTemplate to parse the results of the returned ResponseEntity object. If the request needs to be retried (such as a fixed format error-code method to identify the retry strategy), intercept requests through Interceptor, and invoke multiple requests through callbacks.
For the microservice architecture, through an independent API Gateway, unified request forwarding, merging and protocol conversion can be performed. It can more flexibly adapt to the request data of different clients. Moreover, requests that are compatible with different clients (such as H5 and iOS display data are different) and different versions can be well shielded in Gateway to make microservices more pure. Microservices only need to focus on the design of internal domain services and event processing.
API gateway can also perform certain fault tolerance and service degradation on microservice requests. Using reactive programming to implement API gateway can make thread synchronization and concurrency code simpler and easier to maintain. Requests for microservices can be unified through FeignClint. The code will also be very hierarchical. The figure below is an example of the requested class hierarchy.
Clint is responsible for integrating service discovery (for self-registration using Eureka), load balancing, making requests, and obtaining ResponseEntity objects.
Translator converts ResponseEntity into Observable
Adapter calls each Translator and uses the Observable function to merge the requested data streams. If there are multiple data assemblies, you can add a layer of Assembler to specifically handle the conversion of DTO objects to Models.
Controller provides management of Restful resources. Each Controller only requests a unique Adapter method.
The previous article mainly introduces the service discovery, service communication and API Gateway of microservices. The model of the overall microservice architecture is initially seen. In actual development, testing and production environments. When using Docker to implement microservices, the network environment of the cluster will be more complex. The microservice architecture itself means that several container services need to be managed, and each microservice should be independently deployed, expanded, and monitored. The following will continue to introduce how to perform continuous integration deployment (CI/CD) of Docker microservices.
Mirror Warehouse
To use Docker to deploy microservices, you need to package the microservices into Docker images, just like deploying them on a Web server and packaging them into war files. It's just that the Docker image runs in a Docker container.
If it is a Springboot service, Springboot including the Apache Tomcat server and the compiled Java application including the Java runtime library will be directly packaged into a Docker image.
In order to uniformly manage packaging and distribution (pull/push) images. Enterprises generally need to establish their own private mirror database. The implementation is also very simple. The container version of Registry2 of Docker hub's mirror warehouse can be deployed directly on the server. The latest version currently is V2.
Code Warehouse
Code submission, rollback and other management are also part of the continuous integration of the project. Generally, it is also necessary to establish a private repository for the company's code warehouse. You can use code version management tools such as SVN and GIT.
Currently, the company uses Gitlab, and installation and deployment operations through Git’s Docker image are also very convenient. For specific steps, please refer to docker gitlab install. In order to quickly build and package, Git and Registry can also be deployed on the same server.
Project construction
In the Springboot project, the build tool can be Maven or Gradle. Gradle is more flexible than Maven, and the Springboot application itself is deconfigurable, so it is more suitable to use Gradle based on Groovy. DSL itself is also more concise and efficient than XML.
Because Gradle supports custom tasks. Therefore, after the Dockerfile of the microservice is written, you can use Gradle's task script to build and package it into a Docker Image.
There are also some open source Gradle tools for building Docker images, such as the Transmode-Gradlew plug-in. In addition to building Docker images for sub-projects (single microservices), it can also support uploading images to remote image warehouses at the same time. On the build machine in the production environment, you can directly execute the build of the project, the packaging of the Docker Image, and the push of the image through a single command.
Container Orchestration Technology
After the Docker image is built, because each container runs a different microservice instance, services are deployed in isolation between containers. Through orchestration technology, DevOps can lightweight manage the deployment and monitoring of containers to improve the efficiency of container management.
Currently some common orchestration tools, such as Ansible, Chef, and Puppet, can also orchestrate containers. But none of them are orchestration tools specifically for containers, so you need to write some scripts yourself and combine them with Docker commands when using them. For example, Ansible can indeed achieve very convenient deployment and management of cluster containers. Ansible currently provides an integration solution for the container technology developed by its team: Ansible Container.
The cluster management system uses the host as a resource pool and decides which host to schedule the container to based on the resource requirements of each container.
Currently, the more mature technologies surrounding the scheduling and orchestration of Docker containers include Google's Kubernetes (hereinafter referred to as k8s), Mesos combined with Marathon to manage Docker clusters, and Docker Swarm, which is officially provided in Docker version 1.12.0 and above. Orchestration technology is one of the focuses of container technology. Choosing a container orchestration technology that suits your team can also make operation and maintenance more efficient and automated.
Docker Compose
Docker Compose is a simple Docker container orchestration tool. It configures the applications that need to be run through YAML files, and then starts the containers corresponding to multiple services through the compose up command. Example. Compose is not integrated in Docker and needs to be installed separately.
Compose can be used for continuous integration of microservice projects, but it is not suitable for container management of large clusters. In large clusters, Compose can be combined with Ansible for cluster resource management and service governance.
For situations where there are not many servers in the cluster, Compose can be used. The main steps are:
Combined with the microservice operating environment, define the Dockerfile of the service
Write the docker-compose.yml file based on the service image, port, running variables, etc. so that the services can be deployed together, run the
Run the docker-compose up command Start and enter the container instance. If you need to run it as a background process, use docker-compose up-d.
Docker Swarm
In 2016, after the 1.12 version of Docker came out, the new version of Docker came with Docker swarm mode. No need to install any additional plug-in tools. It can be seen that the Docker team has also begun to pay attention to service orchestration technology since last year. Through the built-in Swarm mode, it also wants to seize part of the service orchestration market.
If the team starts using a new version of Docker, you can choose Docker swarm mode for clustered container scheduling and management. Swarm also supports rolling updates, transport layer security encryption between nodes, load balancing, etc.
For examples of using DockerSwarm, please refer to the article I wrote before: Using docker-swarm to build a continuous integration cluster service.
Kubernetes
Kubernetes is Google’s open source container cluster management system, implemented using the Go language. It provides application deployment, maintenance, expansion mechanism and other functions. Currently, k8s can be used on GCE, vShpere, CoreOS, OpenShift, Azure and other platforms. At present, Aliyun in China also provides a service management platform based on k8s. If it is a Docker cluster built based on physical machines or virtual machines, you can also directly deploy and run k8s. In a microservice cluster environment, Kubernetes can easily manage microservice container instances across machines.
Currently k8s is basically recognized as one of the most powerful open source service management technologies. It mainly provides the following functions:
Automated deployment and replication of service instances based on Docker
Run in a cluster and can manage cross- Machine containers, as well as rolling upgrades and storage orchestration.
Built-in Docker-based service discovery and load balancing module
K8s provides a powerful self-healing mechanism that will repair crashed containers. Replacement (without any awareness to users or even the development team), and the capacity can be expanded or reduced at any time. Make container management more flexible.
k8s mainly completes the management of elastic container clusters through the following important components:
Pod is the smallest management element of Kubernetes. One or more containers run in a pod. The life cycle of a pod is very short and will die when scheduling fails, the node crashes, or other resources are recycled.
Label is a key/value storage structure that can be associated with pods. It is mainly used to mark pods and group services. Microservices use label selectors (Selectors) to identify Pods.
Replication Controller is the core component of the k8sMaster node. Used to ensure that a specified number of pod replicas (replicas) are running in the Kubernetes cluster at any time. That is, it provides the function of a self-healing mechanism, and is also useful for shrinking and expanding, and rolling upgrades.
Service is an abstraction of the strategy for a group of Pods. It is also one of the basic elements of k8s management. Service identifies a group of Pods through Label. When created, a local cluster's DNS will also be created (to store the service address of the Pod corresponding to the Service). Therefore, the client requests to obtain the IP addresses of a set of currently available Pods by requesting DNS. The request is then forwarded to one of the Pods through the kube-proxy running in each Node. This layer of load balancing is transparent, but the current k8s load balancing strategy is not very complete, and the default is random.
In the microservice architecture system, a suitable continuous integration tool can greatly improve the team's operation and maintenance and development efficiency. Currently, similar to Jenkins, there are continuous integration plug-ins for Docker, but there are still many imperfections. Therefore, it is recommended to choose Swarm, k8s, and Mesos that specifically deal with Docker container orchestration technology. Or combine multiple technologies, such as Jenkins for CI k8s and CD.
Swarm, k8s, and Mesos each have their own features, and they all provide support for the continuous deployment, management, and monitoring of containers. Mesos also supports data center management. Docker swarm mode extends the existing Docker API. Through the call and extension of the Docker Remote API, containers can be scheduled to run on specified nodes. Kubernetes is currently the largest orchestration technology in the market. Many large companies have also joined the k8s family. K8s is more flexible in the expansion, maintenance and management of cluster applications, but the load balancing strategy is relatively rough. Mesos focuses more on general scheduling and provides a variety of schedulers.
For service orchestration, you still have to choose the one that best suits your team. If the number of initial machines is small and the cluster environment is not complex, you can also use Ansible Docker Compose and Gitlab CI for continuous integration.
Service cluster solution
When enterprises practice using Docker to deploy and run microservice applications, whether they lay out the microservice architecture from the beginning, or start from the traditional single application Migrate the architecture to microservices. All need to be able to handle service scheduling, orchestration, monitoring and other issues in complex clusters. The following mainly introduces how to use Docker more safely and efficiently in a distributed service cluster, as well as all aspects that need to be considered in architectural design.
Load Balancing
What we are talking about here is load balancing in the cluster. If it is a pure server-side API, it refers to the load balancing of the Gateway API. If Nginx is used, it is Refers to Nginx load balancing. We currently use Alibaba Cloud's load balancing service SLB. One of the main reasons is that it can be bound to the DNS domain name service. For companies that are just starting to start a business, you can set load balancing weights through the Web interface, which is more convenient for partial release, testing and verification, health check monitoring, etc. It is a more suitable choice in terms of efficiency and saving operation and maintenance costs.
If you build a seven-layer load balancing by yourself, such as using Nginx or Haproxy, you also need to ensure that the cluster responsible for load balancing is also highly available and provides convenient cluster monitoring, blue-green deployment and other functions.
Relational Database (RDBMS)
For microservices, the storage technology used is mainly based on the needs of the enterprise. In order to save costs, Mysql is generally used. When choosing the Mysql engine, it is recommended to choose the InnoDB engine (MyISAM was the default before version 5.5). InnoDB is more efficient in handling concurrency, and its query performance gap can also be made up through caching, search and other solutions. InnoDB's free solutions for handling data copy and backup include binlog and mysqldump. However, to achieve automated backup and recovery and a monitorable data center, a DBA or operation and maintenance team is still required. The relative cost is also higher. If you are a start-up, you can also consider relying on the PaaS services provided by some relatively large cloud computing platforms at home and abroad.
Microservices are generally divided according to business areas, so it is best to design sub-library for microservices from the beginning. Whether table subdivision is needed requires detailed analysis based on the development of the specific business fields of each microservice and the size of the data. However, it is recommended that for models in relatively core fields, such as "orders", the sub-table fields should be designed and reserved in advance.
KV model database (Key-Value-stores)
Redis is an open source Key-Value structure database. It is based on memory, has efficient caching performance, and also supports persistence. Redis mainly has two persistence methods. One is RDB, which generates point-in-time snapshots of the data set at specified time intervals and writes them from memory to disk for persistence. The RDB method will cause a certain degree of data loss, but the performance is good. The other is AOF. Its writing mechanism is somewhat similar to InnoDB's binlog. AOF file commands are all saved in the Redis protocol format. These two kinds of persistence can exist at the same time. When Redis is restarted, the AOF file will be used first to restore data. Because persistence is optional, Redis persistence can also be disabled.
In actual scenarios, it is recommended to retain persistence. For example, Redis can be used to solve the verification of SMS verification codes that is currently popular. In the microservice architecture system, Redis can also be used to handle some KV data structure scenarios. The lightweight data storage solution is also very suitable for the microservice idea that emphasizes lightweight solutions.
In practice, we cache and persist Redis, and divide the two functional features into separate libraries.
In the integrated Springboot project, spring-boot-starter-data-redis will be used to perform Redis database connection and basic configuration, as well as the rich data APIOperations provided by spring-data-redis.
In addition, if it is an application that requires high throughput, you can consider using Memcached to cache simple KV data structures. It is more suitable for reading large amounts of data, but the supported data structure types are relatively single.
Graph Database (Graph Database)
Involving the storage of social-related model data, the graph database is a more efficient and flexible choice for intersecting relational databases. Graph database is also a type of Nosql. It is different from KV. The stored data is mainly data nodes (node), directional relationships (Relationship), and properties (Property) on nodes and relationships.
If you use Java as the main development language for microservices, it is best to choose Neo4j. Neo4j is a graph database based on Java that supports ACID. It provides a rich JavaAPI. In terms of performance, the local nature of graph databases makes traversals very fast, especially large-scale deep traversals. This is beyond the reach of multi-table associations in relational databases.
The following figure is an official Getting started data model example displayed using Neo4j's WebUI tool. The statement MATCH p=()-[r:DIRECTED]->() RETURN p LIMIT 25 in the example is the query language provided by Neo4j - Cypher.
# Spring Data Neo4j can be integrated into the project when used in the project. And SpringBootStartersspring-boot-starter-data-neo4j
Document database
The currently widely used open source document-oriented database can use Mongodb. Mongo has high availability, high scalability, and flexible data structure storage, especially for the storage of Json data structures. It is more suitable for the storage of models such as blogs and comments.
Search technology
In the development process, sometimes we often see people writing multi-table query SQL that is long, convoluted, and difficult to maintain, or various Subquery statement for multi-table association. For a certain domain model, when there are many such scenarios, it is time to consider adding a search solution. Don’t use SQL to solve everything, especially query scenarios. The problem of slow query statements can sometimes even bring down the DB. If the DB monitoring system is not in place, the problem may be difficult to troubleshoot.
Elasticsearch is an open source real-time distributed search and analysis engine based on Apache Lucene. Springboot projects also provide integration methods: spring-boot-starter-data-elasticsearch and spring-data-elasticsearch.
To build a search cluster, you can use Docker. For specific construction methods, please refer to Building an Elasticsearch Cluster with Docker. For the integration of Springboot projects, please refer to Integrating Search Services in Springboot Microservices. So far, the latest version of Spring Data Elasticsearch has supported version 5.x of ES, which can solve many pain points of version 2.x.
If it is a small-scale search cluster, you can use three low-configuration machines and then use ES Docker to build it. You can also use some commercial versions of PaaS services. How to choose depends on the size and scenario of the team and business.
At present, in addition to ES, the more widely used open source search engine is Solr. Solr is also based on Lucene and focuses on text search. The text search of ES is indeed not as good as Solr. ES mainly focuses on supporting distributed support, and has a built-in service discovery component Zen to maintain cluster status. Compared with Solr (which requires the help of something like Zookeeper to achieve distribution), the deployment is also more lightweight. In addition to analyzing queries, ES can also integrate log collection and analysis and processing.
Message Queue
As mentioned in the previous article, message queue can be used as a good decoupling communication method for microservices. In the scenario of distributed clusters, it can also provide technical basic guarantee for the final consistency under distributed conditions. And message queues can also be used to handle traffic cuts.
The comparison of message queues will not be repeated here. The company currently uses Alibaba Cloud's ONS. Because the use of message queues still requires high availability and easy management and monitoring, we chose a safe and reliable message queue platform.
Security Technology
Security is the basis that needs to be considered when doing architecture. The Internet environment is complex, and protecting the security of services is also a basic commitment to users. Security technology covers a wide range of topics. This article selects a few common issues and commonly used methods to briefly introduce them.
Service instance security
The distributed cluster itself is a guarantee for the security of service instances. When there is a problem with a server or a certain service instance, load balancing can forward the request to other available service instances. However, many companies build their own computer rooms, and they are single-machine rooms. This layout is actually more dangerous. Because the server's backup and disaster recovery cannot be fully guaranteed. The most fearful thing is that the database is also in the same computer room, and the main and backup servers are all together. Not only is security not guaranteed, but daily operation and maintenance costs will also be relatively high. Also, you need to pay attention to configuring firewall security policies.
If possible, try to use some highly available, highly scalable and stable IaaS platforms.
1. Prevent network attacks
There are currently several main network attacks:
SQL injection: According to different persistence layer frameworks , coping strategies are different. If you use JPA, as long as you follow JPA's specifications, you basically don't have to worry.
XSS attack: Do a good job of escaping and verifying parameters. For details, refer to XSS Prevention
CSRF Attack: Do a good job in Token and Refer verification of Http Header information. For details, refer to CSRF Prevention
DDOS Attacks: Large-traffic DDoS attacks generally use high-defense IP. You can also access the high-defense IP of some cloud computing platforms.
The above only lists a few common attacks. If you want to know more about them, you can read more about the REST security prevention table. In the field of network security, it is generally easy to be ignored by start-ups. If there is no operation and maintenance security team, it is best to use products like Alibaba Cloud-Cloud Shield. Save worry and cost.
2. Use security protocols
Needless to say, this goes without saying, whether it is for microservice communication using Restful API, the CDN used, or the DNS service used. When it comes to the HTTP protocol, it is recommended to use HTTPS uniformly. No matter what size of application it is, it is necessary to guard against traffic hijacking, otherwise it will bring a very bad experience to users.
3. Authentication
API Gateway has already introduced the authentication of microservices before. In addition to the microservices themselves, some of the services we use, such as Mysql, Redis, Elasticsearch, Eureka, etc., also need to set up authentication and try to access them through the intranet. Don't expose too many ports to the outside world. For the API Gateway of microservices, in addition to authentication, it is best for the front end to request the API layer through the Nginx reverse proxy.
The monitoring system of microservices based on container technology faces a more complex network and service environment. How log collection and monitoring can make microservices less intrusive and more transparent to developers has been something that many microservice DevOps are constantly thinking about and practicing.
1. Collection of microservice logs
The monitoring of the API layer of microservices requires tracking, collection and analysis of the call path from the API Gateway to each microservice. If you use the Rest API, in order to collect all requests, you can use Spring Web's OncePerRequestFilter to intercept all requests. When collecting logs, it is also best to record the rt of the request.
In addition to recording access, request and other information, request tracking of API calls is also required. If you simply record the logs of each service and Gateway, then when an exception occurs in the Gateway Log, you will not know which container instance of the microservice has a problem. If the number of containers reaches a certain level, it is impossible to check the logs of all containers and service instances. A relatively simple solution is to append a uniquely identifiable Trace string containing container information to all log information.
After the logs are collected, they still need to be analyzed. If you use E.L.K's technical system, you can flexibly use the real-time distributed features of Elasticsearch. Logstash can collect and analyze logs and synchronize data to Elasticsearch. Kibana combines Logstash and ElasticSearch to provide a good WebUI that facilitates log analysis and enhances the visual management of log data.
For the collection of logs with large amounts of data, in order to improve the collection performance, the message queue mentioned above needs to be used. The optimized architecture is as follows:
#2. Collection of basic service call logs
Through the micro Log collection and analysis of all Rest APIs of the service can monitor request information.
Within the service, log collection and analysis of the performance of middleware and infrastructure calls (including Redis, Mysql, Elasticsearch, etc.) are also necessary.
For log collection of middleware services, we can currently use dynamic proxy to intercept and callback logging methods for basic methods called cache and repository (including search and DB) called by the service. The specific implementation method can use the bytecode generation framework ASM. Regarding the logic injection of methods, you can refer to the previously written article ASM (4). Use the Method component to dynamically inject method logic. If you feel that the ASM code is not easy to maintain, you can also use Relatively API-friendly Cglib.
Five elements of architecture:
Finally, let’s review the technical system we use to build the Docker microservice architecture based on the five core elements of the architecture:
High performance
Message queue, RxJava asynchronous concurrency, distributed cache, local cache, Http Etag cache, using Elasticsearch to optimize queries, CDN, etc.
Availability container service cluster, RxJava circuit breaker processing, service degradation, message idempotent processing, timeout mechanism, retry mechanism, distributed eventual consistency, etc.
Scalability Server cluster scalability, container orchestration Kubernetes, database sharding, Nosql linear scalability, search cluster scalability, etc.
Scalability Microservices based on Docker are born for scalability!
Security
JPA/Hibernate, SpringSecurity, high-defense IP, log monitoring, HTTPS, Nginx reverse proxy, HTTP/2.0, etc.
For service cluster solutions, in fact, whether it is microservice architecture or SOA architecture, they are relatively common. Only for the construction of some middleware clusters, Docker can be used. Just one sentence of Docker ps can easily query the running service information, and it is also convenient to upgrade basic services.
The pursuit of excellent cluster architecture design is endless. After contacting many technical friends from startup companies, everyone prefers to quickly build, develop, and release services. However, on the one hand, there are also concerns that the architecture of microservices will be more complicated, which would be a waste of time. But microservices itself is an excellent practice of agile mode. These friends often face a problem when their business is developing rapidly, which is the embarrassment of service splitting, database sub-databases and tables, and decoupling noodle-like synchronization code through messages. They want to optimize performance but have no way to start.
This article is mainly about the selection and introduction of technical solutions for Docker’s microservice practice. Different businesses and teams may be suitable for different architectural systems and technical solutions.
As an architect, you should make a long-term layout based on the company's short-term and long-term strategic planning. At the very least, the basic architecture needs to be able to support three years of development, during which time new technologies can be continuously introduced, service upgrades and continuous code layer reconstruction can be carried out.
Perhaps it doesn’t take long for an architect to build a complete system from scratch. The most important thing is to continuously promote Domain-Driven Design in the team. And enable the team to follow Clean Code and conduct agile development OvO.
Related articles:
Analysis of Microsoft microservice architecture eShopOnContainers
Related videos:
Geek Academy Docker video tutorial-free online video tutorial
The above is the detailed content of Using Docker's container technology to lay out service architecture system DevOps - JAVA architecture. For more information, please follow other related articles on the PHP Chinese website!