


Preface
Times have changed.
In the past, data was mostly entered manually and transferred from terminals with dedicated network protocols to big iron boxes in "glass houses". Now information is everywhere and everywhere, but it may not always be summarized in your company. , many times we share data in a "flat" world, where there are more channels for information sources and the information itself changes more frequently. Not only that, with the emergence of a series of concepts such as Web 2.0, Enterprise 2.0 and Internet Service Bus, you find that it is far less convenient to find the warehouse address provided by the supplier from your own "glass house" than Google Map.
It seems that all the shackles that have restricted data in the past have been broken one by one under the Internet, but as IT practitioners, our job is to provide users with the data they need and the means they want to obtain information, so application It must be able to withstand various changes, including changes in the user interface that we were concerned about in the past, changes in calls between applications, changes in internal logic of applications, and the increasingly faster pace but the most fundamental change - changes in the data itself. .
RelationshipModel tells us to use two-dimensional tables to describe the information world, but this is too "un" natural. Take a look at a book or It is the home decoration plan and the task breakdown of the soon-to-be-started project. It seems that it is not appropriate to put it into a two-dimensional table. Moreover, even if it is cut down to the last detail through "entity-relationship", it will always be necessary in the rapidly changing environment. It involves a series of changes in "data-application-front-end interaction", and often affects the whole body.
It seems that many new generation applications have found a solution that is more suitable for the new trend - XML, organizing applications and user experience in a way that is closer to our own thinking. So for enterprises, can the relatively basic work of organizing data also be carried out using XML thinking? it should be OK.
Coping with changes in the data entities themselves
Data entities have always been assumed to be the most stable part of the application, whether we use design patterns or use various open source Development frameworks (including these frameworks themselves) are all trying to adapt to the changes in the application itself. So what is the actual situation?
l The data entities we need to exchange often change according to the needs of ourselves and our partners;
l The data entities given to us by our partners also often change;
l With With the introduction of SOA and Enterprise 2.0 concepts, the data entities themselves are mash up from multiple sources, and the data entities themselves are also repeatedly assembled and combined;
l As the business becomes more refined, our own employees We always hope to obtain more and more abundant and detailed information;
Therefore, in the past, the data entities that were thought to be the earliest to be fixed depending on the needs and design are becoming more and more agile in the field of technology and Business status quo requires constant adjustments. In order to adapt to this requirement, we can start from the top down and constantly adjust the flexibility of the application itself; another way is to deal with this problem from the "root" and adopt new data models that can continuously adapt to these changes, such as: XML data model and XML-related technology families.
For example, when defining a user entity, the following information is enough initially, where ICustomer is the user interface that the application will use, and CUSTOMER is the representation in relational database mode, is XML:
It is not difficult to see that although it is just a change in the "contact information" at the end of the "customer" data entity, there is a very big difference between the relational model and the XML model in terms of adaptability. The relational model needs to continuously expand new relational uses. To describe continuously refined data entities, the hierarchical nature of the XML model itself can provide its own continuous extension and expansion under changing conditions. In actual projects, similar problems exist with information such as "education status" and "work experience status". Under the relationship model, even if a customer wants to add the "secondment" method of work status at a certain stage, they will find that there is no such information in the design. The corresponding fields are reserved, so I have to put it as a string"rubbed" in the "work unit" field, followed by "(seconded)", which is equivalent to the rigid data model itself obliterating the data. The information included in the business semantics; the hierarchical model can describe it as a child node or attribute, so that not only can multiple relationships (customers, education status, work experience, contact information) be included under the relational model ) are concentrated inside a data entity, and the extended information of each entity itself (such as "working mode": secondment, exchange, short-term concentration), etc. can also be described inside the data entity, and at the same time, the "customer" entity itself can be viewed from external applications It is still an entity, so that using data entities that are closer to real business scenarios can more effectively adapt to external changes.
What we discussed above is only one data entity. When we further develop into specific business domain models, we often need to integrate multiple data entities at the same time to collaborate to complete business functions. What about this situation? For example: the insurance policy requires customers to provide personal health information, children, parents, and partner's main family member information in addition to the above information. At the same time, the user's credit information will be obtained from other institutions, and different data entity combinations are mainly used within the enterprise. Different application fields, so from the perspective of data usage, in order to make the application part as stable as possible, it is best for the data entity to be stable, but only the contact information part of the user information may change repeatedly. If the application is completely dependent on a combination of these changing factors As a result, it is indeed difficult to guarantee the stability of the application. So the first step from the source is to try to ensure that different applications rely only on a specific entity as much as possible. This may be the first step to effective improvement. At this time, the advantages of XML's hierarchical characteristics are shown again. comes out, for example, we can freely combine this information according to different application themes:
In this way, the application faces a unified
Coping with the integration of data and content
The data entities mentioned above are discussed more in a centralized context, but in addition to conceptual design, there are also A specific problem is how to "gather" them together, which is generally achieved through dataset.
(However, just as the word "architecture" is overused, "data integration" is also defined by various manufacturers as a combination of different concepts based on their own product characteristics, such as BI Vendors try to portray it as synonymous with ETL. Vendors that provide data exchange platforms describe it as products that implement BizTalk Framework. For SOA product companies , Data integration is more about how to ensure the provision of data services under the premise of effective governance. In addition, for some manufacturers, data integration also includes business semantic combination, etc.)
But as a user, data integration. What issues should we focus on?
l The mapping relationship of data entities;
l The interconnection of data sources under various exchange protocols, industry data standards, and security control constraints;
l The data exchange process Arrangement;
l Verification and reconstruction of data entities;
l Conversion of data media and data carriers;
Although in theory these tasks are not a problem to complete with coding, However, as enterprise integration logic becomes more and more complex and changes faster and faster, even if you can modify the code to cope with 1:N integration, if it is often an M:N situation, it will be insufficient. Is there a simpler way? Just speaking from the logical level of "mapping":
l Object-orientedThe idea tells us to rely on inversion, try to rely on abstraction rather than concreteness, such as relying on interfaces rather than entity types;
l Design patterns tell us that incompatible interface adapters (Adapters) are a good way;
So are there similar technologies in the data field? XML Schema + XSLTmay be an option.
The above is a conversion done to be compatible with new and old user entities. Similarly, if you need to perform the above part of the data entity aggregation operation for different subjects, you can also use it. It is completed at the abstract data definition (Schema) level through XSLT (adaptation relationship between Schemas).
In this way, we can see how the data is aggregated at the data entity level, but there is still a problem that needs to be solved before: vehicle information, credit information and legacy systems The customer information is stored in the relational database and the partner's Web Service respectively. How to connect this data channel? From now on, XML is still a good choice.
Data on different data media can be extracted in their original form, such as plain text, relational database, EDI message or SOAP message, and passed to data integration through different information channels aggregation point, and then convert heterogeneous data sources through an adapter according to the needs of the destination data source.
At this time, if a point-to-point adapter is designed for each two types, the overall scale will develop along the N^2 level trend. For this reason, you might as well unify them into XML that is compatible with this information, and then use After the XSLT technology introduced above performs mapping between data entities, it then converts the XML into the form required by the target data source, so that the complexity of the entire adaptation system is reduced to N level.
Next, let’s look at how XML technology meets the prerequisite data integration requirements:
l Mapping of data entities, data media, and data carriers Conversion, verification and reconstruction of data entities:
As above, the data is first uniformly converted into XML, and then processed using the advantages of XML hierarchy and combined with XML-specific technology.
l The interconnection of data sources under various exchange protocols, industry data standards, and security control constraints;
XML data can not only cross networks and firewalls, but can also be easily used on the Internet environment (but you can still define them as messages using the messagequeue method), the data itself will not be restricted by the exchange protocol due to special binary operations. Currently, various industry standards are basically using XML to describe their own industry DM (Data Modal). Even if the data entities of your internal system itself do not conform to these DMs due to issues such as database design and historical legacy, various The protocols and standards for unified management of XML data can facilitate conversion. Regarding security, it seems that there is no security standard family more suitable for the Internet environment than the WS-* related protocols. All standards, without exception, can use XML entities to define the combination relationship between data and additional security mechanisms.
l Orchestration of data exchange process;
For homogeneous system environments, or platforms based only on compatible middleware systems, legacy workflow mechanisms can be used For the orchestration of the data exchange process, in order to adapt to the service-oriented era, the more general BPEL standard can be adopted. At this time, XML is not only data, but also appears as a form of execution instructions. Compared with Java technology, which has always been advertised as cross-platform, In other words, the exchange process defined by XML is even more cross-language.
It seems that integration has solved a lot of problems, but an obvious problem is that we may have to do some implementation of all the work ourselves and tell the application step by step how to do it. Then when we no longer regard the Web as just "New things", when considering it as a system that serves our information content and can interact, how can we present these scattered service capabilities to ourselves? At this time, perhaps the advantages of XML's open metadata definition will really come to light.
Coping with the complexity of the semantic network
In addition to various semantic algorithms, how to aggregate various scattered services to provide us with services, one of the very key factors of XML is Find the backbone of the data clues, and clarify the relationships between entities on this backbone and the process of gradual decomposition and refinement. Data at this level are not only objects that are passively called by the application, they themselves provide support for further inferences by the application. For example:
Here, the application first learns that the object currently being processed is goose meat. Since goose meat is a kind of dark meat, and dark meat is some kind of poultry meat (fowl), poultry meat is edible, so the application can gradually infer that goose meat is It is edible. The above inference process is not complicated, but if it is implemented using a relational database, it is relatively complicated. If it is written in plain text, it is even more difficult to implement. Imagine if all the relationships between poultry, vegetables, desserts, and seafood are expressed using relationships. Database or text writing is really "difficult" to use. XML is different. It can be naturally close to our thinking habits and describe our familiar semantics in an open but intertwined way, whether it is the production material preparation process in the enterprise ERP environment or preparing to cook for a birthday party. The same goes for purchasing plans.
Summary
Perhaps limited by the constraints of the two-dimensional grid for too long, our design and ideas for applications are increasingly constrained by computer processing itself, but as the business environment changes, The time between the occurrence of business requirements and the implementation and launch of applications is getting shorter and shorter. We need to withdraw our thinking from the computer. At this time, it seems more preferable to adopt a data technology that is more open and close to our divergent thinking. For organizations after the data is implemented, we can continue to use various mature technologies to complete it, but at a closer business level and closer to this more volatile environment, XML seems to be flexible and powerful.
The above is the detailed content of Detailed introduction to using XML thinking to organize data (picture). For more information, please follow other related articles on the PHP Chinese website!

如何用PHP和XML实现网站的分页和导航导言:在开发一个网站时,分页和导航功能是很常见的需求。本文将介绍如何使用PHP和XML来实现网站的分页和导航功能。我们会先讨论分页的实现,然后再介绍导航的实现。一、分页的实现准备工作在开始实现分页之前,需要准备一个XML文件,用来存储网站的内容。XML文件的结构如下:<articles><art

一、XML外部实体注入XML外部实体注入漏洞也就是我们常说的XXE漏洞。XML作为一种使用较为广泛的数据传输格式,很多应用程序都包含有处理xml数据的代码,默认情况下,许多过时的或配置不当的XML处理器都会对外部实体进行引用。如果攻击者可以上传XML文档或者在XML文档中添加恶意内容,通过易受攻击的代码、依赖项或集成,就能够攻击包含缺陷的XML处理器。XXE漏洞的出现和开发语言无关,只要是应用程序中对xml数据做了解析,而这些数据又受用户控制,那么应用程序都可能受到XXE攻击。本篇文章以java

当我们处理数据时经常会遇到将XML格式转换为JSON格式的需求。PHP有许多内置函数可以帮助我们执行这个操作。在本文中,我们将讨论将XML格式转换为JSON格式的不同方法。

Pythonxmltodict对xml的操作xmltodict是另一个简易的库,它致力于将XML变得像JSON.下面是一个简单的示例XML文件:elementsmoreelementselementaswell这是第三方包,在处理前先用pip来安装pipinstallxmltodict可以像下面这样访问里面的元素,属性及值:importxmltodictwithopen("test.xml")asfd:#将XML文件装载到dict里面doc=xmltodict.parse(f

xml中node和element的区别是:Element是元素,是一个小范围的定义,是数据的组成部分之一,必须是包含完整信息的结点才是元素;而Node是节点,是相对于TREE数据结构而言的,一个结点不一定是一个元素,一个元素一定是一个结点。

1.在Python中XML文件的编码问题1.Python使用的xml.etree.ElementTree库只支持解析和生成标准的UTF-8格式的编码2.常见GBK或GB2312等中文编码的XML文件,用以在老旧系统中保证XML对中文字符的记录能力3.XML文件开头有标识头,标识头指定了程序处理XML时应该使用的编码4.要修改编码,不仅要修改文件整体的编码,还要将标识头中encoding部分的值修改2.处理PythonXML文件的思路1.读取&解码:使用二进制模式读取XML文件,将文件变为

使用nmap-converter将nmap扫描结果XML转化为XLS实战1、前言作为网络安全从业人员,有时候需要使用端口扫描利器nmap进行大批量端口扫描,但Nmap的输出结果为.nmap、.xml和.gnmap三种格式,还有夹杂很多不需要的信息,处理起来十分不方便,而将输出结果转换为Excel表格,方面处理后期输出。因此,有技术大牛分享了将nmap报告转换为XLS的Python脚本。2、nmap-converter1)项目地址:https://github.com/mrschyte/nmap-

Scrapy是一款强大的Python爬虫框架,可以帮助我们快速、灵活地获取互联网上的数据。在实际爬取过程中,我们会经常遇到HTML、XML、JSON等各种数据格式。在这篇文章中,我们将介绍如何使用Scrapy分别爬取这三种数据格式的方法。一、爬取HTML数据创建Scrapy项目首先,我们需要创建一个Scrapy项目。打开命令行,输入以下命令:scrapys


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Zend Studio 13.0.1
Powerful PHP integrated development environment

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Notepad++7.3.1
Easy-to-use and free code editor

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft
