Home >Common Problem >What is the data structure of a computer?
Data structure is a way for computers to store and organize data. It refers to a collection of data elements that have one or more specific relationships with each other; it studies the logical structure of data and the physical structure of data. Interrelationships between them, define appropriate operations for this structure, design corresponding algorithms, and ensure that the new structure obtained after these operations still maintains the original structure type.
The operating environment of this tutorial: Windows 7 system, Dell G3 computer.
Data structure is the way a computer stores and organizes data. A data structure refers to a collection of data elements that have one or more specific relationships with each other. Often, carefully selected data structures can lead to higher operating or storage efficiency. Data structures are often related to efficient retrieval algorithms and indexing techniques.
Noun Definition
Data structure refers to a collection of data elements that have one or more relationships with each other and the relationship between the data elements in the collection. The relationship between them. Recorded as:
Data_Structure=(D,R)
where D is a set of data elements, and R is a finite set of relationships between all elements in the set.
(1) Commonly used structures
1. Array: In programming, for convenience of processing, groups of the same type are Several variables are organized in an ordered form. These collections of similar data elements arranged in order are called arrays. In C language, arrays are constructed data types. An array can be decomposed into multiple array elements, which can be basic data types or constructed types. Therefore, according to the type of array elements, arrays can be divided into various categories such as numerical arrays, character arrays, pointer arrays, and structure arrays.
2. Stack: It is a special linear list that can only be inserted and deleted at one end. It stores data according to the principle of first in, last out. The data that enters first is pushed to the bottom of the stack, and the last data is on the top of the stack. When data needs to be read, data is popped from the top of the stack (the last data is read out first).
3. Queue: A special linear table that only allows deletion operations at the front end of the table (front) and insertion operations at the back end (rear) of the table. The end that performs the insertion operation is called the tail of the queue, and the end that performs the deletion operation is called the head of the queue. Queues organize data according to the "first in, first out" or "last in, last out" principle. When there are no elements in the queue, it is called an empty queue.
4. Linked list: It is a non-continuous and non-sequential storage structure on a physical storage unit. It can represent either a linear structure or a non-linear structure. The logical order of data elements is through the linked list. The pointer link order in . A linked list consists of a series of nodes (each element in the linked list is called a node), and nodes can be dynamically generated at runtime. Each node consists of two parts: one is the data field that stores data elements, and the other is the pointer field that stores the address of the next node.
5. Tree: It is a finite set K containing n (n>0) nodes, and a relationship N is defined in K. N satisfies the following conditions:
(1) There is and is only one node K0. It has no predecessor for the relationship N. K0 is called the root node of the tree. Referred to as root.
(2) Except for K0, each node in K has one and only one predecessor for the relationship N.
(3) Each node in K can have m successors (m>=0) for the relationship N.
6. Graph: It is composed of a finite set V of nodes and a set E of edges. Among them, in order to distinguish it from the tree structure, nodes are often called vertices in graph structures, and edges are ordered pairs of vertices. If there is an edge between two vertices, it means that the two vertices have adjacent relation.
7. Heap: In computer science, a heap is a special tree data structure, each node has a value. Usually what we call the data structure of a heap refers to a binary heap. The characteristic of a heap is that the root node has the smallest (or largest) value, and the two subtrees of the root node are also a heap.
8. Hash table (also called hash table): If there is a record with a key equal to K in the structure, it must be at the storage location of f(K). Thus, the records being searched can be obtained directly without comparison. This correspondence f is called a hash function (Hash function), and the table built based on this idea is a hash table.
9. Eight sorting algorithms: Sorting algorithms can be divided into internal sorting and external sorting. Internal sorting is that data records are sorted in memory, while external sorting is because the sorted data is very large and cannot be accommodated all at once. Sorting records requires access to external storage during the sorting process. Common internal sorting algorithms include: insertion sort, Hill sort, selection sort, bubble sort, merge sort, quick sort, heap sort, radix sort, etc.
1. Logical structure of data: refers to the data structure that reflects the logical relationship between data elements. The logical relationship refers to the before and after relationship between data elements, and they are related to the computer The storage location is irrelevant.
The logical structure includes:
1) Set
There is no other relationship between the elements in the data structure except that they "belong to the same set";
2) Linear structure
The elements in the data structure have a one-to-one relationship;
3) Tree structure
The elements in the data structure have a one-to-many relationship;
4) Graphic structure
The elements in the data structure have many-to-many relationships.
2. Physical structure of data: refers to the storage form of the logical structure of data in the computer storage space.
The physical structure of data is the representation of the data structure in the computer (also known as the image), which includes the in-machine representation of data elements and the in-machine representation of relationships. Since the specific implementation methods include sequence, linking, indexing, hashing, etc., a data structure can be expressed as one or more storage structures.
In-machine representation of data elements (mapping method): Data elements are represented by bit strings of binary bits. This bit string is usually called a node. When a data element consists of several data items, the sub-bit string corresponding to each data item in the bit string is called the data field. Therefore, a node is an in-machine representation (or in-machine image) of a data element. On-machine representation of relationships (mapping method): The on-machine representation of relationships between data elements can be divided into sequential images and non-sequential images. There are two commonly used storage structures: sequential storage structures and chain storage structures. A sequential map represents the logical relationship between data elements by means of their relative positions in memory. Non-sequential images represent logical relationships between data elements with the help of pointers that indicate the storage locations of elements.
3. Data structure operations
It is generally believed that a data structure is composed of data elements organized according to some logical connection of. The description of the logical relationship between data elements is called the logical structure of data; data must be stored in the computer, and the storage structure of the data is the implementation form of the data structure and its representation in the computer; in addition, when discussing a data structure, we must also discuss Operations performed on this type of data are meaningful. A logical data structure can have multiple storage structures, and various storage structures affect the efficiency of data processing.
In the design of many types of programs, the choice of data structures is a fundamental design consideration. The construction experience of many large-scale systems shows that the difficulty of system implementation and the quality of system construction are seriously dependent on whether the optimal data structure is selected. Many times, once the data structure is determined, the algorithm is easy to come by. Sometimes things go the other way, and we choose a data structure to fit a specific algorithm. In either case, choosing the appropriate data structure is very important.
After choosing the data structure, the algorithm is also determined. Data, not algorithms, is the key factor in system construction. This insight has led to the emergence of many software design methods and programming languages, and object-oriented programming languages are one of them.
When a computer solves a specific problem, it generally needs to go through the following steps: First, it must abstract an appropriate mathematical model from the specific problem, then design an algorithm (Algorithm) to solve the mathematical model, and finally compile the program. Test and tweak until you get the final solution.
The essence of seeking a mathematical model is to analyze the problem, extract the operating objects, find out the relationships between these operating objects, and then describe them in mathematical language. When people use computers to deal with numerical calculation problems, the mathematical models used are described by mathematical equations. The operands involved are generally simple integer, real, and logical data, so the programmer's main focus is on programming skills rather than data storage and organization. However, more areas of computer application are "non-numerical computing problems". Their mathematical models cannot be described by mathematical equations, but by data structures. The key to solving such problems is to design appropriate data structures to describe non-numerical computing problems. The mathematical model of numerical problems is described by structures such as linear tables, trees, and graphs.
Computer algorithms are closely related to the structure of data. Algorithms are all dependent on specific data structures. Data structures are directly related to the selection and efficiency of algorithms. The operation is completed by the computer, which requires the design of corresponding insertion, deletion and modification algorithms. In other words, the data structure also needs to give algorithms for various operations defined by each structure type. A data object is a collection of data elements with the same properties and is a subset of data. Data objects can be finite or infinite. Data processing refers to the operation process of searching, inserting, deleting, merging, sorting, statistics and simple calculations on data.
(2) Structure classification
Data structure refers to the relationship between data elements in the same data element class. Data structures are respectively logical structure, storage structure (physical structure) and data operations. The logical structure of data is a mathematical model abstracted from specific problems. It describes the mathematical characteristics of data elements and their relationships. Sometimes the logical structure is simply referred to as the data structure. A logical structure is an image in computer storage, formally defined as (K, R) (or (D, S)), where K is a finite set of data elements and R is a finite set of relations on K.
According to the different characteristics of the relationship between data elements, there are usually the following four basic types of structures:
⑴ Set structure. The relationship between the data elements of this structure is "belonging to the same set".
⑵Linear structure. There is a one-to-one relationship between the data elements of this structure.
⑶Tree structure. There is a one-to-many relationship between the data elements of this structure
⑷Graphic structure. There is a many-to-many relationship between the data elements of this structure, also called a network structure. From the concept of data structure introduced above, we can know that a data structure has two elements. One is a collection of data elements and the other is a collection of relationships. Formally, a data structure can usually be represented by a tuple.
The form of the data structure is defined as: the data structure is a tuple: Data_Structure=(D, R), where D is a finite set of data elements and R is a finite set of relations on D. The characteristic of linear structure is that there is a linear relationship between data elements, and the data elements are "arranged one after another." The types of data elements in a linear table are the same, or a linear table is a linear structure composed of data elements of the same type. There are many examples of linear tables in practical problems. For example, the student status information table is a linear table: the type of the data elements in the table is student type; a string is also a linear table: the type of the data elements in the table is character type. etc.
Linear table is the simplest, most basic, and most commonly used linear structure. A linear table is a finite sequence of n (n>=0) data elements with the same data type, usually recorded as: (a1, a2,... ai-1, ai, ai 1,...an), where n is the table length , when n=0 it is called an empty list. It has two storage methods: sequential storage and chain storage. Its main basic operations are insertion, deletion and retrieval.
The representation (image) of the data structure in the computer is called the physical (storage) structure of the data. It includes the representation of data elements and the representation of relationships. There are two different representation methods of the relationship between data elements: sequential mapping and non-sequential mapping, and thus two different storage structures are obtained: sequential storage structure and chain storage structure.
Sequential storage method: It stores logically adjacent nodes in physically adjacent storage units. The logical relationship between nodes is reflected by the adjacency relationship of the storage units. From this, The storage representation is called a sequential storage structure. Sequential storage structure is the most basic storage representation method, usually implemented with the help of arrays in programming languages.
Link storage method: It does not require that logically adjacent nodes are also physically adjacent. The logical relationship between nodes is represented by additional pointer fields. The resulting storage representation is called a chained storage structure. The chained storage structure is usually implemented with the help of pointer types in programming languages.
Index storage method: In addition to establishing storage node information, additional Index table to identify the address of the node.
Hash storage method: It is to directly calculate the storage address of the node based on the node's keyword.
In the data structure, logically (logical structure: logical relationship between data elements), the data structure can be divided into linear structure and non-linear structure. The sequential storage structure of a linear structure is a sequential access storage structure, and the linked storage structure of a linear list is a random access storage structure. If a linear table is represented by chain storage, the storage unit addresses between all nodes can be continuous or discontinuous. The logical structure has nothing to do with the form, content, relative position, or number of nodes contained in the data element itself.
(3) Structural Algorithm
The design of the algorithm depends on the data (logical) structure, and the implementation of the algorithm depends on the storage structure used . The storage structure of data is essentially the realization of its logical structure in computer memory. In order to comprehensively reflect the logical structure of a data, its image in the memory includes two aspects, namely the information between data elements and the data elements. The relationship between. Different data structures have their corresponding operations. Data operations are operation algorithms defined on the logical structure of the data, such as retrieval, insertion, deletion, update, and sorting.
The data structure is different from the data type and the data object. It not only describes the data object of the data type, but also describes the relationship between the elements of the data object.
A data type is a collection of values and a set of operations defined on this value set. Data types can be divided into two categories: atomic types and structural types. On the one hand, in programming languages, every data belongs to a certain data type. Types explicitly or implicitly specify the value range of data, how it is stored, and the operations that are allowed. It can be considered that data types are data structures that have been implemented in programming. On the other hand, during the programming process, when a new data structure needs to be introduced, the data storage structure is always described with the help of the data types provided by the programming language. The smallest unit of data represented in a computer is one bit of a binary number, called a bit. We use a bit string formed by a combination of several bits to represent a data element. This bit string is usually called an element or node. When a data element consists of several data items, the sub-bit string corresponding to each data item in the bit string is called a data field. Elements or nodes can be seen as the image of data elements in the computer.
A software system framework should be built on data, not on operations. A software module containing abstract data types should contain three parts: definition, representation, and implementation.
For every data structure, there must be a set of operations closely related to it. If the type and number of operations are different, even if the logical structure is the same, the data structure can play different roles.
Different data structures have different sets of operations, but the following operations are indispensable:
1. Generation of the structure;
2. Destruction of the structure;
3. Find data elements that meet the specified conditions in the structure;
4. Insert new data elements into the structure;
5. Delete existing data elements in the structure;
6. Traverse.
Abstract data type: a mathematical model and a set of operations defined on the model. An abstract data type is actually the definition of this data structure. Because it defines a logical structure of data and a set of algorithms on this structure. Abstract data types can be represented by the following triplet: (D, S, P). D is a data object, S is a set of relationships on D, and P is a set of basic operations on D. The definition of ADT is: ADT abstract data type name: {data object: (set of data elements), data relationship: (data relationship tuple combination), basic operation: (list of operating functions)}; ADT abstract data type name; Abstract data types have two important characteristics:
Data abstraction: When using ADT to describe the entities processed by the program, the emphasis is on its essential characteristics, the functions it can complete, and its interface with external users (i.e. the way the outside world uses it).
Data encapsulation: Separate the external characteristics of an entity from its internal implementation details, and hide its internal implementation details from external users.
Data is the carrier of information, which can be recognized, stored and processed by computers. It is the raw material processed by computer programs, and applications process all kinds of data. In computer science, the so-called data is the object of computer processing. It can be numerical data or non-numeric data. Numerical data are some integers, real numbers or complex numbers, mainly used in engineering calculations, scientific calculations and business processing; non-numeric data includes characters, text, graphics, images, voices, etc.
Data Element (Data Element) is the basic unit of data. Under different conditions, data elements can also be called elements, nodes, vertices, records, etc. For example, a record in the student information table in the student information retrieval system is called a data element. Sometimes, a data element can be composed of several data items (Data Item). For example, each data element in the student information table in the student status management system is a student record. It includes data items such as student number, name, gender, place of origin, date of birth, and grades. These data items can be divided into two types: one is called elementary items, such as students' gender, place of origin, etc. These data items are the smallest units that cannot be divided during data processing; the other is called combination items, such as students' grades. It can be subdivided into smaller terms such as mathematics, physics, chemistry, etc. Typically, each student record is accessed and processed as a basic unit when solving practical application problems.
Data Object (Data Object) or Data Element Class (Data Element Class) is a collection of data elements with the same properties. In a specific problem, the data elements all have the same properties (element values are not necessarily equal), belong to the same data object (data element class), and the data element is an instance of the data element class. For example, in the traffic network of the traffic advisory system, all vertices are a data element class. Vertex A and vertex B each represent a city and are two instances of the data element class. The values of their data elements are A and B. Data structure refers to a collection of data elements that have one or more relationships with each other. In any problem, data elements will not be isolated. There will be some relationship between them. This relationship between data elements is called structure.
For more related knowledge, please visit the FAQ column!
The above is the detailed content of What is the data structure of a computer?. For more information, please follow other related articles on the PHP Chinese website!