Introduction to Data Structures - XALGORD - JOYK Joy of Geek, Geek News, Link all geek

READ ALSO

No Content Available

Table of Contents

Introduction

In an organization, the juniors generate data and seniors rearrange the data into information. The information is further processed by higher authorities to take decisions. The decision results in more information and so on. The information gets more and more refined and exact at each step.

In an organization, the data is transformed into the information the program or software uses.
A big problem or software is not written as a single program; instead, it is broken into smaller modules (may be called a function or procedure) and each module is developed independently.

Fig. 1.1 Hierarchical Organised Program

The main program utilizes the services of level 1.
Level 1 utilizes the services of level 2.
Here our concern is limited to “What it can do” and not “How it is done”.

“Organization of Data and Relationship Among its Participating Member is Called Data Structure.”

According to Nikaulus Wirth a program is made up of data structure and an algorithm as shown below

ALGORITHM + DATA STRUCTURE = PROGRAM

A data structure is a way of storing and organizing data in a computer so that it can be used efficiently. It refers to the scheme for organizing related piece of information.

Definition

“The organized collection of data in a mathematical or logical way is called a data structure.”

Study of data structure includes

Logical description of data.
Implementation detail of data structure.
Quantitative analysis of data structure.

A data structure is defined as a triplet (D, F, A) where “D” stands for the set of domains, “F” denotes the set of operations and “A” represents axioms defining the function in “F”.

Hence data structure is merely an instance.

Basic Terminologies

The various terms connected with the data structures are defined as below

Data means facts and figures. Data is a very valuable raw material that can be in different forms such as numbers, words, alphabets, diagrams, etc.

It is of two types:

Atomic data: Non-decomposable entity.

The data is not further subdivided. For example, value 848 or a character “X” can not be further subdivided. If we further divide the value 848 into three digits ‘8’, ‘4’, and ‘8’ then the meaning is lost.

Composite data: Decomposable data

It is the most fundamental data level which is used as a building block for all other data in a system.
It consists of several atomic data and hence it can be further subdivided.
This term is also called by other names such as Field, data item, or elementary item. For example, a DATE is made up of three parts: DAY, MONTH, YEAR.

Here DD, MM, and YY are the atomic data, and DATE is termed as a group or group of items. The term group items refer to a collection of data items combined together to form an object or entity which is referred to as composite data.

A group can be divided or decomposed into data elements but data elements can not be further subdivided.

Primitive Data Type

The data elements are called primitive data types and the group is called compound data types.

Integer: An integer is an integral or whole number without a decimal point. It may be preceded by a sign indicator (+ for positive, – for negative). If no sign is used, it may treat like positive numbers.
Some valid integers are:
-21009, -724, +52124, 376
Real: It is a number with a decimal point and can be written in two different forms.
(a) Decimal form or real data type: It is a signed or unsigned string of digits including a decimal point.
Some valid real numbers are: 523.24, -1923.17, 42.37.
(b) Exponential form: It is a floating-point format containing two parts: Mantissa and Exponent. The mantissa is the real number of decimal form the exponent part starts with the real data type. The exponent part starts with a letter ‘E’ which is followed by two digits wide integral number.
Example of some valid real data type: 529.23 E04.

2.0 E 08, 96.2E -4, etc.
Boolean: The term boolean is from Boolean Algebra. A boolean data can have two values either “TRUE” or “FALSE”. This type of data is used to make decisions and answer questions.
Character: It is a non-numeric data type consisting of a single alphanumeric character. The alphanumeric characters are compromised of
Alphabets : ‘A’ to ‘Z’ and ‘a’ to ‘z’
Digit : ‘0’ to ‘9’
Special Symbols : ‘+’, ‘-‘, ‘*’, ‘, &, $, ? etc.
Example of some valid character type : ‘A’, ‘B’, ‘8’, ‘$’, ‘&’
Character set is standardized by ANSI (American National Standard Institute) in the ASCII character set.

Constants

It is a memory location that does not change its value during the execution of the program. For example, the value of (π) Pie is a constant quantity.

Variable

It is the most fundamental aspect of any computer language and can be defined as “A location in the memory where a value can be stored”.

To understand, let us have a look at the following statement.

total = 500.25 net = total - 100.00

In statement (i), a value of 500.25 has been stored in a memory location called “total”. The variable “total” is used in the statement (ii) for the calculation of another variable “net”. The point here to note is that the variable total is used in the statement (ii) is by its name not by its value. Before a variable is used in a program, it has to be defined. This activity enables the compiler to make available the appropriate type of location in the memory.

Data object

These are the set of variables used in program.

Every data object is associated with the data type.
It is a place where the data value can be stored, retrieved, and manipulated.
It can also be viewed as a sub-time instance of a data structure.

Static and Dynamic Data Structure

Static: In the static data structure, memory for the object is allocated at the time of loading the program.

Example: int X[20]
Memory array ‘X’ of 20 elements is allocated at the time of loading of the program.

Dynamic: In the dynamic data structure the memory space required by the variable is calculated and allocated during the execution. Usually, the pointer is used for the dynamic data structure.

Example: A = (int *) malloc (50 * size of (int));

A linear data structure can be implemented through static or dynamic data structure but static data structure is preferred.
All linked structure is preferred through the dynamic data structure.
Dynamic data structure give re-usability of memory space.
It provides the flexibility in adding, deleting, or rearranging the data objects at the run time.
Unwanted space can be released at runtime.

Pointers

Pointers are the variables that hold the memory address of other variables.

There is a special operator in ‘C’ language for finding the address of a variable.
If X is variable, then &X is its address.
Value of variable can be manipulated through a pointer.
To declare the pointer in “C”, we use an asterisk “*” before the name of a variable. A pointer to an integer can be declared as int *p;
Value of variable can be accessed and manipulated through a pointer. If X is a variable and its address is stored in pointer P then *P gives the value of X i.e. int X, *P;
Addresses if X can be stored in P by using the “&” operator i.e. P = &X;

After execution of the statement, Pointer P will contain the address of X. We say P points to X.

Means pointer P points to X.

Types Of Data Structure

Data Structures are sometimes called data types. The complete hierarchy of data structure is represented as

There are mainly two types of data structures.

Linear Data Structure:
- A data structure whose elements form a proper sequence or have a sequential structure.
- It has a unique predecessor and unique successor.
- It is comprised of an ordered set of data elements. Let us consider a Linear List X having n elements as
  X = [X0, X1, X2 ….Xn-1]
- If n is zero, it is called an empty List.
- It can be implemented through static or dynamic structure but static data structure is preferred.
Example: Array, Link List, stack, Queues.
Non-Linear Data Structure:
- A data structure does not form a sequence or have a complex structure.
- There is no unique predecessor and unique successor.
- The operations like traversing, insertion, and deletion are not performed in a linear fashion on these data structures.
- These data structures are multi-level structures.
Example: Trees and graphs.

Other types of Data Structures are

Homogeneous:
Data values are of the same type and are usually stored in an array.
Non-Homogenous:
Data values are of different data types and are grouped in structure or union.

Linear Data Structure

Array

Definition:

An array is a group of the homogeneous (similar) type of data. Data is stored in contiguous locations.

An array whose elements are specified by a single subscript is called a one-dimensional array.
The array whose elements are specified by two or more subscripts is called a multi-dimensional array.

(1) Types of Array
(i) Linear Array or One Dimensional Array.
(ii) Non-Linear Array or Two Dimensional Array.

Linear Array: It is one dimension array or single dimension array. In this only one subscript is used. It is written either in row or column form.
- Let us suppose that an array A[N] has a finite number of N elements.
- It means the total number of elements in the array A and N. Then its element can be denoted as.
- A1, A2…An (Mathematical Notation).
- A[0], A[1]…A[n-1] (Subscript Notation used in ‘C’ Langauge).
- A[n] will be called a subscripted variable and n will be called an index or subscript value.

Memory Representation of Linear Array

2. Non-Linear Array: Two-dimensional array or two subscripted arrays are referred to as a non-linear array. These arrays are in both Row form and Column form.

Definition: It is a set of a finite number of homogeneous data elements and each element is accessed by two subscripts.

Linked List

A linked list is a dynamic data structure, which means that the memory to hold the data structure is created at run time (dynamic), and stored in a structured way (also called data structure).
It is a linear collection of data elements called nodes, where the linear order is given by means of pointers.
Each node consists of two fields, one containing the item and the other containing the address of the next item in the list. Each node contains two parts information part and link part.
The last node of the linked list contains a special link part called NULL Pointer (i.e. any invalid address).

Definition:

“A linked list is simply a chain of structures which contains a pointer to the next element.”

Fig. 1.2

Disadvantages:

More memory is required to store the size of the pointer field.
Processing time is more for accessing a unique data entry.

Stack

In computer science, a stack is a temporary abstract data type and data structure based on the principle of Last In First Out (LIFO). Stacks are used extensively at every level of modern computer systems.
In modern PC stacks are used at the architecture level, which is used in the basic design of an operating system for interrupt handling and operating system function calls.
Among other uses, stacks are used to run a Java Virtual Machine, and the Java language itself has a class called “Stack”, which can be used by the programmer.
A stack-based computer system is one that stores temporary information primarily in stacks, rather than hardware CPU registers (a register-based computer system).

Definition:

“Stack is a linear data structure that can be represented by using an array or linked list and is based on the methodology LIFO known as Last In First Out.”

Examples of Stacks are:

CD Rack
Plates in Cafeteria, etc.

There are two main operations applicable to the stack: (i) Push (ii) Pop

Push: An item is put into the Top of the stack. Increasing the size by one.
Pop: An item is deleted from the top of the stack. Decreasing the size by one.

Fig 1.3

Queue

A queue is a special type of data structure that is based on FIFO i.e. First In First Out.
It is a particular kind of collection in which the entities in the collection are kept in order and the Principle operations on the collection are the addition of entities to the rear terminal position and removal of entities from the front terminal position. This makes the queue a First-In-First-Out (FIFO) data structure.
In a FIFO data structure, the first element added to the queue will be the first one to be removed. This is equivalent to the requirement that whenever an element is added, all elements that were added before have to be removed before the new element can be invoked. A queue is an example of a linear data structure.
Queues provide services in computer science, transport, and operations research where various entities such as data, objects, persons, or events are stored and held to be processed later. In these contexts, the queue performs the function of a buffer. Queues are common in computer programs, where they are implemented as data structures coupled with access routines, as an abstract data structure, or in object-oriented languages as classes.
Examples of Queue are 1. Ticket Window 2. Cars in Line at a gas station.
The defining attribute of queue data structure is the fact that allows access to only the front and back of the structure. Furthermore, elements can only be removed from the front and can only be added to the back.
Example of queues is people traveling up an escalator, machine parts on an assembly line, or cars in line at a gas station. The recurring theme is clear: queues are essentially waiting lines.
In each of the cases, the customer or object at the front of the line was the first one to enter, while at the end of the, is last to have entered. Every time a customer finishes paying for their items (or a person steps off the escalator or the machine part is removed from the assembly line, etc.) the object leaves the queue from the front.
The queue “size” function would return the length of the line, and the “empty” function would return true only if there was nothing in the line.

Non Linear Data Structure

A data structure whose elements do not form a sequence.
No unique predecessor and sequence.

Trees

Computer Scientists have developed a complicated data structure called a tree to deal with the inherent deficiencies of the linked lists.
The linked list here proved to be a somewhat inefficient data structure for undertaking certain operations such as binary search as they are unable to provide access to the middle of the list necessitating search from the beginning of the list.
Many mathematical expressions can be represented by the trees.
The trees have proved to be very useful in organizing databases and analyzing electrical circuits.
Trees are also very useful in organizational charts. A tree contains no loops and no cycles.
There is no more than one edge between any pair of nodes.
A finite non-empty set of elements consisting of vertices and edges but without any cycle.
Trees are simply another way of arranging and sorting the data.
Trees get their name because the general shape of the structure (if you draw it out) resembles a tree.
All of the elements in the tree are called a node. Just like a family tree, there is one node from which all the other nodes descend. This is the root node. Each node of the descendants can also have descendants. I other words, each child of the root can be seen as the root of its tree. It is in this way that a tree is naturally recursive.
It is a hierarchal structure of data.
It is a graph without a cycle.

Fig. 1.4 (a)

Fig. 1.4 (b)

Another way of defining the tree is recursive in the sense that every root of a tree points to the root of another tree.
A tree can be equated with a data structure consisting of a series of nodes connected to a common root, the origin of the tree.
Another example of tree data structure can be windows operating system in which file system is implemented in the form of tree as shown in figure 1.4 (b).

Graph

A Graph is a special kind of Tree in which relationships between parent and child are hierarchical.
There is a cyclic relationship.
A Graph data structure consists mainly of a finite set of ordered pairs, called edges or arcs of certain entities called nodes.
A graph is an abstract notion of a set of nodes (vertices or points) and connection relations (edges or arcs) between them,
It is an important non-linear data structure.

Fig. 1.5 (a), Fig. 1.5 (b), Fig. 1.5 (c),

A graph is defined as “a set V of elements called nodes or points or verticals and a set E of edges such that edge e in E is identified with a unique pair [u, v] of unordered nodes in V. It is denoted by e = [u, v]”.
A simple graph can be thought of as a triple G = (V, E, I), where V and E are disjoint finite sets and I is an incidence relation such that every element of E is incident with exactly two distinct elements of V and no two elements of E are incident to the same pair of elements of V. We call V the vertex set and E the edge set of G. We can also write graph parts as G = (V, E). A graph G = (V, E) is a finite non-empty set V of objects called vertices (the singular is vertex) together with a (possibly empty) set E of unordered pair of distinct vertices of G called edges.
A graph can be better understood with the help of the following example.

Suppose a person wants to reach the Patiala from the Firozpur. Then he can follow the different routes to reach like.

In the given example the vertices are the name of location which is represented by points or circles. The edges are the roads connecting the Firozpur and Patiala that are represented by the line segment or arc. From the graph we observe that the various routes are:

Firozpur -> Jalandhar -> Ludhiana -> Patiala.
Firozpur -> Moga -> Patiala.
Firozpur -> Bhatinda -> Sangrur -> Patiala.

Fig. 1.6

Graphs can be maintained in the computer’s memory by using the adjacency Matrix or using a Link representation called an adjacency list.

Common Operations on Data Structure

A data structure is a collection of data elements arranged in either mathematical or logical form. The different operations which can be performed on the data structure are:

Traversing

It means to see or visit each and every element at least once. The algorithm used to traverse an array is

For i = 1 to 5
Process a[i]
i <- i + 1
End for

Insertion

Addition of a new record to a data structure. Insertion can either be at the beginning, end, or in between of existing data structure.

Deletion

Removing an existing record from the data structure.

Example: In ‘C’ programming language, you have used the free() function to free the storage of a dynamically allocated variable using the malloc() function.

Syntax: free(ponter_variable)

occupied calling free(p) makes the storage occupied by point p available for reuse if required.

Searching

Finding the specified record(s) in a given data structure on the basis of a given key value. There are two major methods used for searching, known as Linear search and Binary search.

Selection

It accesses the data with in the data structure.

Sorting

Arranging the record on the basis of key-value in a specified order. It is used to arrange data in some logical order either in increasing order or decreasing order. The commonly used algorithm for sorting are Bubble sort, Quick sort, Merge sort, Heap sort, Insertion sort, etc.

Merging

Combining the records of two similar sorted data structure to form a new data structure of same type.

Splitting

In this operation, records in a large file are split into smaller files for processing.

Copying

This data structure is used to copy the contents of one object to another object.

Concatenation

Selecting the records of two similar data structures and combining them.

Applications of Data Structure

The various applications of a data structure are:

It describes the physical and logical relationship between the data items.
It provides the method of representing the data elements in the memory of computer efficiency.
It gives considerable help in compiler design.
It helps in statistical analysis.
It is very useful in simulation.
We can solve various numerical problems with the help of data structure.
It provides considerable help in managing the operating system.
It helps to find the shortest path between two nodes in a network and to analyze the network completely.
It helps a lot in graphic applications.

Advantages of Data Structure

Data structure helped us in understanding various data types.
It helped us in defining the various operations on data.
Useful in solving complex programming problems.
A suitable data structure can be selected to suit an algorithm.

Introduction to Data Structures - XALGORD

READ ALSO

Introduction

Definition

Basic Terminologies

Atomic data: Non-decomposable entity.

Composite data: Decomposable data

Primitive Data Type

Constants

Variable

Data object

Static and Dynamic Data Structure

Pointers

Types Of Data Structure

Linear Data Structure

Array

Linked List

Stack

Queue

Non Linear Data Structure

Trees

Graph

Common Operations on Data Structure

Traversing

Insertion

Deletion

Searching

Selection

Sorting

Merging

Splitting

Copying

Concatenation

Applications of Data Structure

Advantages of Data Structure

Recommend

About Joyk