fis company ( call based 2016,2017 batch )
drive date 19-4-2018
Address: 2nd Floor, Sjr Ipark, Epip Zone, Whitefield Main Road, Bengaluru, Karnataka 560066
A Programmer’s approach of looking at Array vs. Linked List
In general, array is considered a data structure for which size is fixed at the compile time and array memory is allocated either from Data section (e.g. global array) or Stack section (e.g. local array).
Similarly, linked list is considered a data structure for which size is not fixed and memory is allocated from Heap section (e.g. using malloc() etc.) as and when needed. In this sense, array is taken as a static data structure (residing in Data or Stack section) while linked list is taken as a dynamic data structure (residing in Heap section). Memory representation of array and linked list can be visualized as follows:
An array of 4 elements (integer type) which have been initialized with 1, 2, 3 and 4. Suppose, these elements are allocated at memory addresses 0x100, 0x104, 0x108 and 0x10C respectively.
[(1)] [(2)] [(3)] [(4)]
0x100 0x104 0x108 0x10C
A linked list with 4 nodes where each node has integer as data and these data are initialized with 1, 2, 3 and 4. Suppose, these nodes are allocated via malloc() and memory allocated for them is 0x200, 0x308, 0x404 and 0x20B respectively.
[(1), 0x308] [(2),0x404] [(3),0x20B] [(4),NULL]
0x200 0x308 0x404 0x20B
Anyone with even little understanding of array and linked-list might not be interested in the above explanation. I mean, it is well know that the array elements are allocated memory in sequence i.e. contiguous memory while nodes of a linked list are non-contiguous in memory. Though it sounds trivial yet this the most important difference between array and linked list. It should be noted that due to this contiguous versus non-contiguous memory, array and linked list are different. In fact, this difference is what makes array vs. linked list! In the following sections, we will try to explore on this very idea further.
Since elements of array are contiguous in memory, we can access any element randomly using index e.g. intArr will access directly fourth element of the array. (For newbies, array indexing start from 0 and that’s why fourth element is indexed with 3). Also, due to contiguous memory for successive elements in array, no extra information is needed to be stored in individual elements i.e. no overhead of metadata in arrays. Contrary to this, linked list nodes are non-contiguous in memory. It means that we need some mechanism to traverse or access linked list nodes. To achieve this, each node stores the location of next node and this forms the basis of the link from one node to next node. Therefore, it’s called Linked list. Though storing the location of next node is overhead in linked list but it’s required. Typically, we see linked list node declaration as follows:
/* nextNode is the pointer to next node in linked list*/
struct llNode * nextNode;
Run on IDE
So array elements are contiguous in memory and therefore not requiring any metadata. And linked list nodes are non-contiguous in memory thereby requiring metadata in the form of location of next node. Apart from this difference, we can see that array could have several unused elements because memory has already been allocated. But linked list will have only the required no. of data items. All the above information about array and linked list has been mentioned in several textbooks though in different ways.
What if we need to allocate array memory from Heap section (i.e. at run time) and linked list memory from Data/Stack section. First of all, is it possible? Before that, one might ask why would someone need to do this? Now, I hope that the remaining article would make you rethink about the idea of array vs. linked-list 🙂
Now consider the case when we need to store certain data in array (because array has the property of random access due to contiguous memory) but we don’t know the total size apriori. One possibility is to allocate memory of this array from Heap at run time. For example, as follows:
/*At run-time, suppose we know the required size for integer array (e.g. input size from user). Say, the array size is stored in variable arrSize. Allocate this array from Heap as follows*/
int * dynArr = (int *)malloc(sizeof(int)*arrSize);
Run on IDE
Though the memory of this array is allocated from Heap, the elements can still be accessed via index mechanism e.g. dynArr[i]. Basically, based on the programming problem, we have combined one benefit of array (i.e. random access of elements) and one benefit of linked list (i.e. delaying the memory allocation till run time and allocating memory from Heap). Another advantage of having this type of dynamic array is that, this method of allocating array from Heap at run time could reduce code-size (of course, it depends on certain other factors e.g. program format etc.)
Now consider the case when we need to store data in a linked list (because no. of nodes in linked list would be equal to actual data items stored i.e. no extra space like array) but we aren’t allowed to get this memory from Heap again and again for each node. This might look hypothetical situation to some folks but it’s not very uncommon requirement in embedded systems. Basically, in several embedded programs, allocating memory via malloc() etc. isn’t allowed due to multiple reasons. One obvious reason is performance i.e. allocating memory via malloc() is costly in terms of time complexity because your embedded program is required to be deterministic most of the times. Another reason could be module specific memory management i.e. it’s possible that each module in embedded system manages its own memory. In short, if we need to perform our own memory management, instead of relying on system provided APIs of malloc() and free(), we might choose the linked list which is simulated using array. I hope that you got some idea why we might need to simulate linked list using array. Now, let us first see how this can be done. Suppose, type of a node in linked list (i.e. underlying array) is declared as follows:
/*Here, note that nextIndex stores the location of next node in
struct sllNode arrayLL;
Run on IDE
If we initialize this linked list (which is actually an array), it would look as follows in memory:
[(0),-1] [(0),-1] [(0),-1] [(0),-1] [(0),-1]
0x500 0x508 0x510 0x518 0x520
The important thing to notice is that all the nodes of the linked list are contiguous in memory (each one occupying 8 bytes) and nextIndex of each node is set to -1. This (i.e. -1) is done to denote that the each node of the linked list is empty as of now. This linked list is denoted by head index 0.
Now, if this linked list is updated with four elements of data part 4, 3, 2 and 1 successively, it would look as follows in memory. This linked list can be viewed as 0x500 -> 0x508 -> 0x510 -> 0x518.
[(1),1] [(2),2] [(3),3] [(4),-2] [(0),-1]
0x500 0x508 0x510 0x518 0x520
The important thing to notice is nextIndex of last node (i.e. fourth node) is set to -2. This (i.e. -2) is done to denote the end of linked list. Also, head node of the linked list is index 0. This concept of simulating linked list using array would look more interesting if we delete say second node from the above linked list. In that case, the linked list will look as follows in memory:
[(1),2] [(0),-1] [(3),3] [(4),-2] [(0),-1]
0x500 0x508 0x510 0x518 0x520
The resultant linked list is 0x500 -> 0x510 -> 0x518. Here, it should be noted that even though we have deleted second node from our linked list, the memory for this node is still there because underlying array is still there. But the nextIndex of first node now points to third node (for which index is 2).
Hopefully, the above examples would have given some idea that for the simulated linked list, we need to write our own API similar to malloc() and free() which would basically be used to insert and delete a node. Now this is what’s called own memory management. Let us see how this can be done in algorithmic manner.
There are multiple ways to do so. If we take the simplistic approach of creating linked list using array, we can use the following logic. For inserting a node, traverse the underlying array and find a node whose nextIndex is -1. It means that this node is empty. Use this node as a new node. Update the data part in this new node and set the nextIndex of this node to current head node (i.e. head index) of the linked list. Finally, make the index of this new node as head index of the linked list. To visualize it, let us take an example. Suppose the linked list is as follows where head Index is 0 i.e. linked list is 0x500 -> 0x508 -> 0x518 -> 0x520
[(1),1] [(2),3] [(0),-1] [(4),4] [(5),-2]
0x500 0x508 0x510 0x518 0x520
After inserting a new node with data 8, the linked list would look as follows with head index as 2.
[(1),1] [(2),3] [(8),0] [(4),4] [(5),-2]
0x500 0x508 0x510 0x518 0x520
So the linked list nodes would be at addresses 0x510 -> 0x500 -> 0x508 -> 0x518 -> 0x520
For deleting a node, we need to set the nextIndex of the node as -1 so that the node is marked as empty node. But, before doing so, we need to make sure that the nextIndex of the previous node is updated correctly to index of next node of this node to be deleted. We can see that we have done own memory management for creating a linked list out of the array memory. But, this is one way of inserting and deleting nodes in this linked list. It can be easily noticed that finding an empty node is not so efficient in terms of time complexity. Basically, we’re searching the complete array linearly to find an empty node.
Let us see if we can optimize it further. Basically we can maintain a linked list of empty nodes as well in the same array. In that case, the linked list would be denoted by two indexes – one index would be for linked list which has the actual data values i.e. nodes which have been inserted so far and other index would for linked list of empty nodes. By doing so, whenever, we need to insert a new node in existing linked list, we can quickly find an empty node. Let us take an example:
[(4),2] [(0),3] [(5),5] [(0),-1] [(0),1] [(9),-1]
0x500 0x508 0x510 0x518 0x520 0x528
The above linked list which is represented using two indexes (0 and 5) has two linked lists: one for actual values and another for empty nodes. The linked list with actual values has nodes at address 0x500 -> 0x510 -> 0x528 while the linked list with empty nodes has nodes at addresses 0x520 -> 0x508 -> 0x518. It can be seen that finding an empty node (i.e. writing own API similar to malloc()) should be relatively faster now because we can quickly find a free node. In real world embedded programs, a fixed chunk of memory (normally called memory pool) is allocated using malloc() only once by a module. And then the management of this memory pool (which is basically an array) is done by that module itself using techniques mentioned earlier. Sometimes, there are multiple memory pools each one having different size of node. Of course, there are several other aspects of own memory management but we’ll leave it here itself. But it’s worth mentioning that there are several methods by which the insertion (which requires our own memory allocation) and deletion (which requires our own memory freeing) can be improved further.
If we look carefully, it can be noticed that the Heap section of memory is basically a big array of bytes which is being managed by the underlying operating system (OS). And OS is providing this memory management service to programmers via malloc(), free() etc. Aha !!
The important take-aways from this article can be summed as follows:
A) Array means contiguous memory. It can exist in any memory section be it Data or Stack or Heap.
B) Linked List means non-contiguous linked memory. It can exist in any memory section be it Heap or Data or Stack.
C) As a programmer, looking at a data structure from memory perspective could provide us better insight in choosing a particular data structure or even designing a new data structure. For example, we might create an array of linked lists etc.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above
dbms interview question
Commonly asked DBMS interview questions | Set 1
What are advantages of DBMS over traditional file based systems?
Ans: Database management systems were developed to handle the following difﬁculties of typical ﬁle-processing systems supported by conventional operating systems.
1. Data redundancy and inconsistency
2. Difﬁculty in accessing data
3. Data isolation – multiple ﬁles and formats
4. Integrity problems
5. Atomicity of updates
6. Concurrent access by multiple users
7. Security problems
What are super, primary, candidate and foreign keys?
Ans: A superkey is a set of attributes of a relation schema upon which all attributes of the schema are functionally dependent. No two rows can have the same value of super key attributes.
A Candidate key is minimal superkey, i.e., no proper subset of Candidate key attributes can be a superkey.
A Primary Key is one of the candidate keys. One of the candidate keys is selected as most important and becomes the primary key. There cannot be more that one primary keys in a table.
Foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table. See this for an example.
What is the difference between primary key and unique constraints?
Ans: Primary key cannot have NULL value, the unique constraints can have NULL values. There is only one primary key in a table, but there can be multiple unique constrains.
What is database normalization?
Ans: It is a process of analyzing the given relation schemas based on their functional dependencies and primary keys to achieve the following desirable properties:
1) Minimizing Redundancy
2) Minimizing the Insertion, Deletion, And Update Anomalies
Relation schemas that do not meet the properties are decomposed into smaller relation schemas that could meet desirable properties.
What is SQL?
SQL is Structured Query Language designed for inserting and modifying in a relational database system.
What are the differences between DDL, DML and DCL in SQL?
Ans: Following are some details of three.
DDL stands for Data Definition Language. SQL queries like CREATE, ALTER, DROP and RENAME come under this.
DML stands for Data Manipulation Language. SQL queries like SELECT, INSERT and UPDATE come under this.
DCL stands for Data Control Language. SQL queries like GRANT and REVOKE come under this.
What is the difference between having and where clause?
Ans: HAVING is used to specify a condition for a group or an aggregate function used in select statement. The WHERE clause selects before grouping. The HAVING clause selects rows after grouping. Unlike HAVING clause, the WHERE clause cannot contain aggregate functions. (See this for examples).
See Having vs Where Clause? for more details
How to print duplicate rows in a table?
Ans: See http://quiz.geeksforgeeks.org/how-to-print-duplicate-rows-in-a-table/
What is Join?
Ans: An SQL Join is used to combine data from two or more tables, based on a common field between them. For example, consider the following two tables.
ENROLLNO STUDENTNAME ADDRESS
1000 geek1 geeksquiz1
1001 geek2 geeksquiz2
1002 geek3 geeksquiz3
Following is join query that shows names of students enrolled in different courseIDs.
SELECT StudentCourse.CourseID, Student.StudentName
INNER JOIN Customers
ON StudentCourse.EnrollNo = Student.EnrollNo
ORDER BY StudentCourse.CourseID;
The above query would produce following result.
What is Identity?
Ans: Identity (or AutoNumber) is a column that automatically generates numeric values. A start and increment value can be set, but most DBA leave these at 1. A GUID column also generates numbers; the value of this cannot be controlled. Identity/GUID columns do not need to be indexed.
What is a view in SQL? How to create one
Ans: A view is a virtual table based on the result-set of an SQL statement. We can create using create view syntax.
CREATE VIEW view_name AS
What are the uses of view?
1. Views can represent a subset of the data contained in a table; consequently, a view can limit the degree of exposure of the underlying tables to the outer world: a given user may have permission to query the view, while denied access to the rest of the base table.
2. Views can join and simplify multiple tables into a single virtual table
3. Views can act as aggregated tables, where the database engine aggregates data (sum, average etc.) and presents the calculated results as part of the data
4. Views can hide the complexity of data; for example a view could appear as Sales2000 or Sales2001, transparently partitioning the actual underlying table
5. Views take very little space to store; the database contains only the definition of a view, not a copy of all the data which it presentsv.
6. Depending on the SQL engine used, views can provide extra security
Source: Wiki Page
What is a Trigger?
Ans: A Trigger is a code that associated with insert, update or delete operations. The code is executed automatically whenever the associated query is executed on a table. Triggers can be useful to maintain integrity in database.
What is a stored procedure?
Ans: A stored procedure is like a function that contains a set of operations compiled together. It contains a set of operations that are commonly used in an application to do some common database tasks.
What is the difference between Trigger and Stored Procedure?
Ans: Unlike Stored Procedures, Triggers cannot be called directly. They can only be associated with queries.
What is a transaction? What are ACID properties?
Ans: A Database Transaction is a set of database operations that must be treated as whole, means either all operations are executed or none of them.
An example can be bank transaction from one account to another account. Either both debit and credit operations must be executed or none of them.
ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably.
What are indexes?
Ans: A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and the use of more storage space to maintain the extra copy of data.
Data can be stored only in one order on disk. To support faster access according to different values, faster search like binary search for different values is desired, For this purpose, indexes are created on tables. These indexes need extra space on disk, but they allow faster search according to different frequently searched values.
What are clustered and non-clustered Indexes?
Ans: Clustered indexes is the index according to which data is physically stored on disk. Therefore, only one clustered index can be created on a given database table.
Non-clustered indexes don’t define physical ordering of data, but logical ordering. Typically, a tree is created whose leaf point to disk records. B-Tree or B+ tree are used for this purpos