1.) Explain the role of database administrator

The database which are corporate or enterprise –wide are typically important & complex enough. This task of designing and maintaining the database in entrusted to a Professional called database administrator. The following are the critical tasks of  DBA.

 ·         Design of conceptual & physical schema.

 The DBA is responsible for interacting the users of the system to understand       what data is to be stored in DBMS & how it is likely to be used . Based on this he has to design the conceptual schema & the physical schema. The DBA should widely used portions of the external schema

 ·         Security & authorization

       The DBA is responsible for ensuring the unauthorized data access  is not permitted .In the relation DB the users are allowed only to access certain part of the views &    relation.  

·         Data availability & recovery from failure.

The DBA must take steps to ensure that if the system fails, users can continue to access as much as uncorrupted data as possible .he must try to work to restore to the data for a consistent state. The DBMS provides s/w support  to these ,but DBA is responsible for its implementation to backup the data periodically & maintain logs of system activities.

·         Database tuning

 The user’s need are likely to change with time , there queries tend to change with time .The DBA is responsible to modify the DB, i.e he has to make changes in the conceptual & the physical schema .This is to be done to ensure adequate performance as requirements change.

2.) Explain with a suitable example, converting an E-R diagram into a table.

Convert ER diagram to relational model (DB tables):

·         Each entity:  a table (relation), primary key

·         Each relationship either: 

o        a table, containing the primary keys of the relevant entities  (foreign keys)

 

o        represent by making the primary key of one table the foreign key of another

·         Each attribute:  a column in a table

 Then normalize tables to make them well-structured

 Then if necessary merge tables to remove redundancies, and re-normalize

 Choosing data types for attributes (columns):

 Get appropriate data types from:

·         Requirement’s DD

·         Existing forms, screens & reports

·         Talking to client

Select data types to:

·         Minimize storage space

o        Numbers take up less space than text:  does a product code really have to start P1253, or can the code be 1253?

o        How long does a text string really have to be?  (but make sure it’s long enough…)

·         Represent all possible values

o        Is SMALLINT big enough?

o        Remember that data tends to grow; plan for future needs

·         Support all data manipulations

o        Should invoice date be DATE or CHAR(6)?  Do you need to compare/subtract dates?  Transfer dates between systems?

·         In bigger systems, may need to look at more exotic data types (image, graphic, large text, …)

 

 

 

 

3.) Mention and explain primitive operations in relational algebra

Solution:

Fundamental Operations

1.     The Select Operation

Select selects tuples that satisfy a given predicate. Select is denoted by a lowercase Greek sigma (), with the predicate appearing as a subscript. The argument relation is given in parentheses following the .

For example, to select tuples (rows) of the borrow relation where the branch is ``SFU'', we would write

Let Figure 1 be the borrow and branch relations in the banking example.

 
Figure 1:   The borrow and branch relations.

The new relation created as the result of this operation consists of one tuple: .

We allow comparisons using =, , <, , > and in the selection predicate.

We also allow the logical connectives (or) and (and). For example:

 
Figure 2:   The client relation.

Suppose there is one more relation, client, shown in Figure 2,  with the scheme

we might write

to find clients who have the same name as their banker.

2.     The Project Operation

Project copies its argument relation for the specified attributes only. Since a relation is a set, duplicate rows are eliminated.

Projection is denoted by the Greek capital letter pi (). The attributes to be copied appear as subscripts.

For example, to obtain a relation showing customers and branches, but ignoring amount and loan#, we write

We can perform these operations on the relations resulting from other operations.

To get the names of customers having the same name as their bankers,

Think of select as taking rows of a relation, and project as taking columns of a relation.

3.     The Cartesian Product Operation

The cartesian product of two relations is denoted by a cross (), written

The result of is a new relation with a tuple for each possible pairing of tuples from and .

In order to avoid ambiguity, the attribute names have attached to them the name of the relation from which they came. If no ambiguity will result, we drop the relation name.

The result is a very large relation. If has tuples, and has tuples, then will have tuples.

The resulting scheme is the concatenation of the schemes of and , with relation names added as mentioned.

To find the clients of banker Johnson and the city in which they live, we need information in both client and customer relations. We can get this by writing

However, the customer.cname column contains customers of bankers other than Johnson. (Why?)

We want rows where client.cname = customer.cname. So we can write

to get just these tuples.

Finally, to get just the customer's name and city, we need a projection:

4.     The Rename Operation

The rename operation solves the problems that occurs with naming when performing the cartesian product of a relation with itself.

Suppose we want to find the names of all the customers who live on the same street and in the same city as Smith.

We can get the street and city of Smith by writing

To find other customers with the same information, we need to reference the customer relation again:

where is a selection predicate requiring street and ccity values to be equal.

Problem: how do we distinguish between the two street values appearing in the Cartesian product, as both come from a customer relation?
Solution: use the rename operator, denoted by the Greek letter rho ().

We write

to get the relation under the name of .

If we use this to rename one of the two customer relations we are using, the ambiguities will disappear.

5.     The Union Operation

The union operation is denoted as in set theory. It returns the union (set union) of two compatible relations.

For a union operation to be legal, we require that

o        and must have the same number of attributes.

o        The domains of the corresponding attributes must be the same.

To find all customers of the SFU branch, we must find everyone who has a loan or an account or both at the branch.

We need both borrow and deposit relations for this:

As in all set operations, duplicates are eliminated, giving the relation of Figure 3(a).

 
Figure 3:   The union and set-difference operations.

6.     The Set Difference Operation

Set difference is denoted by the minus sign (). It finds tuples that are in one relation, but not in another.

Thus results in a relation containing tuples that are in but not in .

To find customers of the SFU branch who have an account there but no loan, we write

The result is shown in Figure 3(b).

We can do more with this operation. Suppose we want to find the largest account balance in the bank.
Strategy:

o        Find a relation containing the balances not the largest.

o        Compute the set difference of and the deposit relation.

To find , we write

This resulting relation contains all balances except the largest one. (See Figure 4(a)).

Now we can finish our query by taking the set difference:

Figure 4(b) shows the result.

 
Figure 4:   Find the largest account balance in the bank.

 

 

4.) What are keys? Mention and explain various keys used.

Solution:

Keys

Differences between entities must be expressed in terms of attributes.

A superkey is a set of one or more attributes which, taken collectively, allow us to identify uniquely an entity in the entity set.

For example, in the entity set customer, customer-name and S.I.N. is a superkey.

Note that customer-name alone is not, as two customers could have the same name.

A superkey may contain extraneous attributes, and we are often interested in the smallest superkey. A superkey for which no subset is a superkey is called a candidate key.

In the example above, S.I.N. is a candidate key, as it is minimal, and uniquely identifies a customer entity.

A primary key is a candidate key (there may be more than one) chosen by the DB designer to identify entities in an entity set.

An entity set that does not possess sufficient attributes to form a primary key is called a weak entity set. One that does have a primary key is called a strong entity set.

For example,

The entity set transaction has attributes transaction-number, date and amount.

Different transactions on different accounts could share the same number.

These are not sufficient to form a primary key (uniquely identify a transaction).

Thus transaction is a weak entity set.

For a weak entity set to be meaningful, it must be part of a one-to-many relationship set. This relationship set should have no descriptive attributes.

The idea of strong and weak entity sets is related to the existence dependencies seen earlier.

Member of a strong entity set is a dominant entity.

Member of a weak entity set is a subordinate entity.

A weak entity set does not have a primary key, but we need a means of distinguishing among the entities.

The discriminator of a weak entity set is a set of attributes that allows this distinction to be made.

The primary key of a weak entity set is formed by taking the primary key of the strong entity set on which its existence depends (Mapping Constraints) plus its discriminator.

To illustrate:

transaction is a weak entity. It is existence-dependent on account.

The primary key of account is account-number.

transaction-number distinguishes transaction entities within the same account (and is thus the discriminator).

So the primary key for transaction would be (account-number, transaction-number).

Note : The primary key of a weak entity is found by taking the primary key of the strong entity on which it is existence-dependent, plus the discriminator of the weak entity set.

 

 

5.) What are integrity constraints?Explain.

Solution:

Integrity Constraints

An integrity constraint (IC) is a condition that is specified on a database schema, and restricts the data that can be stored in an instance of the database. If a database instance satisfies all the integrity constraints specified on the database schema, it is a legal instance. A DBMS enforces integrity constraints, in that it permits only legal instances to be stored in the database.

When the DBA or end user defines a database schema, he or she specifies the ICs that must hold on any instance of this database.

When a database application is run, the DBMS checks for violation and disallows changes to the data that violate the specified ICs.

1)     Integrity constraints provide a way of ensuring that changes made to the database by authorized users do not result in a loss of data consistency.

2)     We saw a form of integrity constraint with E-R models:

·        key declarations: stipulation that certain attributes form a candidate key for the entity set.

·        form of a relationship: mapping cardinalities 1-1, 1-many and many-many.

3)     An integrity constraint can be any arbitrary predicate applied to the database.

4)     They may be costly to evaluate, so we will only consider integrity constraints that can be tested with minimal overhead.

KEY CONSTRAINTS

       It is a statement that a certain minimal subset of the fields of a relation is a unique identifier for a tuple.

Eg:

    


Here in the above table Employee ssn is mentioned as a primary key.

 

A set of fields that uniquely identifies a tuple according to a key constraint is called a candidate key.

Possibly many candidate keys (specified using

UNIQUE), one of which is chosen as the primary key.

 

Eg: “For a given student and course, there is a single grade.” vs. “Students can take only one course, and receive a single grade for that course; further, no two students in a course receive the same grade.”

FOREIGN KEY CONSTRAINTS

Foreign key is a set of fields in one relation that is used to `refer’ to a tuple in another relation.  (Must correspond to primary key of the second relation.)

 

 

 E.g. sid is a foreign key referring to Students:

Enrolled ( sid: number, sname: string, rating: number,age: number)

 If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references.

Eg:                                                    

                                             

 

Domain Constraints

1.     A domain of possible values should be associated with every attribute. These domain constraints are the most basic form of integrity constraint.

They are easy to test for when data is entered.

2.     Domain types

1.     Attributes may have the same domain, e.g. cname and employee-name.

2.     It is not as clear whether bname and cname domains ought to be distinct.

3.     At the implementation level, they are both character strings.

4.     At the conceptual level, we do not expect customers to have the same names as branches, in general.

5.     Strong typing of domains allows us to test for values inserted, and whether queries make sense. Newer systems, particularly object-oriented database systems, offer a rich set of domain types that can be extended easily.

3.     The check clause in SQL-92 permits domains to be restricted in powerful ways that most programming language type systems do not permit.

1.     The check clause permits schema designer to specify a predicate that must be satisfied by any value assigned to a variable whose type is the domain.

2.     Examples:

3.                   create domain hourly-wage numeric(5,2)

4.                    

5.                                                                   constraint  wage-value-test check(value >= 4.00)

6.                   

Note that ``constraint wage-value-test'' is optional (to give a name to the test to signal which constraint is violated).

 create domain account-number char(10)

 

                                          constraint  account-number-null-test

     check(value not null)

 

 create domain account-type char(10)

 

                                          constraint  account-type-test

     check(value in (``Checking'', ``Saving''))

 

Referential Integrity

Often we wish to ensure that a value appearing in a relation for a given set of attributes also appears for another set of attributes in another relation. This is called referential integrity.

 

 

6.) Compare and contrast file systems versus DBMS. Explain difference between logical and physical data independence.

Solution:

To understand the need for a DBMS , the following scenario may be considered. A company has a   large collection ( say 500 GB)  of data on employees , departments, products, sales and so on. This data is accessed concurrently by several employees. Questions about the data must be answered quickly, changes made to the data by different users must be applied consistently, and access to certain parts of the data must be restricted.

 The data can be stored in operating system files also but this approach has many drawbacks, including the following:

·          We probably do not have 500GB of main memory to hold all the data. We must therefore store data in a storage device such as a disk or tape and bring relevant parts into main memory for processing as needed.

·          Even if we have 500GB of main memory , on computer systems with 32-bit addressing, we cannot refer directly to more than about 4GB of data, We have to program some method of identifying all data items.

·          We have to write special programs to answer each question a user may want to ask about the data. These programs are likely to be complex because of the large volume of data to be searched.

·          We must protect the data from inconsistent changes made by different users accessing the data concurrently. If applications must address the details of such concurrent access, this adds greatly to their complexity.

·          We must ensure that data is restored to a consistent state if the system crashes while changes are being made.

·          Operating systems provide only a password mechanism for security. This is not sufficiently flexible to enforce security policies in which different users have permission to access different subsets of the data.

 

Buffer management in DBMS requires ability to:

·        pin a page in buffer pool, force a page to disk & order writes (important for implementing CC & recovery)

·        adjust replacement policy, and pre-fetch pages based on access patterns in typical DB operations.

 A DBMS is a piece of software designed to make the preceding tasks easier. By storing data in a DBMS rather than as a collection of operating system files, we can use the DBMS’s features to manage the data in a robust and efficient manner. As the volume of data and the number of users grow DBMS support becomes indispensable.

 Thus, the advantages of DBMS can be concisely stated as follows:

·          Data independence

·          Efficient data access

·          Data integrity and security

·          Data administration

·          Concurrent access and crash recovery

·          Reduced application development time

 A DBMS is a complex piece of software, optimised for certain kinds of workloads and its performance may not be adequate for certain specialized applications. Examples include applications with tight real-time constraints or just a few well-defined critical operations for which efficient custom code must be written. Another reason for not using a DBMS is that an application may need to manipulate the data in ways not supported by the query language. In such a situation, the abstract view of the data presented by the DBMS does not match the application’s needs and actually gets in the way. Thus, if specialized performance or data manipulation requirements are central to an application, the application may choose not to use a DBMS ,especially if the added benefits of a DBMS are not required. In most situations calling for large-scale data management, however, DBMSs have become an indispensible tool.

 Difference between logical and physical data independence:

 A very important advantage of using a DBMS is that it offers data independence. That is, application programs are insulated from changes in the way the data is structured and stored. Data independence is achieved through use of the three levels of data abstraction; in particular, the conceptual schema and the external schema provide distinct benefits in this area.

 Relations in the external schema (view relations) are in principle generated on demand from the relations corresponding to the conceptual schema. If the underlying data is reorganized, that is, the conceptual schema is changed, the definition of a view relation can be modified so that the same relation  is computed as before. For example, suppose that the Faculty relation in a university database is replaced by the following two relations:

        Faculty_public  (fid: string, fname: string, office: integer)

        Faculty_private (fid: string, sal: real)

 Intuitively, some confidential information about the faculty has been placed in a separate relation and information about offices has been added. The courseinfo view relation can be redefined in terms of faculty_public, and faculty_private which together contain all the information in faculty, so that a user who queries courseinfo  will get the same answers as before.

 Thus, the users can be shielded from changes in the logical structure of the data, or the changes in the choice of relations to be stored. This property is called logical data independence.

 In turn, the conceptual schema insulates users from changes in physical storage details. This property is referred to as ‘Physical data independence’. The conceptual schema hides details such as how the data is actually laid out on disk, the file structure, and the choice of indexes. As long as the conceptual schema remains the same , we can change the storage details without altering applications.

 

7)

Will be given shortly

 

 

8.) What are the salient features of SQL?

The answer given to this is brief. You should add the commnds also in your final exm.

Structured query language (SQL) is the most widely used commercial relational database language.

It was originally developed at IBM in SEQUEL-XRM and system-R projects. The SQL language supports several already existing core features. The SQL language several other features improvised on others added to it.

FEATURES:-

1.The Data Definition Language(DDL)

2.The Data Manipulation Language(DML)

3.Embeded and Dynamic SQL

4.Triggers

5.Security

6.Transaction management

7.Client-Server execution and remote database access

THE DATA DEFINATION LANGUAGE:

The SQL supports creation , detection and modification of definitions for tables and views. Integrity constraints can be defined on tables, either when the table is created or later. The DDL also provides commands for specifying

Access rights or privileges to tables and views. Although the standard does not indexes, commercial implementations also provide commands for creating and deleting indexes.

THE DATA MANIPULATION LANGUAGE:

This topic of SQL allows users to pose queries and to INSERT , DELETE and MODIFY ROWS.

When a table is created already the user can modify the table by adding or deleting the rows and columns of the table .Any number of rows of columns can be added or deleted to the table.

EMBEDED AND DYNAMIC SQL:

Embeded SQL features allow code to be called from a host language such as C or COBOL. Dyananic SQL features allow a query to be constructed at run-time.When the query has to be run and if it seen that query is not sufficient to get the required information, the query can be changed even at that point of the executon.

TRIGGERS:

The new SQL :1999 standard includes support for triggers, which are actions executed by the DBMS whenever changes to the database meet conditions specified in the trigger.

SECURITY:

SQL provides mechanisms to control user’s access to data objects such as tables and views.

TRANSACTION MANAGEMENT:

Various commands allow a user to explicitly control aspects of how a transaction is to be executed.

CLIENT-SERVER EXECUTION AND REMOTE DATABASE ACCESS:

These commands control how a client application can connect to an SQL database server , or access data from a database over network.

SQL also has some integrity constraints that rely upon the use of the query language features of SQL. The of expressing queries has played a major role in the success of relational database systems.

 

 

 

9.) Consider the Sailors, Boats and Reserves schema in the text book and write the following queries in SQL;

o       Find the names of sailors who have reserved boat no: 103

o       Find the color of boats reserved by Sudhakar.

o       Find the names of sailors who have reserved both a red and a green boat.

o       Find the sailors whose rating is better than some sailor called Sudhakar

1. SELECT  S.sname

      FROM     Sailors S, Reserves R

        WHERE  S.sid=R.sid AND R.bid=103

 

2.  SELECT B.color FROM Boats B, Reserves R, Sailors S

       WHERE B.bid=R.bid

       AND R.sid=S.sid

       AND S.sname=‘sudhakar

 

3. SELECT  S.sid

      FROM  Sailors S, Boats B1, Reserves R1,

        Boats B2, Reserves R2

          WHERE  S.sid=R1.sid AND R1.bid=B1.bid

            AND  S.sid=R2.sid AND R2.bid=B2.bid

            AND (B1.color=‘red’ AND B2.color=‘green’)

 

4.  SELECT S.sid FROM Sailors S

   WHERE S.rating > (SELECT rating from Sailors

                                  WHERE sname=‘sudhakar’)

10.) Distinguish between DDL and DMl

Solution:

DDL:

 

Data definition Language (DDL) is used to create, rename, alter, modify, drop, replace, and delete tables, Indexes, Views, and macros; show data; comment on database objects; and establish a default database.

The DDL provides statements for the definition and description of entities. It enables you to perform the following operations:

 

·        CREATE Define a new database, user, table, trigger, index, macro, view,  depending on the object of the CREATE  statement.

·        DROP Remove a table, trigger, index, macro, view definition, depending on the object of the DROP statement.

Other data definitions allow us to perform the following functions:

·        ALTER Change a table, trigger or protection definition

·        RENAME table triggers views and macros

·        REPLACE macros, triggers or views

·        SET time zones

·        DELETE database or users

·        COLLECT statistics on a column or index

·        MODIFY databases or users

·        COMMENT on database objects

·        GRANT/REVOKE access rights, logon rights

·        GIVE a database object to another database or user

·        HELP on database objects

·        SHOW database objects

·        ABORT, ROLLBACK, COMMIT, BEGIN/END TRANSACTION to manage   transactions

·        CHECKPOINT a journal

·        DATABASE to set default

·        ECHO a string or command to client

DDL statement may be entered as a single statement, the solitary statement, and the solitary statement in a macro.

DML:

Data manipulation Language (DML) is used to add, delete, and revise data in existing tables or views.

DML supports statements for manipulating and processing database values. It enables us to perform following operations:

·        BEGIN and END which is a user defined transaction.

·        COMMENT which is a transaction in ANSI mode.

·        INSERT insert new rows into a table

·        UPDATE modifies data in one or more rows of a table.

DELETE removes a row from a table.

11.) Explain about Generalisation and Aggregation

    Generalization:

Let’s think of a reverse process of abstraction in which we suppress the differences among several entity types, identify their common features, and generalize them into a single superclass of which the original entity types are special subclasses.

The term “generalization” refers to the process of defining a generalized entity type from the given entity types.

Generalization

Consider extending the entity set account by classifying accounts as being either savings-account or chequing-account.

Each of these is described by the attributes of account plus additional attributes. (savings has interest-rate and chequing has overdraft-amount.)

We can express the similarities between the entity sets by generalization. This is the process of forming containment relationships between a higher-level entity set and one or more lower-level entity sets.

In E-R diagrams, generalization is shown by a triangle, as shown in Figure 1

  
Figure 1: Generalization
bulletGeneralization hides differences and emphasizes similarities.
bulletDistinction made through attribute inheritance.
bulletAttributes of higher-level entity are inherited by lower-level entities.
bulletTwo methods for conversion to a table form:
bulletCreate a table for the high-level entity, plus tables for the lower-level entities containing also their specific attributes.
bulletCreate only tables for the lower-level entities.

 

Aggregation:

It is an abstraction concept for building composite objects from their component objects.

It is used when we have to model a relationship involving entity sets and a relationship set. It allows us to treat a relationship set as an entity set   for purposes of participation in other relationships.

The E-R model cannot express relationships among relationships.

 

When would we need such a thing?

 

Consider a DB with information about employees who work on a particular project and use a number of machines doing that work. We get the E-R diagram shown in the following Figure 1:

 

 

 

Figure 1: E-R diagram with redundant relationships

 

Relationship sets work and uses could be combined into a single set. However, they shouldn't be, as this would obscure the logical structure of this scheme.

 

The solution is to use aggregation.

 

An abstraction through which relationships are treated as higher-level entities.

For our example, we treat the relationship set work and the entity sets employee and project as a higher-level entity set called work.

 

Figure 2 : shows the E-R diagram with aggregation.

 

 

Figure 2: E-R diagram with aggregation

 

 

Transforming an E-R diagram with aggregation into tabular form is easy. We create a table for each entity and relationship set as before.

 

The table for relationship set uses contains a column for each attribute in the primary key of machinery and work.

 

12.) Compare SQL and QBE

QBE and SQL:

·        QBE language is important because it is one of the first graphical query languages with minimum syntax developed for database systems.

·        QBE differs from SQL in that  the user does not have to specify a structured query explicitly; rather, the query is formulated by filling in templates of relations displayed on the monitor screen.

·        The user does not have to follow any rigid syntax rules for query specification in the QBE.

·        QBE is related to the domain relational calculus whereas SQL is based to a greater extent on the tuple relational calculus.

·        QBE’s original specification is shown to be relationally complete.

·        QBE does not use the “linear” style of SQL; rather, it is a “two-dimensional” language, because users specify a query moving around the full area of the screen.

13.) Explain about Domain Calculus expression.

Solution:

The Domain Relational Calculus

Domain variables take on values from an attribute's domain, rather than values for an entire tuple

Formal Definitions

1.     An expression is of the form

where the represent domain variables, and is a formula.

2.     An atom in the domain relational calculus is of the following forms

o        where is a relation on attributes, and , are domain variables or constants.

o        , where and are domain variables, and is a comparison operator.

o        , where c is a constant.

3.     Formulae are built up from atoms using the following rules:

o        An atom is a formula.

o        If is a formula, then so are and .

o        If and are formulae, then so are , and .

o        If is a formula where x is a domain variable, then so are and .

14.) Write short notes on a)Entities and attributes b)Strong and Weak Entities c)Cross products and Joins

Solution:

a.)

An entity is an object that exists and is distinguishable from other objects. For instance, John Harris with S.I.N. 890-12-3456 is an entity, as he can be uniquely identified as one particular person in the universe.

An entity may be concrete (a person or a book, for example) or abstract (like a holiday or a concept).

An entity is represented by a set of attributes.

E.g. name, S.I.N., street, city for ``customer'' entity.

The domain of the attribute is the set of permitted values (e.g. the telephone number must be seven positive integers).

Formally, an attribute is a function which maps an entity set into a domain.

Every entity is described by a set of (attribute, data value) pairs.

There is one pair for each attribute of the entity set.

E.g. a particular customer entity is described by the set {(name, Harris), (S.I.N., 890-123-456), (street, North), (city, Georgetown)}.

b.)

Strong vs. Weak Entities

A Strong Entity does not depend on anything else in the database for its existence.  It contains its own primary key.

A Weak Entity depends on some related entity for its existence.  It has no candidate keys without including the primary key of the entity it depends on.

Strong vs. Weak Example

Employee ( empNo PK, fname, lname salary,…) HAS Child (fname)

Employee is strong (with empNo as primary key)

Child is weak - many children may be called “Jennifer”, but only one Jennifer is the daughter of employee 24901.

15.) Explain different levels of abstraction in a DBMS.. What is data independence? Explain different types of Data Independence

Data Abstraction

1.     The major purpose of a database system is to provide users with an abstract view of the system.

The system hides certain details of how data is stored and created and maintained

Complexity should be hidden from database users.

2.     There are several levels of abstraction:

1.     Physical Level:

§         How the data are stored.

§         E.g. index, B-tree, hashing.

§         Lowest level of abstraction.

§         Complex low-level structures described in detail.

2.     Conceptual Level:

§         Next highest level of abstraction.

§         Describes what data are stored.

§         Describes the relationships among data.

§         Database administrator level.

3.     View Level:

§         Highest level.

§         Describes part of the database for a particular group of users.

§         Can be many different views of a database.

§         E.g. tellers in a bank get a view of customer accounts, but not of payroll data.

Figure illustrates the three levels.

  
Figure: The three levels of data abstraction

Data Independence

The ability to modify a scheme definition in one level without affecting a scheme definition in a higher level is called data independence.

There are two kinds:

Physical data independence

The ability to modify the physical scheme without causing application programs to be rewritten. Modifications at this level are usually to improve performance.

Logical data independence

The ability to modify the conceptual scheme without causing application programs to be rewritten. Usually done when logical structure of database is altered .

Logical data independence is harder to achieve as the application programs are usually heavily dependent on the logical structure of the data. An analogy is made to abstract data types in programming languages.

16.) Explain TRC and DRC with examples. List two reasons why null values are introduced into databases. Explain nested subqueries in SQL.

Solution:

Relational calculus is an alternative to relational algebra. The calculus is nonprocedural, or declarative, in that it allows us to describe the set of answers without being explicit about how they should be computed. Relational calculus has had a big influence on the design of commercial query languages such as SQL and, especially, Query-by-Example (QBE).

The variant of the calculus we present in detail is called the tuple relational calculus (TRC). Variables in TRC take on tuples as values. In another variant, called the domain relational calculus (DRC), the variables range over field values. TRC has had more of an influence on SQL, while DRC has strongly influenced QBE.

Tuple Relational Calculus

A tuple variable is a variable that takes on tuples  of a particular relation schema as values. That is, every value assigned to a given tuple variable has the same number and type of fields. A tuple relational calculus query has the form { T p(T) }, where T is a tuple variable and p(T) denotes a formula that describes T\ we will shortly define formulas and queries rigorously. The result of this query is the set of all tuples t for which the formula p(T) evaluates to true with T t. The language for writing formulas p(T) is thus at the heart of TRC and essentially a simple subset of first-order logic. As a simple example, consider the following query.

(Qll) Find all sailors with a rating above 7.

{S | S  E  Sailors^ S.rating > 7}

When this query is evaluated on an instance of the Sailors relation, the tuple variable S is instantiated successively with each tuple, and the test S.rating>7 is applied. The answer contains those instances of S that pass this test. On instance 53 of Sailors, the answer contains Sailors tuples with sid 31, 32, 58,

71, and 74.

Syntax of TRC Queries

We now define these concepts formally, beginning with the notion of a formula. Let Rel be a relation name, R and S be tuple variables, a be an attribute of R, and b be an attribute of S. Let op denote an operator in the set {<, >,  = , <= , >=, !=}• An atomic formula is one of the following:

        R E  Rel

        R.a op S.b

        R.a op constant, or constant op R.a

A formula is recursively defined to be one of the following, where p and q are themselves formulas and p(R) denotes a formula in which the variable R appears:

        Any atomic formula

        Not p, p ^ q, p V q, or p implies q

        There exists R such that p(R) , where K is a tuple variable

        For all R(p(R)), where R is a tuple variable

The quantifiers there exists and for all are said to bind the variable R.. A variable is said to be free in a formula or sub formula (a formula contained in a larger formula) if the (sub)formula does not contain an occurrence of a quantifier that binds it.

Every variable in a TRC formula appears in a sub formula that is atomic, and every relation schema specifies a domain for each field; each variable in a TRC formula has a well-defined domain from which values for the variable are drawn. Informally, an atomic formula R E Rel gives R the type of tuples in Rel, and comparisons such as R.a op S.b and R.a op constant induce type restrictions on the field R.a. If a variable R does not appear in an atomic formula of the form R Rel, we follow the convention that the type of R is a tuple whose fields include all fields of R that appear in the formula.

For example :

 Find the names and ages of sailors with a rating above 7.

{P /  there exists an S  Sailors(S.rating > 7  ^  P.name =S.sname ^ P.age = S.age)}.

 

Domain Relational Calculus

A domain variable is a variable that ranges over the values in the domain of some attribute (e.g., the variable can be assigned an integer if it appears in an attribute whose domain is the set of integers). A DRC query has the form {<x1,X2,. • xn>/  p(<x1,X2,..,xN>)}, where each Xj is either a domain variable or a constant and p(<x1,x2,…,xn>) denotes a DRC formula whose only free variables are the variables among the xi, 1 < i < n. The result of this query is the set of all tuples (x1,x2,..., xn) for which the formula evaluates to true.

A DRC formula is defined in a manner very similar to the definition of a TRC formula. The main difference is that the variables are now domain variables. Let op denote an operator in the set {<,>,=,<, >, ^} and let X and Y be domain variables. An atomic formula in DRC is one of the following:

          (x1,x2,….xn)  E  Rel, where Rel is a relation with n attributes; each xi, 1 < i < n is either a variable or a constant

          X op Y

          X op constant, or constant op X

A formula is recursively defined to be one of the following, where p and q are themselves formulas and p(X) denotes a formula in which the variable A' appears:

          Any atomic formula

          Not p, p ^ q, p V q, or p implies q

          There exists R such that p(R) , where R is a domain variable

          For all R(p(R)), where R is a domain variable

 

Examples of DRC Queries

 Find all sailors with a rating above 7.

{{I, N, T, A) /   {I, N, T, A) E Sailors ^T > 7}

17.) Differentiate between binary and ternary relationships

 

 A .   

                                            Policy as an entity set.

 

 Consider the  E-R diagram . It  models a situation in which an employee can own several policies ,each policy can be owned by several employees, and each dependent can be covered by several policies.

 

Suppose that we have the following additional requirements :

·         A policy cannot be owned jointly by two or more requirements.

·        Every policy must be owned by some employee.

·        Dependents is a weak entity set , and each dependent entity is uniquely identified by taking pname in conjunction with the policyid of a policy entity.

 The first requirement suggests that we impose a key constraint on polices with respect to covers, but this constraint has the unintended side effect that a policy can cover only one dependent. The second requirement suggests that  we impose a total participation constraint on policies. This  solution is acceptable if each policy covers at least one dependent . the third requirement forces us to introduce an identifying relationship that is binary .

The best way to model this situation is to use two binary relationships,

 

 

    

 

                                            Policy revisited.

 

As a typical example of a ternary relation ship, consider entity sets parts, suppliers, and departments , and a relationship set contracts that involves all of them . A contract specifies that a supplier will supply a part to a department. This relation ship can not be adequately captured by a collection of binary relationships. With binary relationship , we can denote that a supplier ‘can supply ‘ certain parts that a department ‘needs ‘  some parts ,or a department  ‘deals with ‘ a  certain supplier. No combination of these relationships express the meaning of a contract adequately, for at least two reasons :

·        The facts that supplier s can supply part p,  that department D needs part P , and that D will buy from S do not necessarily imply that department D indeed buys part P from suppliers S.

·        We cannot represent the qty attribute of a contract cleanly.

 

19.) Construct an ER digrm for a cr insurance company that has a set of customers, each of whom owns one or more cars. Each car has associated with it zero to any number of recorded accidents.

Solution:

 

 

20.) Describe the extended ER features.

Solution:

 

21.)Draw the ER diagram for the banking Enterprise and explain

 

Solution:

 

 

Let us assume that the bank has two types of customers

1)     Depositors – Fixed Account and Current Savings

2)     Borrowers – Who take loan on interest.

 

Constraints:

·        Each customer has a unique customer_id.

·        Has to have an account either fixed or current savings if it is a depositor.

·        Fixed account holders have a unique account_no, amount, and duration of the fixed deposit.

·        Current savings account holders have a unique account_no and interest.

·        Borrowers have a unique loan_id, amount, duration(of payback) and interest.

 

22.) Describe fundamental operations of relational algebra with examples. Discuss natural join with an example.

 

Solution:

 

 

Fundamental Operations

The Select Operation

Select selects tuples that satisfy a given predicate. Select is denoted by a lowercase Greek sigma (), with the predicate appearing as a subscript. The argument relation is given in parentheses following the .

For example, to select tuples (rows) of the borrow relation where the branch is ``SFU'', we would write

Let Figure 1 be the borrow and branch relations in the banking example.

 
Figure 1:   The borrow and branch relations.

The new relation created as the result of this operation consists of one tuple: .

We allow comparisons using =, , <, , > and in the selection predicate.

We also allow the logical connectives (or) and (and). For example:

 
Figure 2:   The client relation.

Suppose there is one more relation, client, shown in Figure 2,  with the scheme

we might write

to find clients who have the same name as their banker.

The Project Operation

Project copies its argument relation for the specified attributes only. Since a relation is a set, duplicate rows are eliminated.

Projection is denoted by the Greek capital letter pi (). The attributes to be copied appear as subscripts.

For example, to obtain a relation showing customers and branches, but ignoring amount and loan#, we write

We can perform these operations on the relations resulting from other operations.

To get the names of customers having the same name as their bankers,

Think of select as taking rows of a relation, and project as taking columns of a relation.

The Cartesian Product Operation

The cartesian product of two relations is denoted by a cross (), written

The result of is a new relation with a tuple for each possible pairing of tuples from and .

In order to avoid ambiguity, the attribute names have attached to them the name of the relation from which they came. If no ambiguity will result, we drop the relation name.

The result is a very large relation. If has tuples, and has tuples, then will have tuples.

The resulting scheme is the concatenation of the schemes of and , with relation names added as mentioned.

To find the clients of banker Johnson and the city in which they live, we need information in both client and customer relations. We can get this by writing

However, the customer.cname column contains customers of bankers other than Johnson. (Why?)

We want rows where client.cname = customer.cname. So we can write

to get just these tuples.

Finally, to get just the customer's name and city, we need a projection:

The Rename Operation

The rename operation solves the problems that occurs with naming when performing the cartesian product of a relation with itself.

Suppose we want to find the names of all the customers who live on the same street and in the same city as Smith.

We can get the street and city of Smith by writing

To find other customers with the same information, we need to reference the customer relation again:

where is a selection predicate requiring street and ccity values to be equal.

Problem: how do we distinguish between the two street values appearing in the Cartesian product, as both come from a customer relation?
Solution: use the rename operator, denoted by the Greek letter rho ().

We write

to get the relation under the name of .

If we use this to rename one of the two customer relations we are using, the ambiguities will disappear.

The Union Operation

The union operation is denoted as in set theory. It returns the union (set union) of two compatible relations.

For a union operation to be legal, we require that

o        and must have the same number of attributes.

o        The domains of the corresponding attributes must be the same.

To find all customers of the SFU branch, we must find everyone who has a loan or an account or both at the branch.

We need both borrow and deposit relations for this:

As in all set operations, duplicates are eliminated, giving the relation of Figure 3(a).

 
Figure 3:   The union and set-difference operations.

The Set Difference Operation

Set difference is denoted by the minus sign (). It finds tuples that are in one relation, but not in another.

Thus results in a relation containing tuples that are in but not in .

To find customers of the SFU branch who have an account there but no loan, we write

The result is shown in Figure 3(b).

We can do more with this operation. Suppose we want to find the largest account balance in the bank.
Strategy:

o        Find a relation containing the balances not the largest.

o        Compute the set difference of and the deposit relation.

To find , we write

This resulting relation contains all balances except the largest one. (See Figure 4(a)).

Now we can finish our query by taking the set difference:

Figure 4(b) shows the result.

 
Figure 4:   Find the largest account balance in the bank.

The Natural Join Operation

Often we want to simplify queries on a cartesian product.

For example, to find all customers having a loan at the bank and the cities in which they live, we need borrow and customer relations:

Our selection predicate obtains only those tuples pertaining to only one cname.

This type of operation is very common, so we have the natural join, denoted by a sign. Natural join combines a cartesian product and a selection into one operation. It performs a selection forcing equality on those attributes that appear in both relation schemes. Duplicates are removed as in all relation operations.

To illustrate, we can rewrite the previous query as

The resulting relation is shown in Figure 5

 
Figure 5:   Joining borrow and customer relations.

We can now make a more formal definition of natural join.

o        Consider and to be sets of attributes.

o        We denote attributes appearing in both relations by .

o        We denote attributes in either or both relations by .

o        Consider two relations and .

o        The natural join of and , denoted by is a relation on scheme .

o        It is a projection onto of a selection on where the predicate requires for each attribute in .

Formally,

where .

To find the assets and names of all branches which have depositors living in Stamford, we need customer, deposit and branch relations:

Note that is associative.

To find all customers who have both an account and a loan at the SFU branch:

This is equivalent to the set intersection version we wrote earlier. We see now that there can be several ways to write a query in the relational algebra.

If two relations and have no attributes in common, then , and .

 

23.)Explain various built in functions of SQL.

Solution:

Built-In Functions

PL/SQL provides more than 75 powerful functions to help you manipulate data. These built-in functions fall into the following categories:

1)     error-reporting

2)     number

3)     character

4)     conversion

5)     date

6)     miscellaneous

Table 1 shows the functions in each category.

You can use all the functions in SQL statements except the error- reporting functions SQLCODE and SQLERRM. Also, you can use all the functions in procedural statements except the miscellaneous functions DECODE, DUMP, and VSIZE.

Note: The SQL group functions AVG, MIN, MAX, COUNT, SUM, STDDEV, and VARIANCE are not built into PL/SQL. Nevertheless, you can use them in SQL statements (but not in procedural statements).

Error

Number

Character

Conversion

Date

Misc

SQLCODE

ABS

ASCII

CHARTOROWID

ADD_MONTHS

DECODE

SQLERRM

ACOS

CHR

CONVERT

LAST_DAY

DUMP

 

ASIN

CONCAT

HEXTORAW

MONTHS_BETWEEN

GREATEST

 

ATAN

INITCAP

RAWTOHEX

NEW_TIME

GREATEST_LB

 

ATAN2

INSTR

ROWIDTOCHAR

NEXT_DAY

LEAST

 

CEIL

INSTRB

TO_CHAR

ROUND

LEAST_LB

 

COS

LENGTH

TO_DATE

SYSDATE

NVL

 

COSH

LENGTHB

TO_LABEL

TRUNC

UID

 

EXP

LOWER

TO_MULTI_BYTE

 

USER

 

FLOOR

LPAD

TO_NUMBER

 

USERENV

 

LN

LTRIM

TO_SINGLE_BYTE

 

VSIZE

 

LOG

NLS_INITCAP

 

 

 

 

MOD

NLS_LOWER

 

 

 

 

POWER

NLS_UPPER

 

 

 

 

ROUND

NLSSORT

 

 

 

 

SIGN

REPLACE

 

 

 

 

SIN

RPAD

 

 

 

 

SINH

RTRIM

 

 

 

 

SQRT

SOUNDEX

 

 

 

 

TAN

SUBSTR

 

 

 

 

TANH

SUBSTRB

 

 

 

 

TRUNC

TRANSLATE

 

 

 

 

 

UPPER

 

 

 

 

Table 1. Built-in Functions