Make your own free website on Tripod.com

Group:

1.    P.Sudheendra Kumar

2.    Vijay Katgeri

3.    P.Vasudha

4.    R.Sri Lalitha

 

1.) Explain the role of database administrator

Database administrator

Job Description

Every information system relies on data. A high street bank, for example, needs to store securely hundreds of thousands of names, account numbers, balances and individual transactions. An air traffic control centre needs to store airplane codes, flight paths, arrival and departure times, heights, speeds, directions, weather information - and much more. All this data needs to be not only correct, but available for complex calculations and interrogation. A database administrator is responsible for the usage, accuracy, efficiency, security, maintenance, administration and development of an organisation's computerised database(s), providing support to some or all departments depending on the size of the organisation.

Typical Work Activities

The work of a database administrator will vary according to the nature of the employing organisation and the level of responsibility associated with the post.

Typical responsibilities are likely to include:

*    ensuring that the database(s) is updated accurately and regularly;

*    controlling access, performance monitoring and tuning;

*    assisting development staff with application design;

*    identifying and resolving users' problems;

*    developing and implementing maintenance procedures;

*    collaborating in the design and development of databases to meet new user needs and respond to/anticipate technological innovations;

*    facilitating and negotiating the increasing demand for access to data - increasingly via an organisation's intranet or website;

*    devising, developing and implementing disaster recovery and archiving procedures;

*    supporting users by talking them through a search or by customising the `front ends' to incorporate new fields for data entry;

*    capacity planning;

*    working closely with IT project managers and database programmers;

*    liaising with web developers to enable on-line database access and resolve associated problems;

*    planning and co-ordinating database security measures;

*    communicating regularly with internal technical, applications, and operational staff, to ensure the database integrity and security;

*    commissioning and installing new applications.

With the volumes of sensitive data generated, data integrity, backup, and security have become an increasingly important aspect of the work.

2.) Explain with a suitable example, converting an E-R diagram into a table.

Convert ER diagram to relational model (DB tables):

*    Each entity:  a table (relation), primary key

*    Each relationship either: 

*    a table, containing the primary keys of the relevant entities  (foreign keys)

 

*    represent by making the primary key of one table the foreign key of another

*    Each attribute:  a column in a table

 

Then normalize tables to make them well-structured

 

Then if necessary merge tables to remove redundancies, and re-normalize

 

Choosing data types for attributes (columns):

 

Get appropriate data types from:

*    Requirement’s DD

*    Existing forms, screens & reports

*    Talking to client

Select data types to:

*    Minimize storage space

*    Numbers take up less space than text:  does a product code really have to start P1253, or can the code be 1253?

*    How long does a text string really have to be?  (but make sure it’s long enough…)

*    Represent all possible values

*    Is SMALLINT big enough?

*    Remember that data tends to grow; plan for future needs

*    Support all data manipulations

*    Should invoice date be DATE or CHAR(6)?  Do you need to compare/subtract dates?  Transfer dates between systems?

*    In bigger systems, may need to look at more exotic data types (image, graphic, large text, …)

 

 

2.) Mention and explain primitive operations in relational algebra

Solution:

Fundamental Operations

1.    The Select Operation

Select selects tuples that satisfy a given predicate. Select is denoted by a lowercase Greek sigma (), with the predicate appearing as a subscript. The argument relation is given in parentheses following the .

For example, to select tuples (rows) of the borrow relation where the branch is ``SFU'', we would write

Let Figure 1 be the borrow and branch relations in the banking example.

 
Figure 1:   The borrow and branch relations.

The new relation created as the result of this operation consists of one tuple: .

We allow comparisons using =, , <, , > and in the selection predicate.

We also allow the logical connectives (or) and (and). For example:

 
Figure 2:   The client relation.

Suppose there is one more relation, client, shown in Figure 2,  with the scheme

we might write

to find clients who have the same name as their banker.

2.    The Project Operation

Project copies its argument relation for the specified attributes only. Since a relation is a set, duplicate rows are eliminated.

Projection is denoted by the Greek capital letter pi (). The attributes to be copied appear as subscripts.

For example, to obtain a relation showing customers and branches, but ignoring amount and loan#, we write

We can perform these operations on the relations resulting from other operations.

To get the names of customers having the same name as their bankers,

Think of select as taking rows of a relation, and project as taking columns of a relation.

3.    The Cartesian Product Operation

The cartesian product of two relations is denoted by a cross (), written

The result of is a new relation with a tuple for each possible pairing of tuples from and .

In order to avoid ambiguity, the attribute names have attached to them the name of the relation from which they came. If no ambiguity will result, we drop the relation name.

The result is a very large relation. If has tuples, and has tuples, then will have tuples.

The resulting scheme is the concatenation of the schemes of and , with relation names added as mentioned.

To find the clients of banker Johnson and the city in which they live, we need information in both client and customer relations. We can get this by writing

However, the customer.cname column contains customers of bankers other than Johnson. (Why?)

We want rows where client.cname = customer.cname. So we can write

to get just these tuples.

Finally, to get just the customer's name and city, we need a projection:

4.    The Rename Operation

The rename operation solves the problems that occurs with naming when performing the cartesian product of a relation with itself.

Suppose we want to find the names of all the customers who live on the same street and in the same city as Smith.

We can get the street and city of Smith by writing

To find other customers with the same information, we need to reference the customer relation again:

where is a selection predicate requiring street and ccity values to be equal.

Problem: how do we distinguish between the two street values appearing in the Cartesian product, as both come from a customer relation?
Solution: use the rename operator, denoted by the Greek letter rho ().

We write

to get the relation under the name of .

If we use this to rename one of the two customer relations we are using, the ambiguities will disappear.

5.    The Union Operation

The union operation is denoted as in set theory. It returns the union (set union) of two compatible relations.

For a union operation to be legal, we require that

*    and must have the same number of attributes.

*    The domains of the corresponding attributes must be the same.

To find all customers of the SFU branch, we must find everyone who has a loan or an account or both at the branch.

We need both borrow and deposit relations for this:

As in all set operations, duplicates are eliminated, giving the relation of Figure 3(a).

 
Figure 3:   The union and set-difference operations.

6.    The Set Difference Operation

Set difference is denoted by the minus sign (). It finds tuples that are in one relation, but not in another.

Thus results in a relation containing tuples that are in but not in .

To find customers of the SFU branch who have an account there but no loan, we write

The result is shown in Figure 3(b).

We can do more with this operation. Suppose we want to find the largest account balance in the bank.
Strategy:

*    Find a relation containing the balances not the largest.

*    Compute the set difference of and the deposit relation.

To find , we write

This resulting relation contains all balances except the largest one. (See Figure 4(a)).

Now we can finish our query by taking the set difference:

Figure 4(b) shows the result.

 
Figure 4:   Find the largest account balance in the bank.

4.) What are keys? Mention and explain various keys used.

Solution:

Keys

Differences between entities must be expressed in terms of attributes.

A superkey is a set of one or more attributes which, taken collectively, allow us to identify uniquely an entity in the entity set.

For example, in the entity set customer, customer-name and S.I.N. is a superkey.

Note that customer-name alone is not, as two customers could have the same name.

A superkey may contain extraneous attributes, and we are often interested in the smallest superkey. A superkey for which no subset is a superkey is called a candidate key.

In the example above, S.I.N. is a candidate key, as it is minimal, and uniquely identifies a customer entity.

A primary key is a candidate key (there may be more than one) chosen by the DB designer to identify entities in an entity set.

An entity set that does not possess sufficient attributes to form a primary key is called a weak entity set. One that does have a primary key is called a strong entity set.

For example,

The entity set transaction has attributes transaction-number, date and amount.

Different transactions on different accounts could share the same number.

These are not sufficient to form a primary key (uniquely identify a transaction).

Thus transaction is a weak entity set.

For a weak entity set to be meaningful, it must be part of a one-to-many relationship set. This relationship set should have no descriptive attributes.

The idea of strong and weak entity sets is related to the existence dependencies seen earlier.

Member of a strong entity set is a dominant entity.

Member of a weak entity set is a subordinate entity.

A weak entity set does not have a primary key, but we need a means of distinguishing among the entities.

The discriminator of a weak entity set is a set of attributes that allows this distinction to be made.

The primary key of a weak entity set is formed by taking the primary key of the strong entity set on which its existence depends (Mapping Constraints) plus its discriminator.

To illustrate:

transaction is a weak entity. It is existence-dependent on account.

The primary key of account is account-number.

transaction-number distinguishes transaction entities within the same account (and is thus the discriminator).

So the primary key for transaction would be (account-number, transaction-number).

Note : The primary key of a weak entity is found by taking the primary key of the strong entity on which it is existence-dependent, plus the discriminator of the weak entity set.

5.) What are integrity constraints?Explain.

Solution:

Integrity Constraints

An integrity constraint (IC) is a condition that is specified on a database schema, and restricts the data that can be stored in an instance of the database. If a database instance satisfies all the integrity constraints specified on the database schema, it is a legal instance. A DBMS enforces integrity constraints, in that it permits only legal instances to be stored in the database.

When the DBA or end user defines a database schema, he or she specifies the ICs that must hold on any instance of this database.

When a database application is run, the DBMS checks for violation and disallows changes to the data that violate the specified ICs.

1)    Integrity constraints provide a way of ensuring that changes made to the database by authorized users do not result in a loss of data consistency.

2)    We saw a form of integrity constraint with E-R models:

*    key declarations: stipulation that certain attributes form a candidate key for the entity set.

*    form of a relationship: mapping cardinalities 1-1, 1-many and many-many.

3)    An integrity constraint can be any arbitrary predicate applied to the database.

4)    They may be costly to evaluate, so we will only consider integrity constraints that can be tested with minimal overhead.

KEY CONSTRAINTS

       It is a statement that a certain minimal subset of the fields of a relation is a unique identifier for a tuple.

Eg:

    


Here in the above table Employee ssn is mentioned as a primary key.

 

A set of fields that uniquely identifies a tuple according to a key constraint is called a candidate key.

Possibly many candidate keys (specified using

UNIQUE), one of which is chosen as the primary key.

 

Eg: “For a given student and course, there is a single grade.” vs. “Students can take only one course, and receive a single grade for that course; further, no two students in a course receive the same grade.”

FOREIGN KEY CONSTRAINTS

Foreign key is a set of fields in one relation that is used to `refer’ to a tuple in another relation.  (Must correspond to primary key of the second relation.)

 

 

 E.g. sid is a foreign key referring to Students:

Enrolled ( sid: number, sname: string, rating: number,age: number)

 If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references.

Eg:                                                    

                                             

 

Domain Constraints

1.     A domain of possible values should be associated with every attribute. These domain constraints are the most basic form of integrity constraint.

They are easy to test for when data is entered.

2.    Domain types

1.    Attributes may have the same domain, e.g. cname and employee-name.

2.    It is not as clear whether bname and cname domains ought to be distinct.

3.    At the implementation level, they are both character strings.

4.    At the conceptual level, we do not expect customers to have the same names as branches, in general.

5.    Strong typing of domains allows us to test for values inserted, and whether queries make sense. Newer systems, particularly object-oriented database systems, offer a rich set of domain types that can be extended easily.

3.    The check clause in SQL-92 permits domains to be restricted in powerful ways that most programming language type systems do not permit.

1.    The check clause permits schema designer to specify a predicate that must be satisfied by any value assigned to a variable whose type is the domain.

2.    Examples:

3.                   create domain hourly-wage numeric(5,2)

4.                    

5.                                                                  constraint  wage-value-test check(value >= 4.00)

6.                  

Note that ``constraint wage-value-test'' is optional (to give a name to the test to signal which constraint is violated).

 create domain account-number char(10)

 

                                          constraint  account-number-null-test

     check(value not null)

 

 create domain account-type char(10)

 

                                          constraint  account-type-test

     check(value in (``Checking'', ``Saving''))

 

Referential Integrity

Often we wish to ensure that a value appearing in a relation for a given set of attributes also appears for another set of attributes in another relation. This is called referential integrity.

6.) Compare and contrast file systems versus DBMS. Explain difference between logical and physical data independence.

Solution:

DBMS vs. OS File System

OS does disk space & buffer mgmt: why not let OS manage these tasks?

Some limitations

*    Differences in OS support: portability issues

*    Files cannot span disks

*    Buffer management in DBMS requires ability to:

*    pin a page in buffer pool, force a page to disk & order writes (important for implementing CC & recovery)

*    adjust replacement policy, and pre-fetch pages based on access patterns in typical DB operations.

Data Independence

The ability to modify a scheme definition in one level without affecting a scheme definition in a higher level is called data independence.

There are two kinds:

Physical data independence

The ability to modify the physical scheme without causing application programs to be rewritten. Modifications at this level are usually to improve performance.

Logical data independence

The ability to modify the conceptual scheme without causing application programs to be rewritten. Usually done when logical structure of database is altered .

Logical data independence is harder to achieve as the application programs are usually heavily dependent on the logical structure of the data. An analogy is made to abstract data types in programming languages.

8.) What are the salient features of SQL?

 

Salient features of SQL:

 

*    SQL is a computer language that is used to interact with a database. In fact, SQL works with one specific type of database, called a relational database.

*    It is a database language, consisting of statements specialized for database management tasks.

*    Provides  a high-level declarative language interface, so the user only specifies what the result is to be, leaving the actual optimization and decisions on how to execute the query to the DBMS.

*    Syntax is more user-friendly than either of the two formal languages—relational algebra and relational calculus.

*      Lets users define the structure and organization of the stored data and relationships among the stored data items.

*    Allows users or an application program to retrieve stored data from the database and use it or to update the database by adding new data, removing old data, modifying previously stored data.

*    Can be used to restrict a user's ability to retrieve, add, and modify data, protecting stored data against unauthorized access.

*    Used to coordinate data sharing by concurrent users, ensuring that they do not interfere with one another.

*    Defines integrity constraints in the database, protecting it from corruption due to inconsistent updates or system failures.

*      Fundamental concept in SQL is the table, which is equivalent to the entity in the relational database. The instances of the entity are recorded in the table. Every entity will have its own table.

*    Columns in SQL correspond to attributes in the relational database. The column values must be simple data types such as strings and integers.

*    Has language constructs for specifying the granting and revoking of privileges to users.

*    Has methodology for embedding SQL statements in a general purpose PL such as C, C++, COBOL or PASCAL. It also has language bindings to various PLs that specify the correspondence of SQL data  types to the data types of each of the PLs.

*    Has transaction control commands used to specify units of database processing for concurrency control and recovery purposes.

9.) Consider the Sailors, Boats and Reserves schema in the text book and write the following queries in SQL;

*    Find the names of sailors who have reserved boat no: 103

*    Find the color of boats reserved by Sudhakar.

*    Find the names of sailors who have reserved both a red and a green boat.

*    Find the sailors whose rating is better than some sailor called Sudhakar

1. SELECT  S.sname

      FROM     Sailors S, Reserves R

        WHERE  S.sid=R.sid AND R.bid=103

 

2.  SELECT B.color FROM Boats B, Reserves R, Sailors S

       WHERE B.bid=R.bid

       AND R.sid=S.sid

       AND S.sname=‘sudhakar’

 

3. SELECT  S.sid

      FROM  Sailors S, Boats B1, Reserves R1,

        Boats B2, Reserves R2

          WHERE  S.sid=R1.sid AND R1.bid=B1.bid

            AND  S.sid=R2.sid AND R2.bid=B2.bid

            AND (B1.color=‘red’ AND B2.color=‘green’)

 

4.  SELECT S.sid FROM Sailors S

   WHERE S.rating > (SELECT rating from Sailors

                                  WHERE sname=‘sudhakar’)

10.) Distinguish between DDL and DMl

Solution:

DDL:

 

Data definition Language (DDL) is used to create, rename, alter, modify, drop, replace, and delete tables, Indexes, Views, and macros; show data; comment on database objects; and establish a default database.

The DDL provides statements for the definition and description of entities. It enables you to perform the following operations:

 

*    CREATE Define a new database, user, table, trigger, index, macro, view,  depending on the object of the CREATE  statement.

*    DROP Remove a table, trigger, index, macro, view definition, depending on the object of the DROP statement.

Other data definitions allow us to perform the following functions:

*    ALTER Change a table, trigger or protection definition

*    RENAME table triggers views and macros

*    REPLACE macros, triggers or views

*    SET time zones

*    DELETE database or users

*    COLLECT statistics on a column or index

*    MODIFY databases or users

*    COMMENT on database objects

*    GRANT/REVOKE access rights, logon rights

*    GIVE a database object to another database or user

*    HELP on database objects

*    SHOW database objects

*    ABORT, ROLLBACK, COMMIT, BEGIN/END TRANSACTION to manage   transactions

*    CHECKPOINT a journal

*    DATABASE to set default

*    ECHO a string or command to client

DDL statement may be entered as a single statement, the solitary statement, and the solitary statement in a macro.

DML:

Data manipulation Language (DML) is used to add, delete, and revise data in existing tables or views.

DML supports statements for manipulating and processing database values. It enables us to perform following operations:

*    BEGIN and END which is a user defined transaction.

*    COMMENT which is a transaction in ANSI mode.

*    INSERT insert new rows into a table

*    UPDATE modifies data in one or more rows of a table.

DELETE removes a row from a table.

11.) Explain about Generalisation and Aggregation

    Generalization:

Let’s think of a reverse process of abstraction in which we suppress the differences among several entity types, identify their common features, and generalize them into a single superclass of which the original entity types are special subclasses.

The term “generalization” refers to the process of defining a generalized entity type from the given entity types.

Generalization

Consider extending the entity set account by classifying accounts as being either savings-account or chequing-account.

Each of these is described by the attributes of account plus additional attributes. (savings has interest-rate and chequing has overdraft-amount.)

We can express the similarities between the entity sets by generalization. This is the process of forming containment relationships between a higher-level entity set and one or more lower-level entity sets.

In E-R diagrams, generalization is shown by a triangle, as shown in Figure 1

  
Figure 1: Generalization

*    Generalization hides differences and emphasizes similarities.

*    Distinction made through attribute inheritance.

*    Attributes of higher-level entity are inherited by lower-level entities.

*    Two methods for conversion to a table form:

*    Create a table for the high-level entity, plus tables for the lower-level entities containing also their specific attributes.

*    Create only tables for the lower-level entities.

 

Aggregation:

It is an abstraction concept for building composite objects from their component objects.

It is used when we have to model a relationship involving entity sets and a relationship set. It allows us to treat a relationship set as an entity set   for purposes of participation in other relationships.

The E-R model cannot express relationships among relationships.

 

When would we need such a thing?

 

Consider a DB with information about employees who work on a particular project and use a number of machines doing that work. We get the E-R diagram shown in the following Figure 1:

 

 

 

Figure 1: E-R diagram with redundant relationships

 

Relationship sets work and uses could be combined into a single set. However, they shouldn't be, as this would obscure the logical structure of this scheme.

 

The solution is to use aggregation.

 

An abstraction through which relationships are treated as higher-level entities.

For our example, we treat the relationship set work and the entity sets employee and project as a higher-level entity set called work.

 

Figure 2 : shows the E-R diagram with aggregation.

 

 

Figure 2: E-R diagram with aggregation

 

 

Transforming an E-R diagram with aggregation into tabular form is easy. We create a table for each entity and relationship set as before.

 

The table for relationship set uses contains a column for each attribute in the primary key of machinery and work.

 

12.) Compare SQL and QBE

QBE and SQL:

*    QBE language is important because it is one of the first graphical query languages with minimum syntax developed for database systems.

*    QBE differs from SQL in that  the user does not have to specify a structured query explicitly; rather, the query is formulated by filling in templates of relations displayed on the monitor screen.

*    The user does not have to follow any rigid syntax rules for query specification in the QBE.

*    QBE is related to the domain relational calculus whereas SQL is based to a greater extent on the tuple relational calculus.

*    QBE’s original specification is shown to be relationally complete.

*    QBE does not use the “linear” style of SQL; rather, it is a “two-dimensional” language, because users specify a query moving around the full area of the screen.

13.) Explain about Domain Calculus expression.

Solution:

The Domain Relational Calculus

Domain variables take on values from an attribute's domain, rather than values for an entire tuple

Formal Definitions

1.    An expression is of the form

where the represent domain variables, and is a formula.

2.    An atom in the domain relational calculus is of the following forms

*    where is a relation on attributes, and , are domain variables or constants.

*    , where and are domain variables, and is a comparison operator.

*    , where c is a constant.

3.    Formulae are built up from atoms using the following rules:

*    An atom is a formula.

*    If is a formula, then so are and .

*    If and are formulae, then so are , and .

*    If is a formula where x is a domain variable, then so are and .

14.) Write short notes on a)Entities and attributes b)Strong and Weak Entities c)Cross products and Joins

Solution:

a.)

An entity is an object that exists and is distinguishable from other objects. For instance, John Harris with S.I.N. 890-12-3456 is an entity, as he can be uniquely identified as one particular person in the universe.

An entity may be concrete (a person or a book, for example) or abstract (like a holiday or a concept).

An entity is represented by a set of attributes.

E.g. name, S.I.N., street, city for ``customer'' entity.

The domain of the attribute is the set of permitted values (e.g. the telephone number must be seven positive integers).

Formally, an attribute is a function which maps an entity set into a domain.

Every entity is described by a set of (attribute, data value) pairs.

There is one pair for each attribute of the entity set.

E.g. a particular customer entity is described by the set {(name, Harris), (S.I.N., 890-123-456), (street, North), (city, Georgetown)}.

b.)

Strong vs. Weak Entities

A Strong Entity does not depend on anything else in the database for its existence.  It contains its own primary key.

A Weak Entity depends on some related entity for its existence.  It has no candidate keys without including the primary key of the entity it depends on.

Strong vs. Weak Example

Employee ( empNo PK, fname, lname salary,…) HAS Child (fname)

Employee is strong (with empNo as primary key)

Child is weak - many children may be called “Jennifer”, but only one Jennifer is the daughter of employee 24901.

15.) Explain different levels of abstraction in a DBMS.. What is data independence? Explain different types of Data Independence

Data Abstraction

1.     The major purpose of a database system is to provide users with an abstract view of the system.

The system hides certain details of how data is stored and created and maintained

Complexity should be hidden from database users.

2.    There are several levels of abstraction:

1.    Physical Level:

*    How the data are stored.

*    E.g. index, B-tree, hashing.

*    Lowest level of abstraction.

*    Complex low-level structures described in detail.

2.    Conceptual Level:

*    Next highest level of abstraction.

*    Describes what data are stored.

*    Describes the relationships among data.

*    Database administrator level.

3.    View Level:

*    Highest level.

*    Describes part of the database for a particular group of users.

*    Can be many different views of a database.

*    E.g. tellers in a bank get a view of customer accounts, but not of payroll data.

Figure illustrates the three levels.

  
Figure: The three levels of data abstraction

Data Independence

The ability to modify a scheme definition in one level without affecting a scheme definition in a higher level is called data independence.

There are two kinds:

Physical data independence

The ability to modify the physical scheme without causing application programs to be rewritten. Modifications at this level are usually to improve performance.

Logical data independence

The ability to modify the conceptual scheme without causing application programs to be rewritten. Usually done when logical structure of database is altered .

Logical data independence is harder to achieve as the application programs are usually heavily dependent on the logical structure of the data. An analogy is made to abstract data types in programming languages.

16.) Explain TRC and DRC with examples. List two reasons why null values are introduced into databases. Explain nested subqueries in SQL.

Solution:

Relational calculus is an alternative to relational algebra. The calculus is nonprocedural, or declarative, in that it allows us to describe the set of answers without being explicit about how they should be computed. Relational calculus has had a big influence on the design of commercial query languages such as SQL and, especially, Query-by-Example (QBE).

The variant of the calculus we present in detail is called the tuple relational calculus (TRC). Variables in TRC take on tuples as values. In another variant, called the domain relational calculus (DRC), the variables range over field values. TRC has had more of an influence on SQL, while DRC has strongly influenced QBE.

Tuple Relational Calculus

A tuple variable is a variable that takes on tuples  of a particular relation schema as values. That is, every value assigned to a given tuple variable has the same number and type of fields. A tuple relational calculus query has the form { T p(T) }, where T is a tuple variable and p(T) denotes a formula that describes T\ we will shortly define formulas and queries rigorously. The result of this query is the set of all tuples t for which the formula p(T) evaluates to true with T t. The language for writing formulas p(T) is thus at the heart of TRC and essentially a simple subset of first-order logic. As a simple example, consider the following query.

(Qll) Find all sailors with a rating above 7.

{S | S  E  Sailors^ S.rating > 7}

When this query is evaluated on an instance of the Sailors relation, the tuple variable S is instantiated successively with each tuple, and the test S.rating>7 is applied. The answer contains those instances of S that pass this test. On instance 53 of Sailors, the answer contains Sailors tuples with sid 31, 32, 58,

71, and 74.

Syntax of TRC Queries

We now define these concepts formally, beginning with the notion of a formula. Let Rel be a relation name, R and S be tuple variables, a be an attribute of R, and b be an attribute of S. Let op denote an operator in the set {<, >,  = , <= , >=, !=}• An atomic formula is one of the following:

        R E  Rel

        R.a op S.b

        R.a op constant, or constant op R.a

A formula is recursively defined to be one of the following, where p and q are themselves formulas and p(R) denotes a formula in which the variable R appears:

        Any atomic formula

        Not p, p ^ q, p V q, or p implies q

        There exists R such that p(R) , where K is a tuple variable

        For all R(p(R)), where R is a tuple variable

The quantifiers there exists and for all are said to bind the variable R.. A variable is said to be free in a formula or sub formula (a formula contained in a larger formula) if the (sub)formula does not contain an occurrence of a quantifier that binds it.

Every variable in a TRC formula appears in a sub formula that is atomic, and every relation schema specifies a domain for each field; each variable in a TRC formula has a well-defined domain from which values for the variable are drawn. Informally, an atomic formula R E Rel gives R the type of tuples in Rel, and comparisons such as R.a op S.b and R.a op constant induce type restrictions on the field R.a. If a variable R does not appear in an atomic formula of the form R Rel, we follow the convention that the type of R is a tuple whose fields include all fields of R that appear in the formula.

For example :

 Find the names and ages of sailors with a rating above 7.

{P /  there exists an S  Sailors(S.rating > 7  ^  P.name =S.sname ^ P.age = S.age)}.

 

Domain Relational Calculus

A domain variable is a variable that ranges over the values in the domain of some attribute (e.g., the variable can be assigned an integer if it appears in an attribute whose domain is the set of integers). A DRC query has the form {<x1,X2,. • • xn>/  p(<x1,X2,..,xN>)}, where each Xj is either a domain variable or a constant and p(<x1,x2,…,xn>) denotes a DRC formula whose only free variables are the variables among the xi, 1 < i < n. The result of this query is the set of all tuples (x1,x2,..., xn) for which the formula evaluates to true.

A DRC formula is defined in a manner very similar to the definition of a TRC formula. The main difference is that the variables are now domain variables. Let op denote an operator in the set {<,>,=,<, >, ^} and let X and Y be domain variables. An atomic formula in DRC is one of the following:

          (x1,x2,….xn)  E  Rel, where Rel is a relation with n attributes; each xi, 1 < i < n is either a variable or a constant

          X op Y

          X op constant, or constant op X

A formula is recursively defined to be one of the following, where p and q are themselves formulas and p(X) denotes a formula in which the variable A' appears:

          Any atomic formula

          Not p, p ^ q, p V q, or p implies q

          There exists R such that p(R) , where R is a domain variable

          For all R(p(R)), where R is a domain variable

 

Examples of DRC Queries

 Find all sailors with a rating above 7.

{{I, N, T, A) /   {I, N, T, A) E Sailors ^T > 7}

17.) Differentiate between binary and ternary relationships

 

 A .   

                                            Policy as an entity set.

 

 Consider the  E-R diagram . It  models a situation in which an employee can own several policies ,each policy can be owned by several employees, and each dependent can be covered by several policies.

 

Suppose that we have the following additional requirements :

*     A policy cannot be owned jointly by two or more requirements.

*    Every policy must be owned by some employee.

*    Dependents is a weak entity set , and each dependent entity is uniquely identified by taking pname in conjunction with the policyid of a policy entity.

 The first requirement suggests that we impose a key constraint on polices with respect to covers, but this constraint has the unintended side effect that a policy can cover only one dependent. The second requirement suggests that  we impose a total participation constraint on policies. This  solution is acceptable if each policy covers at least one dependent . the third requirement forces us to introduce an identifying relationship that is binary .

The best way to model this situation is to use two binary relationships,

 

 

    

 

                                            Policy revisited.

 

As a typical example of a ternary relation ship, consider entity sets parts, suppliers, and departments , and a relationship set contracts that involves all of them . A contract specifies that a supplier will supply a part to a department. This relation ship can not be adequately captured by a collection of binary relationships. With binary relationship , we can denote that a supplier ‘can supply ‘ certain parts that a department ‘needs ‘  some parts ,or a department  ‘deals with ‘ a  certain supplier. No combination of these relationships express the meaning of a contract adequately, for at least two reasons :

*    The facts that supplier s can supply part p,  that department D needs part P , and that D will buy from S do not necessarily imply that department D indeed buys part P from suppliers S.

*    We cannot represent the qty attribute of a contract cleanly.

 

19.) Construct an ER digrm for a cr insurance company that has a set of customers, each of whom owns one or more cars. Each car has associated with it zero to any number of recorded accidents.

Solution:

 

 

20.) Describe the extended ER features.

Solution:

 

21.)Draw the ER diagram for the banking Enterprise and explain

 

Solution:

 

 

Let us assume that the bank has two types of customers

1)    Depositors – Fixed Account and Current Savings

2)    Borrowers – Who take loan on interest.

 

Constraints:

*    Each customer has a unique customer_id.

*    Has to have an account either fixed or current savings if it is a depositor.

*    Fixed account holders have a unique account_no, amount, and duration of the fixed deposit.

*    Current savings account holders have a unique account_no and interest.

*    Borrowers have a unique loan_id, amount, duration(of payback) and interest.

 

22.) Describe fundamental operations of relational algebra with examples. Discuss natural join with an example.

 

Solution:

 

 

Fundamental Operations

The Select Operation

Select selects tuples that satisfy a given predicate. Select is denoted by a lowercase Greek sigma (), with the predicate appearing as a subscript. The argument relation is given in parentheses following the .

For example, to select tuples (rows) of the borrow relation where the branch is ``SFU'', we would write

Let Figure 1 be the borrow and branch relations in the banking example.

 
Figure 1:   The borrow and branch relations.

The new relation created as the result of this operation consists of one tuple: .

We allow comparisons using =, , <, , > and in the selection predicate.

We also allow the logical connectives (or) and (and). For example:

 
Figure 2:   The client relation.

Suppose there is one more relation, client, shown in Figure 2,  with the scheme

we might write

to find clients who have the same name as their banker.

The Project Operation

Project copies its argument relation for the specified attributes only. Since a relation is a set, duplicate rows are eliminated.

Projection is denoted by the Greek capital letter pi (). The attributes to be copied appear as subscripts.

For example, to obtain a relation showing customers and branches, but ignoring amount and loan#, we write

We can perform these operations on the relations resulting from other operations.

To get the names of customers having the same name as their bankers,

Think of select as taking rows of a relation, and project as taking columns of a relation.

The Cartesian Product Operation

The cartesian product of two relations is denoted by a cross (), written

The result of is a new relation with a tuple for each possible pairing of tuples from and .

In order to avoid ambiguity, the attribute names have attached to them the name of the relation from which they came. If no ambiguity will result, we drop the relation name.

The result is a very large relation. If has tuples, and has tuples, then will have tuples.

The resulting scheme is the concatenation of the schemes of and , with relation names added as mentioned.

To find the clients of banker Johnson and the city in which they live, we need information in both client and customer relations. We can get this by writing

However, the customer.cname column contains customers of bankers other than Johnson. (Why?)

We want rows where client.cname = customer.cname. So we can write

to get just these tuples.

Finally, to get just the customer's name and city, we need a projection:

The Rename Operation

The rename operation solves the problems that occurs with naming when performing the cartesian product of a relation with itself.

Suppose we want to find the names of all the customers who live on the same street and in the same city as Smith.

We can get the street and city of Smith by writing

To find other customers with the same information, we need to reference the customer relation again:

where is a selection predicate requiring street and ccity values to be equal.

Problem: how do we distinguish between the two street values appearing in the Cartesian product, as both come from a customer relation?
Solution: use the rename operator, denoted by the Greek letter rho ().

We write

to get the relation under the name of .

If we use this to rename one of the two customer relations we are using, the ambiguities will disappear.

The Union Operation

The union operation is denoted as in set theory. It returns the union (set union) of two compatible relations.

For a union operation to be legal, we require that

*    and must have the same number of attributes.

*    The domains of the corresponding attributes must be the same.

To find all customers of the SFU branch, we must find everyone who has a loan or an account or both at the branch.

We need both borrow and deposit relations for this:

As in all set operations, duplicates are eliminated, giving the relation of Figure 3(a).

 
Figure 3:   The union and set-difference operations.

The Set Difference Operation

Set difference is denoted by the minus sign (). It finds tuples that are in one relation, but not in another.

Thus results in a relation containing tuples that are in but not in .

To find customers of the SFU branch who have an account there but no loan, we write

The result is shown in Figure 3(b).

We can do more with this operation. Suppose we want to find the largest account balance in the bank.
Strategy:

*    Find a relation containing the balances not the largest.

*    Compute the set difference of and the deposit relation.

To find , we write

This resulting relation contains all balances except the largest one. (See Figure 4(a)).

Now we can finish our query by taking the set difference:

Figure 4(b) shows the result.

 
Figure 4:   Find the largest account balance in the bank.

The Natural Join Operation

Often we want to simplify queries on a cartesian product.

For example, to find all customers having a loan at the bank and the cities in which they live, we need borrow and customer relations:

Our selection predicate obtains only those tuples pertaining to only one cname.

This type of operation is very common, so we have the natural join, denoted by a sign. Natural join combines a cartesian product and a selection into one operation. It performs a selection forcing equality on those attributes that appear in both relation schemes. Duplicates are removed as in all relation operations.

To illustrate, we can rewrite the previous query as

The resulting relation is shown in Figure 5

 
Figure 5:   Joining borrow and customer relations.

We can now make a more formal definition of natural join.

*    Consider and to be sets of attributes.

*    We denote attributes appearing in both relations by .

*    We denote attributes in either or both relations by .

*    Consider two relations and .

*    The natural join of and , denoted by is a relation on scheme .

*    It is a projection onto of a selection on where the predicate requires for each attribute in .

Formally,

where .

To find the assets and names of all branches which have depositors living in Stamford, we need customer, deposit and branch relations:

Note that is associative.

To find all customers who have both an account and a loan at the SFU branch:

This is equivalent to the set intersection version we wrote earlier. We see now that there can be several ways to write a query in the relational algebra.

If two relations and have no attributes in common, then , and .

 

23.)Explain various built in functions of SQL.

 

Built-In Functions

PL/SQL provides more than 75 powerful functions to help you manipulate data. These built-in functions fall into the following categories:

1)    error-reporting

2)    number

3)    character

4)    conversion

5)    date

6)    miscellaneous

Table 1 shows the functions in each category.

You can use all the functions in SQL statements except the error- reporting functions SQLCODE and SQLERRM. Also, you can use all the functions in procedural statements except the miscellaneous functions DECODE, DUMP, and VSIZE.

Note: The SQL group functions AVG, MIN, MAX, COUNT, SUM, STDDEV, and VARIANCE are not built into PL/SQL. Nevertheless, you can use them in SQL statements (but not in procedural statements).

Error

Number

Character

Conversion

Date

Misc

SQLCODE

ABS

ASCII

CHARTOROWID

ADD_MONTHS

DECODE

SQLERRM

ACOS

CHR

CONVERT

LAST_DAY

DUMP

 

ASIN

CONCAT

HEXTORAW

MONTHS_BETWEEN

GREATEST

 

ATAN

INITCAP

RAWTOHEX

NEW_TIME

GREATEST_LB

 

ATAN2

INSTR

ROWIDTOCHAR

NEXT_DAY

LEAST

 

CEIL

INSTRB

TO_CHAR

ROUND

LEAST_LB

 

COS

LENGTH

TO_DATE

SYSDATE

NVL

 

COSH

LENGTHB

TO_LABEL

TRUNC

UID

 

EXP

LOWER

TO_MULTI_BYTE

 

USER

 

FLOOR

LPAD

TO_NUMBER

 

USERENV

 

LN

LTRIM

TO_SINGLE_BYTE

 

VSIZE

 

LOG

NLS_INITCAP

 

 

 

 

MOD

NLS_LOWER

 

 

 

 

POWER

NLS_UPPER

 

 

 

 

ROUND

NLSSORT

 

 

 

 

SIGN

REPLACE

 

 

 

 

SIN

RPAD

 

 

 

 

SINH

RTRIM

 

 

 

 

SQRT

SOUNDEX

 

 

 

 

TAN

SUBSTR

 

 

 

 

TANH

SUBSTRB

 

 

 

 

TRUNC

TRANSLATE

 

 

 

 

 

UPPER

 

 

 

 

Table 1. Built-in Functions