Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

realation database management sysems, Exams of Deductive Database Systems

unit 1 and unit 2 for the introduction and entity model,relation,file base approach

Typology: Exams

2017/2018

Uploaded on 10/14/2018

U171812
U171812 🇮🇳

2

(1)

2 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
File Base Approach: File processing systems was an early attempt to computerize
the manual filing system that we are all familiar with. A file system is a method for
storing and organizing computer files and the data they contain to make it easy to
find and access them. File systems may use a storage device such as a hard disk or
CD-ROM and involve maintaining the physical location of the files. The manual filing
system works well when the number of items to be stored is small. It even works
quite adequately when there are large numbers of items and we have only to store
and retrieve them. However, the manual filing system breaks down when we have to
cross-reference or process the information in the files. For example, a typical real
estate agent's office might have a separate file for each property for sale or rent,
each potential buyer and renter, and each member of staff. Clearly the manual
system is inadequate for this' type of work. The file based system was developed in
response to the needs of industry for more efficient data access. In early processing
systems, an organization's information was stored as groups of records in separate
files.
Limitations of the File Processing System I File-Based Approach
There are following problems associated with the File Based Approach:
1. Separated and Isolated Data: To make a decision, a user might need data from
two separate files. First, the files were evaluated by analysts and programmers to
determine the specific data required from each file and the relationships between the
data and then applications could be written in a programming language to process
and extract the needed data. Imagine the work involved if data from several files was
needed.
2. Duplication of data: Often the same information is stored in more than one file.
Uncontrolled duplication of data is not required for several reasons, such as:
• Duplication is wasteful. It costs time and money to enter the data more than once
• It takes up additional storage space, again with associated costs.
Duplication can lead to loss of data integrity; in other words the data is no longer
consistent. For example, consider the duplication of data between the Payroll and
Personnel departments. If a member of staff moves to new house and the change of
address is communicated only to Personnel and not to Payroll, the person's pay slip
will be sent to the wrong address. A more serious problem occurs if an employee is
promoted with an associated increase in salary. Again, the change is notified to
Personnel but the change does not filter through to Payroll. Now, the employee is
receiving the wrong salary. When this error is detected, it will take time and effort to
resolve. Both these examples, illustrate inconsistencies that may result from the
duplication of data. As there is no automatic way for Personnel to update the data in
the Payroll files, it is difficult to foresee such inconsistencies arising. Even if Payroll is
notified of the changes, it is possible that the data will be entered incorrectly.
3. Data Dependence: In file processing systems, files and records were described by
specific physical formats that were coded into the application program by
programmers. If the format of a certain record was changed, the code in each file
containing that format must be updated. Furthermore, instructions for data storage
and access were written into the application's code. Therefore, .changes in storage
structure or access methods could greatly affect the processing or results of an
application.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download realation database management sysems and more Exams Deductive Database Systems in PDF only on Docsity!

File Base Approach: File processing systems was an early attempt to computerize the manual filing system that we are all familiar with. A file system is a method for storing and organizing computer files and the data they contain to make it easy to find and access them. File systems may use a storage device such as a hard disk or CD-ROM and involve maintaining the physical location of the files. The manual filing system works well when the number of items to be stored is small. It even works quite adequately when there are large numbers of items and we have only to store and retrieve them. However, the manual filing system breaks down when we have to cross-reference or process the information in the files. For example, a typical real estate agent's office might have a separate file for each property for sale or rent, each potential buyer and renter, and each member of staff. Clearly the manual system is inadequate for this' type of work. The file based system was developed in response to the needs of industry for more efficient data access. In early processing systems, an organization's information was stored as groups of records in separate files.

Limitations of the File Processing System I File-Based Approach

There are following problems associated with the File Based Approach:

  1. Separated and Isolated Data: To make a decision, a user might need data from two separate files. First, the files were evaluated by analysts and programmers to determine the specific data required from each file and the relationships between the data and then applications could be written in a programming language to process and extract the needed data. Imagine the work involved if data from several files was needed.
  2. Duplication of data: Often the same information is stored in more than one file. Uncontrolled duplication of data is not required for several reasons, such as:
  • Duplication is wasteful. It costs time and money to enter the data more than once
  • It takes up additional storage space, again with associated costs.
  • Duplication can lead to loss of data integrity; in other words the data is no longer consistent. For example, consider the duplication of data between the Payroll and Personnel departments. If a member of staff moves to new house and the change of address is communicated only to Personnel and not to Payroll, the person's pay slip will be sent to the wrong address. A more serious problem occurs if an employee is promoted with an associated increase in salary. Again, the change is notified to Personnel but the change does not filter through to Payroll. Now, the employee is receiving the wrong salary. When this error is detected, it will take time and effort to resolve. Both these examples, illustrate inconsistencies that may result from the duplication of data. As there is no automatic way for Personnel to update the data in the Payroll files, it is difficult to foresee such inconsistencies arising. Even if Payroll is notified of the changes, it is possible that the data will be entered incorrectly.
  1. Data Dependence: In file processing systems, files and records were described by specific physical formats that were coded into the application program by programmers. If the format of a certain record was changed, the code in each file containing that format must be updated. Furthermore, instructions for data storage and access were written into the application's code. Therefore, .changes in storage structure or access methods could greatly affect the processing or results of an application.

In other words, in file based approach application programs are data dependent. It means that, with the change in the physical representation (how the data is physically represented in disk) or access technique (how it is physically accessed) of data, application programs are also affected and needs modification. In other words application programs are dependent on the how the data is physically stored and accessed.

4. Incompatible file formats: As the structure of files is embedded in the application programs, the structures are dependent on the application programming language. For example, the structure of a file generated by a COBOL program may be different from the structure of a file generated by a 'C' program. The direct incompatibility of such files makes them difficult to process jointly.

DBMS - A database management system is the software system that allows users

to define, create and maintain a database and provides controlled access to the data.

A Database Management System (DBMS) is basically a collection of programs that enables users to store, modify, and extract information from a database as per the requirements. DBMS is an intermediate layer between programs and the data. Programs access the DBMS, which then accesses the data. There are different types of DBMS ranging from small systems that run on personal computers to huge systems that run on mainframes. The following are main examples of database applications:

  • Computerized library systems
  • Automated teller machines
  • Flight reservation systems
  • Computerized parts inventory systems

A database management system is a piece of software that provides services for accessing a database, while maintaining all the required features of the data. Commercially available Database management systems in the market are dbase, FoxPro, IMS and Oracle, MySQL, SQL Servers and DB2 etc.

These systems allow users to create update, and extract information from their databases.

Compared to a manual filing system, the biggest advantages to a computerized database system are speed, accuracy, and' accessibility.

Components of the Database System Environment

There are five major components in the database system environment and their interrelationships are.

  • Hardware
  • Software
  • Data
  • Users
  • Procedures
  1. Procedures: Procedures refer to the instructions and rules that govern the design and use of the database. The users of the system and the staff that manage the database require documented procedures on how to use or run the system.

These may consist of instructions on how to:

  • Log on to the DBMS.
  • Use a particular DBMS facility or application program.
  • Start and stop the DBMS.
  • Make backup copies of the database.
  • Handle hardware or software failures.

Change the structure of a table, reorganize the database across multiple disks, improve performance, or archive data to secondary storage.

5. People:

Database Administrator: The database administrator (DBA) is the person or group in charge for implementing the database system, within an organization. The "DBA has all the system privileges allowed by the DBMS and can assign (grant) and remove (revoke) levels of access (privileges) to and from other users. DBA is also responsible for the evaluation, selection and implementation of DBMS package.

Application Programmers: Professional programmers are those who are responsible for developing application programs or user interface. The application programs could be written using general purpose programming language or the commands available to manipulate a database.

END USERS: The end users are the ‘clients’ for the data base, which has been designed and implemented, and being maintained to serve their information. End users are classified into following ways:

Naive Users: Naive Users are those users who need not be aware of the presence of the database system or any other system supporting their usage. Naive users are end users of the database who work through a menu driven application program, where the type and range of response is always indicated to the user.

A user of an Automatic Teller Machine (ATM) falls in this category. The user is instructed through each step of a transaction. He or she then responds by pressing a coded key or entering a numeric value. The operations that can be performed by valve users are very limited and affect only a precise portion of the database. For example, in the case of the user of the Automatic Teller Machine, user's action affects only one or more of his/her own accounts.

Sophisticated Users: Such users interact with the system without, writing programs. Instead, they form their requests in database query language. Each such query is submitted to a very processor whose function is to breakdown DML statement into instructions that the storage manager understands.

Advantages of DBMS

The database management system has promising potential advantages, which are explained below:

  1. Controlling Redundancy: In file system, each application has its own private files, which cannot be shared between multiple applications. 1:his can often lead to considerable redundancy in the stored data, which results in wastage of storage space. By having centralized database most of this can be avoided. It is not possible that all redundancy should be eliminated. Sometimes there are sound business and technical reasons for· maintaining multiple copies of the same data. In a database system, however this redundancy can be controlled. 2. Data Consistency: By controlling the data redundancy, the data consistency is obtained. If a data item appears only once, any update to its value has to be performed only once and the updated value is immediately available to all users. If the DBMS has controlled redundancy, the database system enforces consistency. 3. Data Security: Form is very important object of DBMS. You can create forms very easily and quickly in DBMS. Once a form is created, it can be used many times and it can be modified very easily. The created forms are also saved along with database and behave like a software component. A form provides very easy way (user-friendly) to enter data into database, edit data and display data from database. The non- technical users can also perform various operations on database through forms without going into technical details of a fatabase 4. Control over Concurrency: In a computer file-based system, if two users are allowed to access data simultaneously, it is possible that they will interfere with each other. For example, if both users attempt to perform update operation on the same record, then one may overwrite the values recorded by the other. Most database management systems have sub-systems to control the concurrency so that transactions are always recorded with accuracy. 5. Backup and Recovery Procedures: In a computer file-based system, the user creates the backup of data regularly to protect the valuable data from damage due to failures to the computer system or application program. It is very time consuming method, if amount of data is large. Most of the DBMSs provide the 'backup and recovery' sub-systems that automatically create the backup of data and restore data if required.

6. Data Security: Form is very important object of DBMS. You can create forms very easily and quickly in DBMS. Once a form is created, it can be used many times and it can be modified very easily. The created forms are also saved along with database and behave like a software component. A form provides very easy way (user-friendly) to enter data into database, edit data and display data from database. The non- technical users can also perform various operations on database through forms without going into technical details of a fatabase.

7. Integration of Data : In Database management system, data in database is stored in tables. A single database contains multiple tables and relationships can be created between tables (or associated data entities). This makes easy to retrieve and update data.

8. Standards can be enforced: Since DBMS is a central system, so standard can be enforced easily may be at Company level, Department level, National level or International level. The standardized data is very helpful during migration or

and Languages (CODASYL, 1971). The DBTG recognized the need for a two level approach with a system view called the schema and user views called subschema. The American National Standards Institute (ANSI) Standards Planning and

Requirements Committee (SPARC) produced a similar terminology mid architecture in 1975 (ANSI 1975). ANSI-SPARC recognized the need for a three level approach

with a system catalog.There are following three levels or layers of DBMS architecture:

  • External Level

•Conceptual Level

  • Internal Level

Objective of the Three Level Architecture

The objective of the three level architecture is to separate each user's view of the database from the Way the database is physically represented. There are several reasons why this separation is desirable:

  • Each user should be able to access the same data, but have a different customized view of the data. Each user should be able to change the way he or she views the data, and this change should not affect other users.
  • Users should not have to deal directly with physical database storage details, such as indexing or hashing. In other words a user's interaction with the database should be independent of storage considerations.
  • The Database Administrator (DBA) should be able to change the database storage structures without affecting the user's views. . The internal structure of the database should be unaffected by changes to the physical aspects of storage, such as the changeover to a new storage device. . The DBA should be able to change the conceptual structure of the database without affecting all users.

External Level or View level

It is the users' view of the database. This level describes that part of the database that is relevant to each user. External level is the one which is closest to the end users. This level deals with the way in which individual users vie\v data. Individual users are given different views according to the user's requirement.

A view involves only those portions of a database which are of concern to a user. Therefore same database can have different views for different users. The external view insulates users from the details of the internal and conceptual levels. External level is also known as the view level. In addition different views may have different representations of the same data. For example, one user may view dates in the form (day, month, year), while another may view dates as (year, month, day).

Conceptual Level or Logical level

It is the community view of the database. This level describes what data is stored in the database and the relationships among the data. The middle level in the three level architecture is the conceptual level. This level contains the logical structure of the entire database as seen by the DBA. It is a complete view of the data

requirements of the organization that is independent of any storage considerations. The conceptual level represents:

  • All entities, their attributes, and their relationships;

An Entity is an object whose information is stored in the database. For example, in student database the entity is student. An attribute is a characteristic of interest about an entity.

The conceptual level supports each external view, in that any data available to a user must be contained in, or derivable from, the conceptual level. However, this level must not contain any storage dependent details. For instance, the description of an entity should contain only data types of attributes (for example, integer, real, character) and their length (such as the maximum number of digits or characters), but not any storage considerations, such as the number of bytes occupied. Conceptual level is also known as the, logical level.

Internal level or Storage level

It is the physical representation of the database on the computer. This level describes how the data is stored in the database. The internal level is the one that concerns the way the data are physically stored on the hardware. The internal level covers the physical\ implementation of the database to achieve optimal runtime performance and storage space utilization. It covers the data structures and file organizations used to store data on storage devices. It interfaces with the operating system access methods to place the data on the storage devices, build the indexes, retrieve the data, and so· on.

The internal level is concerned with such things as:

  • Storage space allocation for data and indexes;
  • Record descriptions for storage (with stored sizes for data items);
  • Record placement;
  • Data compression and data encryption techniques.

There will be only one conceptual view, consisting of the abstract representation of the database in it’s entirely. Similarly there will be only one internal or physical view, representing the total database, as it is physically stored.

Data Independence of DBMS : A major objective for three-level architecture is

to provide data independence, which means that upper levels are unaffected by changes in lower levels.

There are two kinds of data independence:

Logical Data Independence

Data Definition Language (DDL)

It is a language that allows the users to define data and their relationship to other types of data. It is mainly used to create files, databases, data dictionary and tables within databases. It is also used to specify the structure of each table, set of associated values with each attribute, integrity constraints, security and authorization information for each table and physical storage structure of each table on disk.

Data Manipulation Language (DML)

It is a language that provides a set of operations to support the basic data manipulation operations on the data held in the databases. It allows users to insert, update, delete and retrieve data from the database. The part of DML that involves data retrieval is called a query language.

DATA MODEL

A model is a representation of reality, 'real world' objects and events, associations. It is an abstraction that concentrates on the essential, inherent aspects an organization and ignores the accidental properties. A data model represents the organization itself. It should provide the basic concepts and notations that will allow database designers and end users unambiguously and accurately to communicate their understanding of the organizational data.

Data Model can be defined as an integrated collection of concepts for describing and manipulating data, relationships between data, and constraints on the data in an organization.

A data model comprises of three components:

  • A structural part, consisting of a set of rules according to which databases can be constructed.
  • A manipulative part, defining the types of operation that are allowed on the data (this includes the operations that are used for updating or retrieving data from the database and for changing the structure of the database).
  • Possibly a set of integrity rules, which ensures that the data is accurate.

The purpose of a data model is to represent data and to make the data understandable. There have been many data models proposed in the literature. They fall into three broad categories:

•ObjectBasedDataModels •PhysicalDataModels

  • Record Based Data Models

The object based and record based data models are used to describe data at the conceptual and external levels, the physical data model is used to· describe data at the internal level.

Object Based Data Models

Object based data models use concepts such as entities, attributes, and relationships. An entity is a distinct object (a person, place, concept, and event) in the organization that is to be represented in the database. An attribute is a property

that describes some aspect of the object that we wish to record, and a relationship is an association between entities.

Some of the more common types of object based data model are:

  • Entity-Relationship
  • Object Oriented
  • Semantic
  • Functional

The Entity-Relationship model has emerged as one of the main techniques for modelling database design and forms the basis for the database design methodology. The object oriented data model extends the definition of an entity to include, not only the attributes that describe the state of the object but also the actions that are associated with the object, that is, its behaviour. The object is said to encapsulate both state and behaviour. Entities in semantic systems represent the equivalent of a record in a relational system or an object in an OO system but they do not include behaviour (methods). They are abstractions 'used to represent real world (e.g. customer) or conceptual (e.g. bank account) objects. The functional data model is now almost twenty years old. The original idea was to' view the database as a collection of extensionally defined functions and to use a functional language for querying the database.

Physical Data Models

Physical data models describe how data is stored in the computer, representing information such as record structures, record ordering, and access paths. There are not as many physical data models as logical data models, the most common one being the Unifying Model.

Record Based Logical Models

Record based logical models are used in describing data at the logical and view levels. In contrast to object based data models, they are used to specify the overall logical structure of the database and to provide a higher-level description of the implementation. Record based models are so named because the database is structured in fixed format records of several types. Each record type defines a fixed number of fields, or attributes, and each field is usually of a fixed length.

The three most widely accepted record based data models are:

  • Hierarchical Model
  • Network Model
  • Relational Model

The relational model has gained favor over the other two in recent years. The network and hierarchical models are still used in a large number of older databases.

Hierarchical DBMS

In HDBMS, data is organized in a tree like manner. There is a parent- child relationship among data items and the data model is very suitable for representing one-to-many relationship. To access the data items, some kind of tree-traversal techniques are used, such as preorder traversal.

This organization saves a lot of space as data is not made redundant.

Network DBMS

The NDBMS is built primarily on a one–to-many relationship, but where a parent-child representation among the data items cannot be ensured. This may happen in any real world situation where any entity can be linked to any entity. The NDBMS was proposed by a group of theorists known as the Database Task Group ( DBTG ). What they said looks like this:

In NDBMS, all entities are called Records and all relationships are called Sets. The record from where the relationship starts is called the Owner Record and where it ends is called Member Record. The relationship or set is strictly one-to-many.

In case we need to represent a many-to-many relationship, an interesting thing happens. In NDBMS, Owner and Member can only have one-to- many relationship. We have to introduce a third common record with which both the Owner and Member can have one-to-many relationship. Using this common record, the Owner and Member can be linked by a many-to-many relationship.

Suppose we have to represent the statement Teachers teach students. We have to introduce a third record, suppose CLASS to which both teacher and the student can have a many-to-many relationship. Using the class in the middle, teacher and student can be linked to a virtual many-to many relationship.

Relational Model

Relational model stores data in the form of tables. This concept purposed by Dr. E.F. Codd, a researcher of IBM in the year 1960s. The relational model consists of three major components:

  1. The set of relations and set of domains that defines the way data can be represented (data structure).
  2. Integrity rules that define the procedure to protect the data (data integrity).
  3. The operations that can be performed on data (data manipulation).

A rational model database is defined as a database that allows you to group its data items into one or more independent tables that can be related to one another by using fields common to each related table.

Attributes means characteristics. For instance, in a database or a spreadsheet you can apply attributes to each field or cell to customize your document. As a general attribute, you can choose whether it is to be a text field or a numeric field or perhaps a computed field, whose value the application calculates for you.

Then you can apply more specific attributes to the field or cell, such as making the text bold and right-aligned and perhaps in a particular typeface. If a field is numeric, you will have other attribute options available, such as how many decimal places to display, whether to use a dollar sign or a percent symbol, or whether to start a formula in the cell.

Domains: A domain definition specifies the kind of data represented by the attribute.

More- particularly, a domain is the set of all possible values that an attribute may validly contain. Domains are often confused with data type s, but this is inaccurate. Data type is a physical concept while domain is a logical one. "Number" is a data type and "Age" is a domain. To give another example "StreetName" and "Surname" might both be represented as text fields, but they are obviously different kinds of text fields; they belong to different domains.

Domain is also a broader concept than data type, in that a domain definition includes a more specific description of the valid data. For example, the domain Degree A warded, which represents the degrees awarded by a university. In the database schema, this attribute might be defined as Text [3], but it's not just any three-character string, it's a member of the set {BA, BS, MA, MS, PhD, LLB, MD}. Of course, not all domains can be defined by simply listing their values. Age, for example, contains a hundred or so values if we are talking about people, but tens of thousands if we are talking about museum exhibits. In such instances it's useful to define the domain in terms of the rules, which can be used to determine the membership of any specific value in the set of all valid values.

Tuples of a Relation

key) is not having the property of irreducibility because Roll_number which is one subset of the composite key is also unique itself. Thus, this composite key is called as super key because it has the property of uniqueness but not the irreducibility.

Consider a relation of Patient in which Patient_number is unique. Then, Patient_number is a candidate key and (Patient number, Patient name) is a super key. Thus, we can say that "A superset of a candidate key is a super key."

Primary key is an attribute or a set of attributes of a relation which posses the properties of uniqueness and irreducibility (No subset should be unique). For example: Supplier number in S table is primary key, Part number in P table is primary key and the combination of Supplier number and Part Number in SP table is a primary key

Foreign key is the attributes of a table, which refers to the primary key of some another table. Foreign key permit only those values, which appears in the primary key of the table to which it refers or may be null (Unknown value). For example: SNO in SP table refers the SNO of S table, which is the primary key of S table, so we can say that SNO in SP table is the foreign key. PNO in SP table refers the PNO of P table, which is the primary key of P table, so we can say that PNO in SP table is the foreign key.

Integrity Constraints

Null or Unknown Value: Null represents a value for an attribute that is

currently unknown or is not applicable for this tuple. A null can be taken to mean the

logical value 'unknown'. It can ~ean that a value is not applicable to a particular tuple, or it could merely mean that. No value has yet been supplied. Nulls are a way to deal

with incomplete or exceptional data. However a null is not the same as a zero numeric value or a text string filled with spaces; zeros and spaces are values, but a

null represents the absence of a value. Therefore, nulls should be treated differently from other values.

Entity Integrity: Entity Integrity rule states that in a base relation, value of attribute of a primary key cannot be null. Here, a base relation is a relation that corresponds to an entity in the conceptual schema. By definition, a primary key is a minimal identifier that is used to identify tuples uniquely. This means that no subset of the primary key is sufficient to provide unique identification to tuples. If we allow a null for any part of a primary key, we are implying that not all the attributes are needed to distinguish between tuples, which contradicts the definition of the primary key.

Referential integrity: - A referential integrity constraint designates a column or combination of columns as a foreign key and establishes a relationship between the foreign key and a specified primary or unique key, called the referenced key. or The value that appears in one relation for a given set of attributes also appears for a certain set of attributes in another relation is called Referential Integrity. If the Base Relation/Table includes a foreign key matching Primary key of some other base table relation. Then every value of the Foreign Key in the First table/Relation must either be equal to the value of Primary Key in some tuple of the second table or be the wholly null. Or in other words a Foreign Key value must match primary key value in

some tuple of the referenced relation if that foreign key value is non-null. Sometimes it is necessary to permit foreign keys to accept nulls. Here it must be noted that the nulls are of the variety value does not exist’ rather than ‘Value unknown’.

Enterprise Constraints: Enterprise Constraints are additional rules specified by

the users or database administrators of a database. It is also possible for users to

specify additional constraints that the data must satisfy.

Attributes

Entities are represented by means of their properties, called attributes. All attributes have values. For example, a student entity may have name, class, and age as attributes. There exists a domain or range of values that can be assigned to attributes. For example, a student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot be negative, etc.

Types of Attributes

  • Simple attribute − Simple attributes are atomic values, which cannot be divided further. For example, a student's phone number is an atomic value of 10 digits.
  • Composite attribute − Composite attributes are made of more than one simple attribute. For example, a student's complete name may have first_name and last_name.
  • Derived attribute − Derived attributes are the attributes that do not exist in the physical database, but their values are derived from other attributes present in the database. For example, average_salary in a department should not be saved directly in the database, instead it can be derived. For another example, age can be derived from data_of_birth.
  • Single-value attribute − Single-value attributes contain single value. For example − Social_Security_Number.
  • Multi-value attribute − Multi-value attributes may contain more than one values. For example, a person can have more than one phone number, email_address, etc.

STRONG AND WEAK ENTITY

The entity set which does not have sufficient attributes to form a primary key is called as Weak entity set. An entity set that has a primary key is called as Strong entity set. Consider an entity set Payment which has three attributes: payment_number, payment_date and payment_amount. Although each payment entity is distinct but