Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

STORAGE STRUCTURE-DBMS, Study notes of Database Management Systems (DBMS)

Storage Structure, Transaction control, Concurrency control algorithms and Graph , Issues in Concurrent execution, Failures and Recovery algorithms, Case Study

Typology: Study notes

2024/2025

Available from 06/16/2025

anshy-prabha
anshy-prabha 🇮🇳

5 documents

1 / 204

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
21CSC205P-Database
Management Systems
UNIT-V
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download STORAGE STRUCTURE-DBMS and more Study notes Database Management Systems (DBMS) in PDF only on Docsity!

21CSC205P-Database

Management Systems

UNIT-V

TOPICS

  • Storage Structure
  • Transaction control
  • Concurrency control algorithms and Graph
  • Issues in Concurrent execution
  • Failures and Recovery algorithms
  • Case Study: Demonstration of Entire project by applying all the concepts learned with minimum Front-End requirements, NoSQL Database, Document Oriented, Key Value pairs, Column Oriented

Physical Storage Media

Classification of Physical Storage Media

  • Speed with which data can be accessed
  • Cost per unit of data
  • Reliability o data loss on power failure or system crash o physical failure of the storage device
  • Can differentiate storage into: o Volatile storage: loses contents when power is switched off o Non-volatile storage: ▪ Contents persist even when power is switched off. ▪ Includes secondary and tertiary storage, as well as batter- backed up main-memory.

Storage Device Hierarchy

3. Flash Memory

  • Data survives power failure
  • Data can be written at a location only once, but location can be erased and written to again ▪ Can support only a limited number (10K – 1M) of write/erase cycles. ▪ Erasing of memory has to be done to an entire bank of memory
  • Reads are roughly as fast as main memory
  • But writes are slow (few microseconds), erase is slower
  • Cost per unit of storage roughly similar to main memory
    • Widely used in embedded devices such as digital cameras, phones, and USB keys
  • Is a type of EEPROM (Electrically Erasable Programmable Read-Only Memory)

4. Magnetic-disk Storage

  • Data is stored on spinning disk, and read/written magnetically
  • Primary medium for the long-term storage of data; typically stores entire database.
  • Data must be moved from disk to main memory for access, and written back for storage
    • Much slower access than main memory
    • direct-access – possible to read data on disk in any order, unlike magnetic tape
  • Capacities range up to roughly 400 GB currently
    • Much larger capacity and cost/byte than main memory/flash memory
    • Growing constantly and rapidly with technology improvements (factor of 2 to 3 every 2 years)
  • Survives power failures and system crashes
    • disk failure can destroy data, but is rare

6. Tape Storage - Non-volatile, used primarily for backup (to recover from disk failure), and for archival data - Sequential-access – much slower than disk - Very high capacity (40 to 300 GB tapes available) - Tape can be removed from drive  storage costs much cheaper than disk, but drives are expensive - Tape jukeboxes available for storing massive amounts of data - hundreds of terabytes (1 terabyte = 10 9 bytes) to even multiple petabytes (1 petabyte = 10 12 bytes)

Storage Hierarchy (Cont.)

  • Primary Storage: Fastest media but volatile (cache, main memory).
  • Secondary Storage: next level in hierarchy, non-volatile, moderately fast access time - also called on-line storage - E.g. flash memory, magnetic disks
  • Tertiary Storage: lowest level in hierarchy, non-volatile, slow access time - also called off-line storage - E.g. magnetic tape, optical storage

Magnetic Disks Physical Characteristics of Disks

  • Read-write head
    • Positioned very close to the platter surface
    • Reads or writes magnetically encoded information.
  • Surface of platter divided into circular tracks
    • Over 50K-100K tracks per platter on typical hard disks
  • Each track is divided into sectors.
    • A sector is the smallest unit of data that can be read or written.
    • Sector size typically 512 bytes
    • Typical sectors per track: 500 to 1000 (on inner tracks) to 1000 to 2000 (on outer tracks)
  • To read/write a sector
    • disk arm swings to position head on right track
    • platter spins continually; data is read/written as sector passes under head
  • Head-disk assemblies
    • multiple disk platters on a single spindle (1 to 5 usually)
    • one head per platter, mounted on a common arm.
  • Cylinder i consists of i th^ track of all the platters

Magnetic Disks (Cont.)

  • Earlier generation disks were susceptible to head-crashes
    • Surface of earlier generation disks had metal-oxide coatings which would disintegrate on head crash and damage all data on disk
    • Current generation disks are less susceptible to such disastrous failures, although individual sectors may get corrupted
  • Disk controller – interfaces between the computer system and the disk drive hardware.
    • accepts high-level commands to read or write a sector
    • initiates actions such as moving the disk arm to the right track and actually reading or writing the data
    • Computes and attaches checksums to each sector to verify that data is read back correctly - If data is corrupted, with very high probability stored checksum won’t match recomputed checksum
    • Ensures successful writing by reading back sector after writing it
    • Performs remapping of bad sectors

Disk Subsystem (cont.)

  • Disks usually connected directly to computer system
  • In Storage Area Networks (SAN) , a large number of disks are connected by a high-speed network to a number of servers
  • In Network Attached Storage (NAS) networked storage provides a file system interface using networked file system protocol, instead of providing a disk system interface

Performance Measures of Disks

  • Access time – the time it takes from when a read or write request is issued to when data transfer begins. - Seek time – time it takes to reposition the arm over the correct track. - Average seek time is 1/2 the worst case seek time. - Would be 1/3 if all tracks had the same number of sectors, and we ignore the time to start and stop arm movement - 4 to 10 milliseconds on typical disks - Rotational latency – time it takes for the sector to be accessed to appear under the head. - Average latency is 1/2 of the worst case latency. - 4 to 11 milliseconds on typical disks (5400 to 15000 r.p.m.)
  • Data-transfer rate – the rate at which data can be retrieved from or stored to the disk.
    • 25 to 100 MB per second max rate, lower for inner tracks
    • Multiple disks may share a controller, so rate that controller can handle is also important
      • E.g. SATA: 150 MB/sec, SATA-II 3Gb (300 MB/sec)
      • Ultra 320 SCSI: 320 MB/s, SAS (3 to 6 Gb/sec)
      • Fiber Channel (FC2Gb or 4Gb): 256 to 512 MB/s
  • Mean time to failure (MTTF) – the average time the disk is expected to run continuously without any failure.

Redundant Array of Independent Disks (RAID)

  • RAID is a technology that uses multiple physical disk drives to protect data from a single disk failure.
  • The purpose of RAID is to ensure that at the time of failure, there should be one copy of data which should be available for immediate use.
  • RAID levels define the use of disk arrays. RAID levels
  • RAID 0
  • RAID 1
  • RAID 2
  • RAID 3
  • RAID 4
  • RAID 5
  • RAID 6

RAID 0

  • RAID 0 consists of striping, but no mirroring or parity, but no redundancy of data. It offers the best performance, but no fault tolerance.
  • In this level, a striped array of disks is implemented. The data is broken down into blocks and the blocks are distributed among disks.
  • Block “1, 2” forms a stripe.
  • Each disk receives a block of data to write/read in parallel.
  • Reliability: there is no duplication of data. Hence, a block once lost cannot be recovered.