Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Biological Databases and Sequence Alignment Techniques, Assignments of Computer Aided Design (CAD)

A comprehensive overview of different types of biological databases, including primary, secondary, and derived databases. It covers the key features and characteristics of major databases such as genbank, ddbj, embl, swiss-prot, pir, and genepept. The document also discusses the differences between global sequence alignment and local sequence alignment, explaining the needleman-wunsch and smith-waterman algorithms. This information would be valuable for students and researchers studying bioinformatics, computational biology, and related fields. A range of topics, including database structures, sequence analysis, and alignment techniques, making it a useful resource for understanding the fundamental concepts and tools in the field of biological data management and analysis.

Typology: Assignments

2023/2024

Available from 10/19/2024

devanshu-bansal
devanshu-bansal ๐Ÿ‡ฎ๐Ÿ‡ณ

2 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ASSIGNMENT-1 DATE-26/02/2023
Name: Devanshu Bansal
Subject: Computer Aided Drug Design
Semester: 6th
Submitted to: Dr. Prem Raj Meena
Que: Details describe the different types of Biological Databases?
Ans: A collection of biological data, biological sequence, binding sites, metabolic
interactions, molecular action, functional relationship, protein families, motifs,
and homologous, arranged in computer readable form to enhance the speed of
search and retrieval and convenient to use is called biological database.
Biological databases are divided into three types:
1. Primary Databases
2. Secondary Databases
3. Derived Databases
1. Primary Databases: -
โ— It is also known as archival Database.
โ— Databases that are comprise of derived data like as sequence of
nucleotide are known as Primary Databases.
โ— Sequence or structure are store in primary database.
โ— Primary databases classified based on biological molecules: -
Primary Databases
Nucleotide
Sequence Databases
Gen Bank
DDBJ
EMBL
Protein Sequence
Database
Swiss prot
PIR
Gen pept
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Biological Databases and Sequence Alignment Techniques and more Assignments Computer Aided Design (CAD) in PDF only on Docsity!

ASSIGNMENT- 1 DATE-26/02/

Name: Devanshu Bansal

Subject: Computer Aided Drug Design

Semester: 6

th

Submitted to: Dr. Prem Raj Meena

Que: Details describe the different types of Biological Databases? Ans: A collection of biological data, biological sequence, binding sites, metabolic interactions, molecular action, functional relationship, protein families, motifs, and homologous, arranged in computer readable form to enhance the speed of search and retrieval and convenient to use is called biological database. Biological databases are divided into three types:

  1. Primary Databases
  2. Secondary Databases
  3. Derived Databases
  4. Primary Databases: - โ— It is also known as archival Database. โ— Databases that are comprise of derived data like as sequence of nucleotide are known as Primary Databases. โ— Sequence or structure are store in primary database. โ— Primary databases classified based on biological molecules: - Primary Databases Nucleotide Sequence Databases Gen Bank DDBJ EMBL Protein Sequence Database Swiss prot PIR Gen pept

I. GenBank: - ๏ƒ˜ It is joined part of NCBI. ๏ƒ˜ It has a tool called Entrez which helps to retrieve data. ๏ƒ˜ It is maintained by NCBI, Bethesda USA. ๏ƒ˜ It is a one of the fastest growing database storage of known nucleotide sequences. ๏ƒ˜ Gen Bank has a flat file structure. ๏ƒ˜ This database is in the form of text form of ASCII file that is simple to analyse. ๏ƒ˜ Gen Bank files contains information of phylogenetic classification, accession number and gene names ๏ƒ˜ The nucleotide database was divided into three data base at NCBI: ๏‚ง Core nucleotide Databases ๏‚ง Expressed Sequence Tag ๏‚ง Genome Survey Sequence II. DDBJ: - ๏ƒ˜ DDBJ full form is โ€˜DNA Data Bank of Japanโ€™. ๏ƒ˜ It is storage of collection of DNA Sequences. ๏ƒ˜ Situated at National Institute of Genetics (NIG), Japan. ๏ƒ˜ It is technologically created by NIG in 1986. ๏ƒ˜ DDBJ is a single database in Asia. III. EMBL: - ๏ƒ˜ It is begun in 1980 and preserved by EBI. ๏ƒ˜ EMBL full form is European Molecular Biology laboratory. ๏ƒ˜ It is presented at EBI (European Bioinformatics Institute). ๏ƒ˜ SRS (sequence retrieval system) is a tool that is used to retrieve data of as we want the sequence of DNA, protein or gene. IV. Swiss-prot: - ๏ƒ˜ EMBL keep this database as owner. ๏ƒ˜ It is begun in 1986 and preserved by SIB(Swiss Institute of Bioinformatics), ๏ƒ˜ It is organized sequence of Database offers a high level integration with other Database.

GenBank

DDBJ EMBL

SHARING

๏ƒ˜ X-Ray, Crystallography and Nuclear Medical Resonance are techniques to determine these biologically large molecules. ๏ƒ˜ Text file contain the PDB data in information of lines. II. MMDB: - ๏ƒ˜ MMDB full form is "Molecular Modelling Database" ๏ƒ˜ Analysis of Individual structures and relationships among them are: III. EBI-MSD: - ๏ƒ˜ European Bioinformatics Institute - Macromolecular Structure Database ๏ƒ˜ MSD Consist of two separate databases! IV. Prosite: - ๏ƒ˜ Prosite is firstly evolved secondary database. ๏ƒ˜ Protein structure, sites that describes functions and their families are components of PROSITE. ๏ƒ˜ PROSITE are programmed as a regular expression (called pattern) of protein functions. V. Blocks: - ๏ƒ˜ Block is clears the disadvantages of PROSITE and PRINT Databases. ๏ƒ˜ Block databases fully automated. ๏ƒ˜ Two important features are: o Keyword searching o Sequence searching

  1. Derived Database: - ๏‚ท A database derived from other resources but not found in ๏‚ท It is also known as "Composite databases." ๏‚ท The data entered in these types of Databases and then filtered based on desired criteria. are first compared.

๏‚ท The initial data are taken from the primary database, cond then they are merged based on certain conditions. ๏‚ท It helps in searching sequence rapidly. ๏‚ท Composite Databases contain non-redundant data.

  1. CATH: - ๏ƒ˜ Protein domain structures are the fixed three-dimensional substructures of proteins, and the CATH (Class, Architecture, Topology, Homology) library organizes them hierarchically. ๏ƒ˜ The goal of creating this database was to provide a complete, structure- based categorization of protein domains. ๏ƒ˜ Protein domains are organized hierarchically in the CATH catalogue according to their Class, Architecture, Topology, and Homologous group. ๏ƒ˜ o Class is the most general category, and it separates protein domains into four groups according to the predominant secondary structure they exhibit: primarily alpha, mainly beta, alpha-beta, and small proteins. o To fold a domain is to organize its secondary structural components in a three-dimensional shape, and this is what architecture depicts. o Topology is the representation of the spatial arrangement of secondary structural components that characterizes the fundamental architectural properties of an area. o Domains thought to have descended from the same progenitor are clustered together in homologous superfamilies. ๏ƒ˜ The CATH library is frequently updated with new information from the Protein Data Bank and is used extensively in the study of protein structures. (PDB). It's a helpful resource for determining the connections between protein regions and making predictions about the structure and function of uncharacterized proteins.
  2. SCOP: - ๏ƒ˜ A library that provides a systematic organisation of protein shapes is called the Structural Classification of Proteins (SCOP). ๏ƒ˜ The purpose of its creation was to describe in great depth the structural and genetic connections between proteins. The Standard Classification of Proteins (SCOP) divides proteins into four tiers: group, family, domain, and species. o Superfamilies, which include sets of proteins with a shared evolutionary start and a comparable overall fold, are at the summit of the hierarchy. o Subgroups of superfamilies, called families, comprise proteins that are highly comparable to one another in terms of their sequence and structure. o Domains are structurally and functionally separate components within proteins and are further subdivided into families. ๏ƒ˜ Finally, we have species, which are the individual manifestations of a category across various creatures. SCOP is a popular tool in structural

Ans: Global Alignment Trace Back:

Local Alignment: -