Information Technology Fundamentals - Using Databases
Using Databases
A database is a systematic, durable collection of data. A directory is a database. In the IT industry, a database is typically connected with software. A simple database could consist of a single file holding numerous records, each of which contains the same set of fields, with each field having a specific data type and length. A database, in brief, is an organized repository of data accessible via named data pieces.
A database management system (DBMS) is a software package designed to create, store, and administer databases. The DBMS software enables data sharing between end users and application developers. It offers a methodical approach to creating, updating, retrieving, and saving data in a database. Typically, DBMS products are in charge of data integrity, data access control, automated rollback, restart, and recovery.
In an abstract sense, a database may be compared to a file folder, and a DBMS could be compared to the filing cabinet in which the labeled folders are stored. You implement and access database instances utilizing the DBMS's features. The payroll database utilized by your payroll application may be constructed using a DBMS such as Oracle Database 21c, Db2, MongoDB, or SQL Server.
Why Use a DBMS?
The primary benefit of a DBMS is imposing a logical, hierarchical arrangement on the data. Since it is optimised for such processes, a DBMS offers economies of scale for processing enormous amounts of data.
Four DBMS data models have existed historically: hierarchical, network, relational, and object-oriented.
A DBMS can be distinguished based on the data model it employs. A data model is a set of concepts that are used to describe data. A data model is comprised of two key elements: its structure, which describes how data is stored, and its operations, which describe how data can be changed. The leading DBMS solutions employ four distinct data models:
1. Network (or CODASYL)
2. Hierarchical
3. Relational
4. Object-oriented
The network data model is organized as a collection of record kinds and their respective relationships. All relationships are described clearly and stored as part of the DBMS's structure. CODASYL is another common name for the network model. CODASYL is named after the group that developed the concept in the early 1970s, the Conference on Data Systems Languages. Using the location of a particular record and linkages to similar records, data is altered. IDMS is an example of a network-based database management system.
The hierarchical data model organizes data into structural trees that hold subordinate data at lower levels beneath data stored at higher ones. A hierarchical data model is derived from the network model, with the restriction that there is only one way to access a record. IMS is an example of a hierarchical model-based DBMS.
The relational data model is comprised of a collection of tables (or, more accurately, relations), where the columns describe the relationship between tables. The mathematical foundation of the relational paradigm is set theory. Contrary to common assumption, the relational model is called after the relations of set theory, not "relationships." A relation is a collection of unique values. Data can be modified in a variety of ways, but SQL is the most frequent. DB2, Oracle, and SQL Server are examples of relational model-based DBMS solutions.
The object-oriented (OO) data model consists of a collection of entities or objects, with each object containing the actions that may be performed on it. Thus, an object encapsulates both data and process. Typically, OO systems transform data using an OO programming language. ObjectStore by Progress Software and Cache by Intersystems are examples of DBMS software based on the OO concept.
For convenience, each of these four data models is referred to as a data model. Only the relational and network models have formalised data model specifications. Different data models result in dissimilar logical and structural data arrangement. The relational model is the most widely used data model because it is the most abstract and simplest to apply to data, while still having significant data manipulation and access capabilities.
Other Types of DBMS
Despite the fact that the four data models covered thus far are the most prevalent types of DBMS, there are additional types of DBMS with varied degrees of commercial adoption.
A column-oriented DBMS, often known as a column store, is a database management system that stores its data by column rather than by row. This is advantageous for data warehouses where a huge number of data pieces are aggregated. A column-oriented DBMS is not based on a formal data model and can be viewed as a physical implementation of a relational DBMS. Greenplum and Sybase IQ are examples of column storage.
The NoSQL database system is another type of DBMS that has acquired interest on the market, typically in applications using Big Data. NoSQL DBMSs are defined by their flexible schema and independence from SQL; nevertheless, due to SQL's prevalence, many NoSQL solutions have included SQL support. A key is connected with a piece of data in Key-Value data storage. The data is not tightly structured and is not required to adhere to a schema, as in conventional database architecture.
There are four varieties of NoSQL DBMS:
1. Document
2. Key/Value
3. Wide-column
4. Graph
A document store handles and stores document-level data. Documents are typically saved as XML, JSON, BSON, etc. A document database is optimal for high performance, high availability, and straightforward scaling. Consider utilizing a document store for web retail apps, real-time analytics processing, the front end of a blog, or content management. However, they are not well-suited for complicated transaction processing, as is typical for relational applications. Couchbase, RavenDB, and MarkLogic are alternatives to MongoDB, the most popular document database.
The key/value database system is advantageous when every database access is performed using a primary key. Typically, there is no set data model or structure. The key is identified by an arbitrary data "blob." A database of key-value pairs is useful for storing shopping cart information and user profiles. It is ineffective when there are complex interactions between data pieces or when data must be queried using a field other than the main key. Riak, Berkeley DB, and Aerospike are instances of key-value stores.
A wide-column store employs tables, rows, and columns, but, unlike a relational database, the column names and formats can vary from row to row within the same table. Wide-column stores are represented by Apache Cassandra, Amazon DynamoDB, and DataStax Enterprise, among others.
Lastly, we have the graph database, which represents and stores data using graph structures with nodes, edges, and characteristics. In a graph database, each entry carries a direct pointer to its next element, eliminating the need for index lookups. The primary applications for graph databases include social networks, routing and dispatch systems, and location-aware systems. Neo4j, GraphBase, and Meronymy are several instances.
NoSQL database systems are popular among enterprises that have unique data difficulties that cannot be addressed by traditional RDBMS systems. The NoSQL key-value database systems Cassandra and MongoDB are examples.
Other DBMS implementations include the inverted list structure of the original Adabas as well as the dBase format made popular by the PC DBMS, dBase II and dBase III.
Today, the majority of commercial DBMS solutions are relational.
The benefits of using a DBMS
Moreover, a DBMS provides a central repository of data that can be accessed from many locations by multiple users. Instead of propagating and storing data in new files for each new programme, data can be shared throughout numerous apps. Centralized data storage and management within the DBMS affords
- Abstraction and independence of data
- Data security
- A system for locking concurrent access
- An effective handler to balance the requirements of several applications sharing the same data
- The capability to recover quickly from crashes and mistakes
- Robust data integrity capabilities
- Access is simple using a standard API
- Uniform data administration processes
Levels of Data Abstraction
A DBMS can offer many views of a database schema. A view specifies the data a user sees and how that data is displayed. The DBMS provides a degree of abstraction between the conceptual schema that specifies the database's logical structure and the physical schema that describes the files, indexes, and other physical mechanisms used by the database. Users operate at the conceptual level, for example by querying columns within rows of tables, as opposed to having to go through the many physical structures that store the data.
When business requirements change, a DBMS makes it much simpler to modify applications. New data categories can be added to the database without affecting the current system.
Data Independence
A database management system provides a layer of isolation between data and applications that utilise the data. In other words, applications are separated from the structure and storage of data. The DBMS offers two forms of data autonomy:
- Logical data independence — safeguarding against alterations to the logical structure of data
- Physical data independence — protection against alterations in the physical structure of data
As long as the software utilises the API (application programming interface) to the database given by the DBMS, developers can avoid modifying applications due to database modifications.
SQL is the primary interface for relational databases. Most application SQL statements do not need to be modified when database structures change (e.g., a new column is added to a table).
Data Security
The protection of data prohibits unauthorized individuals from accessing or modifying the database. Using IDs and passwords, the DBMS controls which users have access to which database sections. Consider, for instance, a database of employees containing all information about each employee. Using the DBMS security methods, payroll staff can be authorized to view payroll data, whereas only project history data can be accessible to management.
Concurrency Control
A DBMS can provide data to many user programmes that are executing concurrently. This requires a locking mechanism to provide concurrency management, as the concurrent execution of many programmes could potentially result in data inconsistencies. For instance, numerous bank ATM customers may be able to take $100 from a $150-dollar checking account. The locking mechanism of a DBMS prevents such issues by isolating transactions fighting for the same exact data.
Database Logging
The DBMS employs database logging to record "before" and "after" images of updated database objects. It is essential to remember that the database log records every data alteration (except in circumstances as determined by the DBA). This information can be used to undo and redo database transactions. The database management system handles database logging transparently, i.e., automatically.
Ensuring Atomicity and Durability
A DBMS can be used to ensure that transactions are all-or-nothing. This is known as atomicity, and it ensures that data integrity is preserved even if the system crashes during a transaction. Moreover, a DBMS offers recoverability. After a system failure, data can be retrieved to a condition that existed either immediately prior to the failure or at a later period that is required.
Data Integrity
The DBMS provides procedures for determining the kind of data that can be stored in particular fields or columns. In the database, only data that adheres to the business rules will ever be stored. Moreover, the DBMS can be configured to manage relationships between various types of data and to ensure that updates to connected data pieces are applied appropriately.
Data Access
A DBMS provides a common query language so that users can interactively query and study the database's data. This standard API for relational databases is SQL, or Structured Query Language. However, SQL is not needed for a database management system to be relational. In addition, numerous DBMS packages provide analytical tools and report generators to further facilitate data access.