7.4 Transaction Processing, Concurrency Control, and Crash Recovery

Transaction processing, concurrency control, and crash recovery are essential components of database management systems that ensure data consistency, integrity, and durability even in the face of system failures or concurrent operations.


1. ACID Properties

ACID is an acronym that describes the four key properties of a transaction to ensure database reliability and correctness.

  • Atomicity: A transaction is treated as a single unit, which means that either all of its operations are executed, or none of them are. If any part of the transaction fails, the entire transaction is rolled back.

  • Consistency: A transaction takes the database from one consistent state to another. It ensures that any data modifications made by the transaction maintain the integrity constraints of the database.

  • Isolation: Transactions execute independently of one another. Even though transactions may run concurrently, the effect of one transaction is not visible to others until it is fully completed.

  • Durability: Once a transaction is committed, its changes are permanent and survive any subsequent system failures. This ensures that the data is not lost.


2. Concurrent Executions

  • Concurrent Executions: Refers to multiple transactions being executed simultaneously in a database system. Concurrency is essential for improving system throughput, but it introduces challenges such as maintaining consistency and isolation.

    • Challenges:

      • Lost Updates: One transaction's update is overwritten by another.

      • Temporary Inconsistency: Transactions access the database in an inconsistent state.

      • Uncommitted Data: One transaction reads data that has not yet been committed by another transaction.

      • Inconsistent Retrievals: Transactions retrieve data that may be changed by another transaction during its execution.


3. Serializability Concept

  • Serializability: The highest level of isolation, which ensures that the results of concurrent transactions are equivalent to the result of some serial execution of those transactions.

    • Types of Serializability:

      • Conflict Serializability: Transactions are conflict-serializable if their execution order can be rearranged to produce the same result as a serial execution.

      • View Serializability: Transactions are view-serializable if the transactions can be reordered in such a way that each transaction's view of the data is the same as in a serial execution.

    • Goal: To ensure that concurrent execution of transactions preserves the consistency and correctness of the database.


4. Lock-based Protocols

  • Locking: A mechanism used to control concurrent access to data by ensuring that only one transaction can access a particular piece of data at a time.

    • Types of Locks:

      • Shared Lock (S-lock): Allows multiple transactions to read the data but prevents any from writing.

      • Exclusive Lock (X-lock): Allows a transaction to both read and write the data and prevents any other transaction from accessing it.

  • Lock-based Protocols: Ensure serializability by requiring transactions to acquire locks before accessing data and releasing them after the transaction is completed.

    • Two-Phase Locking (2PL): A protocol where transactions must obtain all the locks, they need before releasing any locks. It has two phases:

      1. Growing Phase: Locks are acquired, and no locks are released.

      2. Shrinking Phase: Locks are released, and no new locks are acquired.

  • Deadlock: It is not a protocol but a problem that can occur when using lock-based protocols like 2PL. A deadlock in a database occurs when two or more transactions are waiting for each other to release locks on resources, creating a cycle of dependency that prevents any of them from proceeding.

    Below are the four conditions necessary for a deadlock to occur (commonly known as the Coffman Conditions):

    • Mutual Exclusion

      • At least one resource must be in a non-sharable mode, meaning only one transaction can hold a lock on the resource at any given time.

      • Example: Transaction A locks a row exclusively for writing, so no other transaction can access it.

    • Hold and Wait

      • A transaction holding at least one resource is waiting to acquire additional resources held by other transactions.

      • Example: Transaction A holds Lock 1 and requests Lock 2, while Transaction B holds Lock 2 and requests Lock 1.

    • No Preemption

      • Resources cannot be forcibly taken away from a transaction; they must be released voluntarily.

      • Example: If Transaction A holds Lock 1, the system cannot force Transaction A to release Lock 1. It must wait until Transaction A finishes its task.

    • Circular Wait

      • A circular chain of transactions exists, where each transaction is waiting for a resource held by the next transaction in the chain.

      • Example:

        • Transaction A → waiting for a resource held by Transaction B.

        • Transaction B → waiting for a resource held by Transaction C.

        • Transaction C → waiting for a resource held by Transaction A.

  • Example of Deadlock in a Database:

    Consider the following scenario:

    • Transaction A locks Row X and requests a lock on Row Y.

    • Transaction B locks Row Y and requests a lock on Row X. Both transactions are now waiting for each other to release their locks, leading to a deadlock.

  • Deadlock Detection and Resolution:

    • Detection:

      Database systems periodically check for deadlocks using wait-for graphs. If a cycle is detected, it indicates a deadlock.

    • Resolution:

      • Transaction Rollback: Terminate one of the transactions to break the cycle.

      • Timeouts: Automatically abort transactions that have been waiting too long.

    • Deadlock Prevention Techniques:

      • Avoid Circular Wait: Impose an ordering on resource acquisition.

      • No Hold and Wait: Ensure transactions request all needed resources at once.

      • Allow Preemption: Forcefully release locks from lower-priority transactions.


5. Failure Classification

Failures can occur in a database system, affecting the transactions and the database’s consistency. These failures are typically classified into:

  • Transaction Failures: Failures that occur within a transaction, such as logic errors, constraints violations, or deadlocks.

  • System Failures: Failures in the database management system (DBMS) or hardware that affect multiple transactions, such as power outages, memory corruption, or disk crashes.

  • Media Failures: Failures related to the storage medium, like disk crashes or corrupt data files.


6. Recovery and Atomicity

  • Recovery: The process of restoring the database to a consistent state after a failure. It ensures that all committed transactions are durable and that uncommitted transactions are rolled back.

  • Atomicity in Recovery: During recovery, the atomicity property ensures that if a transaction is incomplete (due to a crash), it is rolled back to avoid partial updates. This prevents the database from being left in an inconsistent state.


7. Log-based Recovery

  • Log-based Recovery: A method for recovering a database by maintaining a transaction log, which records all changes made to the database. Logs contain:

    • Before Image: The value of a data item before it was modified.

    • After Image: The new value of the data item after modification.

    • Log Records: Each log record includes:

      • The transaction identifier (TID).

      • The operation (e.g., read, write).

      • The data item affected.

      • The before and after values.

  • Recovery Process:

    • Undo: For uncommitted transactions at the time of failure, undo the operations by using the before image.

    • Redo: For committed transactions, redo the operations using the after image to ensure that the changes are applied.

    • Write-Ahead Log (WAL): Ensures that log entries are written to disk before any changes are made to the actual database, ensuring the durability of transactions.


Conclusion

  • Transaction processing, concurrency control, and crash recovery are crucial for maintaining the integrity, consistency, and durability of a database.

  • The ACID properties provide the foundation for reliable database operations, while mechanisms like lock-based protocols and log-based recovery ensure that transactions are executed properly even in the case of failures.

  • By ensuring correct and consistent transaction processing, databases can handle concurrent operations efficiently and recover from system failures without losing data.

Last updated