Data Science/Database

Transaction Management

What does it mean by ACID properties in database transaction?

 

 

Atomicity: All database operations of a transaction must be entirely completed or entirely aborted.

 

Consistency: It must take the database from one consistent state to another. i.e. If DB goes down during the transaction, it should revert back to the original starting state.

 

Isolation: Data used during execution cannot be used by the second transaction until the first transaction is completed.

 

Durability: Once the transaction has been completed, the changes made to the transaction must be durable even in the event of system failure.

 

 

Lock

Lock Granularity

1. Database level (1 lock)

2. Table level (1 every table)

3. Page level (kb)

4. Record level (Record, don't touch c_no12 !!)

5. Attribute level ( I'm changing c_no12 name, you can change c_no address)

 

Once the lock is set, no one can do anything.

 

Shared vs Exclusive Locks

 

Shared Lock: Multiple processes can simultaneously hold shared locks, to enable them to READ without updating.

i.e. Many users can have read-only access to DB.

Allows other transactions to read the item.

Two transactions can both read the same part of the DB with share locks.

 

 

Exclusive Lock: A process that needs to update a record must obtain an exclusive lock. Its application for a lock will not proceed until all current locks are released.

i.e. only one exclusive right for the full access is given to a single user.

If Transaction 1 has an exclusive lock on certain part of the DB, Transaction 2 has to wait until T1 commits and finishes the task before it can lock in the exclusive lock.

 

 

 

 

Time 

TX 

Access 

0

(T1)

READ A

S(T1) 

 

 

1

(T2)

READ B

 

S(T2) 

 

2

(T3)

READ C

S(T3) 

 

 

3

(T1)

UPDATE A

T1 wait for T3 to fin 

 

 

4

(T3)

READ C

 

 

S(T3) 

5

(T2)

READ C

 

 

S(T2) 

6

(T2)

UPDATE B

 

X(T2) 

 

7

(T2)

READ A

S(T2) 

 

 

8

(T2)

UPDATE C

 

 

Wait for T3 to fin 

9

(T3)

READ B

 

Wait for T2 to fin 

 

 

Time 0: T1 gains Shared lock to read A

Time 1: T2 gains Shared lock to read B

Time 2: T3 gains Shared lock to read C

Time 3: T1 fails to fain Exclusive lock to update A, as T3 is on Shared lock on A

Time 4: T3 gains Shared lock on C

Time 5: T2 gains Shared lock on C

Time 6: Since T2 already has Shared lock to B, It can promote to Exclusive lock.

Time 7: T2 gains Shared lock for A

Time 8: T2 tries to gain an Exclusive lock on C, but T3 has access, thus it fails.

Time 9: T3 tries to read B, but since T2 has Exclusive share in B, it has to wait for T2 to finish.

 

- This leads to a DeadLock situation.

T1 is waiting for T3, T2 is waiting for T3, T3 is waiting for T2

- T1 has Xlock(A) and requests a lock on B

- T2 has Xlock(B) and requests a lock on A.

Results in a deadlock

Each has locked a resource required by another and will not release that resource until it can either commit or abort.

 

Deadlock Prevention

1. A transaction must acquire all the locks it requires even before it updates any records. If it cannot acquire all the locks, it releases all locks and tries it later.

- Problem is, you might not know which locks you actually need. Then you need to rollback entire transaction and re-do.

 

2. Kill one transaction.

Lock manager sees the deadlock, kills the least important transaction.

 

 

DB Restart / Recovery

- Restart

- Soft Crash

- Loss of volatile storage, but no damage to the physical disk

- These necessitate restart facilities

- Recovery

- Hard Crash

- Anything that makes 'Disk' permanently unreadable

- These necessitate recovery facilities

- Need Back-ups

- Requires transaction log.

- Entire thing depends on the transaction log.

- Much more important, this should have 2, 3 back-ups on a different hard drive.

- Write it to log first and then write it to DB.

 

- Check point: 15min / 20 transactions etc.

 

- In case of Soft Crash

- Define checkpoint for everything after the checkpoint builds 2 lists, 1. re-do list (transaction ids of that were committed before the crash), 2. Un-do list (Transaction IDs that were never committed/completed)

 

1. re-do: if it is between Checkpoint and Time of Failure

 if it is on CP, finished before TF

2. un-do: if it was on-going (not completed) when TF.

 

Deferred write

- DB is updated only after transaction reaches commit point.

- Required roll forward but does not require rollback

 

Recovery

- A hard crash: Physical damage to HDD

- Head crash

- Accidental impact damage

- Disk unit, disk must be replaced, reformatted, reloaded with the database

 

Back-up: Stored on a different device, location

- Back-ups are taken, say at the end of each day's processing

- Ideally, two copies of each back-ups are held. One on-site and another off-site.

- Transaction log: back-up only the transaction log operations hat are not reflected in a previous back up of the database.