Database

This blog contains some database information that I have been putting together recently.

  • SQL and Relational Algebra
  • OLAP, OLTP
  • NoSQL: K-V, Column-Family, Document, Graphical
  • References

SQL and Relational Algebra

(Most of this part come from CSC343 in UofT)

Terminologies

  • DDL (Data Define Language), DML (Data Modification Language), DCL (Data Control Language)
    • DDL commands: CREATE, ALTER, DROP
    • DML commands: SELECT, INSERT, UPDATE, DELETE
    • DCL commands: GRANT, REVOKE
  • key
    • primary key: main index. Build B-tree to index them.
    • foreign key: provides reference to another table, allowing JOIN operations to happen.

RA Operations

  • project (select)
  • join
    • cross (or, inner) join
    • outer join (left / right / full).
      Differences between inner and outer join can be found here
  • Grouping: ‘having’ instead of ‘where’
  • aggregate function

Denormalization of Database

  • Denormalization is an optimization technique. By adding redundant data, we might avoid the costly join operations.

Entity-Relation Model

  • Developed by Peter Chen in 1976 in this paper
  • Cardinality specifies how many instances of an entity relate to one instance of another entity.
  • Ordinality describes the minimal number of relationships.

OLAP vs. OLTP

  • OLAP: On-Line Analytical Processing
  • OLTP: On-Line Transactional Processing.
    Detailed difference described here

NoSQL: K-V, Column-Family, Document, Graphical

  • CAP dilemma
  • Types of NoSQL Databases:
    • K-V pair
    • Column-Family (eg: GFS)
    • Document (eg: MongoDB)
    • Graphical
  • There’s a brief introduction here

References

  • Early database model, disk memory access (sequential / random memory access), Relational model, E-R model, Object model: link
  • Codd’s paper laying the mathematical foundation for relational algebra link
  • OLAP link