Database
2018-06-21 14:13:20 2 举报
AI智能生成
INFO90002 Database University of Melbourne IT IS 2018 S1
作者其他创作
大纲/内容
Modelling
SQL
Normalisation
can prevent
data redundancy
update (and insert, delete) anomalies
biased optimizing UPDATE
slow down SELECT
1NF
multivalued attributes and repeating groups
2NF
partial functional dependencies
3NF
transitive dependencies
BCNF
remaining anomalies
cons:
more tables
need to be joined during SELECTs
denormalisation example
one-to-one relationship
many-to-many relationship with attributes
reference data / lookup table
Physical Design
data type
text
CHAR
VARCHAR(M)
ENUM('V1','V2')
TEXT
number
TINYINT
255
SMALLINT
64k
MEDIUMINT
16M
INT
4G
BIGINT
2^64
DECIMAL
time
DATE
TIME
DATETIME
TIMESTAMP
YEAR
data integrity
default value
range control
null value control
referential integrity
indexing
on large tables
>100 distinct value
frequently used in WHERE, ORDER, GROUP BY
limited use for volatile db
Application
system architecture
presentation logic
business logic
storage logic
multi-tiered architecture
2 tiers
client-dbms
3 tiers
client-web server-dbms
4 tiers
client-web server-app server-dbms
Transaction
definition
A logical unit of work that must either be entirely completed
or aborted (indivisible, atomic)
or aborted (indivisible, atomic)
why
users need the ability to define a unit of work
concurrent access to data by >1 user or program
properties (ACID)
Atomic
Consistency
Isolation
Durability
Transaction logging
concurrent access
Lost Update problem
Uncommitted Data problem
Inconsistent Retrieval problem
methods
serial execution (very expensive!)
Locking
Database-level lock
Table-level lock
(Page-level lock)
Row-level lock
Field-level lock (same row different attributes)
Deadlock
two transactions wait for each other to unlock data
methods: prevention, detection
Time Stamping
Optimistic
Distributed
definition
single logical database
physically spread
appear as though it is one database
Advantages
Good fit for geographically distributed organizations/users
Data located near site with greatest demand
Faster data access (to local data)
Faster data processing
Allows modular growth
Increased reliability and availability
Supports database recovery
Disadvantages
Complexity of management and control
Data integrity
Security
Lack of standards
Increased training & maintenance costs
Increased storage requirements
Trade-offs
Location transparency
not need to know the location of data
Local autonomy
able to operate locally when connection fail
Availability vs Consistency
Synchronous vs Asynchronous
Distribution options
Data replication
Horizontal partitioning
Vertical partitioning
Combinations of the above
Replication
Adv
High reliability
Fast access
avoid complicated distributed integrity routines
Decoupled nodes don't affect data availability
Reduced network traffic at prime time
Dis
Need more storage space
Data integrity
Takes time for update operations
Network communication capabilities
Administration
DBMS
in memory and physically on disk
manage
data
performance
concurrency
recoverability
NoSQL
收藏
收藏
0 条评论
下一页