sql - Historical / auditable database -

this question related schema can found in 1 of other questions here. in database store users, locations, sensors amongst other things. of these things editable in system users, , deletable.

however - when item edited or deleted need store old data; need able see data before change.

there non-editable items in database, such "readings". more of log really. readings logged against sensors, because reading particular sensor.

if generate report of readings, need able see attributes location or sensor at time of reading.

basically should able reconstruct data point in time.

now, i've done before , got working adding following columns each editable table:

valid_from valid_to edited_by

if valid_to = 9999-12-31 23:59:59 that's current record. if valid_to equals valid_from, record deleted.

however, never happy triggers needed use enforce foreign key consistency.

i can possibly avoid triggers using extension "postgresql" database. provides column type called "period" allows store period of time between 2 dates, , allows check constraints prevent overlapping periods. might answer.

i wondering though if there way.

i've seen people mention using special historical tables, don't thought of maintainling 2 tables every 1 table (though still might possibility).

maybe cut down initial implementation not bother checking consistency of records aren't "current" - i.e. bother check constraints on records valid_to 9999-12-31 23:59:59. afterall, people use historical tables not seem have constraint checks on tables (for same reason, you'd need triggers).

does have thoughts this?

ps - title mentions auditable database. in previous system mentioned, there edited_by field. allowed changes tracked see changed record. not sure how difference might make.

thanks.

revised 01 jan 11

ok, there gap between sit (deliver auditable databases; yours being particular requirement of that) , sit: based on questions , comments. work out in commentary. here's position start from.

to provide requirement, there no need @ for: triggers; mass duplication; broken integrity; etc.
this not classic temporal requirement, either, no need "period" capability, can.
validfrom and validto normalisation error: validto data derived; validto in row duplicated, in validfrom of next row; have update anomaly (when update 1 column in 1 row, additionally have update other column in next row); have use dummy value "current".
- all unnecessary, use validfrom only, , keep db clean , pure 5nf.
- the caveat is, if postgresql can't perform subqueries without falling in heap (ala oracle), fine, kep validto.

all of these things editable in system users, , deletable.

well, no. database holding important information; referential integrity, not scratchpad, user cannot walk , "delete" something. contradict same users requirement maintaining historical data (in reading; alert; ack; action; download).

cascading deletes not allowed. functions check boxes non-databases, ms access types. real databases, ri constraints stop parents children being deleted.
primary keys cannot (should not) changed. eg. userid; locationid; networkslavecode never change; remember, considered identifiers. 1 characteristic of pks stable.
you can add new users; can change current user's name; cannot delete user has entries in download, acknowledgement, action.

basically if it's editable has historical (so excludes readings , alerts).

also excludes: downloads; acknowledgements; actions.

and reference tables: sensortype; alerttype; actiontype.

and new history tables: inserted into, cannot updated or deleted.

the problem find isobselete flag is.. if change location, sensor foreign key point obselete record, meaning have duplicate every sensor record. problem gets exponentially worse hierachy gets bigger.

ok, understand locationid (fk) in sensor not change; there no mass duplication, etc ? there no problem in first place (and there in stupid book!) gets exponentially worse in second place.
~~isobsolete inadequate requirement.~~ (refer below)
the updateddtm in real row (reading, etc) identifies parent (fk sensor) history row (its auditeddtm) in effect @ time.
full relational capability; declarative refential integrity, etc.
maintain idef1x, relational concept of strong identifiers ... there 1 current parent row (eg. location)
the rows in history images of current row, before changed, @ stated auditeddtm. current row (non-history) shows 1 last updateddtm, when row changed.
the auditeddtm shows entire series of updateddtms given key; , have used "partition" real key in temporal sense.

all required history table each changeable table. have provided hiistory tables 4 identifying tables: location; sensor; networkslave; , user.

please read understanding auditable in accounting sense.

data model

link sensor data model history (page 2 contains history tables , context).

readers not familiar relational modelling standard may find idef1x notation useful.

response comments

(1) my first issue of referential integrity historic data, in i'm not sure there any, , if there i'm not sure how works. instance, in sensoryhistory possible add record had updateddtm indicating date time before location existed, if see mean. whether issue i'm not sure - enforcing might on top.

(you raised similar issue in other question.) may dbs have experienced did not have referential integrity in place; relation lines there documentation; ri "implemented in app code" (which means there no ri).

this iso/iec/ansi standard sql database. allows declarative referential integrity. every relation line implemented pk::fk reference, actual constraint declared. eg:

create table location     ...     constraint uc_pk         primary key (locationid)     ... create table sensor     ...     constraint uc_pk         primary key (locationid, sensorno)     constraint location_sensor_fk         foreign key (locationid)         reeferences location(locationid)     ... create table sensorhistory     ...     constraint uc_pk         primary key (locationid, sensorno, updateddtm))     constraint sensor_sensorhistory_fk         foreign key (locationid, sensorno)         reeferences sensor (locationid, sensorno)     ...

declared constraints enforced server; not via triggers; not in app code. means:

a sensor locationid not exist in location cannot inserted
a locationid in location has rows in sensor cannot deleted
a sensorhistory locationid+sensorno not exist in sensor cannot inserted
a locationid+sensorno in sensor has rows in sensorhistory cannot deleted.

(1.1) all columns should have rules , check constraints constrain range of values. in addition fact insert/update/deletes programmatic, within stored procs, therefore accidents not happen, , people not walk database , run commands against (excepts selects).

generally stay away triggers. if using stored procs, , normal permissions, this:

in sensoryhistory possible add record had updateddtm indicating date time before location existed, if see mean

is prevented. inserting sensorhistory updateddtm earlier sensor itself. procs not declarative rules. if want doubly sure (and mean doubly, because inserts via proc, direct command users), sure, have use trigger. me, on top.

(2) how indicate deletion? add flag non-historical version of table guess.

not sure yet. eg. accept when sensor deleted, final ... (yes, history maintained) ... , when new sensor added location, have new sensorno ... there no sensor being logically replaced new one, or without gap in time ?

from end-user's point of view, via software should able add, edit , delete sensors @ no limitation. yes, once deleted deleted , cannot undeleted. there's nothing stop them re-adding sensor later though exact same parameters.

and "delete" locations, networkslaves, , users well.

ok. new sensor same parameters, new, has new sensorno, , independent of previous logical sensor. can add isobsolete boolean 4 identifying tables; identified adequate. delete soft delete.

(2.1) networksensor , loggersensor, dependent on 2 parents: obsolete if either of parents obsolete. there no point giving them isobsolete column, has dual meaning, can derived applicable parent.

(2.2) clear, users cannot delete rows transaction , history tables, right?

(3) when updating table, method best insert new row in historical table , update main table? normal sql statements inside transaction maybe?

yes. classic use of transaction, per acid properties, atomic; either succeeds in toto or fails in toto (to retried later when problem fixed).

(4) referenced book

the definitive , seminal text temporal data , relational model c j date, h darwen, n lorentzos. in, of embrace rm familiar extensions, , required in successor rm; rather other method.

the referenced book horrible, , free. pdf isn't pdf (no search; no indexing). opening ms , oracle telling; few bits couched in lots of fluff. many misrepresentations. not worth responding in detail (if want proper review, open new question).

(4.1) validto in addition validfrom. serious mistake (as identified @ top of answer) book makes; laboriously solves. don't make mistake in first place, , have nothing solve in second place. understand it, eliminate triggers.

(4.2) simple rules, taking both normalisation , temporal requirements account. first , foremost, need understand (a) temporal requirement , (b) datatypes, correct usage , limitations. store:

instant datetime, eg. updateddtm
interval integer, identifying unit in column name, eg. intervalsec
period. depends on conjunct or disjunct.
- for conjunct, requirement is, (4.1) applies: use 1 datetime; end of period can derived beginning of period of next row.
- for disjunct periods, yes, need 2 x datetimes, eg, rentedfrom , rentedto gaps in-between.

(4.3) mess "temporal primary key", complicates code (in addition requiring triggers control update anomaly). have delivered clean (tried , tested) temporal primary key.

(4.4) mess dummy values, non-real values, , nulls "now". not allow such things in database. since not storing duplicated validto, not have problem, there nothing solve.

(4.5) 1 has wonder why 528 page "textbook" available free on web, in poor pdf form.

(5) i [an user] quiet happily delete locationhistory rows instance, (leaving current version in location table) - though there may exist sensorhistory row conceptually "belongs" previous version of location, if makes sense.

it not make sense me, there still gap in communication have close. please keep interacting until closed.

in real (standard iso/iec/ansi sql) database, not grant insert/update/delete permission users. grant select , references only (to chosen users) insert/update/deletes coded in transactions, means stored procs. grant exec on each stored proc selected users (use roles reduce administration).
- therefore no 1 can delete table without executing proc.
- do not write proc delete history table. these rows should not deleted. in case, non-permission , non-existence of code is constraint.
- technically, history rows valid, there no period concern with. oldest locationhistory row contains before-image of original location row before changed. youngest locationhistory rows before-image of current location row. every locationhistory row in-between thusly valid , applies period in-between.
- no need "prune" or few locationhistory rows can deleted on basis apply period not used: they used. (definitively, without need checking mapping of location children locationhistory row(s), prove it.)
- bottom line: user cannot delete history (or transaction) table.
- or mean different again ?
- note have added (1.1) above.

(6) corrected 1 mistake in dm. alert expression of reading, not sensor.

(7) corrected business rules in other question/answer reflect that; , new rules exposed in question.

(8) understand/appreciate, since have idef1x compliant model, re identifiers:

the identifiers carried through entire database, retaining power. eg. when listing acknowledgements, can joined directly location , sensor; tables in-between not have read (and must if id keys used). why there in facts less joins required in relational database (and more joins required in unnormalised one).
the subtypes, etc need navigated only when particular context relevant.

Search This Blog

TY

Featured post

c# - Usage of Server Side Controls in MVC Frame work -