Strong Entity Integrity: Part 2 — Projection
What is the entity class of the result of relational projection

in relational algebra or its equivalent SQL expression?:
SELECT
a1, a2
FROM
r
Is the entity class of the result the same as that of r? What’s the primary key of the result?
Neither SQL nor RM provide a consistent answer. The answer depends on whether r has other attributes in its primary key besides a1 and a2. If the primary key only has a1 and/or a2, then the entity identity is preserved and the output of the query has the same primary key and describes the same entity set as r.
If the primary key of r contains other attributes besides a1 and a2, then SQL and RM produce a result that can no longer be reliably associated with entities in r and even the number of unique entities in the result may change. The new effective primary key becomes the combination (a1, a2), with no explicit claim of what entity class they identify. If r is a set of hammers, the projection may be a set of quite different, unidentified things. Entity integrity has broken down.
DataJoint’s projection operator r.proj('a1', 'a2') is constrained to always include the primary key of its argument r. The projection operator can rename primary key attributes but never exclude them from the result. The result is still a set of entities of the same class (even if some non-key properties may be removed or renamed or added). Each entity in the result is associated with an entity in the source r.
Related posts
Entrepreneurs of Insight
A Better Data Engine for Brain Science
Data needs direction: five clarifications for database design
Updates Delivered *Straight to Your Inbox*
Join the mailing list for industry insights, company news, and product updates delivered monthly.
