Written by
Dimitri Yatsenko, PhD
Founder • Chief Science & Technology Officer
July 30, 2017

Strong Entity Integrity: Part 2 — Projection

Dimitri Yatsenko, PhD
Founder • Chief Science & Technology Officer

What is the entity class of the result of relational projection

\pi_{A_1, A_2}(r)

in relational algebra or its equivalent SQL expression?:

SELECT a1, a2 FROM r

Is the entity class of the result the same as that of r? What’s the primary key of the result?

Neither SQL nor RM provide a consistent answer. The answer depends on whether r has other attributes in its primary key besides a1 and a2.  If the primary key only has a1 and/or a2, then the entity identity is preserved and the output of the query has the same primary key and describes the same entity set as r.  

If the primary key of r contains other attributes besides a1 and a2, then SQL and RM produce a result that can no longer be reliably associated with entities in r and even the number of unique entities in the result may change. The new effective primary key becomes the combination (a1, a2), with no explicit claim of what entity class they identify. If r is a set of hammers, the projection may be a set of quite different, unidentified things. Entity integrity has broken down.

DataJoint’s projection operator r.proj('a1', 'a2') is constrained to always include the primary key of its argument r. The projection operator can rename primary key attributes but never exclude them from the result. The result is still a set of entities of the same class (even if some non-key properties may be removed or renamed or added). Each entity in the result is associated with an entity in the source r.

Related posts

Updates Delivered *Straight to Your Inbox*

Join the mailing list for industry insights, company news, and product updates delivered monthly.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.