Written by

Dimitri Yatsenko, PhD

Founder • Chief Science & Technology Officer

Get Started with DataJoint

July 31, 2017

Strong Entity Integrity: Part 3 — Restriction

Dimitri Yatsenko, PhD

Founder • Chief Science & Technology Officer

We use the term restriction to refer to the selection of a subset of an entity set. If database terms, restriction selects a subset of rows from a table whereas projection selects columns. We avoided the term selection because in SQL queres, confusingly, the SELECT clause performs projection whereas the WHERE clause performs selection (restriction).

Since restriction does not affect the heading of its argument, restriction may seem as the most benign operation when it comes to entity integrity. It stands to reason that, if r is a set of hammers, then its subset

or the SQL query

SELECT * FROM r

WHERE color='red'

will also represent hammers.

In DataJoint, the equivalent expression is

r & 'color="red"'

and its primary key is the same as r‘s.

Is this reasoning bullet proof?

What if we restrict by an equality condition on a primary key attribute?

For example, if the primary key of r is (a,b), then r & 'a=5' will yield a relation where every tuple can be uniquely identified by b. From the purely relational point of view, b becomes the primary key of the result, i.e. the minimal subset of attributes uniquely identifying each tuple. a is no longer necessary at all since its value is known from the restriction.

This is where DataJoint makes a compromise in favor of the ERM and entity integrity: DataJoint keeps the original primary key even if it’s redundant — for the sake of entity integrity. DataJoint keeps a in the primary key of the result, which means that it cannot be projected out in subsequent operations (see Part 2 of this series).

The SQL equivalent of DataJoint’s (r & 'a=5').proj('c') if the primary key of r is (a, b) is

SELECT a, b, c

FROM r

WHERE a=5

In SQL you might prefer to omit the a since we do not need to retrieve its value:

SELECT b, c

FROM r

WHERE a=5

Although potentially more efficient, the SQL result has lost its entity class. Or put in another way, this query introduces a new unnamed implicit entity class with the primary key of b.

DataJoint’s mantra is “preserve entity integrity” even if it costs some efficiency (usually very little). True, it is possible that an experienced programmer can re-write a DataJoint query more efficiently in SQL. But the gain in clarity and logical coherence is well worth the cost. DataJoint enforces strong entity integrity for the sake of the human scientist and not to help the query compiler be more efficient. DataJoint’s query model ensures that new entities are not introduced implicitly willy-nilly mid-sentence in queries. The only way to introduce a new entity class is to explicitly define it.

Insights & Ideas

September 26, 2025

Neuropixels, Plainly Explained

What you’ll learn: which probe fits your study, the open tools you need, and how to keep results trustworthy.

Milagros Marín, PhD

SciOps Engineer

Insights & Ideas

August 26, 2025

AI and the Evolution of Relational Schemas

Can AI thrive without structure? Why relational schemas are essential for AI to produce reliable, trustworthy results - and need not be rigid.

Dimitri Yatsenko, PhD

Founder • Chief Science & Technology Officer

Insights & Ideas

August 21, 2025

Insight Entrepreneurship – A New Vision for Science

What if scientists were also stewards, strategists, and storytellers of knowledge? Introducing a new model for revitalizing science through ownership, integrity, and impact.

Dimitri Yatsenko, PhD

Founder • Chief Science & Technology Officer

Updates Delivered Straight to Your Inbox

Join the mailing list for industry insights, company news, and product updates delivered monthly.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Scientists in full protective suits and gloves working in a laboratory, one looking into a microscope and taking notes, the other using a computer.

Strong Entity Integrity: Part 3 — Restriction

Neuropixels, Plainly Explained

AI and the Evolution of Relational Schemas

Insight Entrepreneurship – A New Vision for Science

Updates Delivered *Straight to Your Inbox*

Updates Delivered Straight to Your Inbox