Strong Entity Integrity: Part 7 — Division
One of the most difficult relational operations to comprehend is the relational division.
We re-define division as
Division B ÷ C with respect to A is the subset of A for which every matching element in C has a match in B.
In DataJoint, relational division is performed by the following expression:
D = A - (A*C - B)
Most traditional definitions also exclude the non-primary attributes from the result, making it equivalent to
D = A.proj() - (A*C - B)
From this definition, it follows that relational division is a form of restriction on A and it preserves its entity class and primary key.
The reason that the traditional formulations of relational division are so difficult to articulate is that they present division as a binary operator on B and C, introducing A implicitly through operations on the relations’ headings. DataJoint requires that A be explicitly formed. Therefore, division in DataJoint is a ternary operator.
As an example, consider the following fragment of a university database (in Python):
@schema
class
StudentMajor(dj.Manual):
definition
=
""" # Student with major
-> Student
---
-> Major
"""
@schema
class
CompletedCourse(dj.Manual):
definition
=
""" # Student's completed course
-> Student
-> Course
---
-> Grade
"""
@schema
class
RequiredCourse(dj.Manual):
definition
=
""" # Course required for major
-> Major
-> Course
"""
Then the division CompletedCourse ÷ RequiredCourse with respect to StudentMajor is the list of all students with majors who have completed all the courses required for their major.
required_courses
=
StudentMajor()
*
RequiredCourse()
remaining_courses
=
required_courses
-
CompletedCourse()
candidates
=
StudentMajor()
-
remaining_courses
or as a single statement:
candidates
=
StudentMajor()
-
(
StudentMajor()
*
RequiredCourse()
-
CompletedCourse())
Advanced: The conventional binary relational division can be mimicked using DataJoint’s special universal entity set dj.U by replacing A = dj.U(B.primary_key - C.primary_key) & B().
Related posts
Entrepreneurs of Insight
A Better Data Engine for Brain Science
Data needs direction: five clarifications for database design
Updates Delivered *Straight to Your Inbox*
Join the mailing list for industry insights, company news, and product updates delivered monthly.
