Why reinvent the relational model?

Dimitri Yatsenko, PhD

Founder • Chief Science & Technology Officer

An introductory course in Database Systems will likely present two closely related data models: the Entity-Relationship Model (ERM) and the Relational Data Model (RDM). The ERM is useful for conceptual modeling of real-world entities, their attributes, and relationships between entities of different classes. The RDM supports logical modeling suitable for implementation. Much of the course will focus on how to convert ERM designs into the RDM; such conversion remains an art even though automated tools have been proposed. Furthermore, the process is irreversible: an ERM design cannot be straightforwardly recovered from its RDM counterpart.

Elmasri-Fig3-15 — *An example of entity-relationship modeling from Elmasri and Navathe’s “Fundamentals of Database Systems” (7th Ed.)*

Database programmers rarely bother with a formal ERM design. With experience they learn to model entities and relationships in their heads and churn out SQL table declarations. Just like a Zion operator from the Matrix movies perceives the state of the Matrix from the code raining on her green screen, database programmers infer the underlying conceptual design from existing table declarations and foreign key constraints defined by others. Tools for reverse-engineering database schemas do not quite recover the entity-relationship design but help visualize the structure of tables and foreign key constraints to help infer it.

Why do we need two data models to design one database? Why not have a single data model that can be used for both conceptual modeling and for implementation? Why is the ERM not suitable for logical modeling and the RDM is a poor conceptual model?

I will speculate that part of the problem lies in the chronology of the two inventions. The RDM was defined in 1969 (by Edgar F. Codd) whereas the ERM did not appear until much later, in 1976 (Peter Chen). The RDM was inspired by the mathematical concept of relations from set theory. A relation is defined as a subset of the Cartesian product of several sets (domains). Although his descriptions implied that relations corresponded to sets of real-world entities of various types, Codd formulated his model in much more general and abstract terms. By the time the ERM was described, relational concepts were already firmly ingrained.

I will further speculate that had the chronology been reversed and had the relational model been constrained by E-R concepts, many of its core definitions and operations would have turned out quite different. Perhaps we would have a relational-like model that kept its focus on modeled entities and their relationships. Then perhaps this model would suit the needs of both conceptual and logical design. Furthermore, abstract and arcane concepts such as functional dependencies and normal forms would be formulated in much more approachable terms such as proper delineation of entities.

The core idea of DataJoint is to reformulate the Relational Data Model to prioritize its effectiveness in the role of a Entity-Relationship Model. The resulting data model should obviate the need for two separate processes, or, since ERM is rare in practice, greatly improve the conceptual aspects of the relational data model.

This unification of conceptual and logical modeling required major revisions of many established concepts in traditional database design. Since SQL has long become the lingua franca of relational databases, we will often contrast how solutions in DataJoint differ from those in SQL. Most DataJoint users learn database programming without ever touching SQL. Even for them, such examples may still help clarify basic concepts. For users who already know SQL and relational concepts from other sources, the examples will help map their knowledge to DataJoint.

Insights & Ideas

September 26, 2025

Neuropixels, Plainly Explained

What you’ll learn: which probe fits your study, the open tools you need, and how to keep results trustworthy.

Milagros Marín, PhD

SciOps Engineer

Insights & Ideas

August 26, 2025

AI and the Evolution of Relational Schemas

Can AI thrive without structure? Why relational schemas are essential for AI to produce reliable, trustworthy results - and need not be rigid.

Dimitri Yatsenko, PhD

Founder • Chief Science & Technology Officer

Insights & Ideas

August 21, 2025

Insight Entrepreneurship – A New Vision for Science

What if scientists were also stewards, strategists, and storytellers of knowledge? Introducing a new model for revitalizing science through ownership, integrity, and impact.

Dimitri Yatsenko, PhD

Founder • Chief Science & Technology Officer

Updates Delivered Straight to Your Inbox

Join the mailing list for industry insights, company news, and product updates delivered monthly.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Scientists in full protective suits and gloves working in a laboratory, one looking into a microscope and taking notes, the other using a computer.

Why reinvent the relational model?

Neuropixels, Plainly Explained

AI and the Evolution of Relational Schemas

Insight Entrepreneurship – A New Vision for Science

Updates Delivered *Straight to Your Inbox*

Updates Delivered Straight to Your Inbox