Written by
Dimitri Yatsenko, PhD
Founder • Chief Science & Technology Officer
August 6, 2018

Preprint: The DataJoint Model

Dimitri Yatsenko, PhD
Founder • Chief Science & Technology Officer

We have just published a preprint detailing the DataJoint data model on arXiv. The paper incorporates many of the principles first formulated on this blog.

DataJoint: Managing big scientific data using MATLAB or Python

Authors:

Dimitri Yatsenko, Alexander Reimer, Edgar Walker, Taosha Fan, Andreas S. Tolias

Summary:

This paper introduces DataJoint, a framework designed to manage large and complex scientific datasets—particularly in neuroscience and other data-intensive research domains. The authors identify challenges in traditional data management approaches, such as poor integration of data and computation, lack of reproducibility, and ad hoc workflows.

DataJoint addresses these challenges with:

  • A unified data model based on the relational data model and entity-relationship modeling, but adapted for scientific workflows.
  • Clear separation between manual and automated data, supporting declarative pipeline definitions that link raw data acquisition to derived results.
  • Support for two popular environments—Python and MATLAB—with consistent APIs, making it accessible to scientists.
  • Automatic dependency tracking, so that computations can be automatically triggered and results kept synchronized.
  • Scalability and concurrency through backend use of MySQL or MariaDB and distributed job reservation for parallel computing.

The paper provides several examples, including usage in real neuroscience research, demonstrating how DataJoint enables reproducibility, data integrity, and collaboration.

Key Takeaway:

DataJoint is not just another ORM or data interface—it’s a complete framework for building structured, scalable, and reproducible data pipelines in modern scientific research.

Related posts

Updates Delivered *Straight to Your Inbox*

Join the mailing list for industry insights, company news, and product updates delivered monthly.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.