DataJoint Basics in Ten Minutes

No items found.

As I mentioned in my previous post, DataJoint is one of those rare technologies that comes around and can truly revolutionize your workflow making your life much easier.

The following are different functions that DataJoint has. For the purposes of this blog post, we will imagine ourselves as neuroscientists about to perform a neurophysiology experiment on our subjects, a set of mice.

We start off by importing DataJoint; our convention is to use dj as its alias.

>>> import datajoint as dj

From here we create a schema, which is where our tables will live. We give our schema some meaningful name, in this case, I will call mine ‘tutorial’, and we end the name with an _, giving us tutorial_.

>>> schema = dj.schema('tutorial_')

Creating Our First Table

Now that we have our schema, it is time to populate it with tables! Our first table is called Mouse and it is a Manual table. This is a category of tables where the information is inserted manually.

>>> @schema class Mouse(dj.Manual): definition = """ # Experimental animals mouse_id : int # Unique animal ID --- dob=null : date # date of birth sex="unknown" : enum('M','F','unknown') # sex """

On the left side, under #Experimental animals, we see our column names. These are the features that we will document about our Mice. Each table has a primary key which is separated from the secondary keys by — — — . In this case, the primary key is the mouse_id, and the secondary keys are dob (date of birth) and sex.

Inserting Data

>>> Mouse() Mouse_id dob sex

Our table is currently empty and being a manual table, we must insert data into it. DataJoint has several ways of doing this. The first is using the .insert()function.

>>> Mouse.insert1((0, '2017-03-01', 'M')) >>> Mouse() Mouse_id dob sex 0 2017-03-01 M

Here we see that we can insert a tuple with the necessary information.

You can insert data via a dictionary.

>>> data = { 'mouse_id': 100, 'dob': '2017-05-12', 'sex': 'F'} >>> Mouse.insert1(data) Mouse_id dob sex 0 2017-03-01 M 100 2017-05-12 F

You can also use a list of tuples.

>>> data = [(1, '2016-11-19', 'M'),(2, '2016-11-20', 'unknown'),(5, '2016-12-25', 'F')] >>> Mouse.insert(data) Out[7]: Mouse_id dob sex 0 2017-03-01 M 1 2016-11-19 M 2 2016-11-20 unknown 5 2016-12-25 F 100 2017-15-12 F

Data Integrity

One of the many benefits of DataJoint is its ability to uphold data integrity. Trying to insert a new mouse with mouse_id :0 results in a Duplicate error.

>>> Mouse.insert1({'mouse_id': 0,'dob': '2018-01-01','sex': 'M',}) ---------------------------------------------------------------------------DuplicateError Traceback (most recent call last)<ipython-input-21-304369cd8a83> in <module> 2 {'mouse_id': 0, 3 'dob': '2018-01-01',----> 4 'sex': 'M', 5 })~/.local/lib/python3.7/site-packages/datajoint/table.py in insert1(self, row, **kwargs) 264 For kwargs, see insert() 265 """--> 266 self.insert((row,), **kwargs) 267 268 def insert(self, rows, replace=False, skip_duplicates=False, ignore_extra_fields=False, allow_direct_insert=None):~/.local/lib/python3.7/site-packages/datajoint/table.py in insert(self, rows, replace, skip_duplicates, ignore_extra_fields, allow_direct_insert) 336 except DuplicateError as err: 337 raise err.suggest(--> 338 'To ignore duplicate entries in insert, set skip_duplicates=True') 339 340 def delete_quick(self, get_count=False):DuplicateError: ("Duplicate entry '0' for key 'PRIMARY'", 'To ignore duplicate entries in insert, set skip_duplicates=True')

Creating A Table With Dependencies

>>> @schema class Session(dj.Manual): definition = """ # Experiment session -> Mouse session_date : date # date --- experiment_setup : int # experiment setup ID experimenter : varchar(100) # experimenter name data_path='' : varchar(255) # """

We can see that when creating the Session table, the first primary key is -> Mouse. This means that Session inherits the Mouse’s primary key as well as includes session_date.