Written by
No items found.
June 14, 2020

DataJoint Basics in Ten Minutes

No items found.

As I mentioned in my previous post, DataJoint is one of those rare technologies that comes around and can truly revolutionize your workflow making your life much easier.

The following are different functions that DataJoint has. For the purposes of this blog post, we will imagine ourselves as neuroscientists about to perform a neurophysiology experiment on our subjects, a set of mice.

We start off by importing DataJoint; our convention is to use dj as its alias.

>>> import datajoint as dj

From here we create a schema, which is where our tables will live. We give our schema some meaningful name, in this case, I will call mine ‘tutorial’, and we end the name with an _, giving us tutorial_.

>>> schema = dj.schema('tutorial_')

Creating Our First Table

Now that we have our schema, it is time to populate it with tables! Our first table is called Mouse and it is a Manual table. This is a category of tables where the information is inserted manually.

>>> @schema
   class Mouse(dj.Manual):
     definition = """
     # Experimental animals
     mouse_id             : int                          # Unique animal ID
     ---
     dob=null             : date                         # date of birth
     sex="unknown"        : enum('M','F','unknown')      # sex
     """

On the left side, under #Experimental animals, we see our column names. These are the features that we will document about our Mice. Each table has a primary key which is separated from the secondary keys by  — — — . In this case, the primary key is the mouse_id, and the secondary keys are dob (date of birth) and sex.

Inserting Data

>>> Mouse()
Mouse_id dob sex

Our table is currently empty and being a manual table, we must insert data into it. DataJoint has several ways of doing this. The first is using the .insert()function.

>>> Mouse.insert1((0, '2017-03-01', 'M'))      
>>> Mouse()  
Mouse_id      dob           sex      
0        2017-03-01        M    

Here we see that we can insert a tuple with the necessary information.

You can insert data via a dictionary.

>>> data = { 'mouse_id': 100, 'dob': '2017-05-12', 'sex': 'F'}  
>>> Mouse.insert1(data)                                                                                                                                                                                                                                                                                  
Mouse_id      dob           sex      
0        2017-03-01        M        
100      2017-05-12        F

You can also use a list of tuples.

>>> data = [(1, '2016-11-19', 'M'),(2, '2016-11-20', 'unknown'),(5, '2016-12-25', 'F')]      
>>> Mouse.insert(data)
Out[7]: Mouse_id      dob           sex          
0        2017-03-01        M          
1        2016-11-19        M          
2        2016-11-20   unknown      
5        2016-12-25        F        
100       2017-15-12       F

Data Integrity

One of the many benefits of DataJoint is its ability to uphold data integrity. Trying to insert a new mouse with mouse_id :0 results in a Duplicate error.

>>> Mouse.insert1({'mouse_id': 0,'dob': '2018-01-01','sex': 'M',})
---------------------------------------------------------------------------DuplicateError    
Traceback (most recent call last)<ipython-input-21-304369cd8a83> in <module>      2 {'mouse_id': 0,      3  'dob': '2018-01-01',----> 4  'sex': 'M',      5 })~/.local/lib/python3.7/site-packages/datajoint/table.py in insert1(self, row, **kwargs)    264         For kwargs, see insert()    265         """--> 266         self.insert((row,), **kwargs)    267     268     def insert(self, rows, replace=False, skip_duplicates=False, ignore_extra_fields=False, allow_direct_insert=None):~/.local/lib/python3.7/site-packages/datajoint/table.py in insert(self, rows, replace, skip_duplicates, ignore_extra_fields, allow_direct_insert)    336             except DuplicateError as err:    337                 raise err.suggest(--> 338                     'To ignore duplicate entries in insert, set skip_duplicates=True')    339     340     def delete_quick(self, get_count=False):DuplicateError: ("Duplicate entry '0' for key 'PRIMARY'", 'To ignore duplicate entries in insert, set skip_duplicates=True')

Creating A Table With Dependencies

>>> @schema          
   class Session(dj.Manual):            
       definition = """            
       # Experiment session          
       -> Mouse            
       session_date         : date         # date  
       ---            
       experiment_setup     : int   # experiment setup ID  
       experimenter         : varchar(100)  # experimenter name            
       data_path=''         : varchar(255)                 #            """

We can see that when creating the Session table, the first primary key is -> Mouse. This means that Session inherits the Mouse’s primary key as well as includes session_date.

Querying

One of DataJoint’s strengths lies in its ability to query. There may be times when you only want a subset of the data according to certain criteria.

Four Types Of Queries:

  • restriction (&) and negative restriction (-): filter data
  • join (*): bring fields from different tables together
  • projection (.proj()): focus on a subset of attributes
  • aggregation (.aggr()): simple computation of one table against another table

Restriction (&)

The & can be used to limit

>>> Mouse & 'mouse_id=0'
Mouse_id     dob        sex      
0     2017-03-01      M

We can also use the restriction with a couple of other keywords.

In[4]: Mouse & 'mouse_id = 0'Out[4]: Mouse_id     dob        sex           0     2017-03-01      M

>>> Mouse & 'mouse_id = 0'
Mouse_id     dob        sex      
0     2017-03-01      M

Related posts

Updates Delivered *Straight to Your Inbox*

Join the mailing list for industry insights, company news, and product updates delivered monthly.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.