Skip to content

External Data

File Attachment Datatype

Configuration & Usage

Corresponding to issue #480, the attach attribute type allows users to attach files into DataJoint schemas as DataJoint-managed files. This is in contrast to traditional blobs which are encodings of programming language data structures such as arrays.

The functionality is modeled after email attachments, where users attach a file along with a message and message recipients have access to a copy of that file upon retrieval of the message.

For DataJoint attach attributes, DataJoint will copy the input file into a DataJoint store, hash the file contents, and track the input file name. Subsequent fetch operations will transfer a copy of the file to the local directory of the Python process and return a pointer to it's location for subsequent client usage. This allows arbitrary files to be uploaded or attached to a DataJoint schema for later use in processing. File integrity is preserved by checksum comparison against the attachment data and verifying the contents during retrieval.

For example, given a localattach store:

dj.config['stores'] = {
  'localattach': {
    'protocol': 'file',
    'location': '/data/attach'
  }
}

A ScanAttachment table can be created:

@schema
class ScanAttachment(dj.Manual):
    definition = """
    -> Session
    ---
    scan_image:    attach@localattach  # attached image scans
    """

Files can be added using an insert pointing to the source file:

>>> ScanAttachment.insert1((0, '/input/image0.tif'))

And then retrieved to the current directory using fetch:

>>> s0 = (ScanAttachment & {'session_id': 0}).fetch1()
>>> s0
{'session_id': 0, 'scan_image': './image0.tif'}
>>> fh = open(s0['scan_image'], 'rb')
>>> fh
<_io.BufferedReader name='./image0.tif')