Intro to CZI files Part 2

Reading CZI files with Python.

Purpose: Jclub tutorial for how to process czi files in python. This notebook will just be how to open CZI files and view them in jupyter notebook.

Disclaimer: Marc and Ciera taught me everything I know

General info: The best image analysis tutorial everrrrr

Open a notebook and specify where you are

In [3]:
### in the terminal use cd to get to the directory you want to be in.

### go into the conda environemnt where you installed czifile stuff
souce activate 'yourenv' # change 'yourenv' to whatever you named your environment

###  Then type:
jupyter notebook 

### this should open up a directory in your browser. 
## Click new and then click on whichever environment you are in. This should open a new notebook! 

Import packages

In [4]:
### Essential for viewing importing and viewing images
import czifile #this is the package to import CZI files
import matplotlib.pyplot as plt #this package visualizes images

### Packages for manipulating images
from scipy import ndimage as ndi 
from skimage import filters, measure, segmentation, transform, exposure, img_as_ubyte, feature, morphology

Mise en place - get those folders organized!

  • Download 100118-oligopaint3-2-02.czi
  • Move it to whatever folder you want to work out of #### use 'os' packages to change directories in python
In [7]:
import os

I like to make variables that store paths for specific projects

In [15]:
ProjectDirectory = ('/Users/jennahaines/Box Sync/Eisen_Lab/IntrotoCZIfiles')
ProjectData = ('/Users/jennahaines/Box Sync/Eisen_Lab/IntrotoCZIfiles/data/')
ProjectBin = (ProjectDirectory + '/bin')

Make results notebook

I like to make a new results notebook in the same directory for each jupyter notebook - that way I know which images came from which code later. I based it off of Nobel et al., 2009.

ProjectDirectory

  • bin
  • docs
  • data
In [9]:
NotebookResultsPath = (ProjectDirectory)

Change directory to the results folder so that anything you generate will automatically go in there

In [10]:
os.chdir(NotebookResultsPath)

Open a CZI file

I like to put the CZI name as a seperate variable so that I can easily hook things I am writing into loops

In [12]:
CziName = '100118-oligopaint3-2-02.czi'

I make a full path varaible in a second step so that I can loop through files easily

In [16]:
FullPath = (ProjectData + CziName)
print(FullPath)
/Users/jennahaines/Box Sync/Eisen_Lab/IntrotoCZIfiles/data/100118-oligopaint3-2-02.czi

to open the Czi - use the czifile package

In [21]:
import czifile

czi_array = czifile.imread(FullPath) #read the czi file.

print(czi_array.shape)
(1, 1, 1, 4, 19, 678, 678, 1)

Shape

  • shape just tells you how many dimensions the array is.
  • There are a lot of dimensions and I forget what they all are (written on a sticky note in lab :(). I think the first three are like time point, scene and stuff like that that I don't generally use..
  • (1-?, 1-?, 1-?, 4 - channels, 19 - slices, 678 - y plane, 678 - x plane, 1)

I use squeeze function to take out all of the unused elements

In [22]:
czi_array = czi_array.squeeze() #take out the dimensions that are not important
print(czi_array.shape)
(4, 19, 678, 678)

Looking at the czi

  • usually you want to look at an individual image to see what's going on.
  • to do that use matplotlib - it can print out the czi image with imshow
  • this stack has 20 x 5 images in it though in an array. Matplotlib only prints out one at once.
  • use numpy array functions to dissect array

Pull out all images pertaining to one channel

Tip its 0 based meaning 0 is a channel too.

In [37]:
ChannelStack = czi_array[2,...] 
print(ChannelStack.shape)
(19, 678, 678)

import matplotlib

In [35]:
import matplotlib.pyplot as plt
%matplotlib inline

pick an image in the stack and print it using matplotlib

in this case I picked 15.. change the numbers around to go through the different images

In [38]:
plt.imshow(ChannelStack[15])
Out[38]:
<matplotlib.image.AxesImage at 0x119a8ef28>

There may come a time (and that time might be right now) where you would like to see all the images in this stack at once instead of typing them out manually.

I wrote out a loop that does this.

In [41]:
fig, axs = plt.subplots(nrows = 4, ncols = 5, figsize=(50, 50))
for ax, pln in zip(axs.flat, range(ChannelStack.shape[0])) : 
    ax.imshow(ChannelStack[pln])
    ax.axis('off')
    ax.set_title(str(pln))
plt.tight_layout()

### To save this image as a png in your directory uncomment this line
plt.savefig('200323-StackImages.png')

Hot tip/mess - Distraction

To loop through CziFiles:

In [19]:
from os import listdir

#### Import the filelist
filelist = listdir(ProjectData) #list of file names

#### Add any file names that have .czi to a list target files
target_files = []
for fname in filelist:
    if fname.endswith('.czi'):
        target_files.append(fname)
    else:
         print("This file is not a CZI file - get outta here :", fname)
print(target_files)
['100118-oligopaint3-2-02.czi']

Supp info

This is mostly for me but I thought I would include it in case you wanted to see the full function I use to import and separate Czi files into individual stacks based on channel. All the channels are hardwired so if you are not collecting on these specific channels you'll have to change what strings it's looking for. Maybe after I finish my thesis I will write a way for the script to do this smartly.

In [ ]:
import sys
import czifile
import matplotlib.pyplot as plt
from scipy import ndimage as ndi
from skimage import filters, measure, segmentation, transform, exposure, img_as_ubyte, feature, morphology
from skimage.morphology import disk, ball
import numpy as np
from os import listdir
import os
from xml.etree import ElementTree as ET
import re
import os.path
from os import path
import tifffile

def CZIMetadatatoDictionaries(InputDirectory, CziName):
    czi = czifile.CziFile(InputDirectory + str(CziName))
    czi_array = czifile.imread(InputDirectory + str(CziName)) #read the czi file.
    czi_array = czi_array.squeeze() #take out the dimensions that are not important
    #print(czi_array.shape)
    
    ####### Extract the metadata
    metadata = czi.metadata     #reading the metadata from CZI
    root = ET.fromstring(metadata)  #loading metadata into XML object
    ##### Making a dictionry from all of the channel data only if it has ID that hs the channel number as the key and the dye name as the value
    ChannelDictionary = {}
    for neighbor in root.iter('Channel'):  
        TempDict = {}
        TempDict = neighbor.attrib
        if 'Id' in TempDict: #for the metadata lines that start with ID
            #print(TempDict) #test
            Search = r"(\w+):(\d)" #separate the channel:1 into two parts .. only keep the number
            Result = re.search(Search, TempDict['Id'])
            Search2 = r"(\w+)-(.+)"
            Result2 = re.search(Search2, TempDict['Name'])
            ChannelDictionary[Result2.group(1)] = Result.group(2) #make a new dictionary where that number (0 based!) is the channel/key to the and the value is the dye name
    #print(ChannelDictionary)

    ####### pull out the channels and make stacks
    if "AF405" in ChannelDictionary.keys():
        AF405index = ChannelDictionary["AF405"]
        AF405Stack = czi_array[int(AF405index),...]
    else:
        print("AF405 is not in this file")
        AF488Stack = 'empty'
    
    if "AF488" in ChannelDictionary.keys():
        AF488index = ChannelDictionary["AF488"]
        AF488Stack = czi_array[int(AF488index),...]

    else:
        print("AF488 is not in this file")
        AF488Stack = 'empty'
    
    if "AF647" in ChannelDictionary.keys():
        AF647index = ChannelDictionary["AF647"]
        AF647Stack = czi_array[int(AF647index),...]
    else:
        print("AF647 is not in this file")
        AF647Stack = 'empty'

    if "AF546" in ChannelDictionary.keys() :
        AF546index = ChannelDictionary["AF546"]
        AF546Stack = czi_array[int(AF546index),...]
    elif "At550" in ChannelDictionary.keys() :
        AF546index = ChannelDictionary["At550"]
        AF546Stack = czi_array[int(AF546index),...]
    else:
        print("AF546 is not in this file")
        AF546Stack = 'empty' 
    
    return(AF405Stack, AF488Stack, AF647Stack, AF546Stack)