Setting up Giotto Viewer inputs


Option 1. From Giotto Analyzer using exportGiottoViewer

In the analyzer, there is a function exportGiottoViewer() which is used to help prepare the gene expression, clustering annotations, results of Giotto, for interactive viewing with Giotto Viewer.

In Giotto analyzer R package, once you have finished loading the dataset, have performed some clustering, and spatial domain detection, you will have the information needed to visualize with the Giotto Viewer. Call the exportGiottoViewer() function direclty, which will generate the files in a folder
> viewer_folder = '/home/qzhu/Mouse_cortex_viewer/'
# coming from seqFISH+ dataset
# select annotations, reductions and expression values to view in Giotto Viewer
> exportGiottoViewer(gobject = VC_test, output_directory = viewer_folder, annotations = c('cell_types', 'kmeans', 'global_cell_types', 'sub_cell_types', 'HMRF_k9_b.30'), dim_reductions = c('tsne', 'umap'), dim_reduction_names = c('tsne', 'umap'), expression_values = 'scaled', expression_rounding = 3, overwrite_dir = T)


Option 2. Manual preparation

Giotto Viewer can be used independently of Analyzer portion. If users would like to use Giotto Viewer independently, he/she should prepare the input files manually.


1.1 A list of files
A list of files to be required are:
  • Gene expression matrix (CSV)
  • Cell centroids
  • Clustering annotations
  • tSNE/UMAP cell coordinates
  • Gene list
  • Annotation list
Gene expression
A comma-delimited CSV file with header information. First row consists of cell IDs (integers starting at 1). First column is gene name. Values are distributed in a log-transformed scale, and preferably centered. Z-scores tend to work well. Example:
gene,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36...
1700022a21rik,2.177,1.507,-0.655,-0.307,-0.557,1.209,0.808,-0.745,-0.448,1.364,-0.702,...
...
Cell centroids
Cell centroids (physical space). A txt file, space-delimited. Two columns: column 1: x-coord, column2: y-coord. Coordinates should be floats. Example:
1632.02 -1305.7
1589.47 -669.51
1539.89 -1185.9
1513.94 -710.24
1477.85 -763.87
1304.03 -1142.4
1291.34 -957.71
1282.49 -1725.4
1218.73 -806.56
1192.59 -1800.1
1202.12 -1886
...
UMAP/tSNE cell coordinates
Cell coordinates in dimension reduced space (tsne or umap). A txt file, space-delimited. Two columns: column 1: x-coord, column2: y-coord. Coordinates should be floats. Coordinates in each dimension should be rescaled to (-20, +20). Example:
3.54296041397065 16.2333477323851
4.12680112933669 17.8600391976742
2.70956840808696 18.0771595383443
-19.7267926282317 14.242800138732
5.65662950194794 18.1156365744853
-18.1445481952583 12.4039519560168
3.7023312646684 16.9553451866607
-18.1526517148365 13.7815003752055
6.12198670808932 19.2065944991145
...
Clustering annotations
This can consist of multiple annotation sets. Each annotation set is two files:
  • *.txt: a single column of integer cluster membership. Each row is a cell. Example:
    2
    4
    1
    3
    1
    1
    2
    5
    2
    1
    ...
    
  • *.annot: a mapping of cluster membership to a textual label. Consists of two columns, separated by space. Column 1: cluster ID, column 2: text label. Example:
    1	Excitatory neuron
    2	Interneuron
    3	Microglia
    4	Astrocyte
    5	Endothelial
    6	Oligodendrocyte
    7	Neuroblast
    8	Neural Stem
    9	Ependymal
    
Gene list
Single column list of gene names. No spaces. Each row is a gene. Example:
9430091e24rik
A1cf
A230056j06rik
A3galt2
A430090l17rik
A4galt
A4gnt
Aaas
Aacs
...
Annotation list
A list of annotations: Two columns (space separated). Column 1: Name of *.txt file. Column 2: Name of *.annot file. Example:
cell_types_annot_information.txt cell_types
leiden_clus_annot_information.txt leiden_clus
sub_leiden_clus_select_annot_information.txt sub_leiden_clus_select


1.2 Assumptions
  • The above files assume that there is only one field of view. Or the input files have been stitched to one FOV.
  • It also assumes that there are no images involved.


1.3 Sample package
Download it Mouse_cortex_viewer.zip.
-rw-r--r-- 1 qzhu qzhu    18519 Mar 13  2020 umap_umap_dim_coord.txt
-rw-r--r-- 1 qzhu qzhu     8834 Mar 13  2020 total_expr_num_annot_information.txt
-rw-r--r-- 1 qzhu qzhu     1254 Mar 13  2020 sub_leiden_clus_select_annot_information.txt
-rw-r--r-- 1 qzhu qzhu       75 Mar 13  2020 sub_leiden_clus_select_annot_information.annot
-rw-r--r-- 1 qzhu qzhu       92 Mar 13  2020 offset_file.txt
-rw-r--r-- 1 qzhu qzhu     1046 Mar 13  2020 leiden_clus_annot_information.txt
-rw-r--r-- 1 qzhu qzhu       32 Mar 13  2020 leiden_clus_annot_information.annot
-rw-r--r-- 1 qzhu qzhu     1046 Mar 13  2020 HMRF_2_k9_b.32_annot_information.txt
-rw-r--r-- 1 qzhu qzhu       36 Mar 13  2020 HMRF_2_k9_b.32_annot_information.annot
-rw-r--r-- 1 qzhu qzhu    63376 Mar 13  2020 giotto_gene_ids.txt
-rw-r--r-- 1 qzhu qzhu     4599 Mar 13  2020 giotto_cell_ids.txt
-rw-r--r-- 1 qzhu qzhu     8170 Mar 13  2020 centroid_locations.txt
-rw-r--r-- 1 qzhu qzhu     1254 Mar 13  2020 cell_types_annot_information.txt
-rw-r--r-- 1 qzhu qzhu      149 Mar 13  2020 cell_types_annot_information.annot
-rw-r--r-- 1 qzhu qzhu       48 Mar 13  2020 annotation_num_list.txt
-rw-r--r-- 1 qzhu qzhu      210 Mar 13  2020 annotation_list.txt
-rw-r--r-- 1 qzhu qzhu 34455952 Mar 13  2020 giotto_expression.csv
Explanation:
  • *_annot_information.txt and *_annot_information.annot: Annotation sets
  • giotto_gene_ids.txt: gene list
  • annotation_list.txt: annotation list
  • umap_umap_dim_coord.txt: UMAP coordinates
  • centroid_locations.txt: physical coordinates (cells)
  • giotto_expression.csv: gene expression matrix


2. Images


2.1 Visium image
Use the full, raw size image, that is 200-500MB. Not the reduced size image.

2.2 Multiple images, or multiple fields of view
Multi-channel TIFF format images are supported. No other format is supported at this time. Please leave images unstitched (i.e. 1 FOV 1 image).

If multiple images are supplied, then we suggest you follow workflow for multiple images (see Advanced section of Setting up dataset tutorial).

We suggest a consistent naming of files, like so:
segmentation_staining_1_MMStack_Pos0.ome.tif
segmentation_staining_1_MMStack_Pos1.ome.tif
segmentation_staining_1_MMStack_Pos2.ome.tif
segmentation_staining_1_MMStack_Pos3.ome.tif
segmentation_staining_1_MMStack_Pos4.ome.tif
A offset file should be given to specify the stitching information:
Pos0.x  Pos0.y  0   0
Pos1.x  Pos1.y  1654.97 0
Pos2.x  Pos2.y  Pos1.x+1750.75  0
Pos3.x  Pos3.y  Pos2.x+1674.35  0
Pos4.x  Pos4.y  Pos3.x+675.5    1438.02
A sample package: cortex.tar.gz.

2.3 Cell segmentations (multiple FOVs)
This should be a Roi Zip file, one Roi Zip per FOV. Inside each Roi Zip is a list of *.roi files, each roi is the cell segmentation of one cell.

For example, for the above 5 FOVs, there will be 5 Roi Zip files:
-rw-r--r-- 1 zqian gcproj   102463 Jun 28 16:27 RoiSet_Pos0_real.zip
-rw-r--r-- 1 zqian gcproj    90112 Jun 28 16:27 RoiSet_Pos1_real.zip
-rw-r--r-- 1 zqian gcproj    72043 Jun 28 16:27 RoiSet_Pos2_real.zip
-rw-r--r-- 1 zqian gcproj    78095 Jun 28 16:27 RoiSet_Pos3_real.zip
-rw-r--r-- 1 zqian gcproj    73859 Jun 28 16:27 RoiSet_Pos4_real.zip
A sample package: cortex.tar.gz.