pkgdown/extra.css

Skip to contents

[Experimental]

This function takes in a folder of .tsv files with TCRalpha and TCRbeta read counts of individual wells and converts the data from a subset of these wells (or all wells) to sparse matrices (well x clone) of read counts along with metadata data frames for each clone.

This output can then be quickly loaded with load_well_counts_binary() and used in the future as input to the plot_tshell() function.

Usage

write_well_data_to_binary(
  folder_in,
  folder_out,
  prefix,
  wells = get_well_subset(1:16, 1:24),
  in_file_type = c(".tsv", ".parquet"),
  out_file_type = c(".rds", ".h5"),
  well_pos = 3,
  parallel = FALSE,
  nproc = data.table::getDTthreads(),
  columns = NULL,
  max_files = Inf,
  periods_to_underscores = TRUE,
  to_sparse_matrix = TRUE
)

Arguments

folder_in

the directory with ".tsv" files with read counts for each well

folder_out

the directory to write the output data to (if this does not exist, it will be created)

prefix

a prefix with the sample name that will be prepended to the output file names

wells

a vector of the wells corresponding to the sample (default is all wells on the 384-well plate)

in_file_type

the type of files in folder_in (default is ".tsv")

out_file_type

the type of output file for the sparse read count matrices for TCRalpha and TCRbeta (default is ".rds")

well_pos

the position of the well ID (e.g. "B5") in the file names (when separating by underscores). For example, files named "<well_id>_TCRalpha.tsv" would use well_pos=3. (default is 3)

parallel

whether to use multiple processors when loading the data (default is FALSE)

nproc

number of processors to use when parallel is TRUE

columns

columns of files to read (default is all columns)

max_files

(for testing purposes) the maximum number of files to load (default is all files)

periods_to_underscores

whether to convert periods to underscores in well files names for determining the well name from well_pos. Default is TRUE to allow for compatibility with some pre-existing datasets.

to_sparse_matrix

whether to write the read counts to a sparse matrix vs. a long data frame (default is TRUE)

Value

The function writes the data to the output folder and then returns NULL.