Skip to contents

This function performs rarefaction on a phyloseq object by randomly sub-sampling OTUs/ASVs within samples without replacement for a number of iterations specified by the user. Samples with fewer OTUs/ASVs than the specified depth_level are discarded.

Usage

multi_rarefy(
  physeq_obj,
  depth_level,
  num_iter = 100,
  .as = "list",
  set_seed = NULL
)

Arguments

physeq_obj

A phyloseq object containing an OTU/ASV table.

depth_level

An integer specifying the sequencing depth (number of OTUs/ASVs) to which samples should be rarefied.

num_iter

An integer specifying the number of iterations to perform for rarefaction.

.as

A character string indicating whether to return the results as a 3D array or as a list of data frames. If "array", returns a 3D array with dimensions (samples x taxa x iterations). If "list", returns a list of data frames, one for each iteration, with samples as rows and taxa as columns. (default = "list")

set_seed

An optional integer to set the random seed for reproducibility (default = NULL).

Value

A data frame with taxa as rows and samples as columns. The values represent the average sequence counts calculated across all iterations. Samples with less than depth_level sequences are discarded.

See also

update_otu_table() for updating the OTU table in a phyloseq object and vegan::rrarefy() for the underlying rarefaction method used in this function.

Examples

library(BRCore)


# Example rarefaction (single iteration, single core to keep examples fast)
otu_table_rare <- multi_rarefy(
  physeq_obj = bcse,
  depth_level = 1000,
  num_iter = 10,
  .as = "list",
  set_seed = 7642
)
#> 
#> ── Rarefaction iterations starting... ──────────────────────────────────────────
#> 
#> ── Input Validation ──
#> 
#>  Input phyloseq object is valid!
#>  Seed: 7642
#>  Input (matrix/df dim): 47 samples x 2861 taxa
#>  Rarefaction depth: 1000
#>  Iterations: 10
#>  taxa_are_rows: TRUE
#>  OTU matrix/df rownames head: bcse50, bcse69, bcse73, bcse191, bcse82, bcse102
#>  OTU matrix/df colnames head: OTU_427, OTU_11, OTU_253, OTU_148, OTU_3, OTU_78
#>  Row sums summary: Min=1193, Max=107643, Median=25209
#> 
#> ── Rarefaction Results ──
#> 
#> ── Sample Removal 
#> ! 3 samples removed (depth < 1000)
#> ! Samples removed: "bcse108, bcse105, bcse110"
#> 
#> ── Taxa Removal 
#>  No taxa removed.
#> ! Taxa are not removed across iterations to maintain consistent dimensions. 
#> Downstream analyses should handle zero-abundance taxa appropriately.
#> 
#> ── Data Sparsity 
#>  Returning list of data frames for each iteration.
#> • Rarefied matrix (across 10 iterations):
#>   • Min: 130570 zeros (97.1% sparsity) out of 134467 entries
#>   • Max: 130663 zeros (97.17% sparsity) out of 134467 entries
#>   • Avg: 130615.7 zeros (97.14% sparsity) out of 134467 entries
#> 
#> ── Final Data Dimensions 
#>  Output: 10 iterations with 50 unique samples
#> • Samples per iteration:
#>   • Min: 47
#>   • Max: 47
#> • Non-zero taxa per iteration:
#>   • Min: 1039
#>   • Max: 1103
#>   • Avg: 1073

rowSums(otu_table_rare[[1]])
#>  bcse50  bcse69  bcse73 bcse191  bcse82 bcse102 bcse111  bcse86  bcse97  bcse88 
#>    1000    1000    1000    1000    1000    1000    1000    1000    1000    1000 
#> bcse104  bcse81  bcse77  bcse78  bcse96  bcse66  bcse57  bcse75 bcse101 bcse192 
#>    1000    1000    1000    1000    1000    1000    1000    1000    1000    1000 
#>  bcse98 bcse106  bcse76 bcse103  bcse51  bcse63  bcse68 bcse109  bcse65  bcse58 
#>    1000    1000    1000    1000    1000    1000    1000    1000    1000    1000 
#> bcse107  bcse62  bcse59  bcse99  bcse49  bcse85  bcse79  bcse72  bcse80  bcse71 
#>    1000    1000    1000    1000    1000    1000    1000    1000    1000    1000 
#>  bcse95  bcse67  bcse61 bcse100  bcse87  bcse83  bcse70 
#>    1000    1000    1000    1000    1000    1000    1000