Discovering regulated Metabolite Families
Khabat Vahabi
Leibniz Institute of Plant BiochemistrySteffen Neumann
Leibniz Institute of Plant Biochemistrysneumann@ipb-halle.de
discoveringregulatedmetabolitefamilies.Rmd
Abstract
Description of the MetFamily R package
Introduction
Some text about the scenario, data and MTBLS297.
Cool paper [1].
Cool R package: MetFamily
Loading the data
First, we load the data and summarise it
# Load separate files
filePeakMatrixPath <- system.file("extdata/showcase/Metabolite_profile_showcase.txt", package = "MetFamily")
fileSpectra <- system.file("extdata/showcase/MSMS_library_showcase.msp", package = "MetFamily")
fileAnnotation <- system.file("extdata/testdata/canopus/canopusShort.txt", package = "MetFamily")
parameterSetPath <- system.file("extdata/testdata/parameterSet.RData", package = "MetFamily" )
load(parameterSetPath)
resultObj <- convertToProjectFile(filePeakMatrixPath,
fileSpectra,
parameterSet = parameterSet,
progress = FALSE)
## [1] "Parsing MS/MS file..."
## [1] "MS/MS file: Read file"
## [1] "MS/MS file: Parse"
## [1] "MS/MS file: Assemble spectra"
## [1] "MS/MS file: Box"
## [1] "MS/MS file postprocessing"
## [1] "MS/MS file boxing"
## [1] "Parsing MS/MS file ready"
## [1] "Parsing MS1 file..."
## [1] "Parsing MS1 file content..."
## [1] "Precursor deisotoping..."
## [1] "Precursor deisotoping 582 / 5823"
## [1] "Precursor deisotoping 1164 / 5823"
## [1] "Precursor deisotoping 1746 / 5823"
## [1] "Precursor deisotoping 2328 / 5823"
## [1] "Precursor deisotoping 2910 / 5823"
## [1] "Precursor deisotoping 3492 / 5823"
## [1] "Precursor deisotoping 4074 / 5823"
## [1] "Precursor deisotoping 4656 / 5823"
## [1] "Precursor deisotoping 5238 / 5823"
## [1] "Precursor deisotoping 5820 / 5823"
## [1] "Boxing..."
## [1] "Postprocessing matrix..."
## [1] "Building fragment mzFragmentGroups..."
## [1] "Fragment grouping preprocessing..."
## [1] "Fragment grouping preprocessing ready"
## [1] "Fragment grouping"
##
## [1] "Fragment grouping ready (0.588882207870483s)"
## [1] "Fragment group postprocessing"
## [1] "Fragment group postprocessing: 1394 / 13943"
## [1] "Fragment group postprocessing: 2788 / 13943"
## [1] "Fragment group postprocessing: 4182 / 13943"
## [1] "Fragment group postprocessing: 5576 / 13943"
## [1] "Fragment group postprocessing: 6970 / 13943"
## [1] "Fragment group postprocessing: 8364 / 13943"
## [1] "Fragment group postprocessing: 9758 / 13943"
## [1] "Fragment group postprocessing: 11152 / 13943"
## [1] "Fragment group postprocessing: 12546 / 13943"
## [1] "Fragment group postprocessing: 13940 / 13943"
## [1] "Fragment group postprocessing ready (1.50495147705078s)"
## [1] "Boxing to matrix"
## [1] "Fragment group deisotoping"
## [1] "Fragment group deisotoping 1394 / 13943"
## [1] "Fragment group deisotoping 2788 / 13943"
## [1] "Fragment group deisotoping 4182 / 13943"
## [1] "Fragment group deisotoping 5576 / 13943"
## [1] "Fragment group deisotoping 6970 / 13943"
## [1] "Fragment group deisotoping 8364 / 13943"
## [1] "Fragment group deisotoping 9758 / 13943"
## [1] "Fragment group deisotoping 11152 / 13943"
## [1] "Fragment group deisotoping 12546 / 13943"
## [1] "Fragment group deisotoping 13940 / 13943"
## [1] "Fragment group deisotoping ready (5.75593256950378s)"
## [1] "Fragment group boxing"
## [1] "Building fragment mzFragmentGroups ready"
## 'as(<dgCMatrix>, "dgTMatrix")' is deprecated.
## Use 'as(., "TsparseMatrix")' instead.
## See help("Deprecated") and help("Matrix-deprecated").
## [1] "Boxing..."
## [1] "Ready"
lines <- sparseMatrixToString(matrixRows = resultObj$matrixRows, matrixCols = resultObj$matrixCols,
matrixVals = resultObj$matrixVals, parameterSet = parameterSet)
dataList0 <- readProjectData(fileLines = lines, progress = FALSE)
## [1] "Preprocessing 1528 / 2414"
## [1] "Features"
## [1] "Feature postprocessing"
## [1] "Coloring"
## [1] "Coloring init"
## [1] "Coloring naming functions"
## [1] "Coloring gather data"
## [1] "Coloring matrix"
## [1] "Coloring box"
## [1] "Feature annotations"
## [1] "Boxing"
## [1] "Ready"
# add qfeatures
dataList <- add_qfeatures(dataList0, qfeatures = resultObj$qfeatures, fileAnnotation)
## [1] "Merging by: featureId and Alignment ID"
# *** gp: PR works until here to reproduce previous behavior with a simpler workflow
PCA
PCA on MS1 in Figure @ref(fig:pca).
## [1] "######################################################################################"
## [1] "PCA (Principal Component Analysis)"
## [1] "Analysis: pcaMethods"
data:image/s3,"s3://crabby-images/05f4d/05f4dfe66dea91b31b092509fddc5529da03341f" alt="PCA of MS1."
PCA of MS1.
## [1] "entering the line ...2686"
## Warning in annotatedPoints & selectedPoints: longer object length is not a
## multiple of shorter object length
## Warning in (!annotatedPoints) & selectedPoints: longer object length is not a
## multiple of shorter object length
## Warning in annotatedPoints & (!selectedPoints): longer object length is not a
## multiple of shorter object length
## Warning in (!annotatedPoints) & (!selectedPoints): longer object length is not
## a multiple of shorter object length
data:image/s3,"s3://crabby-images/a04f5/a04f5a71662f6f60eb23f845c7886b83f737018f" alt="PCA of MS1."
PCA of MS1.
References
1. Treutler H, Tsugawa H, Porzel A, Gorzolka K, Tissier A, Neumann S,
Balcke GU: Discovering
Regulated Metabolite Families in
Untargeted Metabolomics
Studies. Anal Chem 2016.
Appendix
Session info
## R version 4.3.3 (2024-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 24.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Europe/Berlin
## tzcode source: system (glibc)
##
## attached base packages:
## [1] tools grid stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] MetFamily_0.99.5 xcms_4.5.1 BiocParallel_1.36.0
## [4] plotly_4.10.4 slam_0.1-54 plotrix_3.8-4
## [7] matrixStats_1.4.1 mzR_2.41.3 Rcpp_1.0.13
## [10] stringr_1.5.1 searchable_0.4.0 gdata_3.0.1
## [13] cowplot_1.1.3 cba_0.2-25 proxy_0.4-27
## [16] pcaMethods_1.94.0 Biobase_2.62.0 BiocGenerics_0.48.1
## [19] mixOmics_6.26.0 ggplot2_3.5.1 lattice_0.22-5
## [22] MASS_7.3-60.0.1 FactoMineR_2.11 squash_1.0.9
## [25] Matrix_1.6-5 colourpicker_1.3.0 DT_0.33
## [28] shinybusy_0.3.3 shinyjs_2.1.0 shinyBS_0.61.1
## [31] shiny_1.9.1 htmltools_0.5.8.1 BiocStyle_2.30.0
##
## loaded via a namespace (and not attached):
## [1] splines_4.3.3 later_1.3.2
## [3] bitops_1.0-9 tibble_3.2.1
## [5] preprocessCore_1.64.0 XML_3.99-0.17
## [7] lifecycle_1.0.4 doParallel_1.0.17
## [9] flashClust_1.01-2 MultiAssayExperiment_1.28.0
## [11] magrittr_2.0.3 limma_3.58.1
## [13] sass_0.4.9 rmarkdown_2.28
## [15] jquerylib_0.1.4 yaml_2.3.10
## [17] httpuv_1.6.15 zip_2.3.1
## [19] DBI_1.2.2 MsCoreUtils_1.17.3
## [21] RColorBrewer_1.1-3 multcomp_1.4-25
## [23] abind_1.4-8 zlibbioc_1.48.2
## [25] GenomicRanges_1.54.1 purrr_1.0.2
## [27] AnnotationFilter_1.26.0 RCurl_1.98-1.16
## [29] TH.data_1.1-2 sandwich_3.1-0
## [31] openxlsx2_1.10 GenomeInfoDbData_1.2.11
## [33] IRanges_2.36.0 S4Vectors_0.40.2
## [35] ggrepel_0.9.6 ellipse_0.5.0
## [37] RSpectra_0.16-2 MSnbase_2.31.1
## [39] pkgdown_2.0.7 ncdf4_1.23
## [41] codetools_0.2-19 DelayedArray_0.28.0
## [43] Spectra_1.15.7 tidyselect_1.2.1
## [45] stats4_4.3.3 jsonlite_1.8.9
## [47] iterators_1.0.14 survival_3.5-8
## [49] emmeans_1.10.5 systemfonts_1.0.5
## [51] foreach_1.5.2 progress_1.2.3
## [53] ragg_1.2.7 glue_1.8.0
## [55] rARPACK_0.11-0 gridExtra_2.3
## [57] SparseArray_1.2.4 xfun_0.48
## [59] MatrixGenerics_1.14.0 GenomeInfoDb_1.38.8
## [61] dplyr_1.1.4 withr_3.0.2
## [63] BiocManager_1.30.25 fastmap_1.2.0
## [65] fansi_1.0.6 egg_0.4.5
## [67] digest_0.6.37 R6_2.5.1
## [69] mime_0.12 estimability_1.5.1
## [71] textshaping_0.3.7 colorspace_2.1-1
## [73] gtools_3.9.5 utf8_1.2.4
## [75] tidyr_1.3.1 generics_0.1.3
## [77] data.table_1.16.2 corpcor_1.6.10
## [79] prettyunits_1.2.0 PSMatch_1.6.0
## [81] httr_1.4.7 htmlwidgets_1.6.4
## [83] S4Arrays_1.2.1 scatterplot3d_0.3-44
## [85] pkgconfig_2.0.3 gtable_0.3.6
## [87] impute_1.76.0 MassSpecWavelet_1.68.0
## [89] XVector_0.42.0 bookdown_0.37
## [91] MALDIquant_1.22.3 multcompView_0.1-10
## [93] ProtGenerics_1.37.1 clue_0.3-65
## [95] scales_1.3.0 leaps_3.2
## [97] MetaboCoreUtils_1.11.3 knitr_1.48
## [99] reshape2_1.4.4 coda_0.19-4.1
## [101] cachem_1.1.0 zoo_1.8-12
## [103] parallel_4.3.3 miniUI_0.1.1.1
## [105] mzID_1.40.0 vsn_3.70.0
## [107] desc_1.4.3 pillar_1.9.0
## [109] vctrs_0.6.5 MsFeatures_1.10.0
## [111] promises_1.3.0 xtable_1.8-4
## [113] cluster_2.1.6 evaluate_1.0.1
## [115] mvtnorm_1.3-1 cli_3.6.3
## [117] compiler_4.3.3 rlang_1.1.4
## [119] crayon_1.5.3 QFeatures_1.12.0
## [121] affy_1.80.0 plyr_1.8.9
## [123] fs_1.6.4 stringi_1.8.4
## [125] viridisLite_0.4.2 munsell_0.5.1
## [127] lazyeval_0.2.2 hms_1.1.3
## [129] MsExperiment_1.5.5 statmod_1.5.0
## [131] highr_0.11 SummarizedExperiment_1.32.0
## [133] igraph_2.1.1 memoise_2.0.1
## [135] affyio_1.72.0 bslib_0.8.0