输入

该方法有两个强制参数:

位置权重矩阵或位置频率矩阵，存储在TFBSTools包中的PWMatrix、PFMatrix、PWMatrixList或PFMatrixList对象中
一组基因组范围(基因组范围或rangedsummarizeexperiment对象)或一组序列(DNAStringSet, DNAString或简单的字符向量)

如果第二个参数是一组基因组范围，则还需要基因组序列。默认情况下BSgenome.Hsapiens.UCSC.hg19-你必须安装bsgome . hsapiens . ucsc .hg19。如果使用另一个基因组构建，您的物种的适当的BSgenome对象或您的物种的DNAStringSet或FaFile对象都应该传递给基因组论点。

#使用峰值motif_ix_peaks < -matchMotifs(example_motifs,山峰,基因组=“hg19”）#使用摘要实验example_SummarizedExperiment < -SummarizedExperiment（化验=列表（数=矩阵（1，ncol =4，nrow =3.)），rowRanges =山峰)motif_ix_SummarizedExperiment < -matchMotifs(example_motifsexample_SummarizedExperiment,基因组=“hg19”）all.equal（motifMatches(motif_ix_peaks),motifMatches(motif_ix_SummarizedExperiment))

##[1]正确

#使用BSgenomeViewsexample_BSgenomeViews < -BSgenomeViews(BSgenome.Hsapiens.UCSC。hg19山峰)motif_ix_BSgenomeViews < -matchMotifs(example_motifs example_BSgenomeViews)

## Warning in matrix(log2(bg(pwm)) - log2(bg_freqs)， nrow = 4, ncol = ## length(pwm)，: data length[12]不是列编号## bb0的子倍数或倍数## Warning in matrix(log2(bg(pwm)) - log2(bg_freqs)， nrow = 4, ncol = ## length(pwm)，: data length[12]不是列编号##[10]的子倍数或倍数

##警告矩阵(log2(bg(pwm)) - log2(bg_freqs)， nrow = 4, ncol = ## length(pwm)，:数据长度[12]不是列[11]的##数的倍数或倍数

all.equal（motifMatches(motif_ix_peaks),motifMatches(motif_ix_BSgenomeViews))

##[1] "属性:< Component \"i\":数值:长度(6,9)不同>" ##[2]"属性:< Component \"p\":平均相对差值:0.5 >" ##[3]"属性:< Component \"x\":长度(6,9)不同(前6个组件的比较)>"

#使用DNAStringSet图书馆(Biostrings)图书馆(BSgenome.Hsapiens.UCSC.hg19)example_DNAStringSet < -getSeq(BSgenome.Hsapiens.UCSC。hg19山峰)motif_ix_DNAStringSet < -matchMotifs(example_motifs example_DNAStringSet)all.equal（motifMatches(motif_ix_peaks),motifMatches(motif_ix_DNAStringSet))

##[1]正确

#使用字符向量example_character < -as.character(example_DNAStringSet)motif_ix_character < -matchMotifs(example_motifs example_character)all.equal（motifMatches(motif_ix_peaks),motifMatches(motif_ix_character))

##[1]正确

选项

背景核苷酸频率

在确定基序匹配时，使用背景核苷酸频率。默认情况下，背景频率是输入序列中的核苷酸频率-要使用备用频率，请提供bgarument来match_pwms．如果输入序列相当短(就像我们的小插图例子!)，使用其他输入频率可能是有意义的。这里我们展示了如何使用偶数频率:

motif_ix < -matchMotifs(example_motifs,山峰,基因组=“hg19”，bg =“甚至”）

我们也可以选择使用基因组的频率。在这种情况下:

motif_ix < -matchMotifs(example_motifs,山峰,基因组=“hg19”，bg =“基因组”）

如果使用，必须指定基因组Bg = "基因组"！

要指定每个碱基对的频率，顺序应该是“A”、“C”、“G”，然后是“T”，或者这些核苷酸应该用作向量中的名称。

motif_ix < -matchMotifs(example_motifs,山峰,基因组=“hg19”，bg =c（“一个”＝0.2，“C”＝0.3，“G”＝0.3，“T”＝0.2）)

方法访问PWMatrix对象的相关背景频率bg函数。如果提供的背景频率match_pwms不匹配输入PWM中的频率，则调整PWM以反映提供的背景频率。计算分数是基于这个调整的PWM，而不是直接输入的PWM。为了确保使用精确的PWM输入计算得分，只需确保传递给matchMotifs的背景频率与用于输入PWM的软管匹配，并存储在PWMatrix对象的bg槽中。

Log底数和假计数

motifmatchr期望输入pwm使用自然对数或log 2。如果输入是PFM，则使用TFBSTools toPWM进行PWM，默认伪计数为0.8，对数基数为2。为了更好地控制伪计数，只需使用toPWM函数在调用matchMotifs之前转换您的pfm。

图书馆(TFBSTools)example_pwms < -do.call(PWMatrixList拉普兰人(toPWM example_motifspseudocounts =0．5）)

P值

默认的p值截止值是0.00005。在这个p值截止点中，没有对多次比较进行调整。这个p值阈值用于确定得分阈值。#输出

matchMotifs可以返回三种可能的输出，这取决于出论点:

(默认,Out =匹配布尔矩阵，指示哪个范围/序列包含哪个主题，存储为motifMatches在summarizeexperiment对象的分析槽中。的motifMatches方法可用于提取布尔矩阵。如果主题参数是一个基因组范围或RangedSummarizedExperiment对象或范围参数，则返回一个rangedsummarizeexperiment而不是summarizeexperiment。
(=分数)与(1)相同，再加上两个额外的分析——一个矩阵表示每个范围/序列中motif分数高的得分(只有在存在匹配时才报告得分)，另一个矩阵表示motif匹配的数量。的motifScores而且motifCounts方法可用于访问这些组件。
(=职位)基因组范围列表，包含输入范围/序列中所有匹配的范围。请注意，如果主题参数是一个字符向量、DNAStringSet或DNAString和一个范围参数与序列所表示的范围相对应，则返回IRangesList对象列表，而不是序列中的相对位置。

motif_ix < -matchMotifs(example_motifs,山峰,基因组=“hg19”）打印(motif_ix)

## class: rangedsummarize实验## dim: 3 3 # metadata(0): ## assays(1): motifMatches ## rownames: NULL ## rowData names(0): ## colnames(3): MA0599.1_KLF5 MA0107.1_RELA MA0137.3_STAT1 ## colData names(1): name

头（motifMatches(motif_ix))

## 3 x 3稀疏矩阵类"lgCMatrix" ## MA0599.1_KLF5 MA0107.1_RELA MA0137.3_STAT1 ##[1，] | |。##[2，] |。| ##[3，] |。|

motif_ix_scores < -matchMotifs(example_motifs,山峰,基因组=“hg19”，了=“分数”）打印(motif_ix_scores)

## class: rangedsummarize实验## dim: 3 3 # metadata(0): ## assays(3): motifScores motifMatches motifCounts ## rownames: NULL ## rowData names(0): ## colnames(3): MA0599.1_KLF5 MA0107.1_RELA MA0137.3_STAT1 ## colData names(1): name

头（motifMatches(motif_ix_scores))

## 3 x 3稀疏矩阵类"lgCMatrix" ## MA0599.1_KLF5 MA0107.1_RELA MA0137.3_STAT1 ##[1，] | |。##[2，] |。| ##[3，] |。|

头（motifScores(motif_ix_scores))

## 3 x 3稀疏矩阵类"dgCMatrix" ## MA0599.1_KLF5 MA0107.1_RELA MA0137.3_STAT1 ##[1，] 7.975145 13.55634。##[2，] 7.085762。16.29401 ##[3，] 7.739609。14.81298

头（motifCounts(motif_ix_scores))

## 3 x 3稀疏矩阵类“dgCMatrix”## MA0599.1_KLF5 MA0107.1_RELA MA0137.3_STAT1 ##[1，] 1 1。##[2，] 1。1 ## [3，] 1;2

motif_pos < -matchMotifs(example_motifs,山峰,基因组=“hg19”，了=“职位”）打印(motif_pos)

## GRangesList对象长度为3:## $MA0599.1_KLF5 ## GRanges对象具有3个范围和1个元数据列:## seqnames ranges strand | score ##    |  ## [1] chr1 76585928-76585937 - | 7.97515 ## [2] chr2 4277348 -42773257 - | 7.08576 ## [3] chr2 100184023-100184032 - | 7.73961 # ------- ## seqinfo: 2个序列来自一个未指定的基因组;no seqlength## ## $ ma0107.1_rela# # GRanges对象具有1个范围和1个元数据列:## seqnames ranges strand | score ##    |  ## [1] chr1 76585933-76585942 + | 13.5563 ## ------- ## seqinfo: 2个序列来自一个未指定的基因组;no seqlength ## ## $MA0137.3_STAT1 ## GRanges对象有3个范围和1个元数据列:## seqnames ranges strand | score ##    |  ## [1] chr2 42773087-42773097 - | 16.2940 ## [2] chr2 100183852-100183862 + | 14.8130 ## [3] chr2 100183852-100183862 - | 13.9536 ## ------- ## seqinfo: 2个序列来自一个未指定的基因组;没有seqlengths

会话信息

Sys。日期（）

## [1] " 22-11-01"

sessionInfo（）

## R正在开发中(不稳定)(22-10-25 r83175) ##平台:x86_64-pc-linux-gnu(64位)##运行在:Ubuntu 22.04.1 LTS ## ##矩阵产品:默认## BLAS: /home/biocbuild/bbs-3.17-bio /R/lib/libRblas。因此## LAPACK: /usr/lib/x86_64-linux-gnu/ LAPACK /liblapack.so.3.10.0 ## ## locale: ## [1] LC_CTYPE=en_US。UTF-8 LC_NUMERIC= c# [3] LC_TIME=en_GB LC_COLLATE= c# [5] LC_MONETARY=en_US。utf - 8 LC_MESSAGES = en_US。UTF-8 ## [7] LC_PAPER=en_US。UTF-8 LC_NAME= c# [9] LC_ADDRESS=C LC_TELEPHONE= c# [11] LC_MEASUREMENT=en_US。UTF-8 LC_IDENTIFICATION=C ## ##附加的基本包:## [1]stats4 stats graphics grDevices utils datasets methods ## [8] base ## ##其他附加的包:# [1] TFBSTools_1.37.0 bsgenomics . hsapiens . ucsc .hg19_1.4.3 ## [3] BSgenome_1.67.0 rtracklayer_1.59.0 ## [5] Biostrings_2.67.0 XVector_0.39.0 ## [7] SummarizedExperiment_1.29.0 Biobase_2.59.0 ## [9] MatrixGenerics_1.11.0 matrixStats_0.62.0 ##[11]基因组icranges_1.1.1.0 GenomeInfoDb_1.35.0 ## [13] IRanges_2.33.0 S4Vectors_0.37.0 ## [15] BiocGenerics_0.45.0 motifmatchr_1.21.0 ## ##通过命名空间加载(并没有附加):# # # # [1] DBI_1.1.3 bitops_1.0-7 [3] rlang_1.0.6 magrittr_2.0.3 # # [5] compiler_4.3.0 RSQLite_2.2.18 # # [7] png_0.1-7 vctrs_0.5.0 # # [9] reshape2_1.4.4 stringr_1.4.1 # # [11] pkgconfig_2.0.3 crayon_1.5.2 # # [13] fastmap_1.1.0 ellipsis_0.3.2 # # [15] caTools_1.18.2 utf8_1.2.2 # # [17] Rsamtools_2.15.0 rmarkdown_2.17 # # [19] tzdb_0.3.0 pracma_2.4.2 # # [21] DirichletMultinomial_1.41.0 bit_4.0.4 # # [23] xfun_0.34 zlibbioc_1.45.0 # # [25] cachem_1.0.6 CNEr_1.35.0 # # [27] jsonlite_1.8.3 blob_1.2.3 # # [29]DelayedArray_0.25.0 BiocParallel_1.33.0 ## [31] parallel_4.3.0 R6_2.5.1 ## [33] bslib_0.4.0 stringi_1.7.8 ## [35] jquerylib_0.1.4 Rcpp_1.0.9 ## [37] assertthat_0.2.1 knitr_1.40 ## [39] R.utils_2.12.1 readr_2.1.3 ## [41] Matrix_1.5-1 tidyselect_1.2.0 ## [43] yaml_2.3.6 codetools_0.2-18 ## [45] lattice_0.20-45 tibble_3.1.8 ## [47] plyr_1.8.7 KEGGREST_1.39.0 ## [49] evaluate_0.17 pillar_1.8.1 ## [51] generics_0.1.3 RCurl_1.98-1.9 ## [53] hms_1.1.2 ggplot2_3.3.6 ## [55] munsell_0.5.0 scales_1.2.1 ## [57] gtools_3.9.3 xtable_1.8-4 ## [59] glue_1.6.2 seqLogo_1.65.0 ## [61] tools_4.3.0 TFMPvalue_0.0.9 ## [63] BiocIO_1.9.0 annotate_1.77.0 ## [65] GenomicAlignments_1.35.0 XML_3.99-0.12 ## [67] poweRlaw_0.70.6 grid_4.3.0 ## [69] AnnotationDbi_1.61.0 colorspace_2.0-3 ## [71] GenomeInfoDbData_1.2.9 restfulr_0.0.15 ## [73] cli_3.4.1 fansi_1.0.3 ## [75] dplyr_1.0.10 gtable_0.3.1 ## [77] R.methodsS3_1.8.2 sass_0.4.2 ## [79] digest_0.6.30 rjson_0.2.21 ## [81] R.oo_1.25.0 memoise_2.0.1 ## [83] htmltools_0.5.3 lifecycle_1.0.3 ## [85] httr_1.4.4 GO.db_3.16.0 ## [87] bit64_4.0.5

motifmatchr

快速入门

输入

选项

背景核苷酸频率

Log底数和假计数

P值

会话信息