我们首先载入alpineData
包:
库(alpineData)
##加载所需的包:ExperimentHub
##加载所需的包:BiocGenerics
## ##附加包:“BiocGenerics”
以下对象从'package:stats'中屏蔽:## ## IQR, mad, sd, var, xtabs
##以下对象从'package:base'中屏蔽:## ## Filter, Find, Map, Position, Reduce, anyduplication, append, ## as.data.frame, basename, cbind, colnames, dirname, do。调用,## duplicate eval evalq get grep grepl, intersect, is。Unsorted, ## lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin, ## pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table, ## tapply, union, unique, unsplit,其中。马克斯,which.min
##加载所需包:AnnotationHub
##加载所需包:BiocFileCache
##加载所需的包:dbplyr
##加载所需的包:基因组校准
##加载所需的包:S4Vectors
##加载所需的包:stats4
## ##附加包:“S4Vectors”
以下对象从'package:base'中屏蔽:## ## I,展开。网格,unname
##加载所需的包:IRanges
##加载所需包:GenomeInfoDb
##加载所需软件包:GenomicRanges
##加载所需包:摘要实验
##加载所需包:MatrixGenerics
##加载所需的包:matrixStats
## ##附加包:'MatrixGenerics'
下面的对象从package:matrixStats中屏蔽:## ## colAlls, colAnyNAs, colanyans, colAvgsPerRowSet, colCollapse, ## colCounts, colCummaxs, colCummins, colCumprods, colMadDiffs, colIQRs, colLogSumExps, colMadDiffs, ## colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats, ## colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds, ## colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads, ## colWeightedMeans, colWeightedMedians, colweighteddsds, ## colweighttedvars, rowAlls, rowAnyNAs, rowAnys, colIQRs, colLogSumExps, colMadDiffs,rowAvgsPerColSet, ## rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods, ## rowcumsum, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps, ## rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins, ## rowOrderStats, rowProds, rowQuantiles, rowwranges, rowwranks, ## rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars, ## rowWeightedMads, rowWeightedMeans, rowWeightedMedians, ## rowweighteddsds, rowWeightedVars
##加载所需的包:Biobase
##欢迎访问Bioconductor ## ##小插图包含介绍性材料;查看## 'browseVignettes()'。要引用Bioconductor,请参见##“citation(“Biobase”)”,以及软件包的“citation(“pkgname”)”。
## ##附件:“Biobase”
下面的对象从“package:MatrixGenerics”中屏蔽:## ## rowMedians
以下对象从'package:matrixStats'中屏蔽:## ## anyMissing, rowMedians
下面的对象从'package:ExperimentHub'屏蔽:## ##缓存
下面的对象从'package:AnnotationHub'屏蔽:## ##缓存
##加载所需的包:生物strings
##加载所需的包:XVector
## ##附加包:“Biostrings”
下面的对象从'package:base'屏蔽:## ## strsplit
##加载所需的包:Rsamtools
这个包包含以下四个GAlignmentPairs对象。我们可以通过命名函数直接访问它们:
ERR188297 ()
## snapshotDate(): 2022-04-19
##参见?alpineData和browseVignettes('alpineData')获取文档
##从缓存加载
# # GAlignmentPairs对象与25531双,strandMode = 1和0元数据列:# # seqnames链:范围,范围# # < Rle > < Rle >: < IRanges >, < IRanges > # #[1] 1 +: 108560389 - 108560463 - 108560454 - 108560528 # #[2]: 1 - 108560454 - 108560528 - 108560383 - 108560457 # #[3] 1 +: 108560534 - 108600608 - 108600626 - 108606236 # #[4]: 1 - 108569920 - 108569994 - 108569825 - 108569899 # #[5] 1 -: 108587954——108588028——108588028——108587954 ## ... ... ... ... ... ... ...[25527] X +: 119790596-119790670—119790717-119790791 ## [25528]X +: 119790988-119791062—119791086-119791160 ## [25529]X +: 119791337 -119791111—119791142-119791216 ## [25530]X +: 119791348-119791422—119791475-119791549 ## [25531]X +: 119791376-119791450—119791481-119791555 ## ------- ## seqinfo: 194个来自未指定基因组的序列
ERR188088 ()
## snapshotDate(): 2022-04-19
##参见?alpineData和browseVignettes('alpineData')获取文档
##从缓存加载
# # GAlignmentPairs对象与28576双,strandMode = 1和0元数据列:# # seqnames链:范围,范围# # < Rle > < Rle >: < IRanges >, < IRanges > # #[1]: 1 - 108565979 - 108566053 - 108565846 - 108565920 # #[2]: 1 - 108573341 - 108573415 - 108573234 - 108573308 # #[3] 1 +: 108581087 - 108581161 - 108581239 - 108581313 # #[4] 1 +: 108601105 - 108601179 - 108601196 - 108601270 # #[5] 1 -: 108603628——108603701——108603701——108603628 ## ... ... ... ... ... ... ...[28572] X -: 119791266-119791340—1197913130 -119791204 ## [28573]X -: 119791431-119791505—119791358-119791432 ## [28574]X -: 119791593-119791667—119786691-119789940 ## [28575]X -: 119791629-119791703—119789951-119791587 ## [28576]X -: 119791637-119791711—119789976-119791612 ## ------- # seqinfo: 194个来自未指定基因组的序列
ERR188204 ()
## snapshotDate(): 2022-04-19
##参见?alpineData和browseVignettes('alpineData')获取文档
##从缓存加载
# # GAlignmentPairs对象与35079双,strandMode = 1和0元数据列:# # seqnames链:范围,范围# # < Rle > < Rle >: < IRanges >, < IRanges > # #[1] 1 +: 108560441 - 108560516 - 108600607 - 108600682 # #[2] 1 +: 108560442 - 108560517 - 108560519 - 108600594 # #[3] 1 +: 108560443 - 108560518 - 108560485 - 108560560 # #[4] 1 +: 108560447 - 108560522 - 108560517 - 108600592 # #[5] 1 +: 108560500——108600570——108600570——108560500 ## ... ... ... ... ... ... ...[35075] X -: 119790855-119790930—119790578-119790653 ## [35076]X -: 119791575-119791650—119786574-119786649 ## [35078]X -: 119791593-119791668—119789978-119791613 ## [35079]X -: 119791627-119791702—119791585-119791660 ## ------- ## seqinfo: 194个来自未指定基因组的序列
ERR188317 ()
## snapshotDate(): 2022-04-19
##参见?alpineData和browseVignettes('alpineData')获取文档
##从缓存加载
# # GAlignmentPairs对象与44535双,strandMode = 1和0元数据列:# # seqnames链:范围,范围# # < Rle > < Rle >: < IRanges >, < IRanges > # #[1] 1 +: 108560515 - 108600590 - 108600611 - 108600686 # #[2]: 1 - 108560530 - 108600605 - 108560452 - 108560527 # #[3] 1 +: 108560533 - 108600608 - 108612199 - 108612274 # #[4] 1 +: 108560552 - 108600627 - 108606221 - 108612221 # #[5] 1 +: 108575456——108575531——108575531——108575456 ## ... ... ... ... ... ... ...[44531] X -: 119791460-119791535—119791348-119791423 ## [44532]X -: 119791574-119791649—119786691- 119789953-119791590 ## [44534]X -: 119791585-119791660—119789990-119791613 ## [44535]X -: 119791620-119791695—119789953-119791590 ## ------- ## seqinfo: 194个来自未指定基因组的序列
或者我们可以使用ExperimentHub接口:
eh <- ExperimentHub()
## snapshotDate(): 2022-04-19
查询(呃,“ERR188”)
## ExperimentHub与4条记录## # snapshotDate(): 2022-04-19 ## $dataprovider: GEUVADIS ## # $species: Homo sapiens ## # $rdataclass: GAlignmentPairs ## #附加mcols(): taxonomyid,基因组,描述,## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags, ## # rdatapath, sourceurl, sourcetype ## #检索记录,例如,'object[["EH166"]]]' ## ## title ## EH166 | ERR188297 ## EH167 | ERR188088 ## EH168 | ERR188204 ## EH169 | ERR188317
嗯[[" EH166 "]]
##参见?alpineData和browseVignettes('alpineData')获取文档
##从缓存加载
# # GAlignmentPairs对象与25531双,strandMode = 1和0元数据列:# # seqnames链:范围,范围# # < Rle > < Rle >: < IRanges >, < IRanges > # #[1] 1 +: 108560389 - 108560463 - 108560454 - 108560528 # #[2]: 1 - 108560454 - 108560528 - 108560383 - 108560457 # #[3] 1 +: 108560534 - 108600608 - 108600626 - 108606236 # #[4]: 1 - 108569920 - 108569994 - 108569825 - 108569899 # #[5] 1 -: 108587954——108588028——108588028——108587954 ## ... ... ... ... ... ... ...[25527] X +: 119790596-119790670—119790717-119790791 ## [25528]X +: 119790988-119791062—119791086-119791160 ## [25529]X +: 119791337 -119791111—119791142-119791216 ## [25530]X +: 119791348-119791422—119791475-119791549 ## [25531]X +: 119791376-119791450—119791481-119791555 ## ------- ## seqinfo: 194个来自未指定基因组的序列
有关这些文件的来源和构造的详细信息,请参见alpineData ?
还有剧本:
本月/脚本/ make-metadata。R
本月/脚本/使数据。限制型心肌病
我们可以快速查看一个文件中的配对对齐。例如,它们在不同染色体上的分布:
library(GenomicAlignments) gap <- ERR188297()
## snapshotDate(): 2022-04-19
##参见?alpineData和browseVignettes('alpineData')获取文档
##从缓存加载
barplot(sort(table(seqnames(gap))[1:25], deleting =TRUE), las=3, main=" read的分布")
1号染色体上第一个读开始的直方图:
Gap1 <- gap[seqnames(gap) == "1"] starts <- start(first(Gap1)) par(mfrow=c(2,2)) hist(starts,col="grey")