BioMedR包提供了一个R/Bioconductor包,用于生成化学物质、蛋白质、dna / rna及其相互作用的各种分子表示。参见“vignette(' BioMedR ')”获取全面的用户指南。要在R中安装生物医学R包,只需输入source('//www.andersvercelli.com/biocLite.R') biocLite(' BioMedR '),以使生物医学R包完全功能(特别是与Open Babel相关的功能),我们建议用户还通过使用以下方法安装_Enhances_包:来源('//www.andersvercelli.com/biocLite.R') biocLite('Rcpi', dependencies = c('Imports', ' enhancements ')) BioMedR包的几个依赖关系可能需要一些系统级库,请查看这些包的相应手册以获得详细的安装指南。Rcpi实现并集成了最先进的蛋白质序列描述符和分子描述符/指纹与r。对于蛋白质序列,Rcpi包可以*计算6个蛋白质描述符组,由14种常用的结构和物理化学描述符组成,包括9920个描述符。*计算由蛋白质化学计量学(PCM)建模的各种降维方法导出的六种基于广义尺度的描述符。*由蛋白质序列比对和基因本体(GO)语义相似性度量在蛋白质列表中衍生的并行成对相似度计算。对于小分子,Rcpi包可以*计算307个分子描述符(2D/3D),包括结构描述符、拓扑描述符、几何描述符和电子描述符等。*计算十多种分子指纹,包括FP4键、e态指纹、MACCS键等,并进行并行化化学相似性搜索。*用kmers (DNA序列子序列)计算描述局部序列信息的三个核酸组成特征。*计算六个自相关特征,描述沿DNA序列的两个寡核苷酸之间的特定物理化学性质的相关水平。 * Calculate two pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence order information, particularly the global or long-range sequence order information, via the physicochemical properties of its constituent oligonucleotides. * Parallelized pairwise similarity computation derived by fingerprints and maximum common substructure search within a list of small molecules. By combining various types of descriptors for drugs, proteins and DNA/RNA in different methods, interaction descriptors representing protein-protein, compound-compound, DNA-DNA, compound-DNA compound-protein and DNA-protein interactions could be conveniently generated withBioMedR, including: * Two types of compound-protein interaction (CPI) descriptors * Two types of compound-DNA interaction (CDI) descriptors * Two types of DNA-protein interaction (DPI) descriptors * Three types of protein-protein interaction (PPI) descriptors * Three types of compound-compound interaction (CCI) descriptors * Three types of DNA-DNA interaction (DDI) descriptors Several useful auxiliary utilities are also shipped with BioMedR: * Parallelized molecule and protein sequence retrieval from several online databases, like PubChem, ChEMBL, KEGG, DrugBank, UniProt, RCSB PDB, genBank, etc. * Loading molecules stored in SMILES/SDF files and loading protein sequences from FASTA/PDB files * Molecular file format conversion The computed protein sequence descriptors, molecular descriptors/fingerprints, interaction descriptors and pairwise similarities are widely used in various research fields relevant to drug disvery, primarily bioinformatics, chemoinformatics, proteochemometrics and chemogenomics. ## Links * Bioconductor Page: //www.andersvercelli.com/packages/release/bioc/html/BioMedR.html * Track Devel: https://github.com/wind22zhu/BioMedR * Report Bugs: https://github.com/wind22zhu/BioMedR/issues ## Contact The BioMedR package is developed by Computational Biology and Drug Design Group, Central South University, China. * Minfeng Zhu *董洁 *曹东升