第一步数据分析的质谱(MS)为基础的蛋白质组学数据识别多肽和蛋白质。用质谱这方面大量的实验通常必须分配理论肽来源于一个序列数据库。搜索引擎是用于这一目的。这些工具比较每个观察光谱的所有候选理论光谱来源于序列数据基础和计算得分为每个比较。然后观察频谱分配给理论肽最好的得分,这也被称为肽谱匹配(PSM)。当然是至关重要的下游分析来评估这些比赛的质量。因此错误发现率(罗斯福)控制用于返回一个可靠psm列表。然而,罗斯福要求一个好的描述的psm的分数分布匹配错误的肽(坏的目标打击)。在蛋白质组学,目标诱饵的方法(TDA)通常用于这一目的。TDA方法匹配光谱数据库的(目标)和废话肽(诱饵)。 A popular approach to generate these decoys is to reverse the target database. Hence, all the PSMs that match to a decoy are known to be bad hits and the distribution of their scores are used to estimate the distribution of the bad scoring target PSMs. A crucial assumption of the TDA is that the decoy PSM hits have similar properties as bad target hits so that the decoy PSM scores are a good simulation of the target PSM scores. Users, however, typically do not evaluate these assumptions. To this end we developed TargetDecoy to generate diagnostic plots to evaluate the quality of the target decoy method.
