「DM」 序列匹配资源汇总
计算两个序列的相似度,可找到两者公共的部分,可带容错;
字符串匹配
序列匹配算法
· 序列匹配
sequence alignment
· significant alignment
· global alignment
· local alignment
不同关键字在搜索引擎上搜索结果不同
1 综述
2 理论
3 全局
3.1 串模式匹配
4 局部
4.1 最长公共子序列
4.2 沃特曼
-
Identification of Common Molecular Subsequences
1981 paper
$\bullet \bullet$ Waterman -
An Improved Algorithm For Matching Biological Sequences
1982 paper
$\bullet \bullet$ Waterman -
Optimal Sequence Alignment Using Affine Gap Costs
1986 -
Optimal alignments in linear space
1988 paper
$\bullet \bullet$ Waterman
5 启发式
速度更快;
- Basic Local Alignment Search Tool
1990 paper
BLAST
6 其他
-
The similarity metric
2001-11-20 paper -
Approximate word matches between two random sequences
2008-01-21 paper -
Empirical distribution of k-word matches in biological sequences
2008-03-14 paper -
An Algorithm for Alignment-free Sequence Comparison using Logical Match
ICCAE 2010 2014-07-06 paper -
An Alignment Algorithm for Sequences
2012-10-31 paper
$\bullet \bullet$ Sequence Alignment -
Ultra-fast Multiple Genome Sequence Matching Using GPU
2013-03-15 paper
$\bullet \bullet$ -
SlopMap: a software application tool for quick and flexible identification of similar sequences using exact k-mer matching
2013-07-31 paper -
Geometry-based Adaptive Symbolic Approximation for Fast Sequence Matching on Manifolds
2014-03-04 paper -
Measuring Congruence on High Dimensional Time Series
2018-05-27 paper
针对高维序列度量; -
Efficient Measuring of Congruence on High Dimensional Time Series
2018-11-27 paper
比 DTW 更稳健;
7 应用
7.1 生物序列
-
Measuring the similarity of protein structures by means of the universal similarity metric
2004 paper
蛋白质序列相似度计算; -
A Comparative Study on String Matching Algorithm of Biological Sequences
2014-01-29 paper
$\bullet \bullet$ string match study -
An optimized Parallel Failure-less Aho-Corasick algorithm for DNA sequence matching
ICIAfS 2016 2018-11-26 paper -
FindeR: Accelerating FM-Index-based Exact Pattern Matching in Genomic Sequences through ReRAM technology
2019-07-11 paper
7.2 购买意向
- Personalized Purchase Prediction of Market Baskets with Wasserstein-Based Sequence Matching
KDD 2019 oral 2019-03-24 paper
购买意向匹配;
7.3 股票预测
7.4 视频序列匹配
7.5 手势识别
附录
A 讲义
B 参考资料
- wikipedia. Sequence alignment[EB/OL]. https://en.wikipedia.org/wiki/Sequence_alignment. -/2019-11-05.
- 维基百科. 序列比对[EB/OL]. https://zh.wikipedia.org/wiki/%E5%BA%8F%E5%88%97%E6%AF%94%E5%B0%8D. -/2019-11-05.
- 维基百科. 尼德曼-翁施算法[EB/OL]. https://zh.wikipedia.org/wiki/%E5%B0%BC%E5%BE%B7%E6%9B%BC-%E7%BF%81%E6%96%BD%E7%AE%97%E6%B3%95. -/2019-11-05.
Saul B. Needleman; and Christian D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology. 1970, 48: 443–453. doi:10.1016/0022-2836(70)90057-4;
启发式的同源算法; - 维基百科. 史密斯-沃特曼算法[EB/OL]. https://zh.wikipedia.org/wiki/%E5%8F%B2%E5%AF%86%E6%96%AF-%E6%B2%83%E7%89%B9%E6%9B%BC%E7%AE%97%E6%B3%95. -/2019-11-05.
M.S Waterman; T.F Smith; and W.A Beyer. Some biological sequence metrics. Advances in Mathematics. 1976, 20: 367–387. doi:10.1016/0001-8708(76)90202-4;
在 Peter H. Sellers. On the Theory and Computation of Evolutionary Distances. Journal of Applied Mathematics. 1974, 26: 787–793. doi:10.1137/0126070 基础上加了插入和删除; - 维基百科. 莱文斯坦距离[EB/OL]. https://zh.wikipedia.org/wiki/%E8%90%8A%E6%96%87%E6%96%AF%E5%9D%A6%E8%B7%9D%E9%9B%A2. -/2019-11-05.
- 小狗贤. 从零开始生物信息学(2):序列比对-Needleman-Wunsch算法[EB/OL]. https://zhuanlan.zhihu.com/p/54142276. 2019-01-17/2019-11-05.
- 小狗贤. 从零开始生物信息学(3):序列比对-Smith–Waterman算法[EB/OL]. https://zhuanlan.zhihu.com/p/54194091. 2019-01-17/2019-11-05.
- wikipedia. FASTA[EB/OL]. https://en.wikipedia.org/wiki/FASTA. -/2019-11-05.
- wikipedia. BLAST[EB/OL]. https://en.wikipedia.org/wiki/BLAST_(biotechnology). -/2019-11-05.
- Paul Reiners. 动态编程和基因序列比对[EB/OL]. https://www.ibm.com/developerworks/cn/java/j-seqalign/index.html. 2008-04-17/2019-11-05.
- 相似性和距离度量
Comments