「VIDEO」视频检索资源汇总

13 minute read

相似视频检索: 相似指的是通过修改原视频的到新的视频；
相关资料：检索概述 · 图像检索资源 · 音频检索资源

video fingerprinting · video DNA · video signature · video hash · video content-based watermarking;
video retrieval; near-duplicate video detection · video copy detection · video forgery detection · video content identification; video vreification · video authentication ;

发现这方面论文很凌乱，关键字都有很多个版本；每个方向也不多

不同的应用场景下，是否应该用相同的策略

1 综述

1.1 重复视频检测

Large-scale near-duplicate web video search: Challenge and opportunity
2009 paper
Million-scale Near-duplicate Video Retrieval System
2011-11-28 paper
$\bullet \bullet$
使用颜色特征； K-Means++ 选择特征，倒排加速搜索；
Spatio-temporal video copy detection
2011 paper
An Exploration based on Multifarious Video Copy Detection Strategies
2013 paper
Near-Duplicate Video Retrieval: Current Research and Future Trends
2013-08 paper
使用了一款视频编辑软件；阐述了全局特征的劣势——忽略了中间物体；
Survey on Web Scale Based Near Duplicate Video Retrieval
2016 paper
视频拷贝检测方法综述
2017-04 复旦·计算机姜育刚 paper
全方位的概述；
GVoS: A General System for Near-Duplicate Video Related Applications on Storm
2017 paper
A Systematic Review of Near Duplicate Video Retrieval Techniques
2018-05-27 paper

太简单，就是一个粗糙的博客；没有思考性的东西，也缺乏大量实验比对；图表模糊，不合格；

1.2 拷贝检测

Video Copy Detection: a Comparative Study
2007 paper
A Survey On Video Forgery Detection
2015-03-03 paper
Recent advances in content based video copy detection
2016-10-28 paper
A review on robust video copy detection
2019 springer
$\bullet \bullet$ VCD Review

1.3 视频签名

Video Content Identification Using Video Signature: Survey
2017-07 paper

1.4 基于内容的视频检索

Content-Based Image Retrieval at the end of the early years
2000 paper
汇总了各种方法，展示了遗留的问题；
Content based Video Retrieval: A Survey
2015-01 paper
Content-Based Video Retrieval in Historical Collections of the German Broadcasting Archive
2017-02-13 paper
提供了丰富的视频检索流程；
FIVR: Fine-grained Incident Video Retrieval
2018-09-11 paper
发布了新数据集 FIVR-200K；
Interactive video retrieval in the age of deep learning
ICMR 2019 toturial 2019 paper

1.5 跨模态视频检索

Find and Focus: Retrieve and Localize Video Events with Natural Language Queries
ECCV 2018 2018 paper | home
Deep Learning for Video Retrieval by Natural Language
2019-10-25 paper

1.6 版权保护

DETECTING DIGITAL COPYRIGHT VIOLATIONS ON THE INTERNET
1999 paper

1.7 视频指纹

Introduction to Video Fingerprinting
2009 paper
Video fingerprinting for copy identification: From research to industry applications
2009 paper
The Core of Video Fingerprinting: Examples of Feature
2009 paper
$\bullet \bullet$

2 理论

3 技术点

3.1 关键帧提取

keyframe extract · shot boundary detection;

3.2 特征提取

包括低层特征：颜色、边缘、纹理、动作；
中级特征：运动轨迹、物体颜色；
高级特征：；

feature extract · video fingerprinting · video DNA · video signature · video hash · video content-based watermarking;

3.2.1 帧特征

3.2.2 视频特征

Robust video signature based on ordinal measure
2004 paper
重采样应对帧率变化；顺序度量特征，固定滑窗计算相似度；
Video Shot Characterization∗
paper
A Rotation Invariant Descriptor for Robust Video Copy Detection
2007 paper
Content-based video fingerprinting method for fast key generation and retrieval
VIDEO COPY DETECTION USING FINGERPRINTING WITH FAST IMAGE PROCESSING
2016-04 paper
Deep Video Hashing
2016 paper
$\bullet \bullet$ Deep Video Hashing
Video Copy Detection Based On Temporal Contextual Hashing
2016
时序二进制 hash；
Unsupervised Deep Video Hashing with Balanced Rotation
IJCAI 2017 2017 paper
$\bullet \bullet$
无监督特征提取；
Deep Hashing with Category Mask for Fast Video Retrieval
2017-12-22 美图 paper | blog
$\bullet \bullet$ Hash Mask
使用网络提取视频特征：先用 backbone 网络针对每一帧提取特征，然后将多个特征进行融合，得到固定维度的特征就是视频特征；
得到特征后使用阈值进行量化，得到 Hash 码；然后计算 hamming 距离；
- 该方案的可扩展性有待深究（阈值的设定、特征融合对于输入帧数的限制）；
- 文章默认未考虑视频剪辑的情况；因为只适应从视频采样出固定的几帧，那么就只能完成检索任务，没有办法定位到时间；且如果采样到不相关的帧，对检索结果影响也较大；
- 文章对于如何采样并没有说明；感觉工作不是很严谨；整体方案应是只停留在理想环境；
Convolutional Hashing for Automated Scene Matching
2018-02-09 paper
A Video Database Management System for Advancing Video Database Research
2018-02-09 paper
cnn hash，设计了新的 loss 函数；
Finding Near-Duplicate Videos in Large-Scale Collections
2019
A Survey of Deep Learning Solutions for Multimedia Visual Content Analysis
2019 paper
Extracting camera-based fingerprints for video forensics
CVPR 2019 workshop 2019 paper

3.2.3 多帧融合
3.2.3.1 聚类

Towards effective indexing for very large video sequence database
对帧特征进行聚类；
Comprehensive Feature-based Robust Video Fingerprinting Using Tensor Model
2016-01-27 paper
$\bullet \bullet$
特征融合及索引；

3.2.3.2 降维

UQLIPS: A Real-time Near-duplicate Video Clip Detection System
2007 paper
使用 PCA 对多帧视频特征降维得到视频特征；
Bounded coordinate system indexing for real-time video clip search
2009 paper
使用 BCS（PCA+）对多帧视频特征降维得到视频特征；
Real-time Large Scale Near-duplicate Web Video Retrieval

3.2.3.3 Fisher vector

SHOT AGGREGATING STRATEGY FOR NEAR-DUPLICATE VIDEO RETRIEVAL
2015 paper
镜头聚合；

3.2.3.4 稀疏编码

Compact CNN Based Video Representation for Efficient Video Copy Detection
2016-12
$\bullet \bullet$
CNN 提取特征，然后稀疏编码到固定长度；

3.2.3.4 VLAD

Aggregating local descriptors into a compact image representation
2010 paper

3.2.3.5 其他

Video histogram: A novel video signature for efficient Web video duplicate detection
2007 paper
相关直方图；
Practical elimination of near-duplicates from Web video search
2007 paper
CC_Web_VIDEO 数据集，基于关键帧，一阶段用颜色，累积直方图，过滤不相似的视频；而阶段用 PCA-SIFT + 滑窗；
CNN Features Off-the-Shelf: An Astounding Baseline for Recognition
2014-03-23 paper
$\bullet \bullet$
A discriminative CNN video representation for event detection
2014-11-14 paper
Hierarchical feature fusion hashing for near-duplicate video retrieval
2016 paper
$\bullet \bullet$
Detecting near-duplicate videos by aggregating features from intermediate CNN layers
2016-08 paper | tensorflow
$\bullet \bullet$
Near-duplicate video retrieval by aggregating intermediate cnn layers
2017 paper | caffe
$\bullet \bullet$ NDVR CNN
流程：CNN 提取特征（每层都提）；K-Means 生成码本（最佳值是 1000）；Apache Spark 助力计算；
对比了 AlexNet、VGG、GoogleNet 在视频检索中的表现；并且与传统方法做了比较；效果更好；

实验很细致，也给出了一个相对完整的视频检索流程；只是没有考虑剪辑的情况；
Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval
2013
$\bullet \bullet$ MFH
监督学习进行特征 hash；

3.2.4 压缩域

无需解码视频，根据编码信息提取特征，可应对裁剪等拷贝操作；

Bit Rate-based H. 264 Video Copy Detection
2018-01 paper
$\bullet \bullet$ h.264
流程：镜头检测（I 帧判断）——压缩域提取特征（P 帧比特率）——比对（$\chi^2$ + 编辑距离）；
结果：只能与部分算法持平；且压缩域极易受剪辑攻击；

这个方法不可行；关于 I 帧找镜头的方法还不错；

3.2.5 其他

Submodular Video Hashing: A Unified Framework Towards Video Pooling and Indexing
2012 paper
Unsupervised t-Distributed Video Hashing and Its Deep Hashing Extension
$\bullet \bullet$ t-UDH
2016-12 paper
Joint Audio-Video Fingerprint Media Retrieval Using Rate-Coverage Optimization
2016-09-05 paper
$\bullet \bullet$ Rate Coverage
Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval
2018 paper
$\bullet \bullet$ UDVH

3.3 索引

3.3.1 综述

Review of Image and Video Indexing Techniques
1997 semanticscholar
Image and video indexing in the compressed domain
1997 paper
另一个名字是 A critical evaluation of image and video indexing techniques in the compressed domain
针对频域索引技术进行了比对，包括傅立叶变换，余弦变换，Karhunen-Loeve变换，子带和小波的变换域技术；还有矢量量化和分形的空间域技术和基于运动矢量的时间索引技术；
A state-of-the-art review on multimodal video indexing
2002 paper
Multimodal video indexing: A review of the state-of-the-art
2005 paper
多模态索引；
A Survey on Visual Content-Based Video Indexing and Retrieval
2011 paper
Multimedia Indexing and Retrieval Techniques: A Review
2012 paper

3.3.2 树形结构

Prover: Probabilistic video retrieval using the Gauss-tree
2007 paper

3.3.3 VA-File

A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces
1998 paper
Trading quality for time with nearest-neighbor search
2000 paper
iDistance: An Adaptive B+-Tree Based Indexing Method for Nearest Neighbor Search
2007 paper
iDistance;
Indexing High-Dimensional Data in Dual Distance Spaces: A Symmetrical Encoding Approach
2008 paper
Online Near-Duplicate Video Clip Detection and Retrieval: An Accurate and Fast System
2009
Towards effective indexing for very large video sequence database
2005 paper
iDistance;

3.3.4 Hash

A Posteriori Multi-Probe Locality Sensitive Hashing
2008 paper
LSH: 提升准确度；
Quality and Efficiency in High Dimensional Nearest Neighbor Search
2009 paper
LSH: 改映射函数；
An Image-Based Approach to Video Copy Detection With Spatio-Temporal Post-Filtering
2010 paper
基于关键点，LSH 改进，词袋模型；
Large-scale video copy retrieval with temporal-concentration SIFT
2016
改进 sift；LSH：位置敏感 hash，解决高维特征检索慢的问题；
Stochastic Multiview Hashing for Large-Scale Near-Duplicate Video Retrieval
2016-09-15 paper
$\bullet \bullet$ stochastic hash
镜头边界检测，汉明距离，map；多视角随机 hash 提升检索精度和速度；

3.3.5 倒排

Video Google: a text retrieval approach to object matching in videos
2003 paper

Scalable Detection of Partial Near-Duplicate Videos by Visual-Temporal Consistency
2009 paper
$\bullet \bullet$ Network Flow
倒排，针对弱几何一致性做了改进；使用了图对齐；

Real-time Large Scale Near-duplicate Web Video Retrieval
2010 paper
倒排，map+准确度+召回率；LBP 直方图相交；词袋模型+时序特征；

3.4 时间对齐

temporal alignment · local alignment

3.4.1 滑窗

Compact video description for copy detection with precise temporal alignment
ECCV 2010 2010 paper
分层索引，霍夫投票；

3.4.2 树形结构

Searching for repeated video sequences
2007 paper

3.4.3 动态规划

Detection of video sequences using compact signatures
2006 paper

3.4.4 图对齐

最大流

Effective and Efficient Query Processing for Video Subsequence Identification
2007 paper
$\bullet \bullet$ MSM SMSM
Scalable Detection of Partial Near-Duplicate Videos by Visual-Temporal Consistency
Efficient mining of multiple partial near-duplicate alignments by temporal network
2010
Detection and location of near-duplicate video sub-clips by finding dense subgraphs
2011 paper
Identification of video subsequence using bipartite graph matching
2011 paper
$\bullet \bullet$ Graph bipartite
A two-step video subsequence identification based on bipartite graph matching 加了个命中函数；
Video Hyperlinking: Libraries and Tools for Threading and Visualizing Large Video Collection
2012 paper | tool
Video Copy Detection Based on Path Merging and Query Content Prediction
2015-10
基于图模型的路径合并；

3.4.5 DL

Temporal Cycle-Consistency Learning
CVPR 2019 2019-04-16 Google & Deepmind paper | project | tensorflow-official

Temporal Attentive Alignment for Video Domain Adaptation
CVPR 2019 workshop 2019-05-26 paper | pytorch-official
与 Temporal Attentive Alignment for Large-Scale Video Domain Adaptation 是同一团队；

Temporal Attentive Alignment for Large-Scale Video Domain Adaptation
ICCV 2019 oral 2019-07-30 paper | pytorch-official | reddit
与 Temporal Attentive Alignment for Video Domain Adaptation 是同一团队；

3.4.6 蛋白质序列

Searching for Near-Duplicate Video Sequences from a Scalable Sequence Aligner
2013-11-19 paper

3.4.7 多项式逼近

只能应对滑窗试匹配；

polynomial approximation

A novel scheme for fast and efficient video sequence matching using compact signatures
2000 paper
特征使用的 DCT 直方图，针对片段匹配；

3.4.8 其他

Video sequence matching based on temporal ordinal measurement
2007 paper
提取了时间维度特征；
Fast Visual Retrieval Using Accelerated Sequence Matching
2010 paper
Efficient video copy detection via aligning video signature time series
2012
基于倾斜的时间对齐；对齐时引入了帧的插入、删除和替换策略；
Frame filtering and path verification for improving video copy detection
2013 paper
$\bullet \bullet$
针对定位精度和就三效率做了处理；
Block based video alignment with linear time and space complexity
ICIP 2016-09 paper
$\bullet \bullet$ DDWT
线性时间和空间复杂度；
Energy based fast event retrieval in video with temporal match kernel
ICIP 2017 2017-09 paper | ppt
乘积量化；
Temporal Matching Kernel with Embedded Stability-Sensitive Filter
2017-12
Circulant Temporal Encoding for Video Retrieval and Temporal Alignment
2015-06-08 paper
$\bullet \bullet$ Circulant
使用复数处理特征，以达到精确匹配；
Burst-survive Temporal Matching Kernel with Fibonacci Periods
ICASSP 2019 2019 paper | python
Neighborhood Preserving Hashing for Scalable Video Retrieval
ICCV 2019 2019 清华深研院 paper

3.5 相似度度量

序列相似度度量、字符串匹配

sequence matching · similarity measure
longest common sub-sequence

3.5.1 全局匹配

3.5.1.1 编辑距离

A distance measure for video sequence similarity matching
1998 香港中文 paper
distance measure · edit distance · sequence-tosequence matching · video string
贡献：在传统 edit distance 上融入了 vstring 的概念，并加入了三种操作来适应序列间的匹配；
说明：vstring 负责序列数据的组织，edit distance 负责非等长数据之间的匹配；
算法特点：对空间特征的鲁棒性要求较高；空间和时间复杂度较高；
疑问：
- vstring 怎么实现的；
- edit distance 怎么实现的；
  
  感觉 ED 系列的方法都是针对本身就相似的特征，使用串匹配的方法详细计算其相似度，所以更适合用来计算数据之间的相似性，而不适合用在拷贝检测相关问题中；因为拷贝检测问题是为了找出该视频副本，也就是副本和原视频越相似越好；而 ED 则是很客观地给出副本和原视频的不相似程度；
Comparison of distance measures for video copy detection
2001 IBM paper
$\bullet \bullet$ IBM
比较了几种特征度量的方法；
特征：直方图（RGB，HSV，梯度）、局部梯度、不变矩、差分；
度量：Hash（直方图相交，分块），L2（max·min）；

针对图像相似度计算做了简单汇总（更像是个博客）；拿视频当幌子，根本没提序列匹配及时间定位问题；
Spatiotemporal sequence matching for efficient video copy detection
2005 Semantic Scholar
Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation
2009-04 paper
邻域度量；
Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval
2009 paper
$\bullet \bullet$ proximity
针对关键帧检索任务（通过文字或图像检索），在单帧的局部特征上用词袋模型，合成帧的全局特征，然后用 EMD 算法计算距离；
An image-based near-duplicate video retrieval and localization using improved edit distance
2017
改进 ED，可以过滤不相似的帧；

3.5.2 局部匹配

3.5.2.1 沃特曼

Video copy detection by fast sequence matching
2009 paper
$\bullet \bullet$ Fast Sequence
先前的相似度计算方法是基于关键帧的，没有考虑时间连续性；编辑距离 ED（最长公共子序列是 ED 的特例）虽然考虑了时序，但是他是针对全局序列的匹配，无法应对剪辑和拼接；文章就此提出了针对局部序列匹配的改进版 ED 算法；
采样：固定时间采样（1s）；
特征：-VSMF-color（半全局特征，马尔可夫）；
相似度：一方面，整体框架使用了史密斯·沃特曼算法，内有 ED 的概念；另一方面，替换操作定义为常量减去两帧之间的距离；
$\begin{align} v(q_i, r_i) &= c - d(q_i, r_i) \\ d(q_i, r_i) &= \chi^2 = \sum_j {\frac{(q_{ij} - r_{ij})^2}{r_{ij}}} \\ \end{align}$ $i$ 是帧号，$x_i$ 是一帧图像的特征向量，$x_{ij}$ 是特征向量中的一个值；

3.5.2.2 最优时间公共子序列
Optimal Temporal Common Subsequence

Efficient and Effective State-based Framework for News Video Retrival
2010 paper
当作多维字符串处理；
$\bullet \bullet$
The Optimal Temporal Common Subsequence
2010 library
Multiscale video sequence matching for near-duplicate detection and retrieval
2018-05-04 paper
$\bullet \bullet$ MS-VSM
多阶段检索；

3.5.2.3 DTW

A Time Warping Based Approach for Video Copy Detection
2006 paper
$\bullet \bullet$ DTW
文章认为之前的方法速度太慢，且没有考虑视频间序列变化问题；因此进行了如下改进：
优化速度：
提取关键帧：拉氏变换 + 时序峰值；
提取片段：基于关键帧，确定可用的起止位置；组合后得到可用的片段；
提升准确度：
精确计算视频距离（时间差分）：当前对应帧距离 + 关联帧距离（关联帧距离中的最小值）；

文章所说的 DTW 中的动态，也只是在候选片段上多跑几次，没有太大的理论意义；
文章大量篇幅在讲关键帧，但是未曾引用相关论文，且 TW 模块也是；因此论文工作量不足，可信度和理论解释也一般；

3.6 多特征&多模态

A MULTIMODAL VIDEO COPY DETECTION APPROACH WITH SEQUENTIAL PYRAMID MATCHING∗
2011 paper

4 应用方向

4.1 重复视频检测

又叫视频拷贝检测，Content Based Copy Detection, CDBC: 作用等同于水印，主要用于版权保护；不同之处在与他是直接从视频本身提取一些特征；

video copy detection · video forgery detection · near-duplicate video detection · video tampering · video manipulation

Real time repeated video sequence identification
2003 paper
Efficient Near-duplicate Detection and Sub-image Retrieval
2004 paper
哈希；
Feature statistical retrieval applied to content-based copy identification
2004 paper
兴趣点签名，对打数据检索高效；
Efficient and Effective Video Copy Detection Based on Spatiotemporal Analysis
2007 paper
Scalable mining of large video databases using copy detection
2008 paper
A Framework for Handling Spatiotemporal Variations in Video Copy Detection
2008 rar
A compact, effective descriptor for video copy detection
2009 paper
Video Copy Detection by Fast Sequence Matching
2009 paper
Scale-Rotation Invariant Pattern Entropy for Keypoint-based Near-Duplicate Detection
2009 paper
Realtime near duplicate elimination for web video search with content and context
2009 paper
A Robust and Fast Video Copy Detection System Using Content-Based Fingerprinting
2011 paper
Frame Fusion for Video Copy Detection
2011 paper
Multiple feature hashing for real-time large scale near-duplicate video retrieval
2011-12 ppt
发布了数据集 UQ_VIDEO
Content Based Video Copy Detection: Issues and Practices
2012 paper
Implementation of a Content Based Video Copy Detection Using Spatio-Temporal Measure
2012 paper
基于镜头；或许并未考虑序列问题；
Content-Based Video Copy Detection Benchmarking at TRECVID
2014 paper
VCDB: A Large-Scale Database for Partial Copy Detection in Videos
2014 paper
发布了数据集
Detection and Localization of Video Copy-Move Forgery in Temporal and Spatial Domain
2015 paper
Pattern-Based Near-Duplicate Video Retrieval and Localization on Web-Scale Videos
2015 paper
$\bullet \bullet$
基于模式的特征处理，过滤不相似的视频；
Frame-level matching of near duplicate videos based on ternary frame descriptor and iterative refinement
ICIP 2015 2015 paper
Partial Copy Detection in Videos: A Benchmark and An Evaluation of Popular Methods
2016-03-01 paper
$\bullet \bullet$ TBD
公开 VCDB 数据集，有 10 万条视频，9000 对；深度局部特征+词袋模型；
Near-Duplicate Video Detection Based on an Approximate Similarity Self-Join Strategy
2016-04 paper
Near Duplicate Video Retrieval using Spatio Temporal Approach with Multifeature Mechanism
2016-06 paper
Near-Duplicate Video Retrieval with Deep Metric Learning
ICCV 2017 workshop 2017 paper
$\bullet \bullet$ DML
Video Copyright Detection Using High Level Objects in Video Clip
2017-12 paper
$\bullet \bullet$ objects
基于镜头找相似片段；
A new technique for video copy-move forgery detection
2017 paper
A PatchMatch-based Dense-field Algorithm for Video Copy-Move Detection and Localization
2017-03-14 paper
发布了新的数据集；
Autoencoder with recurrent neural networks for video forgery detection
2017-08-29 paper
TRECVID 2018: Benchmarking Video Activity Detection, Video Captioning and Matching, Video Storytelling Linking and Video Search
2018-11-12 paper
A Coarse-to-fine Deep Convolutional Neural Network Framework for Frame Duplication Detection and Localization in Forged Videos
2018-11-27 paper
$\bullet \bullet$ coarse-fine cnn
Simple Yet Efficient Content Based Video Copy Detection
2018-04-19 paper
Capsule-forensics: Using Capsule Networks to Detect Forged Images and Videos
2018-10-26 paper
Geometrically robust video hashing based on ST-PCT for video copy
2019-04-08
Video tampering localisation using features learned from authentic content
2019-01-11 paper
$\bullet \bullet$ tampering localisation
We Need No Pixels: Video Manipulation Detection Using Stream Descriptors
ICML 2019 Worksop 2019-06-20 paper
Partial-copy detection of non-simulated videos using learning at decision level
2019
$\bullet \bullet$ decision level
SVD: A Large-Scale Short Video Dataset for Near-Duplicate Video Retrieval
ICCV 2019 2019 抖音+南大 lambda paper
$\bullet \bullet$ SVD
发布了 SVD 数据集；

4.2 签名验证

video vreification

Online Signature Verification Based on Writer Specific Feature Selection and Fuzzy Similarity Measure
CVPR 2019 (Applications to Media Forensics) 2019-05-21 paper

4.3 基于内容的视频检索

CBVR: 用于相似场景检测；

content based video retrieval · video content identification

Video sequence matching
1998
基于动作做的检索；并且进行了时间对齐；
A Fully Automated Content-Based Video Search Engine Supporting Spatiotemporal Queries
1998 paper
Retrieval of News Video using Video Sequence Matching
2005 paper
Content based video retrieval systems
2012-05-08 paper
Content Based Video Retrieval
2016-05 paper
$\bullet \bullet$ CBVR
Content-based Video Indexing and Retrieval Using Corr-LDA
2016-02-27 paper
Exploiting detected visual objects for frame-level video filtering
2017 paper
$\bullet \bullet$
融入了目标检测和跟踪；
Use What You Have: Video Retrieval Using Representations From Collaborative Experts
BMVC 2019 2019-07-31 paper | pytorch-official

4.4 重定位

找到给定视频在目标视频中的位置；

video relocation

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
CVPR 2018 2018-05-23 paper | AVA-dataset
Video Re-localization
ECCV 2018 2018-08-05 腾讯 AI Lab + 美国罗切斯特大学 paper | tensorflow-official
重组 ActivityNet 数据集视频，生成了一个符合研究需求的新数据集，并提出一种交叉过滤的双线性匹配模型；
Spatio-temporal Video Re-localization by Warp LSTM
2019-05-10 paper

4.5 跨模态检索

multi-modal hashing · cross-modal retrieval；
siam-network；
Zero-Example Video Retrieval · Text-Video Retrieval；
Video-Music Retrieval；

年份	AAAI	ICML	NIPS	CVPR	ICCV	ECCV
2019	2	0		1
2018	0		0	0		0
2017	1			0	0
2016	1			0
2015	0			1	0
2014	0			0
2013	0			0	0

4.5.1 跨模态

Coupled CycleGAN: Unsupervised Hashing Network for Cross-Modal Retrieval
AAAI 2019 2019-03-06 paper
UCH：为了解决跨模态散列问题，使用 GAN 将特征提取和哈希连接起来；

4.5.2 图片

4.5.2.1 行人重识别

4.5.2.2 商品检索

Video2Shop: Exactly Matching Clothes in Videos to Online Shopping Images
CVPR 2017 2018-04-14 西南交大、阿里巴巴 paper
AsymNet: 用 FasterRCNN 检测出物体，然后进行比对；
Asymmetric Spatio-Temporal Embeddings for Large-Scale Image-to-Video Retrieval
BMVC 2018 2018 paper

4.5.3 文本

4.5.3.1 广告植入

根据输入的关键词，给出视频中与之呼应的情景片段；

4.5.4 音频

Content-Based Video-Music Retrieval Using Soft Intra-Modal Structure Constraint
2017-04-22 paper | 示例 | tensorflow-official
Cross-modal Embeddings for Video and Audio Retrieval
2018-01-07 paper | Youtube-8m-训练

TOP

附录

A 参考资料

Cross-Modal Retrieval-paper_with_code
InVID H2020 Project
Multiple feature hashing for real-time large scale near-duplicate video retrieval-引用文献
scinapse
video-retrieval-github
爱奇艺视频版权保护技术与维权实践 | 视频回放 | 陈赫
从「发布之前」、「分发播放」、「盗版追踪」、「维权处理」四个环节介绍爱奇艺在版权保护方面的相关技术以及具体应用情况；
awesomeCVpapers ：特征提取；

B 报告

ICIP 2017 Near-Duplicate Video Detection Exploiting Noise Residual Traces
Introduction to video hashing-limu

C 数据集

重复视频检测：

名称	类型	数量(训练/测试)	说明	发布日期
TRECVID & data			著名的视频拷贝检测竞赛	2001~2019
CC_WEB_VIDEO			包含颜色亮度变换，画面编辑和拼接	2007
MuscleVCD ST1
UQ_VIDEO		169,952	查询视频 400 个，检索视频 20 万；单个视频对应的检索视频最多 1000 个；提取到的关键帧有 3,305,525 个；共 15G；	2011
Copy-move forgeries dataset
VCDB			8G	2014
FIVR		225,960/100		2019
AVA
SVD	抖音开源	500,000	ICCV 2019	2019

D 项目

ThreatExchange
code | TMK + PDQF-相似视频识别 | PDQ-相似图片识别
duplicate video search

E 比赛

TREC Video Retrieval Evaluation: TRECVID

F 研究员

OLIVES Research、github

Twitter Facebook LinkedIn

「VIDEO」视频检索资源汇总

1 综述

2 理论

3 技术点

3.1 关键帧提取

3.2 特征提取

3.3 索引

3.4 时间对齐

3.5 相似度度量

3.5.1 全局匹配

3.5.2 局部匹配

3.6 多特征&多模态

4 应用方向

4.1 重复视频检测

4.2 签名验证

4.3 基于内容的视频检索

4.4 重定位

4.5 跨模态检索

4.5.1 跨模态

4.5.2 图片

4.5.2.1 行人重识别

4.5.2.2 商品检索

4.5.3 文本

4.5.3.1 广告植入

4.5.4 音频

附录

A 参考资料

B 报告

C 数据集

D 项目

E 比赛

F 研究员

Comments

You May Also Enjoy

「论文解读」Self-Supervised Person Detection in 2D Range Data using a Calibrated Camera

「CV」深度估计概述

「工具」 Zotero

「DLFramework」 A311D NPU Demo 使用

1 综述

2 理论

3 技术点

3.1 关键帧提取

3.2 特征提取

3.3 索引

3.4 时间对齐

3.5 相似度度量

3.5.1 全局匹配

3.5.2 局部匹配

3.6 多特征&多模态

4 应用方向

4.1 重复视频检测

4.2 签名验证

4.3 基于内容的视频检索

4.4 重定位

4.5 跨模态检索

4.5.1 跨模态

4.5.2 图片

4.5.2.1 行人重识别

4.5.2.2 商品检索

4.5.3 文本

4.5.3.1 广告植入

4.5.4 音频

附录

A 参考资料

B 报告

C 数据集

D 项目

E 比赛

F 研究员

Comments

You May Also Enjoy

「论文解读」Self-Supervised Person Detection in 2D Range Data using a Calibrated Camera

「CV」 深度估计概述

「工具」 Zotero

「DLFramework」 A311D NPU Demo 使用

「CV」深度估计概述