2 minute read

诸多视频分析技术都是以关键帧提取作为基础,在此就做一个汇总;
相关资源:关键帧提取概述

key frame extraction · bag of keyframes · key frame detection
shot boundary detection · key volume mining · Key Segments

1 综述

  1. A Formal Study of Shot Boundary Detection
    2007 paper
    $\bullet \bullet$ study
    基于图;

  2. Analysis of Popular Video Shot Boundary Detection Techniques in Uncompressed Domain
    2012 paper

  3. A REVIEW ON DIFFERENT METHODS OF VIDEO SHOT BOUNDARY DETECTION
    2012-08-01 paper

  4. Analysis and Review of Formal Approaches to Automatic Video Shot Boundary Detection
    2012 paper
    $\bullet \bullet$ analysis

  5. A Review on Shot Boundary Detection Harsh Kumar
    2014 paper

  6. A SURVEY REPORT ON VIDEO SHORT BOUNDARY DETECTION SCHEMES
    2014-05 paper

  7. A Review on Different Keyframe Abstraction Techniques from the Video
    2014-11 paper

  8. Video Shot Boundary Detection: A Review
    2015 paper

  9. Video Shot Boundary Detection: A Comprehensive Review
    2017 paper

  10. Methods and Challenges in Shot Boundary Detection: A Review
    2018-03-23 paper
    $\bullet \bullet$ challenge

2 理论

3 关键帧提取

3.1 传统方法

  1. RPCA-KFE: Key Frame Extraction for Consumer Video based Robust Principal Component Analysis
    2014-05-07 paper
    PCA;

3.2 DL

  1. Video Key Frame Extraction using Entropy value as Global and Local Feature
    2016-05-28 Paper
    自注意力机制助力视频字幕提取;

    摘要不是很好,核心不清晰;

  2. Recognizing Dynamic Scenes with Deep Dual Descriptor based on Key Frames and Key Segments
    2017-02-15 paper

  3. Superframes, A Temporal Video Segmentation
    2018-04-18 paper
    基于光流进行运动估计;

  4. 基于深度学习的视频关键帧提取与视频检索
    2019 梁建胜,温贺平 知网
    $\bullet \bullet$ 检索与关键帧
    实际上是很早的算法了;

4 镜头边界检测

4.1 传统方法

  1. Comparison of automatic shot boundary detection algorithms
    1998 paper
    关注淡入淡出问题;

  2. Video shot boundary detection using motion activity descriptor
    2010-04-26 paper

  3. A Novel Approach for Shot Boundary Detection in Videos
    2012 paper

  4. Video Shot Boundary Detection using Visual Bag-of-Words
    2013 paper

  5. Histogram Based Split and Merge Framework for Shot Boundary Detection
    2013 paper
    基于颜色直方图;

  6. Video Shot Boundary Detection Using Normalized Periodogram Distance Metric
    2016 paper

  7. A Novel Method of Shot Boundary Detection using Center Symmetric Local Binary Pattern
    2016 paper

  8. Shot boundary detection using convolutional neural networks
    2016 paper
    消除假的镜头边界;

  9. Video shot boundary detection and key-frame extraction using mathematical models
    2017 paper
    好长;

4.2 DL

  1. Large-scale, Fast and Accurate Shot Boundary Detection through Convolutional Neural Networks
    2017-05-09 paper | home | matlab-official
    CNN 检测镜头;开放了一个大型数据集;速度快;

  2. Ridiculously Fast Shot Boundary Detection with Fully Convolutional Neural Networks
    2017-05-23 paper | keras
    超快

  3. Fast Video Shot Transition Localization with Deep Structured Models
    2018-08-13 paper

  4. Two Stage Shot Boundary Detection via Feature Fusion and Spatial-Temporal Convolutional Neural Networks
    2019-01-26 paper
    先用分镜头(融合了 CNN 和 颜色特征),再合并过渡段;

  5. TransNet: A deep network for fast detection of common shot transitions
    2019-06-08 paper | tensorflow
    $\bullet \bullet$ TransNet

5 应用

5.1 手势识别

  1. Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion
    2019-01-15 paper | matlab-official
    $\bullet \bullet$ Hand Gesture Fusion
    基于图像熵和视频聚类提取到视频中的关键帧,一次提高手势识别的准确度;

5.2 动作识别

  1. Deep Keyframe Detection in Human Action Videos
    2018-04-26 paper
    人体行为关键帧的特点:这些关键帧的类别区分度最强;
    做法:
    • 生成关键帧的label
    • 利用 Imagenet 预训练的 VGG-16 提取每一帧的特征
    • 根据每个视频的类别,将同一类别的帧组成 Vc
    • 对于每一类,利用 LDA 学习一个矩阵,最大化与其他类别的距离
      每一帧的得分为:
    • 利用生成的 label,训练一个关键帧得分生成网络
      收获:
    • 关键帧的分布与原序列的分布一致(多样性)
    • 关键帧的信息冗余尽可能少(离散型)
    • 关键帧的个数应该尽可能的少
    • 关键帧能够很容易识别出该 id(判别性)
  2. A Key Volume Mining Deep Framework for Action Recognition
    CVPR 2016 2016 paper
    $\bullet \bullet$ key volume
    motivation:视频中包含大量静止画面,如果把这些帧送入网络,会对网络的训练起到一个反向的作用;
    做法:将多个帧输入到网络中,只优化对于在目标类中取得最大概率的帧的loss;
    思考:用分类来提取关键帧,类别分数越高,越有可能成为关键帧;
    问题:测试时输入的帧也有可能不含有动作信息,为什么还要将各个帧的得分平均?是不是也可以考虑像训练集那样只考虑关键帧的预测结果;

5.3 视频摘要

  1. Video Summarization with LongShort-term Memory
    ECCV 2016 2016-05-26 paper | blog | theano
    用 LSTM 提取关键帧序列;

  2. Unsupervised Video Summarization with Adversarial LSTM Networks
    CVPR 2017 2017 paper
    $\bullet \bullet$ ALSTM
    先验:关键帧的分布应该与原序列的分布一直(去除冗余信息);
    正规化:关键帧的个数应该尽可能的少;关键帧的信息尽可能离散;
    做法

    • slstm:输出每一帧的得分,与原来帧加权后得到新的特征;
    • elstm:对于lstm得到的特征编码,得到一个特征;
    • dlstm:对elstm得到的特征解码,恢复出原来的特征;
    • clstm:判断dlstm得到的特征是否还是原来的特征;

处理:根据每一帧的得分选出关键帧

  • 将视频分成不重叠的几个clip;
  • 每个clip的得分是这个clip中所有帧的得分的平均,对clip排序;
  • 高得分的clip中的帧按照分数排序,选出最高的几帧;

5.4 REID


TOP

附录

A 参考资料

  1. Shot transition detection
  2. 视频镜头分割方法综述
    里边有 C++ 代码;
  3. 视频镜头分割
    有 python 代码;
  4. python数字图像处理(二)关键镜头检测
  5. https://www.cnblogs.com/lynsyklate/p/7840881.html

B 项目

  1. Shot-boundary-detection-using-SVM-s-Aritificial-Neural-Networks-and-kNN

  2. ShotDetection

Comments