Skip to main navigation Skip to search Skip to main content

Skeleton-Based Methods for Speaker Action Classification on Lecture Videos

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

The volume of online lecture videos is growing at a frenetic pace. This has led to an increased focus on methods for automated lecture video analysis to make these resources more accessible. These methods consider multiple information channels including the actions of the lecture speaker. In this work, we analyze two methods that use spatio-temporal features of the speaker skeleton for action classification in lecture videos. The first method is the AM Pose model which is based on Random Forests with motion-based features. The second is a state-of-the-art action classifier based on a two-stream adaptive graph convolutional network (2S-AGCN) that uses features of both joints and bones of the speaker skeleton. Each video is divided into fixed-length temporal segments. Then, the speaker skeleton is estimated on every frame in order to build a representation for each segment for further classification. Our experiments used the AccessMath dataset and a novel extension which will be publicly released. We compared four state-of-the-art pose estimators: OpenPose, Deep High Resolution, AlphaPose and Detectron2. We found that AlphaPose is the most robust to the encoding noise found in online videos. We also observed that 2S-AGCN outperforms the AM Pose model by using the right domain adaptations.

Original languageEnglish
Title of host publicationPattern Recognition. ICPR International Workshops and Challenges, 2021, Proceedings
EditorsAlberto Del Bimbo, Rita Cucchiara, Stan Sclaroff, Giovanni Maria Farinella, Tao Mei, Marco Bertini, Hugo Jair Escalante, Roberto Vezzani
PublisherSpringer Science and Business Media Deutschland GmbH
Pages250-264
Number of pages15
ISBN (Print)9783030687984
DOIs
StatePublished - 2021
Event25th International Conference on Pattern Recognition Workshops, ICPR 2020 - Virtual, Online
Duration: Jan 10 2021Jan 15 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12664 LNCS

Conference

Conference25th International Conference on Pattern Recognition Workshops, ICPR 2020
CityVirtual, Online
Period01/10/2101/15/21

Keywords

  • Action classification
  • Lecture video analysis
  • Lecture video dataset
  • Pose estimation

Fingerprint

Dive into the research topics of 'Skeleton-Based Methods for Speaker Action Classification on Lecture Videos'. Together they form a unique fingerprint.

Cite this