Awesome Human Activity Recognition¶
Human Activity Recognition (HAR) is the field of recognizing human actions and activities from sensor data — including video, skeleton/mocap, wearable IMU, and multimodal egocentric inputs. This list covers 53 datasets, frameworks, pretrained models, tutorials, papers, competitions, and tools for HAR research.
Quick Stats¶
| Modality | Datasets | Highlights |
|---|---|---|
| Vision (RGB/Depth) | 14 | Kinetics-700, UCF-101, ActivityNet, AVA |
| Skeleton & MoCap | 7 | NTU RGB+D 60/120, AMASS, Human3.6M |
| Wearable Sensors | 13 | UCI-HAR, PAMAP2, CAPTURE-24 (3883 hrs) |
| Multimodal & Egocentric | 7 | Ego4D (3.3k hrs), EPIC-Kitchens-100 |
| Emerging & Frontier | 12 | HumanML3D, Motion-X++, Ego-Exo4D |
Repository Architecture¶
graph LR
subgraph Datasets["53 Datasets"]
V["Vision (14)"]
S["Skeleton (7)"]
W["Wearable (13)"]
M["Multimodal (7)"]
E["Emerging (12)"]
end
subgraph Ecosystem
F["Frameworks & Libraries"]
P["Pretrained Models"]
T["Tutorials & Courses"]
end
subgraph Automation
LC["Link Check (weekly)"]
SU["SOTA Update (weekly)"]
end
Datasets --> F
Datasets --> P
F --> T
SU -->|updates| Datasets
LC -->|validates| Datasets
Which Dataset Should I Use?¶
Pick your modality and task, then follow the recommendation.
Start with Kinetics-700 for pretraining, evaluate on UCF-101 or HMDB-51 for comparison with prior work.
ActivityNet for proposals, AVA for spatio-temporal, MultiTHUMOS for dense multi-label.
NTU RGB+D 120 is the de facto standard. For text-motion alignment, use BABEL or HumanML3D.
UCI-HAR for baselines, PAMAP2 for multi-sensor, CAPTURE-24 for real-world scale (151 subjects, 3883 hours).
Ego4D for scale (3.3k hours), EPIC-Kitchens-100 for kitchen actions, Ego-Exo4D for cross-view (CVPR 2024).
HumanML3D for single-person, InterHuman for two-person, Motion-X++ for whole-body with face and hands.
Explore¶
- Datasets — Browse all 53 dataset cards organized by modality
- Taxonomy — Multi-dimensional classification by task, license, scale, and year
- Surveys — Curated survey papers across all modalities
- Benchmarking — SOTA baselines and performance bands per dataset
- Roadmap — What is coming next
- Contributing — How to add datasets or improve the list