Keze Wang (王可泽)

Explainable Artificial Intelligence (XAI) for Video Analysis: We employ a Theory of Mind (X-ToM) approach in which an explicit representation of the users' mental model is inferred and applied to the tasks of Visual Question Answering (VQA) on video analysis. Please visit XAI4VideoAnalysis for more details.

Self-driven Progressive Learning System for Large-scale Visual Recognition: We developed a rational yet cost-effective pipeline to significantly decrease the amount of user annotations for improving large-scale visual recognition (e.g., face identification and object detection). Specifically, our system mines from unlabeled and partially labeled images by automatically distinguishing the samples with high prediction confidences, which can be easily and faithfully recognized by computers via a simple-to-complex fashion, and low-confident ones, which can be labeled by active users in an interactive complex-to-simple manner.

Network Acceleration on Embedded Devices: We presented a simple yet effective strategy to design lightweight network architectures. Moreover, we have also studied the fastest convolution algorithm (i.e., winograd) and further extended it with specific optimizations for the ARM Cortex-A15 (32bit) and Cortex A72 (64bit) chips.

Human-centric Analysis System: We proposed several methods to accomplish a real-time and high-accuracy system to predict 2D/3D human poses from depth or common RGB cameras. This system works efficiently and properly on mobile devices, e.g., MiBox 3.