Computer Vision and Machine Learning Lab (CVML)

Visual Attention Models of Dynamic Scenes

The human visual system can quickly, effortlessly, and efficiently process visual information from their surroundings. As a result, modern computer vision is heavily influenced by how biological visual systems encode properties of the natural environment. Humans can perform several complex tasks such as object localization, identification, and recognition in scenes, owing to their ability to “attend” to selected portions of their visual fields while ignoring other information. Although visual attention can either be driven by bottom-up / exogenous-control or top-down / endogenous-control mechanisms, research studies have found that bottom-up influences act more rapidly than top-down processes. Our work here focuses on

  • running psychophysical experiments to understand governing mechanisms of attention
  • proposing computational models for these mechanisms
  • applying these models in real-world scenarios.

Publications

  • M. Wahid et. al., The effect of eye movements in response to different types of scenes using a graph-based visual saliency algorithm, Applied Sciences, vol. 9, no. (24), 2019
  • H. Mehmood et al., Dynamic saliency model inspired by middle temporal visual area: A spatio-temporal perspective, in 2018 Digital Image Computing: Techniques and Applications (DICTA), (Canberra, Australia), Dec. 2018
  • M. S. Azam et al., A benchmark of computational models of saliency to predict human fixations in videos, in 11th International Conference on Computer Vision Theory and Applications (VISAPP 2016), (Rome, Italy), 2016
  • S. O. Gilani et al., PET: An eye-tracking dataset for animal-centric pascal object classes, in 2015 IEEE International Conference on Multimedia and Expo (ICME), (Italy), 2015
  • M. S. Azam et al., Saliency based object detection and enhancements using spectral residual approach in static images and videos, Advanced Science Letters, vol. 21, no. (12), 3677–3679, 2015
  • M. Dwarikanath et al., Coherency based spatio-temporal saliency detection for video object segmentation, IEEE Journal of Selected Topics in Signal Processing, vol. 8, no. (3), 454–462, 2014
  • S. O. Gilani et al., Impact of image appeal on visual attention during photo triaging, in 20th IEEE International Conference on Image Processing (ICIP), (Australia), 2013
  • S. O. Gilani et al., Gist modulated saliency in videos, in 2nd ECE Graduate Student Symposium, National University of Singapore, (Singapore), 2012
  • S. O. Gilani et al., Fixation durations during scene transitions, Journal of Vision, vol. 11, no. (11), 512–512, 2011
  • S. O. Gilani et al., Spatio temporal saliency modelling in videos, in 1st ECE Graduate Student Symposium, National University of Singapore, (Singapore), 2011

Members:

Hassan Mahmood, Shoaib Azam, Usman Khalid, Maria Wahid

Person Detection in Unconstrained Environment

Person detection has been an active area of research due to its wide range of potential applications in pedestrian detection, in-store video analytics, crowd management, and video surveillance. Among a few challenges faced are varying viewpoints, illumination, postures, and sensing modalities. However, strong priors exist for an efficient and practical solution; e.g., movement characteristic, scene properties, postural connectivity etc., Our research aims to develop an efficient model of person detection for a variety of challenges in real-world applications

Publications

  • M. N. Khan et. al., Photo detector-based indoor positioning systems variants: A new look, Computers & Electrical Engineering, vol. 83, 106607, 2020
  • S. Munir et al., Human Torso Detection in Infrared Videos, under review
  • M. Ammar et al., Human Detection by Learning Locally Adaptive Steering Kernels (LASK), under review
  • M. Asad et al., Emotion detection through facial feature recognition, in Proceeding of 3rd International Conference on Green Computing and Engineering Technologies - ICGCET-2017, (Killaloe, Ireland), 2017
  • S. O. Gilani, Human Detection on Raspberry PI, Internal Technical Report, 2017
  • S. O. Gilani, Crowd Emotion Detection using Person Model, Internal Technical Report, 2017
  • H. Ahmed et al., Monocular vision-based signer-independent Pakistani sign language recognition system using supervised learning, Indian Journal of Science and Technology, vol. 9, no. (25), 2016
  • B. Ali et al, Improved method for stereo vision-based human detection for a mobile robot following a target person, South African Journal of Industrial Engineering, vol. 26, no. (1), 102–119, 2015
  • B. Ali et al., Human tracking by a mobile robot using 3d features, in IEEE International Conference on Robotics and Biomimetics (ROBIO), 2013

Members:

Munir Sultan, Muhammad Ammar

Multimedia Analytics

Multimedia analytics is a vast and multidisciplinary field. With recent technological innovations, we have a proliferation of multimedia usage in our daily life. The data embeds several modalities, e.g., audio, visual, textual information. This calls for novel algorithms and technologies (drawing on multiple disciplines) for multimedia retrieval, access, exploration, understanding, abstraction, and interaction. Currently, we are focusing on multimedia abstraction and interactions by analysing

  • User behaviour
  • Memorability
  • Emotions
  • Saliency

Publications

  • S. O. Gilani et al., Video abstraction inspired by human model of attention, in 9th International Conference on Information Technology, Electronics & Mobile Communication (IEMCON 2018), (Vancouver, Canada), Nov. 2018.
  • S. Ramanathan et. al., Utilizing implicit user cues for multimedia analytics, in Frontiers of Multimedia Research, Association for Computing Machinery and Morgan & Claypool, 219–251 ,2018

Members:

Hasnain Ali

Autonomous Vehicle Navigation

Autonomous Vehicle research has recently entered into the mainstream application (e.g., Google, Uber). The enabling technology relies on the ability of a vehicle to sense its environment, interpret multi-sensor (vision, radar, GPS, lidar, odometer etc., ) information and take appropriate decisions (path planning) and actions (control system). Currently, we are focusing on

  • video and scene analysis
  • real-time control

Publications

  • Arqab et al., Autonomous Vehicle Control, Internal Technical Report, 2018
  • H. Fleyeh et al., Road sign detection and recognition using fuzzy artmap: A case study swedish speed-limit signs, in Artificial Intelligence and Soft computing, (Spain), 2006

Members:

Arqab, Aibak

Crowd Modelling and Analytics

Crowd modelling and analytics research offers vital benefits in crowd management and security. Currently, we are focusing on computing two factors; crowd density and crowd flow. Our approach is based on micro and macro-level analysis of the crowd image in estimating these factors.

Members:

Tahseen Akhtar

Miscellaneous projets

OCR Based application

  • M. Sami et al., Text detection and recognition for semantic mapping in indoor navigation, in IEEE ICITCS (Malaysia), 2015
  • S. Z. Zhou et al., Open source OCR framework using mobile devices, in SPIE-EI (USA), 2008

Augmented/Virtual Reality

  • S. O. Gilani, Interactive transcription system and method, US Patent 8,358,320, 2013
  • Z. Zhou et al., Wizqubes - a novel tangible interface for interactive storytelling in mixed reality, IJVR, vol. 7, no. (4), 9–15, 2008
  • P. Song et al., Vision–based projected tabletop interface for finger interac- tions,” in HCI, Springer, 2007
  • Z. Zhou et al, What you write is what you get: A novel mixed reality interface, in HCI (China), 2007

Scene Understanding

  • S. O. Gilani et al., Automated scene analysis by image feature extraction, in IEEE PiCom (Auckland), 530–536, 2016
  • S. O. Gilani et al., Scene transitions effects fixation length in movies, in Decade of Mind IV (Singapore), 2011