In Advanced Driver Assistance Systems (ADAS), image segmentation for recognizing drivable areas and guiding the vehicle forward is a basic function. For the latter, unlike those traditional image segmentation methods, image semantic segmentation based on deep learning architecture can handle the irregularly shaped road areas better, guiding a vehicle to drive in a more complex environment. In our research work, we first analyze the output of state-of-the-art real-time semantic segmentation networks. The result shows that most of the misclassified pixels are located on the edge between two classes. Based on this observation, we propose a novel semantic segmentation network which contains a class-aware edge loss module and a channel-wise attention mechanism, aiming to improve the accuracy with no harm to inference speed.
Deep Learning-based Robust Real-time Road Marking Detection System
In recent years, Autonomous Driving Systems (ADS) become more and more popular and reliable. Road markings are important for drivers and advanced driver assistance systems better understanding the road environment. While the detection of road markings may suffer a lot from various illumination, weather conditions and angles of view, most traditional road marking detection methods use fixed threshold to detect road markings, which is not robust enough to handle various situations in the real world. To deal with this problem, some deep learning-based real-time detection frameworks such as Single Shot Detector (SSD) and You Only Look Once (YOLO) are suitable for this task. However, these deep learning-based methods are data-driven while there are no public road marking datasets. Besides, these detection frameworks usually struggle with distorted road markings and balancing between the precision and recall. We propose a two-stage YOLOv2-based network to tackle distorted road marking detection as well as to balance precision and recall. Our network is able to run at 58 FPS in a single GTX 1070 under diverse circumstances. In addition, we propose a dataset for the public use of road marking detection tasks. The dataset consists of 11800 high resolution images captured at distinct time under different weather conditions. The images are manually labeled into 13 classes with bounding boxes and corresponding classes. We empirically demonstrate both mAP and detection speed of our system over several baseline models.
The information between multiple sensors is collected through deep learning for individual feature collection. The features collected by each sensor are projected to the same standard system, and these features are connected to a specific channel to obtain the fused feature. After the early fusion, each sensor can take advantage of each other to cut short complements to obtain good features. We have developed a series of techniques based on computer vision techniques are applied on obstacle detection (such as pedestrian or vehicle detection).
This research focuses on developing a reliable and general action-recognition (AR) system, which is difficult to adapt various appearance and action styles of different users. Many future applications rely on this technique such as the real-time Human-Machine Interaction (HCI) systems due to the significance of identifying the input signal of a certain motion pattern, and are potentially used in extension techniques of AR as well as enhance the convenience of life.
This research is now focusing on hand pose estimation and human-computer interaction, and we would like to propose a method to provide a natural way to interaction with the computer and the virtual environment. The difficulty of this includes hand pose variations, self-occlusion and depth image broken. The process of our application is listed below: we will first get the depth image from the depth camera which is set on the helmet, then estimation the 3D coordinates of the hand joint by deep learning. Finally, project the joint coordinates in to the virtual environment and interact with the object.