I have been working in Tsinghua-Tencent Innovation Laboratory as a main developer since May, 2013. The Tsinghua-Tencent Innovation Laboratory is a joint laboratory founded by Tsinghua University and Tencent Company. During my stay in this lab, I have written 10,000+ lines of C++ code individually, supervised 2 master students and 2 undergraduate student about coding and algorithms, and cooperated with 20+ develpoers and researchers. We decreased the manual work of streetview image post-processing by 70%+, improved object detection performance on streetview images by 10%+, and brought new feature to live streaming videos. I am also a team leader during 2014-2015 and responsible for checking the progress of all projects and holding the weekly team meeting. The projects I have participated in include:
Background Replacement for Video
In this project, we developed a background replacement tool for live video chat application. Taken a video and a new background image as the input, our tool can replace the original background with the new background at real time. The code is written in C++ using OpenCV. A single core CPU version runs at 8 fps (frames per second). With the help of GPU, it can runs at real time frame rate (e.g. 30 fps). The basic idea of the algorithm is as follows: First, we build Gaussian-Mixture color models for the background and foreground. Using these models we can estimate the probability that a pixel belongs to the foreground/background in each frame. Second, we use graph-cut algorithm to optimize a binary labeling problem which assigns each pixel a label that if it belongs to the foreground, and obtain a binary foreground mask. Then we use alpha matting algorithm to refine the boudary of the binary foreground mask. Finally, we adjust the foreground color according to the light source of the new background image and compute the final blending results.
Vehicle Detection
In this project, we developed a vehicle detection tool for street view images. We train a classifier based on the OverFeat deep convolution neural network model using Caffe. Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. There are 6000 images in our training set, and 1239 images in our test set. Running the classifier on our test set, we acquire 90.36% precision rate at 79.75% recall rate. These figures are actually better than it looks because there are many miss labeling in our data set, which makes true positive results become false positive results. In the image on the left, the blue rectangles are the data set labels, the green and red rectangles are our detection results, while the green ones are regarded as true positive and the red ones are regarded as false positive. In our implementation, it takes 0.2 seconds to parallelly run the classifier on 20 images using Nvidia GeForce GTX TITAN.
Speed Limit Sign Detection and Recognition
In this project, we developed speed limit sign detection and recognition tool for street view images. The algorithm is implemented in C++ using OpenCV. The main idea of our algorithm is as follows: First, we use color filter and circle detection to locate some potential locations. Then we run a fast but less accurate Boost Cascade classifier to acquire some high recall low precision results. Finally we run a slow but more accurate SVM classifier to classify the speed limit signs into 9 classes (30,40,50,60,70,80,100,110,120). In our test set, we acquire a 92% precision rate at 98% recall rate for detection, and a 97% precision rate for recognition. In the future, we plan to use the same CNN model as in Vehicle Detection to implement speed limit sign detection and recognition.
Image D-Lighting
In this project, we developed a D-Lighting tool for backlighting images. D-Lighting is a kind of technology that optimizes high contrast images to restore shadow and highlight details that are often lost when strong lighting increases the contrast between bright and dark areas of the image. The main idea of our algorithm is as follows: First, suppose the input is a single image in RAW format, we use tone-mapping techiniques to convert the HDR (high dynamic range) image into a LDR (low dynamic range) image. Second, we compensate the side effects brought by tone-mapping such as loss of contrast, loss of sharpness and color bias. An example of the effect of our algorithm is shown on the left. We can see that the dark areas on the building are lighten and show more details, while the bright areas on the sky genearlly remain the unchanged.
2011.09 - 2012.08 -- The president of the student photography association, Tsinghua University.
2010.09 - 2011.08 -- The minister of the propaganda department of the student union of the Department of Computer Science and Technology, Tsinghua University
AWARDS & HONORS
2015 - Friend of Tsinghua - Tung OOCL Scholarship.
2012 - Outstanding Graduate of Tsinghua University.