Latest Entries »

The DARPA Robotics Challenge

The DARPA Robotics Challenge (DRC) is a prize competition funded by the Defense Advanced Research Projects Agency. Held from 2012 to 2014, it aims to develop autonomous humanoid robots that can do complex tasks in dangerous environments.

I’ve been working on the DARPA Robotic Challenge since December 2012 as part of the University of Delaware team focusing on developing various vision algorithms for event 1 autonomous driving:

ImageOur platform is called DRC-Hubo Beta, which is a modified version of the KAIST Hubo2 robot with hardware retrofits and software algorithms for autonomy.  Our team consists of ten universities focusing on 7 different events, including driving (UD), rough terrain walking (OSU),  debris removal (GT), door opening (Swarthmore), ladder climbing (IU & Purdue), valve turning (WPI) and hose installation (Columbia):


This set of complex tasks requires seamless integration between vision, motion planning and hardware control. Therefore, Drexel University invites students and professors from different institutions to join the ‘DRC Boot Camp’ and work together in the Philadelphia Armory for about 6 weeks in the summer.


My work is mainly focused on machine vision, including CPU/GPU based stereo matching and point cloud generation, CAT/Kin-Fu based model making, RVIZ interface development and so on. In the last 6 months, a lot of new software packages have been developed and pushed to the ROS repository. Also, many new hardware components and control methods have been made and implemented to this particular project.  Considering the influence of past DARPA Grand Challenges, this event may become a turning point in the development of robotic technology (especially humanoids) for the next 5 to 10 years. The KAIST Hubo robot has been in development since 2003 and its more well-known counterpart, the Honda ASIMO, represents almost 40 years of study on bipedal walking:

honda_humanoid_history_1However, these robots emphasize more on mechanical design and manufacturing precision rather than visual information processing and real-time closed-loop control/stabilization. It was not until recently did we see the Boston Dynamics ATLAS (The Agile Anthropomorphic Robot), being provided as Government Furnished Equipment for the DARPA Robotics Challenge program Track B teams:Atlas_4437_shrunk-1373567699341

This robot shows perfect combination of sensing and control, and it only takes them less than a years to develop the prototype from PETMAN after they get the $10.9 million contract from DARPA in August 2012. This is an example of how the DRC project speeds up robotic hardware development. Hopefully, with the amazing creations, imaginations and explorations from different teams/organizations throughout this whole project, robotics technology can really be pushed forward.

[Updated 12/31/13]

From the 2013 DARPA Robotics Challenge trials:

Carnegie Mellon University — CMU Highly Intelligent Mobile Platform (CHIMP):


MIT/Boston Dynamics — The Agile Anthropomorphic Robot (ATLAS):


NASA Johnson Space Center — Valkyrie (R5):


Boston Dynamics — Legged Squad Support Systems (LS3):


Google driverless car:



Video summary of our team:

[Updated 3/7/15]

At UNLV Howard R. Hughes College of Engineering with DRC-Hubo, congrats on the qualification to DRC final!


[Updated 6/7/15]

DRC final rules do not allow protection tether cables, which leads to:

Congrats DRC-Hubo@UNLV (Which I worked on the vision system) for the 8th place finish at DRC Final! Congrats DRC-Hubo @ KAIST for the 1st place and winning the $2m prize!


A New Era of Robotics is coming!


A full overview of the DRC:


Quadcopter UAV project

Recently I’m working on a quadcopter UAV project. The on-board electronic system includes 3-Axis Gyro, GPS/INS, AHRS, 5.8Ghz FPV transmitter, GoPro Hero camera and a small Gumstix computer.

After I finished assembling and tuning all the parts, I’ll test autonomous flight outside and upload the HD videos recorded by GoPro Hero camera. The future experiments include vision based auto-landing on moving vehicles, large-scale 3D terrain generation using SFM and vision based tracking of a specific ground vehicle. So, please check my blog for the exciting videos to come!

[Updated 5/12/12]

Added Compass/IMU/GPS, central controller, 5.8Ghz FPV system and AHRS system, ready to tune the controller on next Wednesday.

Thanks for Nate at Hobby Hut for helping me to tune the quadcopter. If you live close to the tri-state area and have problems with RC stuff, go to Hobby Hut, Eagleville, PA and ask for Nate, he’s very helpful.

[Updated 11/15/12]

Attended a meeting in Villanova University with the AUVSI local chapter members. This is a very good chance to know and communicate with local people doing UAV activities.

Thanks for the presentation by Mr.Carl Bianchini and the opportunity provided by Mr. Steven Matthews. I will be giving a presentation on Jan. 17 or 24, 7pm, CEER 210 Conference Room , Villanova University.

[Updated 6/1/13]

How are you guys doing ? I was busy preparing the PhD prelim this semester and working on the DARPA Robotic Challenge, so don’t have much time testing the quadcopter. Today we tested RTL (return to launch) successfully:

I will add more videos and pictures later.

[Updated 6/4/13]

Took a flight in the UD campus:


Rectification test at the UD football field:



Raw image:



output2Probably I’ll try to mount another GoPro and do stereo matching + visual odometry based large terrain reconstruction. This method is simple and works well, and then we can generate a 3D model of UD from the point cloud files.

[Updated 6/5/13]

Took a flight at the Chesapeake bay:

[Updated 6/8/13]

Tested in Winterthur:

Testing the UAV for scientific research purposes,maximum altitude on AHRS is set to 350 ft to follow FAA/AMA’s 400 ft limit.

[Updated 6/25/13]

Yesterday I went to the DARPA Robotic Challenge boot camp orientation at Drexel University (I’ll be working on the DRC throughout this summer in Drexel). I did a 360 degree spin at 400 ft over Drexel campus and generated a fan panorama directly from the video input (click to see the full res images):


I also turned this into a polar panorama:

Omni Result

This is actually not a very easy task. As you can see from the video, the quadcopter has to incline its body to fight against the wind, since I don’t have a gimbal, the view of the camera is not level. and the rotating axis is somewhere outside of the quadcopter itself (makes it much more difficult than putting a camera perfectly level on a tripod and spin the camera around its own Z axis). Therefore, feature matching is needed to recover the pose and I can use that information to unwrap the images and blend them together. Anyway, later I will do stereo matching and reconstruction, any experiment on the UAV multi-view geometry and camera calibration at this point is helpful.

[Updated 8/4/13]

Initial results on my SFM and dense mesh reconstruction algorithm (all input images are from the above youtube video):

Center city Philadelphia:


U Penn Franklin field area:



As you can see, the algorithm only works well on close-by buildings. The disparity of far-away object pixel is too small for the algorithm to calculate depth. Another problem is that I’m only doing self-spins on the UAV, so basically I only get translation of the feature points instead of rotation, which makes it very hard to recover point geometry. Later I’ll fly the UAV around a specific target with a higher resolution camera and see what happens.

[Updated 8/25/13]

A Pennsylvania based company has shown interest in my software. I will improve the algorithm and test on their UAV. Once more accurate 3D mesh results are generated, we can prepare materials to apply for a US patent.

[Updated 10/1/14]

The parameters of the 3D reconstruction program is almost perfectly tuned. Here are some input sequences and reconstruction result of the University of Delaware main campus:

Sample input sequence:


GOPR2677.MP4_20141002_181107.867 GOPR2677.MP4_20141002_181235.992Output 3D mesh model:

Top-down view:



You can view the 3D model of UD on Verold or Sketchfab.

[Updated 10/2/15]
New model with improved texture mapping:

Open source packages used:

Poisson Surface Reconstruction:
Point Cloud Registration:
Labeling of buildings on the point cloud:


This is my first job in the US, a 3-month internship from May 25,2012 to August 24,2012. The company is located in Malvern, Pennsylvania, very close to the city of Philadelphia.

Overall the internship experience is very positive. This company is a relatively stable organization with team-oriented environment and a Global focus. It is basically a software company focusing on the development of the Siemens Sorian hospital information system. The location and working conditions are very good. Different from Microsoft Research, which emphasize on independent thinking, Siemens encourages team work, communication and collaboration.

During the internship, I mainly worked on two projects. The first one is an automatic device monitoring program which can detect the status of multiple devices and manage records during laboratory testing exercises. The second one is a concept proofing software for vision based autonomous medical device/patient association. Basically it’s an appearance based multiple object recognition and prediction program.

People inside the company are very nice. They helped me a lot and explained the work details to me very patiently. I also learnt a lot of things from them on how to solve problems and how to face new challenges in the near future.

Automatic image segmentation

Recently I’m working with M company to build image search and classification engine, so I made a study of image feature based segmentation methods and use combined feature to detect the target object in a single still image.

The program is based on M company’s own image processing library, so the executable file is very small, only a 94kb single file after compression, you can try it here.

Source code here:


Vision based robot navigation system

This program is a complete stereo vision robot navigation system that has been successfully tested on different robot platforms. (See I modified the code so it can run on simulation files rather than a real robot system. I’ll keep improving the code and add more modules.

Demo program and complete source code:

Direct download link:

Powerpoint presentation:

1.   Program Interface

2.   Real-time stereo matching

3.   3D-point cloud generation and ground plane detection

4.   Path planning


Stereo matching evaluation system

From now on I’ll open a new section “Source code” and gradually upload all my project codes and share with people who need them. These codes are mainly focused on computer vision, graphics, robot planning, localization and so on. The introductions will be on the wordpress page, and the source code will be hosted on codeplex:

OK. Here comes the first project, which is a course project I recently wrote for the evaluation of  stereo vision algorithms. I implemented the basic algorithms like NCC and SAD. Other better algorithms like SGBM, GC are using OpenCV functions.

The program is written in, using MFC to build the interface, picture control to show the OpenCV results. To compile the program, you need to configure OpenCV to Visual Studio correctly. I’m using OpenCV 2.1 here, if you use other versions, just change the version of corresponding cv210.lib,cxcore210.lib,highgui210.lib files in additional dependencies.

Source code download page:

Executable file can be downloaded here, you should run it in XP3 compatible mode.

If you have any problems, leave a reply here or send me an email.

Program interface



Tsukuba Anaglyph NCC(WINDOW:11 DSR:20)
SSD(WINDOW:20 DSR:20) BM( SADWindowSize=15) SGBM
Apple Anaglyph NCC(WINDOW:20  DSR:10)
SSD (WINDOW:15  DSR:12) BM( SADWindowSize=15) SGBM
Corn Anaglyph NCC(WINDOW:11 DSR:20)
SSD(WINDOW:30 DSR:60) BM( SADWindowSize=15) SGBM
Dolls Anaglyph (On 1390*1100) NCC(WINDOW:9 DSR:20)
SSD(WINDOW:15 DSR:20) BM( SADWindowSize=15) SGBM(On 1390*1100)
Aloe Anaglyph (On 1390*1100) NCC(WINDOW:9 DSR:11)
SSD(WINDOW:15 DSR:20) BM( SADWindowSize=15) SGBM

My Rovio fire extinguisher mod

This is my first project, modify Rovio into an automatic fire extinguisher. I originally posted it on robocommunity, the official forum of the Wowwee company:

The thread on robocommunity gained wide media interest, you can find the following websites also introducing this project:

Report by New York Times, Nov.4 2010, on page B10:

View full story:

Report on my university’s home page:

I’ve done a lot of works regarding machine vision algorithms and microelectronics. The Wowwee Rovio is a wonderful platform to do some experiments, and I changed its shell as well as its inner circuit to make it a fully automatic vision guided fire extinguisher robot . I made a electromagnetic valve and added it to rovio, the bottle on the right is filled with CF2ClBr(4ATM):

Then I use adaboost+SVM to train rovio, I’ve worked in visual flame tracking before and at that time I was asked to develop a surveillance software to track smoke and flames. I tested many different algorithms, and finally I applyed the method used in face detection to give a robust result. I used Haar-like rectangular features and integral image to describe flame features and used SVM instead of the cascade classifier. The exprimental result is quite good, even though Rovio’s camera is not that stable and reliable:

And then…rovio became an auto fire extinguisher! Rovio is just wonderful! The following are snapshots of the experiment:

Recently I’m doing a research project on omni-vSLAM with ground plane constraint, a quite challenging task in machine vision. Firstly I have to build a omni-vision system totally on my own as I can’t find any place to buy such things. I’ve worked on stereo vision depth estimation before:

The above stereo vision system is quite hard to caliberate, so I bought a bumblebee camera from the Point Gray company( on my hands):

It is very easy to use and the image quality is very ideal. Best of all, it supports Linux. So this time I decided to use Point Gray Firefly MV CMOS  camera for my omni-vision system(you can see its label through the transparent connecting component):

I sended my design to another company to build the hyperbolic mirror and outside structure, and carefully installed the system on the top of P3-AT,  because I must make sure that no parts of the robot itself is in the view range of the omni-camera:

It spent me a whole day to finish all the stuff:

The next thing is to build the mathematical model of the system.The general idea of my method is to use a single passive omni-directional camera to learn the ground appearance and find the obstacle edge feature points. Then, under the plane constraint, each feature point is mapped to a coordinate in the global coordinate frame. The coordinate set generated by our system has a similar format with the data coming from the 2D laser range finder and can be imported into toolboxes like CARMEN to do simultaneous localization and mapping.

The key of this system is to generate exact obstacle edges using only one image. I tried many methods and finally wrote the program using canny edge detection, texture features and adaptive learning methods to detect obstacle edge( the yellow squares are haar feature points used to calculate distance):

Extracted ground features:

Outdoor ground extraction  experiment results:

My mentor recommended to to register a patent for this algorithm. Now I’m considering: it can also be used to do mono SLAM using a camera with a view-range of about 60 degree. Now I’m working hard on this and I’ll upload the experimental results here soon.

[Nov.6 Updated] I strengthened the mechanical structure and added a Canon camera:

[Nov.12 Updated] More testing videos, I’ll soon release the rendered map generation results.

[Dec.6 Updated]Here’s the map of our laboratory using my SLAM algorithm:

Feature points obtained from a single frame in the omni-vision video stream:

Yesterday I went outside and tested the improved navigation algorithm, it’s very robust now.

Next, I’m going to work on my graduation thesis project, mainly focusing on stereo-based autonomous navigation. The depth recovered from stereo camera is quite accurate within 25 meters,  so I’m confident the system will be much better than the current monocular omni-directional navigation system.  My final goal is to place P3-AT at the south gate of my university, give a GPS coordinate, and let it guide itself to the north gate without global maps or GPS checkpoints.  There’re a lot of works to do on stereo SLAM, path planning, traversable area segmentation and robot dynamics control. Tomorrow I’m going to install the Bumblebee2 stereo camera.

[Dec.7 Updated]

Self-made R2-D2 telepresence robot

Inspired by Rovio, I decided to build a low-cost telepresence robot which processes audio and visual information on the upper computing unit. I contacted some of my friends and classmates from different majors and founded a group to start the building process.

First we designed the outward appearance and inner circuits and mechanical structure:

Then I use a CNC machine to build the chasis and installed three AC gearmotors, the output power and gear reduction ratio is carefully calculated:

We made the amplify and optocoupler input drive circuits:

The bottle on the left is a micro  fire extinguisher facility and I want to install it into the robot so it can use its own IP camera to recognize fire and extinguish the fire automatically.

I also added a lazer sensor, a fischertechnik robotic arm, a microphone and two speakers and finished its PIC controller and wireless data transmission module:

Finally I finished the programs and visual navigation algorithms:

Test drive:

Fully automatic visual flame detection:

Obstacle avoidance:

Demo video here:

Introducing my new robot “Black Swan”

Recently I pay more attention to hardware building. I need a robot that is agile and fast with a small turning radius and long endurance.

I modified a P3-DX robot base and added a stereo cam to my setup.

Inserting the controller into the base:

The main controller board:

Robot base and main power motherboard:

Upper structure:
Nearly finished:
Differential drive and a caster wheel, looks like DARPA’s LAGR robot platform:

[Updated Feb.28, 2011]

I finished the program on real time disparity map generation:

[Updated March 11, 2011]

Now the program can do real-time point cloud generation and camera pose estimation:

[Updated April 6, 2011]

Oudoor experiment result: