Computer and VR Researchers at Purdue University have gone deep into the hand’s near endless complexity to bring us DeepHand, the next generation of virtual manual systems.
In the near future we will manipulate virtual objects as second nature. Our children won’t differentiate between virtual and real word actions because the interface between the two will be seamless. To realise that future there are groups like that led by Purdue’s Karthik Ramani, professor of mechanical engineering and director of the university’s C Design Lab.
At present users wear headsets or use their smartphones to interact with the video and graphic images generated in VR and AR. Whether users are partially or fully immersed they have to use their hands to make things happen. “In both cases, these systems must be able to see and interpret what the user’s hands are doing,” said Karthik “If your hands can’t interact with the virtual world, you can’t do anything.”
Learning By Hand
DeepHand, uses a “convolutional neural network” that mimics the human brain and is capable of ‘deep learning’ to understand the hand’s complex variability of joint angles and contortions. A depth-sensing camera captures the user’s hand and specialised algorithms interpret hand movements. “We figure out where your hands are and where your fingers are and all the motions of the hands and fingers in real time,” said Karthik.
“It’s called a spatial user interface because you are interfacing with the computer in space instead of on a touch screen or keyboard,” he added. “Say the user wants to pick up items from a virtual desktop, drive a virtual car or produce virtual pottery. The hands are obviously key.”
Millions of Poses
DeepHand’s team of researchers includes doctoral students Ayan Sinha and Chiho Choi alongside Professor Ramani. Importantly the team trained DeepHand with a database of 2.5 million hand poses and configurations. Positions for finger joints were given ‘feature vectors’ allowing them to be retrieved quickly. “We identify angles in the hand, and we look at how these angles change, and these configurations are represented by a set of numbers,” Sinha revealed.
DeepHand then selects the ones that best fit what the camera sees from the database. In effect ‘spatial nearest neighbours’ are identified for each hand position. Training the system for these hand positions requires a large amount of computing power, but once it has been trained it can run on a standard computer. “The idea is similar to the Netflix algorithm, which is able to select recommended movies for specific customers based on a record of previous movies purchased by that customer,” said Karthik.
The Purdue C Design Lab also co-sponsors a conference workshop supported by the National Science Foundation, Facebook and Oculus, called Observing and Understanding Hands in Action. For more technical information on DeepHand, see the video below.