Charbel Rizk has spun Oculi out of Johns Hopkins University to commercialise technology that gives computer vision the capabilities of the human eye.
As you are reading this text, your eyes are prioritising the screen and deprioritising your surroundings. That sounds obvious when you think about it: you don’t currently need to know if the leaves on your desk plant look a bit brown, but if you just looked away to check your eyes would’ve easily adjusted their priority and your screen would’ve faded into the background.
The likelihood is you’ve never really thought about this. It is simply how sighted people perceive the world. Yet eyes, rather than merely capturing light and passing it all straight through to the brain, in fact do a lot of pre-processing and they do it in parallel very efficiently, sending only relevant details on to the brain.
Computer vision does not currently work this way. Instead, it is a dumb sensor that captures everything and passes it all onto a processor. That means that the limit of what machine vision can achieve is reliant on the data pipeline and how powerful the processor is — the more data that needs to be transferred and processed, the larger the pipeline and memory and the faster a processor and, importantly, the more electricity and time are required.
“When we say gigabits per second, people say ‘there is an Intel processor that can do so many teraflops per second’, but the one detail they miss is that the processor cannot handle that data at once. It has to buffer the data back and forth into memory in order to process all of it — moving data back and forth between memory and processing is what consumes most of the power and time,” Charbel Rizk, founder and chief executive of Johns Hopkins University (JHU) spinout Oculi explains.
In addition to processing useless data, this architecture also limits the amount of data that can be accessed from the sensor. Essentially, machine vision today gets you the worst of both worlds: too much useless data and not enough useful data.
There hasn’t really been any innovation in how machine vision is done until now. We’ve merely added ever higher resolution sensors, more powerful processors, and artificial intelligence software that tries to understand the massive amounts of data after it has been captured.
“That architecture is very inefficient and has remained the same for decades. It is why machines have outperformed humans in just about every task subdivision, but human vision remains 40,000 times more efficient than the best computer vision out there.”
Oculi’s founding members include Philippe Pouliquen, a co-inventor on four patents and assistant research professor at Johns Hopkins, and JHU graduate Chad Howard, who serves as lead solutions engineer.
How to save lives by counting raindrops
Oculi’s technology is an integrated sensing and processing architecture for imaging, dubbed sensing and processing unit (SPU) that is capable of performing tasks at the pixel level and at the edge (in other words, data doesn’t need to be sent back to a server to be processed). It uses parallel processing and in-memory compute in the pixel — it calls that technology IntelliPixel — massively reducing the amount of data that needs to be analysed or stored, and thereby reducing power requirements and latency to the point of the analysis happening in real time. In other words, the SPU knows what to look for in a scene and ignores the rest.
“You reduce a gigabits-per-second problem into a kilobits-per-second problem. So, anything that happens after can be handled effectively, because you can manage kilobits per second in real time without using a lot of power and resources. You can’t do that with gigabits per second.”
Rizk stresses that all of this happens on the same chip — a reality that keeps defying expectations every time he explains it to a new person, he says, because the current model of separate sensor and processor is so ingrained in people’s minds.
While the spinout outsources the manufacturing of the chips, its embedded software is created entirely in-house. In fact, half of Oculi’s staff is focused on the latter, Rizk says.
Rizk gives a real-world example of Oculi’s partnership with a Japan-based company, which was looking for a way to predict regular occurrences of flash flooding that often proved fatal because it was impossible to know exactly when an area was becoming too dangerous.
Oculi’s SPU was able to “count the rain drops and estimate how much rain is falling”, Rizk explains. “That is the kind of application that I personally get excited about.”
The company did not need to adapt its hardware for this, he underlines. “We are the only intelligent software-defined vision sensor. Our intelligence starts at the pixel level, not at the edge of the array, not next to it. The pixel is smart and it’s software-defined because just about every aspect of the sensor can change in real time, which is a flexibility that we’ve never had before. This is why the same chip is used to track a bullet in flight, to do rain measurement and to do gesture control.”
It is the consumer sector that Oculi is targeting first with a view to generating significant revenue. Interacting with virtual reality devices involves “tracking your eyes, your face, your hands, your body movement,” Rizk says. “You would think it’s a simple problem. It is if you don’t care about how much power and money you put into it. But if you’re trying to do it effectively with a small solution that doesn’t cost a lot of money, there is none out there outside of our technology.”
Oculi recently secured its first sizeable deal in the consumer sector when a UK-based company signed a letter of intent for 250,000 units, Rizk notes.
But its gesture tracking specifically also has applications in healthcare, Rizk continues, where Oculi has worked with a client to develop touchless check-in screens in medical facilities.
The pursuit of efficiency
Rizk is “obsessed with efficiency”, he says. It is partially what motivated him to develop the new vision architecture in a white paper some 20 years ago and as an engineer he found Oculi’s solution through iteration to optimise the architecture (nowadays this approach is called systems engineering) rather than emulating the human eye, having not known how the eye functions when he set out to create his technology.
That drive for efficiency can be a source of frustration as CEO, he admits, because he’s unable to optimise the fundraising process: after all, no entrepreneur can know if an investor will give them money when they walk into a room.
Another challenge for Oculi is the Baltimore and Washington DC ecosystem (the two cities are within an hour’s drive of each other), which has historical strengths in healthcare and cybersecurity. It means finding investors locally in Oculi’s space has proven tricky.
But even though Johns Hopkins University’s strengths also lie in healthcare, Rizk highlights the importance of support from tech transfer office Johns Hopkins Technology Ventures and in particular Brian Stansky, director of innovation hub FastForward.
Due to the intensive demands of leading a startup, Rizk – an associate research professor in the Whithing School of Engineering since 2016 – did not teach this past academic year. Still, he says: “I was teaching two courses that I had developed myself and they are electives, so I can pick and choose the students. That type of teaching is definitely going to be part of my life.”
Finding capable technical staff has been a breeze, Rizk notes, because of his engineering background and extensive network, even if as a first-time entrepreneur finding businesspeople required more work.
One chip to rule them all
Although Oculi is Rizk’s first startup, he has been entrepreneurial for much of his life. “I developed the first four-rotor drone in the early 90s and ever since then, I’ve had that startup mentality even when I worked for large enterprises like Boeing.”
His ambition is as big as that of any startup founder and he hopes that eventually, Oculi’s technology will become the default for machine vision – in any sector: “If Waymo or Cruise want to put machine vision in their autonomous car, instead of buying a camera and a computer, and then putting it together, we are going to sell them something that already does all the lower level processing and outputs the information that they want: lane markings, traffic lights, pedestrians…
“Everybody wants the same basic information, what is relevant in the scene. They don’t really care about the low-level processing that gets you that information. In fact, it is more of a pain than value added. Because they want to make money with their product, selling you a full solution. They don’t want to drown in low-level integration details. There is a significant value that only systems engineers or integrators really appreciate.
“Ask any automotive original equipment manufacturers or tier 1 suppliers and they’ll tell you the nightmare in integrating sensors from different vendors. Oculi technology will be plug and play because of the significant reduction in data transfer, the interface will be very simple and standard. With that said, the fact that the SPU is programmable, it does not limit access to any and all raw sensor data as needed.
“We want to sell the same SPU to everyone and all they have to do is program it to, say, track the eye.”
With applications in essentially every sector and a chief executive as driven as Rizk at the helm, it seems a safe bet to assume Oculi will pull it off. And now with customers and orders in the pipeline, the focus is shifting to production, Rizk explains. Oculi has secured a strategic partnership and investment with a foundry to mass produce the first of three Oculi products planned for the next five years.
The spinout already has a handful of investors to help it get there: telecoms provider Mada, diversified engineering firm XOL Automation, semiconductor-focused incubator Silicon Catalyst and the CYNC programme, a partnership between aerospace and defence company Northrop Grumman and the Cyber Incubator@bwtech. Rizk is in the process of raising bridge financing while negotiating with multiple lead investors to close the seed round.