In my last post I had committed to penning my thoughts on why we need intelligence on the edge or on a chip. So here are my thoughts on the subject.
The functions performed by intelligent video broadly fall into 3 categories –
1. Event/object detection
2. Image/video quality and visualization enhancement
3. Image/video compression
1. Event/object detection
2. Image/video quality and visualization enhancement
3. Image/video compression
I’ll address the need for each of these in a separate post. In this post, we will focus on why and when it makes sense to have these intelligent video functions on the edge (in embedded form). Event detection and object detection are often applied in the context of surveillance or monitoring applications. Lets assume we have thirty 1MP cameras (not unusual) where we want to run analytics. In a traditional surveillance set up, you’d need, at 15 fps, 15 Mbps bandwidth for each camera which will be 450 Mbps across 30 cameras. Now, that is a serious network to be laid out (Gigabit ethernet). Even if the video sent was encoded, reducing the bandwidth, there will be significant CPU requirements on the server for decoding this for analytics. Note that analytics are performed on raw image/video footage and not on encoded video. With analytics on the edge, only pre-event and post-event buffer with the relevant frame segments or down-sampled frames can be transmitted as and when the events happen. In addition, encoding can be performed on this frame buffer as the raw video has already been analyzed (right at the edge). This could translate to, something as low as, just sending one down-sampled and encoded frame with the event/object which could be as low as a few Kilobytes of data to be transmitted. A world of difference if you ask me! If I am looking for a needle, I’d rather be given a needle than the proverbial haystack.
Image quality enhancement and visualization is invaluable to provide superior video/image watching experience. Achieving these in embedded form eliminates or reduces the need for offline processing of video data. You get high quality output right out of the camera. In some cases, at time of recording or visualizing there may be a need to process the visual feed in real time for -
1. removing some artifacts like blurring
2. increasing the field of view by mosaicing inputs from 2 sensors on the same device
3. convert a video shot to an image sequence with a single click
4. Merge two frames across time to create one composite image
Obviously you start to create more and more sophisticated cameras by building intelligence in the embedded software instead of trying to design optics, lighting and whole host of engineering aspects. I’d go out on a limb to predict that future of smart cameras (and I believe most cameras will become smart cameras in time), will be largely due to our ability to pack a lot of intelligence in the software on the chip.
Compression is another area that will need to be on the edge for obvious reasons. There are two reasons for compression - reduction in bandwidth and reduction in storage. At least bandwidth will continue to be at a premium when we look at Megapixel video and hence compression methods that go beyond current state-of-the-art are definitely round the corner.
At Mamigo we are doing path-breaking work in making this a reality and we are working with partners to bring them to the market. This is great work that requires a combination of deep understanding of computer vision, algorithmic mindset, great software development talent, agile development techniques and intimate understanding of DSPs.
0 comments:
Post a Comment