A sample video of our robust motion detection implementation on a DSP is embedded below
Monday, December 29, 2008
Robust motion detection on a DSP
Building robust motion detection capabilities involves minimizing false positives while taking care of factors like camera motion, occlusions, change in lighting and other standard changes in the scene. This requires algorithmic innovation along with robust and highly optimized implementations on a DSP platform so that this can be done in real time at full resolution.
A sample video of our robust motion detection implementation on a DSP is embedded below
A sample video of our robust motion detection implementation on a DSP is embedded below
Monday, October 13, 2008
Video Stabilization - what is new ?
Some of you want to know how is our video stabilization better than the image stabilization technology that is used in latest digital cameras and camcorders. We have looked at existing image stabilization technologies and studied their limitations. A vast majority of consumer cameras use optical image stabilization that consists of implementing an optical system that compensates for unwanted camera motion using motion sensors and active optical system. This approach is potentially powerful, but makes video cameras significantly more expensive and failure-prone due to the moving parts (the lens or image sensors have to be moved to compensate for detected motion). There are also problems of detecting, measuring and compensating for rotational motion about the optical axis, which are not addressed in most optical stabilization solutions.
A few problems with optical stabilization are that it is -
A software-on-a-chip approach is only limited by the processing power of the DSP and provides high level of flexibility that can be integrated into virtually any solution of any form factor. Digital video stabilization is an architecture choice, that pays both in the short and long term. Optical stabilization is something that will continue to to be used in high-end cameras for professionals. Digital stabilization will continue to gather more momentum and acquire larger chunks of the market.
Often, hybrid methods are used where motion detection and measurement is done using sensors, but the compensation is done in software. Our approach fits into the digital video stabilization category where the acquired frames are processed in real time to estimate motion (translational and rotational) and compensate for it using image processing techniques. Within this category our technology employs proprietary algorithms that provide superior stabilization and that too on a DSP.
A few problems with optical stabilization are that it is -
1. Not modular - harder to integrate into a wide range of cameras as it is an electromechanical solution
2. Expensive
3. Inflexible and hence quality is an issue. While optical stabilization can, in a controlled environment, provide great results, it is more likely to give unpredictable results in unconstrained environment. Rotational motion is not compensated in most optical stabilization technologies.
A software-on-a-chip approach is only limited by the processing power of the DSP and provides high level of flexibility that can be integrated into virtually any solution of any form factor. Digital video stabilization is an architecture choice, that pays both in the short and long term. Optical stabilization is something that will continue to to be used in high-end cameras for professionals. Digital stabilization will continue to gather more momentum and acquire larger chunks of the market.
We have compared the results of stabililzation of other solutions and have found our stabilization solution to come up on tops with rock solid stabilization that compensates for all random motion, both translational and rotational. Our software also allows ignoring smooth and continuous motion as witnessed in panning or standard movement of the person carrying the camera.
While at a capability level we strive to provide better solutions, the power of the approach also comes from being able to create solutions that combine multiple intelligent video capabilities to create user value such as combine stabilization with motion detection. This allows cameras to detect motion and stabilize the camera around the region of interest. To reiterate, DSP based solutions are more malleable and can be molded to create user value very rapidly.
Friday, October 10, 2008
mAmigo Video Stabilization Sample Video
Recently, we tested our video stabilization implementation on videos shot from my Sony camcorder. Our results were quite remarkable across a wide data set. I am posting one sample video from this data set and look forward to comments from folks.
Saturday, September 20, 2008
Intelligent video on the edge – is it really needed?
In my last post I had committed to penning my thoughts on why we need intelligence on the edge or on a chip. So here are my thoughts on the subject.
The functions performed by intelligent video broadly fall into 3 categories –
1. Event/object detection
2. Image/video quality and visualization enhancement
3. Image/video compression
1. Event/object detection
2. Image/video quality and visualization enhancement
3. Image/video compression
I’ll address the need for each of these in a separate post. In this post, we will focus on why and when it makes sense to have these intelligent video functions on the edge (in embedded form). Event detection and object detection are often applied in the context of surveillance or monitoring applications. Lets assume we have thirty 1MP cameras (not unusual) where we want to run analytics. In a traditional surveillance set up, you’d need, at 15 fps, 15 Mbps bandwidth for each camera which will be 450 Mbps across 30 cameras. Now, that is a serious network to be laid out (Gigabit ethernet). Even if the video sent was encoded, reducing the bandwidth, there will be significant CPU requirements on the server for decoding this for analytics. Note that analytics are performed on raw image/video footage and not on encoded video. With analytics on the edge, only pre-event and post-event buffer with the relevant frame segments or down-sampled frames can be transmitted as and when the events happen. In addition, encoding can be performed on this frame buffer as the raw video has already been analyzed (right at the edge). This could translate to, something as low as, just sending one down-sampled and encoded frame with the event/object which could be as low as a few Kilobytes of data to be transmitted. A world of difference if you ask me! If I am looking for a needle, I’d rather be given a needle than the proverbial haystack.
Image quality enhancement and visualization is invaluable to provide superior video/image watching experience. Achieving these in embedded form eliminates or reduces the need for offline processing of video data. You get high quality output right out of the camera. In some cases, at time of recording or visualizing there may be a need to process the visual feed in real time for -
1. removing some artifacts like blurring
2. increasing the field of view by mosaicing inputs from 2 sensors on the same device
3. convert a video shot to an image sequence with a single click
4. Merge two frames across time to create one composite image
Obviously you start to create more and more sophisticated cameras by building intelligence in the embedded software instead of trying to design optics, lighting and whole host of engineering aspects. I’d go out on a limb to predict that future of smart cameras (and I believe most cameras will become smart cameras in time), will be largely due to our ability to pack a lot of intelligence in the software on the chip.
Compression is another area that will need to be on the edge for obvious reasons. There are two reasons for compression - reduction in bandwidth and reduction in storage. At least bandwidth will continue to be at a premium when we look at Megapixel video and hence compression methods that go beyond current state-of-the-art are definitely round the corner.
At Mamigo we are doing path-breaking work in making this a reality and we are working with partners to bring them to the market. This is great work that requires a combination of deep understanding of computer vision, algorithmic mindset, great software development talent, agile development techniques and intimate understanding of DSPs.
Tuesday, August 19, 2008
Why megapixel ?
More pixels mean more detail is captured from a scene. Just like human eye is better able to appreciate images and videos at higher resolution, video and image processing software has more data and hence more potentially useful information that it can extract from higher resolution images. Let’s take a simple example of a face detection analytics. Face detection algorithms require faces to be at least 24x24 pixels in the image. Now consider a 640x480 (VGA resolution) sensor. Say, there is a 6 ft tall human at a distance of 15 feet from this sensor. Assuming that the field of view of this sensor is 90 degrees, the coverage at 15 distance from the sensor will be a 30ft x 30ft plane. This means that 640 pixels will cover 30 ft in width and 480 pixels will cover 30 ft in height. 1 ft high face (hello long face!) will hence get 480/30 pixels which is equal to 16 pixels in height. Clearly detecting faces at a distance of more than 15 feet will be impossible with a VGA resolution sensor based analytics. Instead if we were to use 1 or 2 megapixel sensor, it will be possible to do face detection at larger distances. In fact I will leave this as a simple exercise for our readers.
As you can see there is a direct correlation between the distance you can cover from a sensor for the purpose of analytics and the resolution of the camera as analytics require a minimum number of pixels to compute discriminating features relevant for those analytics. In a nutshell, megapixel analytics provide better solutions to existing problems and open up more possibilities for other applications areas – some of which we will cover in line with our product roadmap.
The next questions is why do we need to do this at the edge ? We will cover that in the next post.
Vijay
As you can see there is a direct correlation between the distance you can cover from a sensor for the purpose of analytics and the resolution of the camera as analytics require a minimum number of pixels to compute discriminating features relevant for those analytics. In a nutshell, megapixel analytics provide better solutions to existing problems and open up more possibilities for other applications areas – some of which we will cover in line with our product roadmap.
The next questions is why do we need to do this at the edge ? We will cover that in the next post.
Vijay
Monday, August 18, 2008
Let the blogging begin !!
We will use this blog to share our experiences, learnings, product direction and challenges with our audience in as much detail as possible. The goal of the blog is to provide material of interest in a short and simple fashion that allows us to create a conversation with our community. It is the explorer in us that makes us push the boundaries of current technology. We know that we ride on the shoulders of the giants before us and we intend to provide that small step forward for those who work with us and will work after us. So this is a blog for the our customers, partners, users, competitors, teachers, students and the rest.
As a Company we are focussed on developing megapixel intelligent video products and components on embedded platforms. The challenges that we intend to address start with a specific problem to be solved like traditional surveillance centric analytics. We address these algorithmic challenges using state-of-the-art techniques available and improving them at an alogorithm level with newer ideas, in line with performance goals. We then port this into our embedded platform leveraging a wide range of embedded optimization techniques including
As a Company we are focussed on developing megapixel intelligent video products and components on embedded platforms. The challenges that we intend to address start with a specific problem to be solved like traditional surveillance centric analytics. We address these algorithmic challenges using state-of-the-art techniques available and improving them at an alogorithm level with newer ideas, in line with performance goals. We then port this into our embedded platform leveraging a wide range of embedded optimization techniques including
- Tools provided by DSP for efficient memory and CPU management,
- Using cascading set of filters on video frames that progressively apply on a narrower region of the image (less pixels to work with)
- Using image and data representations in data structures that improve performance of specific algorithms on the embedded platform
- Implementing some of the core image processing functions in assembly language
At each step the implementation is tested against the performance metrics that are set for the problem at hand. We have been working on this for some time and we continue to make huge improvements in our time to market to bring in new implementations to market. So let the journey begin...
Vijay
Subscribe to:
Posts (Atom)