Ceva announced its fifth generation vision platform, which includes the new Ceva-XM6 DSP chip, hardware accelerators, neural network software framework and a broad set of algorithms. The new platform will enable next-generation smartphones, autonomous cars, drones, and surveillance cameras to perform vision processing and machine learning operations two to eight times faster than the previous XM4-based platform.

Developers are using machine learning software more, so there is an increasing need for chips that can perform vision processing and machine learning tasks on devices in real-time. Internet connectivity may not make sense for all types of devices, and even when it does, the latency required to process the information on a company’s servers and return the data may be too long for practical applications. For some tasks or types of devices, it may be preferable to do all of the computation locally and in real-time. We should see this type of chip in more devices as vision processors get better.

Smartphone companies may be the most interested in adopting vision processors because they tend to use coprocessors alongside a CPU and a GPU. Mobile device makers are looking to differentiate themselves from others through their computational capabilities. A high-performance vision coprocessor may help them stand apart from all the others that may only use Qualcomm or MediaTek SoCs for computation.

Ceva-XM6

The Ceva-XM6 DSP chip promises up to an 8x increase in performance for some hardware accelerated tasks, a 3x increase in performance for vector-based computer vision kernels and a 2x average performance gain across all kernels.

Ceva added the following technical improvements to provide the increased performance:

  • A new vector processing unit architecture ensuring above 95% MAC utilization, which the company claimed is unmatched in the industry.
  • Enhanced Parallel Scatter-Gather Memory Load Mechanism, which further improves the performance of vision algorithms
  • Sliding Window 2.0 - a mechanism that takes advantage of pixel overlap in image processing and helps to achieve higher utilization for a wider variety of neural networks
  • Optional 32-way SIMD vector floating-point unit that includes FP16 support and major non-linear operations enhancements
  • An enhanced 3D data processing scheme for accelerated CNN (Convolutional Neural Networks) performance
  • A 50% improvement in control code performance versus the Ceva-XM4
  • A new scalar unit which further reduces code size, multi-core and system integration support

Ceva also said that its new vision platform has a significant power efficiency lead compared to GPUs when implementing neural networks. We’ve recently seen evidence from other vision processors and Google’s TPU that GPUs are not leading the way in terms of neural network processing efficiency.

However, at least for now, GPUs are still the leading mainstream products that can fulfill the taxing performance requirements for fast neural network training. GPUs may not be the most efficient, but they tend to offer much higher performance in a single chip, and they are more readily available, which is why companies playing with machine learning still prefer them to the emerging alternatives.

Vision processors may still be better suited for low-power embedded devices, where efficiency, peak power consumption and price matter most.

Ilan Yona, vice president and general manager of the Vision Business Unit at CEVA, commented, “As computer vision and deep learning technologies become mainstream, there is a need to bridge the gap between the multi-layered and powerful deep neural networks that are being generated by power-consuming GPU engines and the ability to deploy these in power- and performance-constrained embedded applications. Our new vision platform excels in this regard, providing developers with the most comprehensive set of technologies to rapidly address these embedded use-cases.”
Ceva’s Vision Platform

Ceva’s vision platform functions Alongside the Ceva-XM6 DSP, and it includes function-specific hardware accelerators for Convolutional Neural Networks, along with image de-warp for all types of image transformations. Ceva’s software platform supports its recently unveiled CDNN2 neural network software framework, the Ceva-CV computer vision library, the OpenCV, OpenCL, and OpenVX APIs. It also supports a set of widely used algorithms.

Ceva optimized the CDNN2 software framework for both the Ceva-XM6 and CDNN accelerator. It supports 16-bit fixed-point precision, thus ensuring less than 1% degradation when running a network trained in a 32-bit floating-point environment. According to Ceva, this is critical for companies that want to transition neural networks from R&D (likely run on GPUs) to more cost- and power-efficient solutions, such as its own vision processor.

The Ceva-XM6 and its vision platform components will be available to lead customers in the fourth quarter of this year, and general licensing will come in the first quarter of 2017.