Today Microsoft is hosting a developer day, and one of the highlights they are showcasing is a new API called WinML. Artificial Intelligence and Machine Learning are two of the biggest trends in computing these days, but much of that compute is done in cloud datacenters, where custom servers have specialized hardware to improve the performance and lower the energy consumption. But as with everything in computing, the landscape is cyclical, and we’re once again looking at moving some of that compute back to the edge devices, such as PCs, tablets, and IoT.

WinML is a new set of APIs which will let developers harness the complete capabilities of any Windows 10 device to use pre-trained machine learning models, allowing AI tasks to be offloaded from the cloud, for several reasons.


The first reason is performance. Despite the massive computing power available in the cloud, we still live in a world where moving data to the cloud can be prohibitive in terms of cost, and speed. The latency of any network connection is orders of magnitude slower than local memory access, and working on massive datasets can be difficult without expensive, dedicated, high-bandwidth interconnects. Performing the compute tasks locally can significantly improve performance thanks to the lower latency, and offer up real-time results. Operational costs can be saved through reduction in network bandwidth, as well as less cloud-compute time.

Do to compliance and security concerns, many industries would not be able to leverage machine learning and AI to process datasets, so WinML provides them an easy option to move that work away from the cloud, meaning it’s much easier to ensure you are compliant with all necessary regulations.

An example provided by a Microsoft spokesperson was that of an industrial system where an image is taken of a product, and the system has to determine if that product is correct, or if it has a defect. This is an ideal workload for AI tools, such as Microsoft’s Computer Vision API. Moving this type of compute to an industrial PC, which may be offline, provides excellent performance without an ongoing cost, or the requirement of a cloud connection.

Microsoft’s new set of AI APIs offer several key benefits which should help developers integrate them into their products. Arguably the most important is that the API does all of the heavy lifting for the developer, so the developer doesn’t have to worry about what kind of hardware is available in any machine their app is going to run on. The WinML engine will leverage the hardware dynamically, and create code to get the maximum performance available from whatever hardware the device is running on.

The engine is built on Direct 3D, and if the system has a DX12 capable GPU, will utilize DX12 compute shaders dynamically. If you have a massive GPU with plenty of VRAM, the workload will be offloaded to the GPU. If a DX12 GPU is not available, or performance is an issue because of integrated graphics, the engine can also fall back to a CPU path, but the CPU path leverages all of the silicon as well, with support for things like AVX512. The code is generated at runtime, so it will always optimize for the hardware available, and the engine will get updated as well to take advantage of new silicon such as Intel’s Movidius VPU. Perhaps even more impressive is that the ML engine will even work on a new SnapDragon 835 based PC, or even IoT devices. If you have lots of performance available, it will leverage that, but if the system is a low-power device, it will still work.

A key point here is that the WinML API is meant for models that have already been trained. The training itself would be done ahead of time, and then the models and data supplied to the engine to perform the analysis and provide probabilities

The new API will utilize the industry standard ONNX ML models, and ONNX is driven by Microsoft, Facebook, and Amazon, with support from all the key hardware players such as AMD, Intel, and NVIDIA. ONNX models can be trained using the Azure Machine Learning Workbench, and then the model will be able to be brought into Visual Studio which will create a wrapper for the engine to utilize. Visual Studio Preview 15.7 will support this, and earlier versions can use the MLGen tool to manually add it to projects, and ONNX models will be able to be directly exported from Azure.

Microsoft was candid enough to state that the idea for WinML really only gained traction around May of 2017, and work began on the new API in just August of last year, so in just a few months, they’ve gone from an idea to a new ML toolkit they are bringing to developers. That timeline is fairly impressive, but the response praised the capabilities built into DX12, as well as the high-quality drivers, which allowed the team to deploy this to scale in such a short time.
AI and ML are two of the big keywords in the development and hardware industry these days, and WinML and the AI Platform for Windows developers should help bring those capabilities to even more tasks.

Source: Microsoft