What if machine learning applications on the edge were possible, pushing the limits of size and energy efficiency? GreenWaves is doing this, based on an open-source parallel ultra low power microprocessor architecture. Though it’s early days, implications for IoT architecture and energy efficiency could be dramatic.
The benefits open source offers in terms of innovation and adoption have earned it a place in enterprise software. We could even go as far as to say open source is becoming the norm in enterprise software. But open source hardware, chips to be specific, and AI chips to be even more specific? Is that a thing?
Also: AI Startup Gyrfalcon spins plethora of chips for machine learning
Apparently it is. GreenWaves, a startup based in Grenoble, France, is doing just that. GreenWaves is developing custom, ultra-low power specialized chips for machine learning. These specialized chips leverage parallelism and a multi-core architecture to run machine learning workloads at the edge, on battery-powered devices with extreme limitations. The chips GreenWaves makes are based on open source designs, and are making waves indeed.
GreenWaves just announced a 7M€ Series A Funding with Huami, Soitec, and other investors. As per the announcement, funds will finance the sales ramp of GreenWaves’ first product, GAP8, and the development of the company’s next generation product. ZDNet discussed with Martin Croome, GreenWaves VP of Product Development, to find out what this is all about.
Open source microprocessors for IoT
First off, what does open source even mean when we’re talking about microprocessors? As Croome explained, what is open source in this case is the instruction set architecture (ISA), and the parallel ultra low power computing platform (PULP) that sits on top of it.
Also: AI Startup Cornami reveals details of neural net chip
GreenWaves is a fabless chip maker. What this means is that it designs chip architectures, and it then builds them by outsourcing to some hardware manufacturer. So, GreenWaves uses low-level building blocks, customizing them and combining them with its own extensions, to produce a proprietary design.
This approach is somewhat reminiscent of open core software: An open source core, with custom extensions that add value. The building blocks that GreenWaves is using is the RISC-V instruction set, and PULP.
PULP is an open source parallel ultra-low power platform on which innovative chip designs can be created.
RISC-V is an open-source hardware ISA based on established reduced instruction set computer (RISC) principles. RISC-V can be used royalty-free for any purpose and began in 2010 at the University of California, Berkeley. RISC-V has many contributors and users in the industry. As Loic Lietar, GreenWaves’ co-founder and CEO noted, the likes of Nvidia and Google also use RISC-V. This means RISC-V contributions grow, and anyone can benefit.
Also: AI chips for big data and machine learning: GPUs, FPGAs
PULP is a parallel ultra-low power multi-core platform aiming to satisfy the computational demands of IoT applications requiring flexible processing of data streams generated by multiple sensors. PULP wants to meet the computational requirements of IoT applications, without exceeding the power envelope of a few milliwatts typical of miniaturized, battery-powered systems.
PULP started as a joint effort between ETH Zürich and the University of Bologna in 2013. GreenWaves sprung out of PULP in 2014, as its CTO and co-founder, Eric Flamand, was also a co-founder in PULP. Fast forward to today, and GreenWaves has 20 employees, shipped a first batch of GAP8 chips to clients, and raised a total of €11.5.
Real-time processing at the edge
Croome noted that GreenWaves needed much less capital than what chip startups usually need, which is mostly spent in getting IP rights for designs. GreenWaves did not have to do this, and this made financing easier. Or, as Lietar put it, a few years ago, when GreenWaves would mention open source chips, there was a good chance they would be thrown out of the room. Not anymore.
Also: AI startup Flex Logix touts vastly higher performance than Nvidia
So, what’s special about GAP8, what can it be used for, and how?
GAP8 has an integrated, hierarchical architecture. It hosts 8 extended RISC-V cores and a HWCE (Hardware Convolution Engine). GreenWaves promises ultra low power 20x better than the state-of-the-art on art on content understanding. Content, in this context, can mean anything from image to sound or vibration sensor input.
What GAP8 is designed to do is to process that data at the edge in real time, bypassing the need to collect and send for processing to some remote data center. In order to do this, it has to be fully programmable, agile, and have low installation and operation cost.
The agile part is there, as GAP8 can wake up from a sleep state in 0.5 milliseconds. As Croome noted, such chips deployed at the edge spend a big part of their lifetime actually doing nothing. So it was important to design something that sleep consuming as little power as possible, and then wake up, and switch modes of operation, as quickly and efficiently as possible.
Also: AI’s insatiable appetite for silicon requires new chips
The low installation and operation cost is there, too, as GreenWaves promises years of operation on batteries, or even better solar cells. As GAP8 can operate over wireless solutions such as LoRa, GreenWaves also promises a 10- to 100- fold cost reduction over wired installations.
So, what can GAP8 do? Clients are using GAP8 for things such as counting people or objects, vibration and sound analysis, object recognition, and more. Some areas of application are smart cities, industry, security, and consumer applications. The really interesting part, however, is in how it all works.
Deploying machine learning models
All these applications are based on using machine learning, and more specifically, neural networks. GAP8 takes care of the inference, which means the models have to be trained first, and then deployed on GAP8. And this is where it gets a bit tricky.
Also: Chip startup Efinix hopes to bootstrap AI efforts in IoT
GAP8 is programmable via C or C++. So how does one get from a model built using TensorFlow, or PyTorch, and other machine learning libraries, to deployment on a GAP8? The software stack for this is open source and available on GitHub.
Examples exist for the development flow from TensorFlow to C. However, there’s a couple of gotchas. First, currently GAP8 only works with TensorFlow. Croome said this a matter of resources and priorities, and integration with other frameworks will be provided as well. For the time being, he added, what people do is to port models created in other frameworks to TensorFlow via ONNX.
Then, if you’re expecting a one-click deployment, you’re in for a disappointment. As Croome explained, the flow is tools based rather than being monolithic. This means that a number of tools provided by GreenWaves have to be utilized in order to deploy models to GAP8.
Croome noted that “all the functionality of GAP8 is visible to the programmer but that we do provide pre-written and optimized code as a ‘starting block’ for getting something up and running quickly. The HWCE accelerates the convolution operation however like all hardware blocks it works on specific convolution types. If it doesn’t match a specific layer then this can always be accelerated on the cluster cores programatically.”
Bringing energy efficiency to IoT architecture
The important thing here, however, is the ability to effectively process data at the edge. With a processor like GAP8, Croome noted, one can analyze the content produced by a rich data sensor and only upload the outcome, for example how many people are in a room:
Also: Meet Jetson Xavier: Nvidia’s AI chip
“This may well be uploaded into a time series database via an IoT Application platform (which may also only be hit after transmission over a low-speed LPWAN type network further minimizing data transfer). The energy spent in doing this analysis and the wireless transmission of the results, which can be seen as an ultimate compression, is far less than the wireless transmission of the raw data.”
Some of the applications low-power AI chips like GAP8 can be used for, simplifying IoT architecture
Although we have seen things such as deploying Hadoop at the edge, this would probably make little sense here. AI algorithms that operate on aggregate data from multiple sensors or access very large databases on the pre-compressed data are clearly better run on generic platforms on the edge (as opposed to very edge) or in the cloud, according to Croome.
“For a one in many face recognition application, the extraction of the key features would be run on GAP8 in the sensing device, the result would be uploaded and the matching would run in the cloud. This would be the best balance from a system point of view, for power consumption and from a SW engineering perspective,” Croome said.
Lietar said GreenWaves has been one step ahead of the market in identifying and serving this segment that is now widely recognized. Croome noted the state of the art in machine learning is evolving rapidly. He went on to add, however, that because GAP8 is not specialized, it can adapt well to new topologies and operators while retaining a best in class energy efficiency.
Innovation that leads to optimized energy efficiency and can simplify technical architecture – what’s not to like?