Category: Uncategorized (Page 2 of 2)

Sensing, Communicating, and Classifying with Spikes

Or how to remotely classify data with 80% accuracy and zero latency at Signal-to-Noise Ratios (SNRs) as low as – 8 dB.

Problem
The development of Internet of Things (IoT) systems, with applications ranging from personal healthcare and wearable devices to drone-based monitoring, is driving research efforts on edge-based machine learning. In such systems, data may be collected by battery-powered sensors and processed at a remote device, which may itself be energy-constrained. Standard hardware implementations pose major energy and latency limitations for such applications.
In our new paper, recently accepted at Asilomar 2020, we investigate a novel solution based on neuromorphic sensors, processors, and transmitter/receivers. In neuromorphic sensors, spikes (i.e., binary signals) mark the occurrence of a relevant event, e.g., a significant change in a pixel for a neuromorphic camera. Extremely low energy is consumed when the monitored scene is idle. In neuromorphic processors, known as Spiking Neural Networks (SNNs), spiking signals are processed via dynamic neural models for the detection of spatio-temporal patterns. SNNs have recently emerged as a biologically plausible alternative to ANNs, with significant benefits in terms of energy efficiency and latency. Finally, for communications, pulses, or spikes, can encode information for radio signalling via low-power Impulse Radio (IR). Commercial products are available for all these blocks, including DVS cameras, Intel’s Loihi SNN chip, and transceivers implementing the IEEE 802.15.4z IR standard.

System model
As seen in Fig. 1, the proposed system consists of the integration of neuromorphic sensing and processing with IR transmission, and it carries out Joint Source-Channel Coding (JSCC), as it performs source and channel coding in a single step. The signal sensed by the neuromorphic sensor, e.g., a DVS camera, is encoded as a vector of binary spiking signals, and processed by an encoding SNN that performs source and channel coding. The SNN defines a probabilistic mapping that is defined by its parameter vector. The output of the encoding SNN is modulated using parallel IR transmissions, with each spike encoded by an IR waveform such as a Gaussian monopulse. The channel is modeled as a frequency-flat Gaussian channel. Finally, the received signals are classified via a decoding SNN whose output can be interpreted as a class index using standard methods for SNN-based classification. For example, rate decoding predicts a class by selecting the neurons in an output layer with the largest number of spikes.
The proposed system in Fig. 1, termed NeuroJSCC, is trained by maximizing the log-likelihood that the decoding SNN outputs desired spiking signals in response to a given input. Details on the training procedure, and the resulting algorithm, can be found in the preprint.

Experiments

To illustrate the advantage of the system, we focus on an example consisting of the remote detection of handwritten digits recorded by a neuromorphic camera.
We compare NeuroJSCC to two benchmark schemes:
1) Uncoded transmission: The observation is directly transmitted through the Gaussian channel using On-Off Shift Keying (OOK), and classified using an SNN.
2) Separate Source-Channel Coding (SSCC): The encoder applies state-of-the-art quantization based on the Vector Quantization Variational Autoencoder (VQ-VAE) scheme, followed by LDPC encoding. The spiking signal is encoded as frames, and the scheme is applied separately to each one of them. At the decoder side, frames are decoded using the Belief Propagation algorithm, decompressed using VQ-VAE decoding, and then classified. We consider two different classifiers, namely traditional ANN and SNN.
In Fig. 3, we evaluate the test accuracies at convergence obtained for different levels of SNR and the different schemes. The accuracy of Uncoded transmission drops sharply at sufficiently low SNR levels. In contrast, NeuroJSCC maintains a test accuracy of 80%, even at an SNR level as low as −8 dB. Separate SCC with an SNN as classifier suffers the most from the degradation of the SNR. Using an ANN proves more robust to low SNR levels, since an ANN can benefit from the non binary outputs of the VQ-VAE decoder without further loss of information due to binary quantization.
We refer to the main text for further experiments and analysis.
Code will be released shortly on our Github page.

Coding and Lazy Aggregation for Robust and Efficient Distributed Learning

 

Figure 1: Parameter Server (PS) computing architecture.

Problem Overview:

In order to scale machine learning so as to cope with large volumes of input data, distributed implementations of gradient-based methods, e.g., Gradient Descent (GD), that leverage the parallelism of first-order optimization techniques are commonly adopted. To run GD, as illustrated in Fig.~\ref{fig:model}, multiple parallel workers perform computations of the gradients and the Parameter Server (PS) iteratively aggregates the computed gradients and communicates the updated parameter back to the workers. In the process, the PS computing architecture is subject to two key impairments. First, the potentially high tail of the distribution of the computing times at the workers can cause significant slowdowns in wall-clock run-time per iteration due to straggling workers. Second, the communication overhead resulting from intensive two-way communications between the PS and the workers may require significant networking resources to be available in order not to dominate the overall run-time.

To jointly address these impairments, in a recent work just published on IEEE Transactions on Neural Networks and Learning Systems, we study the performance of coding and lazy aggregation techniques for the PS architecture in terms of wall-clock run-time complexity, communication complexity, and computation complexity.

Main Results:

To explore the trade-off among wall-clock time, communication, and computation requirements, we provide a unified analysis of the techniques of gradient coding (GC), worker grouping, and adaptive worker selection, also known as Lazily Aggregated Gradient (LAG), whose relative merits are summarized in Table I. Both GC and grouping are full-gradient approaches that aim at increasing robustness to stragglers by leveraging storage and computation redundancy. Thanks to coding, with GC, only a given number of workers, dependent on the computing redundancy, need to finish their computations and send their encoded computed gradients to the PS at each iteration in order to retrieve the gradient. Grouping applies data duplication and coding to groups of workers. In contrast, LAG is an approximate gradient descent scheme that judiciously selects a subset of active workers at each iteration in order to reduce communication and computation loads. By integrating all the techniques,
we propose a novel strategy, named Lazily Aggregated Gradient Coding (LAGC), that aims at exploring the trade-off between the robustness to stragglers of GC and the computation and communication efficiency of LAG by generalizing both schemes.

Figure 2: Time, communication, and computation complexity measures under the Pareto distribution.

Figure 3: Time, communication, and computation complexity measures under the exponential distribution.

As a special case, we also introduce a scheme that only uses grouping and adaptive selection, which is referred to as G-LAG. For illustration, we consider a linear regression model under two representative distributions, i.e., Pareto distribution and exponential distribution, accounting high- and low-tails for the distribution of the computing times for the workers. Time, communication and computation complexities of the existing strategies, namely GD, GC, and LAG, and the proposed strategies, i.e., LAGC and G-LAG, are shown in Fig. 2 and Fig. 3. It can be seen that both of the proposed LAGC and G-LAG are capable of combining the benefits of gradient coding and grouping in terms of robustness to stragglers with the communication and computation load gains of adaptive selection (see Table I). Furthermore, G-LAG provides the best wall-clock time and communication performance, while maintaining a low computational cost.
The full paper can be found here.

 

Newer posts »