Redundancy in sensor data

Leveraging redundancy can give 100x improvements

Sep 28, 2024

Real-world sensor data is very redundant. This redundancy is exploited to get orders of magnitude efficiency improvements in data representation and processing. Let’s look at a few examples:

1. Time-series sensor data

You can see two figures below: a time series plot and a frequency domain plot. Both of them contain a sine wave with a 50 Hz frequency. For the time series plot, we need at least 100 data points to represent this information, whereas for the FFT, we need only one data point! That’s a 100x reduction right there.

2. Image compression

In the four images below, we can see that just 0.2% or 1% of the FFT has sufficient detail for the human eye to infer this as a good quality image. This concept is the basis of image compression. There are other details which include differential suppressing of intensity and colors to aid in better human perception, which are again exploits to enable efficient representation and processing. Image credit: DataBook

3. Gene expression

“For certain learning tasks that don’t explicitly depend on high-dimensional details (e.g., clustering, estimating similarity, or classification based on expression profiles), we only need to collect low-dimensional data, as opposed to collecting high-dimensional data, then projecting into low dimension for computational efficiency or robustness.”

Credits: Brian Cleary

This is true for multiple modalities of real-world data.

At Lightscline, we are leveraging AI to exploit the redundancy of sensor data to get 100x speed and efficiency improvements as can be seen here. This lowers the AI infra (compute, power, storage, transmission, and latency) and human capital time/costs by up to 100x. Furthermore, this enables several applications currently unobtainable due to power, compute, bandwidth (SWaP-C) bottlenecks.

Using Lightscline, you can:

1. Train lightweight models that run using just 10% of the raw sensor data

2. Reduce model training times by more than 100x

3. Reduce the ML workflow from 40+ hours to 1 hour/dataset

4. Deploy ML models and run inference on ~mW of power in ~microsecond latencies

You can learn more about Lightscline here.

Lightscline

Redundancy in sensor data

Leveraging redundancy can give 100x improvements

Discussion about this post