Have you ever seen human hands with 6 fingers? Have you noticed a white dog giving birth to a black puppy? All these are unusual, Right !? There is always a better term! In this case, it is called an ANOMALY.
So How is an Anomaly defined?
A deviation from standard or normal values, or sometimes, expected values. In terms of Machine Learning, Anomalies are rare items, events or observations that make suspicions. In simpler words, data points that do not align with normal patterns are identified as anomalies.
Why Anomaly Detection?
These anomalies need to be detected as they prove to be useful to solve many problems like fraud detection, medical diagnosis, etc., Anomaly Detection is also known as Outlier Detection. In a sturdy software system, detecting anomalies can help enhance the quality of interaction, reduce possible menaces, and improve the analysis of root causes in case of existent drawbacks.
As IoT emerges as an unavoidable paradigm in our lives, critical security issues also seem to emerge parallelly, in various aspects. Mirai Malware and zombified IoT bots are one of unforgettable instances urging the need to improve authentication measures. In the case of IoT sensors, an anomaly detected could point out a manufacturing unit flaw. In real-time, the process gets exponentially complex and challenging and in the case of real-time detections, the magnitude of the dataset gets more prominent.
The Origin
Leaving beside the complex terms, we might have one basic question! Where do the anomalies arise from? The possibilities include:
Detection of DoS: IoT has the concept of the Internet in itself which makes it prone to security-related attacks involving Denial-of-Service (DoS) and Distributed Denial-of-Service(DDoS) causing severe damage to the services and applications working within that environment
Financial Frauds: IoT is forever prone to lead to fraudulent activities involving theft of credit card or bank account information during transfers.
Privacy!? : Sensitive data stored in databases and servers are vulnerable to getting leaked to any entity. It creates threats along with loss of information which in the wrong hands will lead to the destruction of confidential information from the system. But, there is a possible solution using proper encryption techniques.
How to Detect Anomalies?
Of course! There’s a silver lining!
Introducing the concept of Machine learning automates the process of detecting anomalies and proves to be exponentially more effective than manual methods. Amongst all the concepts of machine learning, some specific techniques like Support Vector Machine, Random Cut Forest Algorithm and Unsupervised learning prove to have the upper hand over the rest. Exploring the concepts of machine learning in detail will introduce more techniques.
THE SUPERVISED TYPES! They are also known as the discriminative algorithms, they learn through labelled instances. They have their classification algorithms like K.N.N (K-Nearest Neighbour), S.V.M (Support Vector Machine), N.N (Neural Networks) and Bayesian network.
The Unsupervised algorithms use unlabelled data to learn the features. Out of these, clustering-based algorithms like K-means and D.B.S.C.A.N (Density-based spatial clustering of applications with noise) are some efficient techniques that cannot be directly applied for anomaly detection in IoT devices because of the resource usage. Techniques like P.C.A (Principal Component Analysis) can be applied extensively, but fail in dynamic environments involving IoT components. A.E (AutoEncoder) gives a ray of hope but they’re primarily employed for feature extraction.
Recent research pointed out some modifications to improve the quality of deep learning concepts applied for anomaly detections after noticing some flaws that might have posed a possible threat, Namely:
→ Lack of accuracy in phases of preprocessing and optimisation → the neural networks are limited to one network scenario’s datasets transmitted by ad-hocs and local traffic → IoT traffics are missed out during the process
One possible solution cited after the research was to approach based on Deep Learning concepts in IoT scenarios. Introduction of Deep Neural Network architecture and a feature model to perform anomaly detection and specify the type of attack produced considerably improved results when tested on a dataset from IoT public traffic traces from various scenarios after a feature reduction step performed by autoencoder and analysis conducted on the optimization of hyperparameters.
The usage of external intelligence will prove to be efficient in case of detecting anomalies until a certain point as the volume of data generated every day by IoT devices increases over time. As a result, the requirements and the time taken to process all the data will keep rising, thus requiring optimization. A possible solution to this requirement could be EDGE COMPUTING
Edge computing, professionally defined, is a distributed IT architecture to process data at its core network nodes, with a negligible deviation from the originating source. To put it in simple terms, Edge computing is a form of computing performed, particularly data processing, near a particular data source thus eliminating the need to process the same in a remote centre.
This can help use the Deep Learning or Deep Neural Network technique to its maximum efficiency by shifting the data processing works closer to the origin of the data. This helps to improve the response time and reduce the load applied on the network and a lot of expenses invested in the cloud. One of the main advantages of using the concept of edge computing is to ensure zero latency while processing data. It enables the possibility of reporting significant data without time delay and prevents failures in the system.
A hybrid can also be created: An Architecture including edge computing and cloud computing where data from IoT devices are pushed to a messaging layer which is pulled by components that process the data. Insights of this processed data are sent to the cloud for alerts and reports. Now if there is a problem with the components operating, there is a system to monitor and manage it. The operational issues are reported to the cloud by this system. Another job is to ensure the synchronisation of the components and data managed from the cloud. Edge computing proves to be efficient as the components can be deployed together like a single-board computer component.
Another popular domain that could be a possible solution is the BLOCKCHAIN ARCHITECTURE. Its powerful attributes could establish a strong foundation for detecting anomalies in complex networks like IoT. But developing the system does involve risk during the process where local IoT models are bound to share securely. Consensus algorithms and storage from the blockchain concept reduce the possibility of malicious tampering. The drawback includes extensive storage requirements and processing capabilities.
Learn Anomaly Detection with Intel
Whichever technique is put into use, the underrated goal is to understand its theory and its functionality. The application of statistics and Machine Learning is an added technique to detect anomalies in a given dataset. For a kickstart, Intel offers an 8-week course wherein, with the basic knowledge of Calculus, Linear Algebra, Statistics and Python programming, anomalies in data can be detected. Click Here to know more
Several techniques have been proposed to detect anomalies in IoT and many of the approaches fail in the primary requirements - resource and power. While the best algorithm is yet to be found in terms of accuracy and minimum resources, techniques combined to detect anomalies could be explored further to improve the efficiency of the IoT environment.