Apache Flink and clustering-based framework for fast anonymization of IoT stream data

Published in Intelligent Systems with Applications (Elsevier), 2023

In this paper, we present a novel framework that considers the expiration period time of the Internet of Things (IoT) data stream to anonymize it. IoT stands among one of most fast-growing technology in the world. Also, anonymity is one of the safeguards in place to protect data privacy. Because of the dynamic nature, vastness, and rapid changes in data streams, traditional approaches cannot be used to anonymize IoT data. The anonymization framework proposed in this paper performs its operation using a new clustering method and Apache Flink flow data processing engine. In this framework, firstly, we cluster received data. Then, if the size of the clusters doesn’t meet the K-anonymity threshold, our review will continue to suppress and delete them; otherwise, the data would be anonymized and published. In this way, the framework handles both numerical and categorical data. At the end of the stream, the final remaining data will be merged and anonymized. Implementing and evaluating the framework using Scala and Apache Flink shows that the proposed approach reduces data delay by 12.33–66.62% compared with the other methods. Furthermore, in the end, combining the leftover clusters avoids information loss. In comparison with similar methods, information loss is reduced by 5.68–18.26%. The evaluation results show better performance in terms of data delay and information loss.

Recommended citation: Sadeghi-Nasab, A., Ghaffarian, H., Rahmani, M. Apache Flink and clustering-based framework for fast anonymization of IoT stream data. Intelligent Systems with Applications (2023).

A Comprehensive Review of the Security Flaws of Hashing Algorithms

Published in Journal of Computer Virology and Hacking Techniques (Springer), 2022

The blockchain is an emerging technology. It is widely used because of its efficiency and functionality. The hash function, as a supporting aspect of the data structure, is critical for assuring the blockchain’s availability and security. Hash functions, which were originally designed for use in a few cryptographic schemes with specific security needs, have since become regular fare for many developers and protocol designers, who regard them as black boxes with magical characteristics. Message digesting, password verification, data structures, compiler operation and linking file name and path together are contemporary examples of hash functions applications. Since 2004, we’ve observed an exponential increase in the number and power of attacks against standard hash algorithms. In this paper, we investigated reported security flaws on well-known hashing algorithms and determined which of them are broken. A hash function is said to be broken when an attack is found, which, by exploiting special details of how the hash function operates, finds a preimage, a second preimage or a collision faster than the corresponding generic attack. To increase background knowledge, we also provide a summary of the types of attacks in this area. Finally, we summarized the information of the broken hash algorithms in a table which is very helpful for selecting, designing or using blockchains.

Recommended citation: Sadeghi-Nasab, A., Rafe, V. A comprehensive review of the security flaws of hashing algorithms. J Comput Virol Hack Tech (2022).

A Deep Incremental Learning Framework for Predicting Covid-19 by using Incoming Stream X-ray Images of Chest

Published in 7th International Conference on Technology Development in Iranian Electrical Engineering, 2022

The COVID-19 epidemic has erupted in more than 150 nations around the world. One of the quickest ways to diagnose patients is to use radiography and radiology images to detect this disease. As the disease has not yet been eradicated, the number of these images is increasing daily and the dataset is constantly growing. In our framework, Covid -Stream, Images are entered into the framework as a stream of data. The proposed framework consists of two main parts. in, transfer learning phase features are extracted from these batch images using Keras library. Then incremental learning is applied to predict and evaluate COVID and non-COVID images using Creme library. Incremental learning plays an important role in this framework because it is not possible to process and fit all data into the memory. The proposed framework is tasted on a public CXR dataset (named COVID X-ray-5k) containing different chest abnormalities, and the proposed method achieved an accuracy of 0.86%. It also achieved a highly competitive performance while significantly reducing the training and computational burden. The proposed framework can solve real-world big datasets scalability issues.

Recommended citation: Sadeghi-Nasab, Alireza and Shakoor, Mohammad Hossein,1401,A Deep Incremental Learning Framework for Predicting Covid-19 by using Incoming Stream X-ray Images of Chest,7th International Conference on Technology Development in Iranian Electrical Engineering,Tehran,

A New Fast Framework for Anonymizing IoT Stream Data

Published in 2021 5th International Conference on Internet of Things and Applications (IoT), 2021

The Internet of Things (IoT) plays an important role in human life today. Millions of devices generate and transmit vast amounts of data. Exploring this data without compromising privacy practices may expose to risk of users’ identities. One of the measures used to protect data privacy is anonymity methods. IoT data anonymization is not possible using traditional methods because such data, unlike database data, are not static and are very large. In this paper, we propose a new framework that can anonymize the received stream data by considering their expiration time. This anonymization is performed using a new clustering method using a streaming data processing engine. The introduced clustering method has a significant effect on reducing data delay. It supports both numerical and categorical data types too. Also, merging remaining clusters at the end of the method has minimized information loss. Comparing the performance results of the introduced method with similar methods shows that the proposed method performs better in terms of information loss and data delay.

Recommended citation: Nasab ARS, Ghaffarian H. A New Fast Framework for Anonymizing IoT Stream Data. In2021 5th International Conference on Internet of Things and Applications (IoT) 2021 May 19 (pp. 1-5). IEEE.