RoCC: Robust Covert Communication Based on Cross-Modal Information Retrieval

Abstract

Objective Covert communication is a pivotal research area in the field of information security.A highly covert and secure covert channel for transmitting sensitive information must be developed to safeguard the privacy of communication users and prevent occurrences of eavesdropping on confidential data transmissions.Most methods build covert channels by tunneling multimedia streams.However,the problem of packet loss caused by fluctuations in network transmission is not considered.This study proposes a covert communication method that is robust to network anomalies and is based on cross-modal information retrieval and provably secure steganography.Method We propose a general covert communication framework named robust covert communication(RoCC),which is based on cross-modal information retrieval and provably secure steganography.Artificially generated information from artificial intelligence(AI) systems,including deep synthesis models,AI-driven artwork,intelligent voice assistants,and conversational chatbots,has emerged.These AI models can synthesize multimodal data,such as videos,images,audio,and text.The practical application of provably secure steganography has become a reality as generative models make significant strides.Thus,we introduce generative models and provably secure steganography techniques into our framework,embedding secret messages within the cover text data.Furthermore,the domain of speech synthesis and recognition has witnessed the advent of numerous mature open-source models,facilitating seamless cross-modal conversion between speech and text.Our approach employs a combination of direct and indirect communication.In direct communication using voice over internet protocol(VoIP) network call service,realtime synthesized audio stream data are delivered,and the receiver can restore the text through voice recognition.Indirect communication uses a public network database for steganographic text data transmission.The receiver restores lost text semantics because of network packet loss and speech recognition errors via text semantic similarity matching.The entire communication process can be succinctly described as follows.Assuming that the sender of confidential data is Alice and the recipient is Bob,Alice and Bob share the same generative model and parameter settings for provably secure steganography.Alice embeds the confidential data into the generated text data using provably secure steganography techniques and publishes it on a publicly accessible and searchable network database.The only means of direct communication between the two parties is through VoIP network voice calls.Thus,the potential loss of network data packets is acknowledged.On the basis of the preserved semantic information,Bob performs cross-modal information retrieval from the public database and successfully locates the corresponding steganographic text data within the cover text.Subsequently,Bob recovers the confidential data from the steganographic texts by using the same generative model and parameter settings for steganography.Result The results of speech recognition experiments indicate that speech recognition often leads to semantic loss issues.The sentence error rate of the best model,standing at a mere 0.612 5,fails to meet the text recovery capability required for constructing covert channels through direct cross-modal transformations.Text similarity analysis experiments indicate that the best model can achieve a recall metric of 1.0,thereby theoretically enabling complete semantic information restoration.The experiment on combating network packet loss shows that RoCC achieves an impressive information recovery rate of 0.992 1 when the packet loss rate is 10% with a K value of 2.This finding demonstrates the exceptional resilience of RoCC to network anomalies and establishes it as the current state-of-the-art solution.In the experiment on realtime performance,we validate the high efficiency of the RoCC system in various components,such as speech synthesis and recognition,secure steganographic encoding and decoding,and text semantic similarity analysis.These results demonstrate the ability of RoCC to meet the real-time requirements of covert channel communication.In comparative experiments,RoCC is compared with eight representative methods.The results show that RoCC has outstanding advantages in terms of protocol versatility,robustness,and data steganography as provable security.Compared with the current robust model,RoCC shows increased resistance to packet loss rate by 5% in the antinetwork packet loss experiment.Conclusion The covert communication framework proposed in this study combines the advantages of provably secure steganography,generative machine learning methods,and cross-modal retrieval methods,making the covert communication process increasingly stealthy and secure.We also implement the first method of using semantic similarity to restore data communication lost due to an abnormal transmission process.After experimental verification,our framework meets the requirements of real-time communication in terms of performance,and the real-time transmission rate reaches 73~136 bps.

Type
Publication
In Journal of Image and Graphics 2024, in Chinese
Yanming Zhang
Master’s Student, University of Science and Technology of China
Kejiang Chen
Research Associate Professor, University of Science and Technology of China
Jinyang Ding
Jinyang Ding
Master’s Student

My research interests include Information Hiding and AI Security & Privacy.

Weiming Zhang
Full Professor, University of Science and Technology of China
Nenghai Yu
Full Professor, University of Science and Technology of China