上海品茶

全球6G技术大会:2024年10.0B语义通信白皮书(英文版)(156页).pdf

编号:160867 PDF   PPTX 156页 8.58MB 下载积分:VIP专享
下载报告请您先登录!

全球6G技术大会:2024年10.0B语义通信白皮书(英文版)(156页).pdf

1、1/155AbstractIn the past few decades,research in the field of communication has mainly focused on howto accurately and effectively transmit symbols from the transmitter to the receiver,known assyntactic communication.Along with the development of wireless communication systems,system capacity has gr

2、adually approached the Shannon limit.Claude Shannon and WarrenWeaver categorized semantic communication as the second layer of communication.Unliketraditional communication,which aims for the accurate transmission of symbols,the primarygoal of semantic communication is to achieve the accurate exchan

3、ge of semantic information.Semantic communication focuses on conveying the meaning and significance of information,rather than just focusing on the symbols themselves.It is mainly used to address how toaccurately convey the meaning of transmitted symbols and how the receiver can influence thesystem

4、behavior in the desired way.Semantic communication can significantly improvecommunication efficiency through the extraction,coding,and transmission of semanticinformation.With the vigorous development of a new generation of communication and artificialintelligence technologies,semantic communication

5、 has shown broad application prospects infields such as human-computer interaction,holographic communication,and intelligentmanufacturing,receiving extensive attention fromacademic and industry communitiesworldwide.This white paper comprehensively introduces semantic communication in terms of the ba

6、sicprinciple,technical module,application area,key challenge,etc.,aiming to provide reference andguidance for the application of semantic communication in the next-generation wirelessnetworks.It extensively discusses the key modules of semantic communication,including theconstruction of semantic kno

7、wledge bases,joint semantic-channel coding and decoding,semantic information transmission,and compatibility with existing systems,enabling readers tounderstand semantic communication technology in-depth.It further explores single-modal andmultimodal semantic communication system architectures,while

8、studying methods to suppresssemantic noise,providing a comprehensive perspective for achieving multimodal multiusersemantic communication.This white paper also focuses on the integration of semanticcommunication with other networks,including digital communication networks,cognitivenetworks,distribut

9、ed networks,secure networks,and satellite networks,aiming to achieve2/155efficient,reliable,and intelligent semantic transmission.Finally,it analyzes existing andpotential application scenarios of semantic communication,discusses relevant challenges,andlooks forward to the future development of the

10、semantic communication field,providing readerswith insights into future research and innovation directions.3/155ContentsAbstract.11.Overview of Semantic Communication.62.Key Modules of Semantic Communication.82.1Semantic Knowledge Base.82.1.1Modern Semantic Communication Enabled by Semantic Knowledg

11、e Base.82.1.2Structural System of Semantic Knowledge Base for Semantic Communication.112.1.3Construction Method of Semantic Knowledge Base.122.1.4Examples of Semantic Knowledge Bases Supporting Semantic Communication.142.2Joint Semantic-Channel Coding and Decoding.162.2.1Semantic Coding and Decoding

12、.162.2.2Joint Semantic-Channel Coding and Decoding.162.3Semantic Information Transmission.192.3.1Semantic Information Transmission and Challenges.192.3.2Addressing Modern Communication Challenges:Examples of Semantic InformationTransmission System.202.4Compatibility of Semantic Communication with Ex

13、isting Systems.232.4.1Compatibility of Semantic Communication with Source Coding.242.4.2Compatibility of Semantic Communication with Hierarchical Architecture of ClassicalCommunication Systems.293.Single-Modal Semantic Communication.323.1Text-Oriented Semantic Communication.333.2Speech-Oriented Sema

14、ntic Communication.373.3Image-Oriented Semantic Communication.423.4Video-Oriented Semantic Communication.443.4.1The Context of Video Transmission.443.4.2Semantic Video Transmission.453.4.3Semantic Video Conference.453.4.4Conclusion.484.Semantic Noise Suppression.504.1Robust Text Semantic Communicati

15、on.504.2Robust Speech Semantic Communication.524.3Robust Image Semantic Communication.534/1555.Multimodal Semantic Communication.565.1Multimodal Semantic Communication Architecture.565.1.1Modal Fusion Design in Multimodal Semantic Communication.585.1.2Noise Immunity Design in Multimodal Semantic Com

16、munication.595.2Multi-Task Semantic Communication.605.2.1What Triggers Studies on Multi-Task Semantic Communication Systems.605.2.2Multi-Task Semantic Communication System Model.625.2.3Main Technologies and Prospects of Multi-Task Semantic Communication.635.3Multi-User Semantic Communication.645.3.1

17、Major Technologies Used in Multiuser Semantic Communication.655.3.2Multiuser Semantic Communication System Architecture.665.3.3Resource Allocation in Multiuser Semantic Communication.686.Digital Semantic Communication and Waveform Optimization.706.1Digital Semantic Communication Network.706.2OFDM-ba

18、sed Semantic Communication Waveform Optimization.726.3Two-Way Semantic Communication.756.4A Predictive Channel-based Semantic Communication.797.Resource Management in Semantic Communication.837.1Construction of Resource Allocation Model in Semantic Communication.837.2Optimization Objectives of Resou

19、rce Allocation.857.3Resource Management Strategy in Semantic Communication.867.4Semantic Hybrid Automatic Repeat reQuest.877.4.1Traditional Hybrid Automatic Repeat reQuest Strategies.877.4.2Semantic Hybrid Automatic Repeat System.887.4.3Conclusion.917.5Semantic Error Detection.927.5.1Traditional Che

20、ck and Error Detection Methods.927.5.2Semantic Similarity Detection Technology.937.5.3Sentence Semantic Similarity Detection.957.5.4Conclusion.978.Distributed Semantic Communication.988.1Multimodal Semantic Relay Architecture.988.1.1Composition and Deployment of Multimodal Semantic Relay Architectur

21、e.1015/1558.1.2Advantages of Multimodal Semantic Relay Architecture.1028.2Distributed Collaborative Device-Server Inference.1048.2.1Composition and Deployment of Distributed Collaborative Device-Server Inference.1048.2.2Experimental Performance Analysis of Distributed Collaborative Device-Server Inf

22、erence.1059.Security Challenges and Countermeasures in Semantic Communication.1079.1Model Security in Semantic Communication.1079.1.1Adversarial Attacks and Defense.1089.1.2Poisoning Attacks and Defense.1109.1.3Future Prospects.1129.2Cryptographic Techniques in Semantic Communication.1129.3Privacy P

23、rotection in Semantic Communication.1159.4Semantic Watermarking for Secure Communication.11710.Semantic Communication for Satellite Systems.12210.1Efficient Semantic Transmission in Satellite Systems.12210.1.1Multi-Satellite Semantic Transmission System.12310.2Highly Reliable Semantic Transmission i

24、n Satellite Systems.12410.2.1Semantic Transmission under Low Link Budget.12410.2.2Beamforming for Satellite Applications Based on Semantic Transmission.12510.2.3Semantic Transmission Based on Channel Prediction.12610.3Intelligent Semantic Transmission in Satellite Systems.12611.Application Scenarios

25、.12811.1Holographic Communication.12811.2Autonomous Driving.13111.3Digital Safe Village.13311.4Industrial Intelligent Manufacturing.13411.5Smart Instrument.13512.Key Challenges.13712.1Challenges to Semantic Information Theory.13712.2Semantic Communication Based on Large Models.13812.3Standardization

26、.14113.Summary and Outlook.14314.References.14415.List of Abbreviations.1536/1551.Overview of Semantic CommunicationClaude Shannon and Warren Weaver pointed out that communication can be divided intothree layers:syntactic,semantic,and pragmatic.Traditional communication,developed based onClaude Shan

27、nons classical information theory,falls into the syntactic layer,which evaluatesnetwork performance using metrics such as bit error ratio,symbol error rate,and transmissionrate without considering the meaning of symbols.It is mainly used to solve technical challengesrelated to the correct transmissi

28、on of bits or symbols.Semantic communication focuses onconveying the meaning and significance of information,rather than just focusing on the symbolsthemselves.It is mainly used to address how to accurately convey the meaning of transmittedsymbols and how the receiver can influence the system behavi

29、or in the desired way.Semanticcommunication has been proposed in sync with syntactic communication.However,because oftechnical limitations and specific scenario demands,people have predominantly focused onsyntactic communication.Nevertheless,with rapid advancement in communication technology,the cap

30、acity of traditional communication systems has gradually approached the Shannon limit.On the other hand,thanks to the growth of artificial intelligence technology and the increasingdemand for intelligent communication in 6G networks,semantic communication has once againbecome a popular technology.Se

31、mantic communication is a novel communication paradigm that can address theexpression and transmission of information meanings at the semantic layer.It involves movingsome or all aspects of the understanding of information meanings to the transmitter,therebyreducing transmission volume and lowering

32、bandwidth requirements.Semantic communicationdiffers from traditional communication in the following aspects.First,the mode of informationrepresentation.Semantic communication upgrades symbol representation to semantic featurestailored for communication scenarios and tasks,moving the extraction and

33、understanding ofsemantic features of the source content to the transmitter.Second,the evaluation criteria forquality of service.Syntactic communication typically measures the quality of service usingmetrics such as symbol error rate and packet loss rate,which do not directly reflect userexperience a

34、nd other subjective aspects of quality.Depending on scenarios and tasks,semanticcommunication defines the quality of service jointly using objective semantic accuracy and7/155subjectively sensed quality.Compared to traditional communication,semantic communicationsystems are more efficient in transmi

35、ssion.By transmitting only crucial semantic informationrather than all information,semantic communication requires less transmission bandwidth,whichcan enhance transmission reliability and capacity,consequently improving wireless transmissionefficiency.Meanwhile,the reconstruction of raw information

36、 requires a semantic decodingmodel.Accordingly,semantic communication can enhance data security under certain conditions.The system model of semantic communication is shown in the figure below.Semanticcommunication mainly focuses on the semantic representation,transmission,and reconstructionof sourc

37、e content,as well as semantic-based wireless transmission.Key processes includeknowledge base construction,semantic coding and decoding,and channel coding and decoding.In terms of theory,semantic communication research is mainly inspired by Claude Shannonsinformation theory.It defines semantic entro

38、py,semantic rate-distortion,and semantic channelcapacity by replacing statistical probability with logical probability,and then establishes atheoretical framework for semantic communication.On the other hand,thanks to thedevelopment of artificial intelligence and enhancing data processing capabiliti

39、es,algorithmsrelated to the extraction,coding,and transmission of semantic features based on deep learninghave been successfully applied to various types of sources.Semantic communication holds broadapplication prospects in fields such as multimedia communication,augmented reality andimmersive commu

40、nication.Figure 1.1 Semantic Communication System Model8/1552.Key Modules of Semantic Communication2.1Semantic Knowledge Base2.1.1 Modern Semantic Communication Enabled by SemanticKnowledge BaseFigure 2.1 Modern Semantic Communication Architecture Enabled by Semantic KnowledgeBaseAlong with the rapi

41、d development of mobile communication and Internet technologies,thedemand has surged for high-speed and low-latency radio access.Traditional communicationsystems have approached the Shannon limit,making it urgent to achieve breakthroughs in newcommunication technologies.Semantic communication,as an

42、endogenous intelligence and novelbrain-like information interaction mechanism,involves processes such as semantic elementextraction,recognition,understanding,transmission,and reasoning akin to informationtransmission among humans.In traditional communication,source symbols are mapped to bitstreams b

43、ased on preset coding schemes,with mapping functions designed using practicalexperience and accurate mathematical models.In semantic communication,sources are mappedto semantic streams based on the semantic base(Seb)using coding systems powered by artificial9/155intelligence algorithms,with mapping

44、functions established by a neural network system drivenby both data and models 1.Unlike traditional communication,which processes signals based on symbol-bit streams,semantic communication does that based on the semantic base.The semantic base is theminimum transmission and processing unit in semant

45、ic communication but currently lacks aprecise definition.Broadly speaking,it covers all semantic-related features extracted from sourceinformation.However,semantic features extracted by different methods or semantic features invarious forms undoubtedly possess distinct characteristics,with some exhi

46、biting clearadvantages or disadvantages,while others require judgment based on specific communicationtasks and requirements.That is one of the reasons behind the uncertainty of the semantic base.After all,regarding the semantic elements included in the same piece of exact information,theirunderstand

47、ing may not be completely consistent in different communication nodes.That is thereason why the semantic bases definition,acquisition method,and multimodal universalsemantic extraction need further exploration.The semantic knowledge base is a research direction within the semantic base framework,rep

48、resenting a structured and memory-capable knowledge network model that can providerelevant semantic knowledge descriptions for data information.As depicted in Figure 2.1,thesemantic knowledge base for semantic communication can be divided into the source knowledgebase,channel knowledge base,and task

49、 knowledge base,offering multi-level semanticknowledge representations for source data(e.g.,text,images,videos,etc.),channel transmissionenvironment(e.g.position and shape information of obstacles during transmission,positioninformation of intelligent reflecting surface(IRS),and task requirements(e.

50、g.imageclassification,3D reconstruction and semantic segmentation)1.In end-to-end semanticcommunication,at the transmitting end,the transmitter can efficiently acquire multi-levelsemantic knowledge descriptions of source data,channel semantic estimations of transmissionenvironment,and specific seman

51、tic requirements for downstream tasks based on its source,channel,and task knowledge base.Subsequently,it performs joint semantic-channel coding forinformation to be transmitted.At the receiving end,the receiver,based on its semanticknowledge base,performs knowledge query and understanding of receiv

52、ed information tocomplete joint semantic-channel decoding.10/155The semantic knowledge base boosts the development of semantic communication.Forinstance,current deep learning-based joint source-channel coding and decoding methods requireextensive training based on large-scale relevant data to obtain

53、 suitable communication modelstailored to specific tasks.This high consumption of data and time resources impedes thewidespread adoption of deep joint source-channel coding.As a prior knowledge source,thesemantic knowledge base offers efficient and standardized semantic support for semanticcommunica

54、tion,effectively accelerating the training of joint source-channel coding and decodingnetworks,and reducing the dependency on massive data for network training for specificcommunication tasks.Next-generation wireless communication technologies face increasingly complex anddiverse communication scena

55、rios and application requirements.The demand for interconnectedscenarios involving intelligent agents using different communication protocols will significantlyincrease.Knowledge sharing through the semantic knowledge base will help establish a unifiedknowledge background for an efficient human-mach

56、ine-thing interconnected network,enhancinginteraction efficiency among heterogeneous protocol-based intelligent agents.Intelligent agentsemploying different semantic communication protocols exhibit significant differences indefining and representing the semantic base,while the semantic knowledge bas

57、e can partiallyaddress challenges related to non-uniform semantic base specifications and physical layer signalspecifications during cross-protocol semantic communication.A sufficiently large semanticknowledge base can cover representation specifications of semantic bases and signals for variousprot

58、ocols.Distributing it to heterogeneous intelligent agents engaging in cross-protocol semanticcommunication can effectively assist these agents in task-specific training.Furthermore,on thebasis of the semantic knowledge base,coding and decoding components can be updated or addedflexibly without the n

59、eed to delete or modify existing semantic and physical layer modules ofintelligent agents.Typically,the semantic knowledge base is deployed on edge servers.Theheterogeneous protocol-based intelligent agents merely need to request and load relevantsemantic knowledge from the knowledge base before est

60、ablishing a new connection.Theknowledge base doesnt have to be involved in the communication process.The semantic knowledge base provides global knowledge background and storage searchservices for extracting,recognizing,transmitting,understanding,and reasoning semanticelements during semantic commun

61、ication.It defines an efficient search space and standardizes11/155search paths,significantly enhancing the flexibility of semantic communication and supportingthe application of semantic communication in more communication scenarios.It is one of thekey enabling technologies for semantic communicati

62、on.2.1.2 Structural System of Semantic Knowledge Base for SemanticCommunicationTo begin with,we introduce the structure of the semantic knowledge base,including itsinterfaces and internal organizational structure(as shown in Figure 2.2).Figure 2.2 Schematic Diagram of Semantic Knowledge Base Design

63、of knowledge base interfaceConsider a simple scenario where a source and a sink share the same knowledge base.Atthe source end,semantic knowledge about the source,tasks,and channels obtained from theknowledge base is adopted to guide coding and transmission.At the sink end,the knowledgebase is appli

64、ed to decode received signals.A crucial step in this process is how to retrieve therequired knowledge from the knowledge base,which we refer to as knowledge retrieval.Byrepresenting semantic knowledge from different modal signal sources in the form of semanticfeature vectors,we can use a unified int

65、erface to handle knowledge retrieval for different modalsignal sources,ensuring the consistency and scalability of the semantic library interface.Thefollowing defines the input and output of the semantic knowledge base.Input:Types of semantic knowledge(source,task,or channel)and data source informat

66、ion(e.g.numerical data,character,text,image,speech,video and point cloud);12/155Output:A list containing multiple sets of semantic vectors.This list describes the multi-level semantic knowledge of the input signal,where the length of the list indicates the number oflevels.For example,in source seman

67、tic knowledge retrieval,inputting an image returns multiplesets of semantic vectors,with each set representing a level.For object segmentation tasks,thereturned semantic vectors can characterize information such as,and.Organizational structure of semantic knowledgeThe knowledge base will consist of

68、three types of sub-semantic knowledge bases:source,task,and channel.The knowledge base internally stores several sets of semantic bases pre-constructed based on the data sources and target tasks,as well as computational models that canbe used to dynamically construct semantic bases in real-time.When

69、 new data sources are inputfor semantic knowledge retrieval,if the existing semantic bases meet the communication taskrequirements,they will be directly used to represent the input data sources and return semanticknowledge vectors;if the existing semantic bases do not meet the communication taskrequ

70、irements,computational models will be used to dynamically construct semantic bases thatmeet the requirements for semantic knowledge representation of input data sources in real-time.2.1.3 Construction Method of Semantic Knowledge BaseCurrent methods for constructing semantic knowledge bases for end-

71、to-end semantictransmission include knowledge graphs,labeled training datasets,feature vector collections,andlarge language models,etc.The first method is to build a semantic knowledge base based on the knowledge graph.Fortext transmission,Reference 2 used triplets describing semantic information(in

72、cluding headentity,relationship,and tail entity)to construct a semantic knowledge graph,which was thenutilized as the semantic knowledge base at the transmitter and receiver to guide semantic codingand decoding in text transmission.For speech transmission,Reference 3 proposed a basicmodel of a multi

73、-level structured semantic knowledge base based on the knowledge graph,alongwith a semantic knowledge base construction method involving semantic representation andsemantic symbol abstraction.For graph data transmission,Reference 4 introduced a multi-levelsemantic representation method consisting of

74、 explicit semantics,implicit semantics,and user-13/155related knowledge reasoning mechanism,and trained the semantic reasoning mechanism at thereceiver based on imitative learning to ensure consistency with the reasoning mechanism at thetransmitter,effectively reducing transmission loads.Moreover,Re

75、ference 4 further targetedheterogeneous networks enabled by semantic knowledge bases and put forward a cooperativereasoning mechanism for mobile edge servers based on federated graph neural networks,allowing servers to build a shared semantic parsing model based on distributed semanticinformation sa

76、mples.The second method uses labeled training datasets as the knowledge base.When the datainformation to be transmitted differs from the training datasets in terms of statisticalcharacteristic distribution,Reference 5 employed domain adaptation techniques from transferlearning to mitigate the impact

77、 of data differences on model generalization performance,enabling channel coding and decoding schemes to better adapt to the data required fortransmission in the target domain.The third method uses feature vectors extracted from deep learning models as the semanticknowledge base.Reference 6 defined

78、a set of finite discrete semantic base vectors as thesemantic knowledge base and obtained semantic encoders/decoders and semantic knowledgebases through end-to-end joint training.Reference 7 designed a novel Masked VectorQuantized-Variational AutoEncoder(VQ-VAE)network to train and acquire encoders/

79、decodersand discrete codebooks at the same time.Leveraging the use of codebooks not only reducestransmission overhead but also combats channel semantic noise.The fourth method uses pre-trained large models as the semantic knowledge base.Reference 8 proposed a semantic importance-aware communication

80、scheme based on pre-trained language models,studying semantic importance-aware power allocation based onquantified semantic importance.Reference 9 presented a multimodal semantic communicationframework based on large models.This method proposed a multimodal alignment based onmultimodal language mode

81、ls and introduced a personalized large language model semanticknowledge base,allowing users to conduct personalized semantic extraction or restorationthrough the large language model,thus addressing semantic ambiguity issues.In conclusion,semantic knowledge bases based on knowledge graphs,labeled tr

82、ainingdatasets,feature statistics,and large language models have been applied in end-to-end semanticcommunication and have achieved certain results.14/1552.1.4 Examples of Semantic Knowledge Bases Supporting SemanticCommunicationAs new intelligent scenarios(e.g.autonomous driving,extended reality,sm

83、art city)continue to evolve,remote semantic understanding will become a major new challenge in thefield of wireless communication.Taking the increasing scenarios of autonomous driving as anexample,the high-speed mobility of vehicles leads to constant changes in the traffic scenes inwhich the vehicle

84、s are located,resulting in changing in the dataset distribution sensed by thevehicles,therebyfrequentlytriggeringtheoccurrenceofzero-shotclassificationproblemsAccordingly,it has become one of the crucial topics in the field of intelligent simplifiedcommunication to achieve lightweight and efficient

85、zero-shot image classification.This sectionintroduces a multi-level semantic coding and transmission mechanism based on the semanticknowledge base 10 to achieve lightweight and efficient intelligent simplified communication.Figure 2.3 Multi-level Feature EncoderAs shown in Figure 2.3,a multi-level f

86、eature encoder is designed.First,to bridge thedifferences between visual and semantic feature distributions,a conditional primary label spacetransformation method is employed to project visual and semantic features into a common spaceto learn low-dimensional common latent features.Second,under the s

87、upervision of low-dimensional common latent features,design respectively visual/semantic autoencoders and trainthem to obtain visual/semantic encoders that project visual/semantic features to latent features,and visual/semantic decoders that project latent features to visual/semantic features.Finall

88、y,design a multi-level image classification method based on the obtained multi-level semanticencoder to make categorical decisions in the visual feature space,semantic feature space,and15/155latent feature space.Those decisions are applied to determine the transmission level mostsuitable for current

89、 communication requirements.Figure 2.4 Schematic Diagram of Semantic Knowledge Base-driven Multi-level FeatureTransmission System FrameworkBased on the zero-shot multi-level semantic encoder,as shown in Figure 2.4,what isconsidered is an end-to-end multi-level feature transmission system enabled by

90、the semanticknowledge base,wherein both the transmitter and receiver are deployed with semanticknowledge bases and multi-level semantic encoders.When communication tasks are performed,there are four transmission levels to choose from,namely,visual feature transmission,latentfeature transmission,sema

91、ntic feature transmission and category estimation transmission.Beforedetermining the final transmitting level,the transmitter and receiver interact iteratively,andmake judgments onwhether to transmit the semantic information at each semantic level fromhigh to low based on its abstraction level.In tr

92、ansmission decision-making of each level,acorresponding threshold judgment is made first.If the threshold requirement is met,thetransmission at the current level proceeds;otherwise,the transmission decision-making regressesto a lower level of abstraction,where a similar transmission judgment is made

93、 again,ultimatelydetermining the transmission strategy.The performance of this end-to-end multi-level feature transmission is mainly influenced bytwo aspects:(1)The semantic decoders at the receiver and transmitter may not be trained on the samedataset.In that case,their multi-level semantic encoder

94、s may be different.Hence,there may bedifferences in the standards for extracting multi-level semantic information from image samplesat the receiver and transmitter.(2)Differences in the sizes of the knowledge bases at the receiver and transmitter mayaffect the results of semantic information transmi

95、ssion and the required transmission level.Forinstance,when the receiver knowledge base contains all categories of semantic information,the16/155transmitter merely needs to transmit the corresponding category index;otherwise the transmitterneeds to transmit estimated semantic features or latent featu

96、res to the receiver,driving up thetransmission load accordingly.Hence,the performance of semantic knowledge bases and encoders and decoders at thereceiver and transmitter determines the interdependent relationship between semantic loss andtransmission delay.To minimize semantic errors under transmis

97、sion delay constraints,carefuldesign is needed on where to extract information(i.e.,using the multi-level semantic encoder ofthe transmitter or receiver),which level of features to extract(including visual,latent,andsemantic features),and where to make transmission level decisions(i.e.,using the sem

98、anticknowledge base of the transmitter or receiver).This work demonstrates that a multi-level featuretransmission scheme driven by the semantic knowledge base is crucial for efficientlyaccomplishing remote visual recognition tasks.Through this work,we further understand howsemantic knowledge bases s

99、upport semantic communication.2.2Joint Semantic-Channel Coding and Decoding2.2.1 Semantic Coding and DecodingIn traditional communication systems,source coding and channel coding are two separatemodules.The purpose of source coding is to compress raw data into a bit stream to achieveredundancy elimi

100、nation.Channel coding aims to resist interference by adding additional securitycodewords.In semantic communication systems,traditional source coding and decodingmodules are replaced by semantic coding and decoding modules.A semantic encoder can extractthe semantic information of data,and represent i

101、nput signals as forms with semantic informationsuch as semantic features and vectors.Compared to original information,the amount ofcommunication required for transmitting semantic information is greatly reduced,thus semanticcoding can effectively enhance network performance.Along with the rapid deve

102、lopment ofartificial intelligence,the design of semantic encoders based on deep neural networks hasachieved significant success and can be widely applied to semantic extraction from multimodaldata.2.2.2 Joint Semantic-Channel Coding and DecodingIn traditional communication systems,as source coding a

103、nd channel coding technologieshave gradually reached their theoretical limits respectively,joint source channel coding17/155technology has received widespread attention from the academic circle.The basic principle ofjoint source channel coding is to allocate more codewords to source coding under a h

104、igh signal-to-noise ratio to improve transmission efficiency.Under a low signal-to-noise ratio,morecodewords are allocated to channel coding to suppress the negative impact of noise.Inspired by joint source channel coding,joint semantic-channel coding has become one ofthe key technologies in semanti

105、c communication systems.As shown in Figure 2.5,jointsemantic-channel coding is implemented by two deep neural networks.Deep neural networks actas autoencoders/autodecoders,representing various deep learning-based neural network modelssuch as convolutional neural networks,generative adversarial netwo

106、rks,and Transformers.Byjointlytrainingthesetwodeepneuralnetworks,jointsemantic-channelcodingcansimultaneously reflect the semantic features of signals and the characteristics of transmissionchannels.In that case,semantic features can be extracted effectively in noisy environments.Thisalso means that

107、 compared to traditional separate designs,the joint coding method is more stableunder a low signal-to-noise ratio.Figure 2.5 Joint Semantic-Channel Coding ModelCurrently,joint semantic-channel coding and decoding has been successfully applied tovarious communication systems,including text,image,audi

108、o,and multimodal data transmission.Specifically,the rapid development of natural language processing has laid the foundation foranalyzing and understanding semantic information,facilitating joint semantic channel coding fortext transmission.To better evaluate the similarity between reconstructed tex

109、t and original text,semantic similarity metrics for words and sentences have been proposed successively.Jointsemantic-channel coding policies,which are based on models such as recurrent neural networks18/155and Transformers and with the optimization goal of maximizing similarity and minimizingsemant

110、ic errors,have been gradually applied to text transmission.Along with the success of thetext-transmission-oriented semantic communication system,the speech-transmission-orientedsemantic communication system has also received wide attention.Compared to text signals,speech signals are more complex due

111、 to factors such as volume,intonation,and backgroundnoise,making them harder to process and understand.The signal-to-distortion ratio(SDR)andperceptual evaluation of speech quality(PESQ)are the primary metrics for quantifying thequality of reconstructed audio signals.Coding policies with the optimiz

112、ation goal of maximizingthe quality of perceptual evaluation and minimizing the distortion ratio have been applied toaudio transmission.Images/videos are the main source of data flow in the multimedia era,leading to significant attention being paid to image-oriented joint semantic-channel codingpoli

113、cies in the academic circle.In the image transmission system,communication tasks includeimage recognition,image reconstruction,etc.,where the peak signal-to-noise ratio(PSNR)is akey indicator for measuring the similarity between reconstructed images and original images.Research shows that joint sema

114、ntic-channel coding outperforms traditional coding methods inadditive white Gaussian noise channels,Rayleigh fading channels,and Rician fading channels.In addition to typical single-modal data,the framework of multimodal data transmissionused in semantic communication has gradually evolved.For examp

115、le,in applications related toaugmented reality,different types of data are still interrelated.By considering the correlationbetween different types of data,multimodal data coding and transmission capable of semanticsensing can further enhance system performance.For instance,during the execution of v

116、isualQ&A applications,queries are presented in text format and answers are presented in imageformat.Therefore,researchers have designed a multi-user multimodal semantic communicationsystem,utilizing joint semantic-channel coding schemes based on recurrent neural networks andconvolutional neural netw

117、orks respectively for text and image transmission.The receiverpredicts the answer by fusing semantics.In addition to visual Q&A applications,multi-usermultimodal joint coding policies based on Transformers are also beginning to be applied to otherintelligent tasks,including image retrieval and machi

118、ne translation.To sum up,compared to traditional coding methods,joint semantic-channel coding is betterand more stable.Overall,joint semantic-channel coding is mainly advantageous in two aspects.On one hand,semantic encoders based on deep neural networks can effectively extract semantic19/155informa

119、tion,which can compress raw data and thereby improve communication efficiency.Onthe other hand,joint semantic-channel encoders consider both the semantics of signals and thestate of wireless channels,thus enabling adaptive changes to the communication environment.Especially under a low signal-to-noi

120、se ratio,joint semantic-channel coding enhances therobustness of communication systems,protecting the transmission of crucial semanticinformation.2.3Semantic Information TransmissionRegardinginformationtransmission incommunicationsystems,althoughsemanticcommunication adopts different coding and deco

121、ding methods from traditional communication,semantic information transmission is still subject to the constraints commonly found intraditional communication,including the unpredictability of transmission channel conditions andlimited network resources.Hence,challenges posed by modern communication s

122、ystems must beaddressed in semantic information transmission systems.2.3.1 Semantic Information Transmission and ChallengesThe fading effects in wireless communication environments are time-varying.Accordingly,the changing fading channels,uncertain signal-to-noise ratio(SNR),and bit error ratio impa

123、ctthe performance of information transmission systems 11.Traditional channel coding is usuallydesigned for static channel environments,lacking consideration for real-time quality in dynamicchannel environments.In some applications,especially in scenarios with high real-timerequirements,this may rest

124、rict the applicability of traditional coding.To ensure optimalperformance,most existing end-to-end semantic information transmission systems are designedfor specific SNR,leading to the requirement for greater storage space in semantic informationtransmission systems to accommodate multiple network m

125、odels.This incurs storage load on thedevice side and communication delays.Along with the progress of communication technology,base stations in massive MIMOscenarios possess colossal antennas,which can serve multiple user equipment.MIMOtechnology has contributed to significant performance improvement

126、 in the communication field.However,its implementation requires substantial storage space and computational resources.Massive MIMO systems involve multiple antennas and channels and need to process multipleinput and output streams at the receiver and transmitter.Consequently,the system needs moremem

127、ory to store related channel state information,matrix weights,and signal processing20/155algorithms.Additionally,challenges include computational resources because complex matrixoperations of multiple input and output channels require higher processing capabilities.In thedesign and implementation of

128、 MIMO systems,a balance must be struck between performanceenhancement and the consumption of storage space and computational resources to meet therequirements of communication systems.In the transition from 5G to 6G,the number of MIMO antennas keeps growing.Basestations should understand instantaneo

129、us downlink CSI in real time to make the most of thoseantennas.However,the current CSI feedback mechanism,which involves obtaining downlinkCSI through a three-step interaction,requires data transmission in uplink resources.Traditionalmethods encounter issues such as a surging feedback overhead for c

130、odebook-based methods andlong iteration time for compressed sensing algorithms,making them unsuitable for scenarioswith computational constraints or high delay requirements.Given the strict requirements of emerging applications on communication delay,energyconsumption,and computing power,one solutio

131、n is to introduce deep learning for CSIcompression.In this regard,traditional communication adopts separate source-channel coding,while artificial intelligence-based CSI compression assumes that both the channel coding moduleand modulation module can ensure perfect transmission,i.e.,enabling adaptiv

132、e adjustment ofmodulation coding methods based on feedback channel quality to successfully transmit allfeedback codewords.However,the CSI feedback under this coding method has notabledrawbacks,such as the cliff effect and channel decoding errors caused by mismatched channelconditions.In contrast,joi

133、nt source-channel coding schemes can provide smooth performancedegradation even when actual channel conditions are worse than expected,making the recoveredCSI still valuable for the base stations subsequent execution processes.2.3.2 Addressing Modern Communication Challenges:Examplesof Semantic Info

134、rmation Transmission SystemsThanks to the close integration of semantic information transmission systems with deeplearning technology,the latter can easily leverage the characteristics of neural networks to matchthe natural redundancy of sources and the statistical properties of channels in informat

135、iontransmission systems,thereby enhancing the robustness in dynamic channel environments.21/155Figure 2.6 Resource Allocation Policy in Cascading Source Channel Coding 12To adapt to changing channel environments and address limited transmission resourcesavailable for channels,the resource allocation

136、 policy in cascading source channel coding shownin Figure 2.6 allocates more bits to channel coding and fewer bits to source coding under poorchannel conditions 12.Assigning more bits to the channel encoder increases the redundancy ofthe information to be coded to combat strong channel noise,and vic

137、e versa.This resourceallocation policy in cascading source channel coding allows for near-optimal transmission undera constant channel code rate.However,traditional syntactic information transmission systemstypically use separate source channel coding methods,making it challenging to match the natur

138、alredundancy of sources with the statistical properties of channels.Semantic informationtransmission systems based on deep learning can use the attention mechanism to dynamicallyadjust the sizes of the relevant subnets related to source coding and channel coding functions.Neural networks can extract

139、 feature groups from the source,capture relationships betweenfeature groups through a soft attention mechanism,and generate different scaling factors fordifferent channels to enhance or weaken the connections between the features of that channel andthe next-layer network 13.The high combination of s

140、emantic information transmission anddeep learning allows for adaptive adjustments of code rates of source coding and channel codingbased on the signal-to-noise ratio of the channel to adapt to various channel conditions.22/155Figure 2.7 Visual Images of CSI Sample in Angular-latency Domain 14In trad

141、itional communication,separate source channel coding is used.As a result,the deeplearning-based CSI feedback scheme is primarily based on the source coding module in thecommunication system using a separate source channel coding scheme.As shown in Figure 2.7,the truncated two-dimensional discrete co

142、sine transform used in this method will lead to thediscarding of some useful CSI information,which cannot be compensated for in subsequentprocesses 14.In contrast,the joint source channel coding method used in semanticcommunication can replace the truncated two-dimensional discrete cosine transform

143、used in theCSI feedback method based on source coding with a nonlinear transform network.In that way,power requirements can be dramatically reduced during CSI feedback on the user side,thusmitigating the cliff effect and delay issues.Traditional coding and decoding methods result in the phenomenon o

144、f multiple transmittersoccupying storage resources in MIMO scenarios.On the contrary,the artificial intelligence-enabled encoder of semantic information transmission systems can adopt the same networkstructure and achieve the same parameters in both training and testing processes.In cases wherethe n

145、umber of transmitters varies,minor adjustments to the receiver network structure aresufficient to adapt to the situation.Accordingly,the model becomes less complex to lower theconsumption of storage resources 15.This flexibility and high adaptability make artificialintelligence-driven semantic infor

146、mation transmission systems more efficient and economical,especially in complex communication environments involving multiple transmitters.Compared to traditional separate source channel coding,the encryption/decryptionmechanism adopting the policy of encrypting before compressing alters the structu

147、ral23/155information of the source.It may affect the performance of coding and transmission in theencryption domain of joint source channel coding,making it challenging to couple with end-to-end semantic information transmission systems.Therefore,traditional encryption techniques failto address sour

148、ce privacy protection issues in deep joint source channel coding.By using featureextraction networks to measure visual security,protective and de-protective network structuresare designed for information protection in end-to-end semantic information transmission systems16.Compared to traditional enc

149、ryption methods,this protection method is better in sourcereconstruction and exhibits good generalization for various end-to-end semantic informationtransmission systems.To sum up,like traditional communication systems,semantic information transmissionsystems also face the unpredictability of wirele

150、ss transmission channel conditions and limitednetwork resources.Meanwhile,now that the semantic communication system employs jointsource channel coding methods,the CSI feedback method and transmission informationprotection policies in the mechanism of traditional separate source channel coding do no

151、t applyto the semantic communication system.That said,thanks to the high combination of the jointsource channel coding method in the semantic information transmission system with deeplearning,it is easier for the system to integrate into neural networks.Practical examples explainthat the semantic co

152、mmunication system can better adapt to various channel conditionscompared to traditional communication systems,enabling base stations to more effectivelyutilize recovered CSI for subsequent processing,reducing the storage resource consumption ofmultiple transmitters and providing a more generalized

153、information protection scheme for end-to-end information transmission systems.2.4CompatibilityofSemanticCommunicationwithExistingSystemsSemantic communication can significantly compress data bandwidth while retaining thesemantic information of the data during transmission,which holds great research

154、value in futureapplications where intelligent agents serve as communication terminals.However,furtherresearch is needed on how to achieve compatibility for semantic communication within existingcommunicationsystems,howtointegrateartificialintelligence(AI)withclassical24/155communication systems,and

155、how to break down the barriers between data information andsemantic information.2.4.1 Compatibility of Semantic Communication with SourceCodingSemantic communication systems are generally built based on artificial neural networks(ANN),which directly code semantics using images or text as input.Howev

156、er,the ANNarchitecture is built on the probabilistic model and trained using back-propagation algorithms.Hence,its input and output generally require floating-point numbers between 0 and 1 torepresent data with probabilistic meaning,making it impossible for semantic communication tointegrate with tr

157、aditional coding that processes binary bits directly.To explore the relationshipbetween classic source coding and semantic coding,research is essential for mapping from thedata space to the semantic space,and improvements are required in existing semanticcommunication systems to accept binary bits a

158、s input.Figure 1 is the schematic diagram of thearchitecture of a separated data and semantic coding implementation system(referred to as theSDSC system for short)17.This system incorporates a source coding conversion module fromdata to semantics before a general semantic encoder,enabling data codin

159、g to be the input forsemantic coding.Hence,semantic coding can be compatible with classical source coding in aseparate framework.Figure 2.8 Schematic Diagram of Architecture of Separate Data and Semantic CodingImplementation SystemSeveral approaches can be considered for the construction of the text

160、-oriented source codingconversion module.Here,a fully connected network is employed for the conversion module,which requires padding of inputs of varying lengths to serve as input for the fully connectednetwork.When inputting the semantic coding module later on,the Embedding layer in thesemantic cod

161、ing module can be replaced with a fully connected layer to transmit information at25/155the largest level.Moreover,it is essential to note that network input and output carryprobabilistic meanings while source coding codewords and binary bits represent variations,notprobabilities.Hence,during the im

162、plementation of the source coding conversion module,theoccurrence probability of binary bits should replace 01 to represent source coding.This is equalto converting bit information into probabilistic information.Next,coding with equal-lengthcodewords and vocabulary size is obtained via the source co

163、ding conversion module.It is inputinto the semantic coding system to generate semantic feature vectors as output.These vectors arethen received by the sink through a channel,and recovered to the original message via asemantic decoder.The semantic encoder/decoder can be constructed using a Transforme

164、r modulewith parameters detailed in Table 2.1.Table 2.1 ANN Model Parameters Used to Build SDSC SystemSDSC ModelStructureParameters andSettingsSource codingHuffman code3780SourceCodingConversionModuleDense+Relu3780Dense128Value Normoalization1/128SemanticencoderPosition Encoding512Dropout=0.1Transfo

165、rmer Encoder3128(8 heads)Dense+Relu256Dense16Power Normoalization/2ChannelAWGNSNR:-68 dB26/155SemanticdecoderDense+Relu128Dense+Relu512Dense128Transformer Decoder3128(8 heads)Dense3780SoftmaxGreed searchThe experimental results for the aforementioned SDSC system are shown in Figure 2.9,where the blu

166、e line represents the communication performance of the semantic encoder/decoderwhen text or images are directly used as input.The red line indicates the communicationperformance of the SDSC system trained with lossless Huffman Coding.The yellow linerepresents the communication performance of the SDS

167、C system trained with shuffled Huffmancodewords.The blue and red lines demonstrate similar performance,indicating that the SDSCsystem can use the source coding conversion module to effectively separate source coding andsemantic coding.This capability allows the practical integration of semantic codi

168、ng into theseparation framework of the communication system.Furthermore,compared to the red line,theperformance of the yellow line significantly declines,indicating that preserving semanticmodeling information is crucial for successfully converting classical codewords into semanticfeatures.This conf

169、irms that some advantages of the semantic communication system stem fromthe natural language modeling process,that is,the text sequence in natural language.27/155Figure 2.9 compares the performance of the semantic communication system with different datatraining.The horizontal axis represents the SN

170、R of the Additive White Gaussian Noise(AWGN)channel,and the vertical axis depicts the Bilingual Evaluation Understudy(BLEU)metric.Theyare adopted to measure the similarity between decoded messages and original messages.Performance curves are shown with BLEU1-4 as the evaluation indicator.17Whats mor

171、e,to verify that partial data distortion does not impact the performance of thesemantic communication system,i.e.,semantic fidelity is not affected,data distortion coding isachieved by truncating Huffman coding.To determine the length at which Huffman codingneeds to be truncated,a statistical analys

172、is of all Huffman codewords obtained from the corporacoding is conducted and the result is shown in Figure 2.10.It can be seen that the lengths ofHuffman codewords are mainly concentrated between 9-12.Therefore,the length of Huffmancoding can be truncated to 9 or 12.Also,to further emphasize the imp

173、ortance of languagemodeling information for semantic coding,Huffman coding truncated to 6 is selected.It shouldbe noted that codewords will be decoded as if they lose translatability due to truncation.Figure 2.10 Corpus-oriented Huffman coding with codeword length distribution.The horizontalaxis rep

174、resents the codeword length and the vertical axis is the number of codewords 17Figure 2.11 illustrates the possible decrease in performance of the semantic communicationsystem due to lossy source coding.The blue line represents the transmission performance of theSDSC system trained with lossless Huf

175、fman coding,the red line indicates the transmissionperformance of the SDSC system trained with distorted Huffman coding truncated to 12 bits,the28/155yellow line shows the transmission performance of the SDSC system trained with 9-bit distortedHuffman coding,and the purple line displays the transmis

176、sion performance of the SDSC systemtrained with 6-bit distorted Huffman coding.Lossy coding retains most of the semantic modelinginformation,such as the case of Huffman coding truncated to 12 bits as shown in the figure.Itmay cause severe errors such as the cliff effect in traditional communication

177、systems.Nevertheless,the performance loss is negligible in the semantic communication system.This ismainly because the semantic encoder/decoder can recover undecodable codewords based onsemantic modeling information and infer them as reasonable sentences.However,when lossycoding results in the loss

178、of most semantic modeling information,as in the case of Huffmancoding truncated to 6 bits as shown in the figure,it may lead to performance degradation or eventhe complete collapse of the semantic communication system.Nonetheless,when truncated to 9bits,BLEU still exceeds 0.8,indicating that even wi

179、th significantly reduced transmitted data,thesemantic content remains understandable.This signifies that in the situation of semantic fidelity,semantic coding can drastically save the data that needs to be transmitted.Figure 2.11 Performance comparison of semantic communication systems using differe

180、ntlevels of distorted data training.The horizontal axis represents the SNR of the AWGN channel,while the vertical axis represents the BLEU metrics.Performance curves are shown withBLEU1-4 as the evaluation indicator 17.To sum up,the SDSC system can accept binary bits as input for semantic coding,the

181、rebyachieving the separation of classical source coding from semantic coding.This separation allowssemantic coding to become an auxiliary module enhancing source coding and ensurescompatibilitywithclassicalcommunicationsystems.Moreover,semanticcodingcansignificantly reduce the data transmission volu

182、me of classical coding while ensuring semanticfidelity in communication data.Based on the aforementioned SDSC experiments,as shown in29/155Table 2,classical source coding based on semantic coding can reduce the transmitted datavolume by over 75%and enable mutual understanding between communicating p

183、arties to acertain extent,i.e.,the BLEU is above 0.8.Table 2.2 Saved Transmission Data Volume When Huffman Codewords are Truncated toDifferent LengthsSourceCodingTruncation/bits1296Saved TransmissionData Volume/%20.3%75.6%92.4%2.4.2 Compatibility of Semantic Communication with HierarchicalArchitectu

184、re of Classical Communication SystemsIn fact,based on Shannons separation theorem,classical communication systems haveevolved a complex hierarchical structure.The compatibility between semantic communicationand classical communication extends beyond that of classical source coding and channel coding

185、.It involves the compatibility of the JSCC architecture with the entire hierarchical structure.Thereare various potential network architectures,depending on the functionality and position of thesemantic encoder/decoder in wireless communication systems,each with its own set ofchallenges.Generally,th

186、e lower the layer to be compatible with,the more the challenges need tobe addressed.Figure 2.12 Application Layer Compatible with Semantic Coding and DecodingFigure 2.12 shows a scheme for the application layer to be compatible with semantic codingand decoding.Data undergoes semantic coding at the t

187、ransmitter of the application and semanticdecoding at the receiver.Semantic features are transmitted within the wireless communicationsystem and transparent to the wireless communication system.Experiment results in Section30/1552.5.1 demonstrate relatively easy implementation of compatibility betwe

188、en the application layerand semantic coding and decoding.However,corresponding channel information should also betransmitted to the semantic module via the air interface to support the module for fine-tuningtargeting different channels and sources to achieve similar performance given by the semantic

189、communication system designed by JSCC.In this architecture,the semantic coding and decodingmodule can employ a mobile edge computing(MEC)architecture to be deployed close to a radioaccess network(RAN)and obtain sensing of the wireless channel environment through networkcapability exposure.To address

190、 the above issues,ZTE has proposed the concepts of strong coupling and weakcoupling(tight coupling and loose coupling)in semantic communication.Strong coupling refersto the need for the semantic encoder/decoder to undergo end-to-end joint training through thechannel,allowing gradient passing over th

191、e channel during training to optimize encoder/decoderparameters based on channel parameters.Weak coupling implies that the semantic encoder andthe semantic decoder can be trained separately,during which the required gradient informationcan be transmitted in a certain form via weak coupling.For the a

192、pplication layer to becompatible with semantic coding and decoding,weakly coupled semantic communication isessential.Figure 2.13 Core Network Compatible with Semantic Coding and DecodingFigure 2.13 shows a scheme for the core network to be compatible with semantic codingand decoding.In this scheme,t

193、he core network performs semantic coding and decoding and31/155maintains the knowledge base.Compared to the compatibility scheme between the applicationlayer and semantic coding and decoding,the compatible scheme with the core network isrelatively independent of applications,enabling the promotion o

194、f application-specific semanticcoding/decoding to specific types of services.However,certain problems still exist.First,placing the semantic coding and decoding onthe core network may prevent data transmission from the data source to the core network frombenefiting from semantic communication gains.

195、Second,the core network needs to track andinterpret semantics in user data transmissions,leading to new requirements in data security anduser privacy protection.Last,the core network is responsible for maintaining the semanticencoder/decoder and knowledge base,increasing the complexity of the core n

196、etwork.Morecomputing and storage resources may need to be deployed,potentially leading to additionallatency.Figure 2.14 High Layer of RAN Compatible with Semantic Coding and DecodingFigure 2.14 shows a scheme for the high layer of RAN to be compatible with semanticcoding and decoding.In this scenari

197、o,the semantic coding and decoding module can interactdirectly with the channel to build a JSCC semantic communication system.However,similar to the compatibility scheme with the core network,this structure requiresthe RAN to track and interpret users semantic data,and complexity at the high layer o

198、f the RANneeds to be increased accordingly.Moreover,as the data undergoes multiple layers of encryptionand has been rearranged based on the original transmission sequence,further research is neededto determine if semantic feature extraction can still be performed at the RAN level.32/155Figure 2.15 U

199、nderlying Layer of RAN Compatible with Semantic Coding and DecodingFigure 2.15 shows a scheme for the underlying layer of RAN to be compatible withsemantic coding and decoding.This scheme supports the strong coupling implementation ofsemantic coding and decoding,but faces problems similar to those e

200、ncountered by thecompatibility scheme with the high layer of RAN.3.Single-Modal Semantic CommunicationIn recent years,deep neural networks have been applied widely in natural languageprocessing,speech signal processing,and computer vision.These models use pre-trained modelsto be trained on large-sca

201、le general-purpose corpora and can be employed for tasks such asclassification,data integration and clustering correlation 18-20.Typically,a system consistsof a transmitter and a receiver,where the transmitter transmits single-modal data.Those modelsuse only one type of information(text,image,video,

202、data),and are termed single-modal modelsin learning problems.Single-modal semantic communication can be applied in various scenarios,including natural language processing,speech recognition,and image processing.In single-modal semantic communication,conveying semantic information primarily relies on

203、 the selectedmodel.Due to the limited prevalence of sensing hardware equipment,it is hard for mobileequipment to gather relevant data from multiple information sources.Consequently,single-modalsemanticcommunicationremainsaprimaryresearchfocustoday.Semanticcommunication systems have shown significant

204、 improvements in enhancing the transmission33/155efficiency and symbol error rate of single-modal data communication,especially in text andimage data processing.3.1Text-Oriented Semantic CommunicationFor 6G networks,there is a growing demand for intelligent communication emphasizinghigh communicatio

205、n efficiency and low computing costs.Accordingly,researchers are focusingon efficient transmission at the semantic layer and the contextuality of transmitted data.Effortsare being made to maximize communication efficiency under limited bandwidth,robustness,adaptability,and reliability.In terms of te

206、xt data processing,progress has been made in areassuch as text semantic encoder/decoder design 21,text semantic association mining 2,and theperformance optimization of the text semantic communication system 22.This sectionsummarizes the relevant solutions of semantic communication in text transmissi

207、on.Semantic communication is a communication method of extracting semantic informationfrom a source and coding it for transmission in noisy channels.Unlike traditional wirelesscommunication,semantic communication does not require strict alignment of the decoding orderat the receiver with the coding

208、order at the transmitter.It only requires the recovered semanticinformation at the receiver to match the semantic information transmitted at the transmitter totransform from traditional error-free bit transmission to concise semantic transmission.Specifically,traditional wireless communication empha

209、sizes syntactic coding and transmissionfor error-free symbols.In contrast,semantic communication focuses on the extraction,coding,and transmission of meaning features of source-transmitted content,aiming to maintainconsistency between the information received and the meaning of the source content.Se

210、mantic communication surpasses traditional bit-level transmission and achieves semantic-level transmission.Its architecture also transforms from modular to integrated design.Figure 3.1represents the joint-source channel transmission architecture for semantic communication,wheresource information is

211、transmitted through coding and decoding over the physical channel.Thesource information can be text,image,audio,etc.Compared with traditional wirelesscommunication,the transmission architecture of semantic communication incorporates asemantic layer,which consists of a semantic encoder,a semantic dec

212、oder and a sharedknowledge base.The semantic layer achieves the extraction of semantic features from the sourceinformation and semantic recovery of received information to accomplish efficient transmissionand communication.The transmission layer ensures the correct transmission of semantic34/155info

213、rmation.In particular,the source information undergoes semantic coding through thesemantic encoder,followed by channel coding and the physical channel to be transmitted.Finally,information is recovered through channel decoding and semantic decoding.Thesemantic coding process can depend on semantic f

214、eature extraction to reduce informationtransmission redundancy.Figure 3.1 Framework of Joint-Source Channel Semantic Communication System 23Natural language processing enables machines to understand human language,with theprimary goal of understanding grammar and text.Semantic communication systems

215、utilizeshared background knowledge between the transmitter and receiver to compress and understandinformation at a semantic level.Currently,semantic communication technology is widelyapplied in various communication tasks.Regarding optimizing semantic coding,somepreliminary researches have been cond

216、ucted on the semantic transmission of text.Xie et al.24proposed a deep learning-based semantic communication system(DeepSC)for text informationtransmission,preliminarily considering joint source-channel coding to enable the receiver torecover text from a semantic perspective.The study aimed to maxim

217、ize system capacity andminimize semantic errors by restoring the meaning of sentences,rather than errors involving bitsor symbols in traditional communication.To accurately assess the performance of semanticcommunication,the paper introduced a new metric called sentence similarity.A variant ofDeepSC

218、 called L-DeepSC 25 is also used for text transmission.Here,a cloud/edge platformexecutes an Internet of Things(IoT)network trained and updated based on a deep learning-basedsemantic communication model,while IoT equipment performs data collection and transmissionbasedonthetrainedmodel.Adeeplearning

219、-basedlightweightdistributedsemanticcommunication system is proposed for low-complexity text transmission.By analyzing theimpact of fading channels on forward and backward propagation during the training process of35/155L-DeepSC,the paper proposed a method that utilizes channel state information to

220、aid in trainingprocesses and mitigate the effects of fading channels on transmission.As deep learning advances,natural language processing allows people to analyze and understand vast amounts of languagetext.To better leverage natural language processing for semantic communication on channels,anovel

221、 semantic communication system based on the universal transformer was proposed inReference 22.Compared to using fixed conversion zones with natural language processing,theuniversal transformer introduces an adaptive recurrent mechanism,enabling more flexibletransmission of sentences with different s

222、emantic information and achieving better end-to-endperformance under various channel conditions.Building on the semantic similarity of DeepSC27,Yan et al.26 proposed an efficiency metric called Semantic Spectral Efficiency(S-SE)torepresent the efficiency of text transmission.They also studied resour

223、ce allocation in semantic-sensing multi-user communication networks to optimize channel allocations and semanticsymbol numbers in semantic communication,enabling the transmission of more semanticinformation and enhancing communication reliability and efficiency.Targeting uncertain communication scen

224、arios,Zhang et al.28 introduced a deep learning-based context-sensing semantic communication model to learn semantic and contextual featuresas background knowledge.This background knowledge can be applied to some uncertain non-jointly designed communication scenarios.The design of coding and decodin

225、g policies based onparts of speech and context is effective and reliable in reducing the number of transmitted bitsand improving semantic accuracy between transmitted and recovered information.Building uponthis,Liu et al.28 proposed an extended context-sensing semantic communication system.Theyestab

226、lished a model for extracting and recovering sentence semantic features.Taking text in aparagraph as input,the encoder considers contextual meanings when coding the current sentenceto support semantic representation.During decoding,previously decoded information andcurrently received symbols are emp

227、loyed as input for extended decoding.Sachin Kadam 29designed an autoencoder that only transmits extracted keywords and uses shared backgroundknowledge between the transmitter and receiver to recover data with received keywords.Thismethod can save the number of transmitted words per sentence.Hu et al

228、.30 also studiedsemantic communication based on contextual relevance theory and proposed a new frameworkcalled Things2Vec.They modeled the functional sequence relationships generated byinteractions between entities using a graph(referred to as an IoT context graph)and embedded36/155this graph into t

229、he semantic communication framework to generate potential semanticrepresentations from entity interactions.In that way,they mapped semantic relationships to theIoT context graph,enabling comprehensive semantic information retrieval and ensuring theeffectiveness of Things2Vec while maintaining the re

230、liability of semantic communication.Reference 31 proposed a new semantic communication framework for wireless networks,utilizing a reinforcement learning algorithm with policy gradient updates based on a staticlearning rate and combined with attention networks.In this framework,base stations extract

231、semantic information composed of semantic triplets from text data and transmit it to each user.Upon receiving the semantic information,each user uses a graph or text generation model torestore the original text.Building upon this,32 presented a semantic communicationframework for text data transmiss

232、ion,which is different from the attention policy gradient(APG)algorithm adopted in 31 to update policies with the static learning rate.This frameworkemploys a reinforcement learning(RL)algorithm based on proximal policy optimizationcombined with attention networks,dynamically adjusting the learning

233、rate based on thedifference between old and updated policies to ensure convergence to a local optimum.Thesemantic information is represented by a set of semantic triplets forming a Knowledge Graph(KG),and the receiver recovers the original text via a graph-text generation model.Jiang et al.33 introd

234、uced the knowledge graph(KG)into semantic analysis,converting transmittedsentences into triplets with KG.These triplets serve as fundamental semantic symbols forsemantic extraction and restoration,and are sorted based on semantic relevance to convert sourceinformation to enhance semantic accuracy.Se

235、mantic extraction and restoration are employed toreduce the redundancy of transmitted information,adaptively adjust transmitted content,andenhance reliability.Zhou et al.33 proposed a cognitive semantic communication systemutilizing knowledge graphs and designed a shared knowledge base for the trans

236、ceiver to facilitatesemantic information extraction and recovery.34 extended the work in 2,presenting twocognitive semantic communicationframeworksforsingle-user andmultiusersemanticcommunication scenarios.Furthermore,it proposed an effective semantic error correctionalgorithm by leveraging reasonin

237、g rules from knowledge graphs,enabling the receiver to correcterrors at the semantic layer.The algorithm performs well in terms of data compression rate andcommunication reliability.37/155Several semantic metrics have been developed over the years to evaluate text quality.Theseinclude Semantic Dista

238、nce,Word Error Rate(WER),Bilingual Evaluation Understudy(BLEU),Consensus-based Image Description Evaluation(CIDEr),Semantic Similarity Measures(SSM),Tail Probability of SSM,SSM Using Sentence-bert 15(SSM Using SBERT),Metric forEvaluation of Translation with Explicit Ordering(METEOR),etc.3.2Speech-Or

239、iented Semantic CommunicationBecause of the distinctiveness,speech signals contain not only unique speech features suchas background noise,speakers timbre and emotion,but also text information the speaker wishesto express.Therefore,compared with the vigorous evolution of natural language processing,

240、research in speech signal processing is progressing relatively slowly.Fortunately,someachievements have been made so far in semantic communication in the direction of speechtransmission.Weng et al.35 proposed a semantic communication system(DeepSC-S)for speechtransmission on the basis of deep learni

241、ng.They co-designed a semantic-channel encoder toextract and send global semantic information from the original speech.The system effectivelysuppressed distortion and attenuation of wireless channels,thus recovering speech sequencesalmost identical to the original speech at the receiver.Different fr

242、om the traditional method ofmapping input speech into bit sequences,DeepSC-S learns global semantic information inspeech sequences with a convolutional neural network(CNN)based semantic encoder,and mapsit into floating-point number based semantic features that are then converted into transmittablesy

243、mbol sequences by a channel encoder.The DeepSC-S signal transmission process does notinvolve any bit-symbol conversion.Therefore,in order to measure the error between the originalspeech sequences and the recovered ones,MSE is used as a loss function to train the networkparameters of the end-to-end s

244、emantic communication system.In addition,to further analyze thesemantic information,an attention mechanism based semantic encoder(SE-ResNet)is designedfor DeepSC-S,which focuses on the difference in the amount of information carried by silencesegments and utterance segments in speech sequences,hence

245、 to understand the importance ofdifferent speech segments via neural network learning,and assign different importanceaccordingly.Those segments of high importance are prioritized in the parameter updating stage,so that speech segments containing more semantic information are accurately recovered,thu

246、sgreatly improving the overall accuracy of the speech sequences recovered.In comparison with38/155traditional communication systems,DeepSC-S proves to be largely improving speecharticulation recovery,thus justifying the feasibility of speech semantic communication systemsbased on deep learning.Moreo

247、ver,DeepSC-S outperforms the framework of deep learning-basedsource coding cascading traditional channel coding and the framework of deep learning-basedsource coding cascading deep learning-based channel coding,which further demonstrates theeffectiveness of semantic coding for speech compression and

248、 semantic extraction,as well as thesuperiority of the joint semantic-channel coding mechanism.On the foundation of DeepSC-S,Xiao et al.proposed a more efficient speech semantic coding transmission scheme(DSST)inReference 36.In this method,a nonlinear transformation is introduced to map speech signal

249、s tothe semantic latent space,and an entropy model is designed based on the speech latent space toestimate the importance of semantic features,so as to enable more efficient semanticcompression and reduce the size of data required for transmission.In addition,a channel signal-to-noise ratio adaptati

250、on mechanism is also proposed in DSST to train a robust neural networkmodel capable of ensuring stable speech recovery performance in various channel states.InDSST,CNN is used to create the JSCC architecture,and an edge information transmission link isdevised to transmit auxiliary information extrac

251、ted from the original speech sequences to thereceiver,enhancing speech reload accuracy.In Reference 37,Zhou et al.built an end-to-endspeech semantic transmission system(DeepSC-TS)based on Transformer.In their work,speechsequences were compressed based on CNN,and then Transformer was used to learn re

252、levantsemantic features.A feature re-extractor based on CNN and Transformer was designed for thereceiver to extract shallow and deep semantic features,and another layer of CNN was used toenable effective speech reload.39/155Figure 3.2 Performance Comparison between DeepSC-S and Different Speech Tran

253、smissionSystem FrameworksSpeech-oriented semantic communication systems are also used in various downstreamintelligent tasks.Semantic communication systems for speech transmission are also used in task-oriented communications.Speech signals face two typical intelligent tasks:speech recognitionand sp

254、eech synthesis.Therefore,to bridge the gap of task-oriented semantic communication inspeech sources,Weng et al.proposed a speech semantic transmission system(DeepSC-ST)forspeech recognition and speech synthesis in Reference 38.Such a system focuses on building apowerful semantic encoder to extract t

255、ext information from speech signals and convert it intocorresponding semantic features.In DeepSC-ST,a speech encoder based on the CNN+GRUnetwork architecture is designed to maximize the compression of the original speech,so as tofilter out information related to speech features and retain text-relat

256、ed semantic information,thusreducing the size of data required for transmission and improving bandwidth utilization.Oncetext-related semantic features are recovered at the receiver,the text information required by userscan be directly obtained with a feature decoder,realizing end-to-end speech-to-te

257、xt transmission.To enable unified training of joint semantic-channel coding,the authors used CTC as the lossfunction to measure system losses,and used word error rate(WER)and character error rate40/155(CER)to measure the performance of the text obtained.In addition,to improve the diversity ofsystem

258、output and provide users with clear speech sequences,the text obtained by the receiver issent to an independent speech synthesis module to obtain a complete speech sequence.Therefore,by extracting and transmitting text-related semantic features with joint semantic-channel coding,DeepSC-ST makes itse

259、lf a semantic communication system for speech recognition and speechsynthesis.On the foundation of DeepSC-ST,Han et al.39 added a redundancy-removal moduleand an alignment module based on additional semantic information extraction to improve thecompression rate of original speech and the accuracy of

260、 speech recognition.In this method,theauthors used an architecture of BLSTM+fully connected layers to build a semantic encoder,designed a semantic decoder at the receiver based on the fully connected layers,and proposed aTransformer-based semantic error corrector to improve the fidelity of semantic

261、features furtherand realize efficient speech-to-text transmission.Regarding speech synthesis,the authorsdesigned a Transformer-based Generative Adversarial Network(GAN)to train a powerfulgenerator for synthesizing clear speech sequences.Figure 3.3 A Semantic Transmission System(DeepSC-ST)for Speech

262、Recognition andSynthesis 38To expand the application of task-oriented semantic communication systems in speechsources,Weng et al.40 developed a speech transmission system(TOS-ST)for speech-to-texttranslation and speech-to-speech translation.In this method,the authors used Transformer todesign a join

263、t semantic-channel coding mechanism to extract deep semantic features,andproposed a semantic-to-text module at the receiver for the first time.In the semantic-to-textmodule,the authors used a Transformer encoder and an RNN-based detection network to build asemantic error detector to detect impaired

264、semantics in recovered speech and return arepresentation vector.Based on this representation vector,the authors designed a semantic error41/155corrector with the Transformer decoder and the fully connected layers to correct the impairedsemantics detected by the semantic error detector,calculated sem

265、antic losses with CE as the lossfunction,and updated all neural network parameters in the semantic-to-text module.In this way,high-fidelity semantic transmission was assured,and the intelligibility and accuracy of translatedtext were improved.To build a diversified semantic communication system feat

266、uring multiplelanguages and sources,the authors used a multi-layer Transformer to develop a speech synthesismechanism from target text to target speech,providing clear and smooth speech sequences forusers who understand the target language only.Therefore,TOS-ST is a transmission systembetween differ

267、ent speech-to-text sources and between multiple languages.Weng et al.41 designed a speech semantic communication system(SAC-ST)for MIMOchannel transmission to make speech semantic communication suitable for actual communicationscenarios.In this method,the authors pre-coded MIMO channels based on the

268、 SVD algorithmand decomposed them into multiple parallel SISO channels.Due to the difference betweensingular values,the effective signal-to-noise ratio of SISO channels remained proportional to theeigenvalue.After that,the authors developed a joint semantic-channel coding mechanism basedon the pre-c

269、oded MIMO channels by using Transformer and the fully connected layers,andestablished a speech-to-text transmission paradigm.In Reference 41,the semantic analysismechanism for speech signals was proposed for the first time,which focused on sendingsemantic features extracted from pre-trained semantic

270、 encoders to a semantic awareness networkbuilt with fully connected layers for semantic analysis.Such an analysis returned an importancevector for measuring the importance of each semantic feature,so as to identify feature vectorscontaining important semantic information.At last,leveraging the multi

271、ple SISO channels withdifferent signal-to-noise ratios obtained from MIMO channel decomposition and the trainedsemantic awareness network,the authors skillfully designed a method for transmitting semanticfeatures by hierarchy of semantic importance.Specifically speaking,the importance ofinformation

272、contained in different bit sequences in traditional bit layer transmission wasconsidered to be equally distributed.Therefore,when multiple bit sequences were transmittedthrough parallel SISO channels with different signal-to-noise ratios,the gain brought by anyrandom assignment was consistent.Howeve

273、r,in SAC-ST,according to the importance vectorobtained from neural network learning,semantic features of higher importance were assigned toSISO channels with higher signal-to-noise ratio,while those of lower importance were assigned42/155to SISO channels with lower signal-to-noise ratio,thus ensurin

274、g the overall fidelity of semanticfeatures recovered at the receiver.SAC-STs semantic importance hierarchical transmissionmechanismgreatlyimprovedspeech-to-text transmissionaccuracyinMIMOsemanticcommunication systems.3.3Image-Oriented Semantic CommunicationImage data contains richer semantics;theref

275、ore,many studies have been made in the fieldof image-oriented semantic communication to prove the potential of deep learning in imagecoding and transmission.For the semantic extraction of image sources,Gunduz and Kurka et al.42 proposed a joint source channel coding(JSCC)technology for wireless imag

276、e transmissionbased on convolutional neural networks,which put aside explicit coding for compression or errorcorrection and directly mapped image pixel values to complex-valued channel input symbols.Encoder and decoder functions were modeled as complementary convolutional neural networksand jointly

277、trained on data sets to minimize the mean square error of reconstructed images andimprove performance.On such a basis,researches were made to incorporate noiseless or noisychannel output feedback into transmission systems 43,and an autoencoder-based JSCCapproach was introduced,which output feedback

278、through channels and achieved considerableimprovements in end-to-end reconstruction quality of fixed length transmissions.In Reference44,a Nonlinear Transform Source Channel Coding(NTSCC)model was proposed,whichclosely adapted to the source distribution under nonlinear transformation.In this model,t

279、hetransmitter learned non-linear analysis transformation first,mapped source data into latent space,and then transmitted latent representations to the receiver through joint deep source-channelcoding.This model efficiently extracted source semantic features and provided auxiliaryinformation for sour

280、ce channel coding.By testing image sources at different resolutions,theNTSCC approach generally outperformed analog transmission using standard deep joint sourcechannel coding and digital transmission based on classical separation.An image semantic codingmethod based on generative adversarial networ

281、ks(GANs)was proposed in Reference 45,which aimed at semantic exchange rather than symbol transmission,and used multipleperception metrics to train and evaluate the proposed image semantic coding model.In order tosolve the problem of generally different statistical feature distributions between trans

282、mitted dataand training data,Zhang et al.46 converted observed data into a form similar to empirical databy using a domain adaptation(DA)method supported by generative adversarial networks43/155(GANs).Data results showed that this method was very effective in image transmission andclassification tas

283、ks.In traditional communication systems,general source encoders and channelencoders can realize adaptive CR and channel coding rates according to the signal-to-noise ratio,thus optimizing performance in constraint bandwidth conditions.To bridge the gap betweensemantic communication and traditional c

284、ommunication,authors of Reference 12 considered apoint-to-point image transmission system with signal-to-noise ratio feedback,and integrated theattention mechanism widely used in computer vision into semantic extraction.The attentionmechanism uses additional neural networks to cautiously select some

285、 features from the originalneural network or assign different weights to different features.This method is more robust anduniversal.In Reference 47,a knowledge-guided semantic computing network(SCN)wasproposed,which consisted of a master knowledge-guided semantic tree module and an auxiliarydata-dri

286、ven lightweight neural network module to extract semantic information.The semantictree module can quickly calculate classification results through forward computing.Thelightweight neural network module can help the semantic tree module improve classificationcapabilities.In Reference 48,a semantic co

287、mmunication method based on compression ratiooptimizationwasproposed,whichreducedcommunicationdelayandimproveddatatransmission reliability by optimizing the compression and transmission of image data.First,inthe feature extraction stage,key visual features were extracted from original image data.Sub

288、sequently,in the semantic relationship extraction stage,the correlation between features wasanalyzed to understand the semantic content of image data.Finally,in the semantic compressionstage,an adaptive optimization approach that automatically selects optimal compression rate wasrealized,which ensur

289、ed image quality while enabling high compression rate and improvinginterference immunity.In Reference 5,an AI semantic communication architecture(SC-AIT)was proposed,which was trained by a knowledge base to learn how to extract semanticinformation and transmit it through communication channels,thus

290、greatly improving theefficiency of image processing tasks such as classification and detection.Zhang et al.49developed a semantic communication framework for image transmission and proposed anentropy maximization multi-intelligent agent reinforcement learning method based on valuedecomposition,enabl

291、ing servers to coordinate training and allocate resource blocks in adistributed manner.44/155In terms of JSCC research on channel feedback,a vision converter(ViT)-based joint sourcechannel coding(JSCC)solution for Multiple Input Multiple Output(MIMO)system wirelessimage transmission,named ViT-MIMO,w

292、as proposed in Reference 50.This model canadaptively learn feature mapping and power allocation according to source images and channelconditions,and ViT-MIMO can significantly improve the transmission quality in differentchannel conditions.In Reference 51,a novel mode of wireless image transmission

293、wasproposed,which leveraged the feedback from receivers and was called jsccformer-f.A blockfeedback channel model was considered,where transmitters receive noiseless/noisy channeloutput feedback after each block.The unified encoder of jsccform-f can utilize the semanticinformation of the source imag

294、e to obtain channel state information from the feedback signal anddecoders current understanding of source image,and generates coded symbols on each block.Itaddresses four key challenges facing existing channel feedback image transmission methods,namely high complexity,inadaptability,suboptimality a

295、nd non-generalization.In task-oriented communication,multiple AI agents collaborate on tasks in a centralized ordistributed manner.Semantic-aware communication in tasks establishes multiple explicit orimplicit connections between different terminals in an active or passive way to enhanceknowledge am

296、ong intelligent agents.Reference 52 focused on a task-oriented semanticcommunication scenario driven by UAV image sensing.An energy-efficient and task-orientedsemantic communication framework was designed,and semantic communication was considereda promising technology to break the Shannon limit and

297、a key driving factor for futureapplications such as 6G networks and smart healthcare.Kang et al.53 proposed a joint imagetransmission and scenario classification scheme.They used deep reinforcement learning toidentify the most fundamental semantic features that serve a transmission task,thus achievi

298、ng anoptimal trade-off between classification accuracy and transmission cost.As of today,a lot of semantic metrics have been proposed for image quality assessment(IQA).For example,image semantic similarity,PSNR,image and semantic similarity(ISS),andrecognition accuracy are commonly used image semant

299、ic communication indicators.3.4Video-Oriented Semantic Communication3.4.1 The Context of Video TransmissionThe development of social science and technology makes video transmission an inevitablepart of peoples work and life.However,existing wireless video transmission systems are45/155susceptible to

300、 interference in time-varying channel conditions,which may cause video picturesto go blurred,stuck or lost.In particular,high-resolution video transmission requires a largeamount of transmission resources,which makes it challenging to maintain smooth videocommunication in low network bandwidth or un

301、stable network environments.To solve theseproblems,work has been done in the application of semantic communication-relatedtechnologies.By introducing semantic communication,systems can understand and processvideo content more intelligently,rather than simply transmitting image data,thus improving th

302、equality of video conferences.3.4.2 Semantic Video TransmissionSemantic communication technologies can identify and preferentially transmit keyinformation in videos,such as human expressions,actions or important scenarios,whiletransmitting secondary information in a compressed way to save transmissi

303、on bandwidth.Inaddition,by adjusting video coding parameters and optimizing transmission paths in real time,semantic communication technologies can also adapt to different network conditions to improvethe stability and efficiency of video transmission.To solve the problems facing traditional videotr

304、ansmission systems,Wang et al.54 proposed an end-to-end semantic video transmissionsystem(DVST)based on joint source channel coding(JSCC).The system leverages nonlineartransformation and a conditional coding architecture to adaptively extract semantic featuresacross video frames,and transmits semant

305、ic features through a set of learned variable lengthdeep JSCC codecs and wireless channels,which outperformed traditional wireless video codingtransmission systems.By optimizing video transmission bandwidth resources,Liang et al.55proposed a video semantic system(VISTA)that transmits semantics inste

306、ad of all bits of thevideo.The system classifies and encodes dynamic and static segments in source video withsemantic segmentation modules,and obtains semantic and position information of dynamic andstatic segments.Meanwhile,it leverages a JSCC module that adapts to different channelconditions to en

307、code,transmit and decode the segmented information.Finally,it restores video atthe receiver with a frame interpolation module.3.4.3 Semantic Video ConferenceIn terms of semantic video conferencing(SVC)transmission,considering that the videobackground is basically static and there is no frequent swit

308、ch of speakers,only key pointsindicating facial expression changes are transmitted,while other information that remains46/155unchanged in the video conference can be sent in advance,thus effectively improving thetransmission bandwidth of semantic video conferences 56.In order to ensure the performan

309、ceof semantic systems,Jiang et al.57 explored the impact of transmission errors on SVC andproposed a method for evaluating video quality.In an SVC system containing transmissionerrors,its structure is shown in Figure 3.4.Figure 3.4 Semantic Video Conference(SVC)Network Structure Proposed in Referenc

310、e 57The system consists of an effect layer,a semantic layer,and a physical layer.The effectlayer indicates the difference between video frames transmitted and recovered.The semanticlayer is for coding/decoding videos and sharing the knowledge base.The physical layer is fortransmitting encoded semant

311、ic features.At the semantic layer,since there are little changes in thebackground and speakers of a video conference,the transmitter shares the first frame of videowith the receiver as a shared knowledge base.Following that,a keypoint detector extracts facialchanges in the current frame and encodes

312、such changes into corresponding bits for transmissionby the physical layer.The receiver reconstructs the video transmitted with a generator based onthe key points received and the first video frame shared.The Hybrid Automatic Repeat reQuest(HARQ)can cope with time-varying channels in wireless commun

313、ication,which retransmits orsends incremental bits to improve transmission quality in time-varying channels withacknowledgment(ACK)feedback signals.47/155Figure 3.5 Basic Structure of SVC ModulesThe basic structure of SVC modules is shown in Figure 3.5.The keypoint detector consistsof a convolutiona

314、l neural network(CNN).The input image matrix is down-sampled through anti-aliasing interpolation to reduce the complexity of the keypoint detector,and final key points areobtained through an hourglass network with three blocks,a convolutional layer,and a Softmaxactivation function.In an encoder-deco

315、der structure,the fully connected layer is responsible forthe changes in key point dimensions.The quantization operation maps floating-point numbersoutput by the neural network into bits and bits into floating-point numbers,while the de-quantization operation remaps bits received into floating-point

316、 numbers to adapt to the actualcommunication scenario.The generator calculates the dynamic part of a video based on keypoints shared or received and images shared in advance,thereby restoring the video framestransmitted.48/155(a)(b)Figure 3.6 Performance of Traditional Method,SVC and Different Confi

317、gurations of SVC-HARQ under Different BER 57Figure 3.6(a)shows the acceptable ratio of received frames at different bit error ratios(BERs)using a VGG-based detector for conventional H264 and SVC.Figure 3.6(b)shows thethroughput of the traditional method,SVC and different configurations of SVC-HARQ(w

318、hereSVC-HARQ(160,160 bits)indicates that both the first and the second transmission transmit 160bits of information)at different bit error ratios(BERs).Throughput is the quantity of videoframes with acceptable quality received at 1.526M bits,where when BER=0,10,000 videoframes encoded by SVC can be

319、transmitted by 1.526M bits.In Figure 3.6(a),the acceptable ratioof the traditional method is higher than that of SVC when BER is low,because data processed bySVC using neural networks is partially compressed.As BER goes up,the performance of H264deteriorates sharply,which is almost non-workable when

320、 BER is higher than 0.04.Instead,SVCis strongly immune to interferences.Even when BER reaches 0.2,its acceptable ratio is stillabove 0.95.It can be seen from Figure 3.6(b)that only a few frames are acceptable in thetraditional method of AV1+LDPC-HARQ(a combination of AV1 video coding,LDPC channelcod

321、ing and HARQ).As BER goes up,SVCs capability to recover videos degrades significantly,while SVC-HARQ effectively withstands channel impacts with incremental bits,thus ensuringthe quality of video transmitted.3.4.4 ConclusionIn video semantic communication,the impacts of time-varying channels on wire

322、less videotransmission systems can be effectively alleviated with semantic communication technologies.49/155By using deep learning and source channel coding methods,such as DVST and VISTA,thesystems can extract and transmit semantic features in videos to improve the stability of videotransmission an

323、d bandwidth utilization.For semantic video conferences,only the key points offacial expression changes are transmitted,and video frames are restored by a generative network.This effectively improves the transmission bandwidth and interference immunity,ensuring thereliability of video transmission in

324、 time-varying channels.50/1554.Semantic Noise SuppressionIn addition to physical channel noise and interference,semantic communication systems aresubject to semantic noise.Semantic noise 58 refers to signals that lead the transmitter andreceiver to misinterpret semantic information,as shown in Figur

325、e 4.1.The robustness againstsemantic noise is a key factor restraining the performance of semantic information transmission.This white paper introduces semantic noise in text,speech and image modals,thus underliningthe necessity of designing robust semantic communication systems.Specifically,a robus

326、tsemantic communication system needs to not only explore the impacts of semantic noise ontransmission performance,but also a robust semantic communication system for specific modaldata must be designed to eliminate semantic noise,thus improving the semantic fidelity of thetransmission system.Figure

327、4.1 Semantic Noise in Semantic Communication System4.1Robust Text Semantic CommunicationText information as a common information carrier comes from complex sources.A largeamount of human-generated text information exists on the Internet,such as blogs and Wikipedia.In addition,with the development of

328、 speech and image recognition technologies,speech andimages contribute a lot of data to text databases.For example,automatic audio recognition andoptical character recognition technologies use text information to represent semantic informationof speech and images.Although such text information can b

329、e used by deep models to improvethe models semantic understanding of text information,errors may exist in text data due tohuman mistakes and imperfection of recognition algorithms.These errors are the semantic noise51/155in text modals.If not corrected,these errors will impact the fidelity of semant

330、ics extracted by thesystem and thus jeopardize the performance of the semantic communication system.However,existing text semantic communication systems,such as DeepSC,assume that textinput is correct,and include no design for semantic noise,while in practice,semantic noise willseriously affect the

331、performance of these semantic communication systems.In the field of natural language processing,grammatical error correction algorithms,such asthe one proposed in Reference 59,are often independent of communication systems.They areused as front-end or back-end auxiliary modules only to correct error

332、s before sending text orafter reconstructing text.Such a kind of repeated semantic extraction will lead to a waste ofcomputing resources.Semantic communication transmits semantic information.If semanticinformation can be treated,semantic noise can be eliminated in the communication process,thusimpro

333、ving the utilization of computing resources.To solve the above issues,Reference 60defined semantic noise in text semantic communication and proposed metrics to quantify theintensity of semantic noise.A text semantic communication system with a semantic errorcorrector was designed.The architecture of this robust semantic communication system is shownin Figure 1.This system can correct semantic info

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(全球6G技术大会:2024年10.0B语义通信白皮书(英文版)(156页).pdf)为本站 (白日梦派对) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
会员动态
会员动态 会员动态:

189**56...  升级为高级VIP   微**... 升级为标准VIP

Han**Ch... 升级为至尊VIP   wei**n_... 升级为标准VIP 

wei**n_...  升级为高级VIP 微**... 升级为标准VIP

wei**n_... 升级为至尊VIP   130**29... 升级为高级VIP

188**08... 升级为至尊VIP    wei**n_... 升级为标准VIP

 微**... 升级为标准VIP wei**n_... 升级为高级VIP 

 wei**n_... 升级为标准VIP   181**21... 升级为至尊VIP

185**71... 升级为标准VIP  张**  升级为标准VIP

186**18... 升级为至尊VIP  131**52... 升级为至尊VIP 

137**75...  升级为高级VIP  189**04... 升级为至尊VIP

185**62... 升级为至尊VIP   Joc**yn...  升级为高级VIP

  微**... 升级为至尊VIP 176**03...  升级为至尊VIP

186**04...  升级为标准VIP 一**... 升级为至尊VIP 

微**... 升级为高级VIP 159**68... 升级为至尊VIP 

 wei**n_... 升级为高级VIP  136**71...  升级为高级VIP 

wei**n_... 升级为高级VIP   wei**n_... 升级为高级VIP 

m**N 升级为标准VIP   尹** 升级为高级VIP 

 wei**n_... 升级为高级VIP    wei**n_... 升级为标准VIP

189**15... 升级为标准VIP 158**86... 升级为至尊VIP 

 136**84... 升级为至尊VIP 136**84...  升级为标准VIP

 卡** 升级为高级VIP  wei**n_...  升级为标准VIP 

铭**...  升级为至尊VIP   wei**n_...  升级为高级VIP

139**87... 升级为至尊VIP   wei**n_... 升级为标准VIP

拾**... 升级为至尊VIP  拾**...  升级为高级VIP

wei**n_... 升级为标准VIP pzx**21  升级为至尊VIP

 185**69... 升级为至尊VIP  wei**n_...  升级为标准VIP

183**08... 升级为至尊VIP 137**12... 升级为标准VIP  

 林 升级为标准VIP  159**19... 升级为标准VIP 

wei**n_... 升级为高级VIP   朵妈  升级为至尊VIP

 186**60... 升级为至尊VIP 153**00...  升级为高级VIP

wei**n_... 升级为至尊VIP   wei**n_... 升级为高级VIP

 135**79...  升级为至尊VIP 130**19... 升级为高级VIP 

wei**n_...  升级为至尊VIP  wei**n_... 升级为标准VIP

136**12...  升级为标准VIP   137**24... 升级为标准VIP

 理**... 升级为标准VIP wei**n_... 升级为标准VIP  

  wei**n_... 升级为至尊VIP  135**12...  升级为标准VIP

wei**n_...  升级为至尊VIP  wei**n_... 升级为标准VIP 

 特** 升级为至尊VIP 138**31...  升级为高级VIP

wei**n_...  升级为标准VIP  wei**n_... 升级为高级VIP

186**13... 升级为至尊VIP  分**  升级为至尊VIP

set**er 升级为高级VIP  139**80...  升级为至尊VIP 

 wei**n_... 升级为标准VIP  wei**n_...  升级为高级VIP

 wei**n_...  升级为至尊VIP  一朴**P... 升级为标准VIP

 133**88... 升级为至尊VIP  wei**n_...  升级为高级VIP

159**56...  升级为高级VIP 159**56...   升级为标准VIP

升级为至尊VIP 136**96... 升级为高级VIP 

wei**n_...  升级为至尊VIP  wei**n_... 升级为至尊VIP 

 wei**n_... 升级为标准VIP 186**65... 升级为标准VIP

137**92...  升级为标准VIP  139**06...  升级为高级VIP

130**09...  升级为高级VIP  wei**n_... 升级为至尊VIP