TWI821358B - Electronic apparatus, method for controlling thereof, and method for controlling a server - Google Patents
Electronic apparatus, method for controlling thereof, and method for controlling a server Download PDFInfo
- Publication number
- TWI821358B TWI821358B TW108128335A TW108128335A TWI821358B TW I821358 B TWI821358 B TW I821358B TW 108128335 A TW108128335 A TW 108128335A TW 108128335 A TW108128335 A TW 108128335A TW I821358 B TWI821358 B TW I821358B
- Authority
- TW
- Taiwan
- Prior art keywords
- image data
- artificial intelligence
- data
- filter
- electronic device
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 79
- 238000013473 artificial intelligence Methods 0.000 claims abstract description 243
- 238000004891 communication Methods 0.000 claims description 57
- 230000003321 amplification Effects 0.000 claims description 43
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 43
- 230000008569 process Effects 0.000 claims description 28
- 230000009467 reduction Effects 0.000 claims description 20
- 238000013527 convolutional neural network Methods 0.000 claims description 18
- 238000012549 training Methods 0.000 description 103
- 238000012545 processing Methods 0.000 description 37
- 238000010586 diagram Methods 0.000 description 25
- 230000006835 compression Effects 0.000 description 22
- 238000007906 compression Methods 0.000 description 22
- 238000011156 evaluation Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 14
- 238000011176 pooling Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000003860 storage Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000004904 shortening Methods 0.000 description 3
- 239000010409 thin film Substances 0.000 description 3
- 241000282412 Homo Species 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 229910021417 amorphous silicon Inorganic materials 0.000 description 2
- 235000000332 black box Nutrition 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 230000000153 supplemental effect Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 241000699670 Mus sp. Species 0.000 description 1
- 241001422033 Thestylus Species 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012905 input function Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 229910021420 polycrystalline silicon Inorganic materials 0.000 description 1
- 229920005591 polysilicon Polymers 0.000 description 1
- APTZNLHMIGJTEW-UHFFFAOYSA-N pyraflufen-ethyl Chemical compound C1=C(Cl)C(OCC(=O)OCC)=CC(C=2C(=C(OC(F)F)N(C)N=2)Cl)=C1F APTZNLHMIGJTEW-UHFFFAOYSA-N 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 208000031509 superficial epidermolytic ichthyosis Diseases 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440218—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/266—Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
- H04N21/2662—Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440263—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234363—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the spatial resolution, e.g. for clients with a lower screen resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4007—Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/115—Selection of the code volume for a coding unit prior to coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/154—Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/33—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/439—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using cascaded computational arrangements for performing a single operation, e.g. filtering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/59—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
- H04N19/82—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234309—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
- H04N7/0117—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving conversion of the spatial resolution of the incoming video signal
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Computer Graphics (AREA)
- Image Processing (AREA)
- Controls And Circuits For Display Device (AREA)
- Electrotherapy Devices (AREA)
- Percussion Or Vibration Massage (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
本發明是有關於一種電子裝置、一種控制方法以及一種伺服器控制方法,且更確切而言是有關於一種藉由傳輸及接收高畫質影像來改良影像流式傳輸環境的電子裝置、一種控制方法以及一種伺服器控制方法。 The present invention relates to an electronic device, a control method and a server control method, and more specifically to an electronic device that improves the image streaming environment by transmitting and receiving high-definition images, and a control method. method and a server control method.
本申請案主張2018年8月10日於韓國智慧財產局提出申請的韓國專利申請案第10-2018-0093511號的優先權,所述韓國專利申請案的揭露內容全部併入本案供參考。 This application claims priority over Korean Patent Application No. 10-2018-0093511, which was filed with the Korean Intellectual Property Office on August 10, 2018. All disclosures of the Korean patent application are incorporated into this case for reference.
人工智慧(artificial intelligence,AI)系統是自行訓練且實施人類水平的智慧的系統。人工智慧系統的辨識率隨著人工智慧系統使用的增多而提高。 Artificial intelligence (AI) systems are systems that train themselves and implement human-level intelligence. The recognition rate of artificial intelligence systems increases with the increase in the use of artificial intelligence systems.
人工智慧技術包括:使用演算法的機器學習(例如,深度學習)技術,所述演算法使用輸入資料的特徵來自行分類及自行訓練;及元素技術(element technique),藉由使用機器學習演算法來模擬人類大腦的辨識、確定等功能。 Artificial intelligence techniques include: machine learning (e.g., deep learning) techniques that use algorithms that use characteristics of input data to classify and train themselves; and element techniques that use machine learning algorithms to To simulate the recognition, determination and other functions of the human brain.
所述元素技術可包括例如以下各項中的至少一者:辨識人類語言/字符的語言理解、如同被人類感知到一樣辨識物體的視覺理解、確定資訊並且在邏輯上推理並預測資訊的推理/預測、處理人類經驗資訊作為知識資料的知識表示、控制車輛的自主駕駛及機器人移動的運動控制等。 The elemental technologies may include, for example, at least one of the following: language understanding to recognize human speech/characters, visual understanding to recognize objects as if perceived by humans, reasoning to determine information and logically reason and predict the information/ Prediction and processing of human experience information as knowledge representation of knowledge data, control of autonomous driving of vehicles and motion control of robot movement, etc.
確切而言,網路狀態對於藉由適應性地壓縮及復原影像來實行流式傳輸的流式傳輸系統的影像品質而言是至關重要的因素。然而,網路資源是有限的。因此,除非可獲取大量的資源,否則使用者難以使用高畫質(high-definition)內容。 Specifically, network status is a critical factor in the image quality of a streaming system that performs streaming by adaptively compressing and restoring images. However, network resources are limited. Therefore, it is difficult for users to use high-definition content unless a large amount of resources are available.
另外,視訊容量隨著影像品質的提高而不斷增大,但網路頻寬卻並未跟上此增加。因此,編解碼器效能對於在影像壓縮及復原過程的始終確保影像品質的重要性不斷增大。 In addition, video capacity continues to increase as image quality improves, but network bandwidth has not kept pace with this increase. Therefore, codec performance is increasingly important in ensuring image quality throughout the image compression and restoration process.
提供一種電子裝置、一種控制其的方法及一種伺服器控制方法,且更確切而言,提供一種藉由在多個濾波器組當中選擇改良的濾波器組來放大經縮小的影像的電子裝置、一種控制其的方法及一種伺服器控制方法。 Provide an electronic device, a method of controlling the same, and a server control method, and more specifically, provide an electronic device that enlarges a reduced image by selecting an improved filter set among a plurality of filter sets, A method of controlling the same and a server control method.
根據實施例,提供一種控制電子裝置的方法,所述方法包括:自外部伺服器接收影像資料及與濾波器組相關聯的資訊,所述濾波器組應用於用於放大所述影像資料的人工智慧模型;對所述影像資料進行解碼;使用基於與所述濾波器組相關聯的所述資訊獲得的第一人工智慧模型來放大所述經解碼的影像資料;以及提供所述經放大的影像資料以供輸出。 According to an embodiment, a method of controlling an electronic device is provided, the method comprising: receiving image data and information associated with a filter bank from an external server, the filter bank being applied to a manual process for amplifying the image data. an intelligent model; decoding the image data; amplifying the decoded image data using a first artificial intelligence model obtained based on the information associated with the filter bank; and providing the enlarged image data for output.
與所述濾波器組相關聯的所述資訊包括所述濾波器組的索引資訊,且所述放大包括:獲得所述第一人工智慧模型,所述第一人工智慧模型基於所述索引資訊來應用所述電子裝置中所儲存的多個經過訓練的濾波器組中的一者;以及藉由將所述經解碼的影像資料輸入至所獲得的所述第一人工智慧模型中來放大所述經解碼的影像資料。 The information associated with the filter bank includes index information of the filter bank, and the amplification includes obtaining the first artificial intelligence model based on the index information. Apply one of a plurality of trained filter banks stored in the electronic device; and amplify the decoded image data by inputting the decoded image data into the obtained first artificial intelligence model. Decoded image data.
藉由對經縮小的影像資料進行編碼來獲得所述影像資料,所述經縮小的影像資料是藉由將與所述影像資料對應的最初影像資料輸入至第二人工智慧模型中以縮小最初影像資料而獲取 Obtaining the image data by encoding the reduced image data by inputting the original image data corresponding to the image data into a second artificial intelligence model to reduce the original image obtained from information
所述第一人工智慧模型的濾波器的數目可小於所述第二人工智慧模型的濾波器的數目。 The number of filters of the first artificial intelligence model may be smaller than the number of filters of the second artificial intelligence model.
與所述濾波器組相關聯的所述資訊是由所述外部伺服器獲得的資訊,且識別將所述第一人工智慧模型所獲取的所述經放大的影像資料與所述最初影像資料之間的差異最小化的濾波器組。 The information associated with the filter bank is information obtained from the external server, and identifies the difference between the amplified image data obtained by the first artificial intelligence model and the original image data. filter bank that minimizes the difference between
所述第一人工智慧模型可以是卷積神經網路(CNN)。 The first artificial intelligence model may be a convolutional neural network (CNN).
所述提供可包括顯示所述經放大的影像資料。 The providing may include displaying the enlarged image data.
根據實施例,提供一種控制伺服器的方法,所述方法包括:藉由將最初影像資料輸入至用於縮小影像資料的人工智慧縮小模型中來獲得經縮小的影像資料;藉由將所述經縮小的影像資料分別輸入至多個人工智慧放大模型中來獲得多個經放大的影像資料,所述多個人工智慧放大模型應用為了放大所述經縮小的影像資料而訓練的多個濾波器組中的相應濾波器組;藉由添加與人工智慧放大模型的濾波器組相關聯的資訊來對所述經縮小的影像資料進行編碼,所述人工智慧放大模型輸出在所述多個經放大的影像資料當中與所述最初影像資料具有最小差異的經放大的影像資料;以及將所述經編碼的影像資料傳輸至外部電子裝置。 According to an embodiment, a method of controlling a server is provided, the method comprising: obtaining reduced image data by inputting initial image data into an artificial intelligence reduction model for reducing image data; The reduced image data are respectively input into a plurality of artificial intelligence magnification models to obtain a plurality of enlarged image data. The plurality of artificial intelligence magnification models apply a plurality of filter groups trained to enlarge the reduced image data. a corresponding filter bank; encoding the reduced image data by adding information associated with a filter bank of an artificial intelligence amplification model that outputs in the plurality of enlarged images Amplified image data among the data that is minimally different from the original image data; and transmitting the encoded image data to an external electronic device.
所述方法更可包括:訓練所述多個濾波器組的參數以減小所述多個經放大的影像資料與所述最初影像資料之間的差異。 The method may further include training parameters of the plurality of filter banks to reduce differences between the plurality of amplified image data and the original image data.
所述人工智慧放大模型的濾波器的數目可小於所述人工智慧縮小模型的濾波器的數目。 The number of filters of the artificial intelligence amplification model may be smaller than the number of filters of the artificial intelligence reduction model.
根據實施例,提供一種電子裝置,所述電子裝置包括:通訊介面,所述通訊介面包括通訊電路系統;以及處理器,所述處理器被配置以:經由所述通訊介面自外部伺服器接收影像資料及與濾波器組相關聯的資訊,所述濾波器組應用於用於放大所述影像資料的人工智慧模型;對所接收到的所述影像資料進行解碼;使用基於與所述濾波器組相關聯的所述資訊獲得的第一人工智慧模型來放大所述經解碼的影像資料;及提供所述經放大的影 像資料以供輸出。 According to an embodiment, an electronic device is provided. The electronic device includes: a communication interface, the communication interface includes a communication circuit system; and a processor, the processor is configured to: receive images from an external server via the communication interface. data and information associated with a filter bank that is applied to an artificial intelligence model for amplifying the image data; decoding the received image data; using the filter bank based on a first artificial intelligence model obtained by correlating the information to amplify the decoded image data; and provide the amplified image image data for output.
所述電子裝置更可包括記憶體。與所述濾波器組相關聯的所述資訊包括所述濾波器組的索引資訊,且所述處理器更被配置以:獲得所述第一人工智慧模型,所述第一人工智慧模型基於所述索引資訊來應用所述記憶體中所儲存的多個經過訓練的濾波器組中的一者;及藉由將所述經解碼的影像資料輸入至所獲得的所述第一人工智慧模型中來放大所述經解碼的影像資料。 The electronic device may further include a memory. The information associated with the filter bank includes index information of the filter bank, and the processor is further configured to: obtain the first artificial intelligence model, the first artificial intelligence model is based on the using the index information to apply one of a plurality of trained filter banks stored in the memory; and by inputting the decoded image data into the obtained first artificial intelligence model to enlarge the decoded image data.
可藉由對經縮小的影像資料進行編碼來獲得所述影像資料,所述經縮小的影像資料是藉由將與所述影像資料對應的最初影像資料輸入至用於縮小最初影像資料的第二人工智慧模型中而獲取。 The image data may be obtained by encoding reduced image data by inputting original image data corresponding to the image data into a second image data for reducing the original image data. Obtained from the artificial intelligence model.
所述第一人工智慧模型的濾波器的數目可小於所述第二人工智慧模型的濾波器的數目。 The number of filters of the first artificial intelligence model may be smaller than the number of filters of the second artificial intelligence model.
所述濾波器組相關聯的所述資訊可以是由所述外部伺服器獲得的資訊,以減小由所述第一人工智慧模型獲得的所述經放大的影像資料與所述最初影像資料之間的差異。 The information associated with the filter bank may be information obtained from the external server to reduce the difference between the enlarged image data obtained by the first artificial intelligence model and the original image data. difference between.
所述第一人工智慧模型可以是卷積神經網路(CNN)。 The first artificial intelligence model may be a convolutional neural network (CNN).
所述電子裝置更可包括顯示器,且所述處理器被配置以:提供所述經放大的影像資料以藉由控制所述顯示器顯示所述經放大的影像資料來進行輸出。 The electronic device may further include a display, and the processor is configured to provide the amplified image data for output by controlling the display to display the amplified image data.
21、1110:最初影像資料 21. 1110: Initial image data
22:人工智慧編碼器 22:Artificial Intelligence Coder
23、27:經壓縮影像 23, 27: Compressed image
24:編碼過程/編碼操作 24: Encoding process/encoding operation
25:流式傳輸源 25:Streaming source
26:解碼過程/操作 26: Decoding process/operation
28:人工智慧解碼器/操作 28: Artificial Intelligence Decoder/Operation
29:最初復原影像 29:First restored image
30、150:顯示器 30, 150: Monitor
71:影像內容源/最初影像內容源/輸入影像內容源 71: Image content source/original image content source/input image content source
72:經縮小的影像 72: Reduced image
73:人工智慧旗標 73: Artificial Intelligence Flag
74:索引資訊/濾波器索引資訊 74: Index information/Filter index information
75:資訊 75:Information
100:電子裝置/影像處理裝置 100: Electronic devices/image processing devices
110:無線通訊介面 110:Wireless communication interface
120、230、900:處理器 120, 230, 900: Processor
121:隨機存取記憶體 121: Random access memory
122:唯讀記憶體 122: Read-only memory
123:中央處理單元 123: Central processing unit
124:圖形處理單元 124: Graphics processing unit
125:匯流排 125:Bus
130、220、1702:記憶體 130, 220, 1702: memory
140:有線介面 140:Wired interface
160:視訊處理器 160:Video processor
170:音訊處理器 170: Audio processor
180:音訊輸出組件 180: Audio output component
200:伺服器 200:server
210:通訊介面 210: Communication interface
810:濾波器組 810: Filter bank
811、812、81n:層 811, 812, 81n: layer
910:訓練單元 910: Training unit
910-1:訓練資料獲取單元 910-1: Training data acquisition unit
910-2:訓練資料預處理器 910-2: Training data preprocessor
910-3:訓練資料選擇器 910-3: Training data selector
910-4:模型訓練單元 910-4: Model training unit
910-5:模型評估單元 910-5: Model Evaluation Unit
920:獲取單元 920: Get unit
920-1:輸入資料獲取單元 920-1: Input data acquisition unit
920-2:輸入資料預處理器 920-2: Input data preprocessor
920-3:輸入資料選擇器 920-3:Input data selector
920-4:提供器 920-4:Provider
920-5:模型更新單元 920-5: Model update unit
1000:影像流式傳輸系統 1000:Image streaming system
1120、1140:卷積濾波器/濾波器 1120, 1140: Convolution filter/filter
1130:經壓縮影像資料 1130: Compressed image data
1150:復原影像資料 1150:Restore image data
1201:原始視訊資料、人工智慧旗標及濾波器索引資訊 1201: Original video data, artificial intelligence flags and filter index information
1202:標準編碼器 1202:Standard encoder
1203:人工智慧旗標及濾波器索引資訊 1203:Artificial intelligence flag and filter index information
1204:視訊串流 1204:Video streaming
1205、1207:補充強化資訊 1205, 1207: Supplementary strengthening information
1206、1208:視訊塊 1206, 1208: Video block
1209:流式傳輸儲存裝置 1209:Streaming storage device
1501:操作/標準解碼器 1501: Operation/standard decoder
1502:經壓縮原始資料、人工智慧旗標及濾波器索引資訊 1502: Compressed raw data, artificial intelligence flags and filter index information
1503:操作/人工智慧資訊控制器 1503: Operation/Artificial Intelligence Information Controller
1504:操作/索引控制器 1504: Operation/Index Controller
1505:操作/記憶體 1505: Operation/Memory
1506:索引資訊匹配的參數的載入 1506: Loading of parameters matching index information
1507:人工智慧放大模型 1507: Artificial intelligence amplification model
1508、1510:原始資料 1508, 1510: original data
1509:操作/顯示器 1509:Operation/Display
1511:請求 1511: Request
1601:位元串流、補充強化資訊標頭中所包含的濾波器索引及人工智慧旗標 1601: Bit streaming, filter index and artificial intelligence flags contained in supplemental enhancement information header
1602:原始視訊資料及補充強化資訊 1602: Original video data and supplementary enhanced information
1604:原始視訊資料 1604: Original video data
1605:經放大的視訊資料 1605: Amplified video data
1701:濾波器索引資訊 1701:Filter index information
1703:選擇與濾波器索引資訊匹配的濾波器組 1703: Select a filter bank that matches the filter index information
1704:偏倚項 1704: Bias term
1705:輸入影像資料 1705:Input image data
1706:卷積濾波器 1706: Convolution filter
1707:復原的輸出影像 1707:Restored output image
S610、S620、S630、S640、S710、S720、S730、S740、S750、S1410、S1420、S1430、S1440、S1610、S1620、S1630、S1640、S1650、S1660:操作 Operation
結合附圖閱讀以下說明,本發明的某些實施例的以上及其他態樣、特徵及優勢將更加顯而易見,在附圖中:圖1及圖2是根據實施例的影像流式傳輸系統的圖。 The above and other aspects, features and advantages of certain embodiments of the present invention will be more apparent by reading the following description in conjunction with the accompanying drawings, in which: Figures 1 and 2 are diagrams of an image streaming system according to an embodiment. .
圖3是根據實施例的電子裝置的方塊圖。 3 is a block diagram of an electronic device according to an embodiment.
圖4是根據實施例的電子裝置的方塊圖。 4 is a block diagram of an electronic device according to an embodiment.
圖5是根據實施例的伺服器的方塊圖。 Figure 5 is a block diagram of a server according to an embodiment.
圖6是根據實施例的伺服器的影像編碼操作的流程圖。 Figure 6 is a flowchart of an image encoding operation of a server according to an embodiment.
圖7是根據實施例的伺服器的影像編碼操作的流程圖。 Figure 7 is a flowchart of an image encoding operation of a server according to an embodiment.
圖8是根據實施例的濾波器組的圖。 Figure 8 is a diagram of a filter bank according to an embodiment.
圖9是根據實施例的用於訓練並使用人工智慧模型的電子裝置的方塊圖。 Figure 9 is a block diagram of an electronic device for training and using an artificial intelligence model, according to an embodiment.
圖10A及圖10B是根據實施例的訓練單元及獲取單元的方塊圖。 10A and 10B are block diagrams of a training unit and an acquisition unit according to embodiments.
圖11是根據實施例的濾波器組的訓練方法的圖。 Figure 11 is a diagram of a training method of a filter bank according to an embodiment.
圖12是根據實施例的流式傳輸資料的結構的圖。 Figure 12 is a diagram of the structure of streaming data according to an embodiment.
圖13是根據實施例的濾波器組的訓練方法的圖。 Figure 13 is a diagram of a training method of a filter bank according to an embodiment.
圖14是根據實施例的電子裝置的影像解碼操作的流程圖。 FIG. 14 is a flowchart of an image decoding operation of an electronic device according to an embodiment.
圖15及圖16是根據實施例的電子裝置的影像解碼操作的圖。 15 and 16 are diagrams of image decoding operations of the electronic device according to embodiments.
圖17是根據實施例的電子裝置的影像放大操作的圖。 FIG. 17 is a diagram of an image magnification operation of the electronic device according to the embodiment.
可對本發明的示範性實施例進行多種修改。因此,圖式 中說明且詳細說明中詳細地闡述具體的示範性實施例。然而,應理解本發明並不僅限於具體的示範性實施例,而是在不背離本發明的範疇及精神的情況下包括所有的潤飾、等效內容及替代形式。此外,由於不必要的細節會使本發明模糊,因此未詳細地闡述所述眾所周知的功能或構造。 Various modifications may be made to the exemplary embodiments of the invention. Therefore, schema Specific exemplary embodiments are set forth in detail in the description and in the detailed description. However, it should be understood that the present invention is not limited to the specific exemplary embodiments but includes all modifications, equivalents, and alternatives without departing from the scope and spirit of the invention. Additionally, well-known functions or constructions have not been described in detail since they would obscure the invention with unnecessary detail.
將簡要闡述本說明書中所使用的用語,且將詳細地闡述本發明。 The terms used in this specification will be briefly explained, and the present invention will be explained in detail.
包括技術用語及科學用語在內的本說明書中所使用的所有用語皆具有與熟習此項技術者通常所理解的相同的含義。然而,該些用語可根據熟習此項技術者的意圖、法律或技術闡釋以及新技術的出現而有所變化。另外,一些用語是申請人任意地選擇的。該些用語可被解釋為本文中所定義的含義,且除非另有規定,否則可基於本發明的全部內容及此項技術中的技術知識來加以解釋。 All terms used in this specification, including technical terms and scientific terms, have the same meaning as commonly understood by those skilled in the art. However, these terms may change based on the intentions of those skilled in the art, legal or technical interpretations, and the emergence of new technologies. Additionally, some terms are chosen arbitrarily by the applicant. These terms may be interpreted as having the meanings defined herein and, unless otherwise specified, may be interpreted based on the entire content of the present invention and technical knowledge in the art.
本發明並不僅限於本文中所揭露的實施例且可以各種形式來實施,且本發明的範疇並不僅限於以下實施例。另外,自申請專利範圍及其等效內容的含義及範疇導出的所有改變或潤飾皆應被解釋為包含於本發明的範疇內。在以下說明中,可省略眾所周知但與本發明的主旨不相關的配置。 The present invention is not limited to the embodiments disclosed herein and may be implemented in various forms, and the scope of the present invention is not limited only to the following embodiments. In addition, all changes or modifications derived from the meaning and scope of the patent application and its equivalents should be construed as being included in the scope of the present invention. In the following description, configurations that are well known but not related to the gist of the present invention may be omitted.
可使用諸如「第一」、「第二」等用語來闡述各種元件,但所述元件不應受該些用語限制。所述用語是用於將一個元件與其他元件區分開。 Terms such as "first", "second", etc. may be used to describe various elements, but the elements should not be limited by these terms. These terms are used to distinguish one element from other elements.
用語的單數表達亦包含複數含義,只要所述複數含義在所述用語的上下文中不存在不同的含義即可。在本發明中,諸如「包括」及「具有(have/has)」等用語應被解釋為指明本發明中存在該些特徵、數目、操作、元件、組件或其組合,且不排除存在或可能添加其他特徵、數目、操作、元件、組件或其組合中的一者或多者。 The singular expression of a term also includes the plural meaning so long as the plural meaning does not have a different meaning in the context of the term. In the present invention, words such as "include" and "have/has" should be interpreted as indicating that the features, numbers, operations, elements, components or combinations thereof exist in the present invention, and do not exclude the existence or possibility of Add one or more other features, numbers, operations, elements, components, or combinations thereof.
在實施例中,「模組」、「單元」或「部分」等實行至少一個功能或操作,且可被實現為諸如處理器或積體電路等硬體、由處理器執行的軟體或者所述硬體與軟體的組合。另外,多個「模組」、多個「單元」、多個「部分」等可被整合至至少一個模組或晶片中且可被實現為至少一個處理器,但應被實現為特殊硬體的「模組」、「單元」或「部分」除外。 In embodiments, a "module," "unit," or "portion" performs at least one function or operation, and may be implemented as hardware such as a processor or an integrated circuit, software executed by a processor, or the A combination of hardware and software. In addition, multiple "modules", multiple "units", multiple "parts", etc. can be integrated into at least one module or chip and can be implemented as at least one processor, but should be implemented as special hardware Except for "module", "unit" or "part".
在後文中,將參考附圖詳細地闡述本發明的實施例,以使得熟習此項技術者可實施本發明的實施例。然而,本發明可體現為諸多不同的形式,並不僅限於本文中所述的實施例。為在圖式中清晰地說明本發明,為清晰起見省略了對於完整地理解本發明無關緊要的一些元件,且在本說明書通篇中相似的參考編號指代相似的元件。 In the following, embodiments of the present invention will be explained in detail with reference to the accompanying drawings, so that those skilled in the art can implement the embodiments of the present invention. However, the invention may be embodied in many different forms and is not limited to the embodiments set forth herein. In order to clearly illustrate the invention in the drawings, some elements which are not essential to a complete understanding of the invention have been omitted for the sake of clarity, and similar reference numbers refer to similar elements throughout this specification.
在後文中,將參考圖式更詳細地闡述本發明。 In the following, the invention will be explained in more detail with reference to the drawings.
圖1是根據實施例的影像流式傳輸系統的圖。 FIG. 1 is a diagram of an image streaming system according to an embodiment.
參考圖1,影像流式傳輸系統1000可包括電子裝置100及伺服器200。
Referring to FIG. 1 , an
伺服器200可產生經編碼的影像資料。所述經編碼的影像資料可以是在伺服器200縮小最初影像資料之後再加以編碼的影像資料。
The
伺服器200可使用用於縮小影像資料的人工智慧模型來縮小最初影像資料。伺服器200可在像素基礎上、在區塊基礎上、在圖框基礎上等縮小影像資料。
The
伺服器200可獲取多個影像資料,所述多個影像資料是藉由在將多個濾波器組應用於用於放大影像資料的人工智慧模型之後放大經縮小的影像資料而獲得。伺服器200可在像素基礎上、在區塊基礎上、在圖框基礎上等放大經縮小的影像資料。濾波器組可包括應用於人工智慧模型的多個濾波器。應用於放大用人工智慧模型的濾波器的數目可小於應用於縮小用人工智慧模型的濾波器的數目。此乃因用作解碼器的電子裝置100的濾波器層因解碼器操作的即時性質而無法深入地形成。
The
所述多個濾波器中的每一者可包括多個參數。亦即,濾波器組可以是用於獲得人工智慧模型的參數集合。所述參數可被稱為權重、係數等。 Each of the plurality of filters may include a plurality of parameters. That is, the filter bank may be a set of parameters used to obtain an artificial intelligence model. The parameters may be called weights, coefficients, etc.
可提前訓練所述多個濾波器組並將其儲存於伺服器200中。所述多個濾波器組可提供改良的壓縮率以獲得與最初影像資料具有最小差異的經放大的影像。經過訓練的資料可以是應用於縮小人工智慧模型的多個參數以及應用於放大人工智慧模型的多個參數。舉例而言,可基於影像資料的類別來訓練所述多個濾波
器組。將參考圖13更詳細地對示例性實施例加以詳細說明。
The plurality of filter banks may be trained in advance and stored in the
伺服器200可識別用於產生在所述多個經放大的影像資料當中與最初影像資料具有最小差異的影像資料的濾波器組。可針對影像資料的每一圖框識別改良的濾波器組。
The
伺服器200可將與經編碼的影像及濾波器組相關聯的資訊傳輸至電子裝置100。與濾波器組相關聯的資訊可包括所識別濾波器組的索引資訊。濾波器組的索引資訊可用於區分由多個參數構成的濾波器組。舉例而言,當n個濾波器組(諸如濾波器1、濾波器2、...濾波器n)儲存於電子裝置100及伺服器200中時,值1、2、...及n可被定義為索引資訊。
電子裝置100可對所接收到的影像資料進行解碼並實行放大。所接收到的影像資料可以是自伺服器200接收到的經編碼的資料。電子裝置100可使用放大人工智慧模型來對經解碼的影像資料實行放大。
The
電子裝置100可儲存用於放大影像資料的多個濾波器組。電子裝置100中所儲存的所述多個濾波器組可與伺服器200中所儲存的所述多個濾波器組相同。
The
電子裝置100可藉由將經解碼的影像資料輸入至放大人工智慧模型中來獲得經放大的影像,所述放大人工智慧模型是基於自伺服器200接收到的影像資料中所包含的濾波器組而獲得。具體而言,電子裝置100可藉由基於與自伺服器200接收到的濾波器組相關聯的資訊使用電子裝置100中所儲存的所述多個濾波
器組當中的單個濾波器組來獲得放大人工智慧模型以用於放大。
The
電子裝置100可提供經放大的影像資料以供輸出。
The
當電子裝置100是包括顯示器的顯示設備(諸如,個人電腦(personal computer,PC)、電視(television,TV)、行動設備等)時,電子裝置100可提供經放大的影像資料以經由電子裝置100的顯示器進行顯示。
When the
當電子裝置100是不包括顯示器的裝置(諸如,機頂盒或伺服器)時,電子裝置100可將經放大的影像資料提供至具有顯示器的外部設備,以使得所述外部設備可顯示所述經放大的影像資料。
When the
如上所述,藉由提前在由伺服器進行的編碼過程中通過多個放大過程識別改良的放大濾波器組,可將電子裝置中的復原過程簡化。因此,可在影像流式傳輸環境中實現高壓縮率,且因此可在影像流式傳輸環境中傳輸高畫質影像。 As mentioned above, by identifying the improved amplification filter bank through multiple amplification processes in advance during the encoding process by the server, the recovery process in the electronic device can be simplified. Therefore, a high compression rate can be achieved in an image streaming environment, and therefore a high-definition image can be transmitted in an image streaming environment.
圖2是圖1所示影像流式傳輸系統的圖。 FIG. 2 is a diagram of the image streaming system shown in FIG. 1 .
參考圖2,伺服器200可將最初影像資料21輸入至人工智慧編碼器22中。最初影像資料21可以是影像內容源。舉例而言,最初影像資料21的大小可以是2N×2M。
Referring to FIG. 2 , the
人工智慧編碼器22可接收大小為2N×2M的最初影像資料21,且獲得大小為N×M的經壓縮影像23。人工智慧編碼器22可使用縮小人工智慧模型來縮小最初影像資料21。人工智慧編碼器22可獲得多個經放大的影像資料,所述多個經放大的影像資
料是藉由使用應用所述多個濾波器組中的每一者的人工智慧模型放大經壓縮影像23而獲得。人工智慧編碼器22可識別用於產生在所述多個經放大的影像資料當中與最初影像資料21最類似的影像資料的濾波器組。將參考圖8更詳細地闡述濾波器組的示例性實施例。
The
濾波器可以是具有一些參數的遮罩且可由參數矩陣定義。濾波器可被稱為視窗或核心。構成濾波器中的矩陣的參數可包含0(例如,0值)、可逼近0的0元素或具有在0與1之間的恆定值的非0元素,且可根據其功能而具有各種型樣。 A filter can be a mask with some parameters and can be defined by a parameter matrix. Filters may be called windows or kernels. The parameters that make up the matrix in the filter can contain 0 (e.g., 0 values), 0 elements that can approximate 0, or non-zero elements with constant values between 0 and 1, and can have various shapes depending on their function. .
舉例而言,當人工智慧模型體現為用於辨識影像的卷積神經網路(Convolutional Neural Network,CNN)時,電子裝置可對輸入影像使用具有一些參數的濾波器,且確定藉由將影像的相應參數與濾波器的相應參數相乘(卷積計算)獲得的值的總和作為輸出影像的像素值,以提取特徵值。 For example, when the artificial intelligence model is embodied as a convolutional neural network (CNN) for recognizing images, the electronic device can use a filter with some parameters on the input image, and determine the image by converting the The sum of the values obtained by multiplying the corresponding parameters with the corresponding parameters of the filter (convolution calculation) is used as the pixel value of the output image to extract feature values.
可通過多個濾波器提取多個特徵值來提取輸入影像資料的強特徵,且可根據濾波器的數目提取多個特徵值。可通過圖11中所示的多個層重複進行卷積影像處理,且所述層中的每一者可包括多個濾波器。與此同時,將被訓練的濾波器可根據卷積神經網路的訓練目標而有所變化,且將選擇的濾波器的型樣可有所變化。舉例而言,將被訓練或將被選擇的濾波器可根據卷積神經網路的訓練目標是縮小還是放大輸入影像、根據影像所屬類別等而有所變化。 Multiple feature values can be extracted through multiple filters to extract strong features of the input image data, and multiple feature values can be extracted according to the number of filters. Convolutional image processing may be iterated through multiple layers shown in Figure 11, and each of the layers may include multiple filters. At the same time, the filters to be trained may vary according to the training objectives of the convolutional neural network, and the type of filters to be selected may vary. For example, the filter to be trained or selected may change depending on whether the training goal of the convolutional neural network is to reduce or enlarge the input image, according to the category to which the image belongs, and so on.
伺服器200可使用與經壓縮影像23及用於改良放大的濾波器組相關聯的資訊來實行編碼過程24,所述經壓縮影像23是自人工智慧編碼器22獲得,大小為N×M。
The
編碼過程24可以是通用編碼過程。具體而言,影像編碼器可藉由對大小為N×M的經壓縮影像23實行編碼來產生位元串流。可由流式傳輸標準格式化器根據流式傳輸標準格式來產生所產生的位元串流。將參考圖13更詳細地闡述產生位元串流的過程。可將所產生且已壓縮的位元串流儲存於流式傳輸儲存裝置中。
伺服器200可將所儲存的流式傳輸源25傳輸至電子裝置100。所傳輸的經壓縮位元串流可包含與經編碼的影像資料以及與用於改良放大的濾波器組相關聯的資訊。
The
電子裝置100可藉由使用所接收到的流式傳輸源25實行解碼過程26來獲得大小為N×M的經壓縮影像27。所獲得的大小為N×M的經壓縮影像27可與在編碼之前大小為N×M的經壓縮影像23對應。電子裝置100可藉由將流式傳輸源25輸入至流式傳輸剖析器來獲得位元串流,且藉由將所獲得的位元串流輸入至影像解碼器中來獲得大小為N×M的解碼的經壓縮影像27。
The
電子裝置100可藉由將大小為N×M的解碼的經壓縮影像27輸入至人工智慧解碼器28中來實行放大。電子裝置100可藉由應用所儲存的所述多個濾波器組中的一者來獲得放大用的放大人工智慧模型,並使用所獲得的放大人工智慧模型來獲得大小為2N×2M的最初復原影像29。電子裝置100可基於與所接收到
的流式傳輸源25中所包含的濾波器組相關聯的資訊來選擇所述多個濾波器組中的一者。
The
電子裝置100可控制顯示器30顯示最初復原影像29。當電子裝置100中未設置有顯示器時,電子裝置100可將最初復原影像29提供至外部顯示設備以進行顯示。
The
與此同時,為便於闡釋,將人工智慧編碼器22、影像編碼器及流式傳輸標準格式化組件示出為單獨的組件。然而,在另一實施例中,可藉由單個處理器來實施前述組件。同樣地,亦可藉由單個處理器來實施流式傳輸剖析器、影像解碼器及人工智慧解碼器28。
Meanwhile, for ease of explanation, the
如上所述,可藉由使用人工智慧模型來縮小及放大影像資料來得到高品質的影像壓縮,且因此可傳輸高畫質影像。 As mentioned above, high-quality image compression can be achieved by using artificial intelligence models to reduce and enlarge image data, and thus high-quality images can be transmitted.
圖3是根據實施例的電子裝置的方塊圖。 3 is a block diagram of an electronic device according to an embodiment.
參考圖3,電子裝置100可包括無線通訊介面110及處理器120。
Referring to FIG. 3 , the
無線通訊介面110可被配置以根據各種類型的通訊方法來與各種類型的外部設備實行通訊。電子裝置100可使用有線通訊方法或無線通訊方法來與外部設備實行通訊。然而,根據實施例且為易於闡釋起見,在無線通訊方法的情形中,將闡述藉由無線通訊介面110實行通訊,且在有線通訊的情形中,藉由圖4中所示的有線介面140實行通訊。
The
無線通訊介面110可使用無線通訊方法(諸如,無線保
真(wireless fidelity,Wi-Fi)、藍芽、近場通訊(Near Field Communication,NFC)等)自外部設備接收影像資料。根據實施例,電子裝置100可藉由接收影像來實行影像處理,所述影像是使用者自設置於電子裝置100中的記憶體130中所儲存的多個影像當中選出。
The
當電子裝置100被配置以實行無線通訊時,無線通訊介面110可包括Wi-Fi晶片、藍芽晶片、無線通訊晶片、近場通訊晶片等。Wi-Fi晶片或藍芽晶片可分別使用Wi-Fi方法及藍芽方法來實行通訊。當使用Wi-Fi晶片或藍芽晶片時,可首先傳輸及接收諸如服務設定識別符(service set identifier,SSID)及工作階段金鑰等各種連接性資訊,可基於所述連接性資訊來建立通訊連接,且可基於所述通訊連接來傳輸及接收各種資訊。所述無線通訊晶片指代根據各種通訊標準(諸如,電機電子工程師學會(Institute of Electrical and Electronics Engineers,IEEE)標準、紫蜂(ZigBee)、第三代(3G)、第三代合夥專案(Third Generation Partnership Project,3GPP)、長期演化(Long Term Evolution,LTE)、第五代(5G)等)實行通訊的晶片。近場通訊晶片指代在使用各種射頻識別(radio-frequency identification,RFID)頻帶(諸如,135千赫、13.56百萬赫、433百萬赫、860至960百萬赫及2.45吉赫等)當中的13.56百萬赫頻帶的NFC模式中運作的晶片。
When the
電子裝置100可經由無線通訊介面110自外部伺服器接收影像資料以及與應用於用於放大影像資料的人工智慧模型的濾
波器組相關聯的資訊。
The
自外部伺服器接收到的影像資料可以是由所述外部伺服器編碼的影像資料。可藉由對經縮小的影像資料進行編碼來獲得經編碼的影像資料,所述經縮小的影像資料是藉由將最初影像資料輸入至人工智慧模型中以進行縮小而獲得。 The image data received from the external server may be image data encoded by the external server. The encoded image data may be obtained by encoding the reduced image data obtained by inputting the original image data into the artificial intelligence model for reduction.
與濾波器組相關聯的資訊(例如,元資料)可與影像資料包含在一起。可自外部伺服器獲得與濾波器組相關聯的資訊,以將藉由用於放大影像資料的人工智慧模型獲得的經放大的影像資料與最初影像資料之間的差異最小化。可針對影像資料的每一圖框獲得與濾波器組相關聯的資訊。將參考圖7更詳細地闡述自外部伺服器獲得與濾波器組相關聯的資訊的過程。 Information associated with the filter bank (eg, metadata) may be included with the image data. Information associated with the filter bank may be obtained from an external server to minimize differences between the amplified image data obtained by the artificial intelligence model used to amplify the image data and the original image data. Information associated with the filter bank can be obtained for each frame of the image data. The process of obtaining information associated with a filter bank from an external server will be explained in more detail with reference to FIG. 7 .
處理器120可控制電子裝置100的總體運作。
The
根據實施例,處理器120可被實施為數位訊號處理器(digital signal processor,DSP)、微處理器或時間控制器(time controller,TCON),但並不僅限於此。處理器120可包括一個或多個中央處理單元(central processing unit,CPU)、微控制器單元(microcontroller unit,MCU)、微處理單元(micro processing unit,MPU)、控制器、應用處理器(application processor,AP)、通訊處理器(communication processor,CP)、高階RISC機器(Advanced RISC Machine,ARM)處理器等,或者可由對應的用語定義。處理器120可被實施為晶片系統(system on chip,SoC)、具有內建處理演算法的大型積體(large scale integration,LSI),或以現場
可程式化閘陣列(Field Programmable Gate Array,FPGA)形式來實施。
According to embodiments, the
處理器120可對經由無線通訊介面110接收到的影像資料進行解碼。處理器120可藉由對自外部伺服器接收到的經編碼的影像資料進行解碼來產生經壓縮影像資料。
The
處理器120可藉由將經解碼的影像資料輸入至人工智慧模型中以基於所接收到的與濾波器組相關聯的資訊進行放大來放大經解碼的影像資料。用語「放大」亦可被稱為「壓縮解除」、「影像復原」等。處理器120可應用所接收到的與濾波器組相關聯的資訊來獲得構成人工智慧模型的多個濾波器中的全部或一些。獲得多個濾波器中的一些可意味著構成人工智慧模型的所述多個濾波器中的一些係預設的,且僅基於所接收到的濾波器組資訊獲得其餘濾波器中的一些。
The
根據另一實施例,當記憶體130中不儲存有多個濾波器組時,處理器120可控制無線通訊介面110將所接收到的與濾波器組相關聯的資訊傳輸至第二外部伺服器。當經由無線通訊介面110接收到與自第二外部伺服器傳輸而來的濾波器組對應的參數資訊時,處理器120可使用所接收到的參數資訊來獲得放大用的人工智慧模型,並使用所獲得的人工智慧模型來放大經解碼的影像資料。
According to another embodiment, when the
處理器120可提供經放大的影像資料以供輸出。處理器120可控制無線通訊介面110將經放大的影像資料提供至外部顯示
設備。參考圖4,當電子裝置100中設置有顯示器150時,處理器120可控制顯示器150顯示經放大的影像資料。
The
如上所述,根據實施例的電子裝置100可自外部伺服器接收與經編碼的影像資料及與改良的濾波器組相關聯的資訊,藉此在復原影像資料期間縮短時間且減少資源消耗。因此,可實現影像的高壓縮率,且可將高畫質影像壓縮成大小較小的影像資料並進行傳輸。
As described above, the
圖4是根據實施例的電子裝置的方塊圖。 4 is a block diagram of an electronic device according to an embodiment.
參考圖4,電子裝置100可包括無線通訊介面110、處理器120、記憶體130、有線介面140、顯示器150、視訊處理器160、音訊處理器170及音訊輸出組件180等。
Referring to FIG. 4 , the
無線通訊介面110及處理器120可包括與圖3中所述的配置相同的配置,且因此將不再贅述。
The
記憶體130可儲存用於操作電子裝置100的各種程式及資料。具體而言,至少一個命令可儲存於記憶體130中。處理器120可藉由執行記憶體130中所儲存的命令來實行本文中所述的操作。記憶體130可體現為非揮發性記憶體、揮發性記憶體、快閃記憶體、硬碟驅動器(hard disk drive,HDD)、固態驅動器(solid state drive,SDD)等。
The
經過訓練的人工智慧模型可儲存於記憶體130中。經過訓練的人工智慧模型可包括用於放大影像資料的多個層。所述多個層中的每一者可包括多個濾波器。所述多個濾波器中的每一者
可包含多個參數。舉例而言,經過訓練的人工智慧模型可以是卷積神經網路(CNN)。
The trained artificial intelligence model can be stored in the
所述多個濾波器可各自應用於影像資料的整個圖框。根據實施例,可使用對影像資料的每一圖框應用相同的參數的濾波器,但可使用對每一圖框應用不同的參數的濾波器來放大影像資料。 Each of the plurality of filters may be applied to an entire frame of image data. According to embodiments, a filter that applies the same parameters to each frame of the image data may be used, but a filter that applies different parameters to each frame may be used to amplify the image data.
多個經過訓練的濾波器組可儲存於記憶體130中。具體而言,每一濾波器組可包括多個參數,且所述多個經過訓練的參數可儲存於記憶體130中。可基於個別給出的索引資訊來區分所述多個濾波器組。可提前訓練記憶體130中所儲存的濾波器組,以使得可獲得與在輸入影像資料被縮小之後的最初影像資料最類似的經放大的影像資料。輸入影像資料可以是各種類別的影像資料。將參考圖13更詳細地闡述訓練濾波器組的示例性實施例。
Multiple trained filter banks may be stored in
可在處理器120的控制下實行根據實施例的人工智慧模型的操作。
Operations of the artificial intelligence model according to embodiments may be performed under the control of the
處理器120可獲得人工智慧模型,所述人工智慧模型基於所接收到的與濾波器組相關聯的資訊中所包含的索引資訊來應用記憶體130中所儲存的多個濾波器組中的一者。處理器120可藉由將經解碼的影像資料輸入至應用與所接收到的索引資訊對應的濾波器組的人工智慧模型中來放大經解碼的影像資料。
The
處理器120可根據記憶體130中所儲存的人工智慧模型的目的來實行各種操作。
The
舉例而言,若人工智慧模型與影像辨識及/或視覺領悟(諸如,如同被人類感知到一樣辨識並處理物體的技術)有關,則處理器120可使用人工智慧模型來對輸入影像實行物體辨識、物體追蹤、影像搜尋、人類辨識、景物理解、空間理解、影像強化等。
For example, if the artificial intelligence model is related to image recognition and/or visual perception (such as techniques for identifying and processing objects as if perceived by humans), the
舉另一實例,若人工智慧模型與資訊推薦及/或推斷預測(諸如,判定並且在邏輯上推斷並預測資訊的技術)有關,則處理器120可使用人工智慧模型來實行基於知識/概率的推理、最佳化預測、基於偏好的計劃及推薦。
As another example, if the artificial intelligence model is related to information recommendation and/or inference prediction (such as techniques for determining and logically inferring and predicting information), the
舉另一實例,若人工智慧模型與查詢處理及/或知識表示(諸如,將人類經驗資訊自動轉換成知識資料的技術),則處理器120可使用人工智慧模型來實行知識建構(例如,資料產生及分類)及知識管理(資料利用)。
As another example, if the artificial intelligence model is associated with query processing and/or knowledge representation (such as technology that automatically converts human experience information into knowledge data), the
如上所述,藉由獲得基於所接收到的與影像資料及濾波器組相關聯的資訊來放大影像資料的人工智慧模型,且藉由將經解碼的影像資料輸入至所獲得的人工智慧模型中來獲得經放大的影像資料,即使使用少量的時間及資源仍可獲得改良的復原影像。 As described above, by obtaining an artificial intelligence model that amplifies the image data based on the received information associated with the image data and the filter bank, and by inputting the decoded image data into the obtained artificial intelligence model To obtain magnified image data, improved restored images can be obtained even with a small amount of time and resources.
有線介面140可被配置以使用有線通訊方法將電子裝置100連接至外部設備。有線介面140可經由有線通訊方法(諸如,纜線或埠)輸入及輸出音訊訊號及視訊訊號中的至少一者。
The
有線介面140可以是顯示器埠、高畫質多媒體介面(high definition multimedia interface,HDMI)、數位視覺介面(digital visual interface,DVI)、紅綠藍(red green blue,RGB)介面、D
超小型(D-subminiature,DSUB)介面、S視訊介面、複合視訊介面、通用串列匯流排(universal serial bus,USB)、雷電型埠(Thunderbolt type port)等。
The
顯示器150可顯示經放大的影像資料。可藉由經過訓練的人工智慧模型來放大在顯示器150上顯示的影像。根據人工智慧模型的目的,可在顯示器150上顯示影像中所包含的物體,且可顯示物體的種類。
The
顯示器150可被實施為各種類型的顯示器,諸如發光二極體(light emitting diode,LED)、液晶顯示器(liquid crystal display,LCD)、有機發光二極體(organic light emitting diode,OLED)顯示器、電漿顯示面板(plasma display panel,PDP)等。顯示器150更可包括驅動電路、背光單元等,可以非晶矽(amorphous-silicon,a-Si)薄膜電晶體(thin film transistor,TFT)、低溫多晶矽(low temperature poly silicon,LTPS)薄膜電晶體、有機薄膜電晶體(organic TFT,OTFT)等形式來實施所述驅動電路。與此同時,顯示器150可被實施為撓性顯示器。
The
顯示器150可包括用於偵測使用者的觸控手勢的觸控感測器。觸控感測器可體現為各種類型的感測器,諸如靜電型、壓敏型、壓電型等。當影像處理裝置100支援筆輸入功能時,顯示器150可使用輸入方式(諸如,筆及使用者手指)來偵測使用者手勢。當所述輸入方式是其中包括線圈的手寫筆時,影像處理裝置100可包括磁場感測器,所述磁場感測器能夠感測被手寫筆中
的線圈改變的磁場。因此,顯示器150可偵測近處的手勢,亦即懸停及觸控手勢。
The
儘管已闡述了顯示功能及手勢偵測功能是以與如上所述的相同的配置來實行,但可以不同的配置實行所述功能。根據各種實施例,電子裝置100中可不包括顯示器150。舉例而言,當電子裝置100是機頂盒或伺服器時,可不設置有顯示器150。在此種情形中,可經由無線通訊介面110或有線介面140將經放大的影像資料傳輸至外部顯示設備。
Although the display function and gesture detection function have been described as being implemented in the same configuration as described above, the functions may be implemented in different configurations. According to various embodiments, the
處理器120可包括隨機存取記憶體(random access memory,RAM)121、唯讀記憶體(read-only memory,ROM)122、中央處理單元123、圖形處理單元(Graphic Processing Unit,GPU)124及匯流排125。隨機存取記憶體121、唯讀記憶體122、中央處理單元123、圖形處理單元(GPU)124等可經由匯流排125彼此連接。
The
中央處理單元123可存取記憶體130且使用記憶體130中所儲存的作業系統(operating system,O/S)來實行引導(booting)。中央處理單元123可使用記憶體130中所儲存的各種程式、內容、資料等來實行各種操作。
The
用於系統引導的命令集等可儲存於唯讀記憶體122中。當輸入接通命令且供電時,中央處理單元123可根據唯讀記憶體122中所儲存的命令將記憶體130中所儲存的作業系統複製至隨機存取記憶體121,執行所述作業系統並實行系統引導。當完成系
統引導時,中央處理單元123可將記憶體130中所儲存的各種程式複製至隨機存取記憶體121,執行複製至隨機存取記憶體121的各種程式,並實行各種操作。
A command set for system booting, etc. may be stored in the read-
當完成對電子裝置100的引導時,圖形處理單元124可在顯示器150上顯示使用者介面(user interface,UI)。舉例而言,圖形處理單元124可使用計算單元(未示出)及呈現單元(未示出)來產生包含各種物件(諸如,圖標、影像、文字等)的螢幕。計算單元可根據螢幕的佈局來計算物件的屬性值,諸如座標值、形狀、大小、色彩等。呈現單元可基於計算單元所計算的屬性值來產生包括物件的各種佈局的螢幕。可將呈現單元所產生的螢幕(或UI視窗)提供至顯示器150,並在主顯示區域及次顯示區域中予以顯示。
When booting of the
視訊處理器160可被配置以處理經由無線通訊介面110或有線介面140接收到的內容或處理記憶體130中所儲存的內容中所包含的視訊資料。視訊處理器160可對視訊資料實行各種影像處理,諸如解碼、縮放、雜訊濾除、圖框率轉換、解析度轉換等。
The
音訊處理器170可被配置以處理經由無線通訊介面110或有線介面140接收到的內容或處理記憶體130中所儲存的內容中所包含的音訊資料。音訊處理器170可對音訊資料實行各種處理,諸如解碼、擴大(amplification)、雜訊濾除等。
The
當執行關於多媒體內容的再現應用時,處理器120可驅
動視訊處理器160及音訊處理器170來再現內容。顯示器150可在主顯示區域及次顯示區域中的至少一者上顯示由視訊處理器160產生的影像圖框。
When executing a reproduction application regarding multimedia content, the
如上所述,處理器120、視訊處理器160及音訊處理器170被闡述為單獨的組件。然而,根據實施例,前述組件體現為單個晶片。舉例而言,處理器120可用作視訊處理器160及音訊處理器170。
As mentioned above,
音訊輸出組件180可輸出由音訊處理器170產生的音訊資料。
The
儘管圖4中未示出,但根據實施例,電子裝置100更可包括用於與各種外部端子或周邊裝置連接的各種外部輸入埠,諸如耳機(headset)、滑鼠等、接收並處理數位多媒體廣播(digital multimedia broadcasting,DMB)訊號的DMB晶片、用於接收使用者操作的按鈕、用於接收將被轉換成音訊資料的使用者語音或聲音的麥克風、用於根據使用者、各種感測器的控制拍攝靜止影像或視訊的拍攝單元(例如,照相機)等。
Although not shown in FIG. 4 , according to embodiments, the
圖5是根據實施例的伺服器的方塊圖。 Figure 5 is a block diagram of a server according to an embodiment.
參考圖5,伺服器200可包括通訊介面210、記憶體220及處理器230。
Referring to FIG. 5 , the
通訊介面210可使用有線通訊方法或無線通訊方法來實行與外部設備的通訊。
The
通訊介面210可以無線方式(諸如,無線區域網路
(wireless local-area network,LAN)、藍芽等)連接至外部設備。通訊介面210可使用Wi-Fi、紫蜂、紅外線(Infrared ray,IrDA)等連接至外部設備。通訊介面210可包括有線通訊方法的連接埠。
The
伺服器200可經由通訊介面210將經編碼的影像資料傳輸至電子裝置100。傳輸至電子裝置100的影像資料可包含與用於實現改良的影像復原的濾波器組相關聯的資訊。處理器230可獲得與濾波器組相關聯的資訊。
The
記憶體220可儲存用於操作伺服器200的各種程式及資料。至少一個命令可儲存於記憶體220中。處理器230可藉由執行記憶體220中所儲存的命令來實行上述操作。
The
記憶體220可儲存經過訓練的人工智慧模型。具體而言,用於縮小影像資料的經過訓練的人工智慧模型可包括用於縮小的多個層。所述多個層中的每一者可包括多個濾波器。所述多個濾波器中的每一者可包含多個參數。
The
用於放大影像資料的經過訓練的人工智慧模型可儲存於記憶體220中。用於放大影像資料的經過訓練的人工智慧模型可包括用於放大的多個層。所述多個層中的每一者可包括多個濾波器。所述多個濾波器中的每一者可包含多個參數。
The trained artificial intelligence model used to amplify the image data may be stored in
縮小用或放大用的人工智慧模型可以是卷積神經網路。放大用的人工智慧模型的數目可小於縮小用的人工智慧模型的數目。 The artificial intelligence model used for reduction or enlargement can be a convolutional neural network. The number of artificial intelligence models used for upscaling may be smaller than the number of artificial intelligence models used for downscaling.
所述多個濾波器中的每一者可應用於影像資料的整個 圖框。具體而言,根據實施例,可使用對影像資料的每一圖框應用相同的參數的濾波器,但可使用對每一圖框應用不同的參數的濾波器來放大影像資料。 Each of the plurality of filters may be applied to the entire Picture frame. Specifically, according to an embodiment, a filter that applies the same parameters to each frame of the image data may be used, but a filter that applies different parameters to each frame may be used to amplify the image data.
記憶體220可儲存多個經過訓練的濾波器組。記憶體220中所儲存的所述多個濾波器組可包括應用於用於放大影像資料的人工智慧模型的濾波器組。具體而言,濾波器組可包含多個參數,且記憶體220可包含多個經過訓練的參數。可基於個別給出的索引資訊來區分所述多個濾波器組。可提前訓練記憶體220中所儲存的濾波器組,以將經放大的影像資料與在輸入影像資料被縮小之後的最初影像資料之間的差異最小化。在此種情形中,可使用通用類似度分析方法(例如,峰值訊雜比(peak signal to noise ratio,PSNR)、結構類似度(structural similarity,SSIM)等)來識別經放大的影像資料與最初影像資料之間的差異。輸入影像資料可以是各種類別的影像資料。將參考圖13更詳細地闡述濾波器組的示例性實施例。
可在處理器230的控制下實行人工智慧模型的操作。
The operation of the artificial intelligence model can be performed under the control of the
處理器230可藉由將經縮小的影像資料輸入至應用多個濾波器組中的每一者的人工智慧模型中來獲得多個經放大的影像資料。處理器230可藉由同時獲取多個人工智慧模型來獲得所述多個經放大的影像資料。處理器230可使用應用所述多個濾波器組中的一者的人工智慧模型來獲得一個經放大的影像資料,且然後藉由依序地改變所應用的濾波器組來獲得經放大的影像資料。
The
處理器230可在所述多個所獲得的經放大的影像資料當中識別相較於最初影像資料具有最小差異(例如,最小損耗)的經放大的影像資料。處理器230可使用通用類似度分析方法(例如PSNR、SSIM等)來對最初影像資料與所獲得的經放大的影像資料進行比較。
The
處理器230可藉由包含與應用於所識別的影像資料的濾波器組相關聯的資訊(例如,元資料)來對經縮小的影像進行編碼。與濾波器組相關聯的資訊可以是濾波器組的索引資訊。處理器230可將與濾波器組相關聯的資訊包含於補充強化資訊(supplemental enhancement information,SEI)資料中,所述補充強化資訊資料被附加至藉由對經縮小的影像進行編碼產生的位元串流。SEI資料可提供與影像資料的解析度、位元率、圖框率等相關聯的資訊。
處理器230可經由通訊介面210將經編碼的影像資料傳輸至外部電子裝置。與用於改良影像復原的濾波器組相關聯的資訊可與所傳輸的影像資料包含在一起。
The
如上所述,藉由提前在由伺服器進行的編碼過程中通過多個放大過程識別改良的放大濾波器組,可將電子裝置中的復原過程簡化。因此,可在影像流式傳輸環境中實現高壓縮率,且因此可傳輸高畫質影像。 As mentioned above, by identifying the improved amplification filter bank through multiple amplification processes in advance during the encoding process by the server, the recovery process in the electronic device can be simplified. Therefore, a high compression rate can be achieved in an image streaming environment, and therefore high-quality images can be transmitted.
圖6是根據實施例的伺服器的影像編碼操作的流程圖。 Figure 6 is a flowchart of an image encoding operation of a server according to an embodiment.
參考圖6,在操作S610處,伺服器可藉由將最初影像資 料輸入至用於縮小影像資料的人工智慧縮小模型中來獲得經縮小的影像資料。可提前設定縮小壓縮率。舉例而言,參考圖2,可使用1/4壓縮率來壓縮大小為2N×2M的影像,以將所述影像縮小成大小為N×M的影像。然而,壓縮率並不僅限於此。 Referring to FIG. 6, at operation S610, the server may convert the initial image data to The data is input into the artificial intelligence reduction model used to reduce the image data to obtain the reduced image data. The reduction compression ratio can be set in advance. For example, referring to FIG. 2, a 1/4 compression rate may be used to compress an image of size 2N×2M to reduce the image to an image of size N×M. However, the compression ratio doesn't stop there.
在操作S620處,伺服器可藉由將經縮小的影像資料輸入至多個人工智慧放大模型中來獲得多個經放大的影像資料,所述人工智慧放大模型應用為了放大經縮小的影像資料而訓練的多個濾波器組中的每一者。 At operation S620, the server may obtain a plurality of enlarged image data by inputting the reduced image data into a plurality of artificial intelligence enlargement models trained for enlarging the reduced image data. each of multiple filter banks.
可提前對所述多個濾波器組進行放大影像資料的訓練,並將所述多個濾波器組儲存於伺服器中。另外,可將與伺服器中所儲存的所述多個濾波器組中的一個濾波器組相同的濾波器組儲存於外部電子裝置中。 The plurality of filter groups can be trained in advance to amplify image data, and the plurality of filter groups can be stored in a server. Additionally, the same filter set as one of the plurality of filter sets stored in the server may be stored in the external electronic device.
伺服器可使用應用多個濾波器組中的每一者的多個人工智慧放大模型來獲得多個經放大的影像資料。另外,可依序地獲得多個經放大的影像資料,諸如使用應用多個濾波器組中的一者的人工智慧放大模型來獲得一個經放大的影像資料,且然後藉由改變應用於人工智慧放大模型的濾波器組來獲得另一經放大的影像資料。 The server may obtain multiple upscaled image data using multiple artificial intelligence magnification models applying each of multiple filter banks. Additionally, multiple magnified image data may be obtained sequentially, such as using an artificial intelligence magnification model that applies one of multiple filter banks to obtain one magnified image data, and then by changing the applied artificial intelligence Magnify the model's filter bank to obtain another magnified image data.
在操作S630處,伺服器可藉由添加與人工智慧放大模型的濾波器組相關聯的資訊來對經縮小的影像資料進行編碼,所述人工智慧放大模型輸出在所述多個經放大的影像資料當中與最初影像資料具有最小差異的影像資料。 At operation S630, the server may encode the reduced image data by adding information associated with a filter bank of an artificial intelligence upscaling model output in the plurality of upscaled images. The image data among the data that has the smallest difference from the original image data.
具體而言,伺服器可使用類似度分析方法來對所述多個經放大的影像資料中的每一者與最初影像資料進行比較,並識別相較於最初影像資料具有最小差異(例如,最小損耗值等)的經放大的影像資料。伺服器可識別與應用於輸出所識別的經放大的影像資料的人工智慧放大模型的濾波器組相關聯的資訊。伺服器可藉由添加與濾波器組的所識別資訊相關聯的資訊來對經縮小的影像資料進行編碼。與所述濾波器組相關聯的資訊可包含於SEI中,且可包含濾波器組的索引資訊。可藉由對大小受到壓縮的經縮小的影像資料進行編碼來獲得經編碼的影像資料,且經編碼的影像資料的大小可小於最初影像資料。 Specifically, the server may use a similarity analysis method to compare each of the plurality of amplified image data with the original image data and identify a minimum difference compared to the original image data (e.g., a minimum loss value, etc.) enlarged image data. The server can identify information associated with a filter bank applied to an artificial intelligence magnification model that outputs the identified magnified image data. The server may encode the reduced image data by adding information associated with the identified information of the filter bank. Information associated with the filter bank may be included in the SEI and may include index information for the filter bank. The encoded image data may be obtained by encoding reduced image data whose size is compressed, and the size of the encoded image data may be smaller than the original image data.
在操作S640處,伺服器可將經編碼的影像資料傳輸至外部電子裝置。伺服器可將包含有與所述濾波器組相關聯的資訊的經編碼的影像資料傳輸至外部電子裝置,並使用改良的濾波器組來實行解碼及放大。因此,可在流式傳輸環境中傳輸高品質影像。 At operation S640, the server may transmit the encoded image data to the external electronic device. The server may transmit the encoded image data including information associated with the filter bank to an external electronic device and use the modified filter bank to perform decoding and amplification. Therefore, high-quality images can be transmitted in a streaming environment.
圖7是根據實施例的伺服器的影像編碼操作的流程圖。圖7的操作710至操作740可與結合圖2所述的人工智慧編碼器22的操作對應。然而,為易於闡釋起見,將其作為伺服器的操作加以闡述。
Figure 7 is a flowchart of an image encoding operation of a server according to an embodiment. Operations 710 to 740 of FIG. 7 may correspond to the operations of the
參考圖7,在操作S710處,伺服器可將影像內容源71輸入至人工智慧縮小模型。舉例而言,影像內容源71可以是大小為2N×2M的最初影像資料。人工智慧縮小模型可包括多個卷積 濾波器,且訓練可已經完成。 Referring to FIG. 7 , at operation S710 , the server may input the image content source 71 to the artificial intelligence reduction model. For example, the image content source 71 may be original image data with a size of 2N×2M. AI downscaling models can include multiple convolutions filter, and training may have been completed.
伺服器可藉由允許影像內容源71通過人工智慧縮小模型來獲得經縮小的影像72。經縮小的影像72可具有N×M大小,所述N×M大小是最初影像內容源71的1/4大小。影像的大小可與解析度對應。然而,壓縮率1/4是示範性的,且可藉由訓練獲得改良的壓縮率。
The server may obtain the reduced
舉例而言,人工智慧縮小模型可藉由允許輸入影像內容源71通過卷積層來獲得每一圖框的特徵圖,並藉由允許所獲得的特徵圖通過池化層(pooling layer)來獲得經壓縮影像。池化層可將輸入特徵圖劃分成預定網格,並輸出對所獲得的相應網格的代表值進行編譯的特徵圖。自池化層輸出的特徵圖的大小可小於輸入至池化層的特徵圖的大小。每一網格的代表值可以是每一網格中所包含的最大值或每一網格的平均值。 For example, the artificial intelligence reduction model can obtain the feature map of each frame by allowing the input image content source 71 to pass through the convolution layer, and by allowing the obtained feature map to pass through the pooling layer to obtain the feature map. Compress images. The pooling layer can divide the input feature map into a predetermined grid and output a feature map that compiles the obtained representative values of the corresponding grid. The size of the feature map output from the pooling layer may be smaller than the size of the feature map input to the pooling layer. The representative value of each grid can be the maximum value contained in each grid or the average value of each grid.
人工智慧縮小模型可藉由重複進行卷積層及池化層的操作來對影像進行壓縮。當卷積層及池化層的數目增大時,可提高壓縮率。當自池化層獲得代表值的網格的大小增大時,可提高壓縮率。 The artificial intelligence reduction model can compress the image by repeatedly performing the operations of the convolution layer and the pooling layer. When the number of convolutional layers and pooling layers increases, the compression rate can be improved. When the size of the grid from which representative values are obtained from the pooling layer is increased, the compression rate can be improved.
在操作S750處,伺服器可將原始視訊資料及人工智慧旗標73傳輸至標準編碼器,所述原始視訊資料具有按照1/4壓縮率縮小而得到的N×M大小。按照1/4壓縮率縮小的N×M大小的原始視訊影像可與經縮小的影像72相同。人工智慧旗標73可指示是否實行了人工智慧縮小。若人工智慧旗標被設定為值1,則
人工智慧旗標73可表明實行了人工智慧縮小。
At operation S750, the server may transmit the original video data and the
在操作S720處,伺服器可判斷是否使用多人工智慧濾波器選項。若在操作S720處伺服器確定未使用多人工智慧濾波器選項(例如,S720-否),則在操作S750處,伺服器可將濾波器索引的值=NULL傳輸至標準編碼器,濾波器索引=NULL意味著未使用濾波器索引。 At operation S720, the server may determine whether to use the multiple artificial intelligence filter options. If the server determines at operation S720 that the multi-AI filter option is not used (e.g., S720-No), then at operation S750, the server may transmit the value of the filter index = NULL to the standard encoder, filter index =NULL means no filter index is used.
當在操作S720處伺服器確定使用多人工智慧濾波器選項(例如,S720-是)時,則在操作S730處,伺服器可將經縮小的影像輸入至多個所儲存的人工智慧放大模型中。可藉由將多個濾波器組中的每一者應用於人工智慧模型來獲得所述多個人工智慧放大模型。具體而言,所述多個濾波器組中的每一者可包括多個層,如圖8及圖17所示。每一層可以是卷積濾波器,且訓練可已經完成。 When the server determines to use the multiple artificial intelligence filter options at operation S720 (eg, S720-Yes), then at operation S730, the server may input the reduced image into a plurality of stored artificial intelligence magnification models. The plurality of artificial intelligence amplification models may be obtained by applying each of the plurality of filter banks to the artificial intelligence model. Specifically, each of the plurality of filter banks may include multiple layers, as shown in FIGS. 8 and 17 . Each layer can be a convolutional filter, and training can already be completed.
為易於闡釋起見,圖7說明使用應用具有索引0、1、2及n的濾波器組的n個人工智慧放大模型,但在其他實施例中,人工智慧放大模型的數目及索引資訊可有所不同。可藉由提前復原多人工智慧濾波器功能來實現較高的壓縮率。 For ease of explanation, FIG. 7 illustrates the use of n artificial intelligence amplification models applying filter banks with indexes 0, 1, 2, and n, but in other embodiments, the number and index information of the artificial intelligence amplification models may be different. Higher compression rates can be achieved by restoring multiple artificial intelligence filter functions in advance.
在操作S740處,伺服器可選擇輸出相較於影像內容源具有最小差異的經放大的影像的人工智慧模型的濾波器索引。具體而言,伺服器可在操作S730處將自人工智慧縮小模型獲得的經縮小的影像72輸入至每一人工智慧放大模型中,且在操作S730處自每一人工智慧放大模型獲得經放大的影像。
At operation S740, the server may select to output a filter index of the artificial intelligence model of the enlarged image that has a minimum difference compared to the image content source. Specifically, the server may input the reduced
伺服器可對自每一人工智慧放大模型獲得的相應的經放大的影像與影像內容源71(其是最初影像)進行比較,並識別輸出相較於最初影像內容源具有最小差異的經放大的影像(例如,具有最小損耗的經放大的影像)的人工智慧放大模型。在操作S750處,伺服器可將輸出相較於最初影像具有最小差異的經放大的影像的人工智慧放大模型的索引資訊74傳輸至標準編碼器。
The server may compare the corresponding enlarged image obtained from each artificial intelligence upscaling model with the image content source 71 (which is the original image), and identify and output the enlarged image that has the smallest difference compared to the original image content source. An artificial intelligence magnification model of an image (e.g., an enlarged image with minimal loss). At operation S750, the server may transmit the
伺服器可使用按照1/4縮小的N×M大小的所傳輸的原始視訊資料、人工智慧旗標73及濾波器索引資訊74來實行編碼。伺服器可藉由對影像資料進行編碼來獲得位元串流,並將資訊包含於SEI標頭中。
The server can perform encoding using the transmitted original video data,
伺服器可將通過編碼操作獲得的位元串流及SEI標頭中所包含的資訊75傳輸至電子裝置100。
The server may transmit the bit stream obtained through the encoding operation and the
圖8是根據實施例的濾波器組的圖。 Figure 8 is a diagram of a filter bank according to an embodiment.
參考圖8,多個濾波器組中的每一者可包括多個層。所述多個濾波器組可以相同的方式儲存於伺服器及電子裝置中。 Referring to Figure 8, each of the plurality of filter banks may include multiple layers. The plurality of filter banks can be stored in servers and electronic devices in the same manner.
舉例而言,濾波器組810可包括n個層811、812...及81n。所述多個層中的每一者可包括多個卷積濾波器。每一卷積濾波器可以是寬度N×高度M×通道C(例如,N×M×C)的三維卷積濾波器,且所述濾波器之間可包含激活函數偏倚項1、2、...及n。
For example,
為易於闡釋起見,圖8說明多個層的濾波器被界定為N×M×C,但在其他實施例中,每一層可具有不同的濾波器大小(N ×M)及通道(C)。另外,每一濾波器組可具有不同數目個層。 For ease of explanation, Figure 8 illustrates multiple layers of filters defined as N×M×C, but in other embodiments, each layer may have a different filter size (N ×M) and channel (C). Additionally, each filter bank can have a different number of layers.
圖9是根據實施例的用於訓練並使用人工智慧模型的電子裝置的方塊圖。 Figure 9 is a block diagram of an electronic device for training and using an artificial intelligence model, according to an embodiment.
參考圖9,處理器900可包括訓練單元910、獲取單元920中的至少一者。圖9的處理器900可與圖3的處理器120及圖5的處理器230對應。
Referring to FIG. 9 , the
訓練單元910可產生或訓練模型以用於產生縮小濾波器及放大濾波器。訓練單元910可使用所收集的訓練資料來產生人工智慧模型,以用於產生影像資料的縮小濾波器及放大濾波器。訓練單元910可使用所收集的訓練資料來產生經過訓練的模型,所述經過訓練的模型具有用於產生影像資料的縮小濾波器及放大濾波器的準則。訓練單元910可與人工智慧模型的訓練級對應。
The
舉例而言,訓練單元910可使用最初影像資料及藉由縮小及放大所述最初影像資料獲得的影像資料作為輸入資料來產生、訓練或更新模型以預測濾波器的產生。具體而言,若模型的目的是強化影像品質,則訓練單元910可產生、訓練或更新模型以用於產生濾波器,來縮小或放大最初影像資料及藉由縮小及放大所述最初影像資料獲得的影像資料。
For example, the
獲取單元920可藉由使用預定資料作為經過訓練的模型的輸入資料來獲得各種資訊。
The obtaining
舉例而言,當影像被輸入時,獲取單元920可使用輸入影像及經過訓練的濾波器來獲得(或辨識、估計及推斷)與所述
輸入影像相關聯的資訊。
For example, when an image is input, the
訓練單元910的至少一部分及獲取單元920的至少一部分可體現為軟體模組,且被製造成一個或多個硬體晶片形式以供安裝於電子裝置100上。舉例而言,可將訓練單元910及獲取單元920中的至少一者製造成硬體晶片形式以用於人工智慧特定操作,或製造為將安裝於各種種類的電子裝置上的現有通用處理器(例如,中央處理單元或應用處理器)或圖形處理器(例如,圖形處理單元)的一部分。用於人工智慧特定操作的硬體晶片可以是專門用於概率計算的處理器,所述專門用於概率計算的處理器具有較傳統通用處理器高的並行處理效能,藉此在人工智慧領域(諸如,機器學習)中快速實行算術運算。當訓練單元910及獲取單元920被實施為軟體模組(或包含指令的程式模組)時,所述軟體模組可以是非暫時性電腦可讀媒體。在此種情形中,可由作業系統(OS)或由預定應用提供所述軟體模組。另一選擇為,所述軟體模組中的一些可由作業系統來提供,且所述軟體模組中的一些可由預定應用提供。
At least part of the
訓練單元910及獲取單元920可安裝於諸如伺服器等單個電子裝置上,或者可各自安裝於單獨的電子裝置上。舉例而言,訓練單元910及獲取單元920中的一者可包括於諸如電視等電子裝置中,且另一者可包括於外部伺服器中。訓練單元910及獲取單元920可將由訓練單元910建立的模型資訊以有線方式或無線方式提供至獲取單元920,或者可將輸入至訓練單元910中的資料
提供至訓練單元910作為額外訓練資料。
The
圖10A及圖10B是根據實施例的訓練單元及獲取單元的方塊圖。 10A and 10B are block diagrams of a training unit and an acquisition unit according to embodiments.
參考圖10A,根據實施例的訓練單元910可包括訓練資料獲取單元910-1及模型訓練單元910-4。訓練單元910更可以可選地包括訓練資料預處理器910-2、訓練資料選擇器910-3及模型評估單元910-5。
Referring to FIG. 10A , the
訓練資料獲取單元910-1可獲得用於模型的訓練資料。根據實施例,訓練資料獲取單元910-1可獲得與輸入影像相關聯的資料作為訓練資料。具體而言,訓練資料獲取單元910-1可獲得經縮小的最初影像資料及最初影像資料(其是輸入影像),且然後獲得經放大的影像資料作為訓練資料。 The training data acquisition unit 910-1 can obtain training data for the model. According to an embodiment, the training data acquisition unit 910-1 may obtain data associated with the input image as training data. Specifically, the training data acquisition unit 910-1 may obtain the reduced initial image data and the initial image data (which is the input image), and then obtain the enlarged image data as the training data.
模型訓練單元910-4可訓練如何修改使用訓練資料獲得的影像處理結果和與實際輸入影像相關聯的資訊之間的差異。舉例而言,模型訓練單元910-4可通過監督式學習來訓練人工智慧模型,所述監督式學習使用訓練資料的至少一部分作為準則。模型訓練單元910-4可通過非監督式學習來訓練人工智慧模型,所述非監督式學習是在無任何引導的情況下使用訓練資料來進行自我訓練。模型訓練單元910-4可通過強化學習來訓練人工智慧模型,所述強化學習使用基於訓練確定的結果是否正確的回饋。模型訓練單元910-4亦可使用例如學習演算法來訓練人工智慧模型,所述學習演算法包括誤差反向傳播方法(error back-propagation)或梯度下降(gradient descent)。 The model training unit 910-4 can train how to modify the difference between the image processing results obtained using the training data and the information associated with the actual input image. For example, the model training unit 910-4 may train the artificial intelligence model through supervised learning that uses at least a portion of the training material as a criterion. The model training unit 910-4 can train the artificial intelligence model through unsupervised learning, which is self-training using training data without any guidance. The model training unit 910-4 may train the artificial intelligence model through reinforcement learning using feedback based on whether the result determined by the training is correct. The model training unit 910-4 can also use, for example, a learning algorithm to train the artificial intelligence model. The learning algorithm includes an error back propagation method (error back propagation method). back-propagation) or gradient descent.
當人工智慧模型被訓練之後,模型訓練單元910-4可儲存經過訓練的人工智慧模型。在此種情形中,模型訓練單元910-4可將經過訓練的人工智慧模型儲存於伺服器(例如,人工智慧伺服器)中。模型訓練單元910-4可將經過訓練的人工智慧模型儲存於經由有線網路或無線網路連接的電子裝置的記憶體中。 After the artificial intelligence model is trained, the model training unit 910-4 can store the trained artificial intelligence model. In this case, the model training unit 910-4 can store the trained artificial intelligence model in the server (eg, artificial intelligence server). The model training unit 910-4 can store the trained artificial intelligence model in the memory of an electronic device connected via a wired network or a wireless network.
訓練資料預處理器910-2可預處理所獲得的資料,以使得所獲得的資料可用於訓練,以產生將應用於多個特徵圖的濾波器。訓練資料預處理器910-2可以預定的格式將所獲得的資料格式化,以使得模型訓練單元910-4可使用所述所獲得的資料來進行訓練以產生將應用於特徵圖的濾波器。 The training data preprocessor 910-2 can preprocess the obtained data so that the obtained data can be used for training to generate filters to be applied to multiple feature maps. The training data preprocessor 910-2 may format the obtained data in a predetermined format, so that the model training unit 910-4 may use the obtained data for training to generate filters to be applied to feature maps.
訓練資料選擇器910-3可在自訓練資料獲取單元910-1獲得的資料或由訓練資料預處理器910-2預處理的資料之間選擇用於訓練的資料。可將所選訓練資料提供至模型訓練單元910-4。根據預定選擇準則,訓練資料選擇器910-3可在所獲得的資料或經預處理的資料當中選擇用於訓練的訓練資料。訓練資料選擇器910-3可根據由模型訓練單元910-4的訓練預定準則來選擇訓練資料。 The training data selector 910-3 may select data for training between data obtained from the training data acquisition unit 910-1 or data preprocessed by the training data preprocessor 910-2. Selected training data may be provided to the model training unit 910-4. According to the predetermined selection criteria, the training data selector 910-3 may select training data for training among the obtained data or preprocessed data. The training material selector 910-3 may select training materials according to the training predetermined criteria by the model training unit 910-4.
訓練單元910更可包括模型評估單元910-5,所述模型評估單元910-5用於改良人工智慧模型的辨識結果。
The
模型評估單元910-5可將評估資料輸入至人工智慧模型中,且若自評估資料輸出的辨識結果未滿足預定準則,則允許模 型訓練單元910-4進行訓練。在此種情形中,評估資料可以是用於評估人工智慧模型的預定義資料。 The model evaluation unit 910-5 can input evaluation data into the artificial intelligence model, and if the identification result output from the evaluation data does not meet the predetermined criteria, the model is allowed to Type training unit 910-4 performs training. In this case, the evaluation data may be predefined data for evaluating the artificial intelligence model.
舉例而言,就評估資料而言在經過訓練的人工智慧模型的辨識結果當中,若具有不正確辨識結果的評估資料的數目或比率超出預定臨限值,則模型評估單元910-5可評估辨識結果不滿足預定準則。 For example, in terms of evaluation data, among the recognition results of the trained artificial intelligence model, if the number or ratio of evaluation data with incorrect recognition results exceeds a predetermined threshold, the model evaluation unit 910-5 may evaluate the recognition results. The result does not meet the predetermined criteria.
當存在多個經過訓練的人工智慧模型時,模型評估單元910-5可評估每一經過訓練的人工智慧模型是否滿足預定準則,且識別出滿足預定準則的人工智慧模型作為最終的人工智慧模型。在此種情形中,當多個人工智慧模型皆滿足預定準則時,模型評估單元910-5可識別出預定的任一個或按照評估得分的遞降次序提前設定的若干個模型作為最終的人工智慧模型。 When there are multiple trained artificial intelligence models, the model evaluation unit 910-5 may evaluate whether each trained artificial intelligence model satisfies the predetermined criteria, and identify the artificial intelligence model that satisfies the predetermined criteria as the final artificial intelligence model. In this case, when multiple artificial intelligence models meet the predetermined criteria, the model evaluation unit 910-5 can identify any one of the predetermined ones or several models set in advance in descending order of evaluation scores as the final artificial intelligence model. .
參考圖10B,獲取單元920可包括輸入資料獲取單元920-1及提供器920-4。
Referring to FIG. 10B, the
獲取單元920更可選擇性地包括輸入資料預處理器920-2、輸入資料選擇器920-3及模型更新單元920-5。
The
輸入資料獲取單元920-1可獲得輸入最初影像資料,且根據影像處理的目的獲得多個濾波器。所述多個濾波器可以是用於縮小影像資料的多個濾波器及用於放大經縮小的影像資料的多個濾波器。提供器920-4可藉由將自輸入資料獲取單元920-1獲得的輸入資料應用於經過訓練的人工智慧模型作為輸入值來獲得輸入影像的處理結果。提供器920-4可藉由將輸入資料預處理器 920-2或輸入資料選擇器920-3所選擇的資料應用於人工智慧模型作為輸入值來獲得輸入影像的處理結果。 The input data acquisition unit 920-1 can obtain input initial image data, and obtain multiple filters according to the purpose of image processing. The plurality of filters may be a plurality of filters used to reduce the image data and a plurality of filters used to enlarge the reduced image data. The provider 920-4 can obtain the processing result of the input image by applying the input data obtained from the input data acquisition unit 920-1 to the trained artificial intelligence model as an input value. Provider 920-4 can preprocess the input data by The data selected by 920-2 or the input data selector 920-3 is applied to the artificial intelligence model as an input value to obtain the processing result of the input image.
舉例而言,提供器920-4可藉由將自輸入資料獲取單元920-1獲得的輸入最初影像資料、用於縮小最初影像資料的濾波器及用於放大經縮小的影像資料的濾波器應用於經過訓練的人工智慧模型來獲得(或估計)輸入影像的處理結果。 For example, the provider 920-4 may apply input initial image data obtained from the input data acquisition unit 920-1, a filter for reducing the initial image data, and a filter for enlarging the reduced image data. Use a trained artificial intelligence model to obtain (or estimate) the processing results of the input image.
獲取單元920更可包括輸入資料預處理器920-2及輸入資料選擇器920-3,所述輸入資料預處理器920-2及輸入資料選擇器920-3用於改良人工智慧模型的辨識結果,或者節省提供辨識結果的資源或時間。
The
輸入資料預處理器920-2可預處理所獲得的資料,以使得可使用將被輸入至第一人工智慧模型及第二人工智慧模型中的所獲得的資料。輸入資料預處理器920-2可以預定義格式將所獲得的資料格式化,以使得提供器920-4可使用所獲得的資料來獲得改良的壓縮率。 The input data preprocessor 920-2 can preprocess the obtained data so that the obtained data can be used to be input into the first artificial intelligence model and the second artificial intelligence model. The input data preprocessor 920-2 can format the obtained data in a predefined format so that the provider 920-4 can use the obtained data to obtain an improved compression rate.
輸入資料選擇器920-3可在自輸入資料獲取單元920-1獲得的資料與由輸入資料預處理器920-2預處理的資料之間選擇用於狀態確定的資料。可將所選資料提供至提供器920-4。輸入資料選擇器920-3可根據預定的狀態確定準則來選擇所獲得的資料或經預處理的資料中的一部分或全部。輸入資料選擇器920-3可根據由模型訓練單元910-4進行訓練而預定的準則來選擇資料。 The input data selector 920-3 may select data for status determination between the data obtained from the input data acquisition unit 920-1 and the data preprocessed by the input data preprocessor 920-2. Selected information may be provided to provider 920-4. The input data selector 920-3 may select part or all of the obtained data or preprocessed data according to predetermined status determination criteria. The input data selector 920-3 may select data according to predetermined criteria trained by the model training unit 910-4.
模型更新單元920-5可將人工智慧模型控制成基於對提 供器920-4所提供的辨識結果的評估而被更新。舉例而言,模型更新單元920-5可將提供器920-4所提供的影像處理結果提供至模型訓練單元910-4,以請求模型訓練單元910-4另外地訓練或更新人工智慧模型。 The model update unit 920-5 can control the artificial intelligence model to be based on the proposed It is updated based on the evaluation of the identification result provided by the provider 920-4. For example, the model update unit 920-5 may provide the image processing results provided by the provider 920-4 to the model training unit 910-4 to request the model training unit 910-4 to additionally train or update the artificial intelligence model.
圖11是根據實施例的濾波器組的訓練方法的圖。 Figure 11 is a diagram of a training method of a filter bank according to an embodiment.
根據實施例,卷積神經網路型模型可包括寬度×高度×通道的三維卷積濾波器以及激活函數(active function)層。 According to an embodiment, the convolutional neural network type model may include a width×height×channel three-dimensional convolution filter and an activation function layer.
卷積濾波器的參數可以是訓練目標,且可通過訓練獲得適合於達成某一目的的改良的參數。人工智慧縮小模型及人工智慧放大模型旨在提供改良的壓縮率,以使得藉由縮小及放大最初影像資料獲得的影像資料與最初影像資料最類似。 The parameters of the convolution filter can be training targets, and improved parameters suitable for achieving a certain purpose can be obtained through training. The artificial intelligence reduction model and the artificial intelligence upscaling model aim to provide an improved compression rate so that the image data obtained by reducing and enlarging the original image data are most similar to the original image data.
可由伺服器或電子裝置實行訓練,且為易於闡釋起見,將闡述為伺服器實行訓練。 Training can be performed by a server or an electronic device, and for ease of explanation will be described as server-executed training.
參考圖11,根據本發明的訓練方法,伺服器可藉由使用X個卷積濾波器1120縮小最初影像資料1110來獲得經壓縮影像資料1130。圖11說明大小為2N×2M的最初影像資料被縮小成大小為N×M的經壓縮影像資料,但本發明並不僅限於此。
Referring to Figure 11, according to the training method of the present invention, the server can obtain
伺服器可藉由使用Y個卷積濾波器1140放大所獲得的經壓縮影像資料1130來獲得復原影像資料1150。數目「Y」可小於數目「X」。
The server may obtain restored
伺服器可對復原影像資料1150與最初影像資料1110進行比較,並訓練濾波器1120及濾波器1140中的每一者的參數以
減少損耗。伺服器可使用類似度分析方法(例如PSNR、SSIM等)來計算損耗值。
The server may compare the restored
應用經過訓練的參數的人工智慧縮小模型及人工智慧放大模型可通過改良的縮放操作來壓縮或復原影像資料。 Artificial intelligence reduction models and artificial intelligence upscaling models using trained parameters can compress or restore image data through improved scaling operations.
圖12是根據實施例的流式傳輸資料的結構的圖。圖12說明其中在圖2的編碼操作24期間儲存與濾波器組相關聯的資訊的詳細示例性實施例。
Figure 12 is a diagram of the structure of streaming data according to an embodiment. FIG. 12 illustrates a detailed exemplary embodiment in which information associated with a filter bank is stored during encoding
參考圖12,可將按照1/4縮小的原始視訊資料、人工智慧旗標及濾波器索引資訊1201輸入至標準編碼器1202中。伺服器可藉由對按照1/4縮小的輸入原始視訊資料進行編碼來獲得視訊串流1204。視訊串流可指代視訊位元串流。
Referring to Figure 12, original video data, artificial intelligence flags and filter
伺服器可將輸入人工智慧旗標及濾波器索引資訊1203包含於視訊串流1204中所包含的SEI 1205中。伺服器可將視訊串流1204劃分成N個視訊塊1206,且複製SEI 1205以產生N個SEI 1207。
The server may include the input artificial intelligence flag and filter
伺服器可將所產生的SEI添加至每一視訊塊,並將個別地添加有SEI的多個視訊塊1208儲存於流式傳輸儲存裝置1209中。
The server may add the generated SEI to each video chunk and store the plurality of
儘管未示出,但可將所述多個所儲存的視訊塊1208傳輸至電子裝置。
Although not shown, the plurality of stored
圖13是根據實施例的濾波器組的訓練方法的圖。 Figure 13 is a diagram of a training method of a filter bank according to an embodiment.
參考圖13,可利用不同類別的影像資料來訓練多個濾波 器組中的每一者。具體而言,可利用訓練用的訓練資料集中的所有影像(全域資料集)來訓練第一濾波器組(濾波器1)。 Referring to Figure 13, multiple filters can be trained using different categories of image data. each one in the device group. Specifically, the first filter bank (Filter 1) can be trained using all images in the training data set for training (universal data set).
另外,可利用訓練資料集中的單個類別的影像資料(諸如,電影、體育、音樂視訊、紀錄片、新聞等)來訓練每一濾波器組。 Additionally, each filter bank may be trained using a single category of image data in the training data set (such as movies, sports, music videos, documentaries, news, etc.).
如上所述,當基於影像資料類別完成對每一濾波器組的訓練時,可基於輸入最初影像資料的類別來選擇濾波器組。 As described above, when training of each filter bank based on the category of the image data is completed, the filter bank may be selected based on the category of the input original image data.
然而,本發明並不僅限於此。無論輸入最初影像資料的類別如何,皆可在應用多個濾波器組之後選擇改良的濾波器組。 However, the present invention is not limited to this. Regardless of the type of input original image data, an improved filter set can be selected after applying multiple filter sets.
圖13闡述使用基於影像資料類別加以分類的訓練資料集來訓練每一濾波器組。然而,本發明並不僅限於此,且可根據各種準則對訓練資料集進行分類。 Figure 13 illustrates the use of training data sets classified based on image data categories to train each filter bank. However, the present invention is not limited thereto, and the training data sets may be classified according to various criteria.
圖14是根據實施例的電子裝置的影像解碼操作的流程圖。 FIG. 14 is a flowchart of an image decoding operation of an electronic device according to an embodiment.
參考圖14,在操作S1410處,電子裝置可接收與影像資料及應用於用於放大影像資料的人工智慧模型的濾波器組相關聯的資訊。與濾波器組相關聯的資訊可與影像資料包含在一起。電子裝置可自外部伺服器接收與影像資料及濾波器組相關聯的資訊。 Referring to FIG. 14, at operation S1410, the electronic device may receive information associated with image data and a filter bank applied to an artificial intelligence model for amplifying the image data. Information associated with the filter bank can be included with the image data. The electronic device may receive information associated with the image data and filter set from an external server.
在操作S1420處,電子裝置可對所接收到的影像資料進行解碼。所接收到的影像資料可以是由伺服器編碼的影像資料,且經編碼的影像資料可藉由對經縮小的最初影像資料進行編碼來 獲得。 In operation S1420, the electronic device may decode the received image data. The received image data may be image data encoded by the server, and the encoded image data may be encoded by encoding the reduced original image data. obtain.
在操作S1430處,電子裝置可藉由將經解碼的影像資料輸入至基於與濾波器組相關聯的資訊獲得的第一人工智慧模型中來放大經解碼的影像資料。多個濾波器組可預儲存於所述電子裝置中。所述電子裝置可使用在多個濾波器組當中與接收到的資訊對應的濾波器組來獲得用於放大影像資料的第一人工智慧模型。電子裝置可使用所獲得的第一人工智慧模型來放大經解碼的影像資料。 At operation S1430, the electronic device may amplify the decoded image data by inputting the decoded image data into a first artificial intelligence model obtained based on information associated with the filter bank. Multiple filter banks may be pre-stored in the electronic device. The electronic device may use a filter bank corresponding to the received information among the plurality of filter banks to obtain the first artificial intelligence model for amplifying the image data. The electronic device may use the obtained first artificial intelligence model to amplify the decoded image data.
當多個濾波器組未儲存於電子裝置中時,電子裝置可將所接收到的與濾波器組相關聯的資訊傳輸至第二外部伺服器。第二外部伺服器可儲存與多個濾波器組相關聯的參數資訊。第二外部伺服器可與將影像資料傳輸至電子裝置的外部伺服器相同或不同。 When the plurality of filter banks are not stored in the electronic device, the electronic device may transmit the received information associated with the filter banks to the second external server. The second external server may store parameter information associated with multiple filter banks. The second external server may be the same as or different from the external server that transmits the image data to the electronic device.
當自第二外部伺服器接收到與所傳輸的資訊對應的濾波器組的參數資訊時,所述電子裝置可藉由應用所接收到的參數資訊來獲得第一人工智慧模型。 When receiving parameter information of the filter bank corresponding to the transmitted information from the second external server, the electronic device may obtain the first artificial intelligence model by applying the received parameter information.
在操作S1440處,所述電子裝置可輸出經放大的影像資料。具體而言,若所述電子裝置是具有顯示器的顯示設備,則所述電子裝置可控制顯示器顯示所述經放大的影像資料。若所述電子裝置是不具有顯示器的裝置,則所述電子裝置可將經放大的影像資料傳輸至外部顯示設備以供顯示。換言之,電子裝置可提供經放大的影像資料以經由所述電子裝置的顯示器輸出,或將經放 大的影像資料提供至外部顯示設備以經由所述外部顯示設備的顯示器輸出。 At operation S1440, the electronic device may output the enlarged image data. Specifically, if the electronic device is a display device with a display, the electronic device can control the display to display the amplified image data. If the electronic device is a device without a display, the electronic device can transmit the amplified image data to an external display device for display. In other words, the electronic device may provide amplified image data for output via a display of the electronic device, or may The large image data is provided to an external display device for output via a display of the external display device.
如上所述,根據實施例的電子裝置可自外部伺服器接收與經編碼的影像資料及改良的濾波器組相關聯的資訊,藉此在復原影像資料期間縮短時間且減少資源消耗。因此,可實現影像的高壓縮率,且可將高畫質影像壓縮成大小較小的影像資料並傳輸。 As described above, the electronic device according to the embodiment can receive information associated with the encoded image data and the improved filter set from an external server, thereby shortening time and reducing resource consumption during restoration of the image data. Therefore, a high compression rate of images can be achieved, and high-quality images can be compressed into smaller image data and transmitted.
圖15及圖16是根據實施例的電子裝置的影像解碼操作的圖。 15 and 16 are diagrams of image decoding operations of the electronic device according to embodiments.
圖15闡述用於實現人工智慧解碼操作的裝置的配置。圖15的操作1501可與圖2的操作26對應,且圖15的操作1503、操作1504、操作1505及操作1509可與圖12的操作28對應。
Figure 15 illustrates the configuration of a device for implementing artificial intelligence decoding operations.
參考圖15,電子裝置可藉由使用標準解碼器1501對經編碼的影像進行解碼來獲得經壓縮原始資料、人工智慧旗標及濾波器索引資訊1502。電子裝置可將所獲得的原始資料及資訊傳輸至人工智慧資訊控制器1503,並且判斷是否實行了人工智慧編碼且判斷是否存在索引資訊。
Referring to FIG. 15 , the electronic device may obtain compressed raw data, artificial intelligence flags, and filter
若人工智慧旗標指示未實行人工智慧編碼(例如,AI旗標==NULL),則可在不實行放大過程的情況下將原始資料傳輸至顯示器1509以顯示大小為N×M的原始資料。
If the artificial intelligence flag indicates that artificial intelligence encoding is not performed (eg, AI flag == NULL), the original data can be transmitted to the
若人工智慧旗標指示實行了人工智慧編碼(例如,AI旗標==1),則人工智慧資訊控制器1503可將大小為N×M的原始資料1510傳輸至人工智慧(AI)放大模型1507。
If the artificial intelligence flag indicates that artificial intelligence encoding is implemented (eg, AI flag == 1), the artificial
若索引資訊指示使用了人工智慧編碼及多人工智慧濾波器選項(例如,索引資訊!==NULL),則人工智慧資訊控制器1503可將所述索引資訊傳輸至索引控制器1504。接收到所述索引資訊的索引控制器1504可將請求1511傳輸至記憶體1505以將與濾波器匹配的索引資訊的參數載入至人工智慧放大模型1507。當完成與索引資訊匹配的參數的載入1506時,人工智慧放大模型1507可藉由使用與所載入索引資訊匹配的參數放大大小為N×M的所傳輸的原始資料1510來獲得具有2N×2M大小的原始資料1508。
If the index information indicates that AI encoding and multiple AI filter options are used (eg, index information!==NULL), the
若索引資訊指示未使用人工智慧編碼及多人工智慧濾波器選項(例如,索引資訊==NULL),則人工智慧放大模型1507可藉由使用預設放大參數放大具有N×M大小的原始資料1510來獲得大小為2N×2M的原始資料1508。
If the index information indicates that AI encoding and multiple AI filter options are not used (e.g., index information == NULL), the
電子裝置可將所獲得的大小為2N×2M的原始資料1508傳輸至顯示器1509以供顯示。
The electronic device can transmit the obtained
為易於闡釋起見,圖15說明標準解碼器1501、人工智慧資訊控制器1503、索引控制器1504及人工智慧放大模型1507是單獨的組件,但在其他實施例中,可由處理器中的一者或多者實行每一裝置的操作。
For ease of explanation, FIG. 15 illustrates that the
圖16詳細地闡述人工智慧解碼操作。圖16的操作S1601可與圖2的操作26對應,且圖16的操作S1620至操作S1640可與圖2的操作28對應。
Figure 16 illustrates the artificial intelligence decoding operation in detail. Operation S1601 of FIG. 16 may correspond to
參考圖16,電子裝置可自伺服器200接收位元串流、SEI
標頭中所包含的濾波器索引及人工智慧旗標1601。電子裝置可藉由對在操作S1610中輸入至標準解碼器中的所接收到的位元串流、SEI標頭中所包含的濾波器索引及人工智慧旗標1601進行解碼來獲得大小為N×M的原始視訊資料及SEI 1602。
Referring to Figure 16, the electronic device can receive the bit stream, SEI
Filter index and
在操作S1620處,電子裝置可判斷SEI中所儲存的人工智慧旗標是否指示實行了人工智慧編碼(例如,AI旗標==1)。若在操作S1620處電子裝置確定未實行人工智慧編碼(例如,S1620-否),則在操作S1650處電子裝置可顯示大小為N×M的原始視訊資料1604。
At operation S1620, the electronic device may determine whether the artificial intelligence flag stored in the SEI indicates that artificial intelligence encoding is performed (eg, AI flag == 1). If the electronic device determines that artificial intelligence encoding is not performed at operation S1620 (eg, S1620-No), the electronic device may display the
若在操作S1620處電子裝置確定實行了人工智慧編碼(例如,S1620-是),則在操作S1630處電子裝置可判斷是否使用了多人工智慧濾波器選項(例如,是否濾波器索引!==NULL)。若在操作S1630處電子裝置確定使用了多人工智慧濾波器選項(例如,S1630-是),則電子裝置可選擇與濾波器索引資訊對應的濾波器組。電子裝置可使用人工智慧放大模型來放大大小為N×M的原始視訊資料,藉由在操作S1640處應用所選的濾波器組來獲得所述人工智慧放大模型。在操作S1650處,電子裝置可顯示大小2N×2M的經放大的視訊資料。 If the electronic device determines that artificial intelligence encoding is performed at operation S1620 (eg, S1620-Yes), then the electronic device may determine whether the multiple artificial intelligence filter options are used (eg, whether filter index !==NULL) at operation S1630 ). If the electronic device determines that the multiple artificial intelligence filter options are used at operation S1630 (eg, S1630-Yes), the electronic device may select a filter group corresponding to the filter index information. The electronic device may amplify the original video data of size N×M using an artificial intelligence amplification model obtained by applying the selected filter bank at operation S1640. At operation S1650, the electronic device may display the enlarged video data of size 2N×2M.
為易於闡釋起見,圖16說明索引為0、1、2及3的n個濾波器組預儲存於電子裝置中,但索引資訊及濾波器組的數目並不僅限於此。 For ease of explanation, FIG. 16 illustrates that n filter banks with indexes 0, 1, 2, and 3 are pre-stored in the electronic device, but the index information and the number of filter banks are not limited to this.
若在操作S1630處電子裝置確定未使用多人工智慧濾波
器選項(例如,S1630-否),則在操作S1660處電子裝置可實行應用濾波器組0的人工智慧放大。濾波器組0可以是預設濾波器組。在操作S1650處,所述電子裝置可顯示大小為2N×2M的經放大的視訊資料1605。
If the electronic device determines that multi-artificial intelligence filtering is not used at operation S1630
filter option (eg, S1630-No), the electronic device may perform artificial intelligence amplification applying filter bank 0 at operation S1660. Filter bank 0 may be a preset filter bank. At operation S1650, the electronic device may display the
如上所述,根據實施例的電子裝置可自外部伺服器接收與經編碼的影像資料及改良的濾波器組相關聯的資訊,藉此在復原影像資料時縮短時間且減少資源消耗。因此,可實現影像的高壓縮率,且可將高畫質影像壓縮成大小較小的影像資料並在流式傳輸環境中傳輸。 As described above, the electronic device according to the embodiment can receive information associated with the encoded image data and the improved filter set from an external server, thereby shortening time and reducing resource consumption when restoring the image data. Therefore, a high compression rate of images can be achieved, and high-definition images can be compressed into smaller image data and transmitted in a streaming environment.
圖17是根據實施例的電子裝置的影像放大操作的圖。電子裝置可自伺服器接收輸入影像資料1705及濾波器索引資訊1701。
FIG. 17 is a diagram of an image magnification operation of the electronic device according to the embodiment. The electronic device may receive
電子裝置可在記憶體1702中所儲存的多個濾波器組當中選擇與濾波器索引資訊1701匹配的濾波器組1703。記憶體1702中所儲存的所述多個濾波器組可以是使用不同的訓練資料加以訓練的濾波器的集合。每一濾波器組可以是應用於卷積神經網路模型的濾波器組。每一濾波器組可包括多個層及應用於偏倚項1704的多個參數。所述多個層中的每一者可包括多個濾波器。
The electronic device may select the filter set 1703 that matches the
電子裝置可藉由應用所選濾波器組的每一參數來獲得放大人工智慧模型。參考圖17,可獲得包括y個卷積濾波器1706的放大人工智慧模型,且卷積濾波器的數目可根據所選濾波器組而有所變化。電子裝置可藉由將輸入影像資料1705輸入至所獲得
的人工智慧放大模型中來獲得復原的輸出影像1707。可藉由放大輸入影像資料1705來獲得復原的輸出影像1707。
The electronic device can obtain an amplified artificial intelligence model by applying each parameter of the selected filter bank. Referring to Figure 17, an enlarged artificial intelligence model including y convolution filters 1706 can be obtained, and the number of convolution filters can vary according to the selected filter bank. The electronic device may be obtained by inputting the
提供多個濾波器組的理由如下。 The reasons for providing multiple filter banks are as follows.
第一,卷積神經網路模型可含有黑箱特性。由於所述黑箱特性,在訓練過程期間可難以在裝置中識別卷積神經網路模型的運作。因此,可需要以不同的方式輸入訓練資料集,以獲得專門用於影像分量(諸如,影像類別等)或對於所述影像分量而言最佳的濾波器組。相較於利用整個資料集訓練的濾波器組而言,藉由特定影像分量的輸入資料獲得的所述專用濾波器組通常可具有高損耗及低放大功能,但在特定影像中,可導出改良的結果。 First, convolutional neural network models can contain black-box properties. Due to the black-box nature, the operation of a convolutional neural network model can be difficult to identify in a device during the training process. Therefore, the training data set may need to be input in different ways to obtain a filter bank that is specific to or optimal for an image component (such as an image class, etc.). The dedicated filter bank obtained from the input data of a specific image component can generally have high loss and low amplification compared to a filter bank trained using the entire data set, but in a specific image, an improvement can be derived result.
第二,由於解碼器部分的即時性質,深入地形成所述解碼器部分的人工智慧放大模型的濾波器層受到限制。卷積計算需要大量計算/硬體資源,且因此卷積神經網路模型可對即時效能造成阻礙。因此,若將各種經過訓練的放大濾波器組應用於所述編碼器部分,則可增大層的寬度,且因此可強化放大功能。 Second, due to the real-time nature of the decoder part, the depth of the filter layers that form the AI amplification model of the decoder part is limited. Convolutional calculations require extensive computing/hardware resources, and therefore convolutional neural network models can hinder real-time performance. Therefore, if various trained amplification filter banks are applied to the encoder part, the width of the layer can be increased, and thus the amplification function can be enhanced.
最後,以多個所儲存的濾波器組進行多重濾波可提供改良的壓縮率。藉由在不進行額外影像分析的情況下基於影像流式傳輸編碼的非即時性質使用所述多個濾波器組的全部來實行放大,可選擇改良的濾波器組以提供改良的結果。 Finally, multiple filtering with multiple stored filter banks provides improved compression ratios. By using all of the plurality of filter banks to perform upscaling based on the non-real-time nature of image streaming encoding without additional image analysis, improved filter banks may be selected to provide improved results.
根據上述各種實施例,藉由提前在由伺服器進行的編碼過程中通過多個放大過程識別改良的放大濾波器組,可將電子裝置中的復原過程簡化。因此,可在影像流式傳輸環境中提供高壓 縮率,且因此可傳輸高畫質影像。 According to the various embodiments described above, by identifying the improved amplification filter bank through multiple amplification processes in advance during the encoding process by the server, the restoration process in the electronic device can be simplified. Therefore, high voltages can be provided in image streaming environments reduction ratio, and therefore can transmit high-definition images.
與此同時,可使用軟體、硬體或其組合來實施上述各種實施例。根據硬體實施方案,本發明中所述的實施例可被實施為特殊應用積體電路(application specific integrated circuit,ASIC)、數位訊號處理器(digital signal processor,DSP)、數位訊號處理設備(digital signal processing device,DSPD)、可程式化邏輯設備(programmable logic device,PLD)、可程式化閘陣列、處理器、控制器、微控制器、微處理器及用於實行其他功能的電性單元。根據軟體實施方案,可以單獨的軟體模組來實施諸如本文中所述的程序及功能等實施例。所述軟體模組中的每一者可實行本文中所述的功能及操作中的一者或多者。 At the same time, the various embodiments described above may be implemented using software, hardware, or a combination thereof. Depending on the hardware implementation, the embodiments described in the present invention may be implemented as an application specific integrated circuit (ASIC), a digital signal processor (DSP), or a digital signal processing device (DSP). signal processing device (DSPD), programmable logic device (PLD), programmable gate array, processor, controller, microcontroller, microprocessor and electrical units used to perform other functions. Depending on the software implementation, embodiments such as the procedures and functions described herein may be implemented as separate software modules. Each of the software modules may perform one or more of the functions and operations described herein.
與此同時,根據本發明的上述各種實施例的方法可儲存於非暫時性電腦可讀媒體中。該些非暫時性電腦可讀媒體可用於各種設備中。 At the same time, methods according to the above-described various embodiments of the present invention may be stored in a non-transitory computer-readable medium. These non-transitory computer-readable media can be used in a variety of devices.
所述非暫時性電腦可讀媒體指代半永久地儲存資料而非在極短的時間內儲存資料的媒體(諸如,暫存器、高速緩衝記憶體及記憶體),且可由裝置讀取。具體而言,上述各種應用或程式可儲存於以下非暫時性電腦可讀媒體中:諸如光碟(compact disc,CD)、數位多功能磁碟(digital versatile disk,DVD)、硬碟、藍光碟、通用串列匯流排(universal serial bus,USB)記憶條、記憶卡及唯讀記憶體(read only memory,ROM),且可提供上述各種應用或程式。 The non-transitory computer-readable media refers to media that stores data semi-permanently rather than for a very short period of time (such as registers, caches, and memory) and can be read by a device. Specifically, the various applications or programs mentioned above can be stored in the following non-transitory computer-readable media: such as compact disc (CD), digital versatile disk (DVD), hard drive, Blu-ray disc, Universal serial bus (USB) memory sticks, memory cards and read only memory (ROM), and can provide various applications or programs mentioned above.
根據實施例,可將根據本文中所揭露的各種實施例的方法設置於電腦程式產品中。電腦程式產品可銷售者與購買者之間作為商品交易。可以機器可讀儲存媒體的形式(例如,光碟唯讀記憶體(CD-ROM))分銷或通過應用商店(例如,PlayStoreTM)在線上分銷電腦程式產品。在線上分銷的情形中,可將電腦程式產品的至少一部分暫時地儲存或暫時地形成於儲存媒體(諸如,製造商伺服器、應用商店的伺服器、或中繼伺服器的記憶體)上。 According to embodiments, methods according to various embodiments disclosed herein may be implemented in a computer program product. Computer program products can be traded as commodities between sellers and buyers. Computer program products may be distributed in the form of machine-readable storage media (eg, compact disc read-only memory (CD-ROM)) or online through application stores (eg, PlayStore ™ ). In the case of online distribution, at least a portion of the computer program product may be temporarily stored or temporarily formed on a storage medium (such as the memory of a manufacturer's server, an application store's server, or a relay server).
儘管已示出且闡述了一些實施例,但熟習此項技術者應明白可在不背離本發明的原理及精神的情況下對該些實施例做出改變。因此,本發明的範疇不應被解釋為受所述實施例限制,而是由隨附申請專利範圍及其等效內容界定。 Although a few embodiments have been shown and described, it will be apparent to those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention. Accordingly, the scope of the present invention should not be construed as being limited by the embodiments described, but rather by the scope of the appended claims and their equivalents.
21:最初影像資料 21:First video data
22:人工智慧編碼器 22:Artificial Intelligence Coder
23、27:經壓縮影像 23, 27: Compressed image
24:編碼過程/編碼操作 24: Encoding process/encoding operation
25:流式傳輸源 25:Streaming source
26:解碼過程/操作 26: Decoding process/operation
28:人工智慧解碼器/操作 28: Artificial Intelligence Decoder/Operation
29:最初復原影像 29:First restored image
30:顯示器 30:Display
100:電子裝置/影像處理裝置 100: Electronic devices/image processing devices
200:伺服器 200:server
Claims (3)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2018-0093511 | 2018-08-10 | ||
KR1020180093511A KR102022648B1 (en) | 2018-08-10 | 2018-08-10 | Electronic apparatus, method for controlling thereof and method for controlling server |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202012885A TW202012885A (en) | 2020-04-01 |
TWI821358B true TWI821358B (en) | 2023-11-11 |
Family
ID=68067739
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW108128335A TWI821358B (en) | 2018-08-10 | 2019-08-08 | Electronic apparatus, method for controlling thereof, and method for controlling a server |
Country Status (6)
Country | Link |
---|---|
US (3) | US11388465B2 (en) |
EP (1) | EP3635964A1 (en) |
KR (1) | KR102022648B1 (en) |
CN (1) | CN110830849A (en) |
TW (1) | TWI821358B (en) |
WO (1) | WO2020032661A1 (en) |
Families Citing this family (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3567857A1 (en) | 2017-07-06 | 2019-11-13 | Samsung Electronics Co., Ltd. | Method for encoding/decoding image and device therefor |
WO2020080765A1 (en) | 2018-10-19 | 2020-04-23 | Samsung Electronics Co., Ltd. | Apparatuses and methods for performing artificial intelligence encoding and artificial intelligence decoding on image |
WO2020080827A1 (en) | 2018-10-19 | 2020-04-23 | Samsung Electronics Co., Ltd. | Ai encoding apparatus and operation method of the same, and ai decoding apparatus and operation method of the same |
WO2020080665A1 (en) | 2018-10-19 | 2020-04-23 | Samsung Electronics Co., Ltd. | Methods and apparatuses for performing artificial intelligence encoding and artificial intelligence decoding on image |
KR102525578B1 (en) | 2018-10-19 | 2023-04-26 | 삼성전자주식회사 | Method and Apparatus for video encoding and Method and Apparatus for video decoding |
US10789675B2 (en) * | 2018-12-28 | 2020-09-29 | Intel Corporation | Apparatus and method for correcting image regions following upsampling or frame interpolation |
US11265580B2 (en) * | 2019-03-22 | 2022-03-01 | Tencent America LLC | Supplemental enhancement information messages for neural network based video post processing |
KR102166337B1 (en) * | 2019-09-17 | 2020-10-15 | 삼성전자주식회사 | Apparatus and method for performing artificial intelligence encoding and artificial intelligence decoding of image |
CN110889411B (en) * | 2019-09-27 | 2023-12-08 | 武汉创想外码科技有限公司 | Universal image recognition model based on AI chip |
KR102287947B1 (en) | 2019-10-28 | 2021-08-09 | 삼성전자주식회사 | Apparatus and method for performing artificial intelligence encoding and artificial intelligence decoding of image |
KR102436512B1 (en) | 2019-10-29 | 2022-08-25 | 삼성전자주식회사 | Method and Apparatus for video encoding and Method and Apparatus for video decoding |
KR20210056733A (en) * | 2019-11-11 | 2021-05-20 | 삼성전자주식회사 | Electronic device for providing streaming data and method for operating thereof |
KR102245682B1 (en) * | 2019-11-11 | 2021-04-27 | 연세대학교 산학협력단 | Apparatus for compressing image, learning apparatus and method thereof |
KR20210067783A (en) * | 2019-11-29 | 2021-06-08 | 삼성전자주식회사 | Electronic apparatus and control method thereof and system |
KR20210067788A (en) * | 2019-11-29 | 2021-06-08 | 삼성전자주식회사 | Electronic apparatus, system and control method thereof |
US11595847B2 (en) * | 2019-12-19 | 2023-02-28 | Qualcomm Incorporated | Configuration of artificial intelligence (AI) modules and compression ratios for user-equipment (UE) feedback |
KR20210093605A (en) | 2020-01-20 | 2021-07-28 | 삼성전자주식회사 | A display apparatus and a method for operating the display apparatus |
KR20210103867A (en) * | 2020-02-14 | 2021-08-24 | 삼성전자주식회사 | Method and apparatus for streaming vr video |
WO2021163845A1 (en) * | 2020-02-17 | 2021-08-26 | Intel Corporation | Enhancing 360-degree video using convolutional neural network (cnn) -based filter |
KR102287942B1 (en) | 2020-02-24 | 2021-08-09 | 삼성전자주식회사 | Apparatus and method for performing artificial intelligence encoding and artificial intelligence decoding of image using pre-processing |
KR102391615B1 (en) * | 2020-03-16 | 2022-04-29 | 주식회사 카이 | Image processing method, video playback method and apparatuses thereof |
KR102471288B1 (en) * | 2020-08-27 | 2022-11-28 | 한국전자기술연구원 | Method and apparatus for transmitting and receaving |
CN115668273A (en) | 2020-09-15 | 2023-01-31 | 三星电子株式会社 | Electronic device, control method thereof and electronic system |
KR102548993B1 (en) * | 2020-12-02 | 2023-06-27 | 주식회사 텔레칩스 | Image scaling system and method for supporting various image mode |
US11670011B2 (en) * | 2021-01-11 | 2023-06-06 | Industry-Academic Cooperation Foundation Yonsei University | Image compression apparatus and learning apparatus and method for the same |
JPWO2022210661A1 (en) * | 2021-03-30 | 2022-10-06 | ||
JP2022174948A (en) * | 2021-05-12 | 2022-11-25 | 横河電機株式会社 | Apparatus, monitoring system, method, and program |
US11917142B2 (en) * | 2021-07-13 | 2024-02-27 | WaveOne Inc. | System for training and deploying filters for encoding and decoding |
WO2023086795A1 (en) * | 2021-11-09 | 2023-05-19 | Netflix, Inc. | Techniques for reconstructing downscaled video content |
KR20240056112A (en) * | 2022-10-21 | 2024-04-30 | 삼성전자주식회사 | Electronic apparatus for identifying a region of interest in an image and control method thereof |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW200611573A (en) * | 2004-09-24 | 2006-04-01 | Service & Quality Technology Co Ltd | Intelligent image-processing device for closed-circuit TV camera and it operating method |
US10009622B1 (en) * | 2015-12-15 | 2018-06-26 | Google Llc | Video coding with degradation of residuals |
WO2018120082A1 (en) * | 2016-12-30 | 2018-07-05 | Nokia Technologies Oy | Apparatus, method and computer program product for deep learning |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2646575A1 (en) | 1989-04-26 | 1990-11-02 | Labo Electronique Physique | METHOD AND STRUCTURE FOR DATA COMPRESSION |
JPH06197227A (en) | 1992-07-30 | 1994-07-15 | Ricoh Co Ltd | Image processor |
US9064364B2 (en) * | 2003-10-22 | 2015-06-23 | International Business Machines Corporation | Confidential fraud detection system and method |
US7956930B2 (en) | 2006-01-06 | 2011-06-07 | Microsoft Corporation | Resampling and picture resizing operations for multi-resolution video coding and decoding |
US8204128B2 (en) * | 2007-08-01 | 2012-06-19 | Her Majesty The Queen In Right Of Canada, As Represented By The Minister Of Industry, Through The Communications Research Centre Canada | Learning filters for enhancing the quality of block coded still and video images |
JP2013110518A (en) * | 2011-11-18 | 2013-06-06 | Canon Inc | Image coding apparatus, image coding method, and program, and image decoding apparatus, image decoding method, and program |
CN103369313B (en) | 2012-03-31 | 2017-10-10 | 百度在线网络技术(北京)有限公司 | A kind of method, device and equipment for carrying out compression of images |
CA2878807C (en) | 2012-07-09 | 2018-06-12 | Vid Scale, Inc. | Codec architecture for multiple layer video coding |
GB2543429B (en) | 2015-02-19 | 2017-09-27 | Magic Pony Tech Ltd | Machine learning for visual processing |
CN105611303B (en) | 2016-03-07 | 2019-04-09 | 京东方科技集团股份有限公司 | Image compression system, decompression systems, training method and device, display device |
KR101910446B1 (en) | 2016-07-12 | 2018-10-30 | 광운대학교 산학협력단 | A Compression Method of Digital Hologram Video using Domain Transforms and 2D Video Compression Technique |
KR101938945B1 (en) * | 2016-11-07 | 2019-01-15 | 한국과학기술원 | Method and system for dehazing image using convolutional neural network |
US20180183998A1 (en) | 2016-12-22 | 2018-06-28 | Qualcomm Incorporated | Power reduction and performance improvement through selective sensor image downscaling |
WO2018112514A1 (en) | 2016-12-23 | 2018-06-28 | Queensland University Of Technology | Deep learning systems and methods for use in computer vision |
CN107194347A (en) | 2017-05-19 | 2017-09-22 | 深圳市唯特视科技有限公司 | A kind of method that micro- expression detection is carried out based on Facial Action Coding System |
KR102535361B1 (en) * | 2017-10-19 | 2023-05-24 | 삼성전자주식회사 | Image encoder using machine learning and data processing method thereof |
CN108305214B (en) | 2017-12-28 | 2019-09-17 | 腾讯科技(深圳)有限公司 | Image processing method, device, storage medium and computer equipment |
US10645409B2 (en) * | 2018-06-26 | 2020-05-05 | Google Llc | Super-resolution loop restoration |
-
2018
- 2018-08-10 KR KR1020180093511A patent/KR102022648B1/en active IP Right Grant
-
2019
- 2019-08-08 TW TW108128335A patent/TWI821358B/en active
- 2019-08-08 US US16/535,784 patent/US11388465B2/en active Active
- 2019-08-09 EP EP19762283.0A patent/EP3635964A1/en not_active Ceased
- 2019-08-09 CN CN201910738429.3A patent/CN110830849A/en active Pending
- 2019-08-09 WO PCT/KR2019/010034 patent/WO2020032661A1/en unknown
-
2021
- 2021-10-07 US US17/496,507 patent/US11825033B2/en active Active
-
2023
- 2023-10-12 US US18/485,572 patent/US20240040179A1/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW200611573A (en) * | 2004-09-24 | 2006-04-01 | Service & Quality Technology Co Ltd | Intelligent image-processing device for closed-circuit TV camera and it operating method |
US10009622B1 (en) * | 2015-12-15 | 2018-06-26 | Google Llc | Video coding with degradation of residuals |
WO2018120082A1 (en) * | 2016-12-30 | 2018-07-05 | Nokia Technologies Oy | Apparatus, method and computer program product for deep learning |
Also Published As
Publication number | Publication date |
---|---|
KR102022648B1 (en) | 2019-09-19 |
US20220030291A1 (en) | 2022-01-27 |
US11825033B2 (en) | 2023-11-21 |
EP3635964A4 (en) | 2020-04-15 |
CN110830849A (en) | 2020-02-21 |
TW202012885A (en) | 2020-04-01 |
US11388465B2 (en) | 2022-07-12 |
EP3635964A1 (en) | 2020-04-15 |
US20200053408A1 (en) | 2020-02-13 |
US20240040179A1 (en) | 2024-02-01 |
WO2020032661A1 (en) | 2020-02-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI821358B (en) | Electronic apparatus, method for controlling thereof, and method for controlling a server | |
CN110088799B (en) | Image processing apparatus and image processing method | |
US11379955B2 (en) | Electronic device, image processing method thereof, and computer-readable recording medium | |
US11153575B2 (en) | Electronic apparatus and control method thereof | |
KR102476239B1 (en) | Electronic apparatus, method for processing image and computer-readable recording medium | |
US11934953B2 (en) | Image detection apparatus and operation method thereof | |
US11481586B2 (en) | Electronic apparatus and controlling method thereof | |
US11568254B2 (en) | Electronic apparatus and control method thereof | |
EP3671486B1 (en) | Display apparatus and control method thereof | |
CN111095350A (en) | Image processing apparatus, method for processing image, and computer-readable recording medium | |
US20220301312A1 (en) | Electronic apparatus for identifying content based on an object included in the content and control method thereof | |
US20210279589A1 (en) | Electronic device and control method thereof | |
CN111989917B (en) | Electronic device and control method thereof | |
US10997947B2 (en) | Electronic device and control method thereof | |
US11257186B2 (en) | Image processing apparatus, image processing method, and computer-readable recording medium | |
US11627383B2 (en) | Electronic device and operation method thereof | |
US20230209087A1 (en) | Method and device for improving video quality | |
TW202044199A (en) | Image processing apparatus and image processing method thereof |