TWI821358B - Electronic apparatus, method for controlling thereof, and method for controlling a server - Google Patents

Electronic apparatus, method for controlling thereof, and method for controlling a server Download PDF

Info

Publication number
TWI821358B
TWI821358B TW108128335A TW108128335A TWI821358B TW I821358 B TWI821358 B TW I821358B TW 108128335 A TW108128335 A TW 108128335A TW 108128335 A TW108128335 A TW 108128335A TW I821358 B TWI821358 B TW I821358B
Authority
TW
Taiwan
Prior art keywords
image data
artificial intelligence
data
filter
electronic device
Prior art date
Application number
TW108128335A
Other languages
Chinese (zh)
Other versions
TW202012885A (en
Inventor
朴泰俊
李相祚
羅尙權
Original Assignee
南韓商三星電子股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南韓商三星電子股份有限公司 filed Critical 南韓商三星電子股份有限公司
Publication of TW202012885A publication Critical patent/TW202012885A/en
Application granted granted Critical
Publication of TWI821358B publication Critical patent/TWI821358B/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440263Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234363Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the spatial resolution, e.g. for clients with a lower screen resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/33Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/439Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using cascaded computational arrangements for performing a single operation, e.g. filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0117Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving conversion of the spatial resolution of the incoming video signal

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computer Graphics (AREA)
  • Image Processing (AREA)
  • Controls And Circuits For Display Device (AREA)
  • Electrotherapy Devices (AREA)
  • Percussion Or Vibration Massage (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

An electronic apparatus, a method for controlling thereof, and a method for controlling a server are provided. The method for controlling an electronic apparatus includes receiving image data and information associated with a filter set that is applied to an artificial intelligence model for upscaling the image data from an external server; decoding the image data; upscaling the decoded image data using a first artificial intelligence model that is obtained based on the information associated with the filter set; and providing the upscaled image data for output.

Description

電子裝置、控制其的方法以及控制伺服器的方 法 Electronic device, method of controlling same and method of controlling server Law

本發明是有關於一種電子裝置、一種控制方法以及一種伺服器控制方法,且更確切而言是有關於一種藉由傳輸及接收高畫質影像來改良影像流式傳輸環境的電子裝置、一種控制方法以及一種伺服器控制方法。 The present invention relates to an electronic device, a control method and a server control method, and more specifically to an electronic device that improves the image streaming environment by transmitting and receiving high-definition images, and a control method. method and a server control method.

[相關申請案的交叉參考] [Cross-reference to related applications]

本申請案主張2018年8月10日於韓國智慧財產局提出申請的韓國專利申請案第10-2018-0093511號的優先權,所述韓國專利申請案的揭露內容全部併入本案供參考。 This application claims priority over Korean Patent Application No. 10-2018-0093511, which was filed with the Korean Intellectual Property Office on August 10, 2018. All disclosures of the Korean patent application are incorporated into this case for reference.

人工智慧(artificial intelligence,AI)系統是自行訓練且實施人類水平的智慧的系統。人工智慧系統的辨識率隨著人工智慧系統使用的增多而提高。 Artificial intelligence (AI) systems are systems that train themselves and implement human-level intelligence. The recognition rate of artificial intelligence systems increases with the increase in the use of artificial intelligence systems.

人工智慧技術包括:使用演算法的機器學習(例如,深度學習)技術,所述演算法使用輸入資料的特徵來自行分類及自行訓練;及元素技術(element technique),藉由使用機器學習演算法來模擬人類大腦的辨識、確定等功能。 Artificial intelligence techniques include: machine learning (e.g., deep learning) techniques that use algorithms that use characteristics of input data to classify and train themselves; and element techniques that use machine learning algorithms to To simulate the recognition, determination and other functions of the human brain.

所述元素技術可包括例如以下各項中的至少一者:辨識人類語言/字符的語言理解、如同被人類感知到一樣辨識物體的視覺理解、確定資訊並且在邏輯上推理並預測資訊的推理/預測、處理人類經驗資訊作為知識資料的知識表示、控制車輛的自主駕駛及機器人移動的運動控制等。 The elemental technologies may include, for example, at least one of the following: language understanding to recognize human speech/characters, visual understanding to recognize objects as if perceived by humans, reasoning to determine information and logically reason and predict the information/ Prediction and processing of human experience information as knowledge representation of knowledge data, control of autonomous driving of vehicles and motion control of robot movement, etc.

確切而言,網路狀態對於藉由適應性地壓縮及復原影像來實行流式傳輸的流式傳輸系統的影像品質而言是至關重要的因素。然而,網路資源是有限的。因此,除非可獲取大量的資源,否則使用者難以使用高畫質(high-definition)內容。 Specifically, network status is a critical factor in the image quality of a streaming system that performs streaming by adaptively compressing and restoring images. However, network resources are limited. Therefore, it is difficult for users to use high-definition content unless a large amount of resources are available.

另外,視訊容量隨著影像品質的提高而不斷增大,但網路頻寬卻並未跟上此增加。因此,編解碼器效能對於在影像壓縮及復原過程的始終確保影像品質的重要性不斷增大。 In addition, video capacity continues to increase as image quality improves, but network bandwidth has not kept pace with this increase. Therefore, codec performance is increasingly important in ensuring image quality throughout the image compression and restoration process.

提供一種電子裝置、一種控制其的方法及一種伺服器控制方法,且更確切而言,提供一種藉由在多個濾波器組當中選擇改良的濾波器組來放大經縮小的影像的電子裝置、一種控制其的方法及一種伺服器控制方法。 Provide an electronic device, a method of controlling the same, and a server control method, and more specifically, provide an electronic device that enlarges a reduced image by selecting an improved filter set among a plurality of filter sets, A method of controlling the same and a server control method.

根據實施例,提供一種控制電子裝置的方法,所述方法包括:自外部伺服器接收影像資料及與濾波器組相關聯的資訊,所述濾波器組應用於用於放大所述影像資料的人工智慧模型;對所述影像資料進行解碼;使用基於與所述濾波器組相關聯的所述資訊獲得的第一人工智慧模型來放大所述經解碼的影像資料;以及提供所述經放大的影像資料以供輸出。 According to an embodiment, a method of controlling an electronic device is provided, the method comprising: receiving image data and information associated with a filter bank from an external server, the filter bank being applied to a manual process for amplifying the image data. an intelligent model; decoding the image data; amplifying the decoded image data using a first artificial intelligence model obtained based on the information associated with the filter bank; and providing the enlarged image data for output.

與所述濾波器組相關聯的所述資訊包括所述濾波器組的索引資訊,且所述放大包括:獲得所述第一人工智慧模型,所述第一人工智慧模型基於所述索引資訊來應用所述電子裝置中所儲存的多個經過訓練的濾波器組中的一者;以及藉由將所述經解碼的影像資料輸入至所獲得的所述第一人工智慧模型中來放大所述經解碼的影像資料。 The information associated with the filter bank includes index information of the filter bank, and the amplification includes obtaining the first artificial intelligence model based on the index information. Apply one of a plurality of trained filter banks stored in the electronic device; and amplify the decoded image data by inputting the decoded image data into the obtained first artificial intelligence model. Decoded image data.

藉由對經縮小的影像資料進行編碼來獲得所述影像資料,所述經縮小的影像資料是藉由將與所述影像資料對應的最初影像資料輸入至第二人工智慧模型中以縮小最初影像資料而獲取 Obtaining the image data by encoding the reduced image data by inputting the original image data corresponding to the image data into a second artificial intelligence model to reduce the original image obtained from information

所述第一人工智慧模型的濾波器的數目可小於所述第二人工智慧模型的濾波器的數目。 The number of filters of the first artificial intelligence model may be smaller than the number of filters of the second artificial intelligence model.

與所述濾波器組相關聯的所述資訊是由所述外部伺服器獲得的資訊,且識別將所述第一人工智慧模型所獲取的所述經放大的影像資料與所述最初影像資料之間的差異最小化的濾波器組。 The information associated with the filter bank is information obtained from the external server, and identifies the difference between the amplified image data obtained by the first artificial intelligence model and the original image data. filter bank that minimizes the difference between

所述第一人工智慧模型可以是卷積神經網路(CNN)。 The first artificial intelligence model may be a convolutional neural network (CNN).

所述提供可包括顯示所述經放大的影像資料。 The providing may include displaying the enlarged image data.

根據實施例,提供一種控制伺服器的方法,所述方法包括:藉由將最初影像資料輸入至用於縮小影像資料的人工智慧縮小模型中來獲得經縮小的影像資料;藉由將所述經縮小的影像資料分別輸入至多個人工智慧放大模型中來獲得多個經放大的影像資料,所述多個人工智慧放大模型應用為了放大所述經縮小的影像資料而訓練的多個濾波器組中的相應濾波器組;藉由添加與人工智慧放大模型的濾波器組相關聯的資訊來對所述經縮小的影像資料進行編碼,所述人工智慧放大模型輸出在所述多個經放大的影像資料當中與所述最初影像資料具有最小差異的經放大的影像資料;以及將所述經編碼的影像資料傳輸至外部電子裝置。 According to an embodiment, a method of controlling a server is provided, the method comprising: obtaining reduced image data by inputting initial image data into an artificial intelligence reduction model for reducing image data; The reduced image data are respectively input into a plurality of artificial intelligence magnification models to obtain a plurality of enlarged image data. The plurality of artificial intelligence magnification models apply a plurality of filter groups trained to enlarge the reduced image data. a corresponding filter bank; encoding the reduced image data by adding information associated with a filter bank of an artificial intelligence amplification model that outputs in the plurality of enlarged images Amplified image data among the data that is minimally different from the original image data; and transmitting the encoded image data to an external electronic device.

所述方法更可包括:訓練所述多個濾波器組的參數以減小所述多個經放大的影像資料與所述最初影像資料之間的差異。 The method may further include training parameters of the plurality of filter banks to reduce differences between the plurality of amplified image data and the original image data.

所述人工智慧放大模型的濾波器的數目可小於所述人工智慧縮小模型的濾波器的數目。 The number of filters of the artificial intelligence amplification model may be smaller than the number of filters of the artificial intelligence reduction model.

根據實施例,提供一種電子裝置,所述電子裝置包括:通訊介面,所述通訊介面包括通訊電路系統;以及處理器,所述處理器被配置以:經由所述通訊介面自外部伺服器接收影像資料及與濾波器組相關聯的資訊,所述濾波器組應用於用於放大所述影像資料的人工智慧模型;對所接收到的所述影像資料進行解碼;使用基於與所述濾波器組相關聯的所述資訊獲得的第一人工智慧模型來放大所述經解碼的影像資料;及提供所述經放大的影 像資料以供輸出。 According to an embodiment, an electronic device is provided. The electronic device includes: a communication interface, the communication interface includes a communication circuit system; and a processor, the processor is configured to: receive images from an external server via the communication interface. data and information associated with a filter bank that is applied to an artificial intelligence model for amplifying the image data; decoding the received image data; using the filter bank based on a first artificial intelligence model obtained by correlating the information to amplify the decoded image data; and provide the amplified image image data for output.

所述電子裝置更可包括記憶體。與所述濾波器組相關聯的所述資訊包括所述濾波器組的索引資訊,且所述處理器更被配置以:獲得所述第一人工智慧模型,所述第一人工智慧模型基於所述索引資訊來應用所述記憶體中所儲存的多個經過訓練的濾波器組中的一者;及藉由將所述經解碼的影像資料輸入至所獲得的所述第一人工智慧模型中來放大所述經解碼的影像資料。 The electronic device may further include a memory. The information associated with the filter bank includes index information of the filter bank, and the processor is further configured to: obtain the first artificial intelligence model, the first artificial intelligence model is based on the using the index information to apply one of a plurality of trained filter banks stored in the memory; and by inputting the decoded image data into the obtained first artificial intelligence model to enlarge the decoded image data.

可藉由對經縮小的影像資料進行編碼來獲得所述影像資料,所述經縮小的影像資料是藉由將與所述影像資料對應的最初影像資料輸入至用於縮小最初影像資料的第二人工智慧模型中而獲取。 The image data may be obtained by encoding reduced image data by inputting original image data corresponding to the image data into a second image data for reducing the original image data. Obtained from the artificial intelligence model.

所述第一人工智慧模型的濾波器的數目可小於所述第二人工智慧模型的濾波器的數目。 The number of filters of the first artificial intelligence model may be smaller than the number of filters of the second artificial intelligence model.

所述濾波器組相關聯的所述資訊可以是由所述外部伺服器獲得的資訊,以減小由所述第一人工智慧模型獲得的所述經放大的影像資料與所述最初影像資料之間的差異。 The information associated with the filter bank may be information obtained from the external server to reduce the difference between the enlarged image data obtained by the first artificial intelligence model and the original image data. difference between.

所述第一人工智慧模型可以是卷積神經網路(CNN)。 The first artificial intelligence model may be a convolutional neural network (CNN).

所述電子裝置更可包括顯示器,且所述處理器被配置以:提供所述經放大的影像資料以藉由控制所述顯示器顯示所述經放大的影像資料來進行輸出。 The electronic device may further include a display, and the processor is configured to provide the amplified image data for output by controlling the display to display the amplified image data.

21、1110:最初影像資料 21. 1110: Initial image data

22:人工智慧編碼器 22:Artificial Intelligence Coder

23、27:經壓縮影像 23, 27: Compressed image

24:編碼過程/編碼操作 24: Encoding process/encoding operation

25:流式傳輸源 25:Streaming source

26:解碼過程/操作 26: Decoding process/operation

28:人工智慧解碼器/操作 28: Artificial Intelligence Decoder/Operation

29:最初復原影像 29:First restored image

30、150:顯示器 30, 150: Monitor

71:影像內容源/最初影像內容源/輸入影像內容源 71: Image content source/original image content source/input image content source

72:經縮小的影像 72: Reduced image

73:人工智慧旗標 73: Artificial Intelligence Flag

74:索引資訊/濾波器索引資訊 74: Index information/Filter index information

75:資訊 75:Information

100:電子裝置/影像處理裝置 100: Electronic devices/image processing devices

110:無線通訊介面 110:Wireless communication interface

120、230、900:處理器 120, 230, 900: Processor

121:隨機存取記憶體 121: Random access memory

122:唯讀記憶體 122: Read-only memory

123:中央處理單元 123: Central processing unit

124:圖形處理單元 124: Graphics processing unit

125:匯流排 125:Bus

130、220、1702:記憶體 130, 220, 1702: memory

140:有線介面 140:Wired interface

160:視訊處理器 160:Video processor

170:音訊處理器 170: Audio processor

180:音訊輸出組件 180: Audio output component

200:伺服器 200:server

210:通訊介面 210: Communication interface

810:濾波器組 810: Filter bank

811、812、81n:層 811, 812, 81n: layer

910:訓練單元 910: Training unit

910-1:訓練資料獲取單元 910-1: Training data acquisition unit

910-2:訓練資料預處理器 910-2: Training data preprocessor

910-3:訓練資料選擇器 910-3: Training data selector

910-4:模型訓練單元 910-4: Model training unit

910-5:模型評估單元 910-5: Model Evaluation Unit

920:獲取單元 920: Get unit

920-1:輸入資料獲取單元 920-1: Input data acquisition unit

920-2:輸入資料預處理器 920-2: Input data preprocessor

920-3:輸入資料選擇器 920-3:Input data selector

920-4:提供器 920-4:Provider

920-5:模型更新單元 920-5: Model update unit

1000:影像流式傳輸系統 1000:Image streaming system

1120、1140:卷積濾波器/濾波器 1120, 1140: Convolution filter/filter

1130:經壓縮影像資料 1130: Compressed image data

1150:復原影像資料 1150:Restore image data

1201:原始視訊資料、人工智慧旗標及濾波器索引資訊 1201: Original video data, artificial intelligence flags and filter index information

1202:標準編碼器 1202:Standard encoder

1203:人工智慧旗標及濾波器索引資訊 1203:Artificial intelligence flag and filter index information

1204:視訊串流 1204:Video streaming

1205、1207:補充強化資訊 1205, 1207: Supplementary strengthening information

1206、1208:視訊塊 1206, 1208: Video block

1209:流式傳輸儲存裝置 1209:Streaming storage device

1501:操作/標準解碼器 1501: Operation/standard decoder

1502:經壓縮原始資料、人工智慧旗標及濾波器索引資訊 1502: Compressed raw data, artificial intelligence flags and filter index information

1503:操作/人工智慧資訊控制器 1503: Operation/Artificial Intelligence Information Controller

1504:操作/索引控制器 1504: Operation/Index Controller

1505:操作/記憶體 1505: Operation/Memory

1506:索引資訊匹配的參數的載入 1506: Loading of parameters matching index information

1507:人工智慧放大模型 1507: Artificial intelligence amplification model

1508、1510:原始資料 1508, 1510: original data

1509:操作/顯示器 1509:Operation/Display

1511:請求 1511: Request

1601:位元串流、補充強化資訊標頭中所包含的濾波器索引及人工智慧旗標 1601: Bit streaming, filter index and artificial intelligence flags contained in supplemental enhancement information header

1602:原始視訊資料及補充強化資訊 1602: Original video data and supplementary enhanced information

1604:原始視訊資料 1604: Original video data

1605:經放大的視訊資料 1605: Amplified video data

1701:濾波器索引資訊 1701:Filter index information

1703:選擇與濾波器索引資訊匹配的濾波器組 1703: Select a filter bank that matches the filter index information

1704:偏倚項 1704: Bias term

1705:輸入影像資料 1705:Input image data

1706:卷積濾波器 1706: Convolution filter

1707:復原的輸出影像 1707:Restored output image

S610、S620、S630、S640、S710、S720、S730、S740、S750、S1410、S1420、S1430、S1440、S1610、S1620、S1630、S1640、S1650、S1660:操作 Operation

結合附圖閱讀以下說明,本發明的某些實施例的以上及其他態樣、特徵及優勢將更加顯而易見,在附圖中:圖1及圖2是根據實施例的影像流式傳輸系統的圖。 The above and other aspects, features and advantages of certain embodiments of the present invention will be more apparent by reading the following description in conjunction with the accompanying drawings, in which: Figures 1 and 2 are diagrams of an image streaming system according to an embodiment. .

圖3是根據實施例的電子裝置的方塊圖。 3 is a block diagram of an electronic device according to an embodiment.

圖4是根據實施例的電子裝置的方塊圖。 4 is a block diagram of an electronic device according to an embodiment.

圖5是根據實施例的伺服器的方塊圖。 Figure 5 is a block diagram of a server according to an embodiment.

圖6是根據實施例的伺服器的影像編碼操作的流程圖。 Figure 6 is a flowchart of an image encoding operation of a server according to an embodiment.

圖7是根據實施例的伺服器的影像編碼操作的流程圖。 Figure 7 is a flowchart of an image encoding operation of a server according to an embodiment.

圖8是根據實施例的濾波器組的圖。 Figure 8 is a diagram of a filter bank according to an embodiment.

圖9是根據實施例的用於訓練並使用人工智慧模型的電子裝置的方塊圖。 Figure 9 is a block diagram of an electronic device for training and using an artificial intelligence model, according to an embodiment.

圖10A及圖10B是根據實施例的訓練單元及獲取單元的方塊圖。 10A and 10B are block diagrams of a training unit and an acquisition unit according to embodiments.

圖11是根據實施例的濾波器組的訓練方法的圖。 Figure 11 is a diagram of a training method of a filter bank according to an embodiment.

圖12是根據實施例的流式傳輸資料的結構的圖。 Figure 12 is a diagram of the structure of streaming data according to an embodiment.

圖13是根據實施例的濾波器組的訓練方法的圖。 Figure 13 is a diagram of a training method of a filter bank according to an embodiment.

圖14是根據實施例的電子裝置的影像解碼操作的流程圖。 FIG. 14 is a flowchart of an image decoding operation of an electronic device according to an embodiment.

圖15及圖16是根據實施例的電子裝置的影像解碼操作的圖。 15 and 16 are diagrams of image decoding operations of the electronic device according to embodiments.

圖17是根據實施例的電子裝置的影像放大操作的圖。 FIG. 17 is a diagram of an image magnification operation of the electronic device according to the embodiment.

可對本發明的示範性實施例進行多種修改。因此,圖式 中說明且詳細說明中詳細地闡述具體的示範性實施例。然而,應理解本發明並不僅限於具體的示範性實施例,而是在不背離本發明的範疇及精神的情況下包括所有的潤飾、等效內容及替代形式。此外,由於不必要的細節會使本發明模糊,因此未詳細地闡述所述眾所周知的功能或構造。 Various modifications may be made to the exemplary embodiments of the invention. Therefore, schema Specific exemplary embodiments are set forth in detail in the description and in the detailed description. However, it should be understood that the present invention is not limited to the specific exemplary embodiments but includes all modifications, equivalents, and alternatives without departing from the scope and spirit of the invention. Additionally, well-known functions or constructions have not been described in detail since they would obscure the invention with unnecessary detail.

將簡要闡述本說明書中所使用的用語,且將詳細地闡述本發明。 The terms used in this specification will be briefly explained, and the present invention will be explained in detail.

包括技術用語及科學用語在內的本說明書中所使用的所有用語皆具有與熟習此項技術者通常所理解的相同的含義。然而,該些用語可根據熟習此項技術者的意圖、法律或技術闡釋以及新技術的出現而有所變化。另外,一些用語是申請人任意地選擇的。該些用語可被解釋為本文中所定義的含義,且除非另有規定,否則可基於本發明的全部內容及此項技術中的技術知識來加以解釋。 All terms used in this specification, including technical terms and scientific terms, have the same meaning as commonly understood by those skilled in the art. However, these terms may change based on the intentions of those skilled in the art, legal or technical interpretations, and the emergence of new technologies. Additionally, some terms are chosen arbitrarily by the applicant. These terms may be interpreted as having the meanings defined herein and, unless otherwise specified, may be interpreted based on the entire content of the present invention and technical knowledge in the art.

本發明並不僅限於本文中所揭露的實施例且可以各種形式來實施,且本發明的範疇並不僅限於以下實施例。另外,自申請專利範圍及其等效內容的含義及範疇導出的所有改變或潤飾皆應被解釋為包含於本發明的範疇內。在以下說明中,可省略眾所周知但與本發明的主旨不相關的配置。 The present invention is not limited to the embodiments disclosed herein and may be implemented in various forms, and the scope of the present invention is not limited only to the following embodiments. In addition, all changes or modifications derived from the meaning and scope of the patent application and its equivalents should be construed as being included in the scope of the present invention. In the following description, configurations that are well known but not related to the gist of the present invention may be omitted.

可使用諸如「第一」、「第二」等用語來闡述各種元件,但所述元件不應受該些用語限制。所述用語是用於將一個元件與其他元件區分開。 Terms such as "first", "second", etc. may be used to describe various elements, but the elements should not be limited by these terms. These terms are used to distinguish one element from other elements.

用語的單數表達亦包含複數含義,只要所述複數含義在所述用語的上下文中不存在不同的含義即可。在本發明中,諸如「包括」及「具有(have/has)」等用語應被解釋為指明本發明中存在該些特徵、數目、操作、元件、組件或其組合,且不排除存在或可能添加其他特徵、數目、操作、元件、組件或其組合中的一者或多者。 The singular expression of a term also includes the plural meaning so long as the plural meaning does not have a different meaning in the context of the term. In the present invention, words such as "include" and "have/has" should be interpreted as indicating that the features, numbers, operations, elements, components or combinations thereof exist in the present invention, and do not exclude the existence or possibility of Add one or more other features, numbers, operations, elements, components, or combinations thereof.

在實施例中,「模組」、「單元」或「部分」等實行至少一個功能或操作,且可被實現為諸如處理器或積體電路等硬體、由處理器執行的軟體或者所述硬體與軟體的組合。另外,多個「模組」、多個「單元」、多個「部分」等可被整合至至少一個模組或晶片中且可被實現為至少一個處理器,但應被實現為特殊硬體的「模組」、「單元」或「部分」除外。 In embodiments, a "module," "unit," or "portion" performs at least one function or operation, and may be implemented as hardware such as a processor or an integrated circuit, software executed by a processor, or the A combination of hardware and software. In addition, multiple "modules", multiple "units", multiple "parts", etc. can be integrated into at least one module or chip and can be implemented as at least one processor, but should be implemented as special hardware Except for "module", "unit" or "part".

在後文中,將參考附圖詳細地闡述本發明的實施例,以使得熟習此項技術者可實施本發明的實施例。然而,本發明可體現為諸多不同的形式,並不僅限於本文中所述的實施例。為在圖式中清晰地說明本發明,為清晰起見省略了對於完整地理解本發明無關緊要的一些元件,且在本說明書通篇中相似的參考編號指代相似的元件。 In the following, embodiments of the present invention will be explained in detail with reference to the accompanying drawings, so that those skilled in the art can implement the embodiments of the present invention. However, the invention may be embodied in many different forms and is not limited to the embodiments set forth herein. In order to clearly illustrate the invention in the drawings, some elements which are not essential to a complete understanding of the invention have been omitted for the sake of clarity, and similar reference numbers refer to similar elements throughout this specification.

在後文中,將參考圖式更詳細地闡述本發明。 In the following, the invention will be explained in more detail with reference to the drawings.

圖1是根據實施例的影像流式傳輸系統的圖。 FIG. 1 is a diagram of an image streaming system according to an embodiment.

參考圖1,影像流式傳輸系統1000可包括電子裝置100及伺服器200。 Referring to FIG. 1 , an image streaming system 1000 may include an electronic device 100 and a server 200 .

伺服器200可產生經編碼的影像資料。所述經編碼的影像資料可以是在伺服器200縮小最初影像資料之後再加以編碼的影像資料。 The server 200 can generate encoded image data. The encoded image data may be image data that is encoded after the server 200 reduces the original image data.

伺服器200可使用用於縮小影像資料的人工智慧模型來縮小最初影像資料。伺服器200可在像素基礎上、在區塊基礎上、在圖框基礎上等縮小影像資料。 The server 200 may reduce the original image data using an artificial intelligence model for reducing image data. The server 200 can reduce the image data on a pixel basis, a block basis, a frame basis, etc.

伺服器200可獲取多個影像資料,所述多個影像資料是藉由在將多個濾波器組應用於用於放大影像資料的人工智慧模型之後放大經縮小的影像資料而獲得。伺服器200可在像素基礎上、在區塊基礎上、在圖框基礎上等放大經縮小的影像資料。濾波器組可包括應用於人工智慧模型的多個濾波器。應用於放大用人工智慧模型的濾波器的數目可小於應用於縮小用人工智慧模型的濾波器的數目。此乃因用作解碼器的電子裝置100的濾波器層因解碼器操作的即時性質而無法深入地形成。 The server 200 may obtain a plurality of image data obtained by enlarging the reduced image data after applying a plurality of filter sets to an artificial intelligence model for enlarging the image data. The server 200 can enlarge the reduced image data on a pixel basis, a block basis, a frame basis, etc. A filter bank may include multiple filters applied to an artificial intelligence model. The number of filters applied to the amplification of the artificial intelligence model may be smaller than the number of filters applied to the reduction of the artificial intelligence model. This is because the filter layers of the electronic device 100 used as a decoder cannot be formed deeply due to the real-time nature of the decoder operation.

所述多個濾波器中的每一者可包括多個參數。亦即,濾波器組可以是用於獲得人工智慧模型的參數集合。所述參數可被稱為權重、係數等。 Each of the plurality of filters may include a plurality of parameters. That is, the filter bank may be a set of parameters used to obtain an artificial intelligence model. The parameters may be called weights, coefficients, etc.

可提前訓練所述多個濾波器組並將其儲存於伺服器200中。所述多個濾波器組可提供改良的壓縮率以獲得與最初影像資料具有最小差異的經放大的影像。經過訓練的資料可以是應用於縮小人工智慧模型的多個參數以及應用於放大人工智慧模型的多個參數。舉例而言,可基於影像資料的類別來訓練所述多個濾波 器組。將參考圖13更詳細地對示例性實施例加以詳細說明。 The plurality of filter banks may be trained in advance and stored in the server 200 . The plurality of filter banks can provide improved compression to obtain an enlarged image with minimal differences from the original image data. The trained data may be a plurality of parameters used to reduce the artificial intelligence model and a plurality of parameters used to amplify the artificial intelligence model. For example, the plurality of filters can be trained based on the category of the image data. device group. Exemplary embodiments will be described in greater detail with reference to FIG. 13 .

伺服器200可識別用於產生在所述多個經放大的影像資料當中與最初影像資料具有最小差異的影像資料的濾波器組。可針對影像資料的每一圖框識別改良的濾波器組。 The server 200 may identify a filter bank for generating image data that is minimally different from the original image data among the plurality of amplified image data. Improved filter banks can be identified for each frame of image data.

伺服器200可將與經編碼的影像及濾波器組相關聯的資訊傳輸至電子裝置100。與濾波器組相關聯的資訊可包括所識別濾波器組的索引資訊。濾波器組的索引資訊可用於區分由多個參數構成的濾波器組。舉例而言,當n個濾波器組(諸如濾波器1、濾波器2、...濾波器n)儲存於電子裝置100及伺服器200中時,值1、2、...及n可被定義為索引資訊。 Server 200 may transmit information associated with the encoded image and filter set to electronic device 100 . Information associated with the filter bank may include index information for the identified filter bank. Filter bank index information can be used to distinguish filter banks composed of multiple parameters. For example, when n filter sets (such as filter 1, filter 2, ... filter n) are stored in the electronic device 100 and the server 200, the values 1, 2, ... and n can be is defined as index information.

電子裝置100可對所接收到的影像資料進行解碼並實行放大。所接收到的影像資料可以是自伺服器200接收到的經編碼的資料。電子裝置100可使用放大人工智慧模型來對經解碼的影像資料實行放大。 The electronic device 100 can decode the received image data and perform amplification. The received image data may be encoded data received from the server 200 . The electronic device 100 may use the amplification artificial intelligence model to amplify the decoded image data.

電子裝置100可儲存用於放大影像資料的多個濾波器組。電子裝置100中所儲存的所述多個濾波器組可與伺服器200中所儲存的所述多個濾波器組相同。 The electronic device 100 may store multiple filter sets for amplifying image data. The plurality of filter sets stored in the electronic device 100 may be the same as the plurality of filter sets stored in the server 200 .

電子裝置100可藉由將經解碼的影像資料輸入至放大人工智慧模型中來獲得經放大的影像,所述放大人工智慧模型是基於自伺服器200接收到的影像資料中所包含的濾波器組而獲得。具體而言,電子裝置100可藉由基於與自伺服器200接收到的濾波器組相關聯的資訊使用電子裝置100中所儲存的所述多個濾波 器組當中的單個濾波器組來獲得放大人工智慧模型以用於放大。 The electronic device 100 can obtain the enlarged image by inputting the decoded image data into the amplification artificial intelligence model, which is based on the filter set included in the image data received from the server 200 And get. Specifically, the electronic device 100 may use the plurality of filters stored in the electronic device 100 based on information associated with the filter set received from the server 200 A single filter bank within the filter bank is used to obtain an amplified artificial intelligence model for amplification.

電子裝置100可提供經放大的影像資料以供輸出。 The electronic device 100 can provide amplified image data for output.

當電子裝置100是包括顯示器的顯示設備(諸如,個人電腦(personal computer,PC)、電視(television,TV)、行動設備等)時,電子裝置100可提供經放大的影像資料以經由電子裝置100的顯示器進行顯示。 When the electronic device 100 is a display device including a display (such as a personal computer (PC), a television (TV), a mobile device, etc.), the electronic device 100 may provide amplified image data through the electronic device 100 display on the monitor.

當電子裝置100是不包括顯示器的裝置(諸如,機頂盒或伺服器)時,電子裝置100可將經放大的影像資料提供至具有顯示器的外部設備,以使得所述外部設備可顯示所述經放大的影像資料。 When the electronic device 100 is a device that does not include a display (such as a set-top box or a server), the electronic device 100 can provide the enlarged image data to an external device having a display, so that the external device can display the enlarged image data. image data.

如上所述,藉由提前在由伺服器進行的編碼過程中通過多個放大過程識別改良的放大濾波器組,可將電子裝置中的復原過程簡化。因此,可在影像流式傳輸環境中實現高壓縮率,且因此可在影像流式傳輸環境中傳輸高畫質影像。 As mentioned above, by identifying the improved amplification filter bank through multiple amplification processes in advance during the encoding process by the server, the recovery process in the electronic device can be simplified. Therefore, a high compression rate can be achieved in an image streaming environment, and therefore a high-definition image can be transmitted in an image streaming environment.

圖2是圖1所示影像流式傳輸系統的圖。 FIG. 2 is a diagram of the image streaming system shown in FIG. 1 .

參考圖2,伺服器200可將最初影像資料21輸入至人工智慧編碼器22中。最初影像資料21可以是影像內容源。舉例而言,最初影像資料21的大小可以是2N×2M。 Referring to FIG. 2 , the server 200 can input the initial image data 21 into the artificial intelligence encoder 22 . Initially the image data 21 may be an image content source. For example, the initial size of the image data 21 may be 2N×2M.

人工智慧編碼器22可接收大小為2N×2M的最初影像資料21,且獲得大小為N×M的經壓縮影像23。人工智慧編碼器22可使用縮小人工智慧模型來縮小最初影像資料21。人工智慧編碼器22可獲得多個經放大的影像資料,所述多個經放大的影像資 料是藉由使用應用所述多個濾波器組中的每一者的人工智慧模型放大經壓縮影像23而獲得。人工智慧編碼器22可識別用於產生在所述多個經放大的影像資料當中與最初影像資料21最類似的影像資料的濾波器組。將參考圖8更詳細地闡述濾波器組的示例性實施例。 The artificial intelligence encoder 22 can receive the original image data 21 with a size of 2N×2M, and obtain a compressed image 23 with a size of N×M. The AI encoder 22 may use the reduction AI model to reduce the original image data 21 . The artificial intelligence encoder 22 can obtain a plurality of amplified image data, and the plurality of amplified image data The data is obtained by amplifying the compressed image 23 using an artificial intelligence model applying each of the plurality of filter banks. The artificial intelligence encoder 22 can identify the filter set used to generate the image data that is most similar to the original image data 21 among the plurality of amplified image data. An exemplary embodiment of the filter bank will be explained in more detail with reference to FIG. 8 .

濾波器可以是具有一些參數的遮罩且可由參數矩陣定義。濾波器可被稱為視窗或核心。構成濾波器中的矩陣的參數可包含0(例如,0值)、可逼近0的0元素或具有在0與1之間的恆定值的非0元素,且可根據其功能而具有各種型樣。 A filter can be a mask with some parameters and can be defined by a parameter matrix. Filters may be called windows or kernels. The parameters that make up the matrix in the filter can contain 0 (e.g., 0 values), 0 elements that can approximate 0, or non-zero elements with constant values between 0 and 1, and can have various shapes depending on their function. .

舉例而言,當人工智慧模型體現為用於辨識影像的卷積神經網路(Convolutional Neural Network,CNN)時,電子裝置可對輸入影像使用具有一些參數的濾波器,且確定藉由將影像的相應參數與濾波器的相應參數相乘(卷積計算)獲得的值的總和作為輸出影像的像素值,以提取特徵值。 For example, when the artificial intelligence model is embodied as a convolutional neural network (CNN) for recognizing images, the electronic device can use a filter with some parameters on the input image, and determine the image by converting the The sum of the values obtained by multiplying the corresponding parameters with the corresponding parameters of the filter (convolution calculation) is used as the pixel value of the output image to extract feature values.

可通過多個濾波器提取多個特徵值來提取輸入影像資料的強特徵,且可根據濾波器的數目提取多個特徵值。可通過圖11中所示的多個層重複進行卷積影像處理,且所述層中的每一者可包括多個濾波器。與此同時,將被訓練的濾波器可根據卷積神經網路的訓練目標而有所變化,且將選擇的濾波器的型樣可有所變化。舉例而言,將被訓練或將被選擇的濾波器可根據卷積神經網路的訓練目標是縮小還是放大輸入影像、根據影像所屬類別等而有所變化。 Multiple feature values can be extracted through multiple filters to extract strong features of the input image data, and multiple feature values can be extracted according to the number of filters. Convolutional image processing may be iterated through multiple layers shown in Figure 11, and each of the layers may include multiple filters. At the same time, the filters to be trained may vary according to the training objectives of the convolutional neural network, and the type of filters to be selected may vary. For example, the filter to be trained or selected may change depending on whether the training goal of the convolutional neural network is to reduce or enlarge the input image, according to the category to which the image belongs, and so on.

伺服器200可使用與經壓縮影像23及用於改良放大的濾波器組相關聯的資訊來實行編碼過程24,所述經壓縮影像23是自人工智慧編碼器22獲得,大小為N×M。 The server 200 may perform the encoding process 24 using information associated with the compressed image 23, obtained from the artificial intelligence encoder 22, of size N×M, and the filter bank for improved amplification.

編碼過程24可以是通用編碼過程。具體而言,影像編碼器可藉由對大小為N×M的經壓縮影像23實行編碼來產生位元串流。可由流式傳輸標準格式化器根據流式傳輸標準格式來產生所產生的位元串流。將參考圖13更詳細地闡述產生位元串流的過程。可將所產生且已壓縮的位元串流儲存於流式傳輸儲存裝置中。 Encoding process 24 may be a general encoding process. Specifically, the image encoder may generate a bit stream by encoding the compressed image 23 of size N×M. The resulting bit stream may be generated according to the streaming standard format by a streaming standard formatter. The process of generating the bit stream will be explained in more detail with reference to FIG. 13 . The generated compressed bit stream can be stored in a streaming storage device.

伺服器200可將所儲存的流式傳輸源25傳輸至電子裝置100。所傳輸的經壓縮位元串流可包含與經編碼的影像資料以及與用於改良放大的濾波器組相關聯的資訊。 The server 200 can transmit the stored streaming source 25 to the electronic device 100 . The transmitted compressed bitstream may include information associated with the encoded image data and with the filter bank used to improve amplification.

電子裝置100可藉由使用所接收到的流式傳輸源25實行解碼過程26來獲得大小為N×M的經壓縮影像27。所獲得的大小為N×M的經壓縮影像27可與在編碼之前大小為N×M的經壓縮影像23對應。電子裝置100可藉由將流式傳輸源25輸入至流式傳輸剖析器來獲得位元串流,且藉由將所獲得的位元串流輸入至影像解碼器中來獲得大小為N×M的解碼的經壓縮影像27。 The electronic device 100 may obtain a compressed image 27 of size N×M by performing a decoding process 26 using the received streaming source 25 . The obtained compressed image 27 of size N×M may correspond to the compressed image 23 of size N×M before encoding. The electronic device 100 can obtain a bit stream by inputting the streaming source 25 to a streaming parser, and obtain a bit stream of size N×M by inputting the obtained bit stream into an image decoder. The decoded compressed image27.

電子裝置100可藉由將大小為N×M的解碼的經壓縮影像27輸入至人工智慧解碼器28中來實行放大。電子裝置100可藉由應用所儲存的所述多個濾波器組中的一者來獲得放大用的放大人工智慧模型,並使用所獲得的放大人工智慧模型來獲得大小為2N×2M的最初復原影像29。電子裝置100可基於與所接收到 的流式傳輸源25中所包含的濾波器組相關聯的資訊來選擇所述多個濾波器組中的一者。 The electronic device 100 may perform upscaling by inputting the decoded compressed image 27 of size N×M into the artificial intelligence decoder 28 . The electronic device 100 may obtain an amplification artificial intelligence model for amplification by applying one of the stored filter banks, and use the obtained amplification artificial intelligence model to obtain an initial restoration of size 2N×2M. Image 29. The electronic device 100 may receive the Information associated with filter banks contained in the streaming source 25 is used to select one of the plurality of filter banks.

電子裝置100可控制顯示器30顯示最初復原影像29。當電子裝置100中未設置有顯示器時,電子裝置100可將最初復原影像29提供至外部顯示設備以進行顯示。 The electronic device 100 can control the display 30 to display the initial restored image 29 . When the electronic device 100 is not provided with a display, the electronic device 100 may provide the initial restored image 29 to an external display device for display.

與此同時,為便於闡釋,將人工智慧編碼器22、影像編碼器及流式傳輸標準格式化組件示出為單獨的組件。然而,在另一實施例中,可藉由單個處理器來實施前述組件。同樣地,亦可藉由單個處理器來實施流式傳輸剖析器、影像解碼器及人工智慧解碼器28。 Meanwhile, for ease of explanation, the artificial intelligence encoder 22, the image encoder and the streaming standard formatting component are shown as separate components. However, in another embodiment, the aforementioned components may be implemented by a single processor. Likewise, the streaming parser, image decoder and artificial intelligence decoder 28 can also be implemented by a single processor.

如上所述,可藉由使用人工智慧模型來縮小及放大影像資料來得到高品質的影像壓縮,且因此可傳輸高畫質影像。 As mentioned above, high-quality image compression can be achieved by using artificial intelligence models to reduce and enlarge image data, and thus high-quality images can be transmitted.

圖3是根據實施例的電子裝置的方塊圖。 3 is a block diagram of an electronic device according to an embodiment.

參考圖3,電子裝置100可包括無線通訊介面110及處理器120。 Referring to FIG. 3 , the electronic device 100 may include a wireless communication interface 110 and a processor 120 .

無線通訊介面110可被配置以根據各種類型的通訊方法來與各種類型的外部設備實行通訊。電子裝置100可使用有線通訊方法或無線通訊方法來與外部設備實行通訊。然而,根據實施例且為易於闡釋起見,在無線通訊方法的情形中,將闡述藉由無線通訊介面110實行通訊,且在有線通訊的情形中,藉由圖4中所示的有線介面140實行通訊。 The wireless communication interface 110 can be configured to communicate with various types of external devices according to various types of communication methods. The electronic device 100 may use a wired communication method or a wireless communication method to communicate with external devices. However, according to the embodiment and for ease of explanation, in the case of the wireless communication method, communication will be described through the wireless communication interface 110, and in the case of wired communication, through the wired interface 140 shown in FIG. 4 Implement communication.

無線通訊介面110可使用無線通訊方法(諸如,無線保 真(wireless fidelity,Wi-Fi)、藍芽、近場通訊(Near Field Communication,NFC)等)自外部設備接收影像資料。根據實施例,電子裝置100可藉由接收影像來實行影像處理,所述影像是使用者自設置於電子裝置100中的記憶體130中所儲存的多個影像當中選出。 The wireless communication interface 110 may use a wireless communication method such as wireless (wireless fidelity, Wi-Fi, Bluetooth, Near Field Communication (NFC), etc.) to receive image data from external devices. According to an embodiment, the electronic device 100 can perform image processing by receiving an image, which the user selects from a plurality of images stored in the memory 130 provided in the electronic device 100 .

當電子裝置100被配置以實行無線通訊時,無線通訊介面110可包括Wi-Fi晶片、藍芽晶片、無線通訊晶片、近場通訊晶片等。Wi-Fi晶片或藍芽晶片可分別使用Wi-Fi方法及藍芽方法來實行通訊。當使用Wi-Fi晶片或藍芽晶片時,可首先傳輸及接收諸如服務設定識別符(service set identifier,SSID)及工作階段金鑰等各種連接性資訊,可基於所述連接性資訊來建立通訊連接,且可基於所述通訊連接來傳輸及接收各種資訊。所述無線通訊晶片指代根據各種通訊標準(諸如,電機電子工程師學會(Institute of Electrical and Electronics Engineers,IEEE)標準、紫蜂(ZigBee)、第三代(3G)、第三代合夥專案(Third Generation Partnership Project,3GPP)、長期演化(Long Term Evolution,LTE)、第五代(5G)等)實行通訊的晶片。近場通訊晶片指代在使用各種射頻識別(radio-frequency identification,RFID)頻帶(諸如,135千赫、13.56百萬赫、433百萬赫、860至960百萬赫及2.45吉赫等)當中的13.56百萬赫頻帶的NFC模式中運作的晶片。 When the electronic device 100 is configured to perform wireless communication, the wireless communication interface 110 may include a Wi-Fi chip, a Bluetooth chip, a wireless communication chip, a near field communication chip, etc. A Wi-Fi chip or a Bluetooth chip can implement communication using the Wi-Fi method and the Bluetooth method respectively. When using a Wi-Fi chip or a Bluetooth chip, various connectivity information such as service set identifier (SSID) and session key can be first transmitted and received, and communication can be established based on the connectivity information. connection, and can transmit and receive various information based on the communication connection. The wireless communication chip refers to a wireless communication chip based on various communication standards (such as Institute of Electrical and Electronics Engineers (IEEE) standards, ZigBee, third generation (3G), third generation partnership project (Third) Generation Partnership Project (3GPP), Long Term Evolution (LTE), fifth generation (5G), etc.) chips that implement communications. Near field communication chips refer to those that use various radio-frequency identification (RFID) frequency bands (such as 135 kHz, 13.56 MHz, 433 MHz, 860 to 960 MHz, and 2.45 GHz, etc.) The chip operates in NFC mode in the 13.56 MHz frequency band.

電子裝置100可經由無線通訊介面110自外部伺服器接收影像資料以及與應用於用於放大影像資料的人工智慧模型的濾 波器組相關聯的資訊。 The electronic device 100 can receive image data from an external server via the wireless communication interface 110 and apply filters to the artificial intelligence model for amplifying the image data. Information related to the waveform group.

自外部伺服器接收到的影像資料可以是由所述外部伺服器編碼的影像資料。可藉由對經縮小的影像資料進行編碼來獲得經編碼的影像資料,所述經縮小的影像資料是藉由將最初影像資料輸入至人工智慧模型中以進行縮小而獲得。 The image data received from the external server may be image data encoded by the external server. The encoded image data may be obtained by encoding the reduced image data obtained by inputting the original image data into the artificial intelligence model for reduction.

與濾波器組相關聯的資訊(例如,元資料)可與影像資料包含在一起。可自外部伺服器獲得與濾波器組相關聯的資訊,以將藉由用於放大影像資料的人工智慧模型獲得的經放大的影像資料與最初影像資料之間的差異最小化。可針對影像資料的每一圖框獲得與濾波器組相關聯的資訊。將參考圖7更詳細地闡述自外部伺服器獲得與濾波器組相關聯的資訊的過程。 Information associated with the filter bank (eg, metadata) may be included with the image data. Information associated with the filter bank may be obtained from an external server to minimize differences between the amplified image data obtained by the artificial intelligence model used to amplify the image data and the original image data. Information associated with the filter bank can be obtained for each frame of the image data. The process of obtaining information associated with a filter bank from an external server will be explained in more detail with reference to FIG. 7 .

處理器120可控制電子裝置100的總體運作。 The processor 120 can control the overall operation of the electronic device 100 .

根據實施例,處理器120可被實施為數位訊號處理器(digital signal processor,DSP)、微處理器或時間控制器(time controller,TCON),但並不僅限於此。處理器120可包括一個或多個中央處理單元(central processing unit,CPU)、微控制器單元(microcontroller unit,MCU)、微處理單元(micro processing unit,MPU)、控制器、應用處理器(application processor,AP)、通訊處理器(communication processor,CP)、高階RISC機器(Advanced RISC Machine,ARM)處理器等,或者可由對應的用語定義。處理器120可被實施為晶片系統(system on chip,SoC)、具有內建處理演算法的大型積體(large scale integration,LSI),或以現場 可程式化閘陣列(Field Programmable Gate Array,FPGA)形式來實施。 According to embodiments, the processor 120 may be implemented as a digital signal processor (DSP), a microprocessor or a time controller (TCON), but is not limited thereto. The processor 120 may include one or more central processing units (CPUs), microcontroller units (MCUs), micro processing units (MPUs), controllers, application processors ( processor (AP), communication processor (CP), advanced RISC machine (Advanced RISC Machine, ARM) processor, etc., or can be defined by corresponding terms. The processor 120 may be implemented as a system on chip (SoC), a large scale integration (LSI) with built-in processing algorithms, or as an on-site It is implemented in the form of Field Programmable Gate Array (FPGA).

處理器120可對經由無線通訊介面110接收到的影像資料進行解碼。處理器120可藉由對自外部伺服器接收到的經編碼的影像資料進行解碼來產生經壓縮影像資料。 The processor 120 can decode the image data received through the wireless communication interface 110 . The processor 120 may generate compressed image data by decoding encoded image data received from an external server.

處理器120可藉由將經解碼的影像資料輸入至人工智慧模型中以基於所接收到的與濾波器組相關聯的資訊進行放大來放大經解碼的影像資料。用語「放大」亦可被稱為「壓縮解除」、「影像復原」等。處理器120可應用所接收到的與濾波器組相關聯的資訊來獲得構成人工智慧模型的多個濾波器中的全部或一些。獲得多個濾波器中的一些可意味著構成人工智慧模型的所述多個濾波器中的一些係預設的,且僅基於所接收到的濾波器組資訊獲得其餘濾波器中的一些。 The processor 120 may amplify the decoded image data by inputting the decoded image data into an artificial intelligence model for amplification based on received information associated with the filter bank. The term "enlargement" may also be referred to as "compression decompression", "image restoration", etc. The processor 120 may apply the received information associated with the filter bank to obtain all or some of the plurality of filters that constitute the artificial intelligence model. Obtaining some of the plurality of filters may mean that some of the plurality of filters constituting the artificial intelligence model are preset, and only some of the remaining filters are obtained based on the received filter bank information.

根據另一實施例,當記憶體130中不儲存有多個濾波器組時,處理器120可控制無線通訊介面110將所接收到的與濾波器組相關聯的資訊傳輸至第二外部伺服器。當經由無線通訊介面110接收到與自第二外部伺服器傳輸而來的濾波器組對應的參數資訊時,處理器120可使用所接收到的參數資訊來獲得放大用的人工智慧模型,並使用所獲得的人工智慧模型來放大經解碼的影像資料。 According to another embodiment, when the memory 130 does not store multiple filter banks, the processor 120 can control the wireless communication interface 110 to transmit the received information associated with the filter banks to the second external server. . When receiving parameter information corresponding to the filter bank transmitted from the second external server via the wireless communication interface 110, the processor 120 can use the received parameter information to obtain an artificial intelligence model for amplification, and use The obtained artificial intelligence model is used to amplify the decoded image data.

處理器120可提供經放大的影像資料以供輸出。處理器120可控制無線通訊介面110將經放大的影像資料提供至外部顯示 設備。參考圖4,當電子裝置100中設置有顯示器150時,處理器120可控制顯示器150顯示經放大的影像資料。 The processor 120 may provide the enlarged image data for output. The processor 120 can control the wireless communication interface 110 to provide the amplified image data to an external display. equipment. Referring to FIG. 4 , when the display 150 is provided in the electronic device 100 , the processor 120 can control the display 150 to display the enlarged image data.

如上所述,根據實施例的電子裝置100可自外部伺服器接收與經編碼的影像資料及與改良的濾波器組相關聯的資訊,藉此在復原影像資料期間縮短時間且減少資源消耗。因此,可實現影像的高壓縮率,且可將高畫質影像壓縮成大小較小的影像資料並進行傳輸。 As described above, the electronic device 100 according to the embodiment can receive information associated with the encoded image data and the improved filter set from an external server, thereby shortening time and reducing resource consumption during restoration of the image data. Therefore, a high compression rate of images can be achieved, and high-quality images can be compressed into smaller image data and transmitted.

圖4是根據實施例的電子裝置的方塊圖。 4 is a block diagram of an electronic device according to an embodiment.

參考圖4,電子裝置100可包括無線通訊介面110、處理器120、記憶體130、有線介面140、顯示器150、視訊處理器160、音訊處理器170及音訊輸出組件180等。 Referring to FIG. 4 , the electronic device 100 may include a wireless communication interface 110, a processor 120, a memory 130, a wired interface 140, a display 150, a video processor 160, an audio processor 170, an audio output component 180, etc.

無線通訊介面110及處理器120可包括與圖3中所述的配置相同的配置,且因此將不再贅述。 The wireless communication interface 110 and the processor 120 may include the same configuration as that described in FIG. 3 and therefore will not be described again.

記憶體130可儲存用於操作電子裝置100的各種程式及資料。具體而言,至少一個命令可儲存於記憶體130中。處理器120可藉由執行記憶體130中所儲存的命令來實行本文中所述的操作。記憶體130可體現為非揮發性記憶體、揮發性記憶體、快閃記憶體、硬碟驅動器(hard disk drive,HDD)、固態驅動器(solid state drive,SDD)等。 The memory 130 can store various programs and data for operating the electronic device 100 . Specifically, at least one command may be stored in the memory 130 . Processor 120 may perform the operations described herein by executing commands stored in memory 130 . The memory 130 may be embodied as a non-volatile memory, a volatile memory, a flash memory, a hard disk drive (HDD), a solid state drive (SDD), etc.

經過訓練的人工智慧模型可儲存於記憶體130中。經過訓練的人工智慧模型可包括用於放大影像資料的多個層。所述多個層中的每一者可包括多個濾波器。所述多個濾波器中的每一者 可包含多個參數。舉例而言,經過訓練的人工智慧模型可以是卷積神經網路(CNN)。 The trained artificial intelligence model can be stored in the memory 130 . A trained artificial intelligence model can include multiple layers for amplifying image data. Each of the plurality of layers may include a plurality of filters. Each of the plurality of filters Can contain multiple parameters. For example, the trained artificial intelligence model can be a convolutional neural network (CNN).

所述多個濾波器可各自應用於影像資料的整個圖框。根據實施例,可使用對影像資料的每一圖框應用相同的參數的濾波器,但可使用對每一圖框應用不同的參數的濾波器來放大影像資料。 Each of the plurality of filters may be applied to an entire frame of image data. According to embodiments, a filter that applies the same parameters to each frame of the image data may be used, but a filter that applies different parameters to each frame may be used to amplify the image data.

多個經過訓練的濾波器組可儲存於記憶體130中。具體而言,每一濾波器組可包括多個參數,且所述多個經過訓練的參數可儲存於記憶體130中。可基於個別給出的索引資訊來區分所述多個濾波器組。可提前訓練記憶體130中所儲存的濾波器組,以使得可獲得與在輸入影像資料被縮小之後的最初影像資料最類似的經放大的影像資料。輸入影像資料可以是各種類別的影像資料。將參考圖13更詳細地闡述訓練濾波器組的示例性實施例。 Multiple trained filter banks may be stored in memory 130 . Specifically, each filter bank may include multiple parameters, and the multiple trained parameters may be stored in the memory 130 . The plurality of filter banks may be distinguished based on individually given index information. The filter bank stored in the memory 130 can be trained in advance so that the enlarged image data most similar to the original image data after the input image data is reduced can be obtained. The input image data can be various categories of image data. An exemplary embodiment of a training filter bank will be explained in more detail with reference to Figure 13.

可在處理器120的控制下實行根據實施例的人工智慧模型的操作。 Operations of the artificial intelligence model according to embodiments may be performed under the control of the processor 120 .

處理器120可獲得人工智慧模型,所述人工智慧模型基於所接收到的與濾波器組相關聯的資訊中所包含的索引資訊來應用記憶體130中所儲存的多個濾波器組中的一者。處理器120可藉由將經解碼的影像資料輸入至應用與所接收到的索引資訊對應的濾波器組的人工智慧模型中來放大經解碼的影像資料。 The processor 120 may obtain an artificial intelligence model that applies one of the plurality of filter banks stored in the memory 130 based on the index information included in the received information associated with the filter bank. By. The processor 120 may amplify the decoded image data by inputting the decoded image data into an artificial intelligence model that applies a filter bank corresponding to the received index information.

處理器120可根據記憶體130中所儲存的人工智慧模型的目的來實行各種操作。 The processor 120 can perform various operations according to the purpose of the artificial intelligence model stored in the memory 130 .

舉例而言,若人工智慧模型與影像辨識及/或視覺領悟(諸如,如同被人類感知到一樣辨識並處理物體的技術)有關,則處理器120可使用人工智慧模型來對輸入影像實行物體辨識、物體追蹤、影像搜尋、人類辨識、景物理解、空間理解、影像強化等。 For example, if the artificial intelligence model is related to image recognition and/or visual perception (such as techniques for identifying and processing objects as if perceived by humans), the processor 120 may use the artificial intelligence model to perform object recognition on the input image. , object tracking, image search, human recognition, scene understanding, spatial understanding, image enhancement, etc.

舉另一實例,若人工智慧模型與資訊推薦及/或推斷預測(諸如,判定並且在邏輯上推斷並預測資訊的技術)有關,則處理器120可使用人工智慧模型來實行基於知識/概率的推理、最佳化預測、基於偏好的計劃及推薦。 As another example, if the artificial intelligence model is related to information recommendation and/or inference prediction (such as techniques for determining and logically inferring and predicting information), the processor 120 may use the artificial intelligence model to perform knowledge/probability-based Reasoning, optimal prediction, preference-based planning and recommendations.

舉另一實例,若人工智慧模型與查詢處理及/或知識表示(諸如,將人類經驗資訊自動轉換成知識資料的技術),則處理器120可使用人工智慧模型來實行知識建構(例如,資料產生及分類)及知識管理(資料利用)。 As another example, if the artificial intelligence model is associated with query processing and/or knowledge representation (such as technology that automatically converts human experience information into knowledge data), the processor 120 can use the artificial intelligence model to perform knowledge construction (such as data generation and classification) and knowledge management (data utilization).

如上所述,藉由獲得基於所接收到的與影像資料及濾波器組相關聯的資訊來放大影像資料的人工智慧模型,且藉由將經解碼的影像資料輸入至所獲得的人工智慧模型中來獲得經放大的影像資料,即使使用少量的時間及資源仍可獲得改良的復原影像。 As described above, by obtaining an artificial intelligence model that amplifies the image data based on the received information associated with the image data and the filter bank, and by inputting the decoded image data into the obtained artificial intelligence model To obtain magnified image data, improved restored images can be obtained even with a small amount of time and resources.

有線介面140可被配置以使用有線通訊方法將電子裝置100連接至外部設備。有線介面140可經由有線通訊方法(諸如,纜線或埠)輸入及輸出音訊訊號及視訊訊號中的至少一者。 The wired interface 140 may be configured to connect the electronic device 100 to an external device using a wired communication method. Wired interface 140 may input and output at least one of audio signals and video signals via a wired communication method, such as a cable or a port.

有線介面140可以是顯示器埠、高畫質多媒體介面(high definition multimedia interface,HDMI)、數位視覺介面(digital visual interface,DVI)、紅綠藍(red green blue,RGB)介面、D 超小型(D-subminiature,DSUB)介面、S視訊介面、複合視訊介面、通用串列匯流排(universal serial bus,USB)、雷電型埠(Thunderbolt type port)等。 The wired interface 140 may be a display port, high definition multimedia interface (HDMI), digital visual interface (DVI), red green blue (RGB) interface, D D-subminiature (DSUB) interface, S video interface, composite video interface, universal serial bus (USB), Thunderbolt type port, etc.

顯示器150可顯示經放大的影像資料。可藉由經過訓練的人工智慧模型來放大在顯示器150上顯示的影像。根據人工智慧模型的目的,可在顯示器150上顯示影像中所包含的物體,且可顯示物體的種類。 The display 150 can display the enlarged image data. The image displayed on the display 150 can be enlarged by a trained artificial intelligence model. According to the purpose of the artificial intelligence model, objects included in the image can be displayed on the display 150, and the type of the object can be displayed.

顯示器150可被實施為各種類型的顯示器,諸如發光二極體(light emitting diode,LED)、液晶顯示器(liquid crystal display,LCD)、有機發光二極體(organic light emitting diode,OLED)顯示器、電漿顯示面板(plasma display panel,PDP)等。顯示器150更可包括驅動電路、背光單元等,可以非晶矽(amorphous-silicon,a-Si)薄膜電晶體(thin film transistor,TFT)、低溫多晶矽(low temperature poly silicon,LTPS)薄膜電晶體、有機薄膜電晶體(organic TFT,OTFT)等形式來實施所述驅動電路。與此同時,顯示器150可被實施為撓性顯示器。 The display 150 may be implemented as various types of displays, such as light emitting diode (LED), liquid crystal display (LCD), organic light emitting diode (OLED) display, electronic Plasma display panel (PDP), etc. The display 150 may further include a driving circuit, a backlight unit, etc., and may include amorphous-silicon (a-Si) thin film transistor (TFT), low temperature polysilicon (LTPS) thin film transistor, The driving circuit is implemented in the form of organic thin film transistor (organic TFT, OTFT). Meanwhile, the display 150 may be implemented as a flexible display.

顯示器150可包括用於偵測使用者的觸控手勢的觸控感測器。觸控感測器可體現為各種類型的感測器,諸如靜電型、壓敏型、壓電型等。當影像處理裝置100支援筆輸入功能時,顯示器150可使用輸入方式(諸如,筆及使用者手指)來偵測使用者手勢。當所述輸入方式是其中包括線圈的手寫筆時,影像處理裝置100可包括磁場感測器,所述磁場感測器能夠感測被手寫筆中 的線圈改變的磁場。因此,顯示器150可偵測近處的手勢,亦即懸停及觸控手勢。 The display 150 may include a touch sensor for detecting a user's touch gesture. The touch sensor can be embodied as various types of sensors, such as electrostatic type, pressure-sensitive type, piezoelectric type, etc. When the image processing device 100 supports the pen input function, the display 150 may use an input method (such as a pen and a user's finger) to detect the user's gesture. When the input method is a stylus including a coil, the image processing device 100 may include a magnetic field sensor capable of sensing the input of the stylus. The coil changes the magnetic field. Therefore, the display 150 can detect nearby gestures, namely hover and touch gestures.

儘管已闡述了顯示功能及手勢偵測功能是以與如上所述的相同的配置來實行,但可以不同的配置實行所述功能。根據各種實施例,電子裝置100中可不包括顯示器150。舉例而言,當電子裝置100是機頂盒或伺服器時,可不設置有顯示器150。在此種情形中,可經由無線通訊介面110或有線介面140將經放大的影像資料傳輸至外部顯示設備。 Although the display function and gesture detection function have been described as being implemented in the same configuration as described above, the functions may be implemented in different configurations. According to various embodiments, the display 150 may not be included in the electronic device 100 . For example, when the electronic device 100 is a set-top box or a server, the display 150 may not be provided. In this case, the enlarged image data can be transmitted to the external display device via the wireless communication interface 110 or the wired interface 140 .

處理器120可包括隨機存取記憶體(random access memory,RAM)121、唯讀記憶體(read-only memory,ROM)122、中央處理單元123、圖形處理單元(Graphic Processing Unit,GPU)124及匯流排125。隨機存取記憶體121、唯讀記憶體122、中央處理單元123、圖形處理單元(GPU)124等可經由匯流排125彼此連接。 The processor 120 may include a random access memory (RAM) 121, a read-only memory (ROM) 122, a central processing unit 123, a graphics processing unit (GPU) 124, and Bus 125. The random access memory 121 , the read only memory 122 , the central processing unit 123 , the graphics processing unit (GPU) 124 , etc. may be connected to each other via the bus 125 .

中央處理單元123可存取記憶體130且使用記憶體130中所儲存的作業系統(operating system,O/S)來實行引導(booting)。中央處理單元123可使用記憶體130中所儲存的各種程式、內容、資料等來實行各種操作。 The central processing unit 123 can access the memory 130 and use an operating system (O/S) stored in the memory 130 to perform booting. The central processing unit 123 can use various programs, content, data, etc. stored in the memory 130 to perform various operations.

用於系統引導的命令集等可儲存於唯讀記憶體122中。當輸入接通命令且供電時,中央處理單元123可根據唯讀記憶體122中所儲存的命令將記憶體130中所儲存的作業系統複製至隨機存取記憶體121,執行所述作業系統並實行系統引導。當完成系 統引導時,中央處理單元123可將記憶體130中所儲存的各種程式複製至隨機存取記憶體121,執行複製至隨機存取記憶體121的各種程式,並實行各種操作。 A command set for system booting, etc. may be stored in the read-only memory 122 . When a turn-on command is input and power is supplied, the central processing unit 123 can copy the operating system stored in the memory 130 to the random access memory 121 according to the command stored in the read-only memory 122, execute the operating system and Perform system boot. When the system is completed During system boot, the central processing unit 123 can copy various programs stored in the memory 130 to the random access memory 121, execute the various programs copied to the random access memory 121, and perform various operations.

當完成對電子裝置100的引導時,圖形處理單元124可在顯示器150上顯示使用者介面(user interface,UI)。舉例而言,圖形處理單元124可使用計算單元(未示出)及呈現單元(未示出)來產生包含各種物件(諸如,圖標、影像、文字等)的螢幕。計算單元可根據螢幕的佈局來計算物件的屬性值,諸如座標值、形狀、大小、色彩等。呈現單元可基於計算單元所計算的屬性值來產生包括物件的各種佈局的螢幕。可將呈現單元所產生的螢幕(或UI視窗)提供至顯示器150,並在主顯示區域及次顯示區域中予以顯示。 When booting of the electronic device 100 is completed, the graphics processing unit 124 may display a user interface (UI) on the display 150 . For example, the graphics processing unit 124 may use a computing unit (not shown) and a rendering unit (not shown) to generate a screen including various objects (such as icons, images, text, etc.). The calculation unit can calculate the attribute values of objects according to the layout of the screen, such as coordinate values, shape, size, color, etc. The rendering unit may generate screens including various layouts of objects based on the attribute values calculated by the calculation unit. The screen (or UI window) generated by the presentation unit can be provided to the display 150 and displayed in the main display area and the secondary display area.

視訊處理器160可被配置以處理經由無線通訊介面110或有線介面140接收到的內容或處理記憶體130中所儲存的內容中所包含的視訊資料。視訊處理器160可對視訊資料實行各種影像處理,諸如解碼、縮放、雜訊濾除、圖框率轉換、解析度轉換等。 The video processor 160 may be configured to process content received via the wireless communication interface 110 or the wired interface 140 or to process video data included in the content stored in the memory 130 . The video processor 160 can perform various image processing on video data, such as decoding, scaling, noise filtering, frame rate conversion, resolution conversion, etc.

音訊處理器170可被配置以處理經由無線通訊介面110或有線介面140接收到的內容或處理記憶體130中所儲存的內容中所包含的音訊資料。音訊處理器170可對音訊資料實行各種處理,諸如解碼、擴大(amplification)、雜訊濾除等。 The audio processor 170 may be configured to process content received via the wireless communication interface 110 or the wired interface 140 or to process audio data included in the content stored in the memory 130 . The audio processor 170 can perform various processing on the audio data, such as decoding, amplification, noise filtering, etc.

當執行關於多媒體內容的再現應用時,處理器120可驅 動視訊處理器160及音訊處理器170來再現內容。顯示器150可在主顯示區域及次顯示區域中的至少一者上顯示由視訊處理器160產生的影像圖框。 When executing a reproduction application regarding multimedia content, the processor 120 may drive The video processor 160 and the audio processor 170 are activated to reproduce the content. The display 150 may display the image frame generated by the video processor 160 on at least one of the primary display area and the secondary display area.

如上所述,處理器120、視訊處理器160及音訊處理器170被闡述為單獨的組件。然而,根據實施例,前述組件體現為單個晶片。舉例而言,處理器120可用作視訊處理器160及音訊處理器170。 As mentioned above, processor 120, video processor 160, and audio processor 170 are illustrated as separate components. However, according to embodiments, the aforementioned components are embodied as a single wafer. For example, the processor 120 can serve as the video processor 160 and the audio processor 170 .

音訊輸出組件180可輸出由音訊處理器170產生的音訊資料。 The audio output component 180 can output audio data generated by the audio processor 170 .

儘管圖4中未示出,但根據實施例,電子裝置100更可包括用於與各種外部端子或周邊裝置連接的各種外部輸入埠,諸如耳機(headset)、滑鼠等、接收並處理數位多媒體廣播(digital multimedia broadcasting,DMB)訊號的DMB晶片、用於接收使用者操作的按鈕、用於接收將被轉換成音訊資料的使用者語音或聲音的麥克風、用於根據使用者、各種感測器的控制拍攝靜止影像或視訊的拍攝單元(例如,照相機)等。 Although not shown in FIG. 4 , according to embodiments, the electronic device 100 may further include various external input ports for connecting to various external terminals or peripheral devices, such as headsets, mice, etc., to receive and process digital multimedia. DMB chip for digital multimedia broadcasting (DMB) signals, buttons for receiving user operations, microphones for receiving user voices or sounds that will be converted into audio data, various sensors for recording according to the user Control a shooting unit (for example, a camera) that shoots still images or videos, etc.

圖5是根據實施例的伺服器的方塊圖。 Figure 5 is a block diagram of a server according to an embodiment.

參考圖5,伺服器200可包括通訊介面210、記憶體220及處理器230。 Referring to FIG. 5 , the server 200 may include a communication interface 210 , a memory 220 and a processor 230 .

通訊介面210可使用有線通訊方法或無線通訊方法來實行與外部設備的通訊。 The communication interface 210 may use a wired communication method or a wireless communication method to communicate with external devices.

通訊介面210可以無線方式(諸如,無線區域網路 (wireless local-area network,LAN)、藍芽等)連接至外部設備。通訊介面210可使用Wi-Fi、紫蜂、紅外線(Infrared ray,IrDA)等連接至外部設備。通訊介面210可包括有線通訊方法的連接埠。 The communication interface 210 may be wireless (such as wireless local area network (wireless local-area network, LAN, Bluetooth, etc.) to connect to external devices. The communication interface 210 can use Wi-Fi, ZigBee, infrared ray (Infrared ray, IrDA), etc. to connect to external devices. The communication interface 210 may include a connection port for a wired communication method.

伺服器200可經由通訊介面210將經編碼的影像資料傳輸至電子裝置100。傳輸至電子裝置100的影像資料可包含與用於實現改良的影像復原的濾波器組相關聯的資訊。處理器230可獲得與濾波器組相關聯的資訊。 The server 200 can transmit the encoded image data to the electronic device 100 through the communication interface 210 . The image data transmitted to the electronic device 100 may include information associated with the filter bank used to achieve improved image restoration. Processor 230 may obtain information associated with the filter bank.

記憶體220可儲存用於操作伺服器200的各種程式及資料。至少一個命令可儲存於記憶體220中。處理器230可藉由執行記憶體220中所儲存的命令來實行上述操作。 The memory 220 can store various programs and data used to operate the server 200. At least one command may be stored in memory 220. The processor 230 can perform the above operations by executing commands stored in the memory 220 .

記憶體220可儲存經過訓練的人工智慧模型。具體而言,用於縮小影像資料的經過訓練的人工智慧模型可包括用於縮小的多個層。所述多個層中的每一者可包括多個濾波器。所述多個濾波器中的每一者可包含多個參數。 The memory 220 can store the trained artificial intelligence model. Specifically, a trained artificial intelligence model for downscaling image data may include multiple layers for downscaling. Each of the plurality of layers may include a plurality of filters. Each of the plurality of filters may include a plurality of parameters.

用於放大影像資料的經過訓練的人工智慧模型可儲存於記憶體220中。用於放大影像資料的經過訓練的人工智慧模型可包括用於放大的多個層。所述多個層中的每一者可包括多個濾波器。所述多個濾波器中的每一者可包含多個參數。 The trained artificial intelligence model used to amplify the image data may be stored in memory 220 . A trained artificial intelligence model for magnifying image data may include multiple layers for magnification. Each of the plurality of layers may include a plurality of filters. Each of the plurality of filters may include a plurality of parameters.

縮小用或放大用的人工智慧模型可以是卷積神經網路。放大用的人工智慧模型的數目可小於縮小用的人工智慧模型的數目。 The artificial intelligence model used for reduction or enlargement can be a convolutional neural network. The number of artificial intelligence models used for upscaling may be smaller than the number of artificial intelligence models used for downscaling.

所述多個濾波器中的每一者可應用於影像資料的整個 圖框。具體而言,根據實施例,可使用對影像資料的每一圖框應用相同的參數的濾波器,但可使用對每一圖框應用不同的參數的濾波器來放大影像資料。 Each of the plurality of filters may be applied to the entire Picture frame. Specifically, according to an embodiment, a filter that applies the same parameters to each frame of the image data may be used, but a filter that applies different parameters to each frame may be used to amplify the image data.

記憶體220可儲存多個經過訓練的濾波器組。記憶體220中所儲存的所述多個濾波器組可包括應用於用於放大影像資料的人工智慧模型的濾波器組。具體而言,濾波器組可包含多個參數,且記憶體220可包含多個經過訓練的參數。可基於個別給出的索引資訊來區分所述多個濾波器組。可提前訓練記憶體220中所儲存的濾波器組,以將經放大的影像資料與在輸入影像資料被縮小之後的最初影像資料之間的差異最小化。在此種情形中,可使用通用類似度分析方法(例如,峰值訊雜比(peak signal to noise ratio,PSNR)、結構類似度(structural similarity,SSIM)等)來識別經放大的影像資料與最初影像資料之間的差異。輸入影像資料可以是各種類別的影像資料。將參考圖13更詳細地闡述濾波器組的示例性實施例。 Memory 220 may store multiple trained filter banks. The plurality of filter sets stored in the memory 220 may include filter sets applied to an artificial intelligence model for amplifying image data. Specifically, the filter bank may include multiple parameters, and the memory 220 may include multiple trained parameters. The plurality of filter banks may be distinguished based on individually given index information. The filter bank stored in the memory 220 can be trained in advance to minimize the difference between the enlarged image data and the original image data after the input image data is reduced. In this case, general similarity analysis methods (such as peak signal to noise ratio (PSNR), structural similarity (SSIM), etc.) can be used to identify the difference between the amplified image data and the original differences between image data. The input image data can be various types of image data. An exemplary embodiment of the filter bank will be explained in more detail with reference to FIG. 13 .

可在處理器230的控制下實行人工智慧模型的操作。 The operation of the artificial intelligence model can be performed under the control of the processor 230.

處理器230可藉由將經縮小的影像資料輸入至應用多個濾波器組中的每一者的人工智慧模型中來獲得多個經放大的影像資料。處理器230可藉由同時獲取多個人工智慧模型來獲得所述多個經放大的影像資料。處理器230可使用應用所述多個濾波器組中的一者的人工智慧模型來獲得一個經放大的影像資料,且然後藉由依序地改變所應用的濾波器組來獲得經放大的影像資料。 The processor 230 may obtain a plurality of upscaled image data by inputting the downscaled image data into an artificial intelligence model applying each of a plurality of filter banks. The processor 230 can obtain the multiple magnified image data by simultaneously acquiring multiple artificial intelligence models. The processor 230 may obtain a magnified image data using an artificial intelligence model applying one of the plurality of filter banks, and then obtain the magnified image data by sequentially changing the applied filter bank .

處理器230可在所述多個所獲得的經放大的影像資料當中識別相較於最初影像資料具有最小差異(例如,最小損耗)的經放大的影像資料。處理器230可使用通用類似度分析方法(例如PSNR、SSIM等)來對最初影像資料與所獲得的經放大的影像資料進行比較。 The processor 230 may identify, among the plurality of obtained amplified image data, the amplified image data that has the smallest difference (eg, the smallest loss) compared to the original image data. The processor 230 may use a common similarity analysis method (eg, PSNR, SSIM, etc.) to compare the original image data and the obtained amplified image data.

處理器230可藉由包含與應用於所識別的影像資料的濾波器組相關聯的資訊(例如,元資料)來對經縮小的影像進行編碼。與濾波器組相關聯的資訊可以是濾波器組的索引資訊。處理器230可將與濾波器組相關聯的資訊包含於補充強化資訊(supplemental enhancement information,SEI)資料中,所述補充強化資訊資料被附加至藉由對經縮小的影像進行編碼產生的位元串流。SEI資料可提供與影像資料的解析度、位元率、圖框率等相關聯的資訊。 Processor 230 may encode the reduced image by including information (eg, metadata) associated with the filter set applied to the identified image data. The information associated with the filter bank may be index information of the filter bank. Processor 230 may include information associated with the filter bank in supplemental enhancement information (SEI) data that is appended to the bits generated by encoding the downscaled image. Streaming. SEI data can provide information related to the resolution, bit rate, frame rate, etc. of the image data.

處理器230可經由通訊介面210將經編碼的影像資料傳輸至外部電子裝置。與用於改良影像復原的濾波器組相關聯的資訊可與所傳輸的影像資料包含在一起。 The processor 230 can transmit the encoded image data to an external electronic device through the communication interface 210 . Information associated with the filter bank used to improve image restoration may be included with the transmitted image data.

如上所述,藉由提前在由伺服器進行的編碼過程中通過多個放大過程識別改良的放大濾波器組,可將電子裝置中的復原過程簡化。因此,可在影像流式傳輸環境中實現高壓縮率,且因此可傳輸高畫質影像。 As mentioned above, by identifying the improved amplification filter bank through multiple amplification processes in advance during the encoding process by the server, the recovery process in the electronic device can be simplified. Therefore, a high compression rate can be achieved in an image streaming environment, and therefore high-quality images can be transmitted.

圖6是根據實施例的伺服器的影像編碼操作的流程圖。 Figure 6 is a flowchart of an image encoding operation of a server according to an embodiment.

參考圖6,在操作S610處,伺服器可藉由將最初影像資 料輸入至用於縮小影像資料的人工智慧縮小模型中來獲得經縮小的影像資料。可提前設定縮小壓縮率。舉例而言,參考圖2,可使用1/4壓縮率來壓縮大小為2N×2M的影像,以將所述影像縮小成大小為N×M的影像。然而,壓縮率並不僅限於此。 Referring to FIG. 6, at operation S610, the server may convert the initial image data to The data is input into the artificial intelligence reduction model used to reduce the image data to obtain the reduced image data. The reduction compression ratio can be set in advance. For example, referring to FIG. 2, a 1/4 compression rate may be used to compress an image of size 2N×2M to reduce the image to an image of size N×M. However, the compression ratio doesn't stop there.

在操作S620處,伺服器可藉由將經縮小的影像資料輸入至多個人工智慧放大模型中來獲得多個經放大的影像資料,所述人工智慧放大模型應用為了放大經縮小的影像資料而訓練的多個濾波器組中的每一者。 At operation S620, the server may obtain a plurality of enlarged image data by inputting the reduced image data into a plurality of artificial intelligence enlargement models trained for enlarging the reduced image data. each of multiple filter banks.

可提前對所述多個濾波器組進行放大影像資料的訓練,並將所述多個濾波器組儲存於伺服器中。另外,可將與伺服器中所儲存的所述多個濾波器組中的一個濾波器組相同的濾波器組儲存於外部電子裝置中。 The plurality of filter groups can be trained in advance to amplify image data, and the plurality of filter groups can be stored in a server. Additionally, the same filter set as one of the plurality of filter sets stored in the server may be stored in the external electronic device.

伺服器可使用應用多個濾波器組中的每一者的多個人工智慧放大模型來獲得多個經放大的影像資料。另外,可依序地獲得多個經放大的影像資料,諸如使用應用多個濾波器組中的一者的人工智慧放大模型來獲得一個經放大的影像資料,且然後藉由改變應用於人工智慧放大模型的濾波器組來獲得另一經放大的影像資料。 The server may obtain multiple upscaled image data using multiple artificial intelligence magnification models applying each of multiple filter banks. Additionally, multiple magnified image data may be obtained sequentially, such as using an artificial intelligence magnification model that applies one of multiple filter banks to obtain one magnified image data, and then by changing the applied artificial intelligence Magnify the model's filter bank to obtain another magnified image data.

在操作S630處,伺服器可藉由添加與人工智慧放大模型的濾波器組相關聯的資訊來對經縮小的影像資料進行編碼,所述人工智慧放大模型輸出在所述多個經放大的影像資料當中與最初影像資料具有最小差異的影像資料。 At operation S630, the server may encode the reduced image data by adding information associated with a filter bank of an artificial intelligence upscaling model output in the plurality of upscaled images. The image data among the data that has the smallest difference from the original image data.

具體而言,伺服器可使用類似度分析方法來對所述多個經放大的影像資料中的每一者與最初影像資料進行比較,並識別相較於最初影像資料具有最小差異(例如,最小損耗值等)的經放大的影像資料。伺服器可識別與應用於輸出所識別的經放大的影像資料的人工智慧放大模型的濾波器組相關聯的資訊。伺服器可藉由添加與濾波器組的所識別資訊相關聯的資訊來對經縮小的影像資料進行編碼。與所述濾波器組相關聯的資訊可包含於SEI中,且可包含濾波器組的索引資訊。可藉由對大小受到壓縮的經縮小的影像資料進行編碼來獲得經編碼的影像資料,且經編碼的影像資料的大小可小於最初影像資料。 Specifically, the server may use a similarity analysis method to compare each of the plurality of amplified image data with the original image data and identify a minimum difference compared to the original image data (e.g., a minimum loss value, etc.) enlarged image data. The server can identify information associated with a filter bank applied to an artificial intelligence magnification model that outputs the identified magnified image data. The server may encode the reduced image data by adding information associated with the identified information of the filter bank. Information associated with the filter bank may be included in the SEI and may include index information for the filter bank. The encoded image data may be obtained by encoding reduced image data whose size is compressed, and the size of the encoded image data may be smaller than the original image data.

在操作S640處,伺服器可將經編碼的影像資料傳輸至外部電子裝置。伺服器可將包含有與所述濾波器組相關聯的資訊的經編碼的影像資料傳輸至外部電子裝置,並使用改良的濾波器組來實行解碼及放大。因此,可在流式傳輸環境中傳輸高品質影像。 At operation S640, the server may transmit the encoded image data to the external electronic device. The server may transmit the encoded image data including information associated with the filter bank to an external electronic device and use the modified filter bank to perform decoding and amplification. Therefore, high-quality images can be transmitted in a streaming environment.

圖7是根據實施例的伺服器的影像編碼操作的流程圖。圖7的操作710至操作740可與結合圖2所述的人工智慧編碼器22的操作對應。然而,為易於闡釋起見,將其作為伺服器的操作加以闡述。 Figure 7 is a flowchart of an image encoding operation of a server according to an embodiment. Operations 710 to 740 of FIG. 7 may correspond to the operations of the artificial intelligence encoder 22 described in connection with FIG. 2 . However, for ease of explanation, it will be described as the operation of the server.

參考圖7,在操作S710處,伺服器可將影像內容源71輸入至人工智慧縮小模型。舉例而言,影像內容源71可以是大小為2N×2M的最初影像資料。人工智慧縮小模型可包括多個卷積 濾波器,且訓練可已經完成。 Referring to FIG. 7 , at operation S710 , the server may input the image content source 71 to the artificial intelligence reduction model. For example, the image content source 71 may be original image data with a size of 2N×2M. AI downscaling models can include multiple convolutions filter, and training may have been completed.

伺服器可藉由允許影像內容源71通過人工智慧縮小模型來獲得經縮小的影像72。經縮小的影像72可具有N×M大小,所述N×M大小是最初影像內容源71的1/4大小。影像的大小可與解析度對應。然而,壓縮率1/4是示範性的,且可藉由訓練獲得改良的壓縮率。 The server may obtain the reduced image 72 by allowing the image content source 71 to reduce the model through artificial intelligence. The reduced image 72 may have an N×M size that is 1/4 the size of the original image content source 71 . The size of the image can correspond to the resolution. However, the compression ratio of 1/4 is exemplary, and improved compression ratios can be obtained through training.

舉例而言,人工智慧縮小模型可藉由允許輸入影像內容源71通過卷積層來獲得每一圖框的特徵圖,並藉由允許所獲得的特徵圖通過池化層(pooling layer)來獲得經壓縮影像。池化層可將輸入特徵圖劃分成預定網格,並輸出對所獲得的相應網格的代表值進行編譯的特徵圖。自池化層輸出的特徵圖的大小可小於輸入至池化層的特徵圖的大小。每一網格的代表值可以是每一網格中所包含的最大值或每一網格的平均值。 For example, the artificial intelligence reduction model can obtain the feature map of each frame by allowing the input image content source 71 to pass through the convolution layer, and by allowing the obtained feature map to pass through the pooling layer to obtain the feature map. Compress images. The pooling layer can divide the input feature map into a predetermined grid and output a feature map that compiles the obtained representative values of the corresponding grid. The size of the feature map output from the pooling layer may be smaller than the size of the feature map input to the pooling layer. The representative value of each grid can be the maximum value contained in each grid or the average value of each grid.

人工智慧縮小模型可藉由重複進行卷積層及池化層的操作來對影像進行壓縮。當卷積層及池化層的數目增大時,可提高壓縮率。當自池化層獲得代表值的網格的大小增大時,可提高壓縮率。 The artificial intelligence reduction model can compress the image by repeatedly performing the operations of the convolution layer and the pooling layer. When the number of convolutional layers and pooling layers increases, the compression rate can be improved. When the size of the grid from which representative values are obtained from the pooling layer is increased, the compression rate can be improved.

在操作S750處,伺服器可將原始視訊資料及人工智慧旗標73傳輸至標準編碼器,所述原始視訊資料具有按照1/4壓縮率縮小而得到的N×M大小。按照1/4壓縮率縮小的N×M大小的原始視訊影像可與經縮小的影像72相同。人工智慧旗標73可指示是否實行了人工智慧縮小。若人工智慧旗標被設定為值1,則 人工智慧旗標73可表明實行了人工智慧縮小。 At operation S750, the server may transmit the original video data and the artificial intelligence flag 73 to the standard encoder, where the original video data has an N×M size reduced according to a 1/4 compression rate. The N×M size original video image reduced according to a 1/4 compression ratio may be the same as the reduced image 72 . Artificial intelligence flag 73 may indicate whether artificial intelligence reduction is performed. If the artificial intelligence flag is set to the value 1, then The AI flag 73 indicates that AI scaling has been implemented.

在操作S720處,伺服器可判斷是否使用多人工智慧濾波器選項。若在操作S720處伺服器確定未使用多人工智慧濾波器選項(例如,S720-否),則在操作S750處,伺服器可將濾波器索引的值=NULL傳輸至標準編碼器,濾波器索引=NULL意味著未使用濾波器索引。 At operation S720, the server may determine whether to use the multiple artificial intelligence filter options. If the server determines at operation S720 that the multi-AI filter option is not used (e.g., S720-No), then at operation S750, the server may transmit the value of the filter index = NULL to the standard encoder, filter index =NULL means no filter index is used.

當在操作S720處伺服器確定使用多人工智慧濾波器選項(例如,S720-是)時,則在操作S730處,伺服器可將經縮小的影像輸入至多個所儲存的人工智慧放大模型中。可藉由將多個濾波器組中的每一者應用於人工智慧模型來獲得所述多個人工智慧放大模型。具體而言,所述多個濾波器組中的每一者可包括多個層,如圖8及圖17所示。每一層可以是卷積濾波器,且訓練可已經完成。 When the server determines to use the multiple artificial intelligence filter options at operation S720 (eg, S720-Yes), then at operation S730, the server may input the reduced image into a plurality of stored artificial intelligence magnification models. The plurality of artificial intelligence amplification models may be obtained by applying each of the plurality of filter banks to the artificial intelligence model. Specifically, each of the plurality of filter banks may include multiple layers, as shown in FIGS. 8 and 17 . Each layer can be a convolutional filter, and training can already be completed.

為易於闡釋起見,圖7說明使用應用具有索引0、1、2及n的濾波器組的n個人工智慧放大模型,但在其他實施例中,人工智慧放大模型的數目及索引資訊可有所不同。可藉由提前復原多人工智慧濾波器功能來實現較高的壓縮率。 For ease of explanation, FIG. 7 illustrates the use of n artificial intelligence amplification models applying filter banks with indexes 0, 1, 2, and n, but in other embodiments, the number and index information of the artificial intelligence amplification models may be different. Higher compression rates can be achieved by restoring multiple artificial intelligence filter functions in advance.

在操作S740處,伺服器可選擇輸出相較於影像內容源具有最小差異的經放大的影像的人工智慧模型的濾波器索引。具體而言,伺服器可在操作S730處將自人工智慧縮小模型獲得的經縮小的影像72輸入至每一人工智慧放大模型中,且在操作S730處自每一人工智慧放大模型獲得經放大的影像。 At operation S740, the server may select to output a filter index of the artificial intelligence model of the enlarged image that has a minimum difference compared to the image content source. Specifically, the server may input the reduced image 72 obtained from the artificial intelligence reduction model into each artificial intelligence enlargement model at operation S730, and obtain the enlarged image 72 from each artificial intelligence enlargement model at operation S730. image.

伺服器可對自每一人工智慧放大模型獲得的相應的經放大的影像與影像內容源71(其是最初影像)進行比較,並識別輸出相較於最初影像內容源具有最小差異的經放大的影像(例如,具有最小損耗的經放大的影像)的人工智慧放大模型。在操作S750處,伺服器可將輸出相較於最初影像具有最小差異的經放大的影像的人工智慧放大模型的索引資訊74傳輸至標準編碼器。 The server may compare the corresponding enlarged image obtained from each artificial intelligence upscaling model with the image content source 71 (which is the original image), and identify and output the enlarged image that has the smallest difference compared to the original image content source. An artificial intelligence magnification model of an image (e.g., an enlarged image with minimal loss). At operation S750, the server may transmit the index information 74 of the artificial intelligence magnification model that outputs the magnified image with a minimum difference compared to the original image to the standard encoder.

伺服器可使用按照1/4縮小的N×M大小的所傳輸的原始視訊資料、人工智慧旗標73及濾波器索引資訊74來實行編碼。伺服器可藉由對影像資料進行編碼來獲得位元串流,並將資訊包含於SEI標頭中。 The server can perform encoding using the transmitted original video data, artificial intelligence flag 73 and filter index information 74 reduced by 1/4 of the N×M size. The server can obtain a bit stream by encoding the image data and including the information in the SEI header.

伺服器可將通過編碼操作獲得的位元串流及SEI標頭中所包含的資訊75傳輸至電子裝置100。 The server may transmit the bit stream obtained through the encoding operation and the information 75 contained in the SEI header to the electronic device 100 .

圖8是根據實施例的濾波器組的圖。 Figure 8 is a diagram of a filter bank according to an embodiment.

參考圖8,多個濾波器組中的每一者可包括多個層。所述多個濾波器組可以相同的方式儲存於伺服器及電子裝置中。 Referring to Figure 8, each of the plurality of filter banks may include multiple layers. The plurality of filter banks can be stored in servers and electronic devices in the same manner.

舉例而言,濾波器組810可包括n個層811、812...及81n。所述多個層中的每一者可包括多個卷積濾波器。每一卷積濾波器可以是寬度N×高度M×通道C(例如,N×M×C)的三維卷積濾波器,且所述濾波器之間可包含激活函數偏倚項1、2、...及n。 For example, filter bank 810 may include n layers 811, 812... and 81n. Each of the plurality of layers may include a plurality of convolutional filters. Each convolution filter may be a three-dimensional convolution filter with width N×height M×channel C (for example, N×M×C), and activation function bias terms 1, 2, and 2 may be included between the filters. ..and n.

為易於闡釋起見,圖8說明多個層的濾波器被界定為N×M×C,但在其他實施例中,每一層可具有不同的濾波器大小(N ×M)及通道(C)。另外,每一濾波器組可具有不同數目個層。 For ease of explanation, Figure 8 illustrates multiple layers of filters defined as N×M×C, but in other embodiments, each layer may have a different filter size (N ×M) and channel (C). Additionally, each filter bank can have a different number of layers.

圖9是根據實施例的用於訓練並使用人工智慧模型的電子裝置的方塊圖。 Figure 9 is a block diagram of an electronic device for training and using an artificial intelligence model, according to an embodiment.

參考圖9,處理器900可包括訓練單元910、獲取單元920中的至少一者。圖9的處理器900可與圖3的處理器120及圖5的處理器230對應。 Referring to FIG. 9 , the processor 900 may include at least one of a training unit 910 and an acquisition unit 920 . The processor 900 of FIG. 9 may correspond to the processor 120 of FIG. 3 and the processor 230 of FIG. 5 .

訓練單元910可產生或訓練模型以用於產生縮小濾波器及放大濾波器。訓練單元910可使用所收集的訓練資料來產生人工智慧模型,以用於產生影像資料的縮小濾波器及放大濾波器。訓練單元910可使用所收集的訓練資料來產生經過訓練的模型,所述經過訓練的模型具有用於產生影像資料的縮小濾波器及放大濾波器的準則。訓練單元910可與人工智慧模型的訓練級對應。 The training unit 910 may generate or train models for generating downscaling filters and upscaling filters. The training unit 910 can use the collected training data to generate an artificial intelligence model for generating reduction filters and enlargement filters for image data. The training unit 910 may use the collected training data to generate a trained model with criteria for generating reduction filters and upscaling filters for image data. The training unit 910 may correspond to the training level of the artificial intelligence model.

舉例而言,訓練單元910可使用最初影像資料及藉由縮小及放大所述最初影像資料獲得的影像資料作為輸入資料來產生、訓練或更新模型以預測濾波器的產生。具體而言,若模型的目的是強化影像品質,則訓練單元910可產生、訓練或更新模型以用於產生濾波器,來縮小或放大最初影像資料及藉由縮小及放大所述最初影像資料獲得的影像資料。 For example, the training unit 910 may use the original image data and the image data obtained by reducing and enlarging the original image data as input data to generate, train, or update a model to predict the generation of the filter. Specifically, if the purpose of the model is to enhance image quality, the training unit 910 can generate, train or update the model to generate a filter to reduce or enlarge the original image data and obtain by reducing and enlarging the original image data. image data.

獲取單元920可藉由使用預定資料作為經過訓練的模型的輸入資料來獲得各種資訊。 The obtaining unit 920 can obtain various information by using predetermined data as input data for the trained model.

舉例而言,當影像被輸入時,獲取單元920可使用輸入影像及經過訓練的濾波器來獲得(或辨識、估計及推斷)與所述 輸入影像相關聯的資訊。 For example, when an image is input, the acquisition unit 920 can use the input image and a trained filter to obtain (or identify, estimate, and infer) the Enter information associated with the image.

訓練單元910的至少一部分及獲取單元920的至少一部分可體現為軟體模組,且被製造成一個或多個硬體晶片形式以供安裝於電子裝置100上。舉例而言,可將訓練單元910及獲取單元920中的至少一者製造成硬體晶片形式以用於人工智慧特定操作,或製造為將安裝於各種種類的電子裝置上的現有通用處理器(例如,中央處理單元或應用處理器)或圖形處理器(例如,圖形處理單元)的一部分。用於人工智慧特定操作的硬體晶片可以是專門用於概率計算的處理器,所述專門用於概率計算的處理器具有較傳統通用處理器高的並行處理效能,藉此在人工智慧領域(諸如,機器學習)中快速實行算術運算。當訓練單元910及獲取單元920被實施為軟體模組(或包含指令的程式模組)時,所述軟體模組可以是非暫時性電腦可讀媒體。在此種情形中,可由作業系統(OS)或由預定應用提供所述軟體模組。另一選擇為,所述軟體模組中的一些可由作業系統來提供,且所述軟體模組中的一些可由預定應用提供。 At least part of the training unit 910 and at least part of the acquisition unit 920 may be embodied as a software module and manufactured in the form of one or more hardware chips for installation on the electronic device 100 . For example, at least one of the training unit 910 and the acquisition unit 920 may be manufactured in the form of a hardware chip for artificial intelligence specific operations, or as an existing general-purpose processor to be installed on various kinds of electronic devices ( For example, part of a central processing unit or application processor) or a graphics processor (eg, a graphics processing unit). The hardware chip used for specific operations of artificial intelligence can be a processor specifically used for probability calculations. The processor specifically used for probability calculations has higher parallel processing performance than traditional general-purpose processors, thereby in the field of artificial intelligence ( such as machine learning) to quickly perform arithmetic operations. When the training unit 910 and the acquisition unit 920 are implemented as software modules (or program modules containing instructions), the software modules may be non-transitory computer-readable media. In this case, the software module may be provided by the operating system (OS) or by a predetermined application. Alternatively, some of the software modules may be provided by the operating system, and some of the software modules may be provided by predetermined applications.

訓練單元910及獲取單元920可安裝於諸如伺服器等單個電子裝置上,或者可各自安裝於單獨的電子裝置上。舉例而言,訓練單元910及獲取單元920中的一者可包括於諸如電視等電子裝置中,且另一者可包括於外部伺服器中。訓練單元910及獲取單元920可將由訓練單元910建立的模型資訊以有線方式或無線方式提供至獲取單元920,或者可將輸入至訓練單元910中的資料 提供至訓練單元910作為額外訓練資料。 The training unit 910 and the acquisition unit 920 may be installed on a single electronic device, such as a server, or may each be installed on a separate electronic device. For example, one of the training unit 910 and the acquisition unit 920 may be included in an electronic device such as a television, and the other may be included in an external server. The training unit 910 and the acquisition unit 920 may provide the model information established by the training unit 910 to the acquisition unit 920 in a wired or wireless manner, or may input data into the training unit 910 Provided to the training unit 910 as additional training material.

圖10A及圖10B是根據實施例的訓練單元及獲取單元的方塊圖。 10A and 10B are block diagrams of a training unit and an acquisition unit according to embodiments.

參考圖10A,根據實施例的訓練單元910可包括訓練資料獲取單元910-1及模型訓練單元910-4。訓練單元910更可以可選地包括訓練資料預處理器910-2、訓練資料選擇器910-3及模型評估單元910-5。 Referring to FIG. 10A , the training unit 910 according to the embodiment may include a training data acquisition unit 910-1 and a model training unit 910-4. The training unit 910 may optionally include a training data preprocessor 910-2, a training data selector 910-3, and a model evaluation unit 910-5.

訓練資料獲取單元910-1可獲得用於模型的訓練資料。根據實施例,訓練資料獲取單元910-1可獲得與輸入影像相關聯的資料作為訓練資料。具體而言,訓練資料獲取單元910-1可獲得經縮小的最初影像資料及最初影像資料(其是輸入影像),且然後獲得經放大的影像資料作為訓練資料。 The training data acquisition unit 910-1 can obtain training data for the model. According to an embodiment, the training data acquisition unit 910-1 may obtain data associated with the input image as training data. Specifically, the training data acquisition unit 910-1 may obtain the reduced initial image data and the initial image data (which is the input image), and then obtain the enlarged image data as the training data.

模型訓練單元910-4可訓練如何修改使用訓練資料獲得的影像處理結果和與實際輸入影像相關聯的資訊之間的差異。舉例而言,模型訓練單元910-4可通過監督式學習來訓練人工智慧模型,所述監督式學習使用訓練資料的至少一部分作為準則。模型訓練單元910-4可通過非監督式學習來訓練人工智慧模型,所述非監督式學習是在無任何引導的情況下使用訓練資料來進行自我訓練。模型訓練單元910-4可通過強化學習來訓練人工智慧模型,所述強化學習使用基於訓練確定的結果是否正確的回饋。模型訓練單元910-4亦可使用例如學習演算法來訓練人工智慧模型,所述學習演算法包括誤差反向傳播方法(error back-propagation)或梯度下降(gradient descent)。 The model training unit 910-4 can train how to modify the difference between the image processing results obtained using the training data and the information associated with the actual input image. For example, the model training unit 910-4 may train the artificial intelligence model through supervised learning that uses at least a portion of the training material as a criterion. The model training unit 910-4 can train the artificial intelligence model through unsupervised learning, which is self-training using training data without any guidance. The model training unit 910-4 may train the artificial intelligence model through reinforcement learning using feedback based on whether the result determined by the training is correct. The model training unit 910-4 can also use, for example, a learning algorithm to train the artificial intelligence model. The learning algorithm includes an error back propagation method (error back propagation method). back-propagation) or gradient descent.

當人工智慧模型被訓練之後,模型訓練單元910-4可儲存經過訓練的人工智慧模型。在此種情形中,模型訓練單元910-4可將經過訓練的人工智慧模型儲存於伺服器(例如,人工智慧伺服器)中。模型訓練單元910-4可將經過訓練的人工智慧模型儲存於經由有線網路或無線網路連接的電子裝置的記憶體中。 After the artificial intelligence model is trained, the model training unit 910-4 can store the trained artificial intelligence model. In this case, the model training unit 910-4 can store the trained artificial intelligence model in the server (eg, artificial intelligence server). The model training unit 910-4 can store the trained artificial intelligence model in the memory of an electronic device connected via a wired network or a wireless network.

訓練資料預處理器910-2可預處理所獲得的資料,以使得所獲得的資料可用於訓練,以產生將應用於多個特徵圖的濾波器。訓練資料預處理器910-2可以預定的格式將所獲得的資料格式化,以使得模型訓練單元910-4可使用所述所獲得的資料來進行訓練以產生將應用於特徵圖的濾波器。 The training data preprocessor 910-2 can preprocess the obtained data so that the obtained data can be used for training to generate filters to be applied to multiple feature maps. The training data preprocessor 910-2 may format the obtained data in a predetermined format, so that the model training unit 910-4 may use the obtained data for training to generate filters to be applied to feature maps.

訓練資料選擇器910-3可在自訓練資料獲取單元910-1獲得的資料或由訓練資料預處理器910-2預處理的資料之間選擇用於訓練的資料。可將所選訓練資料提供至模型訓練單元910-4。根據預定選擇準則,訓練資料選擇器910-3可在所獲得的資料或經預處理的資料當中選擇用於訓練的訓練資料。訓練資料選擇器910-3可根據由模型訓練單元910-4的訓練預定準則來選擇訓練資料。 The training data selector 910-3 may select data for training between data obtained from the training data acquisition unit 910-1 or data preprocessed by the training data preprocessor 910-2. Selected training data may be provided to the model training unit 910-4. According to the predetermined selection criteria, the training data selector 910-3 may select training data for training among the obtained data or preprocessed data. The training material selector 910-3 may select training materials according to the training predetermined criteria by the model training unit 910-4.

訓練單元910更可包括模型評估單元910-5,所述模型評估單元910-5用於改良人工智慧模型的辨識結果。 The training unit 910 may further include a model evaluation unit 910-5, which is used to improve the recognition results of the artificial intelligence model.

模型評估單元910-5可將評估資料輸入至人工智慧模型中,且若自評估資料輸出的辨識結果未滿足預定準則,則允許模 型訓練單元910-4進行訓練。在此種情形中,評估資料可以是用於評估人工智慧模型的預定義資料。 The model evaluation unit 910-5 can input evaluation data into the artificial intelligence model, and if the identification result output from the evaluation data does not meet the predetermined criteria, the model is allowed to Type training unit 910-4 performs training. In this case, the evaluation data may be predefined data for evaluating the artificial intelligence model.

舉例而言,就評估資料而言在經過訓練的人工智慧模型的辨識結果當中,若具有不正確辨識結果的評估資料的數目或比率超出預定臨限值,則模型評估單元910-5可評估辨識結果不滿足預定準則。 For example, in terms of evaluation data, among the recognition results of the trained artificial intelligence model, if the number or ratio of evaluation data with incorrect recognition results exceeds a predetermined threshold, the model evaluation unit 910-5 may evaluate the recognition results. The result does not meet the predetermined criteria.

當存在多個經過訓練的人工智慧模型時,模型評估單元910-5可評估每一經過訓練的人工智慧模型是否滿足預定準則,且識別出滿足預定準則的人工智慧模型作為最終的人工智慧模型。在此種情形中,當多個人工智慧模型皆滿足預定準則時,模型評估單元910-5可識別出預定的任一個或按照評估得分的遞降次序提前設定的若干個模型作為最終的人工智慧模型。 When there are multiple trained artificial intelligence models, the model evaluation unit 910-5 may evaluate whether each trained artificial intelligence model satisfies the predetermined criteria, and identify the artificial intelligence model that satisfies the predetermined criteria as the final artificial intelligence model. In this case, when multiple artificial intelligence models meet the predetermined criteria, the model evaluation unit 910-5 can identify any one of the predetermined ones or several models set in advance in descending order of evaluation scores as the final artificial intelligence model. .

參考圖10B,獲取單元920可包括輸入資料獲取單元920-1及提供器920-4。 Referring to FIG. 10B, the acquisition unit 920 may include an input data acquisition unit 920-1 and a provider 920-4.

獲取單元920更可選擇性地包括輸入資料預處理器920-2、輸入資料選擇器920-3及模型更新單元920-5。 The acquisition unit 920 further optionally includes an input data preprocessor 920-2, an input data selector 920-3, and a model update unit 920-5.

輸入資料獲取單元920-1可獲得輸入最初影像資料,且根據影像處理的目的獲得多個濾波器。所述多個濾波器可以是用於縮小影像資料的多個濾波器及用於放大經縮小的影像資料的多個濾波器。提供器920-4可藉由將自輸入資料獲取單元920-1獲得的輸入資料應用於經過訓練的人工智慧模型作為輸入值來獲得輸入影像的處理結果。提供器920-4可藉由將輸入資料預處理器 920-2或輸入資料選擇器920-3所選擇的資料應用於人工智慧模型作為輸入值來獲得輸入影像的處理結果。 The input data acquisition unit 920-1 can obtain input initial image data, and obtain multiple filters according to the purpose of image processing. The plurality of filters may be a plurality of filters used to reduce the image data and a plurality of filters used to enlarge the reduced image data. The provider 920-4 can obtain the processing result of the input image by applying the input data obtained from the input data acquisition unit 920-1 to the trained artificial intelligence model as an input value. Provider 920-4 can preprocess the input data by The data selected by 920-2 or the input data selector 920-3 is applied to the artificial intelligence model as an input value to obtain the processing result of the input image.

舉例而言,提供器920-4可藉由將自輸入資料獲取單元920-1獲得的輸入最初影像資料、用於縮小最初影像資料的濾波器及用於放大經縮小的影像資料的濾波器應用於經過訓練的人工智慧模型來獲得(或估計)輸入影像的處理結果。 For example, the provider 920-4 may apply input initial image data obtained from the input data acquisition unit 920-1, a filter for reducing the initial image data, and a filter for enlarging the reduced image data. Use a trained artificial intelligence model to obtain (or estimate) the processing results of the input image.

獲取單元920更可包括輸入資料預處理器920-2及輸入資料選擇器920-3,所述輸入資料預處理器920-2及輸入資料選擇器920-3用於改良人工智慧模型的辨識結果,或者節省提供辨識結果的資源或時間。 The acquisition unit 920 may further include an input data preprocessor 920-2 and an input data selector 920-3, which are used to improve the recognition results of the artificial intelligence model. , or save resources or time in providing identification results.

輸入資料預處理器920-2可預處理所獲得的資料,以使得可使用將被輸入至第一人工智慧模型及第二人工智慧模型中的所獲得的資料。輸入資料預處理器920-2可以預定義格式將所獲得的資料格式化,以使得提供器920-4可使用所獲得的資料來獲得改良的壓縮率。 The input data preprocessor 920-2 can preprocess the obtained data so that the obtained data can be used to be input into the first artificial intelligence model and the second artificial intelligence model. The input data preprocessor 920-2 can format the obtained data in a predefined format so that the provider 920-4 can use the obtained data to obtain an improved compression rate.

輸入資料選擇器920-3可在自輸入資料獲取單元920-1獲得的資料與由輸入資料預處理器920-2預處理的資料之間選擇用於狀態確定的資料。可將所選資料提供至提供器920-4。輸入資料選擇器920-3可根據預定的狀態確定準則來選擇所獲得的資料或經預處理的資料中的一部分或全部。輸入資料選擇器920-3可根據由模型訓練單元910-4進行訓練而預定的準則來選擇資料。 The input data selector 920-3 may select data for status determination between the data obtained from the input data acquisition unit 920-1 and the data preprocessed by the input data preprocessor 920-2. Selected information may be provided to provider 920-4. The input data selector 920-3 may select part or all of the obtained data or preprocessed data according to predetermined status determination criteria. The input data selector 920-3 may select data according to predetermined criteria trained by the model training unit 910-4.

模型更新單元920-5可將人工智慧模型控制成基於對提 供器920-4所提供的辨識結果的評估而被更新。舉例而言,模型更新單元920-5可將提供器920-4所提供的影像處理結果提供至模型訓練單元910-4,以請求模型訓練單元910-4另外地訓練或更新人工智慧模型。 The model update unit 920-5 can control the artificial intelligence model to be based on the proposed It is updated based on the evaluation of the identification result provided by the provider 920-4. For example, the model update unit 920-5 may provide the image processing results provided by the provider 920-4 to the model training unit 910-4 to request the model training unit 910-4 to additionally train or update the artificial intelligence model.

圖11是根據實施例的濾波器組的訓練方法的圖。 Figure 11 is a diagram of a training method of a filter bank according to an embodiment.

根據實施例,卷積神經網路型模型可包括寬度×高度×通道的三維卷積濾波器以及激活函數(active function)層。 According to an embodiment, the convolutional neural network type model may include a width×height×channel three-dimensional convolution filter and an activation function layer.

卷積濾波器的參數可以是訓練目標,且可通過訓練獲得適合於達成某一目的的改良的參數。人工智慧縮小模型及人工智慧放大模型旨在提供改良的壓縮率,以使得藉由縮小及放大最初影像資料獲得的影像資料與最初影像資料最類似。 The parameters of the convolution filter can be training targets, and improved parameters suitable for achieving a certain purpose can be obtained through training. The artificial intelligence reduction model and the artificial intelligence upscaling model aim to provide an improved compression rate so that the image data obtained by reducing and enlarging the original image data are most similar to the original image data.

可由伺服器或電子裝置實行訓練,且為易於闡釋起見,將闡述為伺服器實行訓練。 Training can be performed by a server or an electronic device, and for ease of explanation will be described as server-executed training.

參考圖11,根據本發明的訓練方法,伺服器可藉由使用X個卷積濾波器1120縮小最初影像資料1110來獲得經壓縮影像資料1130。圖11說明大小為2N×2M的最初影像資料被縮小成大小為N×M的經壓縮影像資料,但本發明並不僅限於此。 Referring to Figure 11, according to the training method of the present invention, the server can obtain compressed image data 1130 by using X convolution filters 1120 to reduce the original image data 1110. FIG. 11 illustrates that the original image data with a size of 2N×2M is reduced into compressed image data with a size of N×M, but the present invention is not limited thereto.

伺服器可藉由使用Y個卷積濾波器1140放大所獲得的經壓縮影像資料1130來獲得復原影像資料1150。數目「Y」可小於數目「X」。 The server may obtain restored image data 1150 by amplifying the obtained compressed image data 1130 using Y convolution filters 1140 . The number "Y" can be smaller than the number "X".

伺服器可對復原影像資料1150與最初影像資料1110進行比較,並訓練濾波器1120及濾波器1140中的每一者的參數以 減少損耗。伺服器可使用類似度分析方法(例如PSNR、SSIM等)來計算損耗值。 The server may compare the restored image data 1150 with the original image data 1110 and train the parameters of each of the filters 1120 and 1140 to Reduce losses. The server can use similarity analysis methods (such as PSNR, SSIM, etc.) to calculate the loss value.

應用經過訓練的參數的人工智慧縮小模型及人工智慧放大模型可通過改良的縮放操作來壓縮或復原影像資料。 Artificial intelligence reduction models and artificial intelligence upscaling models using trained parameters can compress or restore image data through improved scaling operations.

圖12是根據實施例的流式傳輸資料的結構的圖。圖12說明其中在圖2的編碼操作24期間儲存與濾波器組相關聯的資訊的詳細示例性實施例。 Figure 12 is a diagram of the structure of streaming data according to an embodiment. FIG. 12 illustrates a detailed exemplary embodiment in which information associated with a filter bank is stored during encoding operation 24 of FIG. 2 .

參考圖12,可將按照1/4縮小的原始視訊資料、人工智慧旗標及濾波器索引資訊1201輸入至標準編碼器1202中。伺服器可藉由對按照1/4縮小的輸入原始視訊資料進行編碼來獲得視訊串流1204。視訊串流可指代視訊位元串流。 Referring to Figure 12, original video data, artificial intelligence flags and filter index information 1201 reduced by 1/4 can be input to the standard encoder 1202. The server may obtain the video stream 1204 by encoding the input raw video data reduced by 1/4. Video streaming may refer to video bit streaming.

伺服器可將輸入人工智慧旗標及濾波器索引資訊1203包含於視訊串流1204中所包含的SEI 1205中。伺服器可將視訊串流1204劃分成N個視訊塊1206,且複製SEI 1205以產生N個SEI 1207。 The server may include the input artificial intelligence flag and filter index information 1203 in the SEI 1205 included in the video stream 1204 . The server may divide the video stream 1204 into N video chunks 1206 and copy the SEI 1205 to generate N SEIs 1207.

伺服器可將所產生的SEI添加至每一視訊塊,並將個別地添加有SEI的多個視訊塊1208儲存於流式傳輸儲存裝置1209中。 The server may add the generated SEI to each video chunk and store the plurality of video chunks 1208 individually appended with SEI in the streaming storage device 1209 .

儘管未示出,但可將所述多個所儲存的視訊塊1208傳輸至電子裝置。 Although not shown, the plurality of stored video blocks 1208 may be transmitted to an electronic device.

圖13是根據實施例的濾波器組的訓練方法的圖。 Figure 13 is a diagram of a training method of a filter bank according to an embodiment.

參考圖13,可利用不同類別的影像資料來訓練多個濾波 器組中的每一者。具體而言,可利用訓練用的訓練資料集中的所有影像(全域資料集)來訓練第一濾波器組(濾波器1)。 Referring to Figure 13, multiple filters can be trained using different categories of image data. each one in the device group. Specifically, the first filter bank (Filter 1) can be trained using all images in the training data set for training (universal data set).

另外,可利用訓練資料集中的單個類別的影像資料(諸如,電影、體育、音樂視訊、紀錄片、新聞等)來訓練每一濾波器組。 Additionally, each filter bank may be trained using a single category of image data in the training data set (such as movies, sports, music videos, documentaries, news, etc.).

如上所述,當基於影像資料類別完成對每一濾波器組的訓練時,可基於輸入最初影像資料的類別來選擇濾波器組。 As described above, when training of each filter bank based on the category of the image data is completed, the filter bank may be selected based on the category of the input original image data.

然而,本發明並不僅限於此。無論輸入最初影像資料的類別如何,皆可在應用多個濾波器組之後選擇改良的濾波器組。 However, the present invention is not limited to this. Regardless of the type of input original image data, an improved filter set can be selected after applying multiple filter sets.

圖13闡述使用基於影像資料類別加以分類的訓練資料集來訓練每一濾波器組。然而,本發明並不僅限於此,且可根據各種準則對訓練資料集進行分類。 Figure 13 illustrates the use of training data sets classified based on image data categories to train each filter bank. However, the present invention is not limited thereto, and the training data sets may be classified according to various criteria.

圖14是根據實施例的電子裝置的影像解碼操作的流程圖。 FIG. 14 is a flowchart of an image decoding operation of an electronic device according to an embodiment.

參考圖14,在操作S1410處,電子裝置可接收與影像資料及應用於用於放大影像資料的人工智慧模型的濾波器組相關聯的資訊。與濾波器組相關聯的資訊可與影像資料包含在一起。電子裝置可自外部伺服器接收與影像資料及濾波器組相關聯的資訊。 Referring to FIG. 14, at operation S1410, the electronic device may receive information associated with image data and a filter bank applied to an artificial intelligence model for amplifying the image data. Information associated with the filter bank can be included with the image data. The electronic device may receive information associated with the image data and filter set from an external server.

在操作S1420處,電子裝置可對所接收到的影像資料進行解碼。所接收到的影像資料可以是由伺服器編碼的影像資料,且經編碼的影像資料可藉由對經縮小的最初影像資料進行編碼來 獲得。 In operation S1420, the electronic device may decode the received image data. The received image data may be image data encoded by the server, and the encoded image data may be encoded by encoding the reduced original image data. obtain.

在操作S1430處,電子裝置可藉由將經解碼的影像資料輸入至基於與濾波器組相關聯的資訊獲得的第一人工智慧模型中來放大經解碼的影像資料。多個濾波器組可預儲存於所述電子裝置中。所述電子裝置可使用在多個濾波器組當中與接收到的資訊對應的濾波器組來獲得用於放大影像資料的第一人工智慧模型。電子裝置可使用所獲得的第一人工智慧模型來放大經解碼的影像資料。 At operation S1430, the electronic device may amplify the decoded image data by inputting the decoded image data into a first artificial intelligence model obtained based on information associated with the filter bank. Multiple filter banks may be pre-stored in the electronic device. The electronic device may use a filter bank corresponding to the received information among the plurality of filter banks to obtain the first artificial intelligence model for amplifying the image data. The electronic device may use the obtained first artificial intelligence model to amplify the decoded image data.

當多個濾波器組未儲存於電子裝置中時,電子裝置可將所接收到的與濾波器組相關聯的資訊傳輸至第二外部伺服器。第二外部伺服器可儲存與多個濾波器組相關聯的參數資訊。第二外部伺服器可與將影像資料傳輸至電子裝置的外部伺服器相同或不同。 When the plurality of filter banks are not stored in the electronic device, the electronic device may transmit the received information associated with the filter banks to the second external server. The second external server may store parameter information associated with multiple filter banks. The second external server may be the same as or different from the external server that transmits the image data to the electronic device.

當自第二外部伺服器接收到與所傳輸的資訊對應的濾波器組的參數資訊時,所述電子裝置可藉由應用所接收到的參數資訊來獲得第一人工智慧模型。 When receiving parameter information of the filter bank corresponding to the transmitted information from the second external server, the electronic device may obtain the first artificial intelligence model by applying the received parameter information.

在操作S1440處,所述電子裝置可輸出經放大的影像資料。具體而言,若所述電子裝置是具有顯示器的顯示設備,則所述電子裝置可控制顯示器顯示所述經放大的影像資料。若所述電子裝置是不具有顯示器的裝置,則所述電子裝置可將經放大的影像資料傳輸至外部顯示設備以供顯示。換言之,電子裝置可提供經放大的影像資料以經由所述電子裝置的顯示器輸出,或將經放 大的影像資料提供至外部顯示設備以經由所述外部顯示設備的顯示器輸出。 At operation S1440, the electronic device may output the enlarged image data. Specifically, if the electronic device is a display device with a display, the electronic device can control the display to display the amplified image data. If the electronic device is a device without a display, the electronic device can transmit the amplified image data to an external display device for display. In other words, the electronic device may provide amplified image data for output via a display of the electronic device, or may The large image data is provided to an external display device for output via a display of the external display device.

如上所述,根據實施例的電子裝置可自外部伺服器接收與經編碼的影像資料及改良的濾波器組相關聯的資訊,藉此在復原影像資料期間縮短時間且減少資源消耗。因此,可實現影像的高壓縮率,且可將高畫質影像壓縮成大小較小的影像資料並傳輸。 As described above, the electronic device according to the embodiment can receive information associated with the encoded image data and the improved filter set from an external server, thereby shortening time and reducing resource consumption during restoration of the image data. Therefore, a high compression rate of images can be achieved, and high-quality images can be compressed into smaller image data and transmitted.

圖15及圖16是根據實施例的電子裝置的影像解碼操作的圖。 15 and 16 are diagrams of image decoding operations of the electronic device according to embodiments.

圖15闡述用於實現人工智慧解碼操作的裝置的配置。圖15的操作1501可與圖2的操作26對應,且圖15的操作1503、操作1504、操作1505及操作1509可與圖12的操作28對應。 Figure 15 illustrates the configuration of a device for implementing artificial intelligence decoding operations. Operation 1501 of FIG. 15 may correspond to operation 26 of FIG. 2 , and operations 1503 , 1504 , 1505 , and 1509 of FIG. 15 may correspond to operation 28 of FIG. 12 .

參考圖15,電子裝置可藉由使用標準解碼器1501對經編碼的影像進行解碼來獲得經壓縮原始資料、人工智慧旗標及濾波器索引資訊1502。電子裝置可將所獲得的原始資料及資訊傳輸至人工智慧資訊控制器1503,並且判斷是否實行了人工智慧編碼且判斷是否存在索引資訊。 Referring to FIG. 15 , the electronic device may obtain compressed raw data, artificial intelligence flags, and filter index information 1502 by decoding the encoded image using a standard decoder 1501 . The electronic device can transmit the obtained original data and information to the artificial intelligence information controller 1503, and determine whether artificial intelligence encoding has been performed and whether index information exists.

若人工智慧旗標指示未實行人工智慧編碼(例如,AI旗標==NULL),則可在不實行放大過程的情況下將原始資料傳輸至顯示器1509以顯示大小為N×M的原始資料。 If the artificial intelligence flag indicates that artificial intelligence encoding is not performed (eg, AI flag == NULL), the original data can be transmitted to the display 1509 to display the original data of size N×M without performing the amplification process.

若人工智慧旗標指示實行了人工智慧編碼(例如,AI旗標==1),則人工智慧資訊控制器1503可將大小為N×M的原始資料1510傳輸至人工智慧(AI)放大模型1507。 If the artificial intelligence flag indicates that artificial intelligence encoding is implemented (eg, AI flag == 1), the artificial intelligence information controller 1503 can transmit the original data 1510 of size N×M to the artificial intelligence (AI) amplification model 1507 .

若索引資訊指示使用了人工智慧編碼及多人工智慧濾波器選項(例如,索引資訊!==NULL),則人工智慧資訊控制器1503可將所述索引資訊傳輸至索引控制器1504。接收到所述索引資訊的索引控制器1504可將請求1511傳輸至記憶體1505以將與濾波器匹配的索引資訊的參數載入至人工智慧放大模型1507。當完成與索引資訊匹配的參數的載入1506時,人工智慧放大模型1507可藉由使用與所載入索引資訊匹配的參數放大大小為N×M的所傳輸的原始資料1510來獲得具有2N×2M大小的原始資料1508。 If the index information indicates that AI encoding and multiple AI filter options are used (eg, index information!==NULL), the AI information controller 1503 may transmit the index information to the index controller 1504. The index controller 1504 that receives the index information may transmit a request 1511 to the memory 1505 to load the parameters of the index information matching the filter into the artificial intelligence amplification model 1507. When loading 1506 of parameters matching the index information is completed, the artificial intelligence amplification model 1507 can obtain a size of 2N× by amplifying the transmitted original data 1510 of size N×M using parameters matching the loaded index information. 2M size of original data 1508.

若索引資訊指示未使用人工智慧編碼及多人工智慧濾波器選項(例如,索引資訊==NULL),則人工智慧放大模型1507可藉由使用預設放大參數放大具有N×M大小的原始資料1510來獲得大小為2N×2M的原始資料1508。 If the index information indicates that AI encoding and multiple AI filter options are not used (e.g., index information == NULL), the AI amplification model 1507 can amplify the original data with N×M size 1510 by using the default amplification parameters. To obtain the original data 1508 with a size of 2N×2M.

電子裝置可將所獲得的大小為2N×2M的原始資料1508傳輸至顯示器1509以供顯示。 The electronic device can transmit the obtained original data 1508 with a size of 2N×2M to the display 1509 for display.

為易於闡釋起見,圖15說明標準解碼器1501、人工智慧資訊控制器1503、索引控制器1504及人工智慧放大模型1507是單獨的組件,但在其他實施例中,可由處理器中的一者或多者實行每一裝置的操作。 For ease of explanation, FIG. 15 illustrates that the standard decoder 1501, the AI information controller 1503, the index controller 1504, and the AI amplification model 1507 are separate components, but in other embodiments, they may be configured by one of the processors. or more to perform the operations of each device.

圖16詳細地闡述人工智慧解碼操作。圖16的操作S1601可與圖2的操作26對應,且圖16的操作S1620至操作S1640可與圖2的操作28對應。 Figure 16 illustrates the artificial intelligence decoding operation in detail. Operation S1601 of FIG. 16 may correspond to operation 26 of FIG. 2 , and operations S1620 to S1640 of FIG. 16 may correspond to operation 28 of FIG. 2 .

參考圖16,電子裝置可自伺服器200接收位元串流、SEI 標頭中所包含的濾波器索引及人工智慧旗標1601。電子裝置可藉由對在操作S1610中輸入至標準解碼器中的所接收到的位元串流、SEI標頭中所包含的濾波器索引及人工智慧旗標1601進行解碼來獲得大小為N×M的原始視訊資料及SEI 1602。 Referring to Figure 16, the electronic device can receive the bit stream, SEI Filter index and artificial intelligence flag 1601 included in the header. The electronic device may obtain the size N M's original video data and SEI 1602.

在操作S1620處,電子裝置可判斷SEI中所儲存的人工智慧旗標是否指示實行了人工智慧編碼(例如,AI旗標==1)。若在操作S1620處電子裝置確定未實行人工智慧編碼(例如,S1620-否),則在操作S1650處電子裝置可顯示大小為N×M的原始視訊資料1604。 At operation S1620, the electronic device may determine whether the artificial intelligence flag stored in the SEI indicates that artificial intelligence encoding is performed (eg, AI flag == 1). If the electronic device determines that artificial intelligence encoding is not performed at operation S1620 (eg, S1620-No), the electronic device may display the original video data 1604 of size N×M at operation S1650.

若在操作S1620處電子裝置確定實行了人工智慧編碼(例如,S1620-是),則在操作S1630處電子裝置可判斷是否使用了多人工智慧濾波器選項(例如,是否濾波器索引!==NULL)。若在操作S1630處電子裝置確定使用了多人工智慧濾波器選項(例如,S1630-是),則電子裝置可選擇與濾波器索引資訊對應的濾波器組。電子裝置可使用人工智慧放大模型來放大大小為N×M的原始視訊資料,藉由在操作S1640處應用所選的濾波器組來獲得所述人工智慧放大模型。在操作S1650處,電子裝置可顯示大小2N×2M的經放大的視訊資料。 If the electronic device determines that artificial intelligence encoding is performed at operation S1620 (eg, S1620-Yes), then the electronic device may determine whether the multiple artificial intelligence filter options are used (eg, whether filter index !==NULL) at operation S1630 ). If the electronic device determines that the multiple artificial intelligence filter options are used at operation S1630 (eg, S1630-Yes), the electronic device may select a filter group corresponding to the filter index information. The electronic device may amplify the original video data of size N×M using an artificial intelligence amplification model obtained by applying the selected filter bank at operation S1640. At operation S1650, the electronic device may display the enlarged video data of size 2N×2M.

為易於闡釋起見,圖16說明索引為0、1、2及3的n個濾波器組預儲存於電子裝置中,但索引資訊及濾波器組的數目並不僅限於此。 For ease of explanation, FIG. 16 illustrates that n filter banks with indexes 0, 1, 2, and 3 are pre-stored in the electronic device, but the index information and the number of filter banks are not limited to this.

若在操作S1630處電子裝置確定未使用多人工智慧濾波 器選項(例如,S1630-否),則在操作S1660處電子裝置可實行應用濾波器組0的人工智慧放大。濾波器組0可以是預設濾波器組。在操作S1650處,所述電子裝置可顯示大小為2N×2M的經放大的視訊資料1605。 If the electronic device determines that multi-artificial intelligence filtering is not used at operation S1630 filter option (eg, S1630-No), the electronic device may perform artificial intelligence amplification applying filter bank 0 at operation S1660. Filter bank 0 may be a preset filter bank. At operation S1650, the electronic device may display the enlarged video data 1605 with a size of 2N×2M.

如上所述,根據實施例的電子裝置可自外部伺服器接收與經編碼的影像資料及改良的濾波器組相關聯的資訊,藉此在復原影像資料時縮短時間且減少資源消耗。因此,可實現影像的高壓縮率,且可將高畫質影像壓縮成大小較小的影像資料並在流式傳輸環境中傳輸。 As described above, the electronic device according to the embodiment can receive information associated with the encoded image data and the improved filter set from an external server, thereby shortening time and reducing resource consumption when restoring the image data. Therefore, a high compression rate of images can be achieved, and high-definition images can be compressed into smaller image data and transmitted in a streaming environment.

圖17是根據實施例的電子裝置的影像放大操作的圖。電子裝置可自伺服器接收輸入影像資料1705及濾波器索引資訊1701。 FIG. 17 is a diagram of an image magnification operation of the electronic device according to the embodiment. The electronic device may receive input image data 1705 and filter index information 1701 from the server.

電子裝置可在記憶體1702中所儲存的多個濾波器組當中選擇與濾波器索引資訊1701匹配的濾波器組1703。記憶體1702中所儲存的所述多個濾波器組可以是使用不同的訓練資料加以訓練的濾波器的集合。每一濾波器組可以是應用於卷積神經網路模型的濾波器組。每一濾波器組可包括多個層及應用於偏倚項1704的多個參數。所述多個層中的每一者可包括多個濾波器。 The electronic device may select the filter set 1703 that matches the filter index information 1701 from among multiple filter sets stored in the memory 1702 . The plurality of filter banks stored in the memory 1702 may be a set of filters trained using different training data. Each filter bank may be a filter bank applied to a convolutional neural network model. Each filter bank may include multiple layers and multiple parameters applied to the bias term 1704. Each of the plurality of layers may include a plurality of filters.

電子裝置可藉由應用所選濾波器組的每一參數來獲得放大人工智慧模型。參考圖17,可獲得包括y個卷積濾波器1706的放大人工智慧模型,且卷積濾波器的數目可根據所選濾波器組而有所變化。電子裝置可藉由將輸入影像資料1705輸入至所獲得 的人工智慧放大模型中來獲得復原的輸出影像1707。可藉由放大輸入影像資料1705來獲得復原的輸出影像1707。 The electronic device can obtain an amplified artificial intelligence model by applying each parameter of the selected filter bank. Referring to Figure 17, an enlarged artificial intelligence model including y convolution filters 1706 can be obtained, and the number of convolution filters can vary according to the selected filter bank. The electronic device may be obtained by inputting the input image data 1705 to The artificial intelligence amplification model is used to obtain the restored output image 1707. The restored output image 1707 may be obtained by enlarging the input image data 1705.

提供多個濾波器組的理由如下。 The reasons for providing multiple filter banks are as follows.

第一,卷積神經網路模型可含有黑箱特性。由於所述黑箱特性,在訓練過程期間可難以在裝置中識別卷積神經網路模型的運作。因此,可需要以不同的方式輸入訓練資料集,以獲得專門用於影像分量(諸如,影像類別等)或對於所述影像分量而言最佳的濾波器組。相較於利用整個資料集訓練的濾波器組而言,藉由特定影像分量的輸入資料獲得的所述專用濾波器組通常可具有高損耗及低放大功能,但在特定影像中,可導出改良的結果。 First, convolutional neural network models can contain black-box properties. Due to the black-box nature, the operation of a convolutional neural network model can be difficult to identify in a device during the training process. Therefore, the training data set may need to be input in different ways to obtain a filter bank that is specific to or optimal for an image component (such as an image class, etc.). The dedicated filter bank obtained from the input data of a specific image component can generally have high loss and low amplification compared to a filter bank trained using the entire data set, but in a specific image, an improvement can be derived result.

第二,由於解碼器部分的即時性質,深入地形成所述解碼器部分的人工智慧放大模型的濾波器層受到限制。卷積計算需要大量計算/硬體資源,且因此卷積神經網路模型可對即時效能造成阻礙。因此,若將各種經過訓練的放大濾波器組應用於所述編碼器部分,則可增大層的寬度,且因此可強化放大功能。 Second, due to the real-time nature of the decoder part, the depth of the filter layers that form the AI amplification model of the decoder part is limited. Convolutional calculations require extensive computing/hardware resources, and therefore convolutional neural network models can hinder real-time performance. Therefore, if various trained amplification filter banks are applied to the encoder part, the width of the layer can be increased, and thus the amplification function can be enhanced.

最後,以多個所儲存的濾波器組進行多重濾波可提供改良的壓縮率。藉由在不進行額外影像分析的情況下基於影像流式傳輸編碼的非即時性質使用所述多個濾波器組的全部來實行放大,可選擇改良的濾波器組以提供改良的結果。 Finally, multiple filtering with multiple stored filter banks provides improved compression ratios. By using all of the plurality of filter banks to perform upscaling based on the non-real-time nature of image streaming encoding without additional image analysis, improved filter banks may be selected to provide improved results.

根據上述各種實施例,藉由提前在由伺服器進行的編碼過程中通過多個放大過程識別改良的放大濾波器組,可將電子裝置中的復原過程簡化。因此,可在影像流式傳輸環境中提供高壓 縮率,且因此可傳輸高畫質影像。 According to the various embodiments described above, by identifying the improved amplification filter bank through multiple amplification processes in advance during the encoding process by the server, the restoration process in the electronic device can be simplified. Therefore, high voltages can be provided in image streaming environments reduction ratio, and therefore can transmit high-definition images.

與此同時,可使用軟體、硬體或其組合來實施上述各種實施例。根據硬體實施方案,本發明中所述的實施例可被實施為特殊應用積體電路(application specific integrated circuit,ASIC)、數位訊號處理器(digital signal processor,DSP)、數位訊號處理設備(digital signal processing device,DSPD)、可程式化邏輯設備(programmable logic device,PLD)、可程式化閘陣列、處理器、控制器、微控制器、微處理器及用於實行其他功能的電性單元。根據軟體實施方案,可以單獨的軟體模組來實施諸如本文中所述的程序及功能等實施例。所述軟體模組中的每一者可實行本文中所述的功能及操作中的一者或多者。 At the same time, the various embodiments described above may be implemented using software, hardware, or a combination thereof. Depending on the hardware implementation, the embodiments described in the present invention may be implemented as an application specific integrated circuit (ASIC), a digital signal processor (DSP), or a digital signal processing device (DSP). signal processing device (DSPD), programmable logic device (PLD), programmable gate array, processor, controller, microcontroller, microprocessor and electrical units used to perform other functions. Depending on the software implementation, embodiments such as the procedures and functions described herein may be implemented as separate software modules. Each of the software modules may perform one or more of the functions and operations described herein.

與此同時,根據本發明的上述各種實施例的方法可儲存於非暫時性電腦可讀媒體中。該些非暫時性電腦可讀媒體可用於各種設備中。 At the same time, methods according to the above-described various embodiments of the present invention may be stored in a non-transitory computer-readable medium. These non-transitory computer-readable media can be used in a variety of devices.

所述非暫時性電腦可讀媒體指代半永久地儲存資料而非在極短的時間內儲存資料的媒體(諸如,暫存器、高速緩衝記憶體及記憶體),且可由裝置讀取。具體而言,上述各種應用或程式可儲存於以下非暫時性電腦可讀媒體中:諸如光碟(compact disc,CD)、數位多功能磁碟(digital versatile disk,DVD)、硬碟、藍光碟、通用串列匯流排(universal serial bus,USB)記憶條、記憶卡及唯讀記憶體(read only memory,ROM),且可提供上述各種應用或程式。 The non-transitory computer-readable media refers to media that stores data semi-permanently rather than for a very short period of time (such as registers, caches, and memory) and can be read by a device. Specifically, the various applications or programs mentioned above can be stored in the following non-transitory computer-readable media: such as compact disc (CD), digital versatile disk (DVD), hard drive, Blu-ray disc, Universal serial bus (USB) memory sticks, memory cards and read only memory (ROM), and can provide various applications or programs mentioned above.

根據實施例,可將根據本文中所揭露的各種實施例的方法設置於電腦程式產品中。電腦程式產品可銷售者與購買者之間作為商品交易。可以機器可讀儲存媒體的形式(例如,光碟唯讀記憶體(CD-ROM))分銷或通過應用商店(例如,PlayStoreTM)在線上分銷電腦程式產品。在線上分銷的情形中,可將電腦程式產品的至少一部分暫時地儲存或暫時地形成於儲存媒體(諸如,製造商伺服器、應用商店的伺服器、或中繼伺服器的記憶體)上。 According to embodiments, methods according to various embodiments disclosed herein may be implemented in a computer program product. Computer program products can be traded as commodities between sellers and buyers. Computer program products may be distributed in the form of machine-readable storage media (eg, compact disc read-only memory (CD-ROM)) or online through application stores (eg, PlayStore ). In the case of online distribution, at least a portion of the computer program product may be temporarily stored or temporarily formed on a storage medium (such as the memory of a manufacturer's server, an application store's server, or a relay server).

儘管已示出且闡述了一些實施例,但熟習此項技術者應明白可在不背離本發明的原理及精神的情況下對該些實施例做出改變。因此,本發明的範疇不應被解釋為受所述實施例限制,而是由隨附申請專利範圍及其等效內容界定。 Although a few embodiments have been shown and described, it will be apparent to those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention. Accordingly, the scope of the present invention should not be construed as being limited by the embodiments described, but rather by the scope of the appended claims and their equivalents.

21:最初影像資料 21:First video data

22:人工智慧編碼器 22:Artificial Intelligence Coder

23、27:經壓縮影像 23, 27: Compressed image

24:編碼過程/編碼操作 24: Encoding process/encoding operation

25:流式傳輸源 25:Streaming source

26:解碼過程/操作 26: Decoding process/operation

28:人工智慧解碼器/操作 28: Artificial Intelligence Decoder/Operation

29:最初復原影像 29:First restored image

30:顯示器 30:Display

100:電子裝置/影像處理裝置 100: Electronic devices/image processing devices

200:伺服器 200:server

Claims (3)

一種控制電子裝置的方法,所述方法包括:自外部伺服器接收影像資料、指示是否由所述外部伺服器實行了人工智慧縮小的人工智慧旗標及濾波器索引;對所述影像資料進行解碼;反應於所述人工智慧旗標為第一值且所述濾波器索引不是空值,使用與所述濾波器索引對應的第一人工智慧模型來放大經解碼的影像資料以及提供經放大的影像資料以供輸出;反應於所述人工智慧旗標為所述第一值且所述濾波器索引為空值,使用預設人工智慧模型來放大所述經解碼的影像資料以及提供所述經放大的影像資料以供輸出;以及反應於所述人工智慧旗標不是所述第一值,在不實行放大過程的情況下提供所述經解碼的影像資料以供輸出,其中藉由對經縮小的影像資料進行編碼來獲得所述影像資料,所述經縮小的影像資料是藉由將與所述影像資料對應的最初影像資料輸入至第二人工智慧模型中以縮小所述最初影像資料而獲取,其中所述第一人工智慧模型的濾波器的數目小於所述第二人工智慧模型的濾波器的數目,以及其中所述第一人工智慧模型是卷積神經網路。 A method of controlling an electronic device, the method comprising: receiving image data from an external server, an artificial intelligence flag and a filter index indicating whether artificial intelligence reduction is performed by the external server; decoding the image data ; In response to the artificial intelligence flag being a first value and the filter index not being null, using the first artificial intelligence model corresponding to the filter index to amplify the decoded image data and provide the enlarged image data for output; in response to the artificial intelligence flag being the first value and the filter index being null, using a preset artificial intelligence model to amplify the decoded image data and provide the amplified image data for output; and in response to the artificial intelligence flag being not the first value, providing the decoded image data for output without performing an amplification process, wherein by The image data is encoded to obtain the image data, and the reduced image data is obtained by inputting the initial image data corresponding to the image data into the second artificial intelligence model to reduce the initial image data, wherein the number of filters of the first artificial intelligence model is less than the number of filters of the second artificial intelligence model, and wherein the first artificial intelligence model is a convolutional neural network. 如申請專利範圍第1項所述的方法,其中所述提供包括顯示所述經放大的影像資料。 For the method described in claim 1, the providing includes displaying the enlarged image data. 一種電子裝置,包括:通訊介面,包括通訊電路系統;以及處理器,被配置以:經由所述通訊介面自外部伺服器接收影像資料、指示是否由所述外部伺服器實行了人工智慧縮小的人工智慧旗標及濾波器索引;對所接收到的所述影像資料進行解碼;反應於所述人工智慧旗標為第一值且所述濾波器索引不是空值,使用與所述濾波器索引對應的第一人工智慧模型來放大經解碼的影像資料以及提供經放大的影像資料以供輸出;反應於所述人工智慧旗標為所述第一值且所述濾波器索引為空值,使用預設人工智慧模型來放大所述經解碼的影像資料以及提供所述經放大的影像資料以供輸出;以及反應於所述人工智慧旗標不是所述第一值,在不實行放大過程的情況下提供所述經解碼的影像資料以供輸出,其中藉由對經縮小的影像資料進行編碼來獲得所述影像資料,所述經縮小的影像資料是藉由將與所述影像資料對應的最初影像資料輸入至第二人工智慧模型中以縮小所述最初影像資料而獲取,其中所述第一人工智慧模型的濾波器的數目小於所述第二人工智慧模型的濾波器的數目,以及其中所述第一人工智慧模型是卷積神經網路。 An electronic device includes: a communication interface, including a communication circuit system; and a processor configured to: receive image data from an external server via the communication interface, and indicate whether artificial intelligence reduction is performed by the external server. Smart flag and filter index; decode the received image data; respond to the fact that the artificial intelligence flag is the first value and the filter index is not a null value, use the filter index corresponding to A first artificial intelligence model to amplify the decoded image data and provide the amplified image data for output; in response to the artificial intelligence flag being the first value and the filter index being a null value, using a preset Provide an artificial intelligence model to amplify the decoded image data and provide the amplified image data for output; and in response to the artificial intelligence flag not being the first value, without performing the amplification process Providing the decoded image data for output, wherein the image data is obtained by encoding reduced image data by converting an original image corresponding to the image data Data is input into a second artificial intelligence model to obtain the initial image data by downsizing, wherein the number of filters of the first artificial intelligence model is less than the number of filters of the second artificial intelligence model, and wherein the The first artificial intelligence model was the convolutional neural network.
TW108128335A 2018-08-10 2019-08-08 Electronic apparatus, method for controlling thereof, and method for controlling a server TWI821358B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2018-0093511 2018-08-10
KR1020180093511A KR102022648B1 (en) 2018-08-10 2018-08-10 Electronic apparatus, method for controlling thereof and method for controlling server

Publications (2)

Publication Number Publication Date
TW202012885A TW202012885A (en) 2020-04-01
TWI821358B true TWI821358B (en) 2023-11-11

Family

ID=68067739

Family Applications (1)

Application Number Title Priority Date Filing Date
TW108128335A TWI821358B (en) 2018-08-10 2019-08-08 Electronic apparatus, method for controlling thereof, and method for controlling a server

Country Status (6)

Country Link
US (3) US11388465B2 (en)
EP (1) EP3635964A1 (en)
KR (1) KR102022648B1 (en)
CN (1) CN110830849A (en)
TW (1) TWI821358B (en)
WO (1) WO2020032661A1 (en)

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3567857A1 (en) 2017-07-06 2019-11-13 Samsung Electronics Co., Ltd. Method for encoding/decoding image and device therefor
WO2020080765A1 (en) 2018-10-19 2020-04-23 Samsung Electronics Co., Ltd. Apparatuses and methods for performing artificial intelligence encoding and artificial intelligence decoding on image
WO2020080827A1 (en) 2018-10-19 2020-04-23 Samsung Electronics Co., Ltd. Ai encoding apparatus and operation method of the same, and ai decoding apparatus and operation method of the same
WO2020080665A1 (en) 2018-10-19 2020-04-23 Samsung Electronics Co., Ltd. Methods and apparatuses for performing artificial intelligence encoding and artificial intelligence decoding on image
KR102525578B1 (en) 2018-10-19 2023-04-26 삼성전자주식회사 Method and Apparatus for video encoding and Method and Apparatus for video decoding
US10789675B2 (en) * 2018-12-28 2020-09-29 Intel Corporation Apparatus and method for correcting image regions following upsampling or frame interpolation
US11265580B2 (en) * 2019-03-22 2022-03-01 Tencent America LLC Supplemental enhancement information messages for neural network based video post processing
KR102166337B1 (en) * 2019-09-17 2020-10-15 삼성전자주식회사 Apparatus and method for performing artificial intelligence encoding and artificial intelligence decoding of image
CN110889411B (en) * 2019-09-27 2023-12-08 武汉创想外码科技有限公司 Universal image recognition model based on AI chip
KR102287947B1 (en) 2019-10-28 2021-08-09 삼성전자주식회사 Apparatus and method for performing artificial intelligence encoding and artificial intelligence decoding of image
KR102436512B1 (en) 2019-10-29 2022-08-25 삼성전자주식회사 Method and Apparatus for video encoding and Method and Apparatus for video decoding
KR20210056733A (en) * 2019-11-11 2021-05-20 삼성전자주식회사 Electronic device for providing streaming data and method for operating thereof
KR102245682B1 (en) * 2019-11-11 2021-04-27 연세대학교 산학협력단 Apparatus for compressing image, learning apparatus and method thereof
KR20210067783A (en) * 2019-11-29 2021-06-08 삼성전자주식회사 Electronic apparatus and control method thereof and system
KR20210067788A (en) * 2019-11-29 2021-06-08 삼성전자주식회사 Electronic apparatus, system and control method thereof
US11595847B2 (en) * 2019-12-19 2023-02-28 Qualcomm Incorporated Configuration of artificial intelligence (AI) modules and compression ratios for user-equipment (UE) feedback
KR20210093605A (en) 2020-01-20 2021-07-28 삼성전자주식회사 A display apparatus and a method for operating the display apparatus
KR20210103867A (en) * 2020-02-14 2021-08-24 삼성전자주식회사 Method and apparatus for streaming vr video
WO2021163845A1 (en) * 2020-02-17 2021-08-26 Intel Corporation Enhancing 360-degree video using convolutional neural network (cnn) -based filter
KR102287942B1 (en) 2020-02-24 2021-08-09 삼성전자주식회사 Apparatus and method for performing artificial intelligence encoding and artificial intelligence decoding of image using pre-processing
KR102391615B1 (en) * 2020-03-16 2022-04-29 주식회사 카이 Image processing method, video playback method and apparatuses thereof
KR102471288B1 (en) * 2020-08-27 2022-11-28 한국전자기술연구원 Method and apparatus for transmitting and receaving
CN115668273A (en) 2020-09-15 2023-01-31 三星电子株式会社 Electronic device, control method thereof and electronic system
KR102548993B1 (en) * 2020-12-02 2023-06-27 주식회사 텔레칩스 Image scaling system and method for supporting various image mode
US11670011B2 (en) * 2021-01-11 2023-06-06 Industry-Academic Cooperation Foundation Yonsei University Image compression apparatus and learning apparatus and method for the same
JPWO2022210661A1 (en) * 2021-03-30 2022-10-06
JP2022174948A (en) * 2021-05-12 2022-11-25 横河電機株式会社 Apparatus, monitoring system, method, and program
US11917142B2 (en) * 2021-07-13 2024-02-27 WaveOne Inc. System for training and deploying filters for encoding and decoding
WO2023086795A1 (en) * 2021-11-09 2023-05-19 Netflix, Inc. Techniques for reconstructing downscaled video content
KR20240056112A (en) * 2022-10-21 2024-04-30 삼성전자주식회사 Electronic apparatus for identifying a region of interest in an image and control method thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW200611573A (en) * 2004-09-24 2006-04-01 Service & Quality Technology Co Ltd Intelligent image-processing device for closed-circuit TV camera and it operating method
US10009622B1 (en) * 2015-12-15 2018-06-26 Google Llc Video coding with degradation of residuals
WO2018120082A1 (en) * 2016-12-30 2018-07-05 Nokia Technologies Oy Apparatus, method and computer program product for deep learning

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2646575A1 (en) 1989-04-26 1990-11-02 Labo Electronique Physique METHOD AND STRUCTURE FOR DATA COMPRESSION
JPH06197227A (en) 1992-07-30 1994-07-15 Ricoh Co Ltd Image processor
US9064364B2 (en) * 2003-10-22 2015-06-23 International Business Machines Corporation Confidential fraud detection system and method
US7956930B2 (en) 2006-01-06 2011-06-07 Microsoft Corporation Resampling and picture resizing operations for multi-resolution video coding and decoding
US8204128B2 (en) * 2007-08-01 2012-06-19 Her Majesty The Queen In Right Of Canada, As Represented By The Minister Of Industry, Through The Communications Research Centre Canada Learning filters for enhancing the quality of block coded still and video images
JP2013110518A (en) * 2011-11-18 2013-06-06 Canon Inc Image coding apparatus, image coding method, and program, and image decoding apparatus, image decoding method, and program
CN103369313B (en) 2012-03-31 2017-10-10 百度在线网络技术(北京)有限公司 A kind of method, device and equipment for carrying out compression of images
CA2878807C (en) 2012-07-09 2018-06-12 Vid Scale, Inc. Codec architecture for multiple layer video coding
GB2543429B (en) 2015-02-19 2017-09-27 Magic Pony Tech Ltd Machine learning for visual processing
CN105611303B (en) 2016-03-07 2019-04-09 京东方科技集团股份有限公司 Image compression system, decompression systems, training method and device, display device
KR101910446B1 (en) 2016-07-12 2018-10-30 광운대학교 산학협력단 A Compression Method of Digital Hologram Video using Domain Transforms and 2D Video Compression Technique
KR101938945B1 (en) * 2016-11-07 2019-01-15 한국과학기술원 Method and system for dehazing image using convolutional neural network
US20180183998A1 (en) 2016-12-22 2018-06-28 Qualcomm Incorporated Power reduction and performance improvement through selective sensor image downscaling
WO2018112514A1 (en) 2016-12-23 2018-06-28 Queensland University Of Technology Deep learning systems and methods for use in computer vision
CN107194347A (en) 2017-05-19 2017-09-22 深圳市唯特视科技有限公司 A kind of method that micro- expression detection is carried out based on Facial Action Coding System
KR102535361B1 (en) * 2017-10-19 2023-05-24 삼성전자주식회사 Image encoder using machine learning and data processing method thereof
CN108305214B (en) 2017-12-28 2019-09-17 腾讯科技(深圳)有限公司 Image processing method, device, storage medium and computer equipment
US10645409B2 (en) * 2018-06-26 2020-05-05 Google Llc Super-resolution loop restoration

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW200611573A (en) * 2004-09-24 2006-04-01 Service & Quality Technology Co Ltd Intelligent image-processing device for closed-circuit TV camera and it operating method
US10009622B1 (en) * 2015-12-15 2018-06-26 Google Llc Video coding with degradation of residuals
WO2018120082A1 (en) * 2016-12-30 2018-07-05 Nokia Technologies Oy Apparatus, method and computer program product for deep learning

Also Published As

Publication number Publication date
KR102022648B1 (en) 2019-09-19
US20220030291A1 (en) 2022-01-27
US11825033B2 (en) 2023-11-21
EP3635964A4 (en) 2020-04-15
CN110830849A (en) 2020-02-21
TW202012885A (en) 2020-04-01
US11388465B2 (en) 2022-07-12
EP3635964A1 (en) 2020-04-15
US20200053408A1 (en) 2020-02-13
US20240040179A1 (en) 2024-02-01
WO2020032661A1 (en) 2020-02-13

Similar Documents

Publication Publication Date Title
TWI821358B (en) Electronic apparatus, method for controlling thereof, and method for controlling a server
CN110088799B (en) Image processing apparatus and image processing method
US11379955B2 (en) Electronic device, image processing method thereof, and computer-readable recording medium
US11153575B2 (en) Electronic apparatus and control method thereof
KR102476239B1 (en) Electronic apparatus, method for processing image and computer-readable recording medium
US11934953B2 (en) Image detection apparatus and operation method thereof
US11481586B2 (en) Electronic apparatus and controlling method thereof
US11568254B2 (en) Electronic apparatus and control method thereof
EP3671486B1 (en) Display apparatus and control method thereof
CN111095350A (en) Image processing apparatus, method for processing image, and computer-readable recording medium
US20220301312A1 (en) Electronic apparatus for identifying content based on an object included in the content and control method thereof
US20210279589A1 (en) Electronic device and control method thereof
CN111989917B (en) Electronic device and control method thereof
US10997947B2 (en) Electronic device and control method thereof
US11257186B2 (en) Image processing apparatus, image processing method, and computer-readable recording medium
US11627383B2 (en) Electronic device and operation method thereof
US20230209087A1 (en) Method and device for improving video quality
TW202044199A (en) Image processing apparatus and image processing method thereof