WO2021031311A1 - 超网络构建方法、使用方法、装置及介质 - Google Patents

超网络构建方法、使用方法、装置及介质 Download PDF

Info

Publication number
WO2021031311A1
WO2021031311A1 PCT/CN2019/110668 CN2019110668W WO2021031311A1 WO 2021031311 A1 WO2021031311 A1 WO 2021031311A1 CN 2019110668 W CN2019110668 W CN 2019110668W WO 2021031311 A1 WO2021031311 A1 WO 2021031311A1
Authority
WO
WIPO (PCT)
Prior art keywords
linear
sub
connection unit
linear connection
network
Prior art date
Application number
PCT/CN2019/110668
Other languages
English (en)
French (fr)
Inventor
初祥祥
许瑞军
张勃
李吉祥
李庆源
王斌
Original Assignee
北京小米智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京小米智能科技有限公司 filed Critical 北京小米智能科技有限公司
Priority to RU2019140852A priority Critical patent/RU2721181C1/ru
Priority to KR1020197033844A priority patent/KR102568810B1/ko
Priority to JP2019563157A priority patent/JP7100669B2/ja
Publication of WO2021031311A1 publication Critical patent/WO2021031311A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn

Definitions

  • This article relates to the field of data processing technology, especially to the construction method, use method, device and medium of the super network.
  • NAS Neural Architecture Search
  • the supernet contains multiple layers, and each layer contains multiple network units. After selecting a network unit from each layer and connecting in turn Form a sub-network.
  • all the sub-network structures inside the super network share parameters when constructing different sub-networks. After the super network is trained to a certain level, the sub-networks can be sampled and evaluated, and there is no need to Each sub-network is trained from scratch. This algorithm is called a neural network super-network single-path activation algorithm.
  • this article provides the construction method, use method, device and medium of the super network.
  • a method for constructing a super network including:
  • a linear connection unit is arranged in at least one layer of the super network, the input end of the linear connection unit is connected to the upper layer of the home layer of the linear connection unit, and the output end is connected to the lower layer of the home layer of the linear connection unit;
  • the output and input of the linear connection unit constitute a linear relationship, and the linear relationship includes a linear relationship except that the output is equal to the input.
  • the method further includes: setting a linear parameter of each linear connection unit in the super network;
  • the setting of the linear parameters of each linear connection unit in the super network includes one of the following methods:
  • the linear parameter includes at least one of the following parameters: slope, first coordinate axis displacement, and second coordinate axis displacement.
  • a method for using a super network including:
  • a linear connection unit is arranged in at least one layer of the super network, the input end of the linear connection unit is connected to the upper layer of the home layer of the linear connection unit, and the output end is connected to the lower layer of the home layer of the linear connection unit;
  • N and M are integers greater than 1, and M is less than or equal to N.
  • Determining the M sub-networks in the N sub-networks includes: calculating the network indicators of the N sub-networks, and selecting the M sub-networks with the highest quality indexes of the network indicators from the N sub-networks.
  • the method further includes: setting a linear parameter of each linear connection unit in the super network;
  • the setting of the linear parameters of each linear connection unit in the super network includes one of the following methods:
  • the method further includes: when the specific value of the linear parameter of the linear connection unit is a variable, updating the linear parameter of each linear connection unit when training each of the N sub-networks.
  • the linear parameter includes at least one of the following parameters: slope, first coordinate axis displacement, and second coordinate axis displacement.
  • a super network construction device including:
  • the first setting module is used to set a linear connection unit in at least one layer of the super network, the input end of the linear connection unit is connected to the upper layer of the home layer of the linear connection unit, and the output end is connected to the linear connection unit
  • the lower layer of the attribution layer; the output of the linear connection unit and the input constitute a linear relationship, and the linear relationship includes a linear relationship except that the output is equal to the input.
  • the device also includes:
  • the second setting module is used to set the linear parameters of each linear connection unit in the super network
  • the setting of the linear parameters of each linear connection unit in the super network includes one of the following methods:
  • the linear parameter includes at least one of the following parameters: slope, first coordinate axis displacement, and second coordinate axis displacement.
  • an apparatus for using a super network including:
  • the third setting module is used to set a linear connection unit in at least one layer of the super network, the input end of the linear connection unit is connected to the upper layer of the home layer of the linear connection unit, and the output end is connected to the linear connection unit
  • the first determining module is configured to determine N sub-networks according to the super network
  • the first training module is used to train each of the N sub-networks until the corresponding training end condition is met;
  • the second determining module is configured to determine M sub-networks in the N sub-networks
  • a modification module configured to modify the linear relationship of the linear connection units in the sub-networks containing linear connection units to the relationship in which the output is equal to the input for the sub-networks including the linear connection units in the M sub-networks;
  • the second training module is used to separately train each sub-network in the M sub-networks, and extract the performance index of each sub-network after the training is completed;
  • N and M are integers greater than 1, and M is less than or equal to N.
  • the second determining module includes:
  • the selection module is used to select M sub-networks with the highest quality index of the network index from the N sub-networks.
  • the device also includes a fourth setting module for setting the linear parameters of each linear connection unit in the super network;
  • the setting of the linear parameters of each linear connection unit in the super network includes one of the following methods:
  • the first training module is further configured to update the linear parameter of each linear connection unit when training each of the N sub-networks when the specific value of the linear parameter of the linear connection unit is a variable.
  • the linear parameter includes at least one of the following parameters: slope, first coordinate axis displacement, and second coordinate axis displacement.
  • a non-transitory computer-readable storage medium When the instructions in the storage medium are executed by the processor of the mobile terminal, the mobile terminal can execute a super network construction method. , The method includes:
  • a linear connection unit is arranged in at least one layer of the super network, the input end of the linear connection unit is connected to the upper layer of the home layer of the linear connection unit, and the output end is connected to the lower layer of the home layer of the linear connection unit;
  • the output and input of the linear connection unit constitute a linear relationship.
  • a non-transitory computer-readable storage medium When the instructions in the storage medium are executed by the processor of the mobile terminal, the mobile terminal can execute a super network usage method.
  • the method includes:
  • a linear connection unit is arranged in at least one layer of the super network, the input end of the linear connection unit is connected to the upper layer of the home layer of the linear connection unit, and the output end is connected to the lower layer of the home layer of the linear connection unit;
  • N and M are integers greater than 1, and M is less than or equal to N.
  • the linear connection unit used in this method can effectively improve the characterization ability of the sub-network and maintain the sub-network containing the linear connection unit in the super network compared with the direct connection unit.
  • the stability of the network index prevents the network index of this sub-network from rapidly dropping after the network unit in one or more layers of the atomic network is replaced with a directly connected unit.
  • Fig. 1 is a structural diagram showing a super network according to an exemplary embodiment
  • Fig. 2 is a structural diagram showing a super network according to an exemplary embodiment
  • Fig. 3 is a structural diagram of a super network according to an exemplary embodiment
  • Fig. 4 is a flow chart showing a method for constructing a super network according to an exemplary embodiment
  • Fig. 5 is a flowchart showing a method for using a super network according to an exemplary embodiment
  • Fig. 6 is a structural diagram showing a device for constructing a super network according to an exemplary embodiment
  • Fig. 7 is a structural diagram showing a device for using a super network according to an exemplary embodiment
  • Fig. 8 is a structural diagram showing a device for constructing or using a super network according to an exemplary embodiment.
  • direct connection units are introduced into the super network to construct a variable depth network.
  • a direct connection unit is set in the second layer of the super network.
  • the function of this direct connection unit is that output equals input.
  • Fig. 4 is a flow chart showing a method for constructing a super network according to an exemplary embodiment.
  • the method includes: step S41, setting a linear connection unit in at least one layer of the super network, and an input end of the linear connection unit Connected to the upper layer of the attribution layer of the linear connection unit, and the output terminal is connected to the lower layer of the attribution layer of the linear connection unit; the output of the linear connection unit forms a linear relationship with the input, and the linear relationship includes dividing output equal to input The linear relationship includes the linear relationship except that the output is equal to the input.
  • This method uses linear connection units in the super network. Compared with the use of direct connection units, this method can effectively improve the characterization ability of the sub-network, maintain the stability of the network index of the sub-network containing the linear connection unit in the super network, and prevent the sub-network from being During the in-depth adjustment process, the network indicators dropped rapidly.
  • the method further includes: setting the linear parameters of each linear connection unit in the super network;
  • the setting of the linear parameters of each linear connection unit in the super network includes one of the following methods:
  • the linear parameter includes at least one of the following parameters: slope, first coordinate axis displacement, and second coordinate axis displacement.
  • the values of the linear parameters of the linear connection units whose linear parameters are constant in the super network are the same or different.
  • the initial value of each linear parameter of each linear connection unit whose linear parameter is a variable in the super network is set.
  • the linear parameters of each linear connection unit are updated.
  • Fig. 5 is a flow chart showing a method for using a super network according to an exemplary embodiment. The method includes:
  • Step S51 Set a linear connection unit in at least one layer of the super network, the input end of the linear connection unit is connected to the upper layer of the home layer of the linear connection unit, and the output end is connected to the home layer of the linear connection unit.
  • Step S52 determining N sub-networks according to the super network, and training each sub-network in the N sub-networks until the corresponding training end condition is met;
  • Step S53 Determine M sub-networks in the N sub-networks
  • Step S54 for the sub-networks including linear connection units in the M sub-networks, modify the linear relationship of the linear connection units in the sub-networks including the linear connection units to the relationship in which the output is equal to the input;
  • Step S55 Perform individual training on each of the M sub-networks, and extract the performance index of each sub-network after the training is completed;
  • N and M are integers greater than 1, and M is less than or equal to N.
  • This method uses linear connection units in the super network. Compared with the use of direct connection units, this method can effectively improve the characterization ability of the sub-network, maintain the stability of the network index of the sub-network containing the linear connection unit in the super network, and prevent the sub-network from being During the in-depth adjustment process, the network indicators dropped rapidly. Specifically: the linear connection unit is used when training each of the N sub-networks, and the linear relationship of the linear connection unit in the sub-network containing the linear connection unit is modified to the output equal to when training each sub-network of the M sub-networks After entering the relationship, the network indicators remain basically unchanged during the in-depth adjustment of the sub-network.
  • determining the M sub-networks in the N sub-networks includes: calculating the network indicators of the N sub-networks, and selecting the M sub-networks with the highest quality index of the network indicators from the N sub-networks.
  • Network indicators include but are not limited to accuracy rate, loss value, verification accuracy rate, verification loss, average absolute error, etc.
  • the method further includes: setting linear parameters of each linear connection unit in the super network;
  • the setting of the linear parameters of each linear connection unit in the super network includes one of the following methods:
  • the linear parameter of each linear connection unit is updated when each of the N sub-networks is trained.
  • the linear parameter includes at least one of the following parameters: slope, first coordinate axis displacement, and second coordinate axis displacement.
  • the values of the linear parameters of the linear connection units whose linear parameters are constant in the super network are the same or different.
  • Fig. 6 is a structural diagram showing a super network construction device according to an exemplary embodiment.
  • the super network construction device includes:
  • the first setting module is used to set a linear connection unit in at least one layer of the super network, the input end of the linear connection unit is connected to the upper layer of the home layer of the linear connection unit, and the output end is connected to the linear connection unit
  • the lower layer of the attribution layer; the output of the linear connection unit and the input constitute a linear relationship, and the linear relationship includes a linear relationship except that the output is equal to the input.
  • the super network construction device further includes:
  • the second setting module is used to set the linear parameters of each linear connection unit in the super network
  • the setting of the linear parameters of each linear connection unit in the super network includes one of the following methods:
  • the linear parameter includes at least one of the following parameters: slope, first coordinate axis displacement, and second coordinate axis displacement.
  • Fig. 7 is a structural diagram showing a device for using a super network according to an exemplary embodiment.
  • the device for constructing a super network includes:
  • the third setting module is used to set a linear connection unit in at least one layer of the super network, the input end of the linear connection unit is connected to the upper layer of the home layer of the linear connection unit, and the output end is connected to the linear connection unit
  • the first determining module is configured to determine N sub-networks according to the super network
  • the first training module is used to train each of the N sub-networks until the corresponding training end condition is met;
  • the second determining module is configured to determine M sub-networks in the N sub-networks
  • a modification module configured to modify the linear relationship of the linear connection units in the sub-networks containing linear connection units to the relationship in which the output is equal to the input for the sub-networks including the linear connection units in the M sub-networks;
  • the second training module is used to separately train each sub-network in the M sub-networks, and extract the performance index of each sub-network after the training is completed;
  • N and M are integers greater than 1, and M is less than or equal to N.
  • the second determining module includes:
  • the selection module is used to select M sub-networks with the highest quality index of the network index from the N sub-networks.
  • the super network construction device further includes a fourth setting module, configured to set the linear parameters of each linear connection unit in the super network;
  • the setting of the linear parameters of each linear connection unit in the super network includes one of the following methods:
  • the first training module is further configured to, when the specific value of the linear parameter of the linear connection unit is a variable, perform training on each linear connection unit when training each of the N sub-networks.
  • the linear parameters are updated.
  • the linear parameter includes at least one of the following parameters: slope, first coordinate axis displacement, and second coordinate axis displacement.
  • Fig. 8 is a block diagram showing a device 800 for constructing or using a super network according to an exemplary embodiment.
  • the device 800 may be a mobile phone, a computer, a digital broadcasting terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, etc.
  • the device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, And the communication component 816.
  • a processing component 802 a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, And the communication component 816.
  • the processing component 802 generally controls the overall operations of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • the processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the foregoing method.
  • the processing component 802 may include one or more modules to facilitate the interaction between the processing component 802 and other components.
  • the processing component 802 may include a multimedia module to facilitate the interaction between the multimedia component 808 and the processing component 802.
  • the memory 804 is configured to store various types of data to support the operation of the device 800. Examples of these data include instructions for any application or method operating on the device 800, contact data, phone book data, messages, pictures, videos, etc.
  • the memory 804 can be implemented by any type of volatile or nonvolatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic Disk or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Magnetic Disk Magnetic Disk or Optical Disk.
  • the power component 806 provides power to various components of the device 800.
  • the power component 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 800.
  • the multimedia component 808 includes a screen that provides an output interface between the device 800 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure related to the touch or slide operation.
  • the multimedia component 808 includes a front camera and/or a rear camera. When the device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 810 is configured to output and/or input audio signals.
  • the audio component 810 includes a microphone (MIC), and when the device 800 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode, the microphone is configured to receive external audio signals.
  • the received audio signal may be further stored in the memory 804 or transmitted via the communication component 816.
  • the audio component 810 further includes a speaker for outputting audio signals.
  • the I/O interface 812 provides an interface between the processing component 802 and a peripheral interface module.
  • the peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include but are not limited to: home button, volume button, start button, and lock button.
  • the sensor component 814 includes one or more sensors for providing the device 800 with various aspects of status assessment.
  • the sensor component 814 can detect the on/off status of the device 800 and the relative positioning of components.
  • the component is the display and the keypad of the device 800.
  • the sensor component 814 can also detect the position change of the device 800 or a component of the device 800. , The presence or absence of contact between the user and the device 800, the orientation or acceleration/deceleration of the device 800, and the temperature change of the device 800.
  • the sensor component 814 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact.
  • the sensor component 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor component 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
  • the communication component 816 is configured to facilitate wired or wireless communication between the device 800 and other devices.
  • the device 800 can access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof.
  • the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 816 further includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • the apparatus 800 may be implemented by one or more application specific integrated circuits (ASIC), digital signal processors (DSP), digital signal processing equipment (DSPD), programmable logic devices (PLD), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic components are implemented to implement the above methods.
  • ASIC application specific integrated circuits
  • DSP digital signal processors
  • DSPD digital signal processing equipment
  • PLD programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor, or other electronic components are implemented to implement the above methods.
  • non-transitory computer-readable storage medium including instructions, such as the memory 804 including instructions, which may be executed by the processor 820 of the device 800 to complete the foregoing method.
  • the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
  • non-transitory computer-readable storage medium including instructions, such as a memory including instructions, which may be executed by the processor 920 of the device 900 to complete the foregoing method.
  • the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本文公开了一种超网络构建方法、使用方法、装置及介质,此构建方法包括:在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系,所述线性关系包括除输出等于输入之外的线性关系。本方法在超网络中使用线性连接单元,相比于使用直连单元,可以有效提高子网络的表征能力,维持超网络中包含线性连接单元的子网络的网络指标的稳定性,防止子网络在深度调整过程中网络指标迅速下降。

Description

超网络构建方法、使用方法、装置及介质
本申请基于申请号为201910763113.X、申请日为2019年8月19日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本文涉及数据处理技术领域,尤其涉及超网络构建方法、使用方法、装置及介质。
背景技术
神经网络结构搜索(Neural Architecture Search,NAS)是一种自动设计神经网络的技术,可以通过算法根据样本集自动设计出高性能的网络结构。在神经网络架构搜索中,需要多次搜索生成单独的神经网络并通过训练后获取网络的指标,存在评估效率低和搜索速度低的问题。为解决这个问题,一些NAS方法使用包含所有搜索网络的超网络。如图1所示的一种示例性的超网络(supernet)的结构中,超网络中包含多个层,每个层中包含多个网络单元,从每个层中选择一个网络单元依次连接后构成一个子网络。对超网络进行训练时,超网络内部的所有子网络结构在构建不同子网络时共享参数,可以做到只对超网络训练到一定程度后,就可以对子网络采样并评估指标,无需再对每个子网络进行从头开始的训练,这种算法称为神经网络超网络单路径激活算法。
发明内容
为了提高单路径激活超网络的网络性能,本文提供超网络构建方法、使用方法、装置及介质。
根据本文实施例的第一方面,提供一种超网络构建方法,包括:
在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系,所述线性关系包括除输出等于输入之外的线性关系。
上述超网络构建方法还具有以下特点:
所述方法还包括:设置所述超网络中每个线性连接单元的线性参数;
所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
设置所述超网络中所有线性连接单元的线性参数为常量;
设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元的线性参数为变量;
设置所述超网络中所有线性连接单元的线性参数为变量。
上述超网络构建方法还具有以下特点:
所述线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。
根据本文实施例的第二方面,提供了一种超网络使用方法,包括:
在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;
根据所述超网络确定N个子网络,对所述N个子网络中每个子网络进行训练的过程直至满足相应的训练结束条件;
确定所述N子网络中的M个子网络;
对于所述M个子网络中包含线性连接单元的子网络,将所述包含线性连接单元的子网络中的线性连接单元的线性关系修改为输出等于输入的关系;
对所述M个子网络中每个子网络进行单独训练,训练结束后提取每个子网络的性能指标;
N和M是大于1的整数,M小于或等于N。
上述超网络使用方法还具有以下特点:
确定所述N子网络中的M个子网络包括:计算所述N个子网络的网络指标,从所述N个子网络中选择出网络指标的质量指数最高的M个子网络。
上述超网络使用方法还具有以下特点:
所述方法还包括:设置所述超网络中每个线性连接单元的线性参数;
所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
设置所述超网络中所有线性连接单元的线性参数为常量;
设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元的线性参数为变量;
设置所述超网络中所有线性连接单元的线性参数为变量。
上述超网络使用方法还具有以下特点:
所述方法还包括:线性连接单元的线性参数的具体值为变量时,在对所述N个子网络中每个子网络进行训练时对每个线性连接单元的线性参数进行更新。
上述超网络使用方法还具有以下特点:
所述线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。
根据本文实施例的第三方面,提供了一种超网络构建装置,包括:
第一设置模块,用于在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系,所述线性关系包括除输出等于输入之外的线性关系。
上述超网络构建装置还具有以下特点:
所述装置还包括:
第二设置模块,用于设置所述超网络中每个线性连接单元的线性参数;
所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
设置所述超网络中所有线性连接单元的线性参数为常量;
设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元的线性参数为变量;
设置所述超网络中所有线性连接单元的线性参数为变量。
上述超网络构建装置还具有以下特点:
所述线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。
根据本文实施例的第四方面,提供了一种超网络使用装置,包括:
第三设置模块,用于在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系;
第一确定模块,用于根据所述超网络确定N个子网络;
第一训练模块,用于对所述N个子网络中每个子网络进行训练的过程直至满足相应的训练结束条件;
第二确定模块,用于确定所述N子网络中的M个子网络;
修改模块,用于对于所述M个子网络中包含线性连接单元的子网络,将所述包含线性连接单元的子网络中的线性连接单元的线性关系修改为输出等于输入的关系;
第二训练模块,用于对所述M个子网络中每个子网络进行单独训练,训练结束后提取每个子网络的性能指标;
N和M是大于1的整数,M小于或等于N。
上述超网络使用装置还具有以下特点:
所述第二确定模块包括:
计算模块,用于计算所述N个子网络的网络指标;
选择模块,用于从所述N个子网络中选择出网络指标的质量指数最高的M个子网络。
上述超网络使用装置还具有以下特点:
所述装置还包括第四设置模块,用于设置所述超网络中每个线性连接单元的线性参数;
所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
设置所述超网络中所有线性连接单元的线性参数为常量;
设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元的线性参数为变量;
设置所述超网络中所有线性连接单元的线性参数为变量。
上述超网络使用装置还具有以下特点:
所述第一训练模块,还用于在线性连接单元的线性参数的具体值为变量时,在对所述N个子网络中每个子网络进行训练时对每个线性连接单元的线性参数进行更新。
上述超网络使用装置还具有以下特点:
所述线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。
根据本文实施例的第五方面,提供了一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行一种超网络构建方法,所述方法包括:
在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系。
根据本文实施例的第五方面,提供了一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行一种超网络使用方法,所述方法包括:
在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;
根据所述超网络确定N个子网络,对所述N个子网络中每个子网络进行训练的过程直至满足相应的训练结束条件;
确定所述N子网络中的M个子网络;
对于所述M个子网络中包含线性连接单元的子网络,将所述包含线性连接单元的子网络中的线性连接单元的线性关系修改为输出等于输入的关系;
对所述M个子网络中每个子网络进行单独训练,训练结束后提取每个子网络的性能指标;
N和M是大于1的整数,M小于或等于N。
本文的实施例提供的技术方案可以包括以下有益效果:本方法中使用线性连接单元,相比于使用直连单元,可以有效提高子网络的表征能力,维持超网络中包含线性连接单元的子网络的网络指标的稳定性,防止原子网络中一个或多个层中的网络单元替换为直连单元后,此子网络的网络指标迅速下降。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本文。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本文的实施例,并与说明书一起用于解释本文的原理。
图1是根据一示例性实施例示出的一种超网络的结构图;
图2是根据一示例性实施例示出的一种超网络的结构图;
图3是根据一示例性实施例示出的一种超网络的结构图;
图4是根据一示例性实施例示出的一种超网络构建方法的流程图;
图5是根据一示例性实施例示出的一种超网络使用方法的流程图;
图6是根据一示例性实施例示出的一种超网络构建装置的结构图;
图7是根据一示例性实施例示出的一种超网络使用装置的结构图;
图8是根据一示例性实施例示出的超网络的构建装置或使用装置的结构图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本文相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本文的一些方面相一致的装置和方法的例子。
为了通过超网络得到更丰富的子网络结构,在超网络中引入直连单元来构造可变深度的网络。如图2所示,在超网络的第二层中设置一直连单元,此直连单元的功能为输出等于输入。在此直连单元用于连接第一层中第一子网络和第三层中的第二子网络时,在此连接方式下,实现了超网络的结构中第一层与第三层的直连效果。在原超网络中加入直连单元时,原子网络中一个或多个层中的网络单元替换为直连单元后,此子网络的网络指标会迅速下降,严重影响子网络的网络指标的稳定性。
为了解决此问题,如图3所示,本文中将直连单元替换为线性连接单元。
图4是根据一示例性实施例示出的一种超网络的构建方法的流程图,此方法包括:步骤S41,在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系,所述线性关系包括除输出等于输入之外的线性关系,所述线性关系包括除输出等于输入之外的线性关系。
本方法在超网络中使用线性连接单元,相比于使用直连单元,可以有效提高子网络的表征能力,维持超网络中包含线性连接单元的子网络的网络指标的稳定性,防止子网络在深度调整过程中网络指标迅速下降。
在一实施例中,此方法还包括:设置所述超网络中每个线性连接单元的线性参数;
所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
一,设置所述超网络中所有线性连接单元的线性参数为常量;
二,设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元 的线性参数为变量;
三,设置所述超网络中所有线性连接单元的线性参数为变量。
在一实施例中,线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。例如:线性关系为y(x)=k(x+a)+b,其中,k是斜率,a是第一坐标轴位移,b是第二坐标轴位移。
在一实施例中,超网络中线性参数为常量的各线性连接单元的各线性参数的值相同,或者不相同。例如:超网络中所有线性连接单元的线性关系均为y(x)=2(x+1)+3。再例如:超网络中一部分线性连接单元的线性关系为y(x)=2(x+1)+3,其它的线性连接单元的线性关系为y(x)=1.5x。再例如:超网络中一部分线性连接单元的线性关系为y(x)=2(x+1)+3,另一部分线性连接单元的线性关系为y(x)=1.5x,其它部分线性连接单元的线性关系为y(x)=2x+3等。
在一实施例中,设置超网络中线性参数为变量的各线性连接单元的各线性参数的初始值。在对超网络中的子网络进行训练时对每个线性连接单元的线性参数进行更新。
图5是根据一示例性实施例示出的一种超网络的使用方法的流程图,此方法包括:
步骤S51,在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;
步骤S52,根据所述超网络确定N个子网络,对所述N个子网络中每个子网络进行训练的过程直至满足相应的训练结束条件;
步骤S53,确定所述N子网络中的M个子网络;
步骤S54,对于所述M个子网络中包含线性连接单元的子网络,将所述包含线性连接单元的子网络中的线性连接单元的线性关系修改为输出等于输入的关系;
步骤S55,对所述M个子网络中每个子网络进行单独训练,训练结束后提取每个子网络的性能指标;
N和M是大于1的整数,M小于或等于N。
本方法在超网络中使用线性连接单元,相比于使用直连单元,可以有效提高子网络的表征能力,维持超网络中包含线性连接单元的子网络的网络指标的稳定性,防止子网络在深度调整过程中网络指标迅速下降。具体的:对N个子网络中每个子网络进行训练时使用线性连接单元,对M个子网络中每个子网络进行训练时将包含线性连接单元的子网络中的线性连接单元的线性关系修改为输出等于输入的关系后,子网络在深度调整过程中网络指标基本维持不变。
在一实施例中,确定N子网络中的M个子网络包括:计算所述N个子网络的网络指标,从所述N个子网络中选择出网络指标的质量指数最高的M个子网络。网络指标包括但不限于准确率、损失值、验证准确率、验证损失、平均绝对误差等。
在一实施例中,本方法还包括:设置所述超网络中每个线性连接单元的线性参数;
所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
设置所述超网络中所有线性连接单元的线性参数为常量;
设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元的线性参数为变量;
设置所述超网络中所有线性连接单元的线性参数为变量。
线性连接单元的线性参数的具体值为变量时,在对所述N个子网络中每个子网络进行训练时对每个线性连接单元的线性参数进行更新。
在一实施例中,所述线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。例如:线性关系为y(x)=k(x+a)+b,其中,k是斜率,a是第一坐标轴位移,b是第二坐标轴位移。
在一实施例中,超网络中线性参数为常量的各线性连接单元的各线性参数的值相同,或者不相同。例如:超网络中所有线性连接单元的线性关系均为y(x)=2(x+1)+3。再例如:超网络中一部分线性连接单元的线性关系为y(x)=2(x+1)+3,其它的线性连接单元的线性关系为y(x)=1.5x。再例如:超网络中一部分线性连接单元的线性关系为y(x)=2(x+1)+3,另一部分线性连接单元的线性关系为y(x)=1.5x,其它部分线性连接单元的线性关系为y(x)=2x+3等。
图6是根据一示例性实施例示出的一种超网络构建装置的结构图,此超网络构建装置包括:
第一设置模块,用于在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系,所述线性关系包括除输出等于输入之外的线性关系。
在一实施例中,超网络构建装置还包括:
第二设置模块,用于设置所述超网络中每个线性连接单元的线性参数;
所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
设置所述超网络中所有线性连接单元的线性参数为常量;
设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元的线性参数为变量;
设置所述超网络中所有线性连接单元的线性参数为变量。
所述线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。
图7是根据一示例性实施例示出的一种超网络使用装置的结构图,此超网络构建装置包括:
第三设置模块,用于在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系;
第一确定模块,用于根据所述超网络确定N个子网络;
第一训练模块,用于对所述N个子网络中每个子网络进行训练的过程直至满足相应的训练结束条件;
第二确定模块,用于确定所述N子网络中的M个子网络;
修改模块,用于对于所述M个子网络中包含线性连接单元的子网络,将所述包含线性连接单元的子网络中的线性连接单元的线性关系修改为输出等于输入的关系;
第二训练模块,用于对所述M个子网络中每个子网络进行单独训练,训练结束后提取每个子网络的性能指标;
N和M是大于1的整数,M小于或等于N。
在一实施例中,第二确定模块包括:
计算模块,用于计算所述N个子网络的网络指标;
选择模块,用于从所述N个子网络中选择出网络指标的质量指数最高的M个子网络。
在一实施例中,超网络构建装置还包括第四设置模块,用于设置所述超网络中每个线性连接单元的线性参数;
所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
设置所述超网络中所有线性连接单元的线性参数为常量;
设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元的线性参数为变量;
设置所述超网络中所有线性连接单元的线性参数为变量。
在一实施例中,所述第一训练模块,还用于在线性连接单元的线性参数的具体值为变量时,在对所述N个子网络中每个子网络进行训练时对每个线性连接单元的线性参数进行更新。
在一实施例中,所述线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。
图8是根据一示例性实施例示出的一种超网络构建装置或使用装置800的框图。例如,装置800可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。
参照图8,装置800可以包括以下一个或多个组件:处理组件802,存储器804,电力组件806,多媒体组件808,音频组件810,输入/输出(I/O)的接口812,传感器组件814,以及通信组件816。
处理组件802通常控制装置800的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件802可以包括一个或多个处理器820来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件802可以包括一个或多个模块,便于处理组件802和其他组件之间的交互。例如,处理组件802可以包括多媒体模块,以方便多媒体组件808和处理组件802之间的交互。
存储器804被配置为存储各种类型的数据以支持在设备800的操作。这些数据的示例包括用于在装置800上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器804可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
电力组件806为装置800的各种组件提供电力。电力组件806可以包括电源管理***,一个或多个电源,及其他与为装置800生成、管理和分配电力相关联的组件。
多媒体组件808包括在所述装置800和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件808包括一个前置摄像头和/或后置摄像头。当设备800处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜***或具有焦距和光学变焦能力。
音频组件810被配置为输出和/或输入音频信号。例如,音频组件810包括一个麦克风(MIC),当装置800处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器804或经由通信组件816发送。在一些实施例中,音频组件810还包括一个扬声器,用于输出音频信号。
I/O接口812为处理组件802和***接口模块之间提供接口,上述***接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件814包括一个或多个传感器,用于为装置800提供各个方面的状态评估。例如,传感器组件814可以检测到设备800的打开/关闭状态,组件的相对定位,例如所述组件为装置800的显示器和小键盘,传感器组件814还可以检测装置800或 装置800一个组件的位置改变,用户与装置800接触的存在或不存在,装置800方位或加速/减速和装置800的温度变化。传感器组件814可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件814还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件814还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件816被配置为便于装置800和其他设备之间有线或无线方式的通信。装置800可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件816经由广播信道接收来自外部广播管理***的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件816还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,装置800可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器804,上述指令可由装置800的处理器820执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器,上述指令可由装置900的处理器920执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本文的其它实施方案。本申请旨在涵盖本文的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本文的一般性原理并包括本文未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本文的真正范围和精神由下面的权利要求指出。
应当理解的是,本文并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本文的范围仅由所附的权利要求来限制。

Claims (18)

  1. 超网络构建方法,其特征在于,包括:
    在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系,所述线性关系包括除输出等于输入之外的线性关系。
  2. 如权利要求1所述的超网络构建方法,其特征在于,
    所述方法还包括:设置所述超网络中每个线性连接单元的线性参数;
    所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
    设置所述超网络中所有线性连接单元的线性参数为常量;
    设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元的线性参数为变量;
    设置所述超网络中所有线性连接单元的线性参数为变量。
  3. 如权利要求2所述的超网络构建方法,其特征在于,
    所述线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。
  4. 超网络使用方法,其特征在于,包括:
    在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;
    根据所述超网络确定N个子网络,对所述N个子网络中每个子网络进行训练的过程直至满足相应的训练结束条件;
    确定所述N子网络中的M个子网络;
    对于所述M个子网络中包含线性连接单元的子网络,将所述包含线性连接单元的子网络中的线性连接单元的线性关系修改为输出等于输入的关系;
    对所述M个子网络中每个子网络进行单独训练,训练结束后提取每个子网络的性能指标;
    N和M是大于1的整数,M小于或等于N。
  5. 如权利要求4中所述的超网络使用方法,其特征在于,
    确定所述N子网络中的M个子网络包括:计算所述N个子网络的网络指标,从所述N个子网络中选择出网络指标的质量指数最高的M个子网络。
  6. 如权利要求4中所述的超网络使用方法,其特征在于,
    所述方法还包括:设置所述超网络中每个线性连接单元的线性参数;
    所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
    设置所述超网络中所有线性连接单元的线性参数为常量;
    设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元的线性参数为变量;
    设置所述超网络中所有线性连接单元的线性参数为变量。
  7. 如权利要求6中所述的超网络使用方法,其特征在于,
    所述方法还包括:线性连接单元的线性参数的具体值为变量时,在对所述N个子网络中每个子网络进行训练时对每个线性连接单元的线性参数进行更新。
  8. 如权利要求6中所述的超网络使用方法,其特征在于,
    所述线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。
  9. 超网络构建装置,其特征在于,包括:
    第一设置模块,用于在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系,所述线性关系包括除输出等于输入之外的线性关系。
  10. 如权利要求9所述的超网络构建装置,其特征在于,
    所述装置还包括:
    第二设置模块,用于设置所述超网络中每个线性连接单元的线性参数;
    所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
    设置所述超网络中所有线性连接单元的线性参数为常量;
    设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元的线性参数为变量;
    设置所述超网络中所有线性连接单元的线性参数为变量。
  11. 如权利要求10所述的超网络构建装置,其特征在于,
    所述线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。
  12. 超网络使用装置,其特征在于,包括:
    第三设置模块,用于在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系;
    第一确定模块,用于根据所述超网络确定N个子网络;
    第一训练模块,用于对所述N个子网络中每个子网络进行训练的过程直至满足相应的训练结束条件;
    第二确定模块,用于确定所述N子网络中的M个子网络;
    修改模块,用于对于所述M个子网络中包含线性连接单元的子网络,将所述包含线性连接单元的子网络中的线性连接单元的线性关系修改为输出等于输入的关系;
    第二训练模块,用于对所述M个子网络中每个子网络进行单独训练,训练结束后提取每个子网络的性能指标;
    N和M是大于1的整数,M小于或等于N。
  13. 如权利要求12中所述的超网络使用装置,其特征在于,
    所述第二确定模块包括:
    计算模块,用于计算所述N个子网络的网络指标;
    选择模块,用于从所述N个子网络中选择出网络指标的质量指数最高的M个子 网络。
  14. 如权利要求12中所述的超网络使用装置,其特征在于,
    所述装置还包括第四设置模块,用于设置所述超网络中每个线性连接单元的线性参数;
    所述设置所述超网络中每个线性连接单元的各线性参数包括以下方式中的一种:
    设置所述超网络中所有线性连接单元的线性参数为常量;
    设置所述超网络中部分线性连接单元的线性参数为常量,其它线性连接单元的线性参数为变量;
    设置所述超网络中所有线性连接单元的线性参数为变量。
  15. 如权利要求14中所述的超网络使用装置,其特征在于,
    所述第一训练模块,还用于在线性连接单元的线性参数的具体值为变量时,在对所述N个子网络中每个子网络进行训练时对每个线性连接单元的线性参数进行更新。
  16. 如权利要求14中所述的超网络使用装置,其特征在于,
    所述线性参数包括以下参数中的至少一种:斜率、第一坐标轴位移、第二坐标轴位移。
  17. 一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行一种超网络构建方法,所述方法包括:
    在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;所述线性连接单元的输出与输入构成线性关系。
  18. 一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行一种超网络使用方法,所述方法包括:
    在超网络的至少一层中设置线性连接单元,所述线性连接单元的输入端连接于所述线性连接单元的归属层的上层,输出端连接于所述线性连接单元的归属层的下层;
    根据所述超网络确定N个子网络,对所述N个子网络中每个子网络进行训练的过程直至满足相应的训练结束条件;
    确定所述N子网络中的M个子网络;
    对于所述M个子网络中包含线性连接单元的子网络,将所述包含线性连接单元的子网络中的线性连接单元的线性关系修改为输出等于输入的关系;
    对所述M个子网络中每个子网络进行单独训练,训练结束后提取每个子网络的性能指标;
    N和M是大于1的整数,M小于或等于N。
PCT/CN2019/110668 2019-08-19 2019-10-11 超网络构建方法、使用方法、装置及介质 WO2021031311A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
RU2019140852A RU2721181C1 (ru) 2019-08-19 2019-10-11 Способ построения суперсети, способ использования, устройство и носитель информации
KR1020197033844A KR102568810B1 (ko) 2019-08-19 2019-10-11 슈퍼 네트워크의 구축 방법, 사용 방법, 장치 및 저장 매체
JP2019563157A JP7100669B2 (ja) 2019-08-19 2019-10-11 スーパーネットワークの構築方法、使用方法、装置及び記録媒体

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910763113.X 2019-08-19
CN201910763113.XA CN110490303A (zh) 2019-08-19 2019-08-19 超网络构建方法、使用方法、装置及介质

Publications (1)

Publication Number Publication Date
WO2021031311A1 true WO2021031311A1 (zh) 2021-02-25

Family

ID=68551875

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/110668 WO2021031311A1 (zh) 2019-08-19 2019-10-11 超网络构建方法、使用方法、装置及介质

Country Status (7)

Country Link
US (1) US20210056421A1 (zh)
EP (1) EP3783539A1 (zh)
JP (1) JP7100669B2 (zh)
KR (1) KR102568810B1 (zh)
CN (1) CN110490303A (zh)
RU (1) RU2721181C1 (zh)
WO (1) WO2021031311A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340220B (zh) * 2020-02-25 2023-10-20 北京百度网讯科技有限公司 用于训练预测模型的方法和装置
CN111652354B (zh) * 2020-05-29 2023-10-24 北京百度网讯科技有限公司 用于训练超网络的方法、装置、设备以及存储介质
CN111639753B (zh) * 2020-05-29 2023-12-05 北京百度网讯科技有限公司 用于训练图像处理超网络的方法、装置、设备以及存储介质

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018140969A1 (en) * 2017-01-30 2018-08-02 Google Llc Multi-task neural networks with task-specific paths
CA3005241A1 (en) * 2017-05-19 2018-11-19 Salesforce.Com, Inc. Domain specific language for generation of recurrent neural network architectures

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5249259A (en) * 1990-01-23 1993-09-28 Massachusetts Institute Of Technology Genetic algorithm technique for designing neural networks
UA49379A (uk) * 2001-11-23 2002-09-16 Відкрите Акціонерне Товариство "Мотор Січ" Спосіб побудови і навчання нейронної мережі з латеральним гальмуванням
RU115098U1 (ru) * 2011-09-29 2012-04-20 Константин Дмитриевич Белов Многослойная нейронная сеть
US10776668B2 (en) * 2017-12-14 2020-09-15 Robert Bosch Gmbh Effective building block design for deep convolutional neural networks using search
CN108985457B (zh) * 2018-08-22 2021-11-19 北京大学 一种受优化算法启发的深度神经网络结构设计方法
CN109934336B (zh) * 2019-03-08 2023-05-16 江南大学 基于最优结构搜索的神经网络动态加速平台设计方法及神经网络动态加速平台

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018140969A1 (en) * 2017-01-30 2018-08-02 Google Llc Multi-task neural networks with task-specific paths
CA3005241A1 (en) * 2017-05-19 2018-11-19 Salesforce.Com, Inc. Domain specific language for generation of recurrent neural network architectures

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIU QIANG, FANG JINQING, LI YONG: "Some Characteristics of Three-Layer Supernetwork Evolution Model", COMPLEX SYSTEMS AND COMPLEXITY SCIENCE, vol. 12, no. 2, 1 June 2015 (2015-06-01), pages 64 - 71, XP055782093, DOI: 10.13306/j.1672-3813.2015.02.010 *
ZICHAO GUO, ZHANG XIANGYU, MU HAOYUAN, HENG WEN, LIU ZECHUN, WEI YICHEN, SUN JIAN: "Single Path One-Shot Neural Architecture Search with Uniform Sampling", 6 April 2019 (2019-04-06), XP055697880, Retrieved from the Internet <URL:https://arxiv.org/pdf/1904.00420.pdf> *

Also Published As

Publication number Publication date
RU2721181C1 (ru) 2020-05-18
EP3783539A1 (en) 2021-02-24
CN110490303A (zh) 2019-11-22
US20210056421A1 (en) 2021-02-25
JP7100669B2 (ja) 2022-07-13
KR102568810B1 (ko) 2023-08-21
JP2022501659A (ja) 2022-01-06
KR20210024409A (ko) 2021-03-05

Similar Documents

Publication Publication Date Title
WO2020244104A1 (zh) 超网络训练方法和装置
JP6227766B2 (ja) チャットインターフェースでの表情記号変更の方法、装置および端末機器
RU2648609C2 (ru) Способ и устройство для рекомендации контактной информации
CN105491642B (zh) 网络连接的方法和装置
CN105517112B (zh) 显示WiFi网络信息的方法和装置
WO2017084183A1 (zh) 信息显示方法与装置
CN107102772B (zh) 触控方法及装置
WO2021031311A1 (zh) 超网络构建方法、使用方法、装置及介质
US10248855B2 (en) Method and apparatus for identifying gesture
EP2985980B1 (en) Method and device for playing stream media data
CN104238890B (zh) 文字显示方法及装置
US20230292269A1 (en) Method and apparatus for determining offset indication, and method and apparatus for determining offset
RU2643805C2 (ru) Способ получения рекомендательной информации, терминал и сервер
US10313537B2 (en) Method, apparatus and medium for sharing photo
JP2016524763A (ja) タグ作成方法、装置、端末、プログラム、及び記録媒体
CN107948093A (zh) 调节终端设备中应用网速的方法及装置
CN104850643B (zh) 图片对比方法和装置
US20220132190A1 (en) Method and apparatus for determining bandwidth, and electronic device and storage medium
US20150113431A1 (en) Method and terminal device for adjusting widget
CN108629814B (zh) 相机调整方法及装置
JP2021531519A (ja) タッチ信号の処理方法、装置および媒体
US20170017656A1 (en) Method and device for presenting tasks
CN106919302B (zh) 移动终端的操作控制方法及装置
KR101850158B1 (ko) 휴대 단말기에서 이미지를 출력하는 장치 및 방법
CN112965653B (zh) 触控位置上报方法、装置及电子设备

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2019563157

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19942402

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19942402

Country of ref document: EP

Kind code of ref document: A1