KR20230107042A

KR20230107042A - An electronic apparatus and a method thereof

Info

Publication number: KR20230107042A
Application number: KR1020220002953A
Authority: KR
Inventors: 박별; 장정록
Original assignee: 삼성전자주식회사
Priority date: 2022-01-07
Filing date: 2022-01-07
Publication date: 2023-07-14
Also published as: WO2023132534A1

Abstract

이미지 정보, 주변 상황 정보, 및 사용자 취향 정보 중 적어도 하나를 포함하는, 음원 생성 정보를 획득하는 단계, 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득하는 단계 및 음원 생성 태그들에 기반하여 음원을 생성하는 단계를 포함하는, 전자 장치의 동작 방법이 개시된다.Acquiring sound source creation information including at least one of image information, surrounding situation information, and user taste information, obtaining sound source creation tags mapped to the sound source creation information, and generating a sound source based on the sound source creation tags A method of operating an electronic device including the step of doing is disclosed.

Description

전자 장치 및 그 동작 방법{An electronic apparatus and a method thereof}An electronic apparatus and a method thereof

개시된 다양한 실시 예들은 전자 장치 및 그 동작 방법에 관한 것으로, 보다 상세하게는 화면에 출력되는 이미지나 주변 상황 등을 기반으로 자동으로 음악을 생성하는 전자 장치 및 그 동작 방법에 관한 것이다.Various disclosed embodiments relate to an electronic device and an operating method thereof, and more particularly, to an electronic device and an operating method thereof that automatically generate music based on an image displayed on a screen or a surrounding situation.

사용자는 전자 장치를 이용하여 사진이나 이미지, 디지털 작품 등을 감상할 수 있다. 사용자는 정적인 환경에서 사진이나 이미지를 보는 것 보다 사진이나 주변 상황 등에 어울리는 배경 음악을 청취하면서 이미지를 감상하는 것을 더 선호할 수 있다. Users can enjoy photos, images, digital works, and the like using electronic devices. A user may prefer to view an image while listening to background music suitable for a photograph or a surrounding situation, rather than viewing a photograph or image in a static environment.

기존에 이미 생성된 음악은 사용자 취향, 화면에 현재 출력되는 사진이나 현재의 주변 상황 등과 같이 매번 바뀌는 다양한 상황을 모두 반영하기 힘들다는 한계가 있다. 따라서, 전자 장치에서 현재 출력되는 이미지나 주변 상황, 사용자 취향 등을 자동으로 고려하여 이미지에 어울리는 음악을 생성하고 이를 사용자에게 제공하는 기술이 요구된다. Music that has already been created has a limitation in that it is difficult to reflect all the various situations that change every time, such as user tastes, photos currently displayed on the screen, or current surroundings. Accordingly, there is a need for a technology for generating music suitable for an image by automatically considering an image currently output from an electronic device, a surrounding situation, a user's taste, and the like, and providing the music to the user.

다양한 실시 예들은 이미지 정보, 주변 상황 정보, 사용자 취향 정보 중 적어도 하나를 음원 생성 정보로 획득하고, 이를 고려하여 음원을 생성하는 전자 장치 및 그 동작 방법을 제공하기 위한 것이다. Various embodiments are intended to provide an electronic device and method of operating the same that acquires at least one of image information, surrounding situation information, and user taste information as sound source generation information and generates a sound source in consideration of the acquired information.

다양한 실시 예들은 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득하고, 음원 생성 태그들에 기반하여 음원을 생성하는 전자 장치 및 그 동작 방법을 제공하기 위한 것이다. Various embodiments are intended to provide an electronic device and an operating method for obtaining sound source creation tags mapped to sound source creation information and generating a sound source based on the sound source creation tags.

다양한 실시 예들은 점수에 따라 음원 생성 태그들을 필터링하고 필터링된 음원 생성 태그들에 기반하여 음원을 생성하는 전자 장치 및 그 동작 방법을 제공하기 위한 것이다. Various embodiments are intended to provide an electronic device and an operating method for filtering sound source generation tags according to scores and generating a sound source based on the filtered sound source generation tags.

실시 예에 따른 전자 장치는 하나 이상의 인스트럭션을 저장하는 메모리 및 상기 메모리에 저장된 상기 하나 이상의 인스트럭션을 실행하는 프로세서를 포함하고, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 이미지 정보, 주변 상황 정보, 및 사용자 취향 정보 중 적어도 하나를 포함하는, 음원 생성 정보를 획득하고, 상기 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득하고, 상기 음원 생성 태그들에 기반하여 음원을 생성할 수 있다. An electronic device according to an embodiment includes a memory that stores one or more instructions and a processor that executes the one or more instructions stored in the memory, and by executing the one or more instructions, the processor obtains image information, surrounding situation information, and Sound source generation information including at least one of user taste information may be acquired, sound source generation tags mapped to the sound source generation information may be acquired, and a sound source may be generated based on the sound source generation tags.

실시 예에서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 상기 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하고, 상기 필터링된 음원 생성 태그들을 이용하여 상기 음원을 생성할 수 있다. In an embodiment, the processor may filter sound source generation tags having high scores among the sound source generation tags by executing the one or more instructions, and generate the sound source using the filtered sound source generation tags.

실시 예에서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 인식 결과의 정확도, 태그 별 중복도, 및 태그 별 웨이트 중 적어도 하나에 기반하여, 상기 이미지 정보에 매핑되는 음원 생성 태그들 별 점수를 획득하고, 상기 이미지 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하여 제1 태그들을 획득할 수 있다. In an embodiment, by executing the one or more instructions, the processor obtains a score for each tag generated by the sound source mapped to the image information based on at least one of the accuracy of the recognition result, the degree of redundancy for each tag, and the weight for each tag. And, among the sound source generation tags mapped to the image information, sound source generation tags having a high score may be filtered to obtain first tags.

실시 예에서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 상황에 따른 사용자 선호도를 나타내는 상황 기반 태그 별 웨이트에 기반하여, 상기 주변 상황 정보에 매핑되는 음원 생성 태그들 별 점수를 획득하고, 상기 주변 상황 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하여 제2 태그들을 획득할 수 있다. In an embodiment, by executing the one or more instructions, the processor obtains a score for each sound source generation tags mapped to the surrounding context information based on a weight for each context-based tag indicating user preference according to the context, and the surrounding Second tags may be obtained by filtering sound source generation tags having a high score among sound source generation tags mapped to situation information.

실시 예에서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 상기 제1 태그들 및 상기 제2 태그들 중 적어도 하나를 이용하여, 상기 음원을 생성할 수 있다. In an embodiment, the processor may generate the sound source using at least one of the first tags and the second tags by executing the one or more instructions.

실시 예에서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 상기 주변 상황 정보 및 사용자 식별 정보 중 적어도 하나에 기반하여, 상기 필터링된 음원 생성 태그들을 추가로 필터링하고, 상기 추가로 필터링된 태그들을 이용하여 상기 음원을 생성할 수 있다. In an embodiment, the processor additionally filters the filtered sound source generation tags based on at least one of the surrounding context information and user identification information by executing the one or more instructions, and uses the additionally filtered tags. By doing so, the sound source can be created.

실시 예에서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 상기 사용자 취향 정보 및 음악 재생 이력 정보 중 적어도 하나에 기반하여 사용자 선호도를 나타내는 태그 별 웨이트를 획득할 수 있다. In an embodiment, the processor may obtain a weight for each tag representing user preference based on at least one of the user taste information and music play history information by executing the one or more instructions.

실시 예에서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 상기 생성된 음원에 따라 음악을 재생하고, 음악 재생 정보에 따라 상기 태그 별 웨이트를 업데이트할 수 있다. In an embodiment, the processor may reproduce music according to the generated sound source and update the weight for each tag according to music reproduction information by executing the one or more instructions.

실시 예에서, 상기 음악 재생 정보는 상기 음악의 재생 빈도, 음악 전체 청취 정도, 재생 중단 정도, 빨리 감기 정도, 스킵 정도에 대한 정보를 포함할 수 있다. In an embodiment, the music reproduction information may include information about the music reproduction frequency, total music listening level, playback stop level, fast-forward level, and skip level.

실시 예에서, 상기 전자 장치는 디스플레이를 더 포함하고, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 상기 디스플레이에 출력되는 이미지에 대한 부가 정보, 상기 이미지에서 식별된 컬러나 스타일, 상기 이미지에서 식별된 오브젝트의 종류, 및 상기 식별된 오브젝트가 사람인 경우 사람의 표정 중 적어도 하나에 기반하여 상기 이미지 정보를 획득할 수 있다. In an embodiment, the electronic device further includes a display, and by executing the one or more instructions, the processor executes additional information about an image output to the display, a color or style identified in the image, and a color or style identified in the image. The image information may be obtained based on at least one of a type of object and, when the identified object is a person, a person's facial expression.

실시 예에서, 상기 전자 장치는 카메라, 센서 및 통신 모듈 중 적어도 하나를 더 포함하고, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 상기 카메라, 상기 센서 및 상기 통신 모듈 중 적어도 하나로부터 획득된, 사용자 유무에 대한 정보, 날씨 정보, 날짜 정보, 시간 정보, 계절 정보, 공휴일 정보, 기념일 정보, 온도 정보, 조도 정보 및 위치 정보 중 적어도 하나로부터 상기 주변 상황 정보를 획득할 수 있다. In an embodiment, the electronic device further includes at least one of a camera, a sensor, and a communication module, and the processor executes the one or more instructions, thereby obtaining a user, obtained from at least one of the camera, the sensor, and the communication module. The surrounding situation information may be obtained from at least one of presence/absence information, weather information, date information, time information, season information, holiday information, anniversary information, temperature information, illuminance information, and location information.

실시 예에서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 사용자 프로필 정보, 사용자의 시청 이력 정보, 및 사용자로부터 선택 받은 선호 음악 정보 중 적어도 하나로부터 상기 사용자 취향 정보를 획득할 수 있다. In an embodiment, the processor may obtain the user taste information from at least one of user profile information, user viewing history information, and preferred music information selected by the user by executing the one or more instructions.

실시 예에 따른 전자 장치의 동작 방법은 이미지 정보, 주변 상황 정보, 및 사용자 취향 정보 중 적어도 하나를 포함하는, 음원 생성 정보를 획득하는 단계, 상기 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득하는 단계 및 상기 음원 생성 태그들에 기반하여 음원을 생성하는 단계를 포함할 수 있다. An operating method of an electronic device according to an embodiment includes obtaining sound source creation information including at least one of image information, surrounding situation information, and user taste information, and acquiring sound source creation tags mapped to the sound source creation information. and generating a sound source based on the sound source generation tags.

실시 예에 따른 컴퓨터로 읽을 수 있는 기록 매체는 이미지 정보, 주변 상황 정보, 및 사용자 취향 정보 중 적어도 하나를 포함하는, 음원 생성 정보를 획득하는 단계, 상기 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득하는 단계 및 상기 음원 생성 태그들에 기반하여 음원을 생성하는 단계를 포함하는, 전자 장치의 동작 방법을 구현하기 위한 프로그램이 기록된 컴퓨터로 읽을 수 있는 기록 매체일 수 있다.In the computer-readable recording medium according to the embodiment, acquiring sound source creation information including at least one of image information, surrounding situation information, and user taste information, and acquiring sound source creation tags mapped to the sound source creation information. It may be a computer-readable recording medium on which a program for implementing a method of operating an electronic device, including the step of performing and generating a sound source based on the sound source generation tags, is recorded.

일 실시 예에 따른 전자 장치 및 그 동작 방법은 이미지 정보, 주변 상황 정보, 사용자 취향 정보 중 적어도 하나를 포함하는 음원 생성 정보를 획득하고, 이를 고려하여 음원을 생성할 수 있다. An electronic device and an operating method thereof according to an embodiment may obtain sound source generation information including at least one of image information, surrounding situation information, and user taste information, and generate a sound source in consideration of the acquired information.

일 실시 예에 따른 전자 장치 및 그 동작 방법은 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득하고, 음원 생성 태그들에 기반하여 음원을 생성할 수 있다. An electronic device and an operating method thereof according to an embodiment may obtain sound source generation tags mapped to sound source generation information and generate a sound source based on the sound source generation tags.

일 실시 예에 따른 전자 장치 및 그 동작 방법은 점수에 따라 음원 생성 태그들을 필터링하고 필터링된 음원 생성 태그들에 기반하여 음원을 생성할 수 있다.An electronic device and an operating method thereof according to an embodiment may filter sound source generation tags according to scores and generate a sound source based on the filtered sound source generation tags.

도 1은 실시 예에 따라, 음원 생성 정보에 기반하여 음악을 생성하고 이를 사용자에게 제공하는 것을 설명하기 위한 도면이다.
도 2는 실시 예에 따른 전자 장치의 일 예의 내부 블록도이다.
도 3은 실시 예에 따른 도 2의 프로세서의 내부 블록도이다.
도 4는 실시 예에 따른 도 3의 음원 생성 정보 획득부의 내부 블록도이다.
도 5는 실시 예에 따른 도 4의 이미지 정보 획득부가 이미지 정보를 획득하는 방법을 설명하기 위한 도면이다.
도 6은 실시 예에 따른 도 3의 음원 생성 태그 획득부의 내부 블록도이다.
도 7은 실시 예에 따라, 도 6의 태그 필터링부가 태그 별 점수를 고려하여 태그를 필터링하는 것을 설명하기 위한 도면이다.
도 8은 실시 예에 따른 음원 생성 정보와 태그 간의 관계를 도시한 도면이다.
도 9는 실시 예에 따라, 태그로부터 음원을 획득하도록 학습된 신경망을 설명하기 위한 도면이다.
도 10은 실시 예에 따른 전자 장치의 내부 블록도이다.
도 11은 실시 예에 따라, 음원을 생성하는 방법을 도시한 순서도이다.
도 12는 실시 예에 따라, 음원 생성 태그들을 필터링하는 방법을 도시한 순서도이다.
도 13은 실시 예에 따라, 음원 생성 정보 별로 음원 생성 태그들을 필터링하여 음원을 생성하는 방법을 도시한 순서도이다.
도 14는 실시 예에 따라, 태그 별 웨이트를 획득하는 방법을 도시한 순서도이다.1 is a diagram for explaining generating music based on sound source generation information and providing it to a user according to an embodiment.
2 is an internal block diagram of an example of an electronic device according to an embodiment.
3 is an internal block diagram of the processor of FIG. 2 according to an embodiment.
4 is an internal block diagram of a sound source generation information acquisition unit of FIG. 3 according to an embodiment.
5 is a diagram for explaining a method of obtaining image information by the image information obtaining unit of FIG. 4 according to an embodiment.
6 is an internal block diagram of a sound generating tag acquisition unit of FIG. 3 according to an embodiment.
FIG. 7 is a diagram for explaining that the tag filtering unit of FIG. 6 filters tags in consideration of scores for each tag according to an embodiment.
8 is a diagram illustrating a relationship between sound source creation information and a tag according to an embodiment.
9 is a diagram for explaining a neural network learned to acquire a sound source from a tag according to an embodiment.
10 is an internal block diagram of an electronic device according to an embodiment.
11 is a flowchart illustrating a method of generating a sound source according to an embodiment.
12 is a flowchart illustrating a method of filtering sound source generation tags according to an embodiment.
13 is a flowchart illustrating a method of generating a sound source by filtering sound source generation tags for each sound source generation information according to an embodiment.
14 is a flowchart illustrating a method of obtaining a weight for each tag according to an embodiment.

아래에서는 첨부한 도면을 참조하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 개시의 실시 예를 상세히 설명한다. 그러나 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. Hereinafter, embodiments of the present disclosure will be described in detail so that those skilled in the art can easily implement the present disclosure with reference to the accompanying drawings. However, the present disclosure may be implemented in many different forms and is not limited to the embodiments described herein.

본 개시에서 사용되는 용어는, 본 개시에서 언급되는 기능을 고려하여 현재 사용되는 일반적인 용어로 기재되었으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 다양한 다른 용어를 의미할 수 있다. 따라서 본 개시에서 사용되는 용어는 용어의 명칭만으로 해석되어서는 안되며, 용어가 가지는 의미와 본 개시의 전반에 걸친 내용을 토대로 해석되어야 한다.The terminology used in the present disclosure has been described as a general term currently used in consideration of the functions mentioned in the present disclosure, but it may mean various other terms depending on the intention or precedent of a person skilled in the art, the emergence of new technologies, and the like. can Therefore, the terms used in the present disclosure should not be interpreted only as the names of the terms, but should be interpreted based on the meanings of the terms and the contents throughout the present disclosure.

또한, 본 개시에서 사용된 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것이며, 본 개시를 한정하려는 의도로 사용되는 것이 아니다. Also, terms used in the present disclosure are only used to describe specific embodiments and are not intended to limit the present disclosure.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. Throughout the specification, when a part is said to be "connected" to another part, this includes not only the case where it is "directly connected" but also the case where it is "electrically connected" with another element interposed therebetween. .

본 명세서, 특히, 특허 청구 범위에서 사용된 “상기” 및 이와 유사한 지시어는 단수 및 복수 모두를 지시하는 것일 수 있다. 또한, 본 개시에 따른 방법을 설명하는 단계들의 순서를 명백하게 지정하는 기재가 없다면, 기재된 단계들은 적당한 순서로 행해질 수 있다. 기재된 단계들의 기재 순서에 따라 본 개시가 한정되는 것은 아니다.As used in this specification, particularly in the claims, “above” and similar designations may refer to both the singular and plural. Further, unless there is a description that explicitly specifies the order of steps in describing a method according to the present disclosure, the recited steps may be performed in any suitable order. The present disclosure is not limited by the order of description of the described steps.

본 명세서에서 다양한 곳에 등장하는 "일부 실시 예에서" 또는 "일 실시 예에서" 등의 어구는 반드시 모두 동일한 실시 예를 가리키는 것은 아니다.The appearances of phrases such as “in some embodiments” or “in one embodiment” in various places in this specification are not necessarily all referring to the same embodiment.

본 개시의 일부 실시 예는 기능적인 블록 구성들 및 다양한 처리 단계들로 나타내어질 수 있다. 이러한 기능 블록들의 일부 또는 전부는, 특정 기능들을 실행하는 다양한 개수의 하드웨어 및/또는 소프트웨어 구성들로 구현될 수 있다. 예를 들어, 본 개시의 기능 블록들은 하나 이상의 마이크로프로세서들에 의해 구현되거나, 소정의 기능을 위한 회로 구성들에 의해 구현될 수 있다. 또한, 예를 들어, 본 개시의 기능 블록들은 다양한 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능 블록들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다. 또한, 본 개시는 전자적인 환경 설정, 신호 처리, 및/또는 데이터 처리 등을 위하여 종래 기술을 채용할 수 있다. “매커니즘”, “요소”, “수단” 및 “구성”등과 같은 용어는 넓게 사용될 수 있으며, 기계적이고 물리적인 구성들로서 한정되는 것은 아니다.Some embodiments of the present disclosure may be represented as functional block structures and various processing steps. Some or all of these functional blocks may be implemented as a varying number of hardware and/or software components that perform specific functions. For example, functional blocks of the present disclosure may be implemented by one or more microprocessors or circuit configurations for a predetermined function. Also, for example, the functional blocks of this disclosure may be implemented in various programming or scripting languages. Functional blocks may be implemented as an algorithm running on one or more processors. In addition, the present disclosure may employ prior art for electronic environment setting, signal processing, and/or data processing. Terms such as “mechanism”, “element”, “means” and “composition” may be used broadly and are not limited to mechanical and physical components.

또한, 도면에 도시된 구성 요소들 간의 연결 선 또는 연결 부재들은 기능적인 연결 및/또는 물리적 또는 회로적 연결들을 예시적으로 나타낸 것일 뿐이다. 실제 장치에서는 대체 가능하거나 추가된 다양한 기능적인 연결, 물리적인 연결, 또는 회로 연결들에 의해 구성 요소들 간의 연결이 나타내어질 수 있다. In addition, connecting lines or connecting members between components shown in the drawings are only examples of functional connections and/or physical or circuit connections. In an actual device, connections between components may be represented by various functional connections, physical connections, or circuit connections that can be replaced or added.

또한, 명세서에 기재된 "...부", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다.In addition, terms such as "...unit" and "module" described in the specification mean a unit that processes at least one function or operation, which may be implemented as hardware or software or a combination of hardware and software. .

또한, 명세서에서 “사용자”라는 용어는 전자 장치를 이용하는 사람을 의미하며, 소비자, 평가자, 시청자, 관리자 또는 설치 기사를 포함할 수 있다.Also, in the specification, the term “user” means a person who uses an electronic device, and may include a consumer, an evaluator, a viewer, an administrator, or an installer.

이하 첨부된 도면을 참고하여 본 개시를 상세히 설명하기로 한다.Hereinafter, the present disclosure will be described in detail with reference to the accompanying drawings.

도 1은 실시 예에 따라, 음원 생성 정보에 기반하여 음악을 생성하고 이를 사용자에게 제공하는 것을 설명하기 위한 도면이다. 1 is a diagram for explaining generating music based on sound source generation information and providing it to a user according to an embodiment.

도 1을 참조하면, 전자 장치(100)는 화면에 이미지를 출력할 수 있다.Referring to FIG. 1 , the electronic device 100 may output an image on the screen.

실시 예에서, 전자 장치(100)는 화면을 포함하는 다양한 형태의 디스플레이 장치로 구현될 수 있다. 일 실시 예로, 도 1은 전자 장치(100)가 디지털 TV인 경우를 도시한다. In an embodiment, the electronic device 100 may be implemented as various types of display devices including screens. As an example, FIG. 1 illustrates a case where the electronic device 100 is a digital TV.

예컨대, 전자 장치(100)는 앰비언트(Ambient) 서비스를 실행하여 화면에 이미지를 출력할 수 있다. 앰비언트 서비스는 디지털 TV 등의 디스플레이 장치가 오프(off) 상태일 때 블랙 화면 대신에, 그림, 사진, 또는 시계 등과 같이 의미 있는 이미지가 디스플레이 되도록 하는 서비스를 의미할 수 있다. For example, the electronic device 100 may output an image on a screen by executing an ambient service. The ambient service may refer to a service that allows a meaningful image such as a picture, photo, or clock to be displayed instead of a black screen when a display device such as a digital TV is in an off state.

전자 장치(100)는 전자 장치(100) 내부에 기 저장되어 있는 이미지를 화면에 출력하거나 또는 외부 서버로부터 앰비언트 서비스 실행을 위한 이미지 등을 수신하여 화면에 출력할 수 있다. 또는 전자 장치(100)는 주변 기기와 유무선 통신을 수행하여 주변 기기에 저장된 사진이나 작품 이미지 등을 전자 장치(100)의 화면을 통해 출력할 수 있다. 예컨대, 전자 장치(100)는 USB(미도시), PC(미도시), 태블릿(미도시), 핸드폰(미도시) 등과 같은 사용자 단말기에 저장되어 있는 사진이나 그림 등을 전자 장치(100)의 화면에 출력할 수 있다.The electronic device 100 may display an image pre-stored inside the electronic device 100 on the screen or receive an image for executing an ambient service from an external server and display the image on the screen. Alternatively, the electronic device 100 may perform wired/wireless communication with a peripheral device to output a photo or work image stored in the peripheral device through the screen of the electronic device 100 . For example, the electronic device 100 transmits photos or pictures stored in a user terminal such as a USB (not shown), a PC (not shown), a tablet (not shown), a mobile phone (not shown), etc. to the electronic device 100. can be output to the screen.

실시 예에서, 전자 장치(100)는 음원 생성 정보를 획득할 수 있다. 실시 예에서, 음원 생성 정보는 음원 생성을 위해 수집되는 정보로, 음원 생성에 영향을 주는 정보를 의미할 수 있다. In an embodiment, the electronic device 100 may obtain sound source creation information. In an embodiment, the sound source generation information is information collected for generating a sound source, and may mean information affecting the generation of a sound source.

실시 예에서, 음원 생성에 영향을 주는 정보는 화면에서 출력되는 이미지, 사용자 주변의 외부 형편이나 상태를 나타내는 정보, 또는 사용자의 선호도나 취향 등에 대한 정보를 포함할 수 있다. In an embodiment, the information affecting the generation of a sound source may include an image output on a screen, information indicating external conditions or conditions around a user, or information about a user's preference or taste.

실시 예에서, 전자 장치(100)는 화면에서 출력되는 이미지에 대한 정보를 이미지 정보로, 주변의 형편이나 상태를 나타내는 정보를 주변 상황 정보로, 사용자의 선호도나 취향을 나타내는 정보를 사용자 취향 정보로 획득할 수 있다. In an embodiment, the electronic device 100 converts information about an image output on the screen into image information, information representing circumstances or conditions around it as surrounding situation information, and information representing a user's preference or taste as user taste information. can be obtained

실시 예에서, 전자 장치(100)는 화면에 출력되는 이미지로부터 이미지 정보를 획득할 수 있다. 이미지 정보는 화면에 출력되는 이미지 자체의 고유 특성에 대한 정보일 수 있다. 이미지 정보는 화면에 출력되는 이미지에서 식별된 컬러나 스타일, 이미지에서 식별된 오브젝트의 종류, 및 식별된 오브젝트가 사람인 경우 사람의 표정 중 적어도 하나를 포함할 수 있다. 또한 이미지 정보는 이미지에 대한 부가 정보를 포함할 수 있다.In an embodiment, the electronic device 100 may obtain image information from an image output on a screen. The image information may be information about unique characteristics of an image itself output on a screen. The image information may include at least one of a color or style identified in an image output on the screen, a type of object identified in the image, and a human expression when the identified object is a person. Also, the image information may include additional information about the image.

도 1에서는 일 예로, 전자 장치(100)가 앰비언트 서비스를 실행하여 화면에 빈센트 반 고흐의 명화인 ‘해바라기’ 작품을 출력한 것을 도시한다.1 illustrates, for example, that the electronic device 100 executes the ambient service and outputs Vincent van Gogh's famous painting 'Sunflower' on the screen.

전자 장치(100)는 화면에 출력된 이미지를 분석하여, 이미지에 포함된 오브젝트가 해바라기이고, 이미지의 스타일 정보가 빈센트 반 고흐 풍의 명화라는 것과 컬러가 진한 노란색이라는 정보, 기타 빈센트 반 고흐에 대한 설명이나 해바라기 작품에 대한 부가 정보 중 적어도 하나를 획득할 수 있다.The electronic device 100 analyzes the image output on the screen, the object included in the image is a sunflower, the style information of the image is that it is a Vincent van Gogh-style masterpiece, the color is dark yellow, and other descriptions of Vincent van Gogh. At least one of the additional information on the or sunflower work may be obtained.

실시 예에서, 전자 장치(100)는 주변 상황 정보를 획득할 수 있다. 주변 상황 정보는 전자 장치(100) 및 사용자가 위치한 장소의 주변 또는 밖의 상황을 표시하는 정보를 의미할 수 있다. 주변 상황 정보는, 전자 장치(100)에 구비된 카메라나, 센서를 통해 획득되거나, 또는 외부 서버로부터 수신하여 획득될 수 있다. 예컨대, 주변 상황 정보는 사용자 유무에 대한 정보, 날씨 정보, 날짜 정보, 시간 정보, 계절 정보, 공휴일 정보, 기념일 정보, 온도 정보, 조도 정보 및 위치 정보 중 적어도 하나를 포함할 수 있다. In an embodiment, the electronic device 100 may obtain surrounding context information. The surrounding situation information may refer to information indicating a situation around or outside the place where the electronic device 100 and the user are located. The surrounding situation information may be acquired through a camera or sensor provided in the electronic device 100 or obtained by receiving it from an external server. For example, the surrounding situation information may include at least one of information about whether or not a user exists, weather information, date information, time information, season information, holiday information, anniversary information, temperature information, illuminance information, and location information.

예컨대, 도 1에서, 전자 장치(100)는 온도 센서(미도시)를 통해 주변의 온도가 섭씨 20도라는 정보를 획득하거나 조도 센서(미도시)를 통해 조도가 300룩스(lx)라는 정보를 획득할 수 있다. 또는, 전자 장치(100)는 통신 모듈(미도시)을 통해 외부 서버 등으로부터 현재 시각이 오후 시각이고, 주변 날씨는 따뜻하고, 계절은 가을이고, 오늘 날짜는 9월 5일이고, 전자 장치(100)의 위치는 미국 워싱턴 주의 시애틀 지역이라는 정보 등을 획득할 수 있다. For example, in FIG. 1 , the electronic device 100 obtains information that the ambient temperature is 20 degrees Celsius through a temperature sensor (not shown) or information that the illuminance is 300 lux (lx) through an illuminance sensor (not shown). can be obtained Alternatively, the electronic device 100 may receive information from an external server or the like through a communication module (not shown) such that the current time is afternoon, the surrounding weather is warm, the season is autumn, today's date is September 5, and the electronic device 100 ) may obtain information such as the location of Seattle, Washington, USA.

실시 예에서, 전자 장치(100)는 사용자 취향 정보를 획득할 수 있다. 사용자 취향 정보는 사용자의 취미나 선호하는 방향을 나타내는 정보를 의미할 수 있다. 실시 예에서, 사용자 취향 정보는 사용자 프로필 정보나 사용자의 시청 이력 정보로부터 획득될 수 있다. 또는 실시 예에서, 사용자 취향 정보는 사용자로부터 직접 선호 음악 정보를 선택 받아 획득될 수 있다. 또는 사용자 취향 정보는 사용자의 이전 음악 청취 이력이 있는 경우, 이전 음악 청취 이력에 기반하여 획득될 수 있다. In an embodiment, the electronic device 100 may obtain user preference information. User preference information may refer to information indicating a user's hobby or preferred direction. In an embodiment, user preference information may be obtained from user profile information or user viewing history information. Alternatively, in an embodiment, the user taste information may be acquired by directly selecting preferred music information from the user. Alternatively, the user taste information may be acquired based on a previous music listening history when the user has a previous music listening history.

예컨대, 도 1에서, 전자 장치(100)는 사용자의 프로필 정보로부터 사용자가 30대 여성이고, 전자 장치(100)의 시청 이력으로부터 사용자가 선호하는 프로그램이 멜로 드라마라는 정보 등을 획득하고 이로부터 사용자의 취향을 추론할 수 있다. 또는 전자 장치(100)는 사용자가 선호한다고 입력한 음악 정보 또는 사용자가 이전에 청취한 음악이 클래식 곡이고, 조용한 곡이고, 피아노와 바이올린 악기로 연주된 곡인 경우, 이로부터 사용자의 취향을 추론할 수 있다. For example, in FIG. 1 , the electronic device 100 obtains information that the user is a woman in her 30s from the user's profile information and that the user's preferred program is a melodrama from the viewing history of the electronic device 100, and the like. preferences can be inferred. Alternatively, the electronic device 100 may infer the user's taste from music information that the user prefers or music that the user has previously listened to is a classic song, a quiet song, or a song played with piano and violin instruments. can

실시 예에서, 전자 장치(100)는 이미지 정보, 주변 상황 정보, 사용자 취향 정보 중 적어도 하나를 포함하는 음원 생성 정보를 획득하고, 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득할 수 있다. In an embodiment, the electronic device 100 may obtain sound source creation information including at least one of image information, surrounding situation information, and user taste information, and acquire sound source creation tags mapped to the sound source creation information.

실시 예에서, 전자 장치(100)는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링할 수 있다. In an embodiment, the electronic device 100 may filter sound source creation tags having high scores among sound source creation tags.

실시 예에서, 전자 장치(100)는 음원 생성 태그들을 필터링하기 위해 태그 별 웨이트를 획득할 수 있다. 전자 장치(100)는 사용자 취향 정보 및 음악 재생 이력 정보 중 적어도 하나에 기반하여 각 태그에 대한 사용자 선호도를 나타내는 태그 별 웨이트를 획득할 수 있다.In an embodiment, the electronic device 100 may obtain a weight for each tag in order to filter sound source generation tags. The electronic device 100 may obtain a weight for each tag representing user preference for each tag based on at least one of user taste information and music playback history information.

실시 예에서, 전자 장치(100)는 인식 결과의 정확도, 태그 별 중복도, 및 태그 별 웨이트 중 적어도 하나에 기반하여, 이미지 정보에 매핑되는 음원 생성 태그들 별 점수를 획득하고, 이미지 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하여 제1 태그들을 획득할 수 있다.In an embodiment, the electronic device 100 obtains a score for each sound generating tag mapped to image information based on at least one of the accuracy of the recognition result, the degree of overlap for each tag, and the weight for each tag, and maps the score to the image information. The first tags may be obtained by filtering sound source generation tags having a high score among the sound source generation tags.

실시 예에서, 전자 장치(100)는 상황에 따른 사용자 선호도를 나타내는 상황 기반 태그 별 웨이트를 획득할 수 있다. 전자 장치(100)는 상황 기반 태그 별 웨이트에 기반하여, 주변 상황 정보에 매핑되는 음원 생성 태그들 별 점수를 획득하고, 주변 상황 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하여 제2 태그들을 획득할 수 있다.In an embodiment, the electronic device 100 may obtain a weight for each context-based tag representing user preference according to context. The electronic device 100 acquires scores for each sound source generation tags mapped to the surrounding context information based on the weight of each context-based tag, and selects sound source generation tags having a high score among the sound source generation tags mapped to the surrounding context information. Second tags may be obtained by filtering.

실시 예에서, 전자 장치(100)는 제1 태그들 및 제2 태그들 중 적어도 하나를 이용하여 음원을 생성할 수 있다.In an embodiment, the electronic device 100 may generate a sound source using at least one of the first tags and the second tags.

경우에 따라, 전자 장치(100)는 주변 상황 정보 및 사용자 식별 정보 중 적어도 하나에 기반하여, 필터링된 음원 생성 태그들을 추가로 필터링하고, 추가로 필터링된 태그들을 이용하여 음원을 생성할 수도 있다.In some cases, the electronic device 100 may additionally filter the filtered sound source generation tags based on at least one of surrounding situation information and user identification information, and may create a sound source using the additionally filtered tags.

실시 예에서, 전자 장치(100)는 적어도 하나의 뉴럴 네트워크를 이용하여, 음원 생성 태그들로부터 음원을 생성할 수 있다. 전자 장치(100)가 이용하는 뉴럴 네트워크는 태그와 음원을 학습 데이터 셋으로 이용하여 훈련된 뉴럴 네트워크일 수 있다. In an embodiment, the electronic device 100 may generate a sound source from sound source generation tags using at least one neural network. The neural network used by the electronic device 100 may be a neural network trained using tags and sound sources as learning data sets.

실시 예에서, 전자 장치(100)는 생성된 음원에 따라 음악을 재생할 수 있다. In an embodiment, the electronic device 100 may play music according to the generated sound source.

실시 예에서, 전자 장치(100)는 사용자가 음악을 재생하는 정도를 나타내는 음악 재생 정보를 획득하고, 음악 재생 정보에 따라 태그 별 웨이트를 업데이트할 수 있다. 실시 예에서, 음악 재생 정보는 음악의 재생 빈도, 음악 전체 청취 정도, 재생 중단 정도, 빨리 감기 정도, 스킵 정도에 대한 정보를 포함할 수 있다. In an embodiment, the electronic device 100 may obtain music play information representing the degree to which the user plays music, and update the weight for each tag according to the music play information. In an embodiment, the music playback information may include information about a music playback frequency, a total listening level of music, a playback stop level, a fast-forward level, and a skip level.

이와 같이, 실시 예에 따르면, 전자 장치(100)는 다양한 형태의 음원 생성 정보를 획득하고, 음원 생성 정보에 기반하여 이미지, 주변 상황, 사용자 취향 등에 맞는 음원을 생성함으로써, 사용자에게 감상하는 이미지, 주변 상황, 기타 사용자 취향 등에 어울리는 음악을 제공 할 수 있다. In this way, according to the embodiment, the electronic device 100 acquires various types of sound source generation information, and based on the sound source generation information, generates a sound source suitable for an image, a surrounding situation, a user's taste, etc. It is possible to provide music suitable for surrounding situations and other user preferences.

도 2는 실시 예에 따른 전자 장치의 일 예의 내부 블록도이다.2 is an internal block diagram of an example of an electronic device according to an embodiment.

도 2의 전자 장치(100a)는 도 1의 전자 장치(100)의 일 예일 수 있다. The electronic device 100a of FIG. 2 may be an example of the electronic device 100 of FIG. 1 .

실시 예에서, 전자 장치(100a)는 화면을 통해 이미지를 출력할 수 있는 다양한 형태의 디스플레이 장치로 구현될 수 있다. 디스플레이 장치는 이미지를 사용자에게 시각적으로 출력하는 장치일 수 있다. 예컨대, 전자 장치(100a)는 디지털 텔레비전, 웨어러블 디바이스, 스마트 폰, 각종 PC(personal computer)들, 예컨대 데스크 톱(desk top), 태블릿 PC, 랩탑 컴퓨터, PDA(personal digital assistant), GPS(global positioning system) 장치, 스마트 미러(smart mirror), 전자책 단말기, 네비게이션, 키오스크, 디지털 카메라, 웨어러블 장치(wearable device), 스마트 와치(smart watch), 홈네트워크 장치, 보안용 장치, 의료 장치 등과 같은 다양한 형태의 전자 기기일 수 있다. 전자 장치(100a)는 고정형 또는 이동형일 수 있다. In an embodiment, the electronic device 100a may be implemented as various types of display devices capable of outputting images through a screen. The display device may be a device that visually outputs an image to a user. For example, the electronic device 100a may include a digital television, a wearable device, a smart phone, and various personal computers (PCs), such as a desktop, tablet PC, laptop computer, personal digital assistant (PDA), and global positioning (GPS). system) device, smart mirror, e-book reader, navigation, kiosk, digital camera, wearable device, smart watch, home network device, security device, medical device, etc. It may be an electronic device of The electronic device 100a may be a fixed type or a mobile type.

또는 전자 장치(100a)는 냉장고나 세탁기 등과 같은 다양한 형태의 가전 제품 등의 전면에 삽입되는 디스플레이와 같은 형태일 수도 있다.Alternatively, the electronic device 100a may be in the form of a display inserted into the front of various types of home appliances such as a refrigerator or a washing machine.

또는, 전자 장치(100a)는 화면을 포함하는 디스플레이 장치와 유선 또는 무선 통신망을 통해 연결된 전자 장치로 구현될 수도 있다. 예컨대, 전자 장치(100a)는 미디어 플레이어나, 셋탑 박스, 인공지능(AI) 스피커 등의 형태로 구현될 수도 있다. Alternatively, the electronic device 100a may be implemented as an electronic device connected to a display device including a screen through a wired or wireless communication network. For example, the electronic device 100a may be implemented in the form of a media player, a set-top box, or an artificial intelligence (AI) speaker.

또한, 본 개시의 실시 예에 따른 전자 장치(100a)는 전술한 디지털 텔레비전, 웨어러블 디바이스, 스마트 폰, 각종 PC(personal computer)들, 예컨대 데스크 톱(desk top), 태블릿 PC, 랩탑 컴퓨터, PDA(personal digital assistant), 미디어 플레이어, 마이크로 서버, GPS(global positioning system) 장치, 스마트 미러(smart mirror), 전자책 단말기, 네비게이션, 키오스크, 디지털 카메라, 웨어러블 장치(wearable device), 스마트 와치(smart watch), 홈네트워크 장치, 보안용 장치, 의료 장치, 냉장고나 세탁기, 기타 가전 제품 등의 전면에 삽입되는 디스플레이, 미디어 플레이어, 셋탑 박스나 인공지능(AI) 스피커 등과 같은 다양한 형태의 전자 기기에 포함되거나 탑재되는 형태로 형성될 수 있다.In addition, the electronic device 100a according to an embodiment of the present disclosure may include the aforementioned digital television, wearable device, smart phone, various personal computers (PCs), such as a desktop, tablet PC, laptop computer, and PDA ( personal digital assistant), media player, micro server, global positioning system (GPS) device, smart mirror, e-reader, navigation, kiosk, digital camera, wearable device, smart watch , home network devices, security devices, medical devices, displays inserted into the front of refrigerators, washing machines, and other home appliances, media players, set-top boxes, or AI speakers. can be formed in the form of

도 2를 참조하면, 전자 장치(100a)는 프로세서(210) 및 메모리(220)를 포함할 수 있다.Referring to FIG. 2 , the electronic device 100a may include a processor 210 and a memory 220 .

실시 예에 따른 메모리(220)는, 적어도 하나의 인스트럭션을 저장할 수 있다. 메모리(220)는 프로세서(210)가 실행하는 적어도 하나의 프로그램을 저장하고 있을 수 있다. 메모리(220)에는 기 정의된 동작 규칙이나 프로그램이 저장될 수 있다. 또한 메모리(220)는 전자 장치(100a)로 입력되거나 전자 장치(100a)로부터 출력되는 데이터를 저장할 수 있다. The memory 220 according to an embodiment may store at least one instruction. The memory 220 may store at least one program executed by the processor 210 . Predefined operation rules or programs may be stored in the memory 220 . Also, the memory 220 may store data input to or output from the electronic device 100a.

메모리(220)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory) SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. The memory 220 may be a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg SD or XD memory, etc.), RAM (RAM, Random Access Memory) SRAM (Static Random Access Memory), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic disk , an optical disk, and at least one type of storage medium.

실시 예에서, 메모리(220)는 음원 생성 정보를 획득하기 위한 하나 이상의 인스트럭션을 포함할 수 있다.In an embodiment, the memory 220 may include one or more instructions for obtaining sound source generation information.

실시 예에서, 메모리(220)는 태그 별 웨이트를 획득하기 위한 하나 이상의 인스트럭션을 포함할 수 있다.In an embodiment, the memory 220 may include one or more instructions for obtaining a weight for each tag.

실시 예에서, 메모리(220)는 태그 별 웨이트를 저장할 수 있다.In an embodiment, the memory 220 may store a weight for each tag.

실시 예에서, 메모리(220)는 태그 별 웨이트를 업데이트하기 위한 하나 이상의 인스트럭션을 포함할 수 있다.In an embodiment, the memory 220 may include one or more instructions for updating the weight for each tag.

실시 예에서, 메모리(220)에는 태그로부터 음원을 생성하기 위한 소프트웨어가 저장될 수 있다.In an embodiment, software for generating a sound source from a tag may be stored in the memory 220 .

실시 예에서, 메모리(220)에는 적어도 하나의 뉴럴 네트워크 및/또는 기 정의된 동작 규칙이나 AI 모델이 저장될 수 있다. 실시 예에서, 메모리(220)에 저장된 적어도 하나의 뉴럴 네트워크 및/또는 기 정의된 동작 규칙이나 AI 모델은 태그로부터 음원을 생성하기 위한 하나 이상의 인스트럭션을 포함할 수 있다.In an embodiment, at least one neural network and/or a predefined operating rule or AI model may be stored in the memory 220 . In an embodiment, at least one neural network and/or a predefined operating rule or AI model stored in the memory 220 may include one or more instructions for generating a sound source from a tag.

실시 예에서, 프로세서(210)는 전자 장치(100a)의 전반적인 동작을 제어한다. 프로세서(210)는 메모리(220)에 저장된 하나 이상의 인스트럭션을 실행함으로써, 전자 장치(100a)가 기능하도록 제어할 수 있다.In an embodiment, the processor 210 controls the overall operation of the electronic device 100a. The processor 210 may control the electronic device 100a to function by executing one or more instructions stored in the memory 220 .

실시 예에서, 프로세서(210)는 음원 생성 정보를 획득할 수 있다. 음원 생성 정보는 이미지 정보, 주변 상황 정보, 및 사용자 취향 정보 중 적어도 하나를 포함할 수 있다.In an embodiment, the processor 210 may obtain sound source generation information. The sound source generation information may include at least one of image information, surrounding situation information, and user taste information.

실시 예에서, 전자 장치(100a)는 디스플레이(미도시)를 더 포함할 수 있다.In an embodiment, the electronic device 100a may further include a display (not shown).

실시 예에서, 프로세서(210)는 디스플레이에 출력되는 이미지에 대한 부가 정보, 이미지에서 식별된 컬러나 스타일, 이미지에서 식별된 오브젝트의 종류, 및 식별된 오브젝트가 사람인 경우 사람의 표정 중 적어도 하나에 기반하여 이미지 정보를 획득할 수 있다. In an embodiment, the processor 210 may be based on at least one of additional information about an image output on a display, a color or style identified in the image, a type of object identified in the image, and a person's expression when the identified object is a person. image information can be obtained.

실시 예에서, 전자 장치(100a)는 카메라(미도시), 센서(미도시) 및 통신 모듈(미도시) 중 적어도 하나를 더 포함할 수 있다. In an embodiment, the electronic device 100a may further include at least one of a camera (not shown), a sensor (not shown), and a communication module (not shown).

실시 예에서, 프로세서(210)는 카메라, 센서 및 통신 모듈 중 적어도 하나로부터 획득된, 사용자 유무에 대한 정보, 날씨 정보, 날짜 정보, 시간 정보, 계절 정보, 공휴일 정보, 기념일 정보, 온도 정보, 조도 정보 및 위치 정보 중 적어도 하나로부터 주변 상황 정보를 획득할 수 있다. In an embodiment, the processor 210 may obtain information about the presence or absence of a user, weather information, date information, time information, season information, holiday information, anniversary information, temperature information, and illuminance, obtained from at least one of a camera, a sensor, and a communication module. Surrounding situation information may be obtained from at least one of information and location information.

실시 예에서, 프로세서(210)는 사용자 프로필 정보, 사용자의 시청 이력 정보, 및 사용자로부터 선택 받은 선호 음악 정보 중 적어도 하나로부터 사용자 취향 정보를 획득할 수 있다. In an embodiment, the processor 210 may obtain user taste information from at least one of user profile information, user viewing history information, and preferred music information selected by the user.

실시 예에서, 프로세서(210)는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하고, 필터링된 음원 생성 태그들을 이용하여 음원을 생성할 수 있다. In an embodiment, the processor 210 may filter sound source generation tags having high scores among sound source generation tags and generate a sound source using the filtered sound source generation tags.

실시 예에서, 프로세서(210)는 인식 결과의 정확도, 태그 별 중복도, 및 사용자 선호도를 나타내는 태그 별 웨이트 중 적어도 하나에 기반하여, 이미지 정보에 매핑되는 음원 생성 태그들 별 점수를 획득하고, 이미지 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하여 제1 태그들을 획득할 수 있다.In an embodiment, the processor 210 obtains a score for each tag generated by a sound source mapped to image information based on at least one of the accuracy of the recognition result, the degree of overlap for each tag, and the weight for each tag representing user preference, and First tags may be obtained by filtering sound source generation tags having a high score among sound source generation tags mapped to information.

실시 예에서, 프로세서(210)는 상황에 따른 사용자 선호도를 나타내는 상황 기반 태그 별 웨이트에 기반하여, 주변 상황 정보에 매핑되는 음원 생성 태그들 별 점수를 획득하고, 주변 상황 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하여 제2 태그들을 획득할 수 있다.In an embodiment, the processor 210 obtains a score for each sound source generation tag mapped to surrounding context information based on the weight of each context-based tag indicating user preference according to the context, and the sound source creation tag mapped to the surrounding context information. Second tags may be obtained by filtering sound source generation tags having a high score among the tags.

실시 예에서, 프로세서(210)는 제1 태그들 및 제2 태그들 중 적어도 하나를 이용하여, 음원을 생성할 수 있다. In an embodiment, the processor 210 may generate a sound source using at least one of the first tags and the second tags.

실시 예에서, 프로세서(210)는 주변 상황 정보 및 사용자 식별 정보 중 적어도 하나에 기반하여, 필터링된 음원 생성 태그들을 추가로 필터링하고, 추가로 필터링된 태그들을 이용하여 음원을 생성할 수 있다. In an embodiment, the processor 210 may additionally filter the filtered sound source creation tags based on at least one of surrounding situation information and user identification information, and generate a sound source using the additionally filtered tags.

실시 예에서, 프로세서(210)는 사용자 취향 정보 및 음악 재생 이력 정보 중 적어도 하나에 기반하여 사용자 선호도를 나타내는 태그 별 웨이트를 획득할 수 있다. In an embodiment, the processor 210 may obtain a weight for each tag representing user preference based on at least one of user taste information and music playback history information.

실시 예에서, 프로세서(210)는 생성된 음원에 따라 음악을 재생하고, 음악 재생 정보에 따라 태그 별 웨이트를 업데이트할 수 있다. In an embodiment, the processor 210 may play music according to the generated sound source and update weights for each tag according to music play information.

실시 예에서, 음악 재생 정보는 음악의 재생 빈도, 음악 전체 청취 정도, 재생 중단 정도, 빨리 감기 정도, 스킵 정도에 대한 정보를 포함할 수 있다.In an embodiment, the music playback information may include information about a music playback frequency, a total listening level of music, a playback stop level, a fast-forward level, and a skip level.

실시 예에서, 프로세서(210)는, 이후 태그들 별 점수를 획득할 때, 업데이트된 태그 별 웨이트를 이용하여 음원 생성 태그들 별 점수를 획득할 수 있다. In an embodiment, the processor 210 may obtain a score for each sound source generation tag using the updated weight for each tag when obtaining a score for each tag thereafter.

실시 예에서, 프로세서(210)는 음원 생성 태그들 별 점수가 높은 태그들을 필터링하고, 필터링된 음원 생성 태그들을 이용하여 음원을 생성할 수 있다. In an embodiment, the processor 210 may filter tags having high scores for each sound source generation tags and generate a sound source using the filtered sound source generation tags.

실시 예에서, 프로세서(210)는 적어도 하나의 뉴럴 네트워크를 이용하여, 음원 생성 태그들로부터 음원을 획득할 수 있다. In an embodiment, the processor 210 may obtain a sound source from sound source generation tags using at least one neural network.

실시 예에서, 프로세서(210)는 인공지능(Artificial Intelligence, AI) 기술을 이용할 수 있다. 실시 예에서, 프로세서(210)는 AI 모델을 적어도 하나 저장하고 있을 수 있다. 실시 예에서 프로세서(210)는 복수 개의 AI 모델들을 이용하여 입력 데이터으로부터 출력 데이터를 생성할 수 있다. 또는, 프로세서(210)가 아닌 메모리(220)가 AI 모델들, 즉, 뉴럴 네트워크를 저장하고 있을 수도 있다. In an embodiment, the processor 210 may use artificial intelligence (AI) technology. In an embodiment, the processor 210 may store at least one AI model. In an embodiment, the processor 210 may generate output data from input data using a plurality of AI models. Alternatively, the memory 220 rather than the processor 210 may store AI models, that is, neural networks.

실시 예에서, 프로세서(210)가 이용하는 뉴럴 네트워크는 태그들로부터 음원을 획득하도록 학습된 뉴럴 네트워크일 수 있다.In an embodiment, the neural network used by the processor 210 may be a neural network trained to acquire sound sources from tags.

실시 예에서, 프로세서(210)는 뉴럴 네트워크를 이용하여 음원 생성 태그들로부터 음원을 획득할 수 있다. 실시 예에서, 뉴럴 네트워크는 GAN(Star Generative Adversarial Networks)을 포함할 수 있다.In an embodiment, the processor 210 may obtain a sound source from sound source generation tags using a neural network. In an embodiment, the neural network may include Star Generative Adversarial Networks (GANs).

도 3은 실시 예에 따른 도 2의 프로세서의 내부 블록도이다. 3 is an internal block diagram of the processor of FIG. 2 according to an embodiment.

도 3을 참조하면, 프로세서(210)는 음원 생성 정보 획득부(310), 음원 생성 태그 획득부(320), 음원 생성부(330) 및 음악 재생부(340)를 포함할 수 있다.Referring to FIG. 3 , the processor 210 may include a sound source creation information acquisition unit 310, a sound source generation tag acquisition unit 320, a sound source generation unit 330, and a music playback unit 340.

실시 예에 따른 음원 생성 정보 획득부(310)는 음원 생성을 위해 다양한 음원 생성 정보를 획득할 수 있다. 실시 예에서, 음원 생성 정보는 이미지 정보, 주변 상황 정보, 사용자 취향 정보 중 적어도 하나를 포함할 수 있다. 음원 생성 정보 획득부(310)가 음원 정보를 획득하는 방법에 대해서는 도 4 및 도 5에 대한 상세한 설명에서 보다 구체적으로 설명하기로 한다.The sound source generation information acquisition unit 310 according to the embodiment may acquire various sound source generation information to generate a sound source. In an embodiment, the sound source generation information may include at least one of image information, surrounding situation information, and user preference information. A method for obtaining the sound source information by the sound source creation information acquisition unit 310 will be described in more detail in the detailed description of FIGS. 4 and 5 .

실시 예에 따른 음원 생성 태그 획득부(320)는 음원 생성 정보 획득부(310)로부터 음원 생성 정보를 수신하고, 음원 생성 정보로부터 음원 생성 태그들을 도출할 수 있다. The sound source generation tag acquisition unit 320 according to the embodiment may receive sound source generation information from the sound source generation information acquisition unit 310 and derive sound source generation tags from the sound source generation information.

실시 예에서, 음원 생성 태그 획득부(320)는 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득할 수 있다. In an embodiment, the sound source generation tag acquisition unit 320 may acquire sound source generation tags mapped to sound source generation information.

실시 예에서, 태그는 정보에 할당된 키워드 또는 단어 등의 메타데이터를 의미할 수 있다. 정보에 태그가 할당된다는 것은 정보에 다양한 분야나 다양한 속성의 태그가 연관 지어지는 것을 의미할 수 있다. In an embodiment, a tag may mean metadata such as keywords or words assigned to information. Allocating tags to information may mean that tags of various fields or properties are associated with information.

실시 예에서, 전자 장치(100a) 내부의 메모리(220)나 데이터 베이스(미도시)에는 복수의 정보들 및 각각의 정보들에 할당된 태그들이 저장되어 있을 수 있다. 또는 전자 장치(100a)가 아니라 외부 서버에 복수의 정보들 및 정보들에 할당된 태그들이 저장되어 있을 수 있다. In an embodiment, a plurality of pieces of information and tags assigned to each piece of information may be stored in the memory 220 or a database (not shown) of the electronic device 100a. Alternatively, a plurality of pieces of information and tags assigned to the pieces of information may be stored in an external server instead of the electronic device 100a.

실시 예에서, 음원 생성 태그 획득부(320)는 음원 생성 정보 획득부(310)로부터 음원 생성 정보를 수신하고, 메모리(220)나 데이터 베이스, 또는 외부 서버에 저장된 수많은 정보 중 음원 생성 정보를 검색하고, 음원 생성 정보에 매핑되는 태그들을 검색할 수 있다. In an embodiment, the sound source generation tag acquisition unit 320 receives the sound source generation information from the sound source generation information acquisition unit 310, and searches for sound source generation information among numerous pieces of information stored in the memory 220, a database, or an external server. and search for tags mapped to sound source generation information.

이하, 복수의 태그들 중에서 음원 생성 정보에 매핑되는 태그들을 음원 생성 태그로 호칭하기로 한다. Hereinafter, among a plurality of tags, tags mapped to sound source creation information will be referred to as sound source creation tags.

실시 예에서, 음원 생성 태그 획득부(320)는 음원 생성 태그들을 필터링할 수 있다. 실시 예에서, 음원 생성 태그 획득부(320)는 음원 생성 태그들을 필터링하기 위해, 각각의 음원 생성 태그들에 점수를 부여할 수 있다. 음원 생성 태그 획득부(320)는 음원 생성 태그들에 점수를 부여하기 위해 태그 별 웨이트를 획득할 수 있다. 태그 별 웨이트는 각 태그에 대한 사용자 선호도를 나타낼 수 있다. 실시 예에서, 음원 생성 태그 획득부(320)는 사용자 취향 정보나 음악 재생 이력 중 적어도 하나에 기반하여 태그 별 웨이트를 획득할 수 있다. 음원 생성 태그 획득부(320)는 태그 별 웨이트나 태그의 중복도 등을 고려하여 태그 별 점수를 획득하고, 태그 별 점수에 따라 음원 생성 태그들을 필터링할 수 있다.In an embodiment, the sound source generation tag acquisition unit 320 may filter sound source generation tags. In an embodiment, the sound source generation tag acquisition unit 320 may assign a score to each sound source generation tag in order to filter the sound source generation tags. The sound source generation tag acquisition unit 320 may acquire weights for each tag in order to assign scores to the sound source generation tags. The weight for each tag may indicate user preference for each tag. In an embodiment, the sound source generation tag acquisition unit 320 may obtain a weight for each tag based on at least one of user taste information and music reproduction history. The sound source generation tag acquisition unit 320 may acquire a score for each tag in consideration of the weight of each tag or the degree of overlap of tags, and may filter the sound source generation tags according to the score for each tag.

실시 예에서, 음원 생성 태그 획득부(320)는 주변 상황 정보나 사용자 프로필 정보 등에 따라 필터링된 음원 생성 태그들을 추가로 더 필터링할 수도 있다. 음원 생성 태그 획득부(320)는 필터링된 음원 생성 태그들을 음원 생성부(330)로 전달할 수 있다.In an embodiment, the sound source generation tag acquisition unit 320 may further filter the filtered sound source generation tags according to surrounding situation information or user profile information. The sound source generation tag acquisition unit 320 may transmit the filtered sound source generation tags to the sound source generation unit 330 .

실시 예에 따른 음원 생성부(330)는 음원 생성 태그 획득부(320)로부터 음원 생성 태그들을 수신하고 음원 생성 태그들을 이용하여 음원을 획득하거나, 또는 음원 생성을 위한 악보를 획득할 수 있다. 실시 예에서, 음원은 다운로드 받거나 실시간 스트리밍하여 재생할 수 있는 형태의 음악 데이터를 의미할 수 있다. 예컨대, 음원은 mp3, midi, wav 등과 같은 재생 가능한 음악 파일 형태일 수 있다. The sound source generation unit 330 according to the embodiment may receive sound source generation tags from the sound source generation tag acquisition unit 320 and obtain a sound source by using the sound source generation tags or obtain a sheet music for generating a sound source. In an embodiment, the sound source may mean music data in a form that can be downloaded or played through real-time streaming. For example, the sound source may be in the form of a playable music file such as mp3, midi, or wav.

실시 예에서, 음원 생성부(330)는 뉴럴 네트워크를 이용하여 음원을 획득할 수 있다. 음원 생성부(330)가 이용하는 뉴럴 네트워크는 음악 콘텐츠와 음악 콘텐츠에 어울리는 태그들을 학습 데이터 셋으로 이용하여 훈련된 뉴럴 네트워크일 수 있다. 보다 구체적으로, 음악 콘텐츠는 조성, 코드, 멜로디, 비트, 박자, 템포, 리듬, 장르, 분위기 등의 텍스트 정보로 인코딩될 수 있다. 각 음악 콘텐츠 별로 어울리는 태그들이 라벨링되어 학습 데이터 셋으로 이용될 수 있다. 학습이 끝난 뉴럴 네트워크는 태그를 입력 받고, 입력된 태그에 어울리는 음악 콘텐츠, 즉, 음원을 획득할 수 있다.In an embodiment, the sound generator 330 may obtain a sound source using a neural network. The neural network used by the sound generator 330 may be a neural network trained using music content and tags suitable for the music content as a learning data set. More specifically, music content may be encoded as text information such as composition, chord, melody, beat, time signature, tempo, rhythm, genre, atmosphere, and the like. Tags suitable for each music content may be labeled and used as a learning data set. The trained neural network may receive a tag and acquire music content, that is, a sound source suitable for the input tag.

실시 예에 따른 음악 재생부(340)는 음원 생성부(330)가 생성한 음원에 따라 음악을 재생할 수 있다. The music reproducing unit 340 according to the embodiment may reproduce music according to the sound source generated by the sound generating unit 330 .

음악 재생부(340)는 음원 생성부(330)로부터 악보를 수신하고 악보에 따라 음원을 재생하거나, 또는 음원 생성부(330)로부터 음원 자체를 수신하여 음원을 뮤직 플레이어를 이용하여 재생할 수 있다. The music reproducing unit 340 may receive sheet music from the sound source generator 330 and reproduce the sound source according to the sheet music, or may receive the sound source itself from the sound source generator 330 and reproduce the sound source using a music player.

사용자는 음악 재생부(340)가 재생하는 음악을 청취할 수 있다. 사용자는 재생되는 음악이 마음에 들어 해당 음악을 여러 번 청취할 수도 있고, 또는 음악이 마음에 들지 않아 음악이 다 끝나기 전에 음악 재생을 정지할 수도 있다.A user may listen to music reproduced by the music reproducing unit 340 . The user may listen to the music several times because he likes the music being played, or may stop playing the music before the music ends because he does not like the music.

실시 예에서, 사용자의 음악 청취에 대한 정보, 즉, 음악 재생에 대한 정보는 음원 생성 태그 획득부(320)로 피드백 되어 전송될 수 있다. 음원 생성 태그 획득부(320)는 음악 재생 정보를 획득하고 음악 재생 정보를 이용하여 태그 별 웨이트를 업데이트할 수 있다. 음원 생성 태그 획득부(320)는 사용자의 음악 재생 이력에 기초하여 사용자가 반복하여 청취한 음원에 연관된 태그에 대한 웨이트를 높이고, 사용자가 중지한 음원에 연관된 태그에 대한 웨이트를 낮춤으로써 태그 별 웨이트를 업데이트할 수 있다. 음원 생성 태그 획득부(320)는 이후 업데이트된 태그 별 웨이트를 이용하여 태그 별 점수를 획득하고, 이에 기반하여 태그를 필터링함으로써 사용자의 음악 재생 이력이 잘 반영된 음악이 생성되도록 할 수 있다. In an embodiment, information on the user's music listening, ie, music reproduction, may be fed back to the sound generating tag acquisition unit 320 and transmitted. The sound source creation tag acquisition unit 320 may obtain music play information and update weights for each tag using the music play information. The sound source creation tag acquisition unit 320 raises the weight for a tag related to a sound source that the user has repeatedly listened to based on the user's music playback history, and lowers the weight for a tag related to a sound source that the user has stopped, thereby generating weight for each tag. can be updated. The sound source generation tag acquisition unit 320 then obtains a score for each tag using the updated weight for each tag, and filters the tag based on the obtained score, so that music in which the user's music playback history is well reflected can be generated.

도 4는 실시 예에 따른 도 3의 음원 생성 정보 획득부의 내부 블록도이다. 4 is an internal block diagram of a sound source generation information acquisition unit of FIG. 3 according to an embodiment.

도 4를 참조하면, 음원 생성 정보 획득부(310)는 이미지 정보 획득부(311), 주변 상황 정보 획득부(313) 및 사용자 취향 정보 획득부(315)를 포함할 수 있다. Referring to FIG. 4 , the sound source creation information acquisition unit 310 may include an image information acquisition unit 311 , a surrounding situation information acquisition unit 313 and a user taste information acquisition unit 315 .

실시 예에 따른 이미지 정보 획득부(311)는 전자 장치(100a)에 출력된 이미지로부터 이미지 정보를 획득할 수 있다. 이미지 정보 획득부(420)는 화면에 출력되는 이미지를 캡쳐하고, 캡쳐된 이미지를 분석하여 이미지 정보를 획득할 수 있다. The image information acquisition unit 311 according to the embodiment may obtain image information from an image output to the electronic device 100a. The image information acquiring unit 420 may acquire image information by capturing an image output on a screen and analyzing the captured image.

이미지 정보는 화면에 출력되는 이미지 자체의 고유 특성에 대한 정보일 수 있다. 이미지 정보는 이미지에서 식별된 컬러나 스타일, 이미지에서 식별된 오브젝트의 종류, 및 식별된 오브젝트가 사람인 경우 사람의 표정 중 적어도 하나에 기반하여 획득될 수 있다. 또는, 이미지 정보 획득부(311)는 화면에 출력되는 이미지에 대한 부가 정보를 이미지와 함께, 또는 이미지와 별도로 전자 장치(100a) 내부의 메모리나, 외부 서버, 또는 외부 사용자 단말 등으로부터 수신하여 이를 이미지 정보로 이용할 수도 있다. The image information may be information about unique characteristics of an image itself output on a screen. The image information may be obtained based on at least one of a color or style identified in the image, a type of object identified in the image, and a person's facial expression when the identified object is a person. Alternatively, the image information acquisition unit 311 receives additional information about an image displayed on the screen together with the image or separately from the image, from an internal memory of the electronic device 100a, an external server, or an external user terminal, and obtains the additional information. It can also be used as image information.

실시 예에 따른 주변 상황 정보 획득부(313)는 통신 신호, 센서 신호, 카메라 신호 중 적어도 하나를 수신할 수 있다. 주변 상황 정보 획득부(313)는 정해진 시간 마다 또는 랜덤한 시간 간격마다, 또는 기 설정된 시각 마다, 또는 온도가 급변하거나 날짜가 바뀌는 것과 같은 이벤트가 발생할 때 마다 통신 신호, 센서 신호 및 카메라 신호 중 적어도 하나를 새로 획득할 수 있다. The surrounding situation information obtaining unit 313 according to the embodiment may receive at least one of a communication signal, a sensor signal, and a camera signal. The surrounding situation information acquisition unit 313 obtains at least one of a communication signal, a sensor signal, and a camera signal at a predetermined time, at a random time interval, at a predetermined time, or whenever an event such as a sudden temperature change or date change occurs. You can get a new one.

통신 신호는 통신망을 통해 외부 서버 등으로부터 획득된 신호로, 외부 상황을 나타내는 정보, 예컨대, 외부의 날씨 정보, 날짜 정보, 시간 정보, 계절 정보, 조도 정보, 온도 정보, 위치 정보, 공휴일 정보 중 적어도 하나를 포함할 수 있다. The communication signal is a signal obtained from an external server or the like through a communication network, and is information representing an external situation, for example, at least one of external weather information, date information, time information, season information, illuminance information, temperature information, location information, and holiday information. may contain one.

실시 예에 따른 주변 상황 정보 획득부(313)는 다양한 센서를 이용하여 전자 장치(100a) 주변의 외부 상황에 대한 센서 신호를 획득할 수 있다. 센서 신호는 센서를 통해 센싱된 신호로, 센서의 종류에 따라 다양한 형태의 신호를 포함할 수 있다. The surrounding situation information acquisition unit 313 according to the embodiment may obtain sensor signals for external conditions around the electronic device 100a using various sensors. The sensor signal is a signal sensed through a sensor, and may include various types of signals according to the type of sensor.

예컨대, 주변 상황 정보 획득부(313)는 온/습도 센서를 이용하여 주변의 온도나 습도를 감지할 수 있다. 또는 주변 상황 정보 획득부(313)는 조도 센서를 이용하여 전자 장치(100a) 주변의 조도를 감지할 수 있다. 조도 센서는 주변의 빛의 양을 측정하여 빛의 양에 따라 밝기를 측정할 수 있다. 또는 주변 상황 정보 획득부(313)는 위치 센서를 이용하여 전자 장치(100a)의 위치를 감지할 수 있다. For example, the surrounding situation information acquisition unit 313 may detect ambient temperature or humidity using a temperature/humidity sensor. Alternatively, the surrounding situation information acquisition unit 313 may detect the ambient light around the electronic device 100a using the light sensor. The illuminance sensor may measure brightness according to the amount of light by measuring the amount of ambient light. Alternatively, the surrounding situation information acquisition unit 313 may detect the location of the electronic device 100a using a location sensor.

또는 주변 상황 정보 획득부(313)는 위치 센서 및/또는 근접 센서를 이용하여 전자 장치(100a)와 사용자 사이의 거리를 감지할 수 있다. Alternatively, the surrounding situation information acquisition unit 313 may detect a distance between the electronic device 100a and the user by using a location sensor and/or a proximity sensor.

또는 주변 상황 정보 획득부(313)는 프레즌스(presence) 센서를 이용하여, 프레즌스 센서에서 방출된 IR 신호가 반사되어 오는지 여부나 반사되어 돌아오는 시간 간격 등에 따라 주변에 사람이 있는지 여부 등을 센싱할 수 있다. 또는 주변 상황 정보 획득부(313)는 프레즌스 센서 대신 카메라를 이용하여 전자 장치(100a) 주변에 사용자가 있는지 여부를 식별할 수도 있다. 카메라 렌즈에 사용자가 포착되는지 여부에 따라, 주변 상황 정보 획득부(313)는 사용자 유무를 판단하고 사용자 유무를 주변 상황 정보로 획득할 수 있다. Alternatively, the surrounding situation information acquisition unit 313 uses a presence sensor to sense whether or not there is a person nearby according to whether an IR signal emitted from the presence sensor is reflected or a time interval after which the IR signal is reflected and returned. can Alternatively, the surrounding situation information obtaining unit 313 may use a camera instead of a presence sensor to identify whether there is a user around the electronic device 100a. Depending on whether or not a user is captured by the camera lens, the surrounding context information acquisition unit 313 may determine whether or not there is a user and acquire the user presence as surrounding context information.

실시 예에 따른 사용자 취향 정보 획득부(315)는 사용자 취향 정보를 획득할 수 있다. 사용자 취향 정보는 사용자가 선호하는 음악을 추론하기 위해 획득될 수 있다. The user taste information acquisition unit 315 according to an embodiment may obtain user taste information. User taste information may be obtained to infer the user's preferred music.

실시 예에서, 사용자 취향 정보는 사용자 프로필 정보나 사용자의 시청 이력 정보, 선호 음악 정보 중 적어도 하나로부터 추론될 수 있다. In an embodiment, the user taste information may be inferred from at least one of user profile information, user viewing history information, and preferred music information.

사용자 프로필 정보는 사용자를 식별하기 위한 정보로, 사용자의 계정(account)을 기반으로 생성될 수 있다. 사용자 프로필 정보는 사용자의 성별, 나이, 결혼 유무, 자녀 유무, 가족 수, 직업, 생일 등의 기념일 정보를 포함할 수 있다. 예컨대, 사용자가 전자 장치(100a)에 계정을 생성한 경우, 전자 장치(100a)는 사용자가 계정을 생성할 때 입력한 프로필 정보를 사용자 계정에 매칭시키고 이를 전자 장치(100a) 내부에 저장하거나 또는 전자 장치(100a)에 서비스를 제공하는 연동된 외부 서버에 저장시킬 수 있다. 또는, 전자 장치(100a)에 카메라가 구비된 경우, 전자 장치(100a)는 카메라를 통해 포착된 사용자의 얼굴을 인식하여 사용자의 연령대나 사용자의 성별 등을 식별하고 이로부터 사용자 취향 정보를 추론할 수도 있다.User profile information is information for identifying a user and may be generated based on the user's account. The user profile information may include anniversary information such as the user's gender, age, marital status, child status, number of family members, occupation, and birthday. For example, when a user creates an account on the electronic device 100a, the electronic device 100a matches profile information entered by the user when creating an account with the user account and stores it in the electronic device 100a, or It can be stored in an interlocked external server that provides services to the electronic device 100a. Alternatively, when the electronic device 100a is equipped with a camera, the electronic device 100a recognizes the user's face captured through the camera to identify the user's age or gender, and infer user preference information therefrom. may be

실시 예에서, 사용자 취향 정보 획득부(315)는 사용자가 전자 장치(100a)를 이용하여 프로그램이나 콘텐츠를 시청한 이력 정보를 획득하고 이로부터 사용자 정보를 추론할 수도 있다. In an embodiment, the user preference information acquisition unit 315 may acquire history information on the user's viewing of programs or content using the electronic device 100a and infer user information from this.

예컨대, 사용자가 주로 시청하는 콘텐츠가 애완동물과 관련된 콘텐츠이고, 또한 로맨틱 코메디 장르의 콘텐츠인 경우, 사용자 취향 정보 획득부(315)는 사용자가 밝고 따뜻한 음악을 선호할 것이라고 추론할 수 있다. For example, when the content that the user mainly watches is related to pets and is also of the romantic comedy genre, the user taste information acquisition unit 315 may infer that the user prefers bright and warm music.

또는 실시 예에서, 사용자 취향 정보는 사용자로부터 선택 받은 선호 음악 정보로부터 획득될 수도 있다. 예컨대, 사용자는 전자 장치(100a)에 계정을 생성할 때 선호하는 음악에 대한 정보를 입력하거나, 또는 전자 장치(100a)를 이용하여 음악을 자동으로 생성하는 프로그램이 실행되도록 하기 위해서 프로그램 이용 시 최초에 한해 전자 장치(100a)에 선호 음악에 대한 정보를 직접 입력할 수도 있다. Alternatively, in an embodiment, the user taste information may be obtained from preferred music information selected by the user. For example, the user inputs information about preferred music when creating an account on the electronic device 100a, or when using the program for the first time to execute a program that automatically creates music using the electronic device 100a. Information on preferred music may be directly input into the electronic device 100a only for .

또는 사용자 취향 정보는 사용자의 이전 음악 청취 이력이 있는 경우, 이전 음악 청취 이력에 기반하여 획득되거나 업데이트될 수 있다. 예컨대, 사용자 취향 정보는 사용자가 선호하거나 이전에 청취한 음악의 무드(mood), 감도(velocity), 악기, 조성, 코드, 멜로디, 비트, 박자, 템포, 리듬, 장르, 분위기 등에 대한 정보로부터 획득될 수 있다. 또한, 사용자 취향 정보 획득부(315)는 사용자가 특정 음악을 재생한 정도, 사용자가 해당 음악을 전부 다 청취했는지, 일부만 청취했는지 등을 나타내는 음악 재생 시간 등으로부터 사용자가 특정 음악을 선호하는 정도를 획득하고 이로부터 사용자 취향 정보를 추론할 수도 있다. 사용자 취향 정보는 사용자의 이전 음악 청취 이력에 따라 주기적으로 또는 음악 청취 이벤트가 발생할 때마다 업데이트될 수 있다. 따라서, 이전 음악 청취 이력이 많을수록 보다 정확한 사용자 취향 정보가 획득될 수 있다. Alternatively, the user taste information may be obtained or updated based on a previous music listening history of the user, if there is a previous music listening history. For example, user preference information is obtained from information about mood, velocity, instrument, key, chord, melody, beat, time signature, tempo, rhythm, genre, atmosphere, etc. of music that the user prefers or previously listened to It can be. In addition, the user preference information acquisition unit 315 determines the degree to which the user prefers a specific music from the music playback time indicating whether the user has played the specific music, whether the user has listened to all or only a part of the corresponding music, and the like. may be obtained and user preference information may be inferred therefrom. User taste information may be updated periodically or whenever a music listening event occurs according to the user's previous music listening history. Therefore, more accurate user taste information can be obtained as the previous music listening history increases.

이와 같이, 실시 예에 의하면, 전자 장치(100a)는 이미지 정보, 주변 상황 정보, 사용자 취향 정보와 같이 다양한 형태의 음원 생성 정보를 이용하여 음원을 생성할 수 있다. 따라서, 동일한 이미지가 화면에 출력되더라도, 전자 장치(100a)는 주변 상황 정보나 사용자 취향 정보에 따라 다른 음원을 생성하여 사용자에게 제공할 수 있다. In this way, according to the embodiment, the electronic device 100a may generate a sound source using various types of sound source generation information such as image information, surrounding situation information, and user preference information. Accordingly, even when the same image is output on the screen, the electronic device 100a may generate and provide a different sound source to the user according to surrounding situation information or user preference information.

도 5는 실시 예에 따른 도 4의 이미지 정보 획득부가 이미지 정보를 획득하는 방법을 설명하기 위한 도면이다. 5 is a diagram for explaining a method of obtaining image information by the image information obtaining unit of FIG. 4 according to an embodiment.

실시 예에서, 이미지 정보 획득부(311)는 이미지를 입력 받고, 이미지로부터 이미지 정보를 획득할 수 있다. 이미지 정보는 이미지 자체의 고유 특징을 나타내는 정보일 수 있다. 이미지 정보는 이미지에 대한 부가 정보, 이미지에서 식별된 컬러나 스타일, 이미지에서 식별된 오브젝트의 종류, 및 식별된 오브젝트가 사람인 경우 사람의 표정 중 적어도 하나로부터 이미지 정보를 획득할 수 있다. In an embodiment, the image information acquisition unit 311 may receive an image and obtain image information from the image. Image information may be information representing unique characteristics of the image itself. The image information may obtain image information from at least one of additional information about the image, a color or style identified in the image, a type of object identified in the image, and a human expression when the identified object is a person.

실시 예에서, 이미지 정보 획득부(311)는 적어도 하나의 뉴럴 네트워크를 이용하여 이미지로부터 이미지 정보를 획득할 수 있다. 적어도 하나의 뉴럴 네트워크는 CNN(Convolution Neural Network), DCNN(Deep Convolution Neural Network) 또는 캡스넷(Capsnet) 기반의 신경망일 수 있으나, 이에 한정되는 것은 아니다.In an embodiment, the image information acquisition unit 311 may acquire image information from an image using at least one neural network. At least one neural network may be a neural network based on a Convolution Neural Network (CNN), Deep Convolution Neural Network (DCNN), or Capsnet, but is not limited thereto.

실시 예에서, 이미지 정보 획득부(311)는 이미지로부터 컬러 정보를 획득할 수 있다. 컬러 정보는 이미지에서 많이 사용된 색의 RGB 값일 수 있다. 이미지 정보 획득부(311)는 각 픽셀의 RGB 값을 컬러 차이 알고리즘을 통해 유사한 색으로 그루핑할 수 있다. 이미지 정보 획득부(311)는 그루핑된 색상들로부터 우세한 컬러를 클러스터링(clustering) 하여 하나의 이미지 별로 하나 또는 복수개의 우세한(dominant) 컬러에 해당하는 RGB 값을 획득할 수 있다. 실시 예에서, 이미지 정보 획득부(311)는 이미지에서 우세한 색상이 파란색, 하늘색, 흰색이라는 것을 식별하고 이러한 색상에 어울리는 감정 정보, 예컨대, 청량감, 시원함 등의 정보를 추가로 획득할 수 있다. In an embodiment, the image information obtaining unit 311 may obtain color information from an image. The color information may be RGB values of colors frequently used in the image. The image information acquisition unit 311 may group the RGB values of each pixel into similar colors through a color difference algorithm. The image information acquisition unit 311 may acquire RGB values corresponding to one or a plurality of dominant colors for each image by clustering dominant colors from the grouped colors. In an embodiment, the image information acquisition unit 311 may identify blue, sky blue, and white as dominant colors in the image and additionally obtain emotion information suitable for these colors, such as coolness and coolness.

실시 예에서, 이미지 정보 획득부(311)는 이미지로부터 스타일 정보를 획득할 수 있다. 스타일 정보는 이미지의 스타일이 느와르(noir)인지, 빈티지(vintage)인지, 로맨틱인지, 공포인지 등을 나타내는 정보일 수 있다. 이미지가 그림인 경우, 스타일 정보는 회화 양식을 나타내는 화풍을 포함할 수 있다. 스타일 정보는 수채화, 유화, 수묵화, 점묘화, 입체화와 같이 그림을 그리는 방식이나 양식을 나타내거나 반 고흐 풍, 모네 풍, 마네 풍, 피카소 풍 등과 같은 특정한 화가의 경향과 특징을 지칭할 수도 있다. 또는 스타일 정보는 중세 시대, 르네상스 시대, 근대시대, 현대 시대 회화와 같이 시대별로 분류되는 특징이거나, 동양화, 서양화, 등과 같은 지역별로 분류되는 특징이거나, 인상파, 추상파, 사실주의 등과 같은 회화 양식의 특징을 포함할 수 있다. 또는, 스타일 정보는 이미지가 갖는 질감, 색감, 분위기, 콘트라스트, 광택 또는 색의 3요소인 명도(Intensity), 색도(Hue), 채도(Saturation) 등에 대한 정보를 포함할 수 있다.In an embodiment, the image information obtaining unit 311 may obtain style information from an image. The style information may be information indicating whether the style of the image is noir, vintage, romantic, or horror. When the image is a picture, the style information may include a painting style representing a painting style. Style information may indicate a drawing method or style, such as watercolor, oil painting, ink painting, pointillism, or three-dimensional painting, or may indicate tendencies and characteristics of a specific artist, such as Van Gogh style, Monet style, Manet style, or Picasso style. Alternatively, the style information is a characteristic classified by era, such as the Middle Ages, the Renaissance, the Modern Age, or contemporary painting, or a characteristic classified by region, such as Eastern painting, Western painting, or the like, or a characteristic of a painting style, such as Impressionism, Abstractism, or Realism. can include Alternatively, the style information may include information on brightness, hue, saturation, and the like, which are three elements of texture, color, atmosphere, contrast, gloss, or color of an image.

또는 이미지가 사진인 경우, 스타일 정보는 카메라 촬영 기법에 대한 정보를 포함할 수 있다. 예컨대, 스타일 정보는 사진 촬영 시 이용된 기법이 패닝 기법(Panning Shot)인지, 틸팅 기법(Tilting shot)인지, 주밍 기법(Zooming Shot)인지, 접사 촬영(Marco Shot)인지, 야경 촬영인지 등에 대한 정보를 포함할 수 있다. 또는 스타일 정보는 피사체의 구도, 화각, 노출 정도, 렌즈 종류, 블러링(blurring) 정도, 포커스 길이 등을 포함할 수 있으나 이에 한정되는 것은 아니다. Alternatively, if the image is a photograph, the style information may include information about a camera shooting technique. For example, the style information is information on whether the technique used when taking a picture is a panning shot, tilting shot, zooming shot, macro shot, night view shot, etc. can include Alternatively, the style information may include, but is not limited to, composition of a subject, angle of view, degree of exposure, type of lens, degree of blurring, focal length, and the like.

실시 예에서, 이미지 정보 획득부(311)는 이미지에서 오브젝트 디텍션(object detection)을 수행할 수 있다. 이미지 정보 획득부(311)는 이미지 처리 기술을 이용하거나, 또는 인공 지능 기술을 이용하여 이미지로부터 오브젝트를 검출할 수 있다. 예컨대, 이미지 정보 획득부(311)는 두 개 이상의 히든 레이어들을 포함하는 딥 뉴럴 네트워크(DNN)를 이용하여 이미지로부터 오브젝트가 있음을 인식하고, 오브젝트가 무엇인지를 분류(classification)하고, 오브젝트의 위치(localization)를 식별함으로써, 오브젝트 디텍션을 수행할 수 있다. 예컨대, 도 5에서, 이미지 정보 획득부(311)는 이미지로부터 사람, 구름, 하늘, 바다 등의 오브젝트를 검출할 수 있다. 또한, 이미지 정보 획득부(311)는 오브젝트가 사람이고, 각각이 어른과 아이임을 식별할 수 있다. 이미지 정보 획득부(311)는 검출된 오브젝트 종류가 어린이와 어른으로 구성된 가족, 구름, 하늘, 바다 등인 것으로부터 행복, 즐거움 등의 감정 정보를 더 획득할 수 있다.In an embodiment, the image information acquisition unit 311 may perform object detection on an image. The image information acquisition unit 311 may detect an object from an image using image processing technology or artificial intelligence technology. For example, the image information acquisition unit 311 recognizes that an object exists in an image using a deep neural network (DNN) including two or more hidden layers, classifies what the object is, and locates the object. By identifying (localization), object detection can be performed. For example, in FIG. 5 , the image information acquisition unit 311 may detect objects such as people, clouds, sky, and sea from an image. Also, the image information obtaining unit 311 may identify that the object is a person and that each object is an adult and a child. The image information acquisition unit 311 may further acquire emotion information such as happiness and pleasure from the detected object type being a family composed of children and adults, a cloud, the sky, and the sea.

실시 예에서, 이미지 정보 획득부(311)는 오브젝트가 사람인 경우 사람의 표정을 식별할 수 있다. 예컨대, 이미지 정보 획득부(311)는 적어도 하나의 뉴럴 네트워크를 이용하여 이미지로부터 얼굴을 검출할 수 있다. 이미지 정보 획득부(311)는 검출된 얼굴에서 특징을 추출하고, 이를 이용하여 표정을 인식할 수 있다. 이미지 정보 획득부(311)는 인식된 표정으로부터 감정을 추론할 수 있다. 도 5에서, 이미지 정보 획득부(311)는 이미지에 포함된 사람의 표정을 식별하고, 표정이 웃는다는 것으로부터 기쁨이나 행복 등의 감정을 추론할 수 있다.In an embodiment, the image information acquisition unit 311 may identify a person's facial expression when the object is a person. For example, the image information acquisition unit 311 may detect a face from an image using at least one neural network. The image information acquisition unit 311 may extract a feature from the detected face and recognize a facial expression using the extracted feature. The image information obtaining unit 311 may infer emotion from the recognized facial expression. In FIG. 5 , the image information obtaining unit 311 may identify a person's expression included in the image and infer emotions such as joy or happiness from the expression of a smile.

실시 예에서, 이미지 정보 획득부(311)는 이미지에 대한 부가 정보를 이미지 정보로 획득할 수도 있다. 이미지에 대한 부가 정보는 이미지의 크기, 이미지의 제작 년도, 이미지에 대한 설명 등을 포함할 수 있다. 이미지에 대한 설명은 이미지의 생성 이력이나 이미지의 주제나 분위기에 대한 설명, 이미지의 종류나 스타일, 색감, 질감 등에 대한 설명, 이미지를 생성한 포토그래퍼나 화가 등의 작가에 대한 설명, 이미지가 생성된 장소, 이미지의 수상 이력 등과 같은 다양한 정보를 포함할 수 있다.In an embodiment, the image information acquisition unit 311 may acquire additional information about an image as image information. Additional information about the image may include the size of the image, the year the image was produced, a description of the image, and the like. The description of the image includes the creation history of the image, description of the subject or atmosphere of the image, description of the type, style, color, texture, etc. of the image, description of the photographer or painter who created the image, and description of the image created. It can include a variety of information, such as the place where it was created, the award history of the image, and so on.

이미지 정보 획득부(311)는 이미지에 대한 부가 정보를 이미지와 함께 획득할 수 있다. 예컨대, 이미지 정보 획득부(311)는 이미지에 대한 부가 정보를 서버(미도시)나 전자 장치(100a) 주변의 사용자 단말기로부터 수신하여 획득할 수 있다. 전자 장치(100a)는 이미지에 대한 부가 정보를 이미지와 함께 서버나 사용자 단말기로부터 수신하여 획득할 수 있다. 또는 전자 장치(100a)는 이미지에 대한 정보를 이미지를 수신한 서버나 사용자 단말기와는 별도의 서버 등으로부터 검색하여 획득할 수도 있다. 예컨대, 이미지가 유명 작품인 경우, 전자 장치(100a)는 이미지에 대한 타이틀만을 이미지와 함께 수신하고, 타이틀을 이용하여, 별도의 서버로부터 해당 이미지에 대한 상세 부가 정보를 획득할 수 있다.The image information acquisition unit 311 may obtain additional information about an image together with the image. For example, the image information acquisition unit 311 may acquire additional information about an image by receiving it from a server (not shown) or a user terminal around the electronic device 100a. The electronic device 100a may acquire additional information about an image by receiving it from a server or a user terminal together with the image. Alternatively, the electronic device 100a may retrieve and obtain information about an image from a server that has received the image or a server separate from the user terminal. For example, if the image is a famous work, the electronic device 100a may receive only the title of the image together with the image, and obtain detailed additional information about the image from a separate server using the title.

이와 같이, 실시 예에 의하면, 이미지 정보 획득부(311)는 이미지로부터 다양한 형태의 이미지 정보를 획득할 수 있다. 또한, 이미지 정보 획득부(311)는 이미지 정보에 대응하는 감정 정보를 획득할 수 있다.In this way, according to the embodiment, the image information acquiring unit 311 may acquire various types of image information from an image. Also, the image information obtaining unit 311 may obtain emotion information corresponding to the image information.

도 6은 실시 예에 따른 도 3의 음원 생성 태그 획득부의 내부 블록도이다.6 is an internal block diagram of a sound generating tag acquisition unit of FIG. 3 according to an embodiment.

도 6을 참조하면, 음원 생성 태그 획득부(320)는 태그 매핑부(321), 데이터 베이스(322), 태그 필터링부(323) 및 웨이트 획득부(324)를 포함할 수 있다.Referring to FIG. 6 , a sound source generation tag acquisition unit 320 may include a tag mapping unit 321 , a database 322 , a tag filtering unit 323 , and a weight acquisition unit 324 .

실시 예에서, 음원 생성 태그 획득부(320)는 데이터 베이스(data base, DB, 322)를 포함할 수 있다. 데이터 베이스(322)에는 음원과 관련된 복수의 정보들 및 각각의 정보들에 매핑된 태그들이 데이터 형태로 저장되어 있을 수 있다. In an embodiment, the sound source creation tag acquisition unit 320 may include a database (data base, DB) 322. The database 322 may store a plurality of pieces of information related to sound sources and tags mapped to the pieces of information in the form of data.

태그는 정보를 분류하거나, 경계를 표시하거나, 정보의 속성이나 정체성 등을 표시하는 식별자일 수 있다. 태그는 단어, 이미지 또는 기타 식별 표시의 형태를 취할 수 있다. 데이터 베이스(322)에는 음원과 관련된 복수의 정보들 각각에 대해 하나 이상의 태그가 할당되어 저장되어 있을 수 있다. 태그는 많은 양의 정보를 효과적으로 관리하거나 검색하는 데 사용되는 정보로, 각 정보에 부여되어 정보를 분류하는 데 사용될 수 있다. A tag may be an identifier that categorizes information, indicates boundaries, or indicates properties or identities of information. Tags can take the form of words, images or other identifying marks. One or more tags may be allocated and stored in the database 322 for each of a plurality of pieces of information related to a sound source. A tag is information used to effectively manage or retrieve a large amount of information, and can be assigned to each information and used to classify the information.

실시 예에서, 태그 매핑부(321)는 이러한 태그의 특성을 이용하여, 정보에 대응하는 태그를 검색하여 이용할 수 있다. 실시 예에서, 태그 매핑부(321)는 음원 생성 정보 획득부(310)로부터 음원 생성 정보를 수신하고, 음원 생성 정보에 매핑되는 음원 생성 태그들을 데이터 베이스(322)에서 검색하여 획득할 수 있다. 즉, 태그 매핑부(321)는 데이터 베이스(322)에 저장된 정보에서 음원 생성 정보 획득부(310)로부터 수신한 음원 생성 정보를 찾고, 음원 생성 정보에 매핑된 태그들인 음원 생성 태그들을 검색할 수 있다. 태그 매핑부(321)는 검색된 음원 생성 태그들을 태그 필터링부(323)로 전달할 수 있다.In an embodiment, the tag mapping unit 321 may search for and use tags corresponding to information by using the characteristics of these tags. In an embodiment, the tag mapping unit 321 may receive sound source generation information from the sound source generation information acquisition unit 310 and retrieve and obtain sound source generation tags mapped to the sound source generation information from the database 322 . That is, the tag mapping unit 321 searches for the sound source generation information received from the sound source generation information acquisition unit 310 from the information stored in the database 322 and searches for sound source generation tags that are tags mapped to the sound source creation information. there is. The tag mapping unit 321 may transmit the searched sound source generation tags to the tag filtering unit 323 .

다만, 이는 하나의 실시 예로, 음원과 관련된 정보 및 정보에 매핑되는 태그들은 전자 장치(100a) 내부의 데이터 베이스(322)에 저장되어 있는 것이 아니라 전자 장치(100a) 외부의 서버 등에 저장되어 있을 수도 있다. 이 경우, 태그 매핑부(321)는 통신부(미도시)를 통해 외부 서버에 음원 생성 정보를 전송하고, 외부 서버로부터 음원 생성 정보에 매핑되는 음원 생성 태그들을 수신하여 획득할 수 있다.However, this is an example, and the information related to the sound source and the tags mapped to the information may not be stored in the database 322 inside the electronic device 100a, but may be stored in a server outside the electronic device 100a. there is. In this case, the tag mapping unit 321 may transmit sound source generation information to an external server through a communication unit (not shown), and receive and obtain sound source generation tags mapped to the sound source generation information from the external server.

실시 예에서, 웨이트 획득부(324)는 태그 별 웨이트를 획득할 수 있다. 실시 예에서, 태그 별 웨이트는 각각의 태그에 대한 사용자의 선호도를 나타내는 정보일 수 있다.In an embodiment, the weight acquisition unit 324 may obtain weights for each tag. In an embodiment, the weight for each tag may be information representing a user's preference for each tag.

실시 예에서, 웨이트 획득부(324)는 사용자 취향 정보를 기반으로 태그 별 웨이트를 생성할 수 있다. 웨이트 획득부(324)는 음원 생성 정보 획득부(310)로부터 사용자 취향 정보에 대한 음원 생성 정보를 수신하고, 이를 이용하여 각 태그 별로 사용자 선호도를 수치화하여 태그 별 웨이트를 생성할 수 있다.In an embodiment, the weight acquisition unit 324 may generate weights for each tag based on user preference information. The weight acquisition unit 324 may receive sound source generation information for user taste information from the sound source generation information acquisition unit 310, digitize user preference for each tag using the received sound source generation information, and generate weights for each tag.

이후, 태그 필터링부(323)는 사용자의 음악 청취 이력 정보, 즉, 음악 재생 정보에 따라 태그 별 웨이트를 추가적으로 업데이트할 수 있다. 예컨대, 웨이트 획득부(324)는 음악의 재생 빈도, 음악 전체 청취 정도, 재생 중단 정도, 빨리 감기 정도, 스킵 정도에 대한 정보 등을 음악 재생 정보로 이용하고, 이에 기반하여 태그 별 웨이트를 업데이트할 수 있다. 예컨대, 웨이트 획득부(324)는 음악이 끝까지 재생되고 사용자가 반복하여 청취한 음악에 대해서는 그 음악을 생성하는 데 이용된 태그들에 더 높은 웨이트를 부여하고, 반대로, 사용자가 재생을 중단하거나 스킵한 음악에 대해서는 그 음악을 생성하는 데 이용된 태그들에 더 낮은 웨이트를 부여함으로써, 태그 별 웨이트를 업데이트할 수 있다.Thereafter, the tag filtering unit 323 may additionally update the weight for each tag according to the user's music listening history information, that is, music reproduction information. For example, the weight acquisition unit 324 uses music playback frequency, total listening level of music, playback stop level, fast-forward level, and skip level information as music play information, and updates the weight for each tag based on this. can For example, the weight acquisition unit 324 assigns a higher weight to tags used to generate the music for music that has been played to the end and has been repeatedly listened to by the user, and conversely, when the user stops or skips playback. For a piece of music, the weight for each tag can be updated by assigning a lower weight to the tags used to create the piece of music.

실시 예예서, 웨이트 획득부(324)는 상황 기반 태그 별 웨이트를 생성할 수 있다. 상황 기반 태그 별 웨이트는 주변 상황을 고려한 태그 별 웨이트를 의미할 수 있다. 웨이트 획득부(324)는 태그 별 웨이트에 주변 상황에 따른 가중치를 한 번 더 부여하여 상황 기반 태그 별 웨이트를 생성할 수 있다. 즉, 웨이트 획득부(324)는 사용자가 특정 주변 상황에서 선호하는 음악이나 특정 주변 상황에서 재생한 이력이 있는 음원에 대해, 그 음원을 생성하는 데 이용된 태그들에 주변 상황에 따른 가중치를 부여하여 상황 기반 태그 별 웨이트를 생성할 수 있다. In an embodiment, the weight acquisition unit 324 may generate a weight for each tag based on a situation. The context-based weight for each tag may refer to a weight for each tag in consideration of a surrounding situation. The weight acquisition unit 324 may generate a context-based weight for each tag by once again assigning a weight according to a surrounding situation to the weight for each tag. That is, the weight acquisition unit 324 assigns a weight according to the surrounding situation to tags used to generate the sound source with respect to music that the user prefers in a specific surrounding situation or a sound source that has been reproduced in a specific surrounding situation. By doing so, weights can be created for each tag based on the situation.

예컨대, 봄비가 오는 아침에, 사용자가 주로 들었던 음악이 보통 템포이고, 감도는 보통이고, 악기는 피아노이고, 피치(pitch)는 중간 톤인 경우, 웨이트 획득부(324)는 봄비가 오는 아침이라는 특정 상황에 대응하여 위 특징을 갖는 음악을 생성하는 데 이용된 태그들에 대해 더 높은 웨이트를 부여함으로써 상황 기반 태그 별 웨이트를 생성할 수 있다. For example, on a spring rainy morning, when the music the user usually listens to has a normal tempo, normal sensitivity, instrument is a piano, and the pitch is a medium tone, the weight acquisition unit 324 determines the specific value of a spring rainy morning. In response to the situation, a weight for each tag based on the situation may be generated by assigning a higher weight to tags used to generate music having the above characteristics.

실시 예에서, 웨이트 획득부(324)는 태그 별 웨이트 및/또는 상황 기반 태그 별 웨이트를 테이블 형태로 생성하고 이를 저장하고 있을 수 있다. 웨이트 획득부(324)는 사용자가 음악을 재생하는 정도 등을 고려하여, 태그 별 웨이트 및/또는 상황 기반 태그 별 웨이트를 계속하여 업데이트할 수 있다. In an embodiment, the weight acquisition unit 324 may create and store weights for each tag and/or weights for each tag based on context in the form of a table. The weight acquisition unit 324 may continuously update the weight for each tag and/or the weight for each tag based on the situation in consideration of the degree to which the user plays music.

실시 예에서, 태그 필터링부(323)는 태그 매핑부(321)로부터 수신한 음원 생성 태그들을 필터링하여 음원 생성에 직접 이용될 태그들만을 획득할 수 있다. In an embodiment, the tag filtering unit 323 may filter the sound source generation tags received from the tag mapping unit 321 to obtain only tags to be directly used for sound source generation.

실시 예에서, 태그 필터링부(323)는 태그들을 필터링하기 위해, 음원 생성 태그들에 점수를 부여할 수 있다. 태그 필터링부(323)는 음원 생성 태그들에 점수를 부여하기 위해 웨이트 획득부(324)로부터 태그 별 웨이트를 수신하고 이를 고려하여 태그 별 점수를 생성할 수 있다. In an embodiment, the tag filtering unit 323 may assign scores to sound source generation tags in order to filter the tags. The tag filtering unit 323 may receive weights for each tag from the weight acquisition unit 324 to assign scores to tags for generating sound sources, and generate scores for each tag in consideration of the weight.

실시 예에서, 태그 필터링부(323)는 음원 생성 태그들 간에 중복되는 태그들이 있는 경우, 중복도를 고려하여 태그 별 점수를 생성할 수 있다. 중복도는 음원 생성 정보에 매핑되는 태그들 중 중복되는 태그들이 있는 경우 그 중복되는 정도를 나타내는 정보일 수 있다. 예컨대, 태그 필터링부(323)는 태그 매핑부(321)로부터 수신한 태그들 간에 동일 태그들이 있는 경우, 동일 태그들의 개수를 고려하여 태그 별로 중복도를 결정할 수 있다. In an embodiment, the tag filtering unit 323 may generate a score for each tag in consideration of the degree of overlap when there are overlapping tags among sound source generation tags. The degree of overlap may be information indicating the degree of overlap when there are overlapping tags among tags mapped to sound source generation information. For example, if there are identical tags among the tags received from the tag mapping unit 321, the tag filtering unit 323 may determine the degree of overlap for each tag in consideration of the number of identical tags.

실시 예에서, 태그 필터링부(323)는 태그 별 웨이트나, 중복도 외에도 다양한 정보를 이용하여 태그 별 점수를 생성할 수 있다. 실시 예에서, 태그 필터링부(323)가 태그 별 점수를 부여하기 위해 이용하는 정보는 음원 생성 정보에 따라 달라질 수 있다. In an embodiment, the tag filtering unit 323 may generate a score for each tag using various information other than the weight or overlap for each tag. In an embodiment, information used by the tag filtering unit 323 to assign scores for each tag may vary according to sound source generation information.

실시 예에서, 음원 생성 정보가 이미지 정보인 경우, 태그 필터링부(323)는 태그 별 웨이트 및 중복도 외에도, 이미지를 인식한 결과의 정확도를 더 고려할 수 있다. 이미지를 인식한 결과의 정확도는, 이미지에 대한 오브젝트 디텍션의 수행 결과에 대한 신뢰도를 의미할 수 있다. 즉, 이미지를 인식한 결과의 정확도는 이미지에서 오브젝트가 얼마나 정확히 검출되었는지를 나타내는 정보일 수 있다. 또는, 이미지를 인식한 결과의 정확도는 이미지에서 오브젝트가 차지하는 비중에 따라 결정될 수도 있다. 예컨대, 이미지에서 오브젝트가 한 명의 사람의 얼굴과 자동차 한 대이고, 사람의 얼굴이 이미지 전체의 70퍼센트를 차지하고, 자동차가 이미지 전체의 10퍼센트를 차지하는 경우, 이미지를 인식한 결과의 정확도는 얼굴과 자동차 각각에 대해 70퍼센트, 10퍼센트의 비중으로 부여될 수 있다.In an embodiment, when the sound source creation information is image information, the tag filtering unit 323 may further consider the accuracy of the image recognition result in addition to the weight and overlap for each tag. Accuracy of a result of recognizing an image may mean reliability of a result of performing object detection on an image. That is, the accuracy of the image recognition result may be information indicating how accurately an object is detected in the image. Alternatively, the accuracy of the result of recognizing the image may be determined according to the weight of the object in the image. For example, if the objects in the image are a person's face and a car, the person's face occupies 70% of the entire image, and the car occupies 10% of the entire image, the accuracy of the image recognition result is 70% and 10% weighting can be given to each car.

실시 예에서, 태그 필터링부(323)는 이미지 정보에 매핑되는 음원 생성 태그들 각각에 대한 태그 별 웨이트, 중복도, 이미지를 인식한 결과의 정확도 중 적어도 하나를 고려하여 점수가 높은 태그들을 필터링할 수 있다. In an embodiment, the tag filtering unit 323 filters tags with high scores in consideration of at least one of the weight for each tag, the degree of redundancy, and the accuracy of the image recognition result for each of the sound generating tags mapped to the image information. can

실시 예에서, 음원 생성 정보가 주변 상황 정보인 경우, 태그 필터링부(323)는 상황 기반 태그 별 웨이트를 이용하여 주변 상황 정보에 매핑되는 음원 생성 태그들 별로 점수를 부여할 수 있다. 실시 예에서, 태그 필터링부(323)는 주변 상황 정보를 기반으로 획득된 음원 생성 태그들에 점수를 부여하기 위해, 웨이트 획득부(324)로부터 상황 기반 태그 별 웨이트를 획득하고 이를 이용하여 상황에 맞는 태그 별 점수를 생성할 수 있다. 태그 필터링부(323)는 상황 기반 태그 별 웨이트에 기반하여 주변 상황 정보에 매핑되는 음원 생성 태그들 별 점수를 획득하고, 점수가 높은 태그들을 필터링할 수 있다.In an embodiment, when the sound source generation information is surrounding context information, the tag filtering unit 323 may assign a score to each sound source generation tag mapped to the surrounding context information using a weight for each context-based tag. In an embodiment, the tag filtering unit 323 obtains weights for each context-based tag from the weight acquisition unit 324 in order to assign scores to sound source generation tags obtained based on surrounding context information, and uses the weights for each context-based tag. Scores can be generated for each matching tag. The tag filtering unit 323 may acquire scores for each sound source generating tags mapped to surrounding context information based on the weight of each tag based on context, and filter tags with high scores.

실시 예에서, 태그 필터링부(323)는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링할 수 있다. 실시 예에서, 태그 필터링부(323)는 복수의 음원 생성 태그들의 개수가 기준치 이상인 경우, 태그들의 개수가 기준치 이하가 되도록 음원 생성 태그들을 필터링할 수 있다. 여기서, 기준치는 태그 매핑부(321)로부터 수신한 음원 생성 태그들의 개수에 비례하여 결정될 수도 있고, 또는 기 설정된 임의의 개수일 수도 있다. 예컨대, 음원 생성 정보에 매핑되는 음원 생성 태그들이 200개인 경우, 태그 필터링부(323)는 점수가 높은 순서대로, 정해진 퍼센테이지인, 20%에 해당하는 40개의 태그들만을 필터링할 수 있다. 또는 태그 필터링부(323)는 100개의 음원 생성 태그들 중 점수가 높은 순서대로, 미리 정해진 개수인 30개의 태그들만을 필터링할 수도 있다. 또는, 태그 필터링부(323)는 복수의 음원 생성 태그들 중 점수가 소정 점수 이상인 태그들만을 필터링할 수 있다. 또는, 태그 필터링부(323)는 점수가 높은 순서대로 소정 점수 이상이면서 소정 개수 이내의 태그들만을 필터링할 수도 있다. In an embodiment, the tag filtering unit 323 may filter out sound source creation tags having a high score among sound source creation tags. In an embodiment, the tag filtering unit 323 may filter the sound source generation tags so that the number of tags is less than or equal to the reference value when the number of the plurality of sound source generation tags is greater than or equal to the reference value. Here, the reference value may be determined in proportion to the number of sound source generation tags received from the tag mapping unit 321 or may be a predetermined number. For example, if there are 200 sound source generation tags mapped to sound source generation information, the tag filtering unit 323 may filter only 40 tags corresponding to 20%, which is a predetermined percentage, in order of highest scores. Alternatively, the tag filtering unit 323 may filter only 30 tags, which are a predetermined number, in order of highest score among 100 sound source generation tags. Alternatively, the tag filtering unit 323 may filter only tags having a score equal to or higher than a predetermined score among a plurality of sound source generating tags. Alternatively, the tag filtering unit 323 may filter only tags with a predetermined score or more and within a predetermined number in order of high scores.

실시 예에서, 태그 필터링부(323)는 필터링된 태그들을 음원 생성부(330)로 전달할 수 있다. 태그 필터링부(323)는 이미지 정보에 매핑되는 음원 생성 태그들 중에서 필터링하여 획득한 태그들과, 주변 상황 정보에 매핑되는 음원 생성 태그들 중에서 필터링하여 획득한 태그들을 함께 음원 생성부(330)로 전달할 수 있다. In an embodiment, the tag filtering unit 323 may transfer the filtered tags to the sound source generating unit 330 . The tag filtering unit 323 transmits tags obtained by filtering among sound source generation tags mapped to image information and tags obtained by filtering among sound source generation tags mapped to surrounding context information to the sound source generation unit 330. can be conveyed

실시 예에서, 태그 필터링부(323)는 필터링된 음원 생성 태그들을 추가로 더 필터링할 수도 있다. 예컨대, 태그 필터링부(323)는 필터링된 음원 생성 태그들의 개수가 여전히 많은 경우, 필터링된 음원 생성 태그들을 추가로 더 필터링할 수 있다.In an embodiment, the tag filtering unit 323 may further filter filtered sound source generation tags. For example, when the number of filtered sound source creation tags is still large, the tag filtering unit 323 may further filter the filtered sound source creation tags.

예컨대, 태그 필터링부(323)는 이미지 정보에 매핑되는 음원 생성 태그들 중에서 필터링하여 획득한 태그들과, 주변 상황 정보에 매핑되는 음원 생성 태그들 중에서 필터링하여 획득한 태그들을 합치고 이들을 추가로 필터링한 후 필터링된 태그들만을 음원 생성부(330)로 전달할 수 있다. For example, the tag filtering unit 323 combines tags obtained by filtering among sound source generation tags mapped to image information and tags obtained by filtering among sound source generation tags mapped to surrounding context information, and further filters them. Afterwards, only the filtered tags may be delivered to the sound source generating unit 330 .

또는, 태그 필터링부(323)는 주변 상황 정보에 따라 서로 다른 음악 스타일의 음원이 생성되도록 하기 위해 음원 생성 태그들을 한번 더 필터링할 수도 있다. 태그 필터링부(323)는 센서나 서버 등을 통해 획득한 주변 상황 정보에 따라, 예컨대, 현재 주변 상황이 겨울이고 시간대가 밤인 경우일 때와 여름 한낮이고 햇볕이 강렬한 경우일 때, 서로 다른 점수를 음원 생성 태그들에 부여함으로써 주변 상황에 따라 다른 음원 생성 태그들이 필터링되도록 할 수 있다. 이 경우, 태그 필터링부(323)는 주변 환경에 따라 한번 더 필터링된 음원 생성 태그들을 음원 생성부(330)로 전송하게 되므로, 음원 생성부(330)는 서로 다른 주변 환경에 보다 적절한 분위기나 장르, 템포, 이퀄라이저 등을 갖는 서로 다른 음악 스타일의 음원을 생성 할 수 있다. Alternatively, the tag filtering unit 323 may filter sound source generation tags once more in order to generate sound sources of different music styles according to surrounding situation information. The tag filtering unit 323 assigns different scores according to surrounding situation information obtained through a sensor or a server, for example, when the current surrounding situation is winter and the time zone is night and when it is midday in summer and sunlight is intense. By assigning tags to sound source creation tags, other sound source creation tags can be filtered according to surrounding conditions. In this case, the tag filtering unit 323 transmits the sound source generation tags filtered once more according to the surrounding environment to the sound source generation unit 330, so the sound source generation unit 330 has an atmosphere or genre more suitable for different surrounding environments. It is possible to create sound sources of different music styles with , tempo, equalizer, etc.

또는, 다른 예로, 태그 필터링부(323)는 사용자 식별 정보에 따라 음원 생성 태그들을 한번 더 필터링할 수도 있다. 전자 장치(100a)가 카메라를 포함하는 경우, 태그 필터링부(323)는 카메라로 인식된 사용자 식별 정보를 이용하여 사용자에 따라 다른 음원 생성 태그들이 필터링되도록 할 수 있다. 예컨대, 태그 필터링부(323)는 카메라 등을 이용하여 획득된 사용자 식별 정보에 따라 사용자가 10대 남성임을 식별한 경우와, 사용자가 50대 여성임을 식별한 경우에, 서로 다른 음원이 생성되도록 하기 위해 음원 생성 태그들에 서로 다른 점수를 부여할 수 있다. Alternatively, as another example, the tag filtering unit 323 may filter sound source generation tags once more according to user identification information. If the electronic device 100a includes a camera, the tag filtering unit 323 may filter other sound source generation tags according to users using user identification information recognized by the camera. For example, the tag filtering unit 323 generates different sound sources when the user is identified as a male in his teens and when the user is identified as a female in his 50s according to user identification information obtained using a camera or the like. For this purpose, different scores may be assigned to sound source creation tags.

또는, 사용자가 본인 계정으로 로그인을 한 경우, 태그 필터링부(323)는 사용자 계정에 대응하는 사용자 프로필 정보를 획득하고, 이를 이용하여 사용자 프로필에 보다 적합한 음원 생성 태그들을 필터링할 수 있다. Alternatively, when the user logs in with his or her own account, the tag filtering unit 323 may obtain user profile information corresponding to the user account, and filter sound source creation tags more suitable for the user profile by using the acquired user profile information.

태그 필터링부(323)는 세대나 성별에 따라 서로 다른 음원 생성 태그들을 필터링하고 이를 음원 생성부(330)에 전달하게 되고, 이에 따라, 음원 생성부(330)는 세대나 성별에 따라 분위기나 빠르기 등이 다른 스타일의 음원을 생성할 수 있다. The tag filtering unit 323 filters different sound source generation tags according to generations or genders and delivers them to the sound source generation unit 330. etc. can create sound sources of different styles.

도 7은 실시 예에 따라, 도 6의 태그 필터링부가 태그 별 점수를 고려하여 태그를 필터링하는 것을 설명하기 위한 도면이다.FIG. 7 is a diagram for explaining that the tag filtering unit of FIG. 6 filters tags in consideration of scores for each tag according to an embodiment.

실시 예에서, 태그 필터링부(323)는 음원 생성 태그들에 점수를 부여하고, 점수에 따라 음원 생성 태그들을 필터링할 수 있다. 태그 필터링부(323)는 음원 생성 태그들을 나열하고, 각각의 태그에 대한 태그 별 웨이트를 웨이트 획득부(324)로부터 수신하고, 각 태그 별로 웨이트를 부여할 수 있다. 또한 태그 필터링부(323)는 음원 생성 태그들 중 중복되는 태그들이 있는 경우, 각 태그 별로 중복도를 부여할 수 있다. In an embodiment, the tag filtering unit 323 may assign scores to sound source generation tags and filter the sound source generation tags according to the scores. The tag filtering unit 323 may list sound source generation tags, receive weights for each tag for each tag from the weight acquisition unit 324, and assign a weight to each tag. In addition, the tag filtering unit 323 may assign a degree of overlap to each tag when there are overlapping tags among sound source generation tags.

실시 예에서, 태그 필터링부(323)는 태그들을 필터링함에 있어서 음원 생성 태그들의 종류에 따라 추가로 다른 정보를 이용할 수도 있다. 예컨대, 태그 필터링부(323)는 음원 생성 태그들이 이미지 정보에 대응하는 태그들인 경우, 태그 별 웨이트 및 중복도 외에 오브젝트 디텍션의 정확도를 더 고려할 수 있다.In an embodiment, the tag filtering unit 323 may additionally use other information according to the types of sound source generation tags in filtering tags. For example, when sound source generation tags are tags corresponding to image information, the tag filtering unit 323 may further consider the accuracy of object detection in addition to weight and overlap for each tag.

도 7을 참조하면, 제1 테이블(710)은 태그 필터링부(323)가 이미지 정보에 매핑되는 음원 생성 태그들에 대해 점수를 부여하는 것을 나타낸다. 제1 테이블(710)은 음원 생성 태그들, 즉, Tag 1부터 Tag 5 각각에 대해, 태그 별 웨이트, 정확도(Accuracy), 중복도(Number)를 나타낸다. 여기서, Tag 1부터 Tag 5는 이미지 정보에 매핑되는 음원 생성 태그들을 나타낸다. 태그 필터링부(323)는 이미지 정보에 매핑되는 음원 생성 태그들 각각에 대해, 태그 별 웨이트, 정확도, 중복도를 함께 고려하여 각 태그 별로 점수를 계산할 수 있다. Referring to FIG. 7 , a first table 710 indicates that the tag filtering unit 323 assigns scores to sound source generation tags mapped to image information. The first table 710 represents the weight, accuracy, and number of duplicates for each of the sound source generation tags, that is, Tag 1 to Tag 5. Here, Tag 1 to Tag 5 represent sound source generation tags mapped to image information. The tag filtering unit 323 may calculate a score for each tag by considering the weight, accuracy, and overlap of each tag for each of the sound generating tags mapped to the image information.

실시 예에서, 태그 필터링부(323)는 태그 별 점수를 계산하는 데 있어서, 태그 별 웨이트와 정확도, 중복도를 같은 비중으로 고려하거나, 또는 각각의 항목 별로 서로 다른 가중치를 부여하여 최종 점수를 계산할 수 있다. In an embodiment, when calculating the score for each tag, the tag filtering unit 323 considers the weight, accuracy, and redundancy of each tag with the same weight or assigns different weights to each item to calculate the final score. can

도 7에서, 제2 테이블(720)은 태그 필터링부(323)가 주변 상황 정보에 매핑되는 음원 생성 태그들에 대해 점수를 부여하는 것을 나타낸다. 제2 테이블(720)에서 Tag 1부터 Tag 4는 주변 상황 정보에 매핑되는 음원 생성 태그들을 나타낸다. 태그 필터링부(323)는 주변 상황 정보에 매핑되는 음원 생성 태그들에 대해, 웨이트 획득부(324)로부터 수신한 상황 기반 태그 별 웨이트를 고려하여 상황에 맞는 태그 별 점수를 생성할 수 있다. In FIG. 7 , the second table 720 indicates that the tag filtering unit 323 assigns scores to sound source generation tags mapped to surrounding context information. In the second table 720, Tag 1 to Tag 4 represent sound source generation tags mapped to surrounding situation information. The tag filtering unit 323 may generate a score for each tag suitable for the situation by considering the weight for each tag based on the situation received from the weight acquisition unit 324 for the sound source generation tags mapped to the surrounding situation information.

예컨대, 현재 주변 상황이, 봄비가 오는 아침인 경우, 태그 필터링부(323)는 봄비가 오는 아침이라는 특정 상황에 대응하여 생성된 상황 기반 태그 별 웨이트를 웨이트 획득부(324)로부터 검색하여 이용할 수 있다. 즉, 태그 필터링부(323)는 웨이트 획득부(324)로부터 Tag 1부터 Tag 4에 대응하는 상황 기반 태그 별 웨이트를 검색하고 이를 Tag 1부터 Tag 4에 대한 점수로 이용할 수 있다. For example, when the current surrounding situation is a spring rainy morning, the tag filtering unit 323 may retrieve and use weights for each situation-based tag from the weight acquisition unit 324 in response to the specific situation of a spring rainy morning. there is. That is, the tag filtering unit 323 may retrieve weights for each tag based on the situation corresponding to Tag 1 to Tag 4 from the weight obtaining unit 324 and use them as scores for Tag 1 to Tag 4.

태그 필터링부(323)는 각 태그 별 점수에 기반하여 음원 생성에 직접적으로 이용할 태그들을 필터링할 수 있다.The tag filtering unit 323 may filter tags to be used directly for generating sound sources based on scores for each tag.

도 8은 실시 예에 따른 음원 생성 정보와 태그 간의 관계를 도시한 도면이다. 8 is a diagram illustrating a relationship between sound source creation information and a tag according to an embodiment.

실시 예에서, 전자 장치는 음원과 태그 간의 유사도를 이용할 수 있다. 예컨대, 전자 장치는 MFCC(MelFrequency Cepstral Coefficient) 알고리즘을 이용하여 음원을 특징 벡터로 변환할 수 있다. 전자 장치는 주파수 도메인에서 웨이브 파형을 시간별로 잘라서 각 조각마다 푸리에 변환을 취한 후 이들을 연결하고, 주파수 대역 별로 묶어서 MFCC를 추출할 수 있다. 전자 장치는 MFCC를 기반으로 음원 간의 유사도를 측정할 수 있다. In an embodiment, the electronic device may use the similarity between the sound source and the tag. For example, the electronic device may convert a sound source into a feature vector using a MelFrequency Cepstral Coefficient (MFCC) algorithm. The electronic device may cut the wave form in the frequency domain by time, take a Fourier transform for each piece, connect them, and extract the MFCC by grouping them by frequency band. The electronic device may measure similarity between sound sources based on MFCC.

도 8을 참조하면, 음원 생성에 이용될 수 있는 정보는 그래프(800) 위에 점들로 표현될 수 있다. 도 8에서 각각의 점은 음원과 관련된 정보를 나타낼 수 있다. 음원과 관련된 정보는, 음원에서 추출되는 특징들로, 예컨대, 장르, 조성, 코드, 멜로디, 비트, 박자, 템포 등의 정보를 포함할 수 있다. 그래프(800)에서 점 사이의 거리는 정보 간 관련도 내지는 유사도를 나타낼 수 있다. 즉, 점간 거리가 가까울수록 관련도가 높은 관계를 나타낼 수 있다. Referring to FIG. 8 , information that can be used for generating a sound source may be represented by dots on a graph 800 . In FIG. 8 , each dot may represent information related to a sound source. The information related to the sound source is characteristics extracted from the sound source, and may include, for example, information such as genre, composition, code, melody, beat, time signature, and tempo. In the graph 800, distances between points may indicate a degree of relevance or similarity between information. That is, the closer the distance between points, the higher the relationship.

도 8의 그래프(800)는 음원에서 추출되는 특징 중 장르에 대한 그래프일 수 있다. 그래프(800)에서, 각 점은 형태에 따라 서로 다른 장르를 갖는 정보를 의미할 수 있다. 예컨대 삼각형과 네모, 별 모양 각각은 서로 다른 장르를 갖는 음원 관련 정보를 표현할 수 있다. 그래프(800)의 X 축과 Y축 값들은 장르를 종류에 따라 구분하여 나타낼 수 있다. 예컨대, 그래프(800)의 X축을 따라 오른쪽으로 갈수록 음원의 장르는 soul, blues, jazz 등으로 바뀔 수 있다. 또한, 그래프에서 Y축을 따라 위쪽으로 갈수록 음원의 장르는 rock, hip hop, R&B 등을 나타낼 수 있다. A graph 800 of FIG. 8 may be a graph of genres among features extracted from a sound source. In the graph 800, each dot may mean information having different genres according to a shape. For example, each of a triangle, a square, and a star shape may represent sound source-related information having different genres. The X-axis and Y-axis values of the graph 800 may represent genres by classifying them according to types. For example, the genre of a sound source may change to soul, blues, jazz, and the like as you move to the right along the X-axis of the graph 800 . In addition, the genre of a sound source may represent rock, hip hop, R&B, etc. as it goes upward along the Y-axis in the graph.

도 8은 음원의 장르에 대해 도시하였으나, 음원과 관련된 다른 정보, 예컨대, 조성, 코드, 멜로디, 비트, 박자, 템포 등에 대해서도 각각의 그래프가 생성될 수 있다. Although FIG. 8 illustrates the genre of the sound source, each graph may be generated for other information related to the sound source, such as composition, chord, melody, beat, time signature, tempo, and the like.

음원과 관련된 정보에는 태그들이 매핑될 수 있다. 이는, 거꾸로, 태그를 이용하여 태그에 매핑되는 음원과 관련된 정보를 검색할 수 있다는 것을 의미할 수 있다. 예컨대, 필터링된 태그가 Ta1 1과 Tag 4의 조합인 경우, 음원 생성부(330)는 Ta1 1과 Tag 4의 조합에 대응하는 음원(810)을 검색할 수 있다. Ta1 1과 Tag 4의 조합에 대응하는 음원(810)이 classical한 장르를 갖는 음원인 경우, 음원 생성부(330)는 classical한 장르를 갖는 음원을 생성할 수 있다. 또한, 음원 생성부(330)는 Ta1 1과 Tag 4의 조합에 대응하는 조성, 코드, 멜로디, 비트, 박자, 템포 등을 갖는 음원을 검색하고, 이들을 조합하여 음원을 생성할 수 있다. Tags may be mapped to information related to sound sources. Conversely, this may mean that information related to a sound source mapped to the tag can be searched using the tag. For example, when the filtered tag is a combination of Ta1 1 and Tag 4, the sound generator 330 may search for a sound source 810 corresponding to the combination of Ta1 1 and Tag 4. When the sound source 810 corresponding to the combination of Ta1 1 and Tag 4 is a sound source having a classical genre, the sound source generating unit 330 may generate a sound source having a classical genre. In addition, the sound generator 330 may search for a sound source having a composition, code, melody, beat, time signature, tempo, etc. corresponding to the combination of Ta1 1 and Tag 4, and generate a sound source by combining them.

도 9는 실시 예에 따라, 태그로부터 음원을 획득하도록 학습된 신경망을 설명하기 위한 도면이다. 9 is a diagram for explaining a neural network learned to acquire a sound source from a tag according to an embodiment.

도 9를 참조하면, 인공지능을 이용하여 태그로부터 음원을 생성하는 과정은 두 가지 과정으로 이루어질 수 있다. Referring to FIG. 9 , a process of generating a sound source from a tag using artificial intelligence may consist of two processes.

먼저 훈련 과정(910)에서는 복수의 훈련 데이터(911)를 입력으로 하여 신경망, 즉, 뉴럴 네트워크 모델(912)을 훈련시킬 수 있다. 각 훈련 결과인 출력 데이터(913)는 다시 뉴럴 네트워크 모델(912)로 피드백되어 뉴럴 네트워크 모델(912)의 가중치를 업데이트하는데 이용될 수 있다. First, in the training process 910, a neural network, that is, a neural network model 912 may be trained by using a plurality of training data 911 as inputs. The output data 913 that is a training result may be fed back to the neural network model 912 and used to update the weight of the neural network model 912 .

뉴럴 네트워크 모델(912)은 복수의 훈련 데이터가 입력된 것에 응답하여, 복수의 훈련 데이터로부터 다른 속성을 갖는 데이터를 검출하는 방법을 학습 및 또는 훈련할 수 있으며, 학습 및/또는 훈련된 결과에 기초하여 생성될 수 있다. The neural network model 912 may learn and/or train a method of detecting data having different properties from the plurality of training data in response to input of a plurality of training data, based on the result of learning and/or training. can be created by

보다 구체적으로, 훈련 데이터(911)는 복수의 태그들 및 태그들과 관련도가 높은 음원을 포함할 수 있다. 복수의 태그들은 다양한 음원 생성 정보에 대응하는 태그들일 수 있다. More specifically, the training data 911 may include a plurality of tags and a sound source highly related to the tags. A plurality of tags may be tags corresponding to various sound source creation information.

뉴럴 네트워크 모델(912)은 태그 별로 관련도가 높은 음원들의 집합을 그루핑하여 학습될 수 있다. 즉, 뉴럴 네트워크 모델(912)은 태그들의 특징을 나타내는 조성이나, 코드, 리듬, 분위기, 장르 등의 특징 정보를 획득하고, 획득한 특징 정보로부터 음원을 생성하도록 훈련될 수 있다.The neural network model 912 may be learned by grouping a set of sound sources with high relevance for each tag. That is, the neural network model 912 may be trained to acquire characteristic information such as composition, code, rhythm, mood, genre, etc. representing characteristics of tags, and generate a sound source from the obtained characteristic information.

실시 예에서, 뉴럴 네트워크 모델(912)은 GAN(Generative Adversarial Network)일 수 있다. In an embodiment, the neural network model 912 may be a Generative Adversarial Network (GAN).

뉴럴 네트워크 모델(912)은 음원 자체를 이미지로 변환해서 이를 학습할 수 있다. 예컨대, 뉴럴 네트워크 모델(912)은 음원을 MFCC(MelFrequency Cepstral Coefficient) 알고리즘을 이용하여 음원을 주파수 변환하고, 음원으로부터 특징 정보를 획득할 수 있다. MFCC 알고리즘은 음원을 20ms-40ms정도의 작은 프레임들로 나누고 나누어진 프레임들의 스펙트럼을 분석하여 특징을 추출하는 기법일 수 있다. 뉴럴 네트워크 모델(912)은 MFCC 알고리즘을 이용하여 획득한 주파수 도메인 상의 특징들 간의 파형의 유사도를 이용하여 음원 간 유사도를 측정할 수 있다. 또한, 뉴럴 네트워크 모델(912)은 음원에 매핑된 태그 간 유사도도 함께 측정할 수 있다. The neural network model 912 may learn by converting the sound source itself into an image. For example, the neural network model 912 may frequency-convert the sound source using a MelFrequency Cepstral Coefficient (MFCC) algorithm and obtain feature information from the sound source. The MFCC algorithm may be a technique of dividing a sound source into small frames of about 20 ms to 40 ms and extracting features by analyzing the spectrum of the divided frames. The neural network model 912 may measure the similarity between sound sources using the similarity of waveforms between features in the frequency domain acquired using the MFCC algorithm. In addition, the neural network model 912 may also measure the similarity between tags mapped to sound sources.

뉴럴 네트워크 모델(912)은 획득하고자 하는 도메인 정보와 원본 데이터, 즉, 필터링된 태그들을 입력 값으로 받을 수 있다. GAN은 모든 가능한 도메인들 사이의 매핑을 하나의 generator를 통해 학습할 수 있다. 예컨대, GAN은 태그들과 음원의 장르에 대한 도메인 정보를 모두 훈련 데이터(911)로 입력 받고, 태그에 따라 적절한 장르의 음원을 학습할 수 있다. The neural network model 912 may receive domain information to be acquired and original data, that is, filtered tags as input values. GAN can learn the mapping between all possible domains through one generator. For example, the GAN can receive both tags and domain information about the genre of a sound source as training data 911 and learn a sound source of an appropriate genre according to the tags.

뉴럴 네트워크 모델(912)은, 복수의 신경망 레이어들로 구성될 수 있다. 복수의 신경망 레이어들 각각은 복수의 가중치들(weight values)을 갖고 있으며, 이전(previous) 레이어의 연산 결과와 복수의 가중치들 간의 연산을 통해 신경망 연산을 수행할 수 있다. The neural network model 912 may include a plurality of neural network layers. Each of the plurality of neural network layers has a plurality of weight values, and a neural network operation may be performed through an operation between an operation result of a previous layer and a plurality of weight values.

복수의 신경망 레이어들이 갖고 있는 복수의 가중치들은 인공지능 모델의 학습 결과에 의해 최적화될 수 있다. 예를 들어, 학습 과정 동안 인공지능 모델에서 획득한 로스(loss) 값 또는 코스트(cost) 값이 감소 또는 최소화되도록 복수의 가중치들이 갱신될 수 있다. 인공 신경망은 심층 신경망(DNN: Deep Neural Network)를 포함할 수 있으며, 예를 들어, CNN (Convolutional Neural Network), DNN (Deep Neural Network), RNN (Recurrent Neural Network), RBM (Restricted Boltzmann Machine), DBN (Deep Belief Network), BRDNN(Bidirectional Recurrent Deep Neural Network) 또는 심층 Q-네트워크 (Deep Q-Networks) 등이 있으나, 전술한 예에 한정되지 않는다.A plurality of weights possessed by a plurality of neural network layers may be optimized by a learning result of an artificial intelligence model. For example, a plurality of weights may be updated so that a loss value or a cost value obtained from an artificial intelligence model is reduced or minimized during a learning process. The artificial neural network may include a deep neural network (DNN), for example, a Convolutional Neural Network (CNN), a Deep Neural Network (DNN), a Recurrent Neural Network (RNN), a Restricted Boltzmann Machine (RBM), A deep belief network (DBN), a bidirectional recurrent deep neural network (BRDNN), or deep Q-networks, but is not limited to the above examples.

뉴럴 네트워크 모델(912)은 매핑 네트워크(mapping network)를 포함할 수 있다. 매핑 네트워크는 비선형으로, 특징들 사이의 편중된 상관관계를 줄여 줄 수 있다. 매핑 네트워크는 복수 개의 레이어들을 포함할 수 있다. 각 레이어들은 적어도 하나의 노드로 표현될 수 있고, 계층 간 노드들은 엣지로 연결된다. 노드들은 이전 및 이후의 레이어들에 포함된 노드들과 완전 연결(Fully Connected)되어 있을 수 있다. The neural network model 912 may include a mapping network. The mapping network is non-linear, which can reduce biased correlations between features. A mapping network may include a plurality of layers. Each layer may be represented by at least one node, and nodes between layers are connected by edges. Nodes may be fully connected to nodes included in previous and subsequent layers.

뉴럴 네트워크 모델(912)은 입력된 정보를 매핑 네트워크에 통과시켜 중간 벡터를 획득할 수 있다. 중간 벡터는 태그의 속성 정보를 담고 있는 웨이트일 수 있다. 예를 들어, 속성 정보로부터 추출된 특징 벡터가 음원의 장르에 대응하는 특징에 관한 것이면, 뉴럴 네트워크 모델(912)은 이러한 특징을 갖는 중간 벡터를 생성할 수 있다. 예를 들어, 속성 정보로부터 추출된 특징 벡터가 음원의 템포와 관련된 속성에 대응하는 특징에 관한 것이면, 뉴럴 네트워크 모델(912)은 이러한 템포의 특징을 갖는 중간 벡터를 생성할 수 있다. The neural network model 912 may obtain an intermediate vector by passing the input information through a mapping network. The intermediate vector may be a weight containing tag attribute information. For example, if a feature vector extracted from attribute information relates to a feature corresponding to a genre of a sound source, the neural network model 912 may generate an intermediate vector having this feature. For example, if the feature vector extracted from the attribute information relates to a feature corresponding to an attribute related to the tempo of a sound source, the neural network model 912 may generate an intermediate vector having the feature of this tempo.

뉴럴 네트워크 모델(912)은 생성한 중간 벡터를 이용하여, 복수의 레이어들 마다 음원에 대한 정보를 입히는 방식으로 출력 데이터를 합성할 수 있다. The neural network model 912 may synthesize output data by applying information about a sound source to each of a plurality of layers using the generated intermediate vector.

뉴럴 네트워크 모델(912)은 텐서(tensor)를 입력 받을 수 있다. 텐서는 딥러닝 모델의 정보를 담고 있는 데이터 스트럭쳐일 수 있다. 텐서는 학습 데이터들의 속성이 반영되지 않은 베이스 정보로, 평균적인 음원에 대응하는 정보일 수 있다. 실시 예에서, 텐서는 기본적인 음원을 갖는 부가 정보 영역의 레이아웃을 의미할 수 있다. The neural network model 912 may receive tensors. A tensor may be a data structure containing information of a deep learning model. A tensor is base information on which properties of learning data are not reflected, and may be information corresponding to an average sound source. In an embodiment, a tensor may mean a layout of an additional information area having a basic sound source.

뉴럴 네트워크 모델(912)에는 4X4X512의 텐서로 시작해서 1024X1024X3으로 끝나는 복수 개의 레이어들을 포함할 수 있다. 각 레이어들은 컨벌루션, 업샘플링을 통해 다음 레이어로 연결될 수 있다. The neural network model 912 may include a plurality of layers starting with a tensor of 4X4X512 and ending with a tensor of 1024X1024X3. Each layer may be connected to the next layer through convolution and upsampling.

웨이트는 뉴럴 네트워크 모델(912)의 각각의 레이어들로 입력될 수 있다. 뉴럴 네트워크 모델(912)은 중간 벡터, 즉, 웨이트를 이용하여 각각의 레이어들에 대한 속성이나 음원의 특징들을 표현하도록 학습될 수 있다.Weights may be input to respective layers of the neural network model 912 . The neural network model 912 may be trained to express properties of each layer or characteristics of a sound source using an intermediate vector, that is, a weight.

계층의 깊이가 얕을수록 이미지의 하위 레벨 특징, 즉, coarse한 특징들이 추출되고, 계층의 깊이가 깊어질수록 디테일한 상위 레벨 특징이 추출될 수 있다. As the depth of the hierarchy is shallow, low-level features of the image, that is, coarse features, are extracted, and as the depth of the hierarchy is deep, detailed high-level features may be extracted.

뉴럴 네트워크 모델(912)은 하위 레벨부터 상위 레벨에서 획득한 특징들에 기반하여 결과 데이터를 획득할 수 있다. 결과 데이터는 음원 또는 음원 생성에 이용되는 악보일 수 있다. The neural network model 912 may obtain result data based on features acquired from a lower level to an upper level. The resulting data may be a sound source or a sheet music used to generate a sound source.

각 훈련 결과는 뉴럴 네트워크 모델(912)로부터 출력 데이터(913)로 도출될 수 있다. 출력 데이터(913)는 뉴럴 네트워크 모델(912)의 가중치들을 업데이트하는데 이용될 수 있다. 훈련 결과가 일정 신뢰도를 넘도록 뉴럴 네트워크 모델(912)이 훈련되면 이 모델은 훈련된 뉴럴 네트워크 모델(922)로서 이용될 수 있다. Each training result may be derived as output data 913 from the neural network model 912 . Output data 913 can be used to update the weights of neural network model 912 . When the neural network model 912 is trained such that the training result exceeds a certain reliability level, the model may be used as the trained neural network model 922 .

적용 과정(920)에서는 적용 데이터(921)를 훈련된 뉴럴 네트워크 모델(922)에 입력하여, 입력된 적용 데이터(921)로부터 결과 데이터(923)를 획득할 수 있다. In the application process 920, the application data 921 may be input to the trained neural network model 922, and result data 923 may be obtained from the input application data 921.

실시 예에서, 적용 데이터(921)는 필터링된 음원 생성 태그들이고, 뉴럴 네트워크 모델(922)로부터 출력되는 결과 데이터(923)는 음원 또는 음원 생성을 위한 악보일 수 있다. In an embodiment, the application data 921 are filtered sound source generation tags, and the resultant data 923 output from the neural network model 922 may be a sound source or a score for generating a sound source.

뉴럴 네트워크 모델(912)을 이용하여 필터링된 태그들로부터 음원을 획득하는 방법을 학습하는 동작은, 전자 장치(100a)에서 수행될 수 있다. 또는 이러한 학습 동작은 전자 장치(100a)와는 별개의 외부의 컴퓨팅 장치에서 수행될 수 있다. 예를 들어, 뉴럴 네트워크 모델(912)을 이용하여 태그로부터 음원을 획득하는 방법을 학습하는 동작은, 상대적으로 복잡한 연산량을 필요로 할 수 있다. 이에 따라, 외부의 컴퓨팅 장치가 학습하는 동작을 수행하고, 전자 장치(100a)는 외부 컴퓨팅 장치로부터 학습이 끝난 뉴럴 네트워크 모델(912)을 수신함으로써, 전자 장치(100a)에서 수행되어야 하는 연산량을 줄일 수 있다. 전자 장치(100a)는 뉴럴 네트워크 모델(912)을 외부 서버로부터 수신하여 메모리에 저장하고, 저장된 뉴럴 네트워크 모델(912)을 이용하여 태그로부터 음원을 획득할 수 있다. An operation of learning how to acquire a sound source from tags filtered using the neural network model 912 may be performed by the electronic device 100a. Alternatively, this learning operation may be performed in an external computing device separate from the electronic device 100a. For example, an operation of learning how to acquire a sound source from a tag using the neural network model 912 may require a relatively complex amount of computation. Accordingly, the external computing device performs a learning operation, and the electronic device 100a receives the trained neural network model 912 from the external computing device, thereby reducing the amount of calculations to be performed in the electronic device 100a. can The electronic device 100a may receive the neural network model 912 from an external server, store it in a memory, and acquire a sound source from a tag using the stored neural network model 912 .

도 10은 실시 예에 따른 전자 장치의 내부 블록도이다. 10 is an internal block diagram of an electronic device according to an embodiment.

도 10의 전자 장치(1000)는 도 2의 전자 장치(100a)의 일 예일 수 있다. The electronic device 1000 of FIG. 10 may be an example of the electronic device 100a of FIG. 2 .

도 10의 전자 장치(1000)는 도 2의 전자 장치(100a)의 구성 요소를 포함할 수 있다. The electronic device 1000 of FIG. 10 may include components of the electronic device 100a of FIG. 2 .

도 10를 참조하면, 전자 장치(1000)는 프로세서(210), 및 메모리(220) 외에, 튜너부(1010), 통신부(1020), 감지부(1030), 입/출력부(1040), 비디오 처리부(1050), 디스플레이부(1060), 오디오 처리부(1070), 오디오 출력부(1080), 및 사용자 입력부(1090)를 포함할 수 있다.Referring to FIG. 10 , an electronic device 1000 includes a processor 210 and a memory 220, a tuner unit 1010, a communication unit 1020, a sensing unit 1030, an input/output unit 1040, and a video It may include a processing unit 1050, a display unit 1060, an audio processing unit 1070, an audio output unit 1080, and a user input unit 1090.

튜너부(1010)는 유선 또는 무선으로 수신되는 방송 콘텐츠 등을 증폭(amplification), 혼합(mixing), 공진(resonance)등을 통하여 많은 전파 성분 중에서 전자 장치(1000)에서 수신하고자 하는 채널의 주파수만을 튜닝(tuning)시켜 선택할 수 있다. 튜너부(1010)를 통해 수신된 콘텐츠는 디코딩되어 오디오, 비디오 및/또는 부가 정보로 분리된다. 분리된 오디오, 비디오 및/또는 부가 정보는 프로세서(210)의 제어에 의해 메모리(220)에 저장될 수 있다. The tuner unit 1010 selects only the frequency of a channel to be received by the electronic device 1000 from many radio wave components through amplification, mixing, resonance, etc. of broadcasting contents received through wired or wireless means. It can be selected by tuning. The content received through the tuner unit 1010 is decoded and separated into audio, video and/or additional information. The separated audio, video and/or additional information may be stored in the memory 220 under the control of the processor 210 .

통신부(1020)는 프로세서(210)의 제어에 의해 전자 장치(1000)를 주변 기기나 외부 장치, 서버, 이동 단말기 등과 연결할 수 있다. 통신부(1020)는 무선 통신을 수행할 수 있는 적어도 하나의 통신 모듈을 포함할 수 있다. 통신부(1020)는 전자 장치(1000)의 성능 및 구조에 대응하여 무선랜 모듈(1021), 블루투스 모듈(1022), 유선 이더넷(Ethernet)(1023) 중 적어도 하나를 포함할 수 있다.The communication unit 1020 may connect the electronic device 1000 to a peripheral device, an external device, a server, or a mobile terminal under the control of the processor 210 . The communication unit 1020 may include at least one communication module capable of performing wireless communication. The communication unit 1020 may include at least one of a wireless LAN module 1021, a Bluetooth module 1022, and a wired Ethernet 1023 corresponding to the performance and structure of the electronic device 1000.

블루투스 모듈(1022)은 블루투스 통신 규격에 따라서 주변 기기로부터 전송된 블루투스 신호를 수신할 수 있다. 블루투스 모듈(1022)은 BLE(Bluetooth Low Energy) 통신 모듈이 될 수 있으며, BLE 신호를 수신할 수 있다. 블루투스 모듈(1022)은 BLE 신호가 수신되는지 여부를 감지하기 위해서 상시적으로 또는 일시적으로 BLE 신호를 스캔할 수 있다. 무선랜 모듈(1021)은 와이파이(Wi-Fi) 통신 규격에 따라서 주변 기기와 와이파이 신호를 송수신할 수 있다. The Bluetooth module 1022 may receive a Bluetooth signal transmitted from a peripheral device according to the Bluetooth communication standard. The Bluetooth module 1022 may be a Bluetooth Low Energy (BLE) communication module and may receive a BLE signal. The Bluetooth module 1022 may continuously or temporarily scan a BLE signal to detect whether a BLE signal is received. The wireless LAN module 1021 may transmit and receive Wi-Fi signals with neighboring devices according to Wi-Fi communication standards.

실시 예에서, 통신부(1020)는 통신 모듈을 이용하여 외부 장치나 서버 등으로부터 외부 상황을 나타내는 다양한 정보들, 예컨대 날씨나 시간, 날짜 등에 대한 정보, 내지는, 사용자 계정에 연계되어 있는 사용자 프로필 정보 등을 획득하고, 이를 프로세서(210)에 전송할 수 있다.In an embodiment, the communication unit 1020 uses a communication module to provide various information indicating an external situation from an external device or server, such as information on weather, time, date, etc., or user profile information linked to a user account. may be obtained and transmitted to the processor 210.

감지부(1030)는 사용자의 음성, 사용자의 이미지, 또는 사용자의 인터랙션을 감지하며, 마이크(1031), 카메라부(1032), 광 수신부(1033), 센싱부(1034)를 포함할 수 있다. 마이크(1031)는 사용자의 발화(utterance)된 음성이나 노이즈를 포함하는 오디오 신호를 수신할 수 있고 수신된 오디오 신호를 전기 신호로 변환하여 프로세서(210)로 출력할 수 있다. The sensing unit 1030 detects a user's voice, a user's image, or a user's interaction, and may include a microphone 1031, a camera unit 1032, a light receiving unit 1033, and a sensing unit 1034. The microphone 1031 may receive an audio signal including a user's utterance or noise, convert the received audio signal into an electrical signal, and output the converted electrical signal to the processor 210 .

카메라부(1032)는 센서(미도시) 및 렌즈(미도시)를 포함하고, 화면에 맺힌 이미지를 촬영하여 캡쳐하고 이를 프로세서(210)로 전송할 수 있다. The camera unit 1032 may include a sensor (not shown) and a lens (not shown), take and capture an image formed on the screen, and transmit it to the processor 210 .

광 수신부(1033)는, 광 신호(제어 신호를 포함)를 수신할 수 있다. 광 수신부(1033)는 리모컨이나 핸드폰 등과 같은 제어 장치로부터 사용자 입력(예를 들어, 터치, 눌림, 터치 제스처, 음성, 또는 모션)에 대응되는 광 신호를 수신할 수 있다. The light receiving unit 1033 may receive light signals (including control signals). The light receiving unit 1033 may receive an optical signal corresponding to a user input (eg, touch, pressure, touch gesture, voice, or motion) from a control device such as a remote controller or a mobile phone.

센싱부(1034)는 전자 장치(100a) 주변의 상태를 감지하고, 감지된 정보를 통신부(1020) 또는 프로세서(210)로 전달할 수 있다. 센싱부(1034)는 예컨대, 센서는 온/습도 센서, 프레즌스 센서, 조도 센서, 위치 센서(예컨대, GPS), 기압 센서 및 근접 센서 중 적어도 하나를 포함할 수 있으나, 이에 한정되는 것은 아니다. The sensing unit 1034 may detect a state around the electronic device 100a and transmit the sensed information to the communication unit 1020 or the processor 210 . The sensing unit 1034 may include, for example, at least one of a temperature/humidity sensor, a presence sensor, an illuminance sensor, a location sensor (eg, GPS), a pressure sensor, and a proximity sensor, but is not limited thereto.

입/출력부(1040)는 프로세서(210)의 제어에 의해 전자 장치(1000) 외부의 기기 등으로부터 비디오(예를 들어, 동이미지 신호나 정지 이미지 신호 등), 오디오(예를 들어, 음성 신호나, 음악 신호 등) 및 부가 정보 등을 수신할 수 있다. The input/output unit 1040 receives video (e.g., a moving image signal or still image signal), audio (e.g., audio signal) from an external device of the electronic device 1000 under the control of the processor 210. B, music signal, etc.) and additional information can be received.

입/출력부(1040)는 HDMI 포트(High-Definition Multimedia Interface port, 1041), 컴포넌트 잭(component jack, 1042), PC 포트(PC port, 1043), 및 USB 포트(USB port, 1044) 중 하나를 포함할 수 있다. 입/출력부(1040)는 HDMI 포트(1041), 컴포넌트 잭(1042), PC 포트(1043), 및 USB 포트(1044)의 조합을 포함할 수 있다.The input/output unit 1040 is one of a High-Definition Multimedia Interface port (1041), a component jack (1042), a PC port (1043), and a USB port (1044). can include The input/output unit 1040 may include a combination of an HDMI port 1041 , a component jack 1042 , a PC port 1043 , and a USB port 1044 .

비디오 처리부(1050)는, 디스플레이부(1060)에 의해 표시될 이미지 데이터를 처리하며, 이미지 데이터에 대한 디코딩, 렌더링, 스케일링, 노이즈 필터링, 프레임 레이트 변환, 및 해상도 변환 등과 같은 다양한 이미지 처리 동작을 수행할 수 있다. The video processing unit 1050 processes image data to be displayed by the display unit 1060, and performs various image processing operations such as decoding, rendering, scaling, noise filtering, frame rate conversion, and resolution conversion for image data. can do.

디스플레이부(1060)는 방송국으로부터 수신하거나 외부 서버, 또는 외부 저장 매체 등으로부터 수신한 콘텐츠를 화면에 출력할 수 있다. 콘텐츠는 미디어 신호로, 비디오 신호, 이미지, 텍스트 신호 등을 포함할 수 있다. The display unit 1060 may display content received from a broadcasting station, an external server, or an external storage medium on a screen. The content is a media signal and may include a video signal, an image, a text signal, and the like.

디스플레이부(1060)가 터치 화면으로 구현되는 경우, 디스플레이부(1060)는 출력 장치 이외에 사용자 인터페이스와 같은 입력 장치로 사용될 수 있다. 예를 들어, 디스플레이부(1060)는 액정 디스플레이(liquid crystal display), 박막 트랜지스터 액정 디스플레이(thin film transistor-liquid crystal display), 유기 발광 다이오드(organic light-emitting diode), 플렉서블 디스플레이(flexible display), 3차원 디스플레이(4D display), 전기 영동 디스플레이(electrophoretic display) 중에서 적어도 하나를 포함할 수 있다. 그리고, 디스플레이부(1060)의 구현 형태에 따라, 디스플레이부(1060)는 둘 이상 포함될 수 있다.When the display unit 1060 is implemented as a touch screen, the display unit 1060 may be used as an input device such as a user interface in addition to an output device. For example, the display unit 1060 may include a liquid crystal display, a thin film transistor-liquid crystal display, an organic light-emitting diode, a flexible display, It may include at least one of a 4D display and an electrophoretic display. Also, depending on the implementation form of the display unit 1060, two or more display units 1060 may be included.

오디오 처리부(1070)는 오디오 데이터에 대한 처리를 수행한다. 오디오 처리부(1070)에서는 오디오 데이터에 대한 디코딩이나 증폭, 노이즈 필터링 등과 같은 다양한 처리가 수행될 수 있다. The audio processing unit 1070 processes audio data. The audio processing unit 1070 may perform various processes such as decoding or amplifying audio data and filtering noise.

오디오 출력부(1080)는 프로세서(210)의 제어에 의해 튜너부(1010)를 통해 수신된 콘텐츠에 포함된 오디오, 통신부(1020) 또는 입/출력부(1040)를 통해 입력되는 오디오, 메모리(220)에 저장된 오디오를 출력할 수 있다. 오디오 출력부(1080)는 스피커(1081), 헤드폰(1082) 또는 S/PDIF(Sony/Philips Digital Interface: 출력 단자)(1083) 중 적어도 하나를 포함할 수 있다. The audio output unit 1080 controls audio included in the content received through the tuner unit 1010 under the control of the processor 210, audio input through the communication unit 1020 or the input/output unit 1040, and memory ( 220) can output audio stored in it. The audio output unit 1080 may include at least one of a speaker 1081, headphones 1082, and a Sony/Philips Digital Interface (S/PDIF) 1083.

실시 예에서, 오디오 출력부(1080)는 프로세서(210)가 생성한 음원에 따라 음악을 연주하여 출력할 수 있다.In an embodiment, the audio output unit 1080 may play and output music according to the sound source generated by the processor 210 .

사용자 입력부(1090)는 전자 장치(1000)를 제어하기 위한 사용자 입력을 수신할 수 있다. 사용자 입력부(1090)는 사용자의 터치를 감지하는 터치 패널, 사용자의 푸시 조작을 수신하는 버튼, 사용자의 회전 조작을 수신하는 휠, 키보드(key board), 및 돔 스위치 (dome switch), 음성 인식을 위한 마이크, 모션을 센싱하는 모션 감지 센서 등을 포함하는 다양한 형태의 사용자 입력 디바이스를 포함할 수 있으나 이에 제한되지 않는다. 리모컨이나 기타 이동 단말기가 전자 장치(1000)를 제어하는 경우, 사용자 입력부(1090)는 이동 단말기로부터 수신되는 제어 신호를 수신할 수 있다.The user input unit 1090 may receive a user input for controlling the electronic device 1000 . The user input unit 1090 includes a touch panel that detects a user's touch, a button that receives a user's push manipulation, a wheel that receives a user's rotation manipulation, a keyboard, and a dome switch, and voice recognition. It may include various types of user input devices including a microphone for sensing motion, a motion sensor for sensing motion, and the like, but is not limited thereto. When a remote control or other mobile terminal controls the electronic device 1000, the user input unit 1090 can receive a control signal received from the mobile terminal.

도 11은 실시 예에 따라, 음원을 생성하는 방법을 도시한 순서도이다.11 is a flowchart illustrating a method of generating a sound source according to an embodiment.

도 11을 참조하면, 전자 장치는 음원 생성 정보를 획득할 수 있다(단계 1110). Referring to FIG. 11 , the electronic device may acquire sound source generation information (step 1110).

실시 예에서, 음원 생성 정보는 이미지 정보, 주변 상황 정보, 및 사용자 취향 정보 중 적어도 하나를 포함할 수 있다. In an embodiment, the sound source generation information may include at least one of image information, surrounding situation information, and user taste information.

실시 예에서, 전자 장치는 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득할 수 있다(단계 1120). In an embodiment, the electronic device may acquire sound source creation tags mapped to sound source creation information (step 1120).

실시 예에서, 전자 장치는 내부의 데이터 베이스나 외부의 서버 등에 저장된 음원과 관련된 정보들 중, 음원 생성 정보에 매핑되는 정보들을 찾고, 음원 생성 정보에 매핑되어 있는 태그들을 검색하여 획득할 수 있다. In an embodiment, the electronic device may search for information mapped to sound source creation information among information related to sound sources stored in an internal database or an external server, and search and obtain tags mapped to sound source creation information.

실시 예에서, 전자 장치는 음원 생성 태그들에 기반하여 음원을 생성할 수 있다(단계 1130). In an embodiment, the electronic device may create a sound source based on sound source generation tags (step 1130).

실시 예에서, 전자 장치는 태그와 음원 간의 관계를 학습한 뉴럴 네트워크를 이용하여, 뉴럴 네트워크에 음원 생성 태그들을 입력시키고, 뉴럴 네트워크로부터 음원을 획득할 수 있다.In an embodiment, the electronic device may input sound source generation tags to the neural network and obtain a sound source from the neural network by using the neural network that has learned the relationship between the tag and the sound source.

도 12는 실시 예에 따라, 음원 생성 태그들을 필터링하는 방법을 도시한 순서도이다.12 is a flowchart illustrating a method of filtering sound source generation tags according to an embodiment.

도 12를 참조하면, 전자 장치는 음원 생성 태그들 별 점수를 획득할 수 있다(단계 1210). Referring to FIG. 12 , the electronic device may obtain points for each sound source generation tags (step 1210).

전자 장치는 태그 별 중복도, 인식 결과의 정확도, 태그 별 웨이트, 상황 기반 태그 별 웨이트 중 적어도 하나에 기반하여 음원 생성 태그들에 대한 점수를 획득할 수 있다. The electronic device may obtain scores for sound source generation tags based on at least one of the degree of redundancy for each tag, the accuracy of the recognition result, the weight for each tag, and the weight for each context-based tag.

실시 예에서, 전자 장치는 높은 점수를 갖는 음원 생성 태그들을 필터링할 수 있다(단계 1220). In an embodiment, the electronic device may filter sound source generation tags having high scores (step 1220).

실시 예에서, 전자 장치는 음원 생성 태그들 중 기준 값 이상의 점수를 갖는 음원 생성 태그들을 필터링하거나, 또는 높은 점수 순서대로 소정 개수만큼의 음원 생성 태그들을 필터링할 수 있다.In an embodiment, the electronic device may filter sound source generation tags having scores equal to or higher than a reference value among sound source generation tags, or may filter a predetermined number of sound source generation tags in order of high scores.

실시 예에서, 전자 장치는 주변 상황 정보 및 사용자 식별 정보 중 적어도 하나에 기반하여, 필터링된 음원 생성 태그들을 추가로 필터링할 수도 있다. In an embodiment, the electronic device may additionally filter the filtered sound source generation tags based on at least one of surrounding situation information and user identification information.

실시 예에서, 전자 장치는 필터링된 음원 생성 태그들을 이용하여 음원을 생성할 수 있다(단계 1230). In an embodiment, the electronic device may create a sound source using the filtered sound source creation tags (step 1230).

전자 장치는 태그와 음원 간의 관계를 학습한 뉴럴 네트워크에, 필터링된 태그들을 입력 시키고, 뉴럴 네트워크로부터 필터링된 태그들에 대응하는 음원을 획득할 수 있다.The electronic device may input the filtered tags to the neural network that has learned the relationship between tags and sound sources, and obtain sound sources corresponding to the filtered tags from the neural network.

도 13은 실시 예에 따라, 음원 생성 정보 별로 음원 생성 태그들을 필터링하여 음원을 생성하는 방법을 도시한 순서도이다.13 is a flowchart illustrating a method of generating a sound source by filtering sound source generation tags for each sound source generation information according to an embodiment.

도 13을 참조하면, 전자 장치는 이미지 정보에 매핑되는 음원 생성 태그들 별 점수를 획득할 수 있다(단계 1310). Referring to FIG. 13 , the electronic device may obtain scores for each sound source generation tags mapped to image information (step 1310).

실시 예에서, 전자 장치는 음원 생성 정보 중, 이미지로부터 획득된 이미지 정보에 대해서 매핑되는 음원 생성 태그들을 획득할 수 있다. 전자 장치는 이미지 정보에 매핑되는 음원 생성 태그들 별로 점수를 획득할 수 있다.In an embodiment, the electronic device may obtain sound source creation tags mapped to image information acquired from an image, among sound source creation information. The electronic device may acquire points for each sound generating tag mapped to image information.

예컨대, 전자 장치는 이미지 정보에 매핑되는 음원 생성 태그들 각각에 대해, 인식 결과의 정확도, 태그 별 중복도, 태그 별 웨이트 중 적어도 하나에 기반하여, 이미지 정보에 매핑되는 음원 생성 태그들 별 점수를 획득할 수 있다.For example, for each of the sound source generation tags mapped to the image information, the electronic device sets a score for each sound source generation tags mapped to the image information based on at least one of the accuracy of the recognition result, the degree of overlap for each tag, and the weight for each tag. can be obtained

실시 예에서, 전자 장치는 이미지 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링할 수 있다. 이미지 정보에 매핑되는 음원 생성 태그들 중에서 필터링된 음원 생성 태그들을 제1 태그들이라 할 때, 전자 장치는 제1 태그들을 획득할 수 있다(단계 1320). In an embodiment, the electronic device may filter sound source creation tags having high scores among sound source creation tags mapped to image information. When the filtered sound source generation tags among sound source generation tags mapped to image information are referred to as first tags, the electronic device may obtain the first tags (operation 1320).

실시 예에서, 전자 장치는 주변 상황 정보에 매핑되는 음원 생성 태그들 별 점수를 획득할 수 있다(단계 1330). In an embodiment, the electronic device may acquire scores for each sound source generation tags mapped to surrounding context information (step 1330).

실시 예에서, 전자 장치는 음원 생성 정보 중, 주변 상황 정보에 대해, 주변 상황 정보에 매핑되는 음원 생성 태그들을 획득하고, 주변 상황 정보에 매핑되는 음원 생성 태그들 별로 점수를 획득할 수 있다.In an embodiment, the electronic device may acquire sound source creation tags mapped to the surrounding context information among the sound source creation information, and obtain scores for each sound source creation tags mapped to the surrounding context information.

실시 예에서, 전자 장치는 상황에 따른 사용자 선호도를 나타내는 상황 기반 태그 별 웨이트를 획득하고, 상황 기반 태그 별 웨이트에 기반하여, 주변 상황 정보에 매핑되는 음원 생성 태그들 별 점수를 획득할 수 있다. 또는 전자 장치는 상황 기반 태그 별 웨이트 외에, 태그 별 중복도를 더 고려하여 주변 상황 정보에 매핑되는 음원 생성 태그들 별 점수를 획득할 수도 있다.In an embodiment, the electronic device may obtain a weight for each context-based tag indicating user preference according to the context, and based on the weight for each context-based tag, score for each sound generating tag mapped to surrounding context information. Alternatively, the electronic device may obtain scores for each sound source generation tags mapped to surrounding context information by further considering the degree of redundancy for each tag in addition to the weight for each tag based on context.

실시 예에서, 전자 장치는 주변 상황 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링할 수 있다. In an embodiment, the electronic device may filter out sound source creation tags having a high score among sound source creation tags mapped to surrounding context information.

실시 예에서, 주변 상황 정보에 매핑되는 음원 생성 태그들 중에서 필터링된 음원 생성 태그들을 제2 태그들이라 할 때, 전자 장치는 제2 태그들을 획득할 수 있다(단계 1340). In an embodiment, when the filtered sound source generation tags among the sound source generation tags mapped to surrounding context information are referred to as second tags, the electronic device may acquire the second tags (operation 1340).

실시 예에서, 전자 장치는 제1 태그들과 제2 태그들 중 적어도 하나를 이용하여 음원을 생성할 수 있다(단계 1340). In an embodiment, the electronic device may generate a sound source using at least one of the first tags and the second tags (step 1340).

예컨대, 전자 장치는 제1 태그들과 제2 태그들을 합치고 이들을 점수에 따라 한번 더 필터링할 수 있다. 또는 전자 장치는 제1 태그들과 제2 태그들 중 중복되는 태그들만을 필터링할 수 있다. 전자 장치는 다양한 방법으로 제1 태그들와 제2 태그들을 함께 고려하여 최종 태그들을 획득하고, 최종 태그들에 기반하여 음원을 생성할 수 있다. For example, the electronic device may merge the first tags and the second tags and filter them once more according to scores. Alternatively, the electronic device may filter only overlapping tags among the first tags and the second tags. The electronic device may obtain final tags by considering both the first tags and the second tags in various ways, and generate a sound source based on the final tags.

도 14는 실시 예에 따라, 태그 별 웨이트를 획득하는 방법을 도시한 순서도이다.14 is a flowchart illustrating a method of obtaining a weight for each tag according to an embodiment.

도 14를 참조하면, 전자 장치는 태그 별 웨이트를 획득할 수 있다(1410). Referring to FIG. 14 , the electronic device may obtain a weight for each tag (1410).

실시 예에서, 태그 별 웨이트는 각각의 태그에 대한 사용자의 선호도를 나타내는 정보일 수 있다. 전자 장치는 사용자 취향 정보와 음악 재생 이력 중 적어도 하나를 기반으로 태그 별 웨이트를 생성할 수 있다. 예컨대, 아직 사용자의 음악 재생 이력이 없거나 충분하지 않은 경우, 전자 장치는 사용자 취향 정보에 기반하여서만 태그 별 웨이트를 생성할 수 있다. In an embodiment, the weight for each tag may be information representing a user's preference for each tag. The electronic device may generate weights for each tag based on at least one of user taste information and music playback history. For example, if the user's music playback history is not present or insufficient, the electronic device may generate weights for each tag only based on user taste information.

실시 예에서, 전자 장치는 음원이 생성되면 음원에 따라 음악을 재생할 수 있다(단계 1420). In an embodiment, when a sound source is generated, the electronic device may play music according to the sound source (step 1420).

실시 예에서, 전자 장치는 음악 재생에 따라 태그 별 웨이트를 업데이트할 수 있다(단계 1430). In an embodiment, the electronic device may update the weight for each tag according to music reproduction (step 1430).

예컨대, 전자 장치는 사용자가 특정 음악을 반복하여 청취하는 경우, 특정 음악을 생성하는 데 이용된 태그들의 웨이트를 늘림으로써 태그 별 웨이트를 업데이트할 수 있다.For example, when a user repeatedly listens to a specific piece of music, the electronic device may update the weight of each tag by increasing the weight of the tags used to generate the specific piece of music.

이후, 전자 장치는 업데이트된 태그 별 웨이트를 이용하여, 태그 별 점수를 부여할 수 있다. 전자 장치는 업데이트된 태그 별 웨이트에 따라 바뀐 태그 별 점수를 이용하여 태그를 필터링함으로써, 음악 재생 정보에 따라 다른 음악이 생성되도록 할 수 있다.Thereafter, the electronic device may assign a score for each tag using the updated weight for each tag. The electronic device can generate different music according to music play information by filtering tags using scores for each tag that are changed according to the updated weight for each tag.

일부 실시 예에 따른 전자 장치의 동작 방법 및 장치는 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 컴퓨터에 의해 실행 가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비 휘발성 매체, 분리형 및 비 분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독 가능 매체는 컴퓨터 저장 매체 및 통신 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독 가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비 휘발성, 분리형 및 비 분리형 매체를 모두 포함한다. 통신 매체는 전형적으로 컴퓨터 판독 가능 명령어, 데이터 구조, 프로그램 모듈, 또는 반송파와 같은 변조된 데이터 신호의 기타 데이터, 또는 기타 전송 메커니즘을 포함하며, 임의의 정보 전달 매체를 포함한다. Methods and apparatuses for operating electronic devices according to some embodiments may be implemented in the form of a recording medium including instructions executable by a computer, such as program modules executed by a computer. Computer readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Also, computer readable media may include both computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Communication media typically includes computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave, or other transport mechanism, and includes any information delivery media.

또한, 본 명세서에서, “부”는 프로세서 또는 회로와 같은 하드웨어 구성(hardware component), 및/또는 프로세서와 같은 하드웨어 구성에 의해 실행되는 소프트웨어 구성(software component)일 수 있다.Also, in this specification, “unit” may be a hardware component such as a processor or a circuit, and/or a software component executed by the hardware component such as a processor.

또한, 전술한 본 개시의 실시 예에 따른 전자 장치 및 그 동작 방법은 이미지 정보, 주변 상황 정보, 및 사용자 취향 정보 중 적어도 하나를 포함하는, 음원 생성 정보를 획득하는 단계, 상기 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득하는 단계 및 상기 음원 생성 태그들에 기반하여 음원을 생성하는 단계를 포함하는, 전자 장치의 동작 방법을 구현하기 위한 프로그램이 기록된 컴퓨터로 읽을 수 있는 기록 매체/저장 매체를 포함하는 컴퓨터 프로그램 제품으로 구현될 수 있다. In addition, the electronic device and its operating method according to an embodiment of the present disclosure described above include obtaining sound source generation information including at least one of image information, surrounding situation information, and user taste information, and mapping the sound source generation information to the sound source generation information. A computer-readable recording medium/storage medium in which a program for implementing a method of operating an electronic device is recorded, which includes acquiring tags for generating sound sources and generating a sound source based on the tags for generating sound sources. It can be implemented as a computer program product that includes

기기로 읽을 수 있는 저장 매체는, 비일시적(non-transitory) 저장 매체의 형태로 제공될 수 있다. 여기서,‘비일시적 저장 매체'는 실재(tangible)하는 장치이고, 신호(signal)(예: 전자기파)를 포함하지 않는다는 것을 의미할 뿐이며, 이 용어는 데이터가 저장 매체에 반영구적으로 저장되는 경우와 임시적으로 저장되는 경우를 구분하지 않는다. 예로, '비일시적 저장 매체'는 데이터가 임시적으로 저장되는 버퍼를 포함할 수 있다.The device-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-transitory storage medium' only means that it is a tangible device and does not contain signals (e.g., electromagnetic waves), and this term refers to the case where data is semi-permanently stored in the storage medium and temporary It does not discriminate if it is saved as . For example, the 'non-temporary storage medium' may include a buffer in which data is temporarily stored.

일 실시예에 따르면, 본 문서에 개시된 다양한 실시 예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로 배포되거나, 또는 어플리케이션 스토어를 통해 또는 두개의 사용자 장치들(예: 스마트폰들) 간에 직접, 온라인으로 배포(예: 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품(예:다운로더블 앱(downloadable app))의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to one embodiment, the method according to various embodiments disclosed in this document may be included and provided in a computer program product. Computer program products may be traded between sellers and buyers as commodities. A computer program product is distributed in the form of a device-readable storage medium (eg compact disc read only memory (CD-ROM)), or through an application store or between two user devices (eg smartphones). It can be distributed (e.g., downloaded or uploaded) directly or online. In the case of online distribution, at least part of a computer program product (eg, a downloadable app) is stored on a device-readable storage medium, such as a memory of a manufacturer's server, an application store server, or a relay server. It can be temporarily stored or created temporarily.

전술한 설명은 예시를 위한 것이며, 발명이 속하는 기술분야의 통상의 지식을 가진 자는 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시 예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일 형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The above description is for illustrative purposes, and those skilled in the art will understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the invention. Therefore, the embodiments described above should be understood as illustrative in all respects and not limiting. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as distributed may be implemented in a combined form.

Claims

전자 장치에 있어서,
하나 이상의 인스트럭션을 저장하는 메모리; 및
상기 메모리에 저장된 상기 하나 이상의 인스트럭션을 실행하는 프로세서를 포함하고,
상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써,
이미지 정보, 주변 상황 정보, 및 사용자 취향 정보 중 적어도 하나를 포함하는, 음원 생성 정보를 획득하고,
상기 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득하고,
상기 음원 생성 태그들에 기반하여 음원을 생성하는, 전자 장치.In electronic devices,
a memory that stores one or more instructions; and
a processor to execute the one or more instructions stored in the memory;
By executing the one or more instructions, the processor:
Obtaining sound source generation information including at least one of image information, surrounding situation information, and user taste information;
Obtaining sound source generation tags mapped to the sound source generation information;
An electronic device that generates a sound source based on the sound source generation tags.

제1 항에 있어서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써,
상기 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하고,
상기 필터링된 음원 생성 태그들을 이용하여 상기 음원을 생성하는, 전자 장치.The method of claim 1 , wherein the processor executes the one or more instructions to:
Filtering sound source generation tags having a high score among the sound source generation tags;
An electronic device generating the sound source using the filtered sound source creation tags.

제2 항에 있어서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써,
인식 결과의 정확도, 태그 별 중복도, 및 태그 별 웨이트 중 적어도 하나에 기반하여, 상기 이미지 정보에 매핑되는 음원 생성 태그들 별 점수를 획득하고, 상기 이미지 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하여 제1 태그들을 획득하는, 전자 장치. 3. The method of claim 2, wherein the processor by executing the one or more instructions:
Based on at least one of the accuracy of the recognition result, the degree of overlap for each tag, and the weight for each tag, a score is obtained for each sound source generation tags mapped to the image information, and a higher score among the sound source generation tags mapped to the image information An electronic device that obtains first tags by filtering sound source generation tags having .

제3 항에 있어서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써,
상황에 따른 사용자 선호도를 나타내는 상황 기반 태그 별 웨이트에 기반하여, 상기 주변 상황 정보에 매핑되는 음원 생성 태그들 별 점수를 획득하고, 상기 주변 상황 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하여 제2 태그들을 획득하는, 전자 장치. 4. The method of claim 3, wherein the processor by executing the one or more instructions:
Based on the weight of each context-based tag indicating user preference according to the situation, a score is obtained for each sound source generation tags mapped to the surrounding context information, and a sound source having a higher score among the sound source generation tags mapped to the surrounding context information and filtering the generated tags to obtain the second tags.

제4 항에 있어서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써,
상기 제1 태그들 및 상기 제2 태그들 중 적어도 하나를 이용하여, 상기 음원을 생성하는, 전자 장치. 5. The method of claim 4, wherein the processor by executing the one or more instructions:
An electronic device generating the sound source using at least one of the first tags and the second tags.

제2 항에 있어서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써,
상기 주변 상황 정보 및 사용자 식별 정보 중 적어도 하나에 기반하여, 상기 필터링된 음원 생성 태그들을 추가로 필터링하고, 상기 추가로 필터링된 태그들을 이용하여 상기 음원을 생성하는, 전자 장치.3. The method of claim 2, wherein the processor by executing the one or more instructions:
The electronic device further filters the filtered sound source generation tags based on at least one of the surrounding situation information and user identification information, and generates the sound source using the additionally filtered tags.

제1 항에 있어서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써,
상기 사용자 취향 정보 및 음악 재생 이력 정보 중 적어도 하나에 기반하여 사용자 선호도를 나타내는 태그 별 웨이트를 획득하는, 전자 장치.The method of claim 1 , wherein the processor executes the one or more instructions to:
An electronic device that obtains a weight for each tag indicating user preference based on at least one of the user taste information and the music playback history information.

제7 항에 있어서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써,
상기 생성된 음원에 따라 음악을 재생하고,
음악 재생 정보에 따라 상기 태그 별 웨이트를 업데이트하는, 전자 장치. 8. The method of claim 7, wherein the processor by executing the one or more instructions:
Play music according to the generated sound source;
An electronic device that updates a weight for each tag according to music reproduction information.

제8 항에 있어서, 상기 음악 재생 정보는 상기 음악의 재생 빈도, 음악 전체 청취 정도, 재생 중단 정도, 빨리 감기 정도, 스킵 정도에 대한 정보를 포함하는, 전자 장치.The electronic device of claim 8 , wherein the music reproduction information includes information about a reproduction frequency of the music, a degree of listening to the entire music, a degree of stopping reproduction, a degree of fast-forward, and a degree of skipping the music.

제1 항에 있어서, 디스플레이를 더 포함하고,
상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써,
상기 디스플레이에 출력되는 이미지에 대한 부가 정보, 상기 이미지에서 식별된 컬러나 스타일, 상기 이미지에서 식별된 오브젝트의 종류, 및 상기 식별된 오브젝트가 사람인 경우 사람의 표정 중 적어도 하나에 기반하여 상기 이미지 정보를 획득하는, 전자 장치. The method of claim 1 further comprising a display,
By executing the one or more instructions, the processor:
Additional information about the image output to the display, the color or style identified in the image, the type of object identified in the image, and if the identified object is a person, the image information based on at least one of a person's facial expression Obtaining, an electronic device.

제1 항에 있어서, 카메라, 센서 및 통신 모듈 중 적어도 하나를 더 포함하고,
상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써,
상기 카메라, 상기 센서 및 상기 통신 모듈 중 적어도 하나로부터 획득된, 사용자 유무에 대한 정보, 날씨 정보, 날짜 정보, 시간 정보, 계절 정보, 공휴일 정보, 기념일 정보, 온도 정보, 조도 정보 및 위치 정보 중 적어도 하나로부터 상기 주변 상황 정보를 획득하는, 전자 장치.The method of claim 1, further comprising at least one of a camera, a sensor and a communication module,
By executing the one or more instructions, the processor:
At least one of information about whether or not a user is present, weather information, date information, time information, season information, holiday information, anniversary information, temperature information, illuminance information, and location information obtained from at least one of the camera, the sensor, and the communication module. An electronic device that obtains the surrounding context information from one.

제1 항에 있어서, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써,
사용자 프로필 정보, 사용자의 시청 이력 정보, 및 사용자로부터 선택 받은 선호 음악 정보 중 적어도 하나로부터 상기 사용자 취향 정보를 획득하는, 전자 장치.The method of claim 1 , wherein the processor executes the one or more instructions to:
An electronic device that obtains the user taste information from at least one of user profile information, user viewing history information, and preferred music information selected by the user.

이미지 정보, 주변 상황 정보, 및 사용자 취향 정보 중 적어도 하나를 포함하는, 음원 생성 정보를 획득하는 단계;
상기 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득하는 단계; 및
상기 음원 생성 태그들에 기반하여 음원을 생성하는 단계를 포함하는, 전자 장치의 동작 방법. Acquiring sound source generation information including at least one of image information, surrounding situation information, and user taste information;
obtaining sound source creation tags mapped to the sound source creation information; and
A method of operating an electronic device comprising generating a sound source based on the sound source generation tags.

제13 항에 있어서, 상기 음원을 생성하는 단계는
상기 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하는 단계; 및
상기 필터링된 음원 생성 태그들을 이용하여 상기 음원을 생성하는 단계를 포함하는, 전자 장치의 동작 방법.14. The method of claim 13, wherein generating the sound source
filtering sound source generation tags having a high score among the sound source generation tags; and
and generating the sound source using the filtered sound source creation tags.

제14 항에 있어서, 상기 음원 생성 태그들을 필터링하는 단계는
인식 결과의 정확도, 태그 별 중복도, 및 사용자 선호도를 나타내는 태그 별 웨이트 중 적어도 하나에 기반하여, 상기 이미지 정보에 매핑되는 음원 생성 태그들 별 점수를 획득하는 단계; 및
상기 이미지 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하여 제1 태그들을 획득하는 단계를 포함하는, 전자 장치의 동작 방법. 15. The method of claim 14, wherein the step of filtering the sound source generation tags
obtaining a score for each sound generating tag mapped to the image information based on at least one of accuracy of a recognition result, overlapping degree for each tag, and weight for each tag representing user preference; and
and obtaining first tags by filtering sound source generation tags having high scores among sound source generation tags mapped to the image information.

제15 항에 있어서, 상기 음원 생성 태그들을 필터링하는 단계는
상황에 따른 사용자 선호도를 나타내는 상황 기반 태그 별 웨이트에 기반하여, 상기 주변 상황 정보에 매핑되는 음원 생성 태그들 별 점수를 획득하는 단계; 및
상기 주변 상황 정보에 매핑되는 음원 생성 태그들 중 높은 점수를 갖는 음원 생성 태그들을 필터링하여 제2 태그들을 획득하는 단계를 포함하는, 전자 장치의 동작 방법. 16. The method of claim 15, wherein the step of filtering the sound source generation tags
obtaining a score for each sound generating tag mapped to the surrounding context information based on a weight for each context-based tag indicating user preference according to the context; and
and obtaining second tags by filtering sound source generation tags having high scores among sound source generation tags mapped to the surrounding situation information.

제16 항에 있어서, 상기 음원을 생성하는 단계는
상기 제1 태그들 및 상기 제2 태그들 중 적어도 하나를 이용하여, 상기 음원을 생성하는 단계를 포함하는, 전자 장치의 동작 방법. 17. The method of claim 16, wherein generating the sound source
and generating the sound source using at least one of the first tags and the second tags.

제14 항에 있어서, 상기 주변 상황 정보 및 사용자 식별 정보 중 적어도 하나에 기반하여, 상기 필터링된 음원 생성 태그들을 추가로 필터링하는 단계를 더 포함하고,
상기 음원을 생성하는 단계는 상기 추가로 필터링된 태그들을 이용하여 상기 음원을 생성하는 단계를 포함하는, 전자 장치의 동작 방법.15. The method of claim 14, further comprising filtering the filtered sound source generation tags based on at least one of the surrounding context information and user identification information,
Wherein the generating of the sound source includes generating the sound source using the additionally filtered tags.

제13 항에 있어서, 상기 사용자 취향 정보 및 음악 재생 이력 정보 중 적어도 하나에 기반하여 사용자 선호도를 나타내는 태그 별 웨이트를 획득하는 단계;
상기 생성된 음원에 따라 음악을 재생하는 단계; 및
음악 재생 정보에 따라 상기 태그 별 웨이트를 업데이트하는 단계를 더 포함하는, 전자 장치의 동작 방법. 14 . The method of claim 13 , further comprising: obtaining a weight for each tag indicating user preference based on at least one of the user taste information and the music reproduction history information;
playing music according to the generated sound source; and
The method of operating the electronic device further comprising updating a weight for each tag according to music reproduction information.

이미지 정보, 주변 상황 정보, 및 사용자 취향 정보 중 적어도 하나를 포함하는, 음원 생성 정보를 획득하는 단계;
상기 음원 생성 정보에 매핑되는 음원 생성 태그들을 획득하는 단계; 및
상기 음원 생성 태그들에 기반하여 음원을 생성하는 단계를 포함하는, 전자 장치의 동작 방법을 구현하기 위한 프로그램이 기록된 컴퓨터로 읽을 수 있는 기록 매체. Acquiring sound source generation information including at least one of image information, surrounding situation information, and user taste information;
obtaining sound source creation tags mapped to the sound source creation information; and
A computer-readable recording medium having a program recorded thereon for implementing a method of operating an electronic device, comprising generating a sound source based on the sound source generation tags.