WO2016058560A1 - 一种基于服务端与外部缓存***的外接式计算设备加速方法与实现该方法的设备 - Google Patents

一种基于服务端与外部缓存***的外接式计算设备加速方法与实现该方法的设备 Download PDF

Info

Publication number
WO2016058560A1
WO2016058560A1 PCT/CN2015/096743 CN2015096743W WO2016058560A1 WO 2016058560 A1 WO2016058560 A1 WO 2016058560A1 CN 2015096743 W CN2015096743 W CN 2015096743W WO 2016058560 A1 WO2016058560 A1 WO 2016058560A1
Authority
WO
WIPO (PCT)
Prior art keywords
cache
layer
speed
performance
server
Prior art date
Application number
PCT/CN2015/096743
Other languages
English (en)
French (fr)
Inventor
张维加
Original Assignee
张维加
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 张维加 filed Critical 张维加
Publication of WO2016058560A1 publication Critical patent/WO2016058560A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • G06F3/0665Virtualisation aspects at area level, e.g. provisioning of virtual or logical volumes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0685Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays

Definitions

  • the product belongs to the field of computer equipment, and is an electronic device acceleration method based on cross-device cache technology, and a device for implementing the method.
  • Mobile devices are small in size and generally do not provide a DIY upgrade. The following describes the situation on the computer.
  • SSDs do not have the motors and rotating media of ordinary hard drives, so they are quick to start and have excellent shock resistance.
  • Solid state drives do not use heads, disk reads and writes are fast, and latency is small.
  • the read and write speed can generally reach more than 100M per second. Although the speed is much faster than the mechanical hard disk, but the disadvantages are also many, such as expensive, small capacity, short battery range, limited write life and so on. Often not expensive, the capacity is small, and the entry-level Scientific SSD NOW of about 500 yuan has only 32G capacity.
  • the large capacity is expensive, the same is 1TB size, the mechanical hard disk is about 200 yuan, and the solid state hard disk is at least 5,000. Therefore, in the new factory computer, the solid state drive still does not replace the mechanical hard disk.
  • the old equipment upgrade should mainly consider the feasibility, as well as the cost performance, these two aspects. Feasibility: The first is the compatibility issue. Early motherboards did not support solid state drives. Specifically, first of all, many computers from early to before 2009 did not support SSDs, and computers that were 11 years ago did not support SATA 2 protocol at all. The maximum support speed of the motherboard interface is 100 MB per second for ordinary IDE or 150 MB per second. The SATA hard disk protocol cannot be accelerated with SSD at all. It is almost impossible to replace the motherboard. Secondly, it is still inconvenient. The average user is not good at replacing the hard disk. Changing the hard disk means changing the whole system, copying all the files, reinstalling various drivers, and consuming at least one or two days.
  • the SSD setting is complicated, and it can only exceed the normal hard disk speed under Win7 or Win8.
  • XP does not recognize the Trim command of SSD, 4k alignment and ACHI. Not only can't you speed up, you can't use your computer, and in most cases it will blue screen and crash.
  • the first is the high price, the price of the entry-level 32G-64G for SSDs is about 500 yuan, but 64G has basically no space after installing Win7 system and Office.
  • the price of the entry-level 128G has already approached 1,000 yuan.
  • the cost of upgrading an old computer is not worth it.
  • the second is short life, solid state drives are generally MLC flash memory, and its life is very short without proper maintenance. Turning on Trim and setting 4k alignment and other maintenance measures will not be available to customers.
  • the mobile device is small in size and generally does not provide a DIY upgrade method, and an upgrade method is urgently needed.
  • Fast Disk is a PCI-E interface expansion card, equipped with one or two MLC NAND flash memory, as a Mini PCI-E 1x specification expansion card, through the PCI-E bus Data exchange with the system I/O controller.
  • the flash memory module used by the Fast Disk is NAND, not NOR. This is because NAND is better than NOR in accessing data performance and has better performance. Value for money.
  • ReadyBoost and ReadyDrive functions are provided. These functions will directly improve the performance of the system in terms of booting, hibernation, installing programs, copying files, loading games and other tasks related to disk operations. According to official data, the fast disk can speed up the boot speed by 20%, while reducing the number of hard disk revolutions to save power.
  • ReadyBoost When ReadyBoost determines that the cache in the flash is more capable of satisfying random reads than the cache in the hard disk, it will randomly read the data from the flash disk media.
  • the hard disk will read a large amount of data in batches at a time, and temporarily store it in the fast disk for the system to call at any time; at the same time, the data to be written is temporarily stored in the fast disk, and then accumulated to a certain number and then uniformly written to the hard disk.
  • This on-demand read/write mechanism is very helpful in improving system performance.
  • the hard disk is idle, and the larger the capacity of the fast disk, the longer the idle time of the hard disk, thereby reducing the number of mechanical rotations and power consumption, and prolonging the battery life of the notebook battery.
  • ReadyDrive is in fact Microsoft's name for a hybrid hard drive (a hard drive with internal flash components).
  • a hybrid hard drive a hard drive with internal flash components.
  • the biggest temptation is that the data stored in it is "right to wait” - because for flash memory, there is no need to start the head or wait for the head to rotate to the proper position.
  • Hybird drives start up, hibernate, sleep faster, and consume less power. Because when the operating system reads and writes the cache, the drive itself can temporarily stop working without consuming any power. When resumed from hibernation, the laptop can immediately start reading data from the cache, instead of waiting for the drive's head to start up as usual.
  • the user can set the module to provide ReadyBoost, ReadyDrive function through the software interface, or both.
  • the measured random read and write speed of 35M per second is not much improved for the hard disk, and is not as good as the solid state hard disk.
  • Intel's fast disk is limited in size, unable to add cache or parallel modules, or more master ICs; 4. expensive. 4G's fast disk pricing is at $100; 5. System compatibility is poor. This alone is enough to rule out the possibility that the Turbo is used to speed up old computers. Both Readydrive and Readyboost can only be used for Windows Vista and above, and most of the old computers are XP operations. The system can only run smoothly under XP.
  • the present invention provides an external computing device acceleration method based on a server and an external cache system and an apparatus for implementing the method.
  • the method achieves acceleration by composing a performance-classified, cross-device caching system, including at least one server-side caching device (hereinafter referred to as S), through a high-speed or multi-channel data transmission device (hereinafter referred to as L) and an accelerated computing device (hereinafter referred to as C) connection.
  • L can be either a wired transmission type or a wireless transmission type.
  • S provides a relatively high-speed write cache layer (such as Ramdisk, etc.) and a random read cache layer (such as resistive memory) for C, a common file for C cache system and programs, and frequently read and write files, etc., which are transferred to the C hard disk. I/O and provide computational performance support for C through the virtual layer.
  • L When L is a wired type, L may be an optical fiber, a USB 3.0 or higher transmission line, a super five type network cable, or other high speed wired transmission equipment.
  • L When L is a wireless type, L may be a wireless network card supporting WiGig technology, or a base or an external box of C that is connected to the S end by wireless transmission. See Figure 1.
  • the computing device here does not specifically refer to a computer, but should include a variety of computers, tablets, mobile phones, robots or smart hardware in a broad sense. These computing devices contain data and file acquisition and processing. These generalized computing devices can be used to form a short-distance, multi-channel, cross-device caching system for the same enterprise or the same family.
  • R Layer or R layer such as Ramdisk, etc.
  • S simulates memory as a disk and creates a cache for C in it.
  • D Layer or D layer such as resistive memory, or NAND disk array combined in RAID mode
  • V Layer or V layer provide computational performance support for C by virtualizing application or hardware.
  • the above three layers can be strengthened, merged or reduced, and should be determined according to specific needs.
  • the V layer may need multiple types to suit different application requirements.
  • the R layer and the D layer may be physically combined into one RD layer; for the C terminal, idle availability is available.
  • S can retrieve the memory of the C-end device part and the cache of the S device to form a complex cache across devices.
  • S can also provide computational performance support for C.
  • the method can be any one or more of the following three methods: 1. S-shared computing tasks; 2. Through virtual architecture in S, allow C Run this virtual layer in S in a remote operation mode, display the interface on C and realize interaction with the user. One S can create multiple Cs. The virtual architecture is built, and the S resources are deducted according to the real-time usage of C. The virtual architecture separated in S can be a virtual machine or a virtual application layer. 3. The S virtualizes the application to pre-store the program files required by C. The system environment required by the program is in the S device, and C can run the application directly in S.
  • S and C will have a rich topological relationship:
  • S and C can be one-to-one, one-to-many, or many-to-many relationships, such as a home environment may be one-to-one or one
  • the environment of enterprises, schools, administrative units, etc. may be one-to-many or many-to-many relationships (this topological relationship is hereinafter referred to as elastic service relationship);
  • S and C are only relative relationships, ie one
  • the C in the group relationship can be S in another group relationship, such as an accelerated computer, which can be accelerated for another computer at the same time, and correspondingly, the S in a group relationship can also be in another group relationship.
  • C such as the S-acceleration of the next-level performance and cache speed by the computing device of the higher-level performance and cache speed
  • the topological relationship is hereinafter referred to as the performance flow relationship
  • three, multiple Cs can simultaneously act as S, forming each other Accelerated networking of P2P to improve performance
  • a computer C1 has a large memory, can provide a stronger small file cache capability and write operation cache, but the disk speed is relatively insufficient
  • a computer C2 has multiple blocks Large flash solid state hard Disk, and increase the bandwidth in the form of RAID, with a stronger read operation cache, then C2 can be S with C1, C2 and C1 write cache is more written to C1, and read operation cache is more written to C2
  • This topological relationship is hereinafter referred to as a performance complementarity relationship).
  • multiple Ss are networked in a high-speed connection such as fiber to enhance read-ahead analysis and achieve higher cache performance.
  • a higher-speed L device such as a multi-path fiber, delivers performance to the next-level S, thereby taking advantage of the idle computing center's idle performance.
  • L When L is a wired type, L can be a fiber optic cable, a Thunderbolt transmission line, an extended USB 3.0 or higher transmission line, a super five network cable, or any other type of wired transmission device with a transmission speed greater than 60 MB per second, and S is directly connected through these transmission cables.
  • S is directly connected through these transmission cables.
  • an interface conversion method can be adopted for a device that does not have a high-speed wired network interface.
  • a high-speed network cable and a USB Ethernet adapter with a CAT6 standard or higher (such adapters have a USB interface on one end and an Ethernet interface on the other end, and an Ethernet port is added to the computer through an existing USB input port)
  • CAT6 CAT6 standard
  • other necessary accessories use the USB end of the adapter to connect S or C, and use the high speed network cable above cat6 standard to transmit between the two.
  • L is a wireless type
  • L can be a wireless network card supporting WiGig technology (the technology can currently transmit data at a speed of 25 Gbits per second), or a wireless network card that can be an 802.11ac for a mobile device. Because the internal disk reading and writing of the mobile device itself is very slow, the bandwidth of 802.11ac can also be significantly improved.
  • L is a wireless type
  • an interface conversion method can also be adopted for a device that does not have a high-speed wireless network interface.
  • L can be a base, an external box or a protective case that is transmitted between the wireless transmission and the server (the base is connected to the accelerated computer by a USB interface or a Thunderbolt interface or other interface), or any other type including wireless
  • the part of the device is transmitted, and S is connected to C through wireless transmission via L.
  • L is a base or a protective case that is transmitted between the wireless transmission and the server.
  • the base can communicate with the S with a high-speed wireless network such as WiGig, and then with a USB interface (all USB interfaces mentioned in this document are generalized).
  • USB interface including various branches such as MicroUSB, MiniUSB, Wireless USB, etc.) or Thunderbolt interface or other interface to C, C does not need high-speed wireless communication capabilities, can also be a normal tablet or mobile phone, using the base
  • you can enter the high-performance mode and return to a lighter but lower-performance mode without using the base or case.
  • L can adopt a multi-channel method of increasing bandwidth.
  • the connection between S and C passes more than one L, or L contains more than one interface connected with C, thereby increasing the transmission bandwidth
  • C is connected to S through its own Wifi, on the one hand,
  • the Wifi adapter of the USB interface is also connected to the S, and the bandwidth is merged after the connection; for example, the plurality of USB interfaces of the C are connected to the S by using the Wifi adapter of the USB interface, thereby increasing the upper limit of the Wifi bandwidth to the upper limit of the USB bandwidth;
  • Wigig's base L communicates with S via Wigig and is connected to C by Bluetooth and MicroUSB.
  • This method of grading across devices will also change the traditional cache implementation.
  • cache implementations were often self-learning based on the algorithm of a single device, and in the new system, the cache work itself became a source of big data.
  • the advantage is that the cache structure can be optimized or pre-determined according to the statistics of various applications and related files cached by multiple C-sides served by a single or multiple S pairs (Example 1: a large number of C-sides A certain folder of a game program exhibits a frequent reading feature.
  • S services a new C-end, if the program is found, the pre-judgment work can be directly performed, such as caching the file frequently read and written on other devices.
  • Example 2 A large number of programs on the C-end display frequent write jobs, such as a shopping browser, when the browser is launched, it can be prejudged Allocate a large S-side write cache layer without re-accumulating cached data). In fact, many programs cannot learn the optimal cache on a single device because the user is not used frequently. However, the acquisition of cross-device data enables statistics and judgment of a large number of data samples, making many rarely used programs. Even the programs used for the first time can be accurately pre-optimized.
  • S can be virtualized on existing large computers, servers, or high-performance computers.
  • R layer D layer V layer, etc. and install appropriate modifications to be converted into S devices, then install drivers and L server on S and C to form a cross-device cache system, and achieve the purpose of C acceleration.
  • Methods and apparatus for retrofitting an acceleration system are also provided in the sample portion of the invention.
  • the method and the corresponding device for implementing the method have the following advantages:
  • Support short-range wireless acceleration Internet of Things and mobile devices may be widely used in the future. These devices are often small in size, neither have an external interface nor a wired interface, and are also likely to be embedded devices and cannot be upgraded. Replacement equipment must be replaced in a complete set, which is costly.
  • the mobile device itself provides simple performance and relies on the mobile Internet to obtain information flow.
  • the present invention considers that long-distance networks acquire information and short-range networks acquire performance.
  • the solution of the present invention provides another aspect, that is, by short-distance cache acceleration and performance transfer (the distance is much smaller than that of the network mentioned in the past), the mobile device, the wearable device, and the Internet of Things smart device can be accelerated.
  • Mobile devices are limited by power consumption and volume. The processing performance and cache media are inherently insufficient, but mobile devices often have high wireless data speeds.
  • Example 2 A large number of programs on the C-end display frequent write jobs, such as a shopping browser, when the browser is launched, it can be prejudged Allocate a large S-side write cache layer without re-accumulating cached data). In fact, many programs cannot learn the optimal cache on a single device because the user is not used frequently. However, the acquisition of cross-device data enables statistics and judgment of a large number of data samples, making many rarely used programs. Even the programs used for the first time can be accurately pre-optimized.
  • S and C can be one-to-one, one-to-many, or many-to-many relationships, such as the home environment may be one-to-one or one-to-many. Relationship, and the environment of enterprises, schools, administrative units, etc.
  • S and C are only relative relationships, that is, a group of relationships C can be another set of relationships S, such as an accelerated computer, can simultaneously accelerate for another computer, and correspondingly, S in a set of relationships can also be C in another set of relationships, such as S-acceleration of the next-level performance and cache speed by the computing device of the higher-level performance and cache speed (this topological relationship is hereinafter referred to as the performance fluid principle);
  • three, multiple Cs can simultaneously act as S to form P2P acceleration Networking for performance improvement (for example, a computer C1 has a large memory, can provide a stronger small file cache capability and write operation cache, but the disk speed is relatively insufficient, and a computer C2 has a large number of large Flash SSD with RAI The D form increases the bandwidth and has a stronger read operation buffer.
  • C2 can be S with C1, C2 and C1 write buffers are more written
  • the present invention has been tested successfully for four examples.
  • the first example is an active wired acceleration center for enterprises, schools, and schools. This example assumes the applicable ring.
  • the environment is an enterprise and enterprise.
  • the unit has 50 computers that have been used for many years and are all ordinary mechanical hard disks (sequential read and write speed is about 40-70MB per second, the key is that 4K random read and write is very slow at around 1MB per second) .
  • the equipment of enterprises, institutions and schools is mainly desktop computers.
  • the static environment does not need to consider mobile issues, and is more suitable for wired L. Commercial and government, some departments must be wired for security and confidentiality, and wireless network cards will be taken out.
  • the acceleration scheme connection diagram of the first example is shown in FIG.
  • Part S Specialized cache service machine with silent design, 10 Gigabit Ethernet card, 4 Ethernet outlets and 50 Ethernet outlets through routers, and a Wigig wireless network card, not connected for security reasons
  • the external network is only connected to 50 sets of C to form a short-range internal network. It uses 48GB RDIMM memory (4GB by 12) and 1333MHz, of which 25G of memory is virtualized into a memory disk as the R layer write cache layer.
  • the Ramdisk read and write speeds are respectively 8GB per second and 10GB per second, 4K random read and write reaches 600MB per second, allocates 512MB of RAMDISK on the S side for each of the 50 devices on the C side, and generates an image file of RAMDISK, which is loaded when the device is powered on, and saved when the device is powered off.
  • the speed of the RAMDISK is 10GB per second for S, since the actual speed of the CAT6 network is 110MB/S to 120MB/S, even after the wireless 802.11ac is connected to the grid at the same time, it is only 200MB per second, so for the C-side only
  • the sequence reads and writes 200MB/S, 512k randomly reads and writes 200MB/S disks, and caches C writes and frequently used small files into the RAMDISK.
  • the processor is Intel Xeon E7540 (2.0GHz, 12M cache, 6.4GT/s QPI, No Turbo), 6C, 45W, two 3D V-NAND manufactured Samsung 850 Pro 128GB solid state hard disk composed of NAND composed of Raid0, read and write
  • the speed reaches 1GB per second, 4K random read and write reaches 160MB per second as the D layer read cache layer
  • the SSD solid state hard disk creates the buffer data cache file package, and the data cache package is allocated for C.
  • the initial size of each data file is 4GB.
  • the read operation file of C is cached in the data package.
  • a 2GB RAMDISK cache is allocated for such a common cache.
  • the V-cache layer divides 50 virtual private servers (VPS) dynamically allocated resources based on OpenVZ, and allocates them to different ports for C-side users to connect.
  • the dynamic memory limit of each device is 512MB. Sharing a 24GB disk, the tool program commonly used by the company is pre-installed on the disk, and a sub-account permission control mechanism is created in the storage of the data portion of the disk. Different users have different usage and login rights, and different users have different Different files also have different permissions.
  • the reason why the vps is divided is that the work of the employees cannot interfere with each other in the work environment. Since no display equipment is required, the total cost is about 5,000 yuan. An abstract schematic of the device has been illustrated in Figure 2.
  • the transmission line is mainly CAT6 Gigabit network cable, but for a real company, due to the different degrees of computer, the new computer usually has a Gigabit network card, and the old computer may still be a 100M network card, the old-fashioned for the 100M network card.
  • the computer is connected to C via USB 3.0 to Ethernet at the terminal. Due to the limited transmission speed of CAT6, if you want to get better results, you can connect to the wireless network at the same time. For older desktop computers without a wireless network card USB to 802.11ac NIC device to obtain wireless network. The cost is around 1,000 yuan.
  • the acceleration structure of the first example is shown in Fig. 5.
  • Cost evaluation The total cost is about 10,000 yuan, not more than 20,000 yuan, which can improve the performance of 50 C-sides.
  • the second example is a family-oriented, active home computing and wireless acceleration center.
  • the home equipment will be mainly wireless in the future, so the L layout is based on wireless.
  • households generally had PC equipment, often desktop computers.
  • the PC equipment could be replaced by the S of the scheme, or by modifying the original PC equipment as the S of the scheme. The meaning of the original PC changed from Personal Computer to Personal Center.
  • the acceleration scheme connection diagram of the second example is shown in Fig. 6.
  • Part S A Personal Center prototype with high-speed local area network, high memory and memory-based virtual disk, low storage capacity, simplified peripheral configuration, quiet design, and long-term work orientation (because general storage can be replaced by cloud storage) ), with 16G DDR3 1600 memory (4*4G), 12G of memory and virtualized into memory disk as R layer, read and write speed is 10GB per second and 12GB per second, two 64GB SANDISK SDSSDP-064G-G25 solid state
  • the hard disk consists of Raid0, which has a read/write speed of 800MB per second and 700MB per second. It is a D layer, an Intel i3 processor, and a Dell Wireless 1601 Wigig network card ($60). The cost is about 6,000 yuan.
  • a remote VPS with screen size and resolution adaptable to device changes is also provided for the accelerated mobile device as the V layer.
  • the remaining hardware resources that are not available for caching can be used as a traditional desktop computer.
  • Part L Mobile devices such as HTC ONE phones or iPads use the 802.11ac protocol to connect to the S-end, up to 120MB per second, while the ROM reads itself up to only 17M per second, with writes of up to 8M per second.
  • Old-fashioned notebooks and portable ultrabooks can be used to obtain high-speed wireless networks via one or more USB-to-Gigabit wireless network cards (such as the ASUS USB-AC56 adapter that converts the USB3 interface to a Gigabit wireless network card that supports the 802.11ac protocol).
  • the new notebooks and ultrabooks themselves have Gigabit wireless network cards, and even models such as Dell's Latitude 7440 have 10 Gigabit wireless network cards.
  • the acceleration structure diagram of the second example is shown in Fig. 7.
  • Performance evaluation Mobile devices, older computers, have significant performance improvements. And the V layer is functionally capable of extending the mobile device.
  • Cost assessment Considering that many homes also need a high-performance desktop computer, the cost of this solution is actually an incremental cost of upgrading from a desktop computer to a wireless service S-end, and the configuration cost of L, which is about 1000. yuan.
  • the third example is pure cache-based acceleration, a simple active home and personal wireless acceleration prototype.
  • the R layer and the D layer are physically merged together, and the V layer is also embedded in the D layer.
  • An abstract schematic of the device has been illustrated in Figure 3.
  • the acceleration scheme connection diagram of the third example is shown in FIG.
  • Part S Wireless Accelerated S-Side prototype with wireless WiGig NIC with DRAM cache (R-layer write cache layer), with four-channel SLC NAND flash to form Raid (D-layer read cache layer), and in the Raid Divide the area, create a virtualized Windows environment (including the library and registry required by the program), and virtualize the application to pre-store more program files and program system environment files in the device, more thoroughly avoiding The hard disk read and write in the program is used as the V layer, and the speed is 1 GB per second.
  • the device can retrieve the memory of the system part together with the R layer D layer of the device to form a complex cache across devices, which compensates for the problem of write cache limitation of low wireless speed devices. As future resistive memory and phase change memory industry costs are reduced, resistive memory or phase change memory can also be used as the D-layer read buffer layer herein.
  • the algorithm and architecture of the device also include: 1. Intelligent compression and automatic background release for system memory; 2. Long-term monitoring and identification of user habits to determine which data is to be used by the system, pre-existing in the device; 3. Multi-channel The mode, as described above; 4.
  • the device virtualizes the application to pre-store more or even all of the program files and system environment files required by the program in the device, as described above.
  • Virtualization principle is mainly to use the sandbox virtualization technology, first install the application to the running, all the actions are recorded and process the cost of the file, when the main program file is executed, it will temporarily generate a virtual environment. Execution, like the shadow system, all the operations involved are done in this virtual environment, and will not move the original system. After this processing, all the calling files are in the V layer, and will not be safe. Installed on the C side.
  • Part L Assume that the user's 2008 desktop computer has a USB AC1200 network card installed on the front two USB3s, and then the bandwidth is superimposed, 110MB per second multiplied by 2 to get 220M per second bandwidth.
  • HTC ONE mobile phone supports 802.11ac protocol, the theory can reach 120MB per second bandwidth (actually HTC ONE is 58MB per second in current test), and its own ROM read is only 17M per second, and the write is only 8M per second.
  • the acceleration structure diagram of the third example is shown in FIG.
  • Acceleration effect similar to the second example, but the type of acceleration is single, and the cost and power consumption are very low. Suitable for scenarios with more mobile devices.
  • Mobile device storage devices have power consumption and volume limitations, especially mobile phone flash memory, it is impossible to do multi-channel design and complex circuit design, and the slow writing speed of the flash itself makes the mobile device itself accelerate. Performance is impossible and unnecessary. With the S device, the mobile device can achieve performance improvements when necessary.
  • Example 2 A large number of programs on the C-end are showing frequent write jobs, such as a shopping browser. When the browser is launched, the browser can be preliminarily assigned a larger S-side write cache layer without re-accumulating cached data.
  • the above three examples are all active designs, which are characterized by specially designed S and better acceleration performance.
  • active-type design acceleration systems there is the possibility of performance from the upper level of high performance to the lower level of performance with lower portability and higher portability and lower power consumption.
  • S can be virtualized on existing large computers, servers, or high-performance computers. Virtually out the R layer D layer V layer, etc., and install the appropriate modifications to be converted into S devices, and then install the driver and L server to achieve the purpose of C acceleration.
  • the fourth example is a device designed specifically for system retrofits.
  • the retrofit device has a software portion and a hardware portion.
  • the software part is divided into a server and a server.
  • the device to be modified is re-layered to create a write cache layer and a read cache layer, and Server-side communication interaction cache Command and data, when the server is installed on the C side, it will intercept the C-side I/O for redirection, change the C-side cache structure, call the S-side cache layer as a cache, and interact with the server to exchange the cache.
  • the hardware part of the device includes two ends of the wireless transmission L, with a USB interface and a Wifi network card, and one end can be connected to the S or C by USB (including a generalized USB interface such as MicroUSB), and one end communicates with each other.
  • USB including a generalized USB interface such as MicroUSB
  • FIG. 1 Schematic of the device. At the top of the figure is a typical short-distance, multi-channel, local-performance network-based cross-device caching and computing virtualization system that provides equipment acceleration for general enterprises, homes, and individuals. The new I/O mechanism of the next computing device in the system is shown below the figure. The detailed design of the typical S devices A and B is shown in Figure 2 and Figure 3, respectively.
  • the acceleration scheme connection diagram of the first example is mainly based on short-distance CAT6 wired connection and short-distance multi-channel connection to form a cross-device buffer and calculation system.
  • S of the first example At the center of the drawing is the S of the first example, and the periphery is C.
  • FIG. 5 Accelerated block diagram of Example 1, depicting a new cross-device cache structure of the accelerated device C. C's I/O is intercepted and reallocated.
  • the acceleration scheme connection diagram of the second example is mainly based on short-range wireless connection and short-range multi-channel connection to form a cross-device buffer and calculation system.
  • S of the second example At the center of the drawing is the S of the second example, and the surrounding is C.
  • C In the counterclockwise order, the types of the C devices are smart watches, smart glasses, tablet computers, smart phones, and large-screen human-computer interaction devices.
  • FIG. 7 Accelerated block diagram of Example 2, depicting the new cross-device cache structure of Accelerated Device C under Example 2.
  • C's I/O is intercepted and reallocated.
  • the accelerated device also includes many mobile device types. The mobile device is limited by power consumption and volume, and the processing performance and the cache medium itself are seriously insufficient. However, mobile devices often have higher wireless data speeds due to mobility requirements.
  • the virtual environment can also enable the mobile device to indirectly process the Window application to obtain a corresponding operating experience.
  • Acceleration scheme connection diagram for example three this acceleration scheme is a simple scheme based on short-range wireless connection
  • the short-range multi-channel connection constitutes a cross-device cache and computing system.
  • S of the third example At the center of the drawing is the S of the third example, and the surrounding is C.
  • C In the counterclockwise order, the types of the C devices are smart watches, smart glasses, tablets, smart phones, and desktop computers.
  • FIG. 11 Accelerated block diagram of Example 3, depicting a new cross-device cache structure for Accelerated Device C in Example 3. Since the S device of the third example is a simplified device, there is a corresponding adjustment in the cache hierarchy system. For a detailed description, refer to the introduction of Case 3 in the implementation case section.
  • Example 3 uses a cache configuration optimization data collection, analysis and feedback mechanism based on multiple S-C systems.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

一种基于服务端与外部缓存***的外接式计算设备加速方法与实现该方法的设备,包含至少一个服务端缓存设备(简称S),通过高速或多通道数据传输设备(简称L)与被加速的计算设备(简称C)连接。L可以为有线传输型或无线传输型。S为C提供相对高速的写缓存层(如Ramdisk等,简称R层)与随机读缓存层(如相变存储器等,简称D层),为C缓存***与程序的常用文件以及频繁读写的文件等,转移指向C硬盘的I/O,并通过虚拟层(简称V层)为C提供计算性能支持。当L是有线类型时,L可为光纤、USB3.0以上传输线、超五类网线或其它高速有线传输设备。当L是无线类型时,L可为支持WiGig技术的无线网卡,也可为以无线传输与S端连接的C的底座或外接盒子等。

Description

一种基于服务端与外部缓存***的外接式计算设备加速方法与实现该方法的设备 技术领域
该产品属于计算机设备领域,是一种基于跨设备快取缓存技术的电子设备加速方法,以及实现该方法的设备。
背景技术
目前个人用户与企业用户均大量使用计算机与移动设备,在未来还可能大量应用机器人与智能家居设备、智能穿戴设备。但是这些电子智能设备的更新换代非常快,而产品型号众多,设备种类繁多,年代跨度大,***平台复杂,性能升级存在极大困难。尤其是对于一般的移动设备、嵌入式智能设备和穿戴式设备,则基本没有办法升级。
该问题目前暂时缺乏有效的通用型升级解决方案。
1.为什么需要加速升级型功能的产品?
技术的发展总把硬件甩在后面。软件与***都在快速发展。升级电脑成本很高,一般都要数千元,而手机、平板等则几乎无法升级。目前,这种升级是一个棘手的问题,现有的解决方案一般是配新机器,不管电脑还是iphone等,不但花费近万,而且十分麻烦,更糟糕的是,旧机器还从此成为电子垃圾。
当然,也有不少用户自行购买零配件来拆机换部件,俗称DIY市场,但是技术要求较高,难度也很大,比如换CPU,换硬盘不但需要准确接驳机箱中的各种数据线与插口,还需要导出旧硬盘的数据并重装***与各类软件,一般用户根本不会。而且成本依然居高不下,与主板的兼容性也存在很大问题。
这还是针对传统的计算机而言,对于移动计算设备,则往往擅长DIY的高技能用户也一筹莫展。
也有一些软件可以优化***,比如360优化大师,加速球,但是这些都没有在实质上改善硬件能力,只是清理垃圾等,本身并没有增强计算机的性能。
2.计算机与移动设备性能的瓶颈在哪里?
在于硬盘等存储设备速度,尤其是小文件频繁读写和随机读写。
近十年来,CPU和内存的性能提高了100多倍,但硬盘的性能只提高了两倍。整个数据处理的瓶颈,就在硬盘上。只要能打通这个瓶颈,信息传输就走上了“高速公路”。(诸如移动设备的情形也颇类似,移动设备的处理器性能增长很快,但存储芯片性能增长则较 慢。)
移动设备体积小,一般都不提供DIY升级的方式。以下阐述计算机上的情形。
计算机上,固态硬盘可用于升级。固态硬盘没有普通硬盘的电机和旋转介质,因此启动快、抗震性极佳。固态硬盘不用磁头,磁盘读取和写入速度快,延迟很小。读写速度一般可以达到100M每秒以上。虽然速度比起机械硬盘那还是快不少,但是坏处也是很多的,比如价格贵,容量小,电池航程较短,写入寿命有限等等。往往不贵的就容量小,五百元左右的入门级金士顿SSD NOW只有32G容量。容量大的就价格贵,同样是1TB的大小,机械硬盘200元左右,固态硬盘至少五千。因此在新的出厂的电脑中,固态硬盘也仍然没有取代机械硬盘。
而且旧设备升级主要要考虑可行性,以及性价比,这两个方面。可行性上:首先是兼容性问题。早期的主板并不支持固态硬盘。具体地说,首先,早期到09年以前的电脑很多不支持固态硬盘,而11年以前的电脑根本就不支持SATA 2协议,主板接口最大支持速度就是100MB每秒的普通IDE或者150MB每秒的SATA硬盘协议,根本无法用SSD获得加速效果。而更换主板几乎不可能。其次,依然是不方便,一般的用户并不擅长自己更换硬盘,更换硬盘尤其意味着更换整个***,拷贝所有的文件,重装各种驱动,消耗至少一两天的时间。再者,SSD设置复杂,只有在Win7或者Win8下才可以超过普通硬盘速度,XP不识别SSD的Trim指令、4k对齐以及ACHI。不但不能加速,还会无法使用电脑,在大多数情况下会蓝屏、死机。
就性价比而言:第一就是价格高,固态硬盘的入门级32G-64G的价格都要五百元左右,可是64G在安装Win7***与Office之后就基本没有剩余空间了。而入门级128G的价格就已经逼近千元。升级旧电脑来说这个成本已经不值得。第二就是寿命短,固态硬盘一般都是MLC闪存,其寿命在得不到正确保养下很短。而开启Trim,设置4k对齐等保养措施一般客户不会。
3.那么目前是否有其他的低成本更方便的技术方案来解决硬盘速度的瓶颈?
移动设备体积小,一般都不提供DIY升级的方式,迫切需要升级办法。
以下阐述计算机上的其他升级方案情形。
计算机上也有用其他设备来加速的尝试。目前所知的主要是英特尔的迅盘:迅盘是一块PCI-E接口的扩展卡,搭载有一块或两块MLC NAND闪存,作为一个Mini PCI-E 1x规格的扩展卡,通过PCI-E总线与***I/O控制器进行数据交换。迅盘所采用的闪存模块为NAND,而并非NOR,这是由于NAND在存取数据的性能方面要优于NOR,且具备更好的 性价比。
在***的支持下,可提供ReadyBoost和ReadyDrive功能,这些功能将直接对***在启动、休眠、安装程序、拷贝文件、载入游戏等有关磁盘操作的任务上进行性能提升。官方资料表明,迅盘可以使开机速度加快20%,同时减少硬盘转数以节省功耗。
ReadyBoost功能简介:
当ReadyBoost确定闪存内的缓存比硬盘内的缓存更能满足随机读取需求时,它便会从闪盘介质内随机读取数据。硬盘会一次性的批量读出大量数据,并暂时储存在迅盘中,供***随时调用;同时需要写入的数据也先暂存在迅盘中,等积累到一定数量后再统一写入到硬盘中,这种随用随取的读/写机制对提高***性能很有帮助。在这段时间里,硬盘处于闲置状态,而且迅盘的容量越大,硬盘闲置的时间越长,从而减少机械转动次数和电量消耗,延长笔记本电池的续航时间。
ReadyDriver功能简介:
ReadyDrive事实上就是微软对混合硬盘(带有内部闪存部件的硬盘)的称呼。这种硬盘除了闪存显而易见的随机访问速度优势外,最大的诱惑还是在于其中保存的数据“立等可取”—因为对于闪存而言,既不需要启动磁头,也不用等待磁头转动到合适的位置。Hybird硬盘的启动、休眠、睡眠速度更快,而且功耗更低。因为当操作***读写缓存时,驱动器本身可以暂时停止工作,不消耗任何电力。而从休眠状态恢复运行时,笔记本电脑也能够马上从缓存中读取数据开始工作,而不用像往常那样,先得等待驱动器的磁头启动起来。
在迅盘的驱动程序中可以看出,使用者可以通过软件界面设定该模块提供ReadyBoost、ReadyDrive功能,还是两者兼具。
但是,迅盘依然不是一个有效的升级方案。也正因为此,现在已经不太有人提起。其失败的主要原因在于:1.不能用于台式机,也不能用于绝大多数笔记本。所有上网本以及多数笔记本电脑均不支持迅盘模块,因为这不仅要求笔记本电脑提供一个额外的Mini PCI-E插槽,同时更重要的还要求笔记本电脑的SATA接口支持ACHI功能;2.安装复杂,一般用户并不会拆机安装mini PCI-E,以至于无法用于旧电脑升级;3.效果不好。PCI-E总线的速度本身被限制在150M每秒以下,而英特尔的闪存则还远达不到这个速度,实测在35M每秒的随机读写速度,对硬盘提升不大,比固态硬盘还不如,英特尔的迅盘体积受限,无法加装缓存或者并行模组,或更多主控IC;4.价格昂贵。4G的迅盘定价就在100美元;5.***兼容性差。这一点本身就足以排除迅盘用于给旧电脑加速的可能了。无论Readydrive还是Readyboost都只能用于Windows Vista以上的操作***,而旧电脑绝大多数都是XP的操作 ***,也只能在XP下流畅运行。
综上所述,智能设备中大体积设备如PC计算机、笔记本等存在升级复杂困难问题,而小体积设备如移动设备等存在几乎无法升级的问题。
发明内容
鉴于上述问题,本发明提出一种基于服务端与外接缓存***的外接式计算设备加速方法与实现该方法的设备。
该方法通过组成一个性能上分级、跨设备的缓存***来实现加速,包含至少一个服务端缓存设备(以下称S),通过高速或多通道数据传输设备(以下称L)与被加速的计算设备(以下称C)连接。L可以为有线传输型也可以为无线传输型。S为C提供相对高速的写缓存层(如Ramdisk等)与随机读缓存层(如阻变存储器等),为C缓存***与程序的常用文件以及频繁读写的文件等,转移指向C硬盘的I/O,并通过虚拟层为C提供计算性能支持。当L是有线类型时,L可以为光纤、USB3.0以上传输线、超五类网线或其它高速的有线传输设备。当L是无线类型时,L可以为支持WiGig技术的无线网卡,也可为以无线传输与S端连接的C的底座或外接盒子等。见附图1。
此处的计算设备并不特指电脑,而应包含广义的各种计算机、平板、手机、机器人或智能硬件等。这些计算设备都包含数据、文件的获取与处理。包含这些广义计算设备,能够对于同一个企事业单位,或者同一个家庭,利用形成一个短距离的、多通道的、跨设备的缓存体系。
S通过以上所描述的分工化的三个加速层为C加速,典型的如写缓存层(以下简称R Layer或R层,如Ramdisk等,S将内存模拟为磁盘,并在其中为C创建缓存,以获得更大的缓存速度),随机读缓存层(以下简称D Layer或D层,如阻变存储器等,或以RAID方式组合的NAND磁盘阵列),虚拟层(以下简称V Layer或V层,通过将应用虚拟化或硬件虚拟化为C提供计算性能支持)。当然出于功耗与成本在具体环境下的考量,以上的三个层可以被强化、合并或者缩减,都应视具体需求而定。比如,对于企业应用环境,V层可能需要多种,以适应不同的应用需求,而对于简单的家庭应用,R层与D层可以在物理上合并为一个RD层;对于C终端具有闲置可利用内存的设备,S可以调取C端设备部分的内存与S设备的缓存共同组成跨设备的复杂型缓存。
除建立缓存体系外,S还可以为C提供计算性能的支持,方式可以为下面三种方式的任意一种或多种:一、S分担计算任务;二、通过在S中虚拟架构,允许C以远程操作的方式在S运行这一虚拟层,在C上显示界面以及实现与用户的交互,一个S可以为多个C创 建虚拟架构,根据C的实时使用情况扣除S资源,S中分隔的虚拟架构可以为虚拟机或虚拟的应用层;三、S将应用程序进行虚拟化处理,从而预存C所需的程序文件与程序所需***环境在S设备中,C能够在S中直接运行应用程序。
可以设想,S与C之间将能够具备丰富的拓扑关系:一、S与C之间可以是一对一,一对多,或多对多的关系,如家庭环境可能是一对一或一对多的关系,而企业、学校、行政单位等环境可能就会是一对多或多对多关系(该拓扑关系以下称为弹***关系);二、S与C仅是相对关系,即一组关系中的C可以是另一组关系中的S,如一台被加速的计算机,可以同时为另一台计算机加速,而相应的,一组关系中的S也可以是另一组关系中的C,如由更上一级性能与缓存速度的计算设备对下一级性能与缓存速度的S加速(该拓扑关系以下称为性能流动关系);三、多个C可以同时相互作为S,形成P2P的加速组网,以获得性能提升(比如,其中某计算机C1具有较大的内存,可以提供更强的小文件缓存能力与写操作缓存,但磁盘速度相对不足,而某计算机C2具有多块较大的闪存固态硬盘,并以RAID形式增大带宽,具有更强的读操作缓存,则C2可以与C1互为S,C2与C1的写操作缓存更多地写入C1,而读操作缓存更多写入C2,该拓扑关系以下称为性能互补关系)。
这些丰富的拓扑关系能够延伸出更多的应用。比如,多个S之间以光纤等高速连接方式组网,以增强预读分析的能力,以及获得更高的缓存性能。又比如,从更高级的S用更高速的L设备如多路光纤传递性能给下一级的S,从而利用高级计算中心闲置的性能。
L为有线类型时,L可以为光纤、Thunderbolt传输线、加长的USB3.0以上传输线、超五类网线或其他任何类型的传输速度大于60MB每秒的有线传输设备,S通过这些传输线缆直接连接到C。由于USB协议目前的带宽利用效率较低,当L以USB接口与C连接时,可对USB协议进行改善,对于传统的USB接口协议中的BOT协议进行优化,并在USB传输协议上做资源分配优化。
L为有线类型时,对于原本不具备高速有线网络接口的设备,可以采用接口转换的办法。比如,包含CAT6标准以上的高速网线与USB以太网转接器(此类转接器一端带有USB接口,另一端有以太网接口,通过现有的USB输入端口在计算机上添加以太网端口)以及其他必要的附件,用转接器的USB端连接S或C,而在两者之间用cat6标准以上的高速网线传输。
L为无线类型,L可以为支持WiGig技术的无线网卡(该技术目前可以25Gbit每秒的速度传输数据),或者对于移动设备可以为802.11ac的无线网卡。因为移动设备本身内部磁盘读写很慢,802.11ac的带宽也可以带来明显提升。
L为无线类型时,对于原本不具备高速无线网络接口的设备,也可以采用接口转换的办法。比如,L可以为以无线传输与服务端之间传输的底座、外接盒子或保护套(该底座再以USB接口或Thunderbolt接口或其他接口与被加速的计算机连接),或其他任何类型的包含无线传输部分的设备,S通过无线传输经L再与C连接。
此种接口转换办法对于移动设备比较实用。比如,L为以无线传输与服务端之间传输的底座或保护套,该底座能够以高速的无线网络如WiGig与S通信,再以USB接口(本文中所提到的所有USB接口都是广义USB接口,包含各种分支如MicroUSB、MiniUSB、无线USB等)或Thunderbolt接口或其他接口与C连接,C不需要具备高速无线通信的能力,也可以是普通的平板电脑或手机,在使用该底座或保护套时,能够进入高性能模式,而不使用该底座或保护套时,能恢复到更轻便但低性能的模式。
为了获得更高的传输性能,L可采用多通道增大带宽的办法。此时,S与C之间的连接通过不止一个L,或L包含不止一个与C连接的接口,从而增大传输带宽(例一、C一方面通过自有的Wifi与S连接,一方面通过USB接口的Wifi适配器也与S连接,连接后合并带宽;例二、C的多个USB接口均使用USB接口的Wifi适配器与S连接,从而将Wifi带宽上限提高到USB带宽上限;例三、采用Wigig的底座L以Wigig与S通讯,并同时以蓝牙加MicroUSB与C连接)。
这种分级跨设备的方法还将改变传统的缓存实现方式。过去的缓存实现往往是基于单个设备的算法自我学习,而在新的体系下,缓存工作本身成为一种大数据的来源。其优势在于,可以依据单个或多个S对所服务的多个C端的缓存过的各种应用程序与相关文件的统计数据,来进行缓存结构的优化或预判(举例一:大量C端上某游戏程序的某文件夹都呈现出频繁读取特征,则当S服务新的C端时,如发现该程序,可直接进行预判性质的工作如缓存该在其他设备上被频繁读写文件夹到高速设备,而无需重新积累缓存数据;举例二:大量C端上某程序都呈现出频繁写入工作,如某购物浏览器,则当启动该浏览器时,可预判性质地为其分配较大的S端写缓存层,而无需重新积累缓存数据)。事实上,许多程序由于用户的使用频率不高,无法在单个设备上学习到最优的缓存,但是跨设备数据的获取,就能够进行大量数据样本的统计与判断,使得许多很少使用的程序甚至第一次使用的程序都能够被准确地预先优化。
基于上述的方法可以设计出各种***,或制造出各种设备。在本发明的样例中制造了两种典型的设备,一种为企事业单位的典型有线型***设计,另一种为家庭与个人用户的无线型***设计,分别如附图2与附图3所示,这两种设备会在下面的几个情景样例介绍中 做具体介绍。对于家庭与个人用户设备,出于静音与节能的考虑,样例二中的S在没有与C取得连接的时候可以处于待机状态,通过C连接唤醒。
这些专门针对本发明所描述的方法设计的设备具有较好的效果,以及较低的功耗与成本。当然在一些情况下,如果有原有的设备供改造利用,以及并不计较功耗和成本的情况下,S可以通过在现有的大型计算机、服务器、或高性能计算机上进行虚拟化,虚拟出R层D层V层等,并安装适当的改装来改装成S设备,然后在S与C上安装驱动程序和L的服务端来形成跨设备缓存体系,并实现为C加速的目的。在本发明的样例部分中同样提供了用于改造出加速***的方法与设备。
本发明内容的有益效果
相比于传统的电脑升级,本方法与相应的实现该方法的设备具备如下优点:
1.支持广泛的设备类型,属于有效可行且操作简单的升级方案:升级老电脑往往需要拆机换内存换硬盘,如果要加快速度还要动手焊主板换CPU,忙活一两天还常常倒腾坏,或出现蓝屏,各种接口之间的兼容性问题也绝非一般用户搞得清楚的。最妥当的办法是自己当搬运工将电脑抱到电脑城去现场升级,但价格很高,猫腻很多,常常被偷换部件。用本方法及实现之的设备的则非常简单就可以完成。而升级移动设备以往还没有有效方案,本发明提供了一个可行的跨设备性能增强方案。
2.支持批量设备同时加速:对于有大量设备的企事业单位与学校,只需要非常低的成本就可以实现批量设备的升级。
3.支持短距无线加速:物联网与移动设备可能会在未来大量普及,这些设备往往体积很小,既不具备外接接口,也不具备有线接口,同时还很可能是嵌入型设备,无法升级、更换设备必须整套更换,成本高昂。对于以往的观念认为,移动设备自身提供简单的性能,依靠移动互联网获得信息流。而本发明认为长距网络获取信息,短距网络获取性能。本发明方案提供另一个方面,即通过短距离的缓存加速与性能转移(其距离要远远小于过去所说的网络),移动设备、穿戴设备、物联网智能设备可以被加速。移动设备受限于功耗与体积原因,本身搭载的处理性能与缓存介质严重不足,但移动设备往往具备较高的无线数据速度。
4.效果明显:以本发明的样品为例,对于普通百兆网卡的普通机械硬盘的电脑,通过用USB改装千兆网卡以及多通道网络合并带宽后,部分程序启动运行速度可以提升1-3倍,(此外,其实对于一般的电脑,都可以从PCI-E或者ExpressCard转接出USB3.0,相比于原装的USB3.0,这些转接出的USB3.0速度较低,数据传输大约在150M每秒。因此老电脑也可以用上USB3.0的),对于千兆网卡的较新机械硬盘或混合硬盘的电脑,程序启动运行 速度可以提升20%-80%,对于最新的使用WIGIG技术的固态硬盘的电脑,如DELL LATITUDE 7440,理论上(以25Gbit每秒计算),程序启动运行速度仍然可以提升2-3倍。
5.成本低廉:方案的成本在一般企业单位、个人消费者的承受范围内。
6.可持续升级:对C的更进一步的升级可以通过对S的进一步升级实现。
7.可逐级升级:对S的进一步升级可以通过同一方法用更高级S设备加速达到。
8.可依靠大数据智能增强缓存性能:这种分级跨设备的方法还将改变传统的缓存实现方式。过去的缓存实现往往是基于单个设备的算法自我学习,而在新的体系下,缓存工作本身成为一种大数据的来源。其优势在于,可以依据单个或多个S对所服务的多个C端的缓存过的各种应用程序与相关文件的统计数据,来进行缓存结构的优化或预判(举例一:大量C端上某游戏程序的某文件夹都呈现出频繁读取特征,则当S服务新的C端时,如发现该程序,可直接进行预判性质的工作如缓存该在其他设备上被频繁读写文件夹到高速设备,而无需重新积累缓存数据;举例二:大量C端上某程序都呈现出频繁写入工作,如某购物浏览器,则当启动该浏览器时,可预判性质地为其分配较大的S端写缓存层,而无需重新积累缓存数据)。事实上,许多程序由于用户的使用频率不高,无法在单个设备上学习到最优的缓存,但是跨设备数据的获取,就能够进行大量数据样本的统计与判断,使得许多很少使用的程序甚至第一次使用的程序都能够被准确地预先优化。
9.具备拓扑扩展性,有助未来的发展:一、S与C之间可以是一对一,一对多,或多对多的关系,如家庭环境可能是一对一或一对多的关系,而企业、学校、行政单位等环境可能就会是一对多或多对多关系(该拓扑关系以下称为弹***原则);二、S与C仅是相对关系,即一组关系中的C可以是另一组关系中的S,如一台被加速的计算机,可以同时为另一台计算机加速,而相应的,一组关系中的S也可以是另一组关系中的C,如由更上一级性能与缓存速度的计算设备对下一级性能与缓存速度的S加速(该拓扑关系以下称为性能流体原则);三、多个C可以同时相互作为S,形成P2P的加速组网,以获得性能提升(比如,其中某计算机C1具有较大的内存,可以提供更强的小文件缓存能力与写操作缓存,但磁盘速度相对不足,而某计算机C2具有多块较大的闪存固态硬盘,并以RAID形式增大带宽,具有更强的读操作缓存,则C2可以与C1互为S,C2与C1的写操作缓存更多地写入C1,而读操作缓存更多写入C2,该拓扑关系以下称为性能互补原则)。
本发明内容的实施案例
本发明已经试验成功四个样例。
第一个样例是面向企事业单位与学校的,有源型有线加速中心,本样例假定适用环 境是一家企事业单位,该单位有着50台已使用多年的且都是普通机械硬盘的计算机(顺序读写速度约40-70MB每秒,关键是4K随机读写非常慢在1MB每秒左右)。企事业单位与学校的设备以台式计算机为主,静态环境不需要考虑移动问题,比较适合有线L。且商用和政府,有些部门出于安全和保密,必须有线连接,有无线网卡也会取出来。
样例一的加速方案连接图见附图4所示。
S部分:专门的缓存服务机器,采用静音设计,带有万兆以太网卡,自带4个以太网出口并通过路由器扩展出50个以太网出口,以及一个Wigig无线网卡,为了安全考虑不连接到外部网络,仅与50台C连接成短距内部网络,采用了48GB RDIMM内存(4GB乘12),1333MHz,其中25G的内存并虚拟化成内存磁盘作为R层写缓存层,Ramdisk读写速度分别为8GB每秒与10GB每秒,4K随机读写达到600MB每秒,为C端50台设备每台分配512MB在S端的RAMDISK,并生成RAMDISK的镜像文件,在设备开机时载入,关机时保存,虽然对于S而言该RAMDISK的速度是10GB每秒,但是由于CAT6网络实际速度在110MB/S到120MB/S,即使经无线802.11ac同时连接并网后也只有200MB每秒,所以对于C端只是个顺序读写200MB/S,512k随机读写200MB/S的磁盘,将C的写操作与频繁使用的小文件缓存到该RAMDISK中。处理器为Intel Xeon E7540(2.0GHz,12M缓存,6.4GT/s QPI,No Turbo),6C,45W,两块3D V-NAND制造的三星850 Pro 128GB固态硬盘组成的NAND组成的Raid0,读写速度达到1GB每秒,4K随机读写达到160MB每秒作为D层读缓存层,在S端的固态硬盘创建缓存区data缓存文件包,为C分配data缓存包,每个data文件初始大小4GB,将C的读操作文件缓存到该data包中。考虑到50台左右C端所访问的网络文件和程序文件具有高度类似性,再分配一个2GB的RAMDISK缓存此类共同缓存。V缓存层则基于OpenVZ对S划分出50台动态分配资源的虚拟专用服务机(virtual private server(VPS)),分配给不同的端口供C端用户连接使用,每台的动态内存上限为512MB,共享24GB磁盘,对于公司常用的工具程序,都预装在该磁盘,并在该磁盘对数据部分的存储创建了分帐号的权限控制机制,不同用户具有不同的使用与登录权限,且不同用户对不同的文件也各具备不同的权限。之所以分割vps的原因是工作环境下,员工的工作不能相互干扰。由于不需要任何显示设备,总造价约五千元。该设备抽象示意图已经在附图2中说明。
L部分:传输线主要为CAT6千兆网线,但对于一个真实的公司,由于电脑的新旧程度不同,目前新的计算机一般有千兆网卡,而老式计算机可能还是百兆网卡,对于百兆网卡的老式计算机,在终端还要经过USB3.0转以太网连接到C。由于CAT6传输速度有限,如果为了获得更好的效果,可以同时连接无线网络。对于没有无线网卡的老式台式计算机以 USB转802.11ac网卡设备获得无线网络。成本在千元左右。
样例一的加速结构图见附图5所示。
效果评估:1.缓存带来的速度提升,在原来一台C端机器的硬盘的CrystalDiskMark速度测试显示,顺序读75MB每秒,顺序写23MB每秒,随机512K读69MB每秒,随机写13MB每秒,随机4K读4MB每秒,随机4K写5MB每秒,在使用本方案后,以上速度基本都达到180MB每秒以上,其中顺序读缓存与512K随机读写基本贡献自S端SSD的data包的速度,写缓存与4K随机读写基本贡献自S端RAMDISK,考虑到缓存命中率,总体加速效果应当在2-3倍;2.虚拟层性能支持带来的速度与效率提升,企事业环境以办公环境为主,存在大量文件处理与行业软件的操作,对于可后台执行的工作或执行后等待处理结果的,通过此方案可以转移到S端,此外还可利用其他S端特征比如S端的带宽与处理性能优势能够更快解决一些相应的工作,或达到关键工作安全备份与工作遇到问题的调试的作用;3.共同缓存与共同预取,50台左右C端所访问的网络文件和程序文件具有高度类似性,在同一个S端进行处理也有明显提升缓存命中率。
成本评估:总成本造价约一万元,不超过两万元,能为50台左右C端提升明显性能。
第二个样例是面向家庭的,有源型家用计算与无线加速中心。家庭设备未来会基本以无线为主,因此L布置基于无线。在过去,家庭一般都有PC设备,往往是台式计算机,在采用本样品方案的情况下,该PC设备可以被本方案的S代替,或者通过对原有PC设备改装作为本方案的S,将原有的PC的意义从Personal Computer变成Personal Center。
样例二的加速方案连接图见附图6所示。
S部分:一台Personal Center原型机,采用高速局域网络、高内存及基于内存虚拟磁盘、低存储容量、简化周边配置、静音设计、适应长时间工作的导向(因为一般的存储可以被云存储代替),具有16G的DDR3 1600内存(4*4G),其中12G的内存并虚拟化成内存磁盘作为R层,读写速度为10GB每秒与12GB每秒,两块64GB的SANDISK SDSSDP-064G-G25固态硬盘组成的Raid0,读写速度分别达到800MB每秒与700MB每秒,作为D层,Intel i3处理器,Dell Wireless 1601 Wigig网卡(60美元),成本约六千元。除了提供与第一例类似的两级缓存外,也为被加速的移动设备提供屏幕大小与分辨率可适应设备变化的远程VPS,作为V层。剩余的未提供作缓存的硬件资源,则可平时作为传统的台式计算机使用。
L部分:移动设备如HTC ONE手机或ipad用802.11ac协议与S端连接,最高可达120MB每秒,而这些移动设备本身的ROM读取最高仅17M每秒,写入最高仅8M每秒。 老式的笔记本设备与轻便型超级本可以通过一个或多个USB转千兆无线网卡(如华硕USB-AC56适配器可以将USB3接口转为千兆无线网卡,支持802.11ac协议)来获得高速无线网络。而新的笔记本与超级本很多本身有千兆无线网卡,甚至如Dell的Latitude 7440等型号本身有万兆无线网卡。
样例二的加速结构图见附图7所示。
样例二的读操作流程图见附图8所示。
样例二的写操作流程图见附图9所示。
效果评估:移动设备、较为老旧的计算机,有明显的性能提升。并且V层在功能上也能够对移动设备给与扩展。
成本评估:考虑到许多家庭本来也需要一台高性能的台式计算机,本方案的成本实际上是一个从台式计算机升级到无线服务S端的增量成本,以及L的配置成本,增量约为千元。
第三个样例是纯缓存型加速,是简易有源型家用与个人用无线加速原型。将R层与D层在物理上合并到一起,将V层也嵌入D层。该设备抽象示意图已经在附图3中说明。
样例三的加速方案连接图见附图10所示。
S部分:无线加速S端原型机,带有无线WiGig网卡,带有DRAM缓存(R层写缓存层),带有四通道的SLC NAND闪存组成Raid(D层读缓存层),并在该Raid中划出区域,创建虚拟化的Windows环境(包括程序所需Library与注册表),并将应用程序进行虚拟化处理,从而预存更多程序文件与程序***环境文件在设备中,更彻底地避免了程序使用中的硬盘读写,作为V层,发出速度上线1GB每秒。此外,设备能够调取***部分的内存与设备的R层D层一起组成跨设备的复杂型缓存,弥补低无线速度设备的写缓存受限问题。随着未来阻变存储器与相变存储器工业成本的降低,还可使用阻变存储器或相变存储器作为此处的D层读缓存层。
设备的算法与架构还包括了:1.对***内存提供智能压缩与后台自动释放;2.通过对用户习惯进行长期监测识别,判断出***即将使用哪些数据,预存在设备中;3.多通道模式,如上所述;4.设备将应用程序进行虚拟化处理,从而预存更多甚至所有程序文件与程序所需***环境文件在设备中,如上所述。(虚拟化原理主要是利用沙盒的虚拟化技术,先把应用程序安装到运行中所有的动作都记录起来并处理成本地的文件,当执行主程序文件时,它会临时产生一个虚拟环境来执行,类似影子***一样,一切涉及的操作都是在这个虚拟环境中完成,并不会去动原本的***。这样处理后所有的调用文件都在V层,而不会安 装到C端。)
L部分:假定该用户2008年的台式机,在前端两个USB3上安装USB AC1200网卡,然后带宽叠加,110MB每秒乘以2,得到220M每秒的带宽。HTC ONE手机支持802.11ac协议,理论可达120MB每秒的带宽(实际目前测试中HTC ONE为58MB每秒),而其本身的ROM读取最高仅17M每秒,写入最高仅8M每秒。
样例三的加速结构图见附图11所示。
加速效果:与第二例类似,但是加速的类型单一,成本与功耗都很低。适宜有较多移动设备的场景。移动设备的存储设备具有功耗与体积限制,尤其是手机闪存,不可能做多通道设计与复杂电路设计,以及闪存本身的写入速度慢的特点,使得让移动设备本身达到加速后的这种性能是不可能的,也是毫无必要的。而有了该S设备,移动设备就可以在必要的时候获得性能提升。
在样例三的案例中,我们还增加了不同应用程序在经一段时间优化后在各自S上缓存的模式的收集与反馈的设计,见附图12所示,S会以密文上传这些在各自***中优化后的缓存模式配置数据到一台处理机器,处理机器对多个C端的缓存过的各种应用程序与相关文件的统计数据,来进行缓存结构的优化或预判,再反馈到各个S端以及新的S端(举例一:大量C端上某游戏程序的某文件夹都呈现出频繁读取特征,则当S服务新的C端时,如发现该程序,可直接进行预判性质的工作如缓存该在其他设备上被频繁读写文件夹到高速设备,而无需重新积累缓存数据;举例二:大量C端上某程序都呈现出频繁写入工作,如某购物浏览器,则当启动该浏览器时,可预判性质地为该浏览器分配较大的S端写缓存层,而无需重新积累缓存数据)。
以上的三个样例都是有源型设计,其特点是有专门设计的S以及较好的加速性能。对于这些有源型设计的加速体系,存在性能从高性能的上一级向低性能但具有更高便携性与更低功耗的下一级输入的可能。事实上,除了以上几个实现的样例,还可以创建更多的实例,比如按照S与C的拓扑关系,还可能创建更多的加速体系。此外,在一些情况下,如果有原有的设备供改造利用,以及并不计较功耗和成本的情况下,S可以通过在现有的大型计算机、服务器、或高性能计算机上进行虚拟化,虚拟出R层D层V层等,并安装适当的改装来改装成S设备,然后安装驱动程序和L的服务端来实现为C加速的目的。
第四个样例就是一个专门为***改造设计的设备。该改造设备具有软件部分与硬件部分。软件部分的装置分为服务端与被服务端,服务端经安装到在拟改造为S端的设备上后,会对该待改造设备进行重新分层,创建写缓存层与读缓存层,以及与被服务端通讯交互缓存 命令与数据,当装置的被服务端安装到C端后,会拦截C端的I/O进行重定向,改变C端的缓存结构,调用S端缓存层作为缓存,并与服务端通讯交互缓存,从而组成跨设备缓存体系。硬件部分的装置包含无线传输的L的两端,带有USB接口与Wifi网卡,能够一端以USB(含MicroUSB等广义的USB接口)与S或C连接,一端相互通信。
在试验的这个环境中,我们有一台具有USB3接口的09年组装台式计算机,一台HTC ONE手机。改造装置在两台设备安装后,台式机通过虚拟内存为磁盘做缓存,就能够以802.11ac的无线协议与HTC ONE通信,为HTC ONE提速。HTC ONE的存储的写入速度原本仅10MB每秒,但是其具有的802.11ac网卡速度能够达到百兆每秒。
附图说明:
图1.设备的原理图。本附图上方展示了一个典型的为一般企事业单位、家庭、个人提供设备加速的短距离、多通道的、基于局域性能网络的跨设备缓存与计算虚拟化***。本附图的下方则展示了在该***下一个计算设备的新I/O机制。其中的典型S设备A与B的详细设计分别见图2与图3。
图2.为有线型***设计的加速服务端设备,详细描述见实施案例部分的案例一。
图3.为无线型***设计的加速服务端设备,详细描述见实施案例部分的案例三。
图4.样例一的加速方案连接图,主要基于短距CAT6有线连接与短距多通道连接构成跨设备缓存与计算体系。本附图中央的是样例一的S,周围的是C。
图5.样例一的加速结构图,描绘的是样例一下被加速设备C的新的跨设备缓存结构。C的I/O都被拦截和重新分配。
图6.样例二的加速方案连接图,主要基于短距无线连接与短距多通道连接构成跨设备缓存与计算体系。本附图中央的是样例二的S,周围的是C,按照逆时针的次序,C设备的类型分别为智能手表、智能眼镜、平板电脑、智能手机,以及大屏幕人机交互设备。
图7.样例二的加速结构图,描绘的是样例二下被加速设备C的新的跨设备缓存结构。C的I/O都被拦截和重新分配。在案例中,被加速的设备还包括许多移动设备类型,移动设备受限于功耗与体积原因,本身搭载的处理性能与缓存介质严重不足。但由于移动性需要,移动设备往往具备较高的无线数据速度。虚拟环境还可以使得移动设备间接处理Window应用获得相应操作体验。
图8.样例二的读操作流程图
图9.样例二的写操作流程图
图10.样例三的加速方案连接图,本加速方案是一个简易型的方案,主要基于短距无线连接 与短距多通道连接构成跨设备缓存与计算体系。本附图中央的是样例三的S,周围的是C,按照逆时针的次序,C设备的类型分别为智能手表、智能眼镜、平板电脑、智能手机,以及台式计算机。
图11.样例三的加速结构图,描绘的是样例三下被加速设备C的新的跨设备缓存结构。由于样例三的S设备是简化型设备,因此在缓存的分层体系上有相应的调整,详细描述见实施案例部分的案例三的介绍。
图12.样例三采用了基于多个S-C***的缓存配置优化数据的采集、分析与反馈机制。

Claims (20)

  1. 一种基于服务端与外部缓存***的外接式计算设备加速方法,包含至少一个服务端(以下称S),服务端通过数据传输设备(以下称L,L可以为有线传输类型也可以为无线传输类型或二者综合)与被加速的计算设备(此处的计算设备并不特指电脑,而应包含广义的各种计算机、平板、手机、机器人或智能硬件等,以下称C)连接,且该服务端的工作原理至少包含:S为C提供相对高速或高I/O性能的缓存设备(如Ramdisk或阻变存储器或高速的固态硬盘组成的RAID等),用于为C缓存***与应用程序的常用文件或频繁读写的零散小文件或网络文件等,作为高速缓存,转移C对自身相对低速或低I/O性能设备的部分访问,从而为C提供加速或提升其I/O性能。
  2. 根据权利要求1的一种方法,其特征在于,除建立缓存体系外,S还为C提供计算性能的支持,方式可以为下面三种方式的任意一种或多种:一、S分担计算任务;二、通过在S中虚拟架构与硬件资源,允许C以远程操作的方式在S运行这一虚拟层,在C上显示界面以及实现与用户的交互,一个S可以为多个C创建虚拟架构,根据C的实时使用情况扣除S资源,S中分隔的虚拟架构可以为虚拟机或虚拟的应用层;三、S将应用程序进行虚拟化处理,从而预存C所需的程序文件与程序所需***环境在S设备中,C能够在S中直接运行应用程序。
  3. 根据权利要求1的一种方法,其特征在于,S通过分工化的三个加速层为C加速,包括写缓存层(以下简称R Layer或R层,如Ramdisk等,S将内存模拟为磁盘,并在其中为C创建缓存,以获得更大的缓存速度),随机读缓存层(以下简称D Layer或D层,如阻变存储器等,或以RAID方式组合的NAND磁盘阵列),虚拟层(以下简称V Layer或V层,通过将应用虚拟化或硬件虚拟化为C提供计算性能支持),当然,出于功耗与成本在具体环境下的考量,以上的三个层在物理设备层面上可以被强化、或合并(比如,对于企业应用环境,V层可能需要多种,以适应不同的应用需求,而对于简单的家庭应用,R层与D层可以在物理上合并为一个RD层)。
  4. 根据权利要求1的一种方法,其特征在于,S与C之间还构成如下的拓扑关系中任意两条或两条以上:一、S与C之间可以是一对一,一对多,或多对多的关系,如家庭环境可能是一对一或一对多的关系,而企业、学校、行政单位等环境可能就会是一对多或多对多关系(该拓扑关系以下称为弹***关系);二、S与C仅是相对关系,即一组关系中的C可以是另一组关系中的S,如一台被加速的计算机,可以同时为另一台计算机加速,而相应的,一组关系中的S也可以是另一组关系中的C,如由更上一级性能与缓存速度的计算设备对下一级性能与缓存速度的S加速(该拓扑关系以下称为性能流动关系);三、多个C可以同时 相互作为S,形成P2P的加速组网,以获得性能提升(比如,其中某计算机C1具有较大的内存,可以提供更强的小文件缓存能力与写操作缓存,但磁盘速度相对不足,而某计算机C2具有多块较大的闪存固态硬盘,并以RAID形式增大带宽,具有更强的读操作缓存,则C2可以与C1互为S,C2与C1的写操作缓存更多地写入C1,而读操作缓存更多写入C2,该拓扑关系以下称为性能互补关系)。
  5. 根据权利要求1的一种方法,其特征在于,还采用了如下预存设计:通过对C用户习惯进行长期监测识别,判断出C***即将使用哪些数据,预存在S中,C将直接从设备中获取数据,再将其转入内存或处理器中,从而减少对C硬盘的读写。
  6. 根据权利要求1的一种方法,其特征在于,还为所有在S的缓存文件提供了加密。
  7. 根据权利要求1的一种方法,其特征在于,多个S之间以光纤等高速连接方式组网,以增强预读分析的能力,以及获得更高的缓存性能。
  8. 根据权利要求1的一种方法,其特征在于,S以RAID方式组合其硬盘,以获得更大的缓存速度尤其是随机缓存速度。
  9. 根据权利要求1的一种方法,其特征在于,S将内存模拟为磁盘,并在其中为C创建缓存,以获得更大的缓存速度与写缓存速度。
  10. 根据权利要求1的一种方法,其特征在于,当多个C端所访问的网络文件和程序文件具有高度类似性时,再在S端分配一个共享缓存区缓存此类共同缓存。
  11. 根据权利要求1的一种方法,其特征在于,依据单个或多个S对所服务的多个C端的缓存过的各种应用程序与相关文件的统计数据,进行缓存结构的优化或预判(举例一:大量C端上某游戏程序的某文件夹都呈现出频繁读取特征,则当S服务新的C端时,如发现该程序,可直接进行预判性质的工作如缓存该在其他设备上被频繁读写文件夹到高速设备,而无需重新积累缓存数据;举例二:大量C端上某程序都呈现出频繁写入工作,如某购物浏览器,则当启动该浏览器时,可预判性质地为其分配较大的S端写缓存层)。
  12. 根据权利要求1的一种方法,其特征在于,L为有线类型:L可以为光纤、Thunderbolt传输线、加长的USB3.0以上传输线、超五类网线或其他任何类型的传输速度大于60MB每秒的有线传输设备,S通过这些传输线缆直接连接到C;L也可以为包含USB以太网转接器的有限类型(此类转接器一端带有USB接口,另一端有以太网接口,通过现有的USB输入端口在计算机上添加以太网端口)以及其他必要的附件,用转接器的USB端连接S或C,而在两者之间用高速网线传输。
  13. 根据权利要求1的一种方法,其特征在于,L为无线类型,L可以为支持WiGig技术的 无线网卡,也可以为以无线传输与服务端S之间传输,再以USB接口或Thunderbolt接口或其他接口与被加速的计算设备C连接的底座、外接盒子或保护套,或其他任何类型的包含无线传输部分的设备。
  14. 一种用于实现权利要求1所述方法的设备,其特征在于,L是多通道类型,也即S与C之间的连接通过不止一个L,或L包含不止一个与C连接的接口,从而增大传输带宽(例一、C一方面通过自有的Wifi与S连接,一方面通过USB接口的Wifi适配器也与S连接,连接后合并带宽;例二、C的多个USB接口均使用USB接口的Wifi适配器与S连接,从而将Wifi带宽上限提高到USB带宽上限;例三、采用Wigig的底座L以Wigig与S通讯,并同时以蓝牙加MicroUSB与C连接)。
  15. 一种用于实现权利要求1所述方法的设备,其特征在于,S缓存服务端,还采用如下的三层加速层具体设计中任意一种或以上:一、将部分内存虚拟化成内存磁盘作为R层写缓存层,为C端设备分别分配部分R层缓存,并生成每个内存磁盘缓存的镜像文件,在设备开机时载入,关机时保存;二、在S端的固态硬盘或多块固态硬盘组成的Raid0中创建读缓存区;三、对S划分出多个动态分配资源的虚拟专用服务机,分配给不同的端口供C端用户连接使用。
  16. 一种用于实现权利要求1所述方法的设备,其特征在于,S以无线网络与C连接,并在该S上创建虚拟化的Windows环境(包括程序所需Library与注册表),并将应用程序进行虚拟化处理,从而预存更多程序文件与程序***环境文件在设备中,更彻底地避免了程序使用中的硬盘读写。
  17. 一种用于实现权利要求1所述方法的设备,其特征在于,S还调取C端设备部分的内存与S设备的缓存一起组成跨设备的复杂型缓存。
  18. 一种用于实现权利要求1所述方法的设备,其特征在于,S以无线网络与C连接,S以包含多通道闪存的高速存储设备为C提供缓存层。
  19. 一种用于实现权利要求1所述方法的设备,其特征在于,S以无线网络与C连接,S以带有DRAM缓存的高速存储设备为C提供缓存层。
  20. 一种用于实现权利要求1所述方法的设备,其特征在于,该设备的软件部分的装置分为服务端与被服务端,服务端经安装到在拟改造为S端的设备上后,会对该待改造设备进行重新分层,创建读写缓存层,以及与被服务端通讯交互缓存命令与数据,而当装置的被服务端安装到C端后,会拦截C端的I/O进行重定向,改变C端的缓存结构,调用S端缓存层作为缓存(本条权利要求所述设备主要为改造原有计算***设计,根据待改造的计算***可能 包含软件部分与硬件部分,也可能仅含软件部分)。
PCT/CN2015/096743 2014-10-13 2015-12-08 一种基于服务端与外部缓存***的外接式计算设备加速方法与实现该方法的设备 WO2016058560A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410535038.9 2014-10-13
CN201410535038.9A CN104298474A (zh) 2014-10-13 2014-10-13 一种基于服务端与外部缓存***的外接式计算设备加速方法与实现该方法的设备

Publications (1)

Publication Number Publication Date
WO2016058560A1 true WO2016058560A1 (zh) 2016-04-21

Family

ID=52318221

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/096743 WO2016058560A1 (zh) 2014-10-13 2015-12-08 一种基于服务端与外部缓存***的外接式计算设备加速方法与实现该方法的设备

Country Status (2)

Country Link
CN (1) CN104298474A (zh)
WO (1) WO2016058560A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111399886A (zh) * 2020-04-13 2020-07-10 上海依图网络科技有限公司 用于设备快速升级的方法及***
CN113625937A (zh) * 2020-05-09 2021-11-09 鸿富锦精密电子(天津)有限公司 存储资源处理装置及方法
US11893391B2 (en) 2019-04-29 2024-02-06 Alibaba Group Holding Limited Processing computing jobs via an acceleration device

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104298474A (zh) * 2014-10-13 2015-01-21 张维加 一种基于服务端与外部缓存***的外接式计算设备加速方法与实现该方法的设备
CN106598473B (zh) * 2015-10-15 2020-09-04 南京中兴新软件有限责任公司 消息持久化方法及装置
CN106951194A (zh) * 2017-03-30 2017-07-14 张维加 一种新结构的计算机设备
CN118295945A (zh) * 2024-06-05 2024-07-05 济南浪潮数据技术有限公司 带宽分配方法、主板、设备、存储介质及程序产品

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103002445A (zh) * 2012-11-08 2013-03-27 张维加 一种安全的提供应用服务的移动电子设备
CN103365603A (zh) * 2012-03-27 2013-10-23 株式会社日立制作所 存储***的存储器管理的方法和装置
CN103488515A (zh) * 2012-12-05 2014-01-01 张维加 一种usb引导***与程序虚拟机结合的设备
CN103500075A (zh) * 2013-10-11 2014-01-08 张维加 一种基于新材料的外接的计算机加速设备
CN103500076A (zh) * 2013-10-13 2014-01-08 张维加 一种基于多通道slc nand与dram缓存的新usb协议计算机加速设备
US20140143504A1 (en) * 2012-11-19 2014-05-22 Vmware, Inc. Hypervisor i/o staging on external cache devices
CN104298474A (zh) * 2014-10-13 2015-01-21 张维加 一种基于服务端与外部缓存***的外接式计算设备加速方法与实现该方法的设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1163829C (zh) * 1999-11-18 2004-08-25 武汉东湖存储技术有限公司 一种以最大带宽工作的硬盘作高速缓存的外存储器加速方法
CN1258056A (zh) * 1999-11-30 2000-06-28 武汉东湖存储技术有限公司 以最大带宽工作的硬盘作高速缓存的串联式外存储器加速卡
CN102323888B (zh) * 2011-08-11 2014-06-04 杭州顺网科技股份有限公司 一种无盘计算机启动加速方法
CN102981783A (zh) * 2012-11-29 2013-03-20 浪潮电子信息产业股份有限公司 一种基于Nand Flash的Cache加速方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103365603A (zh) * 2012-03-27 2013-10-23 株式会社日立制作所 存储***的存储器管理的方法和装置
CN103002445A (zh) * 2012-11-08 2013-03-27 张维加 一种安全的提供应用服务的移动电子设备
US20140143504A1 (en) * 2012-11-19 2014-05-22 Vmware, Inc. Hypervisor i/o staging on external cache devices
CN103488515A (zh) * 2012-12-05 2014-01-01 张维加 一种usb引导***与程序虚拟机结合的设备
CN103500075A (zh) * 2013-10-11 2014-01-08 张维加 一种基于新材料的外接的计算机加速设备
CN103500076A (zh) * 2013-10-13 2014-01-08 张维加 一种基于多通道slc nand与dram缓存的新usb协议计算机加速设备
CN104298474A (zh) * 2014-10-13 2015-01-21 张维加 一种基于服务端与外部缓存***的外接式计算设备加速方法与实现该方法的设备

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11893391B2 (en) 2019-04-29 2024-02-06 Alibaba Group Holding Limited Processing computing jobs via an acceleration device
CN111399886A (zh) * 2020-04-13 2020-07-10 上海依图网络科技有限公司 用于设备快速升级的方法及***
CN113625937A (zh) * 2020-05-09 2021-11-09 鸿富锦精密电子(天津)有限公司 存储资源处理装置及方法
CN113625937B (zh) * 2020-05-09 2024-05-28 富联精密电子(天津)有限公司 存储资源处理装置及方法

Also Published As

Publication number Publication date
CN104298474A (zh) 2015-01-21

Similar Documents

Publication Publication Date Title
EP3920034B1 (en) Systems and methods for scalable and coherent memory devices
WO2016058560A1 (zh) 一种基于服务端与外部缓存***的外接式计算设备加速方法与实现该方法的设备
CN109891399B (zh) 在相同的物理串行总线集线器上产生多个虚拟串行总线集线器实例的装置和方法
US20160253093A1 (en) A new USB protocol based computer acceleration device using multi I/O channel SLC NAND and DRAM cache
JP2005222123A5 (zh)
US11010084B2 (en) Virtual machine migration system
CN107562645B (zh) 一种内存页管理方法及计算设备
US9417819B2 (en) Cache device for hard disk drives and methods of operations
US10572667B2 (en) Coordinating power management between virtual machines
WO2015051694A1 (zh) 一种基于新材料的外接的计算机加速设备
US10235054B1 (en) System and method utilizing a cache free list and first and second page caches managed as a single cache in an exclusive manner
US11157191B2 (en) Intra-device notational data movement system
US20210357339A1 (en) Efficient management of bus bandwidth for multiple drivers
CN103227825A (zh) 桌面一体机架构
US11093175B1 (en) Raid data storage device direct communication system
CN104298620A (zh) 一种耐擦写低能耗的外接计算机加速设备
US20240028209A1 (en) Distributed region tracking for tiered memory systems
US11003378B2 (en) Memory-fabric-based data-mover-enabled memory tiering system
TWI696068B (zh) 用於提供高效功率檔案系統操作至一非揮發性區塊記憶體之系統及方法
Wu et al. I/O stack optimization for efficient and scalable access in FCoE-based SAN storage
KR20210043001A (ko) 하이브리드 메모리 시스템 인터페이스
US12019894B2 (en) Systems and methods for managing coresident data for containers
US10853293B2 (en) Switch-based inter-device notational data movement system
US20190265902A1 (en) Live migration of applications using capi flash
Oikawa Performance Impact of New Interface for Non-volatile Memory Storage

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15851078

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15851078

Country of ref document: EP

Kind code of ref document: A1