CN112954001B - Method and device for HTTP-to-HTTPS bidirectional transparent proxy - Google Patents

Method and device for HTTP-to-HTTPS bidirectional transparent proxy Download PDF

Info

Publication number
CN112954001B
CN112954001B CN202110062859.5A CN202110062859A CN112954001B CN 112954001 B CN112954001 B CN 112954001B CN 202110062859 A CN202110062859 A CN 202110062859A CN 112954001 B CN112954001 B CN 112954001B
Authority
CN
China
Prior art keywords
http
https
tcp
proxy
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110062859.5A
Other languages
Chinese (zh)
Other versions
CN112954001A (en
Inventor
王赟
侯贺明
程波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Greenet Information Service Co Ltd
Original Assignee
Wuhan Greenet Information Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Greenet Information Service Co Ltd filed Critical Wuhan Greenet Information Service Co Ltd
Priority to CN202110062859.5A priority Critical patent/CN112954001B/en
Publication of CN112954001A publication Critical patent/CN112954001A/en
Priority to PCT/CN2021/135682 priority patent/WO2022151867A1/en
Application granted granted Critical
Publication of CN112954001B publication Critical patent/CN112954001B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/09Mapping addresses
    • H04L61/25Mapping addresses of the same type
    • H04L61/2503Translation of Internet protocol [IP] addresses
    • H04L61/2521Translation architectures other than single NAT servers
    • H04L61/2528Translation at a proxy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/161Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to the technical field of network communication, and provides a method and a device for HTTP-to-HTTPS bidirectional transparent proxy. After receiving a first HTTP request sent by a client, a proxy system analyzes an HTTP header field to obtain a Host field, compares the Host field with a built-in domain name list and stores the content of the first HTTP request; the domain name list is used for storing domain names meeting HTTPS redirection; if the Host domain name is found in the list, the proxy system initiates a TCP handshake with a target port of 443 to the server so as to establish a first TCP channel and perform a TLS negotiation process through the first TCP channel. The invention realizes a simplified bidirectional transparent proxy which is not easy to make mistakes.

Description

Method and device for HTTP-to-HTTPS bidirectional transparent proxy
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of network communication, in particular to a method and a device for HTTP-to-HTTPS bidirectional transparent proxy.
[ background of the invention ]
With the increasing complexity of network environments, various proxy technologies have come into play. The protocol layer of the slave agent can be divided into HTTP agent, SSL agent, FTP agent, mail agent, TCP agent, etc.; from the position of the proxy server, the method can be divided into a forward proxy and a reverse proxy; from the perspective of whether it can be perceived, transparent proxies and non-transparent proxies can be distinguished. The transparent proxy can be divided into client-side transparency and server-side transparency, the client-side transparency means that a proxied user does not need to make any configuration, the own flow passively receives the proxy, and the user does not sense the whole process; server side transparency means that the destination server accessed by the user does not feel that the accessed traffic passes through the proxy server, and the IP address seen by the server is the IP address of the user, not the IP address of the intermediate proxy server. The bidirectional transparent proxy means that both the terminal user and the server end can not feel the existence of the proxy server, the terminal user does not need to do any proxy related configuration, and the accessed target IP is also the real IP address of the server, but not the IP of the proxy server; the access IP address seen by the server side is also the real end user's IP address, not the proxy server's IP address. In the whole flow transmission process, the IP address of the intermediate proxy server does not appear, although the flow is processed by the proxy server, the two sides can not perceive that the flow is proxied, which is the meaning of the bidirectional transparent proxy.
In order to implement transparent proxy, the traffic that the user needs to access the server first passes through the proxy, that is, the proxy is deployed between the end user and the server and can access the communication traffic between them. Secondly, some special processing needs to be carried out on the forwarding of the message, the message only flows to a node of a destination IP address under the default condition, and for the proxy server, the message is not a server which a user needs to access, but the message is not transmitted to an application layer though passing through the proxy server, and the application layer does not have a corresponding socket for processing. Since the proxy is used, some control must be performed on the routing of the packet, so that the packet forwarded to the destination server is continuously transferred to the application layer and then processed. Another key technology is to realize bidirectional transparency, i.e. the IP address of the proxy server needs to be hidden in the whole communication process, and only the IP address of the end user and the IP address of the server appear in the whole process. The core of distinguishing various transparent proxy servers is to distinguish which technology is used for message routing and which technology is used for realizing IP address transparency.
The traditional transparent proxy technology has two main routes: the first method is to realize transparency by modifying the IP address of the message; because the target IP address is modified, the routing system of the proxy router can route the message to the router, but not forward the message, and the message is further processed by a TCP/TP protocol stack and reaches an application layer to be used as a proxy. Specifically, for a message reaching the proxy server, modifying a destination IP and a destination port into an IP address and a socket port of the proxy server; for the message leaving the proxy server, the source IP and the source port are modified from the IP address and the port of the proxy server to the IP address and the port of the client or the server according to different types of connection. The second method is not to make IP address conversion, but to use the Tproxy characteristic of Linux kernel, Tproxy is the abbreviation of transparent proxy, it can realize bidirectional transparent proxy without any change to IP address.
At present, a transparent proxy system implemented by using a Tproxy technology generally uses a virtual network bridge device to connect two or more physical network cards, wherein one network card is an uplink network port and is connected with client traffic; the other network card is a downlink network port and is connected with the flow of the server side. The transparent proxy system needs to deliver the message generated by the application layer proxy program to the network, and the correct message forwarding can be realized only by using the transparent proxy system of the Linux bridge device and configuring an IP address for the bridge device and configuring a correct gateway address and a correct routing table. At this time, an IP address is often required to be configured for the bridge device to participate in routing based on the IP address, and under the condition that some network networking environments are complex, configuring the IP address for the bridge device and configuring a related routing table is very tedious and easy to make an error.
In view of the above, overcoming the drawbacks of the prior art is an urgent problem in the art.
[ summary of the invention ]
The technical problem to be solved by the invention is that the existing transparent proxy system needs to deliver the message generated by the application layer proxy program to the network, and the correct message forwarding can be realized only by using the transparent proxy system of the Linux bridge device and configuring the bridge device with an IP address and a correct gateway address and routing table. At this time, an IP address is often required to be configured for the bridge device to participate in routing based on the IP address, and under the condition that some network networking environments are complex, configuring the IP address for the bridge device and configuring a related routing table is very tedious and easy to make an error.
The invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for HTTP to HTTPs bidirectional transparent proxy, including:
after receiving a first HTTP request sent by a client, the proxy system analyzes an HTTP header field to obtain a Host field, compares the Host field with a built-in domain name list and stores the content of the first HTTP request; the domain name list is used for storing domain names meeting HTTPS redirection;
if the Host domain name is found in the list, the proxy system initiates TCP handshake with a target port of 443 to the server so as to establish a first TCP channel and perform TLS negotiation process through the first TCP channel; wherein an IP address of a client used in the TCP handshake;
the agent system sends a first HTTPS request to the server through a first TLS channel established by TLS negotiation; the first HTTPS request carries encrypted content of the first HTTP request content;
and the proxy system receives a first HTTPS response returned by the server, deletes the secure attribute of the cookie field in the HTTP header field of the first HTTPS response and the Strict-Transport-Security field contained in the HSTS, decrypts the content of the first HTTPS response into a plaintext, and then transmits the plaintext to the client.
Preferably, if the Host domain name is found not in the list, the method further comprises:
the proxy system initiates a TCP handshake with a target port of 80 to the server so as to establish a second TCP channel;
the proxy system sends a second HTTP request to the server through the second TCP channel, wherein the second HTTP request carries the content of the first HTTP request;
the proxy system receives a second HTTP response returned by the server and checks whether the second HTTP response is an HTTPS redirection;
if the redirection is HTTPS redirection, the proxy system adds the Host into the domain name list, discards the second HTTP response, simultaneously sends a TCP Reset message to the server, and closes the first TCP channel.
Preferably, the method further comprises:
the proxy system initiates a TCP handshake with a target port of 443 to the server so as to establish a second TCP channel and perform a TLS negotiation process through the second TCP channel; wherein an IP address of a client used in the TCP handshake;
the agent system sends a second HTTPS request to the server through a second TLS channel established by TLS negotiation; the second HTTPS request carries encrypted content of the second HTTP request content;
and the proxy system receives a second HTTPS response returned by the server, deletes the secure attribute of the cookie field in the HTTP header field of the second HTTPS response and the Strict-Transport-Security field contained in the HSTS, decrypts the content of the second HTTPS response into a plaintext, and then transmits the plaintext to the client.
Preferably, the checking whether the second HTTP response is an HTTPs redirect includes:
checking whether the status code of the second HTTP response is between 300 and 399, and checking that the redirected target address is an HTTPS version of the redirected previous Host; wherein, according to the HTTP protocol specification, the response status code of the interval represents redirection.
Preferably, if it is not an HTTPS redirect, the method further includes:
and the proxy system transmits the received second HTTP response to the client.
Preferably, the agent system includes: data packet transceiver module, virtual network card module, data packet routing module, agent program module, it is specific:
the data packet receiving and transmitting module is used for receiving and transmitting messages at the bottom layer and consists of a data packet receiving module and a data packet transmitting module, and the data packet receiving module is responsible for receiving data packets from the physical network card and forwarding the data packets to the virtual network card;
the virtual network card module is a standard TUN virtual network card working on a transmission layer and is used for analyzing the received IP data packet through a protocol stack of an operating system;
and the data packet routing module is used for routing the target message to the agent program module for processing, and comprises iptables rule configuration and policy routing configuration.
Preferably, the agent module operates in an application layer of an operating system and supports a Tproxy mechanism, specifically:
the IP _ TRANSPARENT parameter is set on the socket attribute, so that socket connection of any destination IP address is accepted, and meanwhile, data messages can be generated by taking any IP address as a source IP, so that the socket can bind any IP address.
Preferably, the method further comprises:
adding a first rule in the iptables rule:
iptables-t mangle-A PREROUTING-p tcp--dport 80-j TPROXY--tproxy-mark 0x1/0x1--on-port 0;
the first rule indicates that a label value of 1 is marked on a message with a transmission protocol of TCP and a destination port of 80;
adding a second rule in the iptables rule:
iptables-t mangle-A PREROUTING-p tcp--sport 443-j MARK--set-mark 1;
the second rule indicates that the label value of the message with the transmission protocol being TCP and the source port being 443 is marked with 1;
adding a third rule in the iptables rule:
iptables-t mangle-A OUTPUT-p tcp--dport 443-j MARK--set-mark2;
the third rule indicates that the label value of the message with the transmission protocol being TCP and the destination port being 443 is marked with 2;
add a fourth rule in the iptables rule:
iptables-t mangle-A OUTPUT-p tcp--sport 80-j MARK--set-mark2;
the fourth rule indicates that the packet with the transport protocol TCP and the source port 80 is tagged with a tag value of 2.
Preferably, the method further comprises policy routing configuration, specifically:
the message with the label value 1 is queried in the routing table 100:
ip rule add fwmark 1lookup 100;
establishing a routing table with the number of 100, and setting the content as the routing from the virtual network card of the agent system to the application layer of the agent system:
ip route add local 0.0.0.0/0dev lo table 100;
configuring a message query routing table 200 with a label value of 2:
ip rule add fwmark 0x2 table 200;
a routing table with the number of 200 is established, and the message is sent through the virtual network card tun 1:
ip route add default via 10.0.0.2dev tun1 table 200。
in a second aspect, the present invention further provides an HTTP to HTTPs bidirectional transparent proxy apparatus, for implementing the HTTP to HTTPs bidirectional transparent proxy method described in the first aspect, where the apparatus includes:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor for performing the HTTP to HTTPs bi-directional transparent proxy method of the first aspect.
In a third aspect, the present invention further provides a non-volatile computer storage medium, where the computer storage medium stores computer-executable instructions, which are executed by one or more processors, and are configured to perform the method for converting HTTP to HTTPs according to the first aspect.
The invention has realized a two-way transparent agent method and apparatus to HTTP changes HTTPS, in the data link layer, receive and send by the direct control message of the procedure, does not change the MAC address of the original message, have realized two-way transparency to the upstream and downstream routing equipment in the data link layer; in the IP transmission layer, the source IP address and the destination IP address are not changed by combining the virtual network card technology and the Tproxy technology, so that the bidirectional transparency of the client direction and the server direction is realized; on the application layer, HTTP-to-HTTPS proxy is realized, an HTTP connection is maintained between the proxy system and the client, and an HTTPS connection is maintained between the proxy system and the server, so that bidirectional transparent proxy of the application layer is realized.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a diagram of the overall architecture of a system for converting HTTP to HTTPs bidirectional transparent proxy according to an embodiment of the present invention;
fig. 2 is a schematic flow diagram of a message in an agent system according to an embodiment of the present invention;
fig. 3 is a flowchart illustrating a method for converting HTTP to HTTPs by using a bidirectional transparent proxy according to an embodiment of the present invention;
fig. 4 is a flowchart illustrating a method for converting HTTP to HTTPs by using a bidirectional transparent proxy according to an embodiment of the present invention;
fig. 5 is a flowchart illustrating a method for converting HTTP to HTTPs by using a bidirectional transparent proxy according to an embodiment of the present invention;
fig. 6 is a flowchart illustrating a processing method of a packet receiving module according to an embodiment of the present invention;
fig. 7 is a flowchart illustrating a processing method of a data packet sending module according to an embodiment of the present invention;
fig. 8 is a flowchart illustrating a processing method of a configuration flow according to an embodiment of the present invention;
fig. 9 is an overall workflow of an agent system according to an embodiment of the present invention;
fig. 10 is a signaling diagram of an interaction flow a according to an embodiment of the present invention;
fig. 11 is a signaling diagram of an interaction flow B according to an embodiment of the present invention;
fig. 12 is a signaling diagram of an interaction flow C according to an embodiment of the present invention;
fig. 13 is a diagram illustrating the contents of the original HTTP response header field sent by the server to the client according to an embodiment of the present invention;
fig. 14 is a schematic diagram illustrating the contents of the HTTP response header modified by the proxy system according to an embodiment of the present invention;
fig. 15 is a schematic structural diagram of an apparatus for converting HTTP to HTTPs bidirectional transparent proxy according to an embodiment of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the description of the present invention, the terms "inner", "outer", "longitudinal", "lateral", "upper", "lower", "top", "bottom", and the like indicate orientations or positional relationships based on those shown in the drawings, and are for convenience only to describe the present invention without requiring the present invention to be necessarily constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
The transparent agent building method used in the invention does not use IP address modification or adopt network bridge equipment, but directly reads the data packet of the network card from the bottom layer, and then performs agent on the data packet by combining the virtual network card technology. The method is characterized in that the network card of the data message is directly controlled to receive and transmit by using a program, and the message routing is not dependent on an operating system. This has two benefits:
firstly, a large amount of routing configuration work is avoided, such as configuring an IP address for a network bridge, configuring an operating system routing table, configuring a default gateway address for the network bridge, configuring an ARP address of the default gateway in an operating system and the like;
and secondly, the acquisition of the related address information of the upstream and downstream routing nodes of the current proxy system is avoided, the proxy system does not need to know the IP address and the MAC address of the next hop node of the data message routing in advance, because the MAC address of the original message is reserved in the proxy system, the MAC address of the original message is not modified when the message is received and sent, and the proxy system is transparent to the upstream and downstream routing equipment, so that the proxy system can be conveniently embedded into a complex network environment.
The traditional agent for the HTTPS is generally only data transfer of a TCP layer, and data forwarding is simply performed between an HTTPS client and an HTTPS server. This is so because HTTPS itself can prevent data modification by the man-in-the-middle, and if the man-in-the-middle acts on HTTPS traffic without a legal certificate, it may cause the HTTPS client to report an error, for example, a typical HTTPS client is a browser, and the browser may alert the end user to warn the user that the traffic may be hijacked. In order to solve the problem that the HTTPS flow is difficult to decrypt and act, the invention provides an HTTP-to-HTTPS agent, which bypasses the process that HTTPS encrypts the client flow and solves the problem that a client browser generates a certificate alarm due to the fact that a certificate of an agent system is not trusted under the HTTPS agent.
The invention realizes a bidirectional transparent proxy system for HTTP to HTTPS, wherein, at the data link layer, the program directly controls the message receiving and sending without changing the MAC address of the original message, and the bidirectional transparency is realized for the upstream and downstream routing devices at the data link layer; in an IP transmission layer, a virtual network card technology and a Tproxy technology are combined, a source IP address and a target IP address are not changed, and bidirectional transparency of the client direction and the server direction is realized; on the application layer, HTTP-to-HTTPS proxy is realized, an HTTP connection is maintained between the proxy system and the client, and an HTTPS connection is maintained between the proxy system and the server, so that bidirectional transparent proxy of the application layer is realized.
The first part is the bottom realization of the transparent agent technology, including message receiving and sending and realizing IP address transparency; the second part is how the agent system interacts with the message between the client and the server.
The transparent proxy bottom layer realizes that:
the overall role is divided into three categories, client, agent system and server. The proxy system maintains two connections simultaneously, one with the client and one with the server, and relays data between the two connections.
As shown in fig. 1, the proxy system is integrally divided into four modules, namely, a data packet transceiver module, a virtual network card module, a data packet routing module, and a proxy system module.
The data packet receiving and transmitting module is mainly responsible for receiving and transmitting messages of the bottom layer and consists of a data packet receiving module and a data packet transmitting module, wherein the data packet receiving module is responsible for receiving data packets from the network card and transmitting the data packets to the virtual network card equipment; the virtual network card module is a standard TUN virtual network card working on a transmission layer and is mainly responsible for enabling a received IP data packet to enter a protocol stack of an operating system for analysis; the data packet routing module is mainly responsible for routing the target message to the agent program module for processing, and specifically comprises iptables rule configuration, policy routing configuration and the like. Under the default condition, the operating system finds that the destination IP address is not a data packet of the IP of the operating system, and the data packet is not transmitted to the application layer but discarded or forwarded. The agent program module works in an application layer of an operating system, is an application program, and sets an IP _ TRANSPARENT parameter on a socket attribute in order to process a message of any destination IP address, namely, to support a Tproxy mechanism, so that socket connection of any destination IP address can be accepted, and simultaneously, a data message can be generated by taking any IP address as a source IP, namely, the socket can bind any IP address.
The flow direction of the data packet in the agent system is shown in fig. 2, the agent program module works in the application layer, and maintains two socket connections with the client and the server at the same time, the flow direction formed by several lines with the numbers of 1, 2, 3,4, 5, 6 represents the connection maintained by the agent system and the client, the flow direction formed by several lines with the numbers of 7, 8, 9, 10, 11, 12 represents the connection maintained by the agent system and the server, and the line with the number of 13 represents the traffic of non-HTTP, and the traffic is not delivered to the application layer for processing, but is directly forwarded by the network card. The traffic of which the data packet transceiver module is responsible for processing is the traffic messages with the numbers 2,5,8,11 and 13, specifically, the data packet receiver module is responsible for processing the traffic messages represented by the numbers 2,11 and 13, and the data packet transmitter module is responsible for processing the traffic messages represented by the numbers 5 and 8; the data packet routing module is responsible for processing the flow messages with the serial numbers of 3,4,7 and 12.
Message processing flow of the agent program module:
the HTTP-to-HTTPS proxy in the invention essentially realizes HTTP redirection jumping hijacking. The modern website gradually adopts HTTPS to protect the transmission content from being modified, but in order to be compatible with the HTTP traffic accessed by the default of the user, the HTTP service of an 80 port is generally kept, when the user accesses the HTTP service of the 80 port, a redirection mechanism of an HTTP protocol is utilized to send 301/302 redirection commands to the user, the user is guided to the corresponding HTTPS service, and after receiving redirection, the HTTP client of the user accesses the redirected HTTPS service according to the specification of the HTTP protocol. The HTTP-to-HTTPS proxy function realized by the invention essentially utilizes and hijacks the redirection mechanism.
The agent program module transmits and interacts the messages of the client and the server:
when a client side initiates an HTTP request, the proxy system firstly probes a target domain name, the specific probing process is to initiate the HTTP request to the corresponding domain name and check whether an HTTP response state code is a redirection state code such as 301/302, if the redirection state code is a redirection state code, and the redirected website address is an HTTPS version of the website address, the HTTPS request is initiated to the corresponding domain name and the returned response content of the HTTPS is transmitted to the client side through an HTTP channel. By the operation, the proxy system and the client maintain an HTTP connection, but maintain an HTTPS connection with the server, the 302 redirection action of the client is successfully hijacked, and the client cannot be upgraded to the HTTPS connection all the time. If the detection result is that the target domain name does not meet the HTTPS redirection condition, the proxy system simultaneously communicates with the client and the server by using an HTTP channel, and data is transferred between the client and the server, and then the proxy system works in an HTTP proxy mode.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1:
embodiment 1 of the present invention provides a method for converting HTTP into HTTPs for a bidirectional transparent proxy, as shown in fig. 3, including:
in step 101, after receiving a first HTTP request sent by a client, a proxy system parses a HTTP header field to obtain a Host field, compares the Host field with a built-in domain name list, and stores the content of the first HTTP request; the domain name list is used for storing domain names meeting HTTPS redirection.
In step 102, if the Host domain name is found in the list, the proxy system initiates a TCP handshake with a target port of 443 to the server, so as to establish a first TCP channel and perform a TLS negotiation process through the first TCP channel; wherein an IP address of a client used in the TCP handshake; wherein the TLS negotiation process comprises: negotiating an encryption suite, transferring certificates, verifying certificates, and computing keys.
In step 103, the agent system sends a first HTTPS request (also expressed as HTTP request in other embodiments of the present invention, and described herein as HTTPS request for intuitive representation of technical characteristics) to the server through a first TLS channel established by TLS negotiation; and the first HTTPS request carries the encrypted content of the first HTTP request content.
In step 104, the proxy system receives the first HTTPS response returned by the server, deletes the secure attribute of the cookie field in the HTTP header field of the first HTTPS response and the Strict-Transport-Security field included in the HSTS, decrypts the content of the first HTTPS response into a plaintext, and passes the plaintext to the client.
The embodiment of the invention realizes a bidirectional transparent proxy method for HTTP to HTTPS, wherein, at a data link layer, a program directly controls the receiving and sending of messages, the MAC address of the original message is not changed, and bidirectional transparency is realized on upstream and downstream routing equipment at the data link layer; in the IP transmission layer, the source IP address and the destination IP address are not changed by combining the virtual network card technology and the Tproxy technology, so that the bidirectional transparency of the client direction and the server direction is realized; on the application layer, HTTP-to-HTTPS proxy is realized, an HTTP connection is maintained between the proxy system and the client, and an HTTPS connection is maintained between the proxy system and the server, so that bidirectional transparent proxy of the application layer is realized.
With reference to the embodiment of the present invention, there is also an extended implementation scheme, where as a parallel optional situation of step 102 in embodiment 1, if it is found that the Host domain name is not in the list, as shown in fig. 4, the method further includes:
in step 105, the proxy system initiates a TCP handshake to the server with target port 80 to establish a second TCP tunnel.
In step 106, the proxy system sends a second HTTP request to the server through the second TCP channel, where the second HTTP request carries the content of the first HTTP request.
In step 107, the proxy system receives a second HTTP response returned by the server and checks if said second HTTP response is an HTTPs redirect.
Wherein, checking whether the second HTTP response is an HTTPs redirect specifically includes:
checking whether the status code of the second HTTP response is between 300 and 399 (because the redirection function is not only an HTTPS redirection but also any URL redirection can be carried out, and the HTTPS redirection is only one application), and checking that the redirected target address is an HTTPS version of the redirection front Host; wherein, according to the HTTP protocol specification, the response status code of the interval represents redirection.
In step 108, if the redirection is HTTPS redirection, the proxy system adds Host to the domain name list, discards the second HTTP response, and sends a TCP Reset packet to the server to close the first TCP tunnel.
In connection with the embodiment of the present invention, there is also an extended implementation, as shown in fig. 5, after the step 108 is performed, preferably, the method further includes:
in step 109, the proxy system initiates a TCP handshake with a target port 443 to the server, so as to establish a second TCP channel and perform a TLS negotiation process through the second TCP channel; wherein the IP address of the client used in the TCP handshake, wherein the TLS negotiation process comprises: negotiating an encryption suite, transferring certificates, verifying certificates, and computing keys.
In step 110, the proxy system sends a second HTTPS request to the server through a second TLS channel established by the TLS negotiation; the second HTTPS request carries encrypted content of the second HTTP request content;
in step 111, the proxy system receives the second HTTPS response returned by the server, deletes the secure attribute of the cookie field in the HTTP header field of the second HTTPS response and the Strict-Transport-Security field included in the HSTS, decrypts the content of the second HTTPS response to a plaintext, and passes the plaintext to the client.
As a complete technical solution implementation consideration, if the determination result in the step 108 is another case, the corresponding technical solution implementation content is represented as: if not HTTPS redirection is found, the method further comprises: and the proxy system transmits the received second HTTP response to the client.
In the embodiment of the present invention, a preferred proxy system framework structure is further provided, as shown in fig. 2, including: data packet transceiver module, virtual network card module, data packet routing module, agent program module, it is specific:
the data packet receiving and transmitting module is used for receiving and transmitting messages at the bottom layer and consists of a data packet receiving module and a data packet transmitting module, and the data packet receiving module is responsible for receiving data packets from the physical network card and forwarding the data packets to the virtual network card;
the virtual network card module is a standard TUN virtual network card working on a transmission layer and is used for analyzing the received IP data packet through a protocol stack of an operating system;
and the data packet routing module is used for routing the target message to the agent program module for processing, and comprises iptables rule configuration and policy routing configuration. Under the default condition, the operating system finds that the destination IP address is not a data packet of the IP of the operating system, and the data packet is not transmitted to the application layer but discarded or forwarded.
The agent program module works in an application layer of an operating system and supports a Tproxy mechanism, and specifically comprises the following steps:
the IP _ TRANSPARENT parameter is set on the socket attribute, so that socket connection of any destination IP address is accepted, and meanwhile, data messages can be generated by taking any IP address as a source IP, so that the socket can bind any IP address.
As a bottom layer mechanism element for supporting the implementation of the relevant method steps in embodiment 1 of the present invention, the method further includes:
adding a first rule in the iptables rule:
iptables-t mangle-A PREROUTING-p tcp--dport 80-j TPROXY--tproxy-mark 0x1/0x1--on-port 0;
the first rule indicates that a label value of 1 is marked on a message with a transmission protocol of TCP and a destination port of 80;
adding a second rule in the iptables rule:
iptables-t mangle-A PREROUTING-p tcp--sport 443-j MARK--set-mark 1;
the second rule indicates that the label value of the message with the transmission protocol being TCP and the source port being 443 is marked with 1;
adding a third rule in the iptables rule:
iptables-t mangle-A OUTPUT-p tcp--dport 443-j MARK--set-mark2;
the third rule indicates that the label value of the message with the transmission protocol being TCP and the destination port being 443 is marked with 2;
add a fourth rule in the iptables rule:
iptables-t mangle-A OUTPUT-p tcp--sport 80-j MARK--set-mark2;
the fourth rule indicates that the packet with the transport protocol TCP and the source port 80 is tagged with a tag value of 2.
Matching with the iptables rule, the policy routing configuration corresponding to 9 is also needed to implement, specifically:
a message with a tag value of 1 is queried in the routing table 100 (the routing table 100 is only used for convenience of presentation of the following instruction, and in the actual use process, the routing table 100 may also be expressed as other self-defined character strings, and therefore, it should not be taken as a special limitation of the protection scope of the present invention):
ip rule add fwmark 1lookup 100;
establishing a routing table with the number of 100, and setting the content as the routing from the virtual network card of the agent system to the application layer of the agent system:
ip route add local 0.0.0.0/0dev lo table 100;
configuring a message query routing table 200 with a label value of 2:
ip rule add fwmark 0x2 table 200;
a routing table with a number of 200 is established, and the message is sent through the virtual network card tun1 (actually, the message may also be a virtual network card with other numbers, such as tun2, tun5, etc., which is not specifically limited here):
ip route add default via 10.0.0.2dev tun1 table 200。
next, the method execution process of the main function module in the proxy system related to the present invention is described one by one through a plurality of embodiments, and the first and second expressions in embodiment 1 of the present invention will not be continued in the corresponding embodiments, it should be noted that the first, second, and third prefixes used in the embodiments of the present invention are only used for distinguishing objects more clearly in the process of the citation, and do not have special limiting meanings, and the following embodiments can be associated with the relevant objects related to the embodiments of the present invention through the expression of the context even without adding the first and second calibrations, and thus, the description of the related objects is omitted here and in the following.
Example 2:
the work flow of the data packet receiving module is shown in fig. 6:
in step S121, a message is received from the physical network card, the network card in this step includes an uplink network card and a downlink network card, at this time, the program receives the message and does not depend on the analysis and processing of the message by the operating system, but directly reads the message from the network card, and at this time, the message includes an ethernet header, an IP header, and the like, and is a complete message.
In step S122, the message is analyzed, in which the source MAC address, the source IP address, the source port, the destination MAC address, the destination IP address, and the destination port of the message are obtained, and meanwhile, the correspondence between the IP address and the MAC address is recorded and stored in the IP-MAC correspondence table.
In step S123, it is checked whether the packet is a destination packet, where the checking conditions in this step include two conditions, one is to check whether the destination port is 80 ports of TCP, and the other is to check whether the source port is 443 ports of TCP, and these two conditions are in an or relationship, that is, as long as one condition is hit, the packet belongs to the destination packet.
In step S124, the ethernet header is stripped off, and the ethernet header of the message hit in step S123 is stripped off, and only the data portion above the IP header is reserved.
In step S125, the IP data message generated in step S124 is sent to the virtual network card device.
In step S126, the message is sent to the physical network card of the opposite terminal, that is, the message that is not hit in step S123 is directly delivered to the physical network card of the opposite terminal. Here, the physical network card of the opposite terminal refers to that if the message is received from the uplink network card, the message is transferred out from the downlink network card; otherwise, if the message is received from the downlink network card, the message is transferred out from the uplink network card.
Example 3:
the work flow of the data packet receiving module is shown in fig. 7:
in step S131, a message is received from the virtual network card, where the step is to read a message from the virtual network card device, the virtual network card is a TUN-type network card at this time, and the received message is an IP data message with an IP header and has no ethernet header.
In step S132, an ethernet header is added, which means that an ethernet header is added to the IP datagram, including the source MAC address and the destination MAC address. The principle of the addition is that an IP-MAC corresponding table generated by the data packet receiving module is inquired, a source MAC is set as an MAC address corresponding to the source IP, and a target MAC is set as an MAC address corresponding to the target IP.
In step S133, it is detected whether the destination port is 443, and if the destination port of the packet is 443, the determination condition is satisfied.
In step S134, the ethernet data packet is sent to the physical network card of the downstream port, where the ethernet data packet is sent to the physical network card of the downstream port.
In step S135, the ethernet data packet is sent to the physical network card of the upstream port, where the ethernet data packet is filled with the ethernet header.
Example 4:
the configuration process involves several aspects such as route forwarding configuration, virtual network card configuration, iptables rule configuration, policy routing configuration, and the like, as shown in fig. 8.
S141 is a route forwarding configuration, which is to turn on the route forwarding function of the Linux system. Under the default condition, the operating system finds that the destination IP address is not the data message of the IP address of the operating system and discards the data message; if the routing forwarding function is started, the operating system will forward the data packet whose destination IP address is not the own IP address, instead of discarding the data packet, which is equivalent to the routing processing performed by the operating system on the data packet. The method for starting the route forwarding is simple, the configuration can be carried out in a command mode, the configuration can also be carried out in a configuration file writing mode, and the names and the positions of network configuration files of different Linux system release versions are possibly different. Commands configured by command lines are as follows:
sysctl-w net.ipv4.ip_forward=1;
s142 is a virtual network card configuration, which is to generate a virtual network card for receiving the IP packet message, so that the IP packet enters the local protocol stack for processing.
A typical configuration method is as follows:
first, a TUN-type virtual network card device is created, which is named TUN5:
ip tuntap add mode tun tun5;
activating the virtual network card:
ip link set tun5 up;
adding an IP address for the virtual network card:
ip addr add 10.0.0.2/24dev tun5;
the IP address of the virtual network card does not influence the function of the whole agent system and can be set at will.
S143 is an iptables rule configuration, which is mainly intended to tag the target packet. The destination packet includes two parts, the first part is traffic represented by the line No. 3 and the line No. 12 in fig. 2, and the second part is traffic represented by the line No. 4 and the line No. 7 in fig. 2. The flow represented by the line No. 3 and the line No. 12 is labeled, so that the message of which the destination address is not the IP address of the local machine can be routed to the application layer of the local machine for processing, but not forwarded; the traffic represented by the line 4 and the line 7 is labeled because the part of the traffic inquires the default routing table and is routed to the physical network card, and the traffic is also labeled for routing to the virtual network card in the invention.
The first type of rule is to label the traffic represented by the line No. 3 and the line No. 12 in fig. 2, and a typical configuration method is as follows, and the following rule is added by using an iptables tool:
iptables-t mangle-A PREROUTING-p tcp--dport 80-j TPROXY--tproxy-mark 0x1/0x1--on-port 0;
this rule indicates that, when the transmission protocol is TCP and the message with the destination port of 80 is labeled with a value of 1, the message is delivered to the application layer to the program with the port of 80 for processing, and this part of the traffic matches the traffic represented by the line 3 in fig. 2.
iptables-t mangle-A PREROUTING-p tcp--sport 443-j MARK--set-mark 1;
This rule indicates that the packet whose transport protocol is TCP and whose source port is 443 is labeled with a value of 1, and this part of the traffic matches the traffic represented by line 12 in fig. 2.
The second rule is to label the traffic represented by lines 4 and 7 in fig. 2:
iptables-t mangle-A OUTPUT-p tcp--dport 443-j MARK--set-mark2;
iptables-t mangle-A OUTPUT-p tcp--sport 80-j MARK--set-mark2;
s144 is policy routing configuration, which is partly for use with iptables rules. Firstly, the message of the virtual network card can be delivered to an application layer program for processing, namely, the flow messages represented by the line No. 3 and the line No. 12 in the graph 2 are matched with the policy route; secondly, the message generated by the application program can be delivered to the virtual network card instead of the physical network card, and the flow messages represented by the line No. 4 and the line No. 7 in fig. 2 are matched with the policy routing.
The specific configuration method is as follows, and the message with the label value of 1 is made to query the routing table 100:
ip rule add fwmark 1 lookup 100;
establishing a routing table with the number of 100, and setting the content as routing to an application layer:
ip route add local 0.0.0.0/0dev lo table 100;
configuring a message query routing table 200 with a label value of 2:
ip rule add fwmark 0x2 table 200;
a routing table with the number of 200 is established, and the message is sent through the virtual network card tun5:
ip route add default via 10.0.0.2dev tun5 table 200。
example 5:
fig. 9 is a process of message handling by the agent module, which maintains an HTTP or HTTPs connection with the client and the server at the same time, and relays data between the client and the server.
Role:
c: client, representing a Client, a typical HTTP Client being a browser;
p: proxy, representing a Proxy system;
s: a Server, which represents a Server side, typically an HTTP Server side, such as various websites accessed;
and (3) interaction direction:
- > represents a one-way transmission message;
-representing the two parties sending and receiving messages to each other;
name interpretation:
TCP: transmission Control Protocol, Transmission Control Protocol;
HTTP: HyperText Transfer Protocol, Hypertext Transfer Protocol;
TLS: transport Layer Security, secure Transport Layer protocol;
HSTS: HTTP strong Transport Security, HTTP Strict secure Transport;
TCP handshake: the method refers to a three-way handshake process which needs to be executed when two sides in a TCP protocol establish connection;
TLS handshake: the method comprises the following steps that in a TLS protocol, a client and a server establish a negotiation process of TLS connection, and the process has a plurality of message interactions;
HTTP GET: the method refers to a resource request initiated by a client in an HTTP (hyper text transport protocol);
HTTP Response, which means that in the HTTP protocol, the server side replies the Response content of the client side;
host is an HTTP header field in the HTTP request message, and the value of the HTTP header field is the accessed target domain name;
as shown in fig. 9, the message interaction process between the proxy system and the client and server is as follows:
in step S201, the client and the proxy system initiate a TCP handshake, and establish a TCP connection, where the target port is an 80 port. To the client, it appears that a TCP connection is being made with the server, but because the traffic at this point has already been routed to the proxy system, the client actually establishes a TCP connection with the proxy system.
In step S202, the client initiates an HTTP request to the proxy system, where the HTTP is a plaintext request and the destination port is 80 ports. The client appears to be initiating an HTTP request to the server, and because the traffic at this point has already been routed to the proxy system, the client actually sends the request to the proxy system.
In step S203, after receiving the HTTP request sent by the client, the proxy system parses the HTTP header field to obtain the Host field, and compares the Host field with the built-in domain name list; the domain name list refers to all domain names meeting the HTTPS redirection, and if a domain name is located in the list, it indicates that the domain name has been subjected to the HTTPS redirection detection before, and the domain name is subjected to the HTTPS redirection. If the Host domain name is found in the list, the step S204 is entered; otherwise, the process proceeds to step S210.
In step S204, the proxy system initiates a TCP handshake to the server, target port 443; at this time, the IP address seen by the server is the IP address of the client, and the existence of the proxy system cannot be found.
In step S205, the proxy system and the server perform a TLS negotiation process, which is based on the TCP tunnel established in step S204; in the TLS negotiation process, a plurality of messages are interacted, and in the process, the two parties can complete the work of negotiating an encryption suite, transferring a certificate, checking the certificate, calculating a secret key and the like.
In step S206, the proxy system sends an HTTP request to the server, the request being based on the TLS channel established in step S205, the HTTP request being an encrypted request, and the specific request content being obtained from step S202;
in step S207, the server replies to the proxy system with HTTP response content, which is based on the TLS channel established in step S205, encrypted, and not in clear.
In step S208, the HTTP response is processed. After receiving the HTTP response from the server, the proxy system needs to process the response content and then send the response content to the client. The HTTP response is handled here because an HTTPs to HTTP adaptation is performed. The attribute of some fields in the HTTP header field relates to an HTTPs transmission mode, and typically has two fields, one is a secure attribute of a cookie field, which indicates that the cookie can only be transmitted in an HTTPs channel and cannot be transmitted in an HTTP channel; if the attribute is not deleted, the cookie can not be transmitted in a plaintext transmission channel such as HTTP, and various server authentication problems can be caused; and secondly, the HSTS has the function of forcing the client to establish connection with the server by using the HTTPS, and the HSTS is set by including a Strict-Transport-Security field in an HTTP response header. The HTTP protocol specifies that the HSTS field set for unencrypted transmission is invalid, but the HSTS field present in the HTTP clear text transmission channel raises the client's doubt, so this field is deleted.
In step S209, the proxy system returns an HTTP response to the client, where the content of the HTTP response may be adapted or not changed in step S208; if the step is reached from the step S213, no modification of the HTTP response content is required.
In step S210, the proxy system initiates a TCP handshake to the server, and the target port is 80 ports.
In step S211, the proxy system sends an HTTP request to the server, the HTTP request being based on the TCP channel established in step S210, the HTTP request being a clear text request, and the specific request content being obtained from step S202.
In step S212, the server replies to the proxy system with HTTP response content, which is transmitted in clear text based on the TCP channel established in step S210.
In step S213, it is checked whether the HTTP response is an HTTPs redirect. The step mainly checks the HTTP response content obtained in the step S212, namely, whether the state code of the HTTP response is between 300 and 399 is checked, and the response state code of the interval represents redirection according to the HTTP protocol specification; and secondly, whether the redirected target address is an HTTPS version of Host before redirection is checked, because the redirection function is not only an HTTPS redirection but also any URL redirection can be carried out, and the HTTPS redirection is only one application. After checking, if the HTTPS redirection is found, then step S214 is skipped, otherwise step S209 is skipped.
In step S214, the proxy system and the server disconnect the TCP connection. The proxy system finds that the HTTP response content is HTTPS redirection at the moment, and at the moment, the proxy system does not forward the response to the client, but discards the HTTP response, and simultaneously sends a TCP Reset message to the server and closes a TCP channel;
in step S215, the Host is added to the domain name list. At this time, the agent system has already finished the detection to the target domain name, know that the target domain name meets the condition that HTTPS redirects, so join the domain name list, when the agent system processes this domain name next time, it can be to act on directly, does not need to detect repeatedly;
all the traffic passing through the agent system can be divided into 3 interactive flows according to different situations:
the first, interaction flow A: the HTTP proxy mode, namely the proxy system and the client use HTTP connection, and the server side also uses HTTP connection; the proxy system firstly detects the server, finds that the server does not meet the HTTPS redirection condition, and then switches to the HTTP proxy mode;
second, interaction flow B: the HTTP-HTTPS proxy mode is changed, namely the proxy system uses HTTP connection with the client and uses HTTPS connection with the server; the agent system detects the domain name before, finds that the domain name meets the HTTPS redirection condition, and directly performs HTTPS agent;
the third, interaction flow C: the HTTP changes into HTTPS proxy mode, and the initial detection flow is superposed; the agent system firstly detects the server, finds that the server meets an HTTPS redirection condition and then transfers to an HTTPS agent;
when the agent system uses the interaction flow a, the working steps are S201, S202, S203, S210, S211, S212, S213, and S209, and the interaction flow of the message is as shown in fig. 10. Suppose the IP address used by the client is 1.1.1.1 and the port is 1024; the IP address of the server is 2.2.2.2, and the port is 80; the proxy system itself has an IP address of 3.3.3.3. The connections between the client and the proxy system use IP addresses of 1.1.1.1 and 2.2.2.2 and ports of 1024 and 80, at which time the proxy system assumes itself to be the server to establish a connection with the client. The connection between the proxy system and the server uses IP addresses of 1.1.1.1 and 2.2.2.2 and ports of 2000 and 80, at which point the proxy system assumes itself to be the client to establish a connection with the server. It can be seen that in the whole communication process, the own IP address of the proxy system does not appear, only the source port is changed, and because the source port is randomly selected, the proxy system is still a bidirectional transparent proxy system even if the source port is changed. After passing through the proxy system, the MAC address, IP address, HTTP message content, etc. of the message remain unchanged, and what is changed is the source port field of the TCP protocol.
When the agent system uses the interaction flow B, the working steps are S201, S202, S203, S204, S205, S206, S207, S208, and S209, and the interaction flow of the message is as shown in fig. 11. Suppose the IP address used by the client is 1.1.1.1 and the port is 1024; the IP address of the server side is 2.2.2.2, and the ports are 80 and 443; the proxy system itself has an IP address of 3.3.3.3. When the client initiates an HTTP access to the server, the proxy system establishes an HTTP connection with the client using the IP address of 2.2.2.2 and the port 80, and simultaneously establishes an HTTPS connection with the real server using the IP address of 1.1.1.1 and a random source port. In the whole communication process, an HTTP connection is formed between the client and the proxy system; an HTTPS connection is between the proxy system and the server, and this connection is a trusted connection because the proxy system is now in use as a client and the certificate is that of the server, which is a trusted certificate. In the whole communication process, the IP address of the proxy system cannot appear, only the source port is changed, and the source port is randomly selected by the client, so that the proxy system is still a bidirectional transparent proxy system even if the source port is changed. After passing through the proxy system, the MAC address and the IP address of the packet are kept unchanged, the source port field of the TCP protocol is changed, or the HTTP protocol is changed into the HTTPs protocol, and the HTTP header field may be changed.
As shown in fig. 13, the original HTTP response header field that represents the server sent to the client; note that there is a stream-Transport-Security header field inside, and in addition, the Set-Cookie field sets a Security attribute for the Cookie.
As shown in fig. 14, an HTTP response header modified by the proxy system is represented; the proxy system deletes the cookie's secure attribute along with the Strict-Transport-Security header field.
When the agent system uses the interaction flow C, the working steps are S201, S202, S203, S210, S211, S212, S213, S214, S215, S204, S205, S206, S207, S208, and S209, and the interaction flow of the message is as shown in fig. 12. Suppose the IP address used by the client is 1.1.1.1 and the port is 1024; the IP address of the server side is 2.2.2.2, and the ports are 80 and 443; the proxy system itself has an IP address of 3.3.3.3. When the client initiates HTTP access to the server, the proxy system uses the IP address of 2.2.2.2 and the port 80 to establish an HTTP connection with the client, and simultaneously, the proxy system uses the IP address of 1.1.1.1 and a random source port to establish an HTTP connection with the real server, and probes whether the response of HTTP is HTTPS redirection. Once probing is completed, and the server is found to meet the HTTPS redirection condition, the proxy system closes the HTTP connection with the server, and establishes an HTTPS connection with the server using the IP address of 1.1.1.1 and a random source port. In the subsequent communication process, an HTTP connection is formed between the client and the proxy system, and an HTTPs connection is formed between the proxy system and the server, and the HTTPs connection is a trusted connection because the proxy system is used as the client. In the whole communication process, the source port is changed, and the source port is randomly selected by the client and can be changed. In the proxy system and client connection, the server port is 80; in the connection between the proxy system and the server, the server port is 443, but the own IP address of the proxy system does not appear during the whole communication process, and the proxy system is a bidirectional transparent proxy system.
Example 6:
fig. 15 is a schematic diagram illustrating an architecture of an HTTP-to-HTTPs bidirectional transparent proxy apparatus according to an embodiment of the present invention. The HTTP to HTTPs bidirectional transparent proxy apparatus of the present embodiment includes one or more processors 21 and a memory 22. In fig. 15, one processor 21 is taken as an example.
The processor 21 and the memory 22 may be connected by a bus or other means, and the bus connection is exemplified in fig. 15.
The memory 22 is a non-volatile computer-readable storage medium, and can be used to store a non-volatile software program and a non-volatile computer-executable program, such as the HTTP to HTTPs bidirectional transparent proxy method in embodiment 1. Processor 21 executes the HTTP to HTTPs bi-directional transparent proxy method by executing non-volatile software programs and instructions stored in memory 22.
The memory 22 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 22 may optionally include memory located remotely from the processor 21, and these remote memories may be connected to the processor 21 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The program instructions/modules are stored in the memory 22 and when executed by the one or more processors 21, perform the HTTP to HTTPs bidirectional transparent proxy method of embodiment 1 described above, for example, perform the steps shown in fig. 3 to 9 described above.
It should be noted that, for the information interaction, execution process and other contents between the modules and units in the apparatus and system, the specific contents may refer to the description in the embodiment of the method of the present invention because the same concept is used as the embodiment of the processing method of the present invention, and are not described herein again.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A method for HTTP to HTTPS bidirectional transparent proxy is characterized by comprising the following steps:
after receiving a first HTTP request sent by a client, the proxy system analyzes an HTTP header field to obtain a Host field, compares the Host field with a built-in domain name list and stores the content of the first HTTP request; the domain name list is used for storing domain names meeting HTTPS redirection;
if the Host domain name is found in the list, the proxy system initiates TCP handshake with a target port of 443 to the server so as to establish a first TCP channel and perform TLS negotiation process through the first TCP channel; wherein an IP address of a client used in the TCP handshake;
the agent system sends a first HTTPS request to the server through a first TLS channel established by TLS negotiation; the first HTTPS request carries encrypted content of the first HTTP request content;
and the proxy system receives a first HTTPS response returned by the server, deletes the secure attribute of the cookie field in the HTTP header field of the first HTTPS response and the Strict-Transport-Security field contained in the HSTS, decrypts the content of the first HTTPS response into a plaintext, and then transmits the plaintext to the client.
2. The HTTP-to-HTTPs bi-directional transparent proxy method of claim 1, wherein if a Host domain name is found not to be on the list, the method further comprises:
the proxy system initiates a TCP handshake with a target port of 80 to the server so as to establish a second TCP channel;
the proxy system sends a second HTTP request to the server through the second TCP channel, wherein the second HTTP request carries the content of the first HTTP request;
the proxy system receives a second HTTP response returned by the server and checks whether the second HTTP response is an HTTPS redirection;
if the redirection is HTTPS redirection, the proxy system adds the Host into the domain name list, discards the second HTTP response, simultaneously sends a TCP Reset message to the server, and closes the first TCP channel.
3. The HTTP-to-HTTPs bi-directional transparent proxy method of claim 2, further comprising:
the proxy system initiates a TCP handshake with a target port of 443 to the server so as to establish a second TCP channel and perform a TLS negotiation process through the second TCP channel; wherein an IP address of a client used in the TCP handshake;
the agent system sends a second HTTPS request to the server through a second TLS channel established by TLS negotiation; the second HTTPS request carries encrypted content of the second HTTP request content;
and the proxy system receives a second HTTPS response returned by the server, deletes the secure attribute of the cookie field in the HTTP header field of the second HTTPS response and the Strict-Transport-Security field contained in the HSTS, decrypts the content of the second HTTPS response into a plaintext, and then transmits the plaintext to the client.
4. The method for HTTP to HTTPs bidirectional transparent proxy according to claim 2, wherein checking whether the second HTTP response is an HTTPs redirect includes:
checking whether the status code of the second HTTP response is between 300 and 399, and checking that the redirected target address is an HTTPS version of the redirected previous Host; wherein, according to the HTTP protocol specification, the response status code of the interval represents redirection.
5. The HTTP-to-HTTPs bidirectional transparent proxy method of claim 2, wherein if it is found that it is not an HTTPs redirect, the method further comprises:
and the proxy system transmits the received second HTTP response to the client.
6. The HTTP-to-HTTPs bidirectional transparent proxy method according to any one of claims 1 to 5, wherein the proxy system comprises: data packet transceiver module, virtual network card module, data packet routing module, agent program module, it is specific:
the data packet receiving and transmitting module is used for receiving and transmitting messages at the bottom layer and consists of a data packet receiving module and a data packet transmitting module, and the data packet receiving module is responsible for receiving data packets from the physical network card and forwarding the data packets to the virtual network card;
the virtual network card module is a standard TUN virtual network card working on a transmission layer and is used for analyzing the received IP data packet through a protocol stack of an operating system;
and the data packet routing module is used for routing the target message to the agent program module for processing, and comprises iptables rule configuration and policy routing configuration.
7. The method for converting HTTP to HTTPs according to claim 6, wherein the agent program module operates in an application layer of an operating system and supports a Tproxy mechanism, specifically:
the IP _ TRANSPARENT parameter is set on the socket attribute, so that socket connection of any destination IP address is accepted, and meanwhile, data messages can be generated by taking any IP address as a source IP, so that the socket can bind any IP address.
8. The HTTP-to-HTTPs bi-directional transparent proxy method of claim 6, further comprising:
adding a first rule in the iptables rule:
iptables-t mangle-A PREROUTING-p tcp--dport 80-j TPROXY--tproxy-mark 0x1/0x1--on-port 0;
the first rule indicates that a label value of 1 is marked on a message with a transmission protocol of TCP and a destination port of 80;
adding a second rule in the iptables rule:
iptables-t mangle-A PREROUTING-p tcp--sport 443-j MARK--set-mark 1;
the second rule indicates that the label value of the message with the transmission protocol being TCP and the source port being 443 is marked with 1;
adding a third rule in the iptables rule:
iptables-t mangle-A OUTPUT-p tcp--dport 443-j MARK--set-mark2;
the third rule indicates that the label value of the message with the transmission protocol being TCP and the destination port being 443 is marked with 2;
add a fourth rule in the iptables rule:
iptables-t mangle-A OUTPUT-p tcp--sport 80-j MARK--set-mark2;
the fourth rule indicates that the packet with the transport protocol TCP and the source port 80 is tagged with a tag value of 2.
9. The HTTP to HTTPs bidirectional transparent proxy method according to claim 8, further comprising policy routing configuration, specifically:
the message with the label value 1 is queried in the routing table 100:
ip rule add fwmark 1lookup 100;
establishing a routing table with the number of 100, and setting the content as the routing from the virtual network card of the agent system to the application layer of the agent system:
ip route add local 0.0.0.0/0dev lo table 100;
configuring a message query routing table 200 with a label value of 2:
ip rule add fwmark 0x2 table 200;
a routing table with the number of 200 is established, and the message is sent through the virtual network card tun 1:
ip route add default via 10.0.0.2dev tun1 table 200。
10. an apparatus for HTTP to HTTPs bi-directional transparent proxy, the apparatus comprising:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor for performing the method of HTTP to HTTPs bi-directional transparent proxy of any of claims 1-9.
CN202110062859.5A 2021-01-18 2021-01-18 Method and device for HTTP-to-HTTPS bidirectional transparent proxy Active CN112954001B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110062859.5A CN112954001B (en) 2021-01-18 2021-01-18 Method and device for HTTP-to-HTTPS bidirectional transparent proxy
PCT/CN2021/135682 WO2022151867A1 (en) 2021-01-18 2021-12-06 Method and apparatus for converting http into https bidirectional transparent proxy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110062859.5A CN112954001B (en) 2021-01-18 2021-01-18 Method and device for HTTP-to-HTTPS bidirectional transparent proxy

Publications (2)

Publication Number Publication Date
CN112954001A CN112954001A (en) 2021-06-11
CN112954001B true CN112954001B (en) 2022-02-15

Family

ID=76235497

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110062859.5A Active CN112954001B (en) 2021-01-18 2021-01-18 Method and device for HTTP-to-HTTPS bidirectional transparent proxy

Country Status (2)

Country Link
CN (1) CN112954001B (en)
WO (1) WO2022151867A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112954001B (en) * 2021-01-18 2022-02-15 武汉绿色网络信息服务有限责任公司 Method and device for HTTP-to-HTTPS bidirectional transparent proxy
CN113810380B (en) * 2021-08-23 2023-08-01 杭州安恒信息安全技术有限公司 Agent level switching method, system, readable storage medium and computer device
CN114025030A (en) * 2021-11-08 2022-02-08 北京天融信网络安全技术有限公司 Transparent proxy implementation method, device, computer equipment and medium
CN114125030A (en) * 2021-11-30 2022-03-01 北京天融信网络安全技术有限公司 Connection tracking method, device, electronic equipment and computer readable storage medium
CN115277837B (en) * 2022-07-22 2023-04-25 杭州迪普科技股份有限公司 Agent-based redirection method and device
CN115242766A (en) * 2022-08-02 2022-10-25 亚数信息科技(上海)有限公司 Method for HTTPS transparent gateway based on two-layer network bridge
CN115720222B (en) * 2022-12-19 2023-06-02 广西大学 Method for realizing HTTP forwarding on ARM multi-core architecture and storage medium
CN116028980B (en) * 2023-03-29 2023-08-25 北京中安星云软件技术有限公司 Database bypassing prevention method, system, equipment and medium
CN117076144A (en) * 2023-08-17 2023-11-17 合芯科技有限公司 System parallel conversion method, device, computer equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017161081A1 (en) * 2016-03-16 2017-09-21 Affirmed Networks, Inc. Systems and methods for intelligent transport layer security

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005060202A1 (en) * 2003-12-10 2005-06-30 International Business Machines Corporation Method and system for analysing and filtering https traffic in corporate networks
CN101242336B (en) * 2008-03-13 2010-12-01 杭州华三通信技术有限公司 Method for remote access to intranet Web server and Web proxy server
CN102685165B (en) * 2011-03-16 2015-01-28 中兴通讯股份有限公司 Method and device for controlling access request on basis of proxy gateway
CN105376209A (en) * 2014-09-02 2016-03-02 松下电器产业株式会社 Network agent equipment, building monitoring system and method thereof
US9888350B2 (en) * 2015-12-22 2018-02-06 Intel IP Corporation System, method and apparatus for hybrid wireless fine-timing measurement
US20170223054A1 (en) * 2016-02-02 2017-08-03 Cisco Technology, Inc. Methods and Apparatus for Verifying Transport Layer Security Server by Proxy
US10264079B2 (en) * 2016-05-18 2019-04-16 Cisco Technology, Inc. Fastpath web sessions with HTTP header modification by redirecting clients
CN107613036B (en) * 2017-09-04 2021-07-23 北京新流万联网络技术有限公司 Method and system for realizing HTTPS transparent proxy
US10693893B2 (en) * 2018-01-16 2020-06-23 International Business Machines Corporation Detection of man-in-the-middle in HTTPS transactions independent of certificate trust chain
CN111314499B (en) * 2020-02-17 2022-09-30 深信服科技股份有限公司 Domain name proxy method, device, equipment and readable storage medium
CN112954001B (en) * 2021-01-18 2022-02-15 武汉绿色网络信息服务有限责任公司 Method and device for HTTP-to-HTTPS bidirectional transparent proxy

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017161081A1 (en) * 2016-03-16 2017-09-21 Affirmed Networks, Inc. Systems and methods for intelligent transport layer security

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
闫露 ; 邓浩江 ; 陈晓.TLS协议现状与研究综述.《网络新媒体技术》.2019, *
韦俊琳 ; 段海新 ; 万涛.HTTPS/TLS协议设计和实现中的安全缺陷综述.《信息安全学报》.2018, *

Also Published As

Publication number Publication date
CN112954001A (en) 2021-06-11
WO2022151867A1 (en) 2022-07-21

Similar Documents

Publication Publication Date Title
CN112954001B (en) Method and device for HTTP-to-HTTPS bidirectional transparent proxy
US8473620B2 (en) Interception of a cloud-based communication connection
CN104270379B (en) HTTPS agency retransmission methods and device based on transmission control protocol
US7849495B1 (en) Method and apparatus for passing security configuration information between a client and a security policy server
US8938553B2 (en) Cooperative proxy auto-discovery and connection interception through network address translation
CN102546800B (en) Handshake and communication methods for gateway, gateway and Web communication system
US8850553B2 (en) Service binding
Bormann et al. CoAP (constrained application protocol) over TCP, TLS, and WebSockets
US9900178B2 (en) Device arrangement and method for implementing a data transfer network used in remote control of properties
US11196833B1 (en) Proxy server synchronizer
US11888818B2 (en) Multi-access interface for internet protocol security
US20150373135A1 (en) Wide area network optimization
US9787770B2 (en) Communication system utilizing HTTP
JP2017118545A5 (en)
US8650313B2 (en) Endpoint discriminator in network transport protocol startup packets
JP6521762B2 (en) HTTP server, control method therefor, image forming apparatus and program
US11038994B2 (en) Technique for transport protocol selection and setup of a connection between a client and a server
US7526797B2 (en) System and method for processing callback requests included in web-based procedure calls through a firewall
US10361997B2 (en) Auto discovery between proxies in an IPv6 network
US20200106515A1 (en) Communication Device, Relay Device, Information Processing System, Communication System and Communication Method
CN114553567B (en) Network transmission method, system, storage medium and computing device in multiparty security computing
JP7178523B2 (en) Relay device and local breakout transfer method
CN106487819A (en) A kind of method and apparatus that HTTP request is acted on behalf of by UDP
Huawei Technologies Co., Ltd. TCP/IP
JP2014236423A (en) Communication device, control method, communication program and communication system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method and device for HTTP to HTTPS bidirectional transparent proxy

Effective date of registration: 20220620

Granted publication date: 20220215

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: WUHAN GREENET INFORMATION SERVICE Co.,Ltd.

Registration number: Y2022420000171

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20230704

Granted publication date: 20220215

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: WUHAN GREENET INFORMATION SERVICE Co.,Ltd.

Registration number: Y2022420000171

PC01 Cancellation of the registration of the contract for pledge of patent right