0

I have already converted a multi-treaded HTTP proxy ran by a socket server into a single-treaded proxy ran by Asyncio. And, it works well for HTTP. However, when I try to reproduce the same with SSL handshakes or maybe "while" statement then the server fails to exchange data. This code doesn't produce any error message, no hint. Meantime, yet still, it doesn't work. Only browser says that the requests are timed-out but that's something we already knew. Any suggestions? Thanks.

Working socket:

def proxy(host, port, conn, request):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    try:
        # If successful, send 200 code response
        s.connect((host, port))
        reply = "HTTP/1.0 200 Connection established\r\n"
        reply += "Proxy-agent: Proxy\r\n"
        reply += "\r\n"
        conn.sendall(reply.encode())
        s.setblocking(False)
        print("  HTTPS Connection Established")
        while True:
            try:
                s.sendall(conn.recv(4096))
            except socket.error as err:
                pass
            try:
                reply = s.recv(4096)
                conn.sendall(reply)
            except socket.error as e:
                s.close()
            finally:
                print("  Request of client  completed...")
                try:
                    s.close()
                except NameError: 
                    print("  Connection was already closed...")
    except socket.error as err:
        pass

Non-working socket with Asyncio:

async def proxy(host, port, conn, request):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
    try:
        # If successful, send 200 code response
        s.connect((host, port))
        reply = "HTTP/1.0 200 Connection established\r\n"
        reply += "Proxy-agent: Proxy\r\n"
        reply += "\r\n"
        await loop.sock_sendall(conn, reply.encode())
        s.setblocking(False)
        print("  HTTPS Connection Established")
        while True:
            try:
                req_ = await loop.sock_recv(conn, 4096)
                await loop.sock_sendall(s, req_)
            except socket.error as err:
                s.close()
            try:
                ans_ = await loop.sock_recv(s, 4096)
                await loop.sock_sendall(conn, ans_)
            except socket.error as e:
                s.close()
            finally:
                print("  Request of client  completed...")
                try:
                    s.close()
                except NameError: 
                    print("  Connection was already closed...")
    except socket.error as err:
        pass
1
  • 1
    How is loop defined? Is it possible that it has an internal mutex lock it's using?
    – thethiny
    Commented Apr 19 at 1:07

3 Answers 3

2

Your working version was simply trying to read non-blocking from sock conn forwarding it to s and then trying to read non-blocking from sock s forwarding it to conn and then again and again - busy looping. This worked because your read attempts were immediately failing if there were no data. It wasn't a smart implementation though since this busy loop just burned CPU time while most time doing nothing - better use select or similar to actually wait blockingly until data gets available on either socket.

Your non-working version does not take into account that loop.sock_recv will not just fail if the socket is still connected but no data are available. Instead it will wait for completion, i.e. that it either got data or the connection got closed. So your previous approach of trying on one socket first and then trying the other etc in a loop and implicitly skipping if there are no data available on a socket fails. Instead a similar smarter approach need to be done as suggested for your working version. For example you could use loop.add_reader to get a callback invoked once data are available on a socket and then use this to forward the data to the peer - instead of just busy trying to read again and again. See also this answer for a possible way to implement a proxy using asyncio.

2
  • That's a code I found on GitHub and I asked why it doesn't work, not what you think about the authors' intellect. If nothing else, the proxies from "smarter" authors didn't work art all, for one reason or another. I tested minimally 20 proxies and this was the only working code that I found on internet. To Asyncio, the only they say is that TCP proxying got a minimal or basically no audience so Asyncio team doesn't waste time working on it (e.g., to documment it).
    – Juraj
    Commented Apr 19 at 10:56
  • 1
    @Juraj: The term "not a smart implementation" was referring to the way it was implemented and wasn't intended to be a statement of the intellectual capabilities of the author. Even smart people might write not so smart code if they don't have much knowledge and experience in the necessary field. Commented Apr 19 at 11:05
1

Your async proxy fails with HTTPS because after the 200 Connection established response, you're supposed to tunnel encrypted data without touching it. But your code closes sockets too early and doesn’t handle both directions properly. Use asyncio.open_connection() and forward data both ways until the connection naturally ends

0

Thanks for your hints, i.e. you all were very helpful. Below, that's the first quickie version of a working code for HTTPS (it'll need just some small improvements like timeout, forced cleaning of old connections, etc). It's neither in the code below nor in the question but a low-level socket can provide a direct control over any data that pass (that's one of the purposes of a proxy). Asyncio was added to use asynchronous HTTPX client, await treads and Cython (which can release the Python GIL, also without treading). Prior to this function, I make with it DoH (DNS over HTTPS) requests with HTTPX (thus, first retrieve domain names), verify user agents, either login credentials or ECC signature, etc (and don't really wanna pull data from "local_reader" and "remote_reader"), IPv6 support is questionable and maybe someone might wanna deploy it also on something else than a Linux (on Windows and MacOS, Socket works well without extra work to make it cross-platform compatible). Not for all but particularly for a personal proxy, I'm gonna stick with a piece of Socket.

    async def Relay_(reader,writer):
        while True:
            try:
                data = await loop.sock_recv(reader, 2048)
                if not data:
                    break
                elif len(data) == 0:
                    print('Msg arrived empty')
                    break
                else:
                    print(data)
                    await loop.sock_sendall(writer, data)
            except:
                print('Error in Relay_ function')



    async def Proxy_(host, port, conn):
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
        try:
            s.connect((host, port))
            s.setblocking(False)
            reply = "HTTP/1.0 200 Connection established\r\n"
            reply += "Proxy-agent: Proxy\r\n"
            reply += "\r\n"
            await loop.sock_sendall(conn, reply.encode())
            tasks = [loop.create_task(Relay_(s,conn)),loop.create_task(Relay_(conn,s))]
            await asyncio.wait(tasks)
        except:
            print('Phff, see Proxy_ function')
        finally:
            return

It needs to get fixed some bugs but the first look was somewhat like this: https://freeimage.host/i/screenshot-from-2025-04-19-01-04-48.3Exd2xR - this page loaded with the code above. I changed also the HTTP version and, on my Pc, the overhead makes less than 6-8ms. On pages like Example Domain without concurrent requests/tasks, where's almost nothing to await in parallel so the time climbed from <6ms to 6-8ms. However, my first impression is that it's better to timeout and safer to close connections with Asyncio.

4
  • The main point of asyncio is that I/O on a socket is done only when the socket is ready (i.e. it will not block). Asyncio's event loop is built around a select/poll system call and takes care of that. Converting a program to asyncio means to use this kind of I/O. I'm sorry, but you are doing something else instead.
    – VPfB
    Commented Apr 20 at 5:29
  • I don't know what you want to say with those CPU usage percentages. I was talking e.g. about the socket.connect in your prorgam. This call blocks until the TCP handshake completes. Every connection problem will make a single-threaded asyncio program unresponsive for a while. Have a nice day.
    – VPfB
    Commented Apr 20 at 18:48
  • Oh... :) Where do you see that asyncio blocks the event loop? Nowhere. It's from here: stackoverflow.com/questions/46788964/… Just rebuilt for a newer version of Python, newer Asyncio. Since it's converted from the first code found on Github to this answer then the CPU usage dropped from 20-55% to nearly 0% and works. Surely, since the CPU usage is in the range between 1% and ~0% then this conversion "opened a huge window for resource saving with the next target in negative values". Ah, really? Thanks, my pleasure to chat with you.
    – Juraj
    Commented Apr 20 at 19:10
  • That's why. You see synchronous code (socket module) that runs for a few ms during handshakes. Were handshakes supposed to be handled by asyncio? No. Does it need to be fully asynchronous? No. To clarify, since I got working, fast, resource-efficient and cross-platform compatible code then could think twice that I really need to swap it with select, pipes, or loop.add_reader() which won't work on Windows and OpenBSD-based routers. I could improve that handshake but the function could be wrapped in a new awaited tread. Have a nice day.
    – Juraj
    Commented Apr 21 at 15:25

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.