FastCGI: 30 years old and still the better protocol for reverse proxies

360 points - yesterday at 4:16 PM

Comments

max_k yesterday at 6:44 PM

I agree with the article, FastCGI is better than HTTP for these things.

Though I'd like to make another protocol known: Web Application Socket (WAS). I designed it 16 years ago at my dayjob because I thought FastCGI still wasn't good enough.

Instead of packing bulk data inside frames on the main socket, WAS has a control socket plus two pipes (raw request+response body). Both the WAS application and the web server can use splice() to operate on a pipe, for example. No framing needed. Also, requests are cancellable and the three file descriptors can always be recovered.

Over the years, we used WAS for many of our internal applications, and for our web hosting environment, I even wrote a PHP SAPI for WAS. Quite a large number of web sites operate with WAS internally.

It's all open source:

- library: https://github.com/CM4all/libwas - documentation: https://libwas.readthedocs.io/en/latest/ - non-blocking library: https://github.com/CM4all/libcommon/tree/master/src/was/asyn... - our web server: https://github.com/CM4all/beng-proxy - WebDAV: https://github.com/CM4all/davos - PHP fork with WAS SAPI: https://github.com/CM4all/php-src

nostrademons yesterday at 5:26 PM

This is quite an interesting article for its omissions.

I remember the great FastCGI vs. SCGI vs. HTTP wars: I was founding a Web2.0 startup right at the time these technologies were gaining adoption, and so was responsible for setting up the frontend stack. HTTP won because of simplicity: instead of needing to introduce another protocol into your stack, you can just use HTTP, which you already needed to handle at the gateway. Now all sorts of complex network topologies became trivial: you could introduce multiple levels of reverse proxies if you ran out of capacity; you could have servers that specialized in authentication or session management or SSL termination or DDoS filtering or all the other cross-cutting concerns without them needing to know their position in the request chain; and you could use the same application servers for development, with a direct HTTP connection, as you did in production, where they'd sit behind a reverse proxy that handled SSL and authentication and abuse detection.

It also helped that nginx was lots faster than most FastCGI/SCGI modules of the time, and more robust. I'd initially setup my startup's stack as HTTP -> Lighttpd -> FastCGI -> Django, but it was way slower than just using nginx.

The use of HTTP was basically the web equivalent of the End-to-End Principle [1] for TCP/IP. It's the idea that the network and its protocols should be agnostic to what's being transmitted, and all application logic should be in nodes of the network that filter and redirect packets accordingly. This has been a very powerful principle and shouldn't be discarded lightly.

The observation the article makes is that for security, it's often better to follow the Principle of Least Privilege [2] rather than blindly passing information along. Allowlist your communications to only what you expect, so that you aren't unwittingly contributing to a compromise elsewhere in the network.

And the article is highlighting - not explicitly, but it's there - the tension between these two principles. E2E gives you flexibility, but with flexibility comes the potential for someone to use that flexibility to cause harm. PoLP gives you security, but at the cost of inflexibility, where your system can only do what you designed it to do and cannot easily adapt to new requirements.

[1] https://en.wikipedia.org/wiki/End-to-end_principle

[2] https://en.wikipedia.org/wiki/Principle_of_least_privilege

apitman yesterday at 7:22 PM

> That said, using a vintage technology has some downsides. It was never updated to support WebSockets

With widespread browser support for WHATWG streams, it's pretty easy to implement your own WebSockets over long-lived HTTP requests. Basically you just send a byte stream and prepend each message with a header, which can just be a size in many cases.

Advantages over WebSockets:

* No special path in your server layer like you need for WebSocket.

* Backpressure

* You get to take advantage of HTTP/2/3 improvements for free

* Lower framing overhead

Unfortunately AFAIK it's still not supported to still be streaming your request body while receiving the response, so you need a pair of requests for full bidirectional streaming.

thayne today at 3:56 AM

The untrusted header problem could potentially be fixed by having the reverse proxy embed all the trusted information in a specific header, and then it just has to make sure that one header is stripped from the request. Unfortunately, there isn't (yet) a standard for that.

Or you could use something like haproxy's proxy protocol (although that may not support all the information you want, and doesn't work for multiplexing).

Edit: actually the "Forwarded" header kind of fills that niche. Although you may want extensions for things like the client certificate.

xorcist today at 9:08 AM

The mystery is why uWSGI isn't more widely used. Perhaps the name does not help. It has little to do with WSGI just as FastCGI is unrelated to CGI.

It is a tiny binary protocol, with frames just as FastCGI. The reference server works with several languages, I've used it over the years mostly with Python but also Ruby and Perl. It is a small C executable with all the practical features one need for web hosting: Draining backends, autoscaling, logging, chrooted backends, everything.

Very few FastCGI servers are this mature. Unlike FastCGI, it has been extended to support websockets and async.

I have used it in production at several places for many years and have nothing but praise for it. It feels like this weird unknown secret for web operations. Unfortunately, it sees lesser use now in the cloud era, and development seems to have all but stopped. It still works and is still reliable but the writing is probably on the wall. However nothing comes close in terms of speed, simplicity, and features.

nzoschke yesterday at 7:02 PM

I’ve rediscovered plain old CGI as a great way for users to “vibe code” custom pages on our platform. [1]

The scenario is we have our first party task lists and data viewers, but often users want to highly customize it. Say build a Kanban view or a custom dashboard with data filters and charts.

The box has a coding agent which means the user can code anything vs us building traditional report builder tools.

Go’s stdlib has good support on both the server side and user space. The coding agent makes a page-name/main.go that talks CGI and the server delegates requests to it.

It’s all “person scale” data and page views so no real need to optimize with fast CGI even.

What’s old is new again for agents!

1. https://housecat.com

zokier today at 10:24 AM

Does anyone remember mongrel2? A neat web server that used zeromq for backend communication (based on scgi). I always felt it had huge potential. It was also kinda famous in the early days of HN, back in the days when RoR was in vogue etc.

https://hn.algolia.com/?q=mongrel2

verifex today at 4:51 AM

I was reading the article and thinking, I wonder what that proxy I'm using in Apache is using, it's really fast and I've had a lot of luck with it. OMG, so I've been using FastCGI all this time and had no idea, well that's awesome. :)

Animats yesterday at 8:22 PM

FCGI is also an orchestration system. It launches more server tasks when the load goes up, shuts them down when the load decreases, and launches new copies of tasks if they crash. It's like single-system Kubernetes.

chasil yesterday at 5:56 PM

The PHP/Apache configuration that is distributed in the Red Hat family is "FastCGI Process Manager" (FPM).

I don't know if anything else in the RHEL distributions use FastCGI.

  $ rpm -qi php-fpm | grep ^Summary
  Summary     : PHP FastCGI Process Manager

tombert yesterday at 5:13 PM

Interesting.

Most of the stuff I've done for reverse proxies has been pretty straightforward and just using the stuff built into Nginx, but I have to admit that it wouldn't have even occurred to me to use FastCGI if I needed something more elaborate.

I used FastCGI a bit about ten years ago to "convert" some C++ code I wrote to work on the web, but admittedly I haven't used it much since then.

athrowaway3z yesterday at 7:08 PM

This seems like really bad advice or am i missing something?

Using fastcgi requires you write your app to serve fastcgi.

The upside of serving http/1.1 instead of fastcgi is that devs can instantly use their browser to test things instead of having to setup a reverse proxy on their machine.

The bad parts of http/1.1 are fixed equally well by both http/2.0 and fastcgi. So just use http/2.0 and you get the proper framing as well as browser support.

novoreorx today at 6:55 AM

FastCGI is theoretically better does not make it the easier choice in reality, the success of HTTP is just another case of "worse is better"

daneel_w yesterday at 7:58 PM

I've built a lot of API backends with Perl and FCGI::ProcManager, letting nginx (and Apache HTTPd in the past) front everything. For me it has been a pleasantly simple, incredibly robust and high-performing setup with no mess to speak of.

xp84 today at 12:43 AM

> Only if True-Client-IP doesn't exist does it use X-Real-IP. So even if your proxy does the right thing with X-Real-IP, you can still be pwned by an attacker sending a True-Client-IP header.

Can we just take a moment to appreciate the absurdity of HTTP headers for a moment? We have X-Forwarded-For, X-Real-IP, each CDN has their own custom flavored one. Some of them are a comma-separated list, and usually ends up having an IP of your own LB uselessly added in there (I know why, it's just not helpful). All of them might be inserted by a malicious user-agent. I guess nobody could agree on how all the various trusted servers in the pipeline should convey the important bit.

I guess it fits in quite well with the absurdity of the User-Agent header, which has come so far in absurdity that Apple decided to fully kill it by just sending utterly fake nonsense (false OS version, etc) in the name of "pRiVaCy."

runxiyu today at 12:03 AM

I think there is a lot of merit to this argument, however, FastCGI defers to CGI/1.1 for `PATH_INFO`, etc., which is lossy as it must be URL-decoded and therefore cannot represent encoded slashes, `%2F`. (Some implementations also collapse `//` to `/` in path, but this is an issue in various HTTP implementations too.)

It is less expressive than HTTP in ways that may or may not be important to your application; I prefer accurate URL handling.

sscaryterry yesterday at 5:47 PM

I've fought many battles with perl + windows + apache + FastCGI in a previous life. No thank you.

Tepix yesterday at 7:03 PM

I‘ve had good experiences with FastCGI back when Perl was popular. These days, WebTransport is the new sexy thing. Probably not a real FastCGI replacement.

blipvert yesterday at 10:44 PM

(u)WSGI must surely get a mention here?!

est today at 12:46 AM

Then there's uwsgi protocol. It's also an RPC for basically everything.

LAC-Tech today at 10:12 AM

I am actually implementing a reverse proxy right now...

I am doing a typical http thing, but I wonder, has anyone used fastcgi in Caddy?

https://caddyserver.com/docs/caddyfile/directives/reverse_pr...

simonw yesterday at 6:42 PM

As I understand it FastCGI doesn't handle websockets, which is a shame. It should be able to handle SSE though since that's effectively just a regular slow-loading/streaming HTTP response.

scotty79 yesterday at 7:19 PM

What I'd like to see is someone creating local caching proxy for modern https infested world. I'm fed up with downloading same packages 100 times.

shevy-java yesterday at 6:11 PM

I'd love for CGI to be updated, kind of merging what works and not really caring about what does not work. Getting a .cgi file to work on Linux is really easy. Naturally you get more leverage with e. g. rails, but there is also a lot more complexity and I really hate intrinsic complexity.

marlburrow yesterday at 6:03 PM

[flagged]

teddy_oweh today at 2:11 AM

[flagged]