Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hey there, I work on Proxygen at Facebook. I'm happy to answer any questions you have about the project.


This question is a bit naive, but outside of Facebook, can you think of what kind of application this is well suited for?


Well besides the fun of hacking around and building little HTTP servers, I could see this being useful if you want to save money by running fewer instances of your HTTP service. For instance, if you have a widely deployed python webservice that isn't scaling well, you could rewrite it in C++ with very little boilerplate using proxygen's httpserver.

It's early stages for the proxygen open source project. Maybe further down the road we'll provide off-the-shelf binaries, but we think the library is already interesting enough to warrant a release.


Is that the inception of the project? I am more curious about the lineage. How was the decision made to go this route instead of throwing more instances at it?


The blog post goes into more detail, but Proxygen started with an effort to write a L7 reverse proxy that could deeply integrate into FB internal services. We pulled out a lot of the non-FB specific stuff into this open source release. Before that, we used hardware load balancers for this role, which was expensive.


By expensive, not just capital costs, but costs around operating them - they weren't as reliably configurable, health-checkable, and instrumentable as we'd want, and Proxygen (and a later L4 load balancer) were.

Also the previous load balancers had constraints we weren't willing to accept - they required special connectivity to our networks, we could not use particular combinations of options, and we had to rely on vendors to solve problems that most of their customers were not encountering and/or able to detect.


Yeah, we are hitting the same wall right now and we've been going down the same route using HAProxy/Chef. Our plan is to put an API in front of it and treat it similar to an ELB. Are you still using hardware load balancers for SSL termination?


The machines running Proxygen do the TLS termination.


That's interesting. I had been working something similar years ago. I eventually open sourced it, but discontinued work on it. Http://github.com/baus/swithflow

I'm surprised more systems don't take this approach of doing more work in L7 proxies.


Can you tell us motivation to build this instead of contorting to and using existing solutions such as Apache Traffic Server, squid, nginx or haproxy?


I am not a huge fan of NIH but more often than not it is simply easier in the long run to roll something in house that does the job. Although the debt that piles up tends to dwarf the original choice it is often still a better idea due to having employee churn. Its much easier to keep it within the standards of normal coding conventions.

I don't work at FB but near the bottom in the comments you can see that they build on existing C++ libraries that have been tried/tested. We do the [same thing][1] with smaller services simply because its easier in the SDLC process to move libraries that are already in-house.

[1]: https://github.com/bloomberg/bde/tree/master/groups/bsl


I'll have a look at the blog for more -- but this was something that couldn't be solved with haproxy and/or trafficserver?


haproxy only got TLS support in 2012, whereas work on proxygen began quite before that. In addition, haproxy does not support SPDY natively (although it can forward it to something that does).

Apache Traffic Server only got SPDY support in a release about 45 days ago according to their release notes.

There are many other relatively basic features besides those, and similar considerations exist for the other projects that existed back then (and even now).


He ask the reason for `not contributing` to that projects, not `using` them.


Any kind of internal service that listens and accepts an API. Like you could put a bunch of monitoring agents on all your machines and those agents get information about your machines and report back to the service that makes use of Proxygen. Your service would use Proxygen instead of hiding behind an nginx or apache server and running via some kind of fcgi setup. The agents themselves could be using the Proxygen client to talk to the service using Proxygen server pieces.


What are the security implications of running native code on a public facing service? How many RCEs has Proxygen had? Has it been audited for security? What is the testing procedure?


In general, native code is not the same as unsafe code (that's why I'm personally excited about Rust. Native + safe = awesome). In C++'s case though it is true that a memory error could be used to exploit the process. Careful use of safe memory abstractions are our main safeguard against this. I can't comment on RCEs, but Proxygen has been externally and internally audited for security. We have many unit tests in the project, and we also internally have more comprehensive integration tests and a fuzzer.


I currently use nginx for all my projects, which for me mostly consist of (1) serving static content and (2) proxying to gunicorn. My setups are usually not overly complicated, basic redirects for stuff like www/non-www or ssl/non-ssl, installing ssl certs can be a pain sometimes, and for the most part I use a lot of default settings and have never needed to go in-depth with tweaking settings.

Does somebody like me have a reason to check out Proxygen?


Maybe not today. We haven't open sourced a configurable proxy yet, so you wouldn't be able to do your redirects and some of these other features without writing C++ code.

If you're interested in how HTTP frameworks are designed and implemented, I'd definitely suggest checking it out though. This project is initially going to be more interesting to people integrating with HTTP quite directly.


Question for you and the general C++ community. I am a Java developer mostly but it looks like I'll be transitioning to a few C++ projects in the near future. Any good resources you can recommend for learning modern C++? Particularly anything you've used to get developers on your team similarly up to speed.

I've cruised through the Proxygen code base and there are definitely some head scratchers.


This SO thread has a good overview of the books out there:

http://stackoverflow.com/questions/388242/the-definitive-c-b...

I'd recommend "C++ Primer" first, "Effective C++" second, and "The C++ Programming Language" as a reference. Other must haves are "Modern C++ Design", the other Scott Meyers books, and the Herb Sutter books.


Hey, could you recommend any books or tutorials for C++1y/C++14? I had learned C++ in my CS courses but I've used java/python/javascript these days. I'm thinking of coming back to C++ because C++14 looks cool.


Start with Bjarne's new book, "Tour of C++", it shows how to make proper use of C++11 without those C unsafe influences.

C++14 is mostly fixing what was left out in C++11, so the book is already a good starting point.

C++14 is definitely cool, I use JVM/.NET nowadays at work, but am an old C++ dog (since 1993), so I always used the language on a few side projects.

C++14 kills quite a few complaints I had about the language. Now if we just could get proper modules.


You surely should pick Bjarne's latest book, "Tour of C++".

It is show how to make proper use of C++11 without any bad C influences.


Hey! So Proxygen was originally a reverse-proxy load balancer. Is that still how Facebook utilizes it now? If not, what is its current role? Are there any plans for integrating this with Hiphop/PHP in any way?


Yup, we still use Proxygen (the library) in our reverse proxy. Maybe some day we'll be able to open source the reverse proxy too, but it's pretty deeply integrated with internal FB code right now so that's tricky.

We already use the Proxygen HTTP code for the webserver part of HHVM internally. We hope to release that webserver part too (in the HHVM project).


Websockets were mentioned in the blog post. Has proxygen been deployed with websockets at facebook scale? How much support for websockets is there in the opensourced proxygen?


I would love to see Proxygen integrated as a HackLang extension. Especially the HTTP parser. As far as I know, the PHP world is sorely lacking a robust HTTP parser.


In response to my comment here, I did some more experimentation with pecl_http. Turns out version 2.1.4 looks relatively stable. Just be sure not to make the same mistake I made of referring to the documentation on the php.net website. Instead one should look here for version 2.x documentation:

http://devel-m6w6.rhcloud.com/mdref/http/


The blog post mentions that HHVM uses parts of Proxygen.


Sorry; I had read through most of it but was scanning for PHP or Hiphop. Honest mistake, I swear :).


Hi! Looks interesting!

I have a question -- in particular, the blog post mentions that the framework "includes both server and client code", any my question is about the second part :-)

I'm wondering, how does it compare to the other C++ HTTP client solutions: Is it closer to higher-level libraries like cpp-netlib, Casablanca, or POCO -- or more on a lower-level / comparable to Asio (or Boost.Asio)?

In your view, what are the main relative advantages/disadvantages (namely in the scenario mentioned in the blog post, i.e., integration into existing applications)?


I'm curious to know why there is a file named PATENTS in the repo (alongside the LICENSE file). What kind of legal protections would this "additional grant" give you?


This is similar to the Apache License, which allows developers to use the project with the confidence that we grant a license to any patents that may affect the project. This is a grant we use for all of our projects and is not anything specific to Proxygen.


I am sorry to point this out, but the patent license of Proxygen does not look similar to that of Apache License for two reasons.

- the license is terminated when one files a claim against _any_ of Facebook's software or services (IIRC Apache License gets terminated only when filed against the software)

- the license also terminates when you claim that "any right in any patent claim of Facebook is invalid or unenforceable"

The second clause seems very agressive (or pro-patent) to me, which makes me feel sorry for the developers of Proxygen, since IMO such a clause would harm the acceptance of the software outside Facebook.

It would be great if you reconsider the patent license.

Disclaimer: I am developer of H2O, an open-source HTTP/1 and HTTP/2 library, so there is obviously a conflict of interest here. But I wanted to leave a comment anyways since, honestly, I feel sorry if my friends at Facebook needs to go with this kind of license.


Thats a good point though, thats not exactly the kind of stuff people expect in open source license.

Then everyone goes and complain about the GPL vs BSD.. but this is waaaaaaaaay worse.


I would love to see a comparison between Proxygen and another server (ideally nginx or golang server). The numbers are impressive, but the client is on the same box, it is a big box, and it is a simple and short response. So I'm not sure if I should be wowed or not. From reading the post, Proxygen was written as a server that would be well-integrated into Facebook tools. I don't really use Facebook tools, so I'm not sure if Proxygen would be right for me?


I think https://news.ycombinator.com/item?id=8563766 is a good answer to use cases.


Maybe I am being dense, but why is Proxygen a better solution here?


It's a library that you can integrate into your application rather than passing requests via an intermediary (so performance would improve). It's not necessarily a better solution unless you're optimizing for performance. It probably isn't if you're going for ease of development and maintainability.


How tested is the websocket support? Does facebook use this for TLS termination?


If you need a really simple embeddable C++ webserver that supports websockets (and as the primary developer) can I suggest SeaSocks: https://github.com/mattgodbolt/seasocks


Websocket support isn't out yet unfortunately. It's something we hope to get to soon.

Our reverse proxy uses proxygen and does TLS termination too, yes.


Hey, I am a bit late, but I wonder how you generally handle unicode in C++, since the language itself does not have much support for it. That's what always makes me weary of writing any kind of server in C++.



Could you comment on the verisimilitude of the benchmark parameters? I would expect in Facebook's production environment the number of active sockets would be hundreds or thousands of times greater than 400.


You're right we see many more connections in production. We didn't have enough time to run exhaustive benchmarks for all the different possible combinations of active sockets, requests per connection, etc. As you might suspect, 400 was a sweet spot for our performance numbers at 1 core. Overall RPS didn't dip too much when increasing the number of connections. The table we included is just to give a rough idea of perf.


Are there any plans for spdy and http2?


From the first paragraph:

In addition to HTTP/1.1, Proxygen (rhymes with "oxygen") supports SPDY/3 and SPDY/3.1. We are also iterating and developing support for HTTP/2.


We already do support SPDY/3 and SPDY/3.1, and we're working on HTTP/2 currently.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: