I wonder if the amount of RAM used by a new process can be reduced. Code and other RO segments are shared anyway, so it's only basically the new heap and various buffers.
Reducing this amount would also run Postgres in more constrained environments.
There are two parts of this - the memory allocated by OS and internally.
At the OS level, we can't really do much, I'm afraid :-( I don't think we're wasting too much memory there, exactly because a lot of the memory is shared between processes. Which also makes it difficult to determine how much memory is actually used by the processes (the sharing makes the various metrics in ps/top are rather tricky to interpret).
As for the internal memory, it's a bit more complicated. We need a little bit of "per process" memory (per-backend entries in various internal data structures, etc.) - a couple dozen/hundred kBs, perhaps. It's hard to give a clear figure, because it depends on max_locks_per_transaction etc. This is unlikely to go away even if we switched to threads, because it's really "per session" state.
But then there are the various caches the processes keep, memory used to run queries etc. Those may be arbitrarily large, of course. The caches (with metadata about relations, indexes etc.) are usually a couple MBs at most, but yes, we might share them between threads and save some of this memory. The price for that would be the need for additional synchronization / locking, etc. The memory used to run queries (i.e. work_mem) is impossible to share between threads, of course.
Overall, there's very little chance PostgreSQL switch to threads (difficulty of such project, various drawbacks, ...). But I do agree having to run a separate connection pool may be cumbersome, etc. There was a proposal to implement integrated connection pool, which would address at least some of those problems, and I wouldn't be surprised if it happened in foreseeable future.
Reducing this amount would also run Postgres in more constrained environments.