Below, you're doubling down on the "monolith == 1 server" point. You should do a...

KronisLV · on July 21, 2022

> You should do a bit of research before you continue, the word doesn't mean what you think it means.

I'd suggest that the word means whatever the majority of people assume it means - that's also how we get the meaning of "agile" to be so inconsistent, depending on which companies/teams/people you talk to. Essentially, people who've only worked with projects that run as single instances might have a pretty different opinion on what a "monolith" is.

I've definitely met a lot of people who'd claim that a single application package across multiple servers is no longer a monolith. The reasoning would probably go along the lines of this: "If a monolith can live on multiple servers, what do you call an application that can only ever live on a single server, with a single instance being launched at the same time?"

So essentially, what would be the names best suited to describe:

  - a single codebase that can only run as a single instance
  - a single codebase that can be deployed with multiple concurrent instances

Personally, I think it might be worthwhile to also answer that, so we have a better idea of what to name things. Otherwise we end up with messes like people taking DDD too far and having a separate microservice per business entity, because they read into the "micro" part of microservices too much.

Oh, also, in regards to the definition of "monolith" that is centered around only how the code itself is deployed, personally I think that a modular monolith ("modulith"? heh) is a great architecture that doesn't get explored nearly enough! A single codebase that is easy to reason about, a single executable that is probably scalable but also simple to deploy however you need, with different parts of the functionality (think front end, different types of API, reporting functionality, file upload functionality etc.) being possible to enable/disable in each of the instances based on feature flags. Want everything on a single instance? Flip all of the flags to "on". Multiple instances? Configure whatever modules you want, where ever you need.

ajmurmann · on July 21, 2022

> "If a monolith can live on multiple servers, what do you call an application that can only ever live on a single server, with a single instance being launched at the same time?"

That exists? Are there examples of this, especially once where there is a good reason for this? I cannot even begin to list all the awful issues with this in my head.

KronisLV · on July 21, 2022

> That exists? Are there examples of this, especially once where there is a good reason for this? I cannot even begin to list all the awful issues with this in my head.

Most certainly. I'd suggest that many systems out there that ever only needed to run on a single server are structured like this. Even though you could technically take plenty of these systems and launch two parallel instances, you'd get problems because they haven't adopted the "shared nothing" approach, or even just basic statelessness principles.

We tend to forget ourselves with all of our modern and scalable container developments, but there are untold amounts of PHP code out there that stores files and other uploaded data on the very same server, in any number of folders. Of course, you can technically set up a clustered file system, or at least a network based one, unless you are running in a shared hosting environment, in which case you are out of luck.

Oh, and speaking of shared hosting, in theory you should be able to get rid of environments that use cPanel and instead switch to containers, right? Well, no, because workflows are built around it and dozens of sites might be run on the same account with any given shared hosting provider.

You'll be lucky to even find such an environment that has an up to date version of PHP installed and running and resource contention issues will present themselves sooner or later: "Oh hey, this one slow SQL query in this site brings down this other dozen sites. Could you have a look at it?"

I actually helped an acquaintance with that exact problem, I dread myself for agreeing to help because it wasn't a good experience.

Looking at the enterprise space, I've also seen systems out there that store state (e.g. information about business rules) in the actual application memory liberally, as well as things like user session information, because someone didn't know how or couldn't be bothered to set up Redis.

So there an app restart would mean that everyone is logged out. Not only that, but if you have a system which allows users to make some sorts of requests, with business rules about what order they can be accepted in, that means that you can store the output of these states in the DB, but during the processing you have an in-memory queue, which means that you couldn't feasibly have multiple instances running in parallel, because then you'd have a split brain problem. It's like those people had never heard of RabbitMQ while designing it.

Apart from that, there are also issues with scheduled processes. If you've never heard of feature flags or don't see a good reason to use them, you'll run into the situation where you'll have your main application instance executing scheduled tasks in parallel to serving user requests. Worse yet if it's coupled tightly and the application will do "callbacks" for reacting to certain changes, instead of passing the message through the DB or something. Oh and in regards to performance, you better hope that the reporting process you wrote doesn't cause the service's GC to thrash to the point where everything slows down.

Oh, and in addition to that, there are hybrid rendering technologies like PrimeFaces/JSF out there, which store the user's UI state on the server (in memory), whilst sending diffs back and forth, as well as making the client execute JavaScript in the browser for additional interactivity. Think along the lines of GWT, but even more complicated and way worse. A while back some people talked about how the productivity can actually be pretty nice, but what I saw was 100% the opposite, but more importantly there's also no viable way to (easily) distribute this UI state across multiple instances, at least with the way the eldritch monolith is written. I've also seen Vaadin applications with the same problem.

Another factor that can cause situations like this to eventually develop is having a tightly coupled codebase, where you cannot reasonably extract a piece of code into a separate deployment, because it has 20+ dependencies on other services in the app and is called in about 40+ places (not even kidding). While you could try, before you know it you would be sending 20 MB of JSON for simple data fetching calls between applications (again, not kidding - once actually saw close to 100 MB of network traffic between back end services and DB calls for a page to load).

Those are just some of the issues. My suggestion would be to never build systems like that no matter how "simple" they seem and instead just stop being lazy and use Redis, RabbitMQ, or even just PostgreSQL/MySQL/MariaDB tables for ad-hoc queues, anything is better than writing such messes. And if you are ever asked to help someone with anything that starts looking like the above, tell them that your schedule is sadly full or at least very carefully consider your options.

xorcist · on July 21, 2022

> because they haven't adopted the "shared nothing" approach,

In practice, many web applications are stateful. The load balancer would see to it that clients keep talking to the same frontend. For larger applications it is important for cache locality.

> untold amounts of PHP code out there that stores files and other uploaded data

This is quite normal when you have some type of blob, and normally what networked file systems are user for.

KronisLV · on July 22, 2022

> In practice, many web applications are stateful. The load balancer would see to it that clients keep talking to the same frontend. For larger applications it is important for cache locality.

In regards to front end resources, it shouldn't matter which instance you're talking to, if all web servers are serving copies of the same bundle, given that the resource hashes would match, outside of A/B testing scenarios. It's also nice to explore stateless APIs where possible, and not have to worry about sticky sessions.

In many dynamically scalable setups if you tried talking to API instance #27, you might discover that it is no longer present because the group of instances has been scaled down due to decreased load. Alternatively, you could discover that the instance that you were talking to has crashed and now has been replaced by another one.

Hence, having something like Redis for caching data, or even a cluster of such services becomes pretty important! Of course, there are ways to do this differently, such as taking advantage of CDN capabilities, but for the most part sticky sessions are a dated approach in quite a few cases. It's easier for everyone not to care about ensuring such persistence.

An excellent exception for this: geographically distributed systems where even if you don't care about that exact instance, you still want stuff in this data center to be reached, instead stuff half way across the world.

> This is quite normal when you have some type of blob, and normally what networked file systems are user for.

Nowadays, I'd argue that S3 (or compatibles like MinIO or Zenko) is one of the very few ways to do this properly, or perhaps GridFS in MongoDB - an abstraction on top of the file system, that handles storing and accessing data as necessary. Then, using a distributed or networked file system, or block/object storage (depending on the setup) is a good idea.

However, in general cases, you should never use the file system directly for the storage of your blobs, regardless of whether those are stored locally or in a networked file system, as that is just asking for trouble. Things like maximum files per folder, inode limits, maximum folder nesting/file name length limits, maximum file sizes, writing bad code that allows browsing other directories than the intended ones, the risk of files that might be executed in the case of bad code/configuration, case sensitivity based on the file system, encoding issues, special characters in filenames or directories, need to escape certain characters as well, reserved names in certain file systems and frankly too many issues to list here.

So yes, it is "normal" but that doesn't make it okay, though one also has to understand that often in a shared hosting environment there aren't good options on offer, versus just spinning up a MinIO container and using the S3 library in your app.

rfrey · on July 23, 2022

>I'd suggest that the word means whatever the majority of people assume it means

"Monolith" is a term of art in software engineering. You're in a discussion about software engineering. Saying "it means whatever people want it to mean!" is like talking to a bunch of chemistry people and saying most people think car springs when they hear "suspension".

Look, you didn't know what "monolith" means in a software context. Everyone has stuff they don't know, even in their field. Learn something and move on.

KronisLV · on July 23, 2022

> "Monolith" is a term of art in software engineering. You're in a discussion about software engineering.

The problem is that most of these terms are loosely defined and evolve over time. And I do mean in practice, as opposed to some thesaurus definition that gets lost in conversation.

REST? A set of architectural constraints, but nowadays most just selectively pick aspects to implement and forget about things like HATEOAS or hard problems like resource expansion and something like HAL isn't popular.

Microservices? Some think that those need to be "small" which isn't true, and yet somehow we see time and time again people ending with a pattern of service-per-developer instead of service-per-team, because people just aren't good at DDD and aren't experienced with what works and doesn't yet, if they haven't been building systems like that for a while.

Agile? I'm sure that you're aware of what the management and consultation industry did to the term, now we have something like SAFE which goes exactly against what Agile is and shouldn't be allowed to be named after it.

Cloud native apps? Who even knows, everyone keeps talking about them, there are attempts to codify knowledge like "12 Factor Apps" but I've seen plenty of projects where apps are treated just like they were 10 years ago, to the point of dependencies being installed into containers at runtime, logs being written to bind mounted files and configuration coming from bind mounted config files, as well as local filesystem being used for intermediate files, thus making them not scalable.

> Look, you didn't know what "monolith" means in a software context. Everyone has stuff they don't know, even in their field. Learn something and move on.

Another person suggested the distinction between "monolith" applications (referring to the codebase) and "singleton" applications (which describes the concept of only a single instance being workable at a time). That advice is workable and is useful, since now we have two precise concepts to use.

Your advice isn't useful, because while you're right about the "proper" definition, you brush aside the fact that people misuse what "monolith" is supposed to mean and don't engage with the argument that we need a proper set of terms to describe all sorts of applications to avoid this.

If we don't address this, then monolithic apps become some sort of a boogeyman due to claims that they aren't scalable when that isn't true for all monolithic apps (since that should only imply how the code is deployed), just the subset of arguably badly written ones, as explained in my other comment.

So essentially:

  - a single codebase that can only run as a single instance --> could be called "singleton apps"
  - a single codebase that can be deployed with multiple concurrent instances --> the proper definition of "monolith"

Without this, you cannot say something along the lines of: "Please don't make this monolith into a singleton app by using local instance memory for global business state." and concisely explain why certain things are a bad idea.

Hope that clears it up.

wlonkly · on July 21, 2022

> - a single codebase that can only run as a single instance

I've heard this referred to as a "singleton app".

KronisLV · on July 22, 2022

That's pretty good and sounds like an apt description!