It's easier not to mess up table based filters using explicit semi-join operators (eg. in, not in, exists) instead of using regular joins because joins can introduce duplicates.
Give me 'any join' operation - ie. just select the first value instead of all, and I'll happily use joins more. They are actually more intuitive.
It's not that relational algebra is untintuitive. It's because standard SQL sucks.
My problem with semijoins is that the semantics of "what exactly does a SELECT evaluate to inside an expression" are sometimes murky and might vary across databases.
what the heck is the result of evaluating the inner query, in the outer expression?
Maybe I am missing something, but the exact meaning to vary a lot across different databases. Some seem to have a standalone "table" data type, while others don't.
I might be missing something as I'm self-taught, but the inner select specifies a set, and you "just" do a simple set membership test?
How it's implemented is as usual up to the database server implementation. Ones I've used creates a temporary table (like it does in so many other cases), and as such EXISTS is usually faster.
But I wouldn't rely on this when moving to another implementation, and use the query planner to see, just as I'd view the assembly output when moving to a new compiler.
Again, I don't have tons of experience, so concrete (counter) examples are welcome.
and their combined thousands of votes and dozens of answers, all full of awkward workaround or ill-performing or specialised-for-one-database-engine code for this common and desirable thing which would be trivial with a couple of boring loops in Python.
Give me 'any join' operation - ie. just select the first value instead of all, and I'll happily use joins more. They are actually more intuitive.
It's not that relational algebra is untintuitive. It's because standard SQL sucks.