Speaking for myself, the first form is more natural- even if it’s a useless cat, because I’m always cat-ing files to see their structure. Then progressively tacking on different transforms. And then finally putting it in whatever I want as output.
It’s so ingrained, I’m more likely than not to just write it out that way even when I know exactly what I’m doing from the onset.
generally, speaking, if you don't have an idea of how big the file is, or it would take up too much real-estate on your terminal window, sure. 100%. It was just an example.
lot's of times we sort of know what we are working with, but don't remember the particulars especially
I really recommend folks use "less" over cat, especially keyboard oriented folks. Different terminal emulators don't always have the scroll behavior I want, not do they always allow me to search the file I'm looking at. "less" does all those things, in nearly every environment no matter the terminal emulator, and has other wonderful options to boot (chop long lines so they don't wrap can be nice for logs, line numbers can be VITAL, etc).
I still uselessly use cat though, it's such a nice way to build a pipeline.
As a scientist who cares about reproducibility, the big difference between the "useless cat" and providing the input file name on the command line is that, in the latter case, the program can capture that file name and reproduce it. That is harder when using stdin.
Many of my programs and scripts start output with the line:
# cmd arg1 arg2 arg3 ...
and simply echo back lines that start with '#'. That way, I have an internal record of the program that was run and the data file that was read (as well as previous parts of the analysis chain).
And, 'R' ignores lines starting with '#', so the record is there, but does not affect later analyses.
It’s so ingrained, I’m more likely than not to just write it out that way even when I know exactly what I’m doing from the onset.