I had to work with a 2.1 GB JSON file today. It was really striking that, on a m...

dmitrygr · on March 29, 2023

I routinely use “less -n” to open terabyte-sized text files on a machine with a few GB of RAM. It can even do regex search.

Sometimes good old tools beat the new shiny turd.

dekhn · on March 29, 2023

less works fine on files with enormous numbers of lines, but not lines that are extremely long. Open up a 1G file with lines that are all 10K long and you'll see just how slow the pager can get!

deathanatos · on March 29, 2023

ipython, plus a bit of magic to json.loads to get it to intern the object keys? (Assuming you have enough repetition in the keys to result in enough memory savings to get you within your budget.)

I had to do this once for a large piece of JSON that ballooned quite large when loaded into RAM. But in my case I could wait until after json.loads to do the interning; but I think you can do it on the fly with object_hook or object_pairs_hook.

Alternatively, Rust+Serde & doing as bare-bones a computation to get it working, again with an eye out to whether the decoded form will fit. (Hopefully, since Serde'ing into a struct should make the keys 0 bytes, essentially, but there would be some memory lost for pointers and the like.)

Alternatively, a 20 GiB swap file.

beigebrucewayne · on March 29, 2023

https://www.visidata.org

Swizec · on March 29, 2023

vim is my go-to for large text files. Works great even on TB-sized monsters.

bcrosby95 · on March 29, 2023

Just make sure your syntax highlighting is off if its something like xml.

sn9 · on March 29, 2023

Sometime in the last decade, I installed Mathematica which involved downloading and running a 1GB shell script!

Naturally I was curious what such a shell script consisted of, but most tools choked on it. IIRC, even vim and less couldn't open it, but emacs could.

narwally · on March 30, 2023

I believe emacs chunks large files and then lazily loads them to enable this. I remember having to mess around with a specific mode to get it to work in the past, but I think it's included in base emacs now.

r-bryan · on March 29, 2023

I can highly recommend Epsilon (emacs-like) from lugaru.com.

fourseventy · on March 29, 2023

Need to use something like less or tail

upon_drumhead · on March 29, 2023

what where you trying to do with it? I often work with jq on very large json files without much issue.

crooked-v · on March 29, 2023

In this case it was a single giant object I was trying to extract a specific list of keys (with unknown value lengths) from. Every method I found with jq died with out of memory errors.

upon_drumhead · on April 1, 2023

Did you try using streaming?

https://stedolan.github.io/jq/manual/#Streaming

You have to rewrite your criteria, but it's what's allowed me to process huge json files without issues.