Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I had to work with a 2.1 GB JSON file today. It was really striking that, on a machine with 4 GB free RAM, I could find literally nothing that could work with the whole thing in memory as JSON, including every GUI editor I looked at, Node, jq, etc.


I routinely use “less -n” to open terabyte-sized text files on a machine with a few GB of RAM. It can even do regex search.

Sometimes good old tools beat the new shiny turd.


less works fine on files with enormous numbers of lines, but not lines that are extremely long. Open up a 1G file with lines that are all 10K long and you'll see just how slow the pager can get!


ipython, plus a bit of magic to json.loads to get it to intern the object keys? (Assuming you have enough repetition in the keys to result in enough memory savings to get you within your budget.)

I had to do this once for a large piece of JSON that ballooned quite large when loaded into RAM. But in my case I could wait until after json.loads to do the interning; but I think you can do it on the fly with object_hook or object_pairs_hook.

Alternatively, Rust+Serde & doing as bare-bones a computation to get it working, again with an eye out to whether the decoded form will fit. (Hopefully, since Serde'ing into a struct should make the keys 0 bytes, essentially, but there would be some memory lost for pointers and the like.)

Alternatively, a 20 GiB swap file.



vim is my go-to for large text files. Works great even on TB-sized monsters.


Just make sure your syntax highlighting is off if its something like xml.


Sometime in the last decade, I installed Mathematica which involved downloading and running a 1GB shell script!

Naturally I was curious what such a shell script consisted of, but most tools choked on it. IIRC, even vim and less couldn't open it, but emacs could.


I believe emacs chunks large files and then lazily loads them to enable this. I remember having to mess around with a specific mode to get it to work in the past, but I think it's included in base emacs now.


I can highly recommend Epsilon (emacs-like) from lugaru.com.


Need to use something like less or tail


what where you trying to do with it? I often work with jq on very large json files without much issue.


In this case it was a single giant object I was trying to extract a specific list of keys (with unknown value lengths) from. Every method I found with jq died with out of memory errors.


Did you try using streaming?

https://stedolan.github.io/jq/manual/#Streaming

You have to rewrite your criteria, but it's what's allowed me to process huge json files without issues.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: