It's interesting how Alphafold... which is a 3d generative model from 1d protein sequences is all fancy and complicated with its internal data representation in comparison to this paper which basically just voxelizes the input data and takes a bunch of pictures from various angles to build its training set.