We can't expect a massive improvement on computer performance anymore. So it will never come the day household computer can finish training the model equivalent of Github Copilot in practical time.
They can just add more cores and more power. With modern languages like Rust making multi threading more accessible, I expect we will double down on this. You could also crowd source this, some distributed application where everyone puts their home machines towards training.
> You could also crowd source this, some distributed application where everyone puts their home machines towards training.
That's how Leela Chess 0 (LC0) replicated the Alpha-zero performance.
In fact, this is actually not that difficult. Assuming you have the means to orchestrate it; all it takes is loading the weights and a batch, computing backprop; and submitting it to the central system which aggregates the gradient updates and then updates the whole network and push new weights (kinda how bitcoin creates a new block).
This is no different to gradient accumulation; just "distributed". In-fact, the system could offload a large number of batches because the returned update is O(1) space to return where n is for batch size; it's just that the O(1) is the size of the network.