"It is noteworthy that ChatGPT didn’t design the CPU in its entirety. Yes, NYU researchers used ChatGPT to translate the ‘plain English’ that describes a chip and its capabilities to a Hardware Descriptor Language (HDL) such as Verilog."
"It its entirety" is doing a lot of work - I would say ChatGPT didn't design anything, period. I strongly recommend reading the actual chat logs from the data[1]. ChatGPT implemented a very detailed design, and it needed a lot of help to do so. This is still very impressive, but it ultimately reflects GPT-4 being a semi-reliable knowledge base and an excellent translator, whether it's between human languages or human-computer language. Despite the article's implications, GPT-4 doesn't democratize hardware design, because it requires specialized Verilog knowledge to fix ChatGPT's bugs. It's not just the syntax errors, there are very subtle bugs around binary arithmetic, swapping registers, etc. ChatGPT does not help in these cases unless you specifically identify the bug.
It is undeniably useful for certain hardware designers, and worth doing more research. But there is absolutely no need for an inflammatory headline like this. Tom's Hardware is simply lying to readers. I'll let the researchers speak for themselves[2]:
> Challenges: While using a conversational LLM to assist in designing and implementing a hardware device can be beneficial overall, it is clear that the technology still needs improvement. The ChatGPT LLM produced errors in aspects of both the specification and implementation, requiring intervention by the experienced hardware designer. It seems unlikely, then, that the model could produce designs without assistance (i.e. in the zero-shot setting). Further, we observed deficiencies when attempting to use the model for producing verification code.
> Opportunities: Still, when a human is paired with ChatGPT-4, the language model seems to be a ‘force multiplier’, allowing for rapid design space exploration and iteration. We demonstrated this in our case study, where it helped to architect and implement a novel processor. In general, we observed that ChatGPT-4 could produce functionally correct code, which in general could free up designer time when implementing common modules and thus improving developer productivity.
This sort of reminds me of all of the 'high-level' hardware languages that have mostly failed to catch on. As an ASIC designer, you're pretty much never limited by your typing speed (and much more frequently limited by tool runtimes for things like synthesis and place & route, timing analysis, etc), and across a large modern chip the difficult part is reasoning about how big distributed systems behave. A better/faster P&R tool seems ~1000x more useful than a buggy RTL generator.
This agrees with my experience using GPT for software engineering. In the nontrivial case it usually bullshits nonexistent APIs and so on, but as an experienced dev that bullshit often still serves to point me in the right direction. This is particularly true for languages that have poor documentation, but substantial presence on Github.
If you look at solar and batteries 10 years ago, that is where we’re at today with GPUs and generative AI today. Hard to see the hockey stick from here, but we are there.
Try preprocessing your data to extract the important information. GPT doesn't really care about the correct syntax. I once got it to successfully analyze and explain a big terraform database by providing just the filenames and definitions in these files (basically a copy from a project-wide search result window for "resource").
"It its entirety" is doing a lot of work - I would say ChatGPT didn't design anything, period. I strongly recommend reading the actual chat logs from the data[1]. ChatGPT implemented a very detailed design, and it needed a lot of help to do so. This is still very impressive, but it ultimately reflects GPT-4 being a semi-reliable knowledge base and an excellent translator, whether it's between human languages or human-computer language. Despite the article's implications, GPT-4 doesn't democratize hardware design, because it requires specialized Verilog knowledge to fix ChatGPT's bugs. It's not just the syntax errors, there are very subtle bugs around binary arithmetic, swapping registers, etc. ChatGPT does not help in these cases unless you specifically identify the bug.
It is undeniably useful for certain hardware designers, and worth doing more research. But there is absolutely no need for an inflammatory headline like this. Tom's Hardware is simply lying to readers. I'll let the researchers speak for themselves[2]:
> Challenges: While using a conversational LLM to assist in designing and implementing a hardware device can be beneficial overall, it is clear that the technology still needs improvement. The ChatGPT LLM produced errors in aspects of both the specification and implementation, requiring intervention by the experienced hardware designer. It seems unlikely, then, that the model could produce designs without assistance (i.e. in the zero-shot setting). Further, we observed deficiencies when attempting to use the model for producing verification code.
> Opportunities: Still, when a human is paired with ChatGPT-4, the language model seems to be a ‘force multiplier’, allowing for rapid design space exploration and iteration. We demonstrated this in our case study, where it helped to architect and implement a novel processor. In general, we observed that ChatGPT-4 could produce functionally correct code, which in general could free up designer time when implementing common modules and thus improving developer productivity.
[1] https://zenodo.org/records/7953725
[2] https://arxiv.org/abs/2305.13243