That's a pretty interesting idea! I guess 160+ is sort of doing some of that for us - it compiles to SQL WHERE clauses, right - but generally, we found good results giving it a SQL dialect directly.
I think some of the reason is that there's so much coverage of writing SQL in its training set.
Yes! This works really well from Sonnet 4.5 onwards, in our experience. Sonnet 4.0 was a little rocky - we had to give it tons of documentation - but by now it works without much effort.
One thing that works very well is just giving it one or two example valid programs/statements in the custom language. It usually picks up what you're getting at very quickly.
When it slips up, you get good signal you can capture for improving the language. If you're doing things in a standard agent-y loop, a good error message also helps it course-correct.
That’s really interesting. The “one or two examples + good error messages” part feels especially important. It suggests the limiting factor may be less finetuning and more whether the model is given a tight representation and a feedback loop it can recover from.
The AEPs were originally based off Google AIPs, but we did a hard fork and have altered a lot since then. For one thing, the AIPs were entirely protobuf focused, while we're focusing equally on protobuf + OpenAPI.
The CRUD methods are great examples where we deviate from the AIPs.
"Cheap" how? I have a friend who works on Seattle's bus planning. Removing a stop is a _lot_ of political work. When an elderly person depends on that bus stop being within a block so they can get to their doctor, and you're proposing to move it six blocks further away, that's essentially a _political_ cost.
It might better in the system throughput, and those benefits may even outweigh the misery put on that one person. But in the US, we largely sort that out by using cool-down times, hearings, and "community input."
Net result, according to my friend at least, is that bus stops feel _very_ sticky and hard to change.
I feel like I see an independent low-noise phone project like, every 3 months. Clearly there is some latent demand here. I wonder why the big players (Google, Apple, Samsung, HTC) haven't made a big-corp product for this market.
I am always reluctant to jump on with these independent ambitious projects. The first version is understandably rough, and the company seems to fold before they get to a second or third version.
But maybe advances in manufacturing in China are making high-quality, small-batch products like this more tractable?
Same reason Acura stopped making small cars like the Integra/RSX: costs scale more slowly than revenue as car size increases, so selling to the small car market segment results in unearned potential profits — even if the small car segment is a majority, it’s better to make a higher profit per unit on fewer unit sales if your most primary goal is to min/max labor/profit.
(Small phones, unlike small cars, also have costs in UI development to maintain their form factor’s OS support, which can create an additional pressure to withhold devices for a viable and profitable market.)
Big corps were the ones to move away from Blackbery en masse towards a BYOD system. Before that, Samsung and Nokia both had a series of keyboard phones running Windows Mobile 6 or SymbianOS. I had the Samsung Blackjack II in 2008.
No, there demand is negligible. It's just typical hacker news people who want to suddenly become productive Silicon Valley trope hustle style, or people who want to change their damaging habits in a day, so instead of uninstalling TikTok which takes 15 seconds to do, they will spend money a separate device.
"But accepting the full S3Client here ties UploadReport to an interface that’s too broad. A fake must implement all the methods just to satisfy it."
This isn't really true. Your mock inplementation can embed the interface, but only implement the one required method. Calling the unimplemented methods will panic, but that's not unreasonable for mocks.
Not to mention, introducing all the permutations of methods as separate interfaces on the "consumer side" means extreme combinatorial explosion of interfaces. It is far better to judge the most common patterns and make single-method interfaces for these on the provider side.
Lots of such frequently-quoted Go "principles" are invalid and are regularly broken within the standard library and many popular Go projects. And if you point them out, you will be snootily advised by the Go gurus on /r/golang or even here on HN that every principle has exceptions. (Even if there are tens of thousands of such exceptions).
At work we use it heavily. You don't really see "a zillion interfaces" after a while, only set of dependencies of a package which is easy to read, and easy to understand.
"makes it hard to cone up with good names" is not really a problem, if you have a `CreateRequest` method you name the interface `RequestCreator`. If you have a request CRUD interface, it's probably a `RequestRepository`.
The benefits outweigh the drawbacks 10 to one. The most rewarding thing about this pattern is how easy it is to split up large implementations, and _keep_ them small.
Any method you forget to overwrite from the embed struct gives a false "impression" you can call any method from mockS3.
Most of time code inside test will be:
// embedded S3Client not properly initialized
mock := mockS3{}
// somewhere inside the business logic
s3.UploadReport(...) // surprise
Go is flexible, you can define a complete interface at producer and consumers still can use their own interface only with required methods if they want.
Compare https://docs.firetiger.com with https://docs.firetiger.com/llms.txt and https://docs.firetiger.com/llms-full.txt for a realy example.
reply