Hacker Newsnew | past | comments | ask | show | jobs | submit | spenczar5's commentslogin

Sure, llms.txt is a convention for this.

Compare https://docs.firetiger.com with https://docs.firetiger.com/llms.txt and https://docs.firetiger.com/llms-full.txt for a realy example.


Why does the article say that’s useless?

It’s not useful if it’s never read by agents - that’s the premise of the statement.

But will agents know to send a "Accept: text/markdown" header?

Did you consider looking to see what they actually already do? There's a reason this works.

That's a pretty interesting idea! I guess 160+ is sort of doing some of that for us - it compiles to SQL WHERE clauses, right - but generally, we found good results giving it a SQL dialect directly.

I think some of the reason is that there's so much coverage of writing SQL in its training set.


Good point, that makes a lot of sense to use a tool that has plenty of sample usage data available.

Yes! This works really well from Sonnet 4.5 onwards, in our experience. Sonnet 4.0 was a little rocky - we had to give it tons of documentation - but by now it works without much effort.

One thing that works very well is just giving it one or two example valid programs/statements in the custom language. It usually picks up what you're getting at very quickly.

When it slips up, you get good signal you can capture for improving the language. If you're doing things in a standard agent-y loop, a good error message also helps it course-correct.


That’s really interesting. The “one or two examples + good error messages” part feels especially important. It suggests the limiting factor may be less finetuning and more whether the model is given a tight representation and a feedback loop it can recover from.

Author here! I am pretty jazzed about these ideas and happy to dig into more detail than a blog post allows.

Is this a clone of the Google AIPs? Like https://aep.dev/160/ seems to just copy https://google.aip.dev/160.

The AEPs were originally based off Google AIPs, but we did a hard fork and have altered a lot since then. For one thing, the AIPs were entirely protobuf focused, while we're focusing equally on protobuf + OpenAPI.

The CRUD methods are great examples where we deviate from the AIPs.


"Cheap" how? I have a friend who works on Seattle's bus planning. Removing a stop is a _lot_ of political work. When an elderly person depends on that bus stop being within a block so they can get to their doctor, and you're proposing to move it six blocks further away, that's essentially a _political_ cost.

It might better in the system throughput, and those benefits may even outweigh the misery put on that one person. But in the US, we largely sort that out by using cool-down times, hearings, and "community input."

Net result, according to my friend at least, is that bus stops feel _very_ sticky and hard to change.


I think the article means 'cheap' as in it doesn't really require any new/expensive infrastructure and could theoretically be done overnight.

Though, as you mention it's a big political ask (which is unfortunate).


Its unexported for that reason. You only change it in tests.


It's in the article that you're commenting on, https://www.spacex.com/updates#xai-joins-spacex.


Oh, ffs.


Haha. It's less than 1,000 words that would take less than 5 minutes to read.

I bet much less than half of the hundreds of HN commenters here bother to read it. Many are clearly unfamiliar with its content.


I can't, I don't want it in my head :/


I feel like I see an independent low-noise phone project like, every 3 months. Clearly there is some latent demand here. I wonder why the big players (Google, Apple, Samsung, HTC) haven't made a big-corp product for this market.

I am always reluctant to jump on with these independent ambitious projects. The first version is understandably rough, and the company seems to fold before they get to a second or third version.

But maybe advances in manufacturing in China are making high-quality, small-batch products like this more tractable?


I feel like I see an independent low-noise phone project like, every 3 months. Clearly there is some latent demand here.

I don’t know - it feels to me that this is evidence that there _isn’t_ sufficient demand to sustain a successful product like this.


Same reason Acura stopped making small cars like the Integra/RSX: costs scale more slowly than revenue as car size increases, so selling to the small car market segment results in unearned potential profits — even if the small car segment is a majority, it’s better to make a higher profit per unit on fewer unit sales if your most primary goal is to min/max labor/profit.

(Small phones, unlike small cars, also have costs in UI development to maintain their form factor’s OS support, which can create an additional pressure to withhold devices for a viable and profitable market.)


> I wonder why the big players (Google, Apple, Samsung, HTC) haven't made a big-corp product for this market.

Because it impacts ARPU. It's really not that difficult, you're the product being sold.


Big corps were the ones to move away from Blackbery en masse towards a BYOD system. Before that, Samsung and Nokia both had a series of keyboard phones running Windows Mobile 6 or SymbianOS. I had the Samsung Blackjack II in 2008.


> Clearly there is some latent demand here

No, there demand is negligible. It's just typical hacker news people who want to suddenly become productive Silicon Valley trope hustle style, or people who want to change their damaging habits in a day, so instead of uninstalling TikTok which takes 15 seconds to do, they will spend money a separate device.

Although the keyboard may be useful.


"But accepting the full S3Client here ties UploadReport to an interface that’s too broad. A fake must implement all the methods just to satisfy it."

This isn't really true. Your mock inplementation can embed the interface, but only implement the one required method. Calling the unimplemented methods will panic, but that's not unreasonable for mocks.

That is:

    type mockS3 struct {
        S3Client
    }

    func (m mockS3) PutObject(...) {
        ...
    }
You don't have to implement all the other methods.

Defining a zillion interfaces, all the permutations of methods in use, makes it hard to cone up with good names, and thus hard to read.


While you can do that, having unused methods that don't work is a footgun. It's cleaner if they don't exist at all.


Not to mention, introducing all the permutations of methods as separate interfaces on the "consumer side" means extreme combinatorial explosion of interfaces. It is far better to judge the most common patterns and make single-method interfaces for these on the provider side.

Lots of such frequently-quoted Go "principles" are invalid and are regularly broken within the standard library and many popular Go projects. And if you point them out, you will be snootily advised by the Go gurus on /r/golang or even here on HN that every principle has exceptions. (Even if there are tens of thousands of such exceptions).


Is this pattern commonly used? Any drawbacks?

Sounds much better than the interface boilerplate if it's just for the sake of testing.


At work we use it heavily. You don't really see "a zillion interfaces" after a while, only set of dependencies of a package which is easy to read, and easy to understand.

"makes it hard to cone up with good names" is not really a problem, if you have a `CreateRequest` method you name the interface `RequestCreator`. If you have a request CRUD interface, it's probably a `RequestRepository`.

The benefits outweigh the drawbacks 10 to one. The most rewarding thing about this pattern is how easy it is to split up large implementations, and _keep_ them small.


Any method you forget to overwrite from the embed struct gives a false "impression" you can call any method from mockS3. Most of time code inside test will be:

    // embedded S3Client not properly initialized
    mock := mockS3{}
    // somewhere inside the business logic
    s3.UploadReport(...) // surprise
Go is flexible, you can define a complete interface at producer and consumers still can use their own interface only with required methods if they want.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: