Hacker Newsnew | past | comments | ask | show | jobs | submit | michaelmure's commentslogin

Any recommandation for a quality non-toy rust codebase to study?


Two arbitrary picks:

https://crates.io/crates/serde

https://crates.io/crates/regex

Anything covered by Gjengset's "decrusted" series: https://youtube.com/playlist?list=PLqbS7AVVErFirH9armw8yXlE6...

Sort of on the border between toy and not-toy; Gjengset implements a concurrent hash map: https://youtube.com/playlist?list=PLqbS7AVVErFj824-6QgnK_Za1... [17hr recorded over 3 streams]


The rust standard library is excellent. Start more or less anywhere, and click "view source". Or open up the source code files on github.

There's often a lot more comments than code, which is kind of annoying. But it really is the best way to learn how a lot of good rust is written.

Vec is a good read: https://doc.rust-lang.org/src/alloc/vec/mod.rs.html

Here's the lovely slice::binary_search_by: https://doc.rust-lang.org/src/core/slice/mod.rs.html#2967-29...


> On CRDTs: I assume tools like git-bug adopted CRDTs primarily to avoid merge conflicts, but "last-writer-wins" via timestamps is risky

FYI, git-bug doesn't use timestamps to figure out the ordering. It first uses the git DAG topology (it defines ancestors), then Lamport clocks (increment for any changes on any bugs), then order by hash if there is still an ambiguity. Note that the git DAG could also be signed, which would also provide some form of reliance against abuse.

I had an interesting discussion recently about how to handle conflict for bug trackers. In my opinion it's a great use-cases for CRDTs (as it avoids data corruption), as long as all user intents are visibly captured in a timeline and easily fixable. It turned out though that there is an interesting middle ground: as the CRDT captures *when* a concurrent editing happen, it's 100% doable to highlight in the UI which event are concurrent and might need attention.


Thanks for weighing in — git-bug is in the spec's Acknowledgments and Section 10.2 for good reason.

The DAG topology → Lamport clocks → hash ordering hierarchy is cleaner than LWW. And the signing point is shared ground, since git-native-issue uses standard commits, GPG/SSH signing works out of the box for the same abuse-resistance property.

Your "middle ground" observation — that CRDTs naturally reveal when concurrent edits happened — is the part I find most compelling. The format spec reserves a Conflict: trailer (Section 6.8) for a similar idea: flag divergent edits for human review rather than silently resolving them. The current v1 resolves everything automatically via LWW + three-way set merge. The honest gap: LWW can detect that a divergence happened (via merge-base), but it can't express the causal relationship between events the way Lamport clocks can. That's a real limitation.

The design bet is on adoption surface. The spec uses only commits, refs, and trailers primitives that any Git hosting platform or TUI already understands, with no new dependencies. My hope is that a simpler format gets more implementations, and more implementations make portable issue tracking real. But I take your point that "simpler" isn't automatically "better for users" when it comes to concurrent editing.

On interop — the spec says a bridge would be "straightforward" (Section 10.2). That's probably too optimistic. git-bug's operation log doesn't map cleanly to linear commit chains, and a lossy bridge helps nobody. Worth exploring more carefully though, if there's interest on both sides.


> Stuff like git-bug exists, but then you still need participation from other people.

The plan is to 1) finish the webUI and 2) accept external auth (e.g. github OAuth). Once done, anyone can trivially host publicly their own forge and accept public contribution without any buy-in effort. Then, if user wants to go native they just install git-bug locally.


Whoa, git-bug is still being developed, awesome! I wonder how difficult it would be to add other tables to it (I cannot help but think about bug trackers as being a database with a frontend like Access, and many limitations...) — in particular to have Messages (for messages) and Discussions (for hierarchical list of message references). Now that git has reftable maybe this sort of abuse would actually work...


Assuming that by "table" you mean another "document type" ... pretty easily. There is a reusable CRDT like datastructure that you can use to define your own thing. You do that by defining the operations that can happen on it and what they do. You don't have to handle the interaction with git or the conflict resolution.


Location: France

Remote: Yes (have been 100% remote for 6+ years)

Willing to relocate: No

Technologies: Go, networking, protocols, identity, cryptography, local-first, CRDTs, devops, AWS, kubernetes, IaC, free software, linux

Résumé/CV: see my Github profile for notable and open source work: https://github.com/MichaelMure

Email: on my Github profile

I'm a builder, a backend engineer with strong go proficiency, passionate about local-first and how we can build better software for a better world. Happy to talk about opportunities.


With some motivation you could port git-bug to another VCS without too much problem. You would need to implement those interfaces [1]. The one you care about especially is RepoData, which mainly imply you can store a DAG, have references and push/pull. I believe other VCS (say mercurial) have similar concepts.

Or you could just as well plug a generic database there.

[1]: https://github.com/git-bug/git-bug/blob/master/repository/re...


So for example, git-bug already has a PR to add support for a project board: https://github.com/git-bug/git-bug/pull/843

The same way, one could add support for code review (aka PRs), todo list, custom entities that your workflow need (say, tracking documentation or custom requirement) ... It can also be entirely outside of the development process.


This would allow for really native linking between a tracked issue and corresponding commits/branches/tags, for modeling dependencies between issues as part of the git DAG,...


In theory it could happen but it's unlikely in practice, for multiple reasons:

- git-bug use a form of logical clock (not wall clock) that order an action in relation to other actions in the repo. Clock drifting doesn't matter.

- pushing to git usually require some access to the repo, and therefore abuse can be dealt with socially (aka you get kicked out)

What can happen for example is someone write a comment, shut down the computer and only push the next day, but in that case the comment showing up before yours is the correct merging.


> pushing to git usually require some access to the repo

Wait, so to comment on an issue I now have to already have push access to that repo? How does that work? E.g. what if I want to comment on a VSCode issue? I'm not a VSCode developer...


Right now, yes, but the idea is to augment the webUI with external auth (e.g. Github OAuth and others) to make it a public portal where anyone can create issues and so on. In that case, the webUI would have access to the git repo, enforce any rules and prevent abuses.

With a single binary deployment, you'd just need a bit of config and a DNS, and you could host a forge-ish for your project.

We are not there yet but it's really not far.


to support the workflow where you, an individual, outside contributor, want to use git-bug to create or comment on an issue on a third-party platform that you do not control, you would:

- install git-bug

- create a directory (and `git init`), optionally fetch/clone the remote repo (but this is not needed)

- create a git-bug identity (`git bug user new`)

- configure a bridge to (for example, using vscode) github (`git bug bridge new`)

- pull issues from the bridge to your local repository's refs/bugs namespace (`git bug bridge pull`)

- create a new issue, or browse existing ones and comment on them at will

- export your activity to the bridge (`git bug bridge push`)

this works without push access to the repository, because when importing to or exporting from a bridge, the API credentials you provide when configuring the bridge are used -- `git bug bridge {push,pull}` does not push your local `refs/bugs` to the remote.


Yes, this really meant to be some sort of framework for storing entities in git, handle the conflicts, and let you buld easily your own tool (or add more features to git-bug).

See also https://github.com/git-bug/git-bug/blob/master/doc/design/da... and https://github.com/git-bug/git-bug/blob/master/entity/dag/ex...

I'd love to see this used in the wild for other use cases.


The interesting thing to me is the stark difference between this and golang's approach.

With golang, you can run fuzzing as simply as you run tests, which means that it's trivial to target specific parts of your application or library. It obsoletes so much of those techniques.

I'm quite curious of techniques to guide more the fuzzing. It seems like the best you can do is provide a seed corpus and hope for the best.


some fuzzing tools (libFuzzer for example) leverage LLVM's intermediate representation to provide code-coverage metrics that they feed back into their fuzzing algorithms, increasing test coverage


Golang does that natively ;-)


LibFuzzer is packaged with clang, so there is no additional installation [0]. You just have to provide an entry function and link it with a command-line flag. However, since C and C++ lack reflection you have to work with raw bytes as input.

LibFuzzer has the option to provide callbacks that customize mutation, which can help with obtaining coverage.

[0] https://llvm.org/docs/LibFuzzer.html


I proposed using reinforcement learning to guide coverage as a potential phd topic, but didn't really go down that path, no idea if it could work


Did you try making small changes to your phd proposal to see if it opened up new paths?

</fuzzingjoke>


I think it would go the other way where you use coverage to guide reinforcement. Crank the temperature up to increase variation and you would probably produce a model that could approximate the file format you were targeting.


Please tell us more!

Fuzzing is often a special case of genetic algorithms, so there is already a tiny connection to RL. I'm curious to hear what your proposal was.


> Fuzzing is often a special case of genetic algorithms

Yes, that was sort of why I thought RL guided fuzzing could work, and possibly better. Also, for things like XSS fuzzing (which I have a little experience in), it is possible for an experienced attacker to intelligibly guide the fuzzing to a payload, which theoretically could be mimicked through RL.

There wasn't really anything novel in the proposal, it was just for a graduate cyber-security course, and one of the deliverables was a project proposal for something related. There were already some existing works that time (around 2-3 years ago) where people tried combining RL with fuzzing, and I just mish-mashed some ideas together so I could hand in something.

My main concern at the time however was that with fuzzing the positive signal would be so rare compared to the negative signal, since most randomly fuzzed inputs would just return the same negative feedback. I wasn't sure that would be enough signal to train an RL system. I'm not quite sure what new progress has been made in the field since then.


Not a file storage but https://github.com/git-bug/git-bug push and sync with any git remote. There is a generic data structure you can use to build your conflict-free type.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: