remenoscodes's comments

remenoscodes · 2026-02-25T04:13:39 1771992819

Just saw this — I posted a different take on the same problem the same day (git-native-issue, commits instead of files). Two independent projects converging on refs/issues/ is good signal.

The markdown files approach gives you grep/ag/rg search for free, which is a real advantage. And the repo-prefix + sequential ID scheme is definitely more memorable than UUIDs for daily use.

A couple of questions after reading the code:

- With a single refs/issues/latest branch for all issues, what happens when two contributors work offline on different issues and then push? Seems like it would produce a merge even though the issues are independent, does that cause friction in practice?

- On the ID scheme: gen_repo_id() generates 6 chars from a 22-letter alphabet. Two clones getting the same prefix is unlikely but not impossible. Have you considered what happens if it collides?

kwhkim · 2026-02-25T07:09:36 1772003376

Thank you for the seasoned opinion.

* It will produce a merge by default, but you can use `git pad merge --rebase` to rebase when possible. I thought a lot about this — "Do I need linear commits whenever possible?" I understand linear commits are easier to read, but I don't see other problems with merges per se. I think "who cares? No one will look at previous commits anyway." But there could be some friction — do you have anything specific in mind? Besides, you can use `git pad merge --rebase --audit` to tag the original commit before rebasing, for auditing purposes.

* It basically comes down to how many alphabets you need to guarantee no collision. Git's hashes have some probability of collision too. I don't recall the exact numbers, but for 100–1000 repos the chance of a repo-ID collision is extremely low. In the rare case of a collision, you can delete the `.local-repo-id` file and regenerate it. You would get merge conflicts though, so you'd need to reset to before the merge and rename the files. This is not implemented yet but people could do it manually using git commands and other linux commands.

remenoscodes · 2026-02-24T11:02:31 1771930951

I built git-native-issue (https://github.com/remenoscodes/git-native-issue), a distributed issue tracker that stores issues as Git commits under refs/issues/.

In April 2007, during a flame war about the Linux 2.6.21 release, Linus wrote:

"There must be some better form of bug tracking than bugzilla. Some really distributed way of letting people work together, without having to congregate on some central web-site kind of thing. A 'git for bugs', where you can track bugs locally and without a web interface."

Source: https://lore.kernel.org/all/alpine.LFD.0.98.0704290848360.99...

19 years later, nobody shipped this. 10+ tools tried (Bugs Everywhere, ticgit, git-bug, git-dit, git-appraise, git-issue). All failed for similar reasons — mostly file-based storage that breaks on merge, and no format spec for ecosystem adoption.

The core insight: issues are append-only event logs, and Git is a distributed append-only database. So I mapped issue tracking directly to Git primitives:

  - Commits = issue events (creation, comments, state changes)
  - Refs = issue identity (refs/issues/<uuid>)
  - Trailers = structured metadata (same format as Signed-off-by)
  - Merge commits = conflict resolution
  - Fetch/push = sync

Usage:

  $ git issue create "Fix login crash" -l bug -p high
  $ git issue ls -f full
  $ git issue show a7f3b2c
  $ git issue sync github:owner/repo

The architecture follows Git's own philosophy: the core only knows commits, refs, and trailers. Platform bridges (GitHub, GitLab, Gitea/Forgejo) are separate scripts that translate between APIs and git primitives. New platform = new bridge, core doesn't change.

The real deliverable is ISSUE-FORMAT.md — a standalone spec that any tool can implement. If this project dies tomorrow, the spec survives. That's the key difference from prior art: none of them produced a standalone format specification.

282 tests across 22 suites. POSIX shell, zero dependencies beyond Git for the core. Platform bridges need jq + their respective CLIs (gh, glab, or curl for Gitea/Forgejo).

Honest limitations: shell is slow with large repos (10k+ issues work but not fast). A C rewrite is on the roadmap, inspired by the path git-subtree took into contrib/.

Feedback I'm looking for: Is the format spec (ISSUE-FORMAT.md) clear and implementable? What edge cases did I miss? Would you actually use this?

Install:

  brew install remenoscodes/git-native-issue/git-native-issue
  # or
  curl -sSL https://raw.githubusercontent.com/remenoscodes/git-native-issue/main/install.sh | sh

hunvreus · 2026-02-24T11:45:57 1771933557

Have you looked at https://github.com/git-bug/git-bug ?

remenoscodes · 2026-02-24T11:59:01 1771934341

Yes! git-bug is the closest prior art, I reference it in the README's Prior Art section.

Three key differences:

1. Plain Git primitives — git-bug uses CRDTs with JSON operation logs. git-native-issue uses commits as events, Git trailers for metadata (same format as Signed-off-by), and merge commits for conflict resolution. No custom serialization.

2. Standalone format spec — git-bug's "format" is whatever its Go code produces. git-native-issue ships ISSUE-FORMAT.md, a standalone specification that any tool in any language can implement. The spec is the deliverable, not the CLI.

3. Simplicity — CRDTs are powerful but overkill here. Git already solves distributed conflict resolution with three-way merge. Why rebuild that in userspace?

git-bug validated that storing issues in Git refs works. I built on that lesson with a simpler data model and a spec-first approach.

kwhkim · 2026-02-24T18:14:24 1771956864

Impressive work. I built something similar and clearly remember the design choices I made, so this is an interesting subject.

On CRDTs: I assume tools like git-bug adopted CRDTs primarily to avoid merge conflicts, but "last-writer-wins" via timestamps is risky — clocks are notoriously unreliable or set incorrectly. While CRDTs aren't "accurate" either, they do their best to converge. Personally, I'm uncomfortable with systems that silently overwrite edits when multiple people change the same issue differently. If two users modify state concurrently, I'd rather see an explicit conflict than an automatic resolution.

Related to that, I couldn't fully determine from the docs how your merge behavior works in practice. It seems like there are no conflicts, but how they are resolved is unclear to me. This is easily one of the hardest design decisions in a distributed issue system. One approach might be restricting certain edits (e.g., only authors can modify specific fields) or explicitly raising conflicts when a semantic disagreement occurs. I prefer the latter. With the help of AI, we could likely distinguish semantic conflicts from trivial syntactic differences (like LF vs. CRLF).

Regarding UUIDs: I understand why Git uses hashes — collision avoidance is critical in a distributed system. However, from a UX perspective, long, opaque identifiers are difficult to remember or reference in conversation. I've explored using shorter, human-friendly identifiers that still minimize collisions. I think Ergonomics matter immensely if this is intended for daily use.

A practical concern: using custom refs per issue can clutter the namespace. In tools like VS Code's Git graph, enabling "Show Remote Branches" causes remote issue refs to appear visually, adding significant noise. It's not a dealbreaker, but it does affect usability in some Git clients (speaking from my experience with git-bug).

Broadly speaking, I'm still undecided about storing issues directly as Git commits. Whatever the merge policy or conflict resolution strategy, it can be implemented in a custom merge engine (I think — and it's on my to-do list). You can track how issues evolved over time with commits, but it makes managing them in batch or bulk more difficult. A file-based system can achieve the same effect using `git log -- $FILENAME`.

That said, I really like that you extracted a standalone ISSUE-FORMAT.md. A format spec that outlives the implementation is arguably the most important part of this effort — especially if it includes a specification for attachments. Even though I'd prefer the issue format to be more flexible and casual, trying to establish a standard is worth the effort.

remenoscodes · 2026-02-24T22:47:32 1771973252

Good catches, thank you.

Timestamps: You're right that the current merge uses committer timestamps (LWW), and clocks can disagree. The spec is explicit about this tradeoff — Principle 4: "Last-writer-wins over Lamport clocks." The reasoning: for issue tracking (as opposed to, say, collaborative editing), the practical risk of clock skew producing a wrong merge result is low, and when it does happen, a follow-up commit corrects it. The format is versioned (Format-Version: 1), so a future version can introduce logical clocks if production use reveals timestamp-related bugs. In 8+ months of use — including imports from multi-contributor projects (GitHub, GitLab, Gitea) — clock-skew issues haven't surfaced. I'm now adopting it in a team setting at work, which will be the real stress test for the merge heuristics.

UUIDs: In practice, users interact with 7-character short IDs (a7f3b2c). The CLI resolves abbreviations unambiguously, similar to how git log abc1234 works. Sequential IDs would require a central counter, which breaks in distributed systems.

Namespace: refs/issues/* doesn't appear in branch listings (git branch, git log --all with default config). Most Git GUIs filter to refs/heads/* and refs/remotes/*. For fetch performance with many issues, Git protocol v2 does server-side ref filtering, so only requested refs transfer. Valid concern though worth documenting.

Attachments: Agreed, that's tracked for Format-Version 2 — binary blobs in the tree object instead of the empty tree. The spec's Section 12 outlines this.

The format spec is designed to evolve. Appreciate the detailed feedback.

kwhkim · 2026-02-25T02:20:38 1771986038

I’d like to share my work with you: https://news.ycombinator.com/item?id=47137452

It shows that I posted just a little later than you.

I agree that the chances of something going wrong with timestamps are low, but I still think it’s worth considering as a potential security risk or injection vector — although I’m not sure how realistic that threat actually is.

Regarding UUIDs, you can get the best of both worlds: sequential IDs are convenient, and adding a small number of random characters can help avoid name collisions. Since the random component only needs to ensure uniqueness across local repositories, it doesn’t require many characters.

I can see that you’ve put a great deal of effort into this, especially with the various bridging components. I built a simple bridge myself and ran some tests using the pandas project (https://github.com/pandas-dev/pandas ), which has more than 30,000 issues. Even storing only metadata (such as title and type, excluding the body) as plain text takes more than 100 MB, which seems quite large. In comparison, storing the same data in SQL takes only about 10 MB, and packed Git objects are comparable.

So while storing issues as Git commits certainly has some benefits, I don't see much advantage beyond that. It also seems that most users would not be able to make practical use of this approach easily — for example, for batch processing — unless they are already quite comfortable working directly with Git commits.

I'm curious about what considerations led you to decide to store issues as empty Git commits. I would appreciate it if you could share your reasoning with me.

remenoscodes · 2026-02-25T03:07:02 1771988822

Cool, just looked at git-pad. Same day, different data models for the same problem. Independent convergence is a good signal.

On why empty commits: this started with Linus's 2007 rant about wanting "a git for bugs." I took it literally, how far can Git's existing primitives go without introducing anything new? No files, no JSON, no database. Just commits, refs, and trailers.

The mapping: issues are append-only event logs (create -> comment -> edit -> close). Git is an append-only content-addressable store. Each commit is an event. The ref tip is current state. Trailers carry structured metadata in the same format as Signed-off-by:. Merge commits handle divergence. The entire Git toolchain works out of the box — log, rev-list,interpret-trailers, GPG signing, refspecs.

The implementation is a proof of concept though. What I really care about is ISSUE-FORMAT.md as a standalone format spec. Most of the internet runs on community-agreed specifications where the spec is the contract and implementations are details. If we have a canonical issue format, Forgejo or GitKraken or whoever can build a proper UI around it. Different implementations emerge — shell, C, Rust — until we find the optimal one. The spec is the deliverable, not the CLI.

Storage: packed Git objects are comparable to SQL for metadata. The shell won't scale to 30K issues, a C implementation with libgit2 would. That's a known limitation of v1.

Timestamps: fair concern. The format is versioned (Format-Version: 1), so logical clocks can be added in a future version without breaking existing data. For v1, LWW was the pragmatic choice — keeps the spec implementable by any tool that can read Git commits.

The bridges solved a specific problem I kept hitting: migrating projects between GitLab and GitHub or Gitea(and now azure devops) while keeping issues intact. That alone justified the effort.

Curious about git-pad's file-based approach: what happens when two contributors edit the same issue file offline and then push? Standard Git merge conflict, or do you handle it at a higher level? (Haven't had time to look at the implementation code yet)

kwhkim · 2026-02-25T07:18:30 1772003910

As for the merge conflict, you resolve it the same way you would with any other file in git. I think a custom merge driver needs to be developed eventually — for example, automatically picking `type: bug or feature` instead of leaving the raw conflict markers like the following— but that's not implemented yet.

<<<<< type:bug ==== type: feature >>>>>>

michaelmure · 2026-02-24T20:07:07 1771963627

> On CRDTs: I assume tools like git-bug adopted CRDTs primarily to avoid merge conflicts, but "last-writer-wins" via timestamps is risky

FYI, git-bug doesn't use timestamps to figure out the ordering. It first uses the git DAG topology (it defines ancestors), then Lamport clocks (increment for any changes on any bugs), then order by hash if there is still an ambiguity. Note that the git DAG could also be signed, which would also provide some form of reliance against abuse.

I had an interesting discussion recently about how to handle conflict for bug trackers. In my opinion it's a great use-cases for CRDTs (as it avoids data corruption), as long as all user intents are visibly captured in a timeline and easily fixable. It turned out though that there is an interesting middle ground: as the CRDT captures *when* a concurrent editing happen, it's 100% doable to highlight in the UI which event are concurrent and might need attention.

remenoscodes · 2026-02-24T22:56:41 1771973801

Thanks for weighing in — git-bug is in the spec's Acknowledgments and Section 10.2 for good reason.

The DAG topology → Lamport clocks → hash ordering hierarchy is cleaner than LWW. And the signing point is shared ground, since git-native-issue uses standard commits, GPG/SSH signing works out of the box for the same abuse-resistance property.

Your "middle ground" observation — that CRDTs naturally reveal when concurrent edits happened — is the part I find most compelling. The format spec reserves a Conflict: trailer (Section 6.8) for a similar idea: flag divergent edits for human review rather than silently resolving them. The current v1 resolves everything automatically via LWW + three-way set merge. The honest gap: LWW can detect that a divergence happened (via merge-base), but it can't express the causal relationship between events the way Lamport clocks can. That's a real limitation.

The design bet is on adoption surface. The spec uses only commits, refs, and trailers primitives that any Git hosting platform or TUI already understands, with no new dependencies. My hope is that a simpler format gets more implementations, and more implementations make portable issue tracking real. But I take your point that "simpler" isn't automatically "better for users" when it comes to concurrent editing.

On interop — the spec says a bridge would be "straightforward" (Section 10.2). That's probably too optimistic. git-bug's operation log doesn't map cleanly to linear commit chains, and a lossy bridge helps nobody. Worth exploring more carefully though, if there's interest on both sides.