Thread

New release (v0.17.0) has: - `nak git` commands that allow cloning, setting up a new nip34/grasp repository, pushing, fetching and pulling (just call "nak git push", for example, instead of "git push") - `nak req --only-missing` flag that takes a jsonl file with events and does negentropy with a target relay to only download the events that are not in that file (finally @Kieran -- this was ready 2 weeks ago, but I had to make a ton of git stuff before I was able to publish it) - `nak serve --negentropy --blossom --grasp` new flags that make hosting these things of servers locally much easier for debugging - you can finally use npubs/nprofiles/nevents/naddrs directly as arguments to `nak event`, `nak req` and others (they will be parsed and included in the event or filter as proper hex)

Replies (46)

wat is GRASP? i have started a negentropy peer implementation on orly now too, and it has a serve function that uses tmpfs, and blossom is default on, and respects whatever ACL system you enable. there is an aggregator tool that lets you trawl the nostrwebs for your events and writes them to jsonl. it has a GUI, which is still a bit of a WIP but already lets you log in, view and delete your own events. there is a zap-to-subscribe system for paid use case but it's untested as yet, same as blossom, though it's tested to the gills. but wat is grasp?
Grasp is Blossom for git repositories. You can host your repositories in multiple servers and their state is attested by Nostr events you publish that tells people where to fetch from and what is the latest head and branches, so the servers don't have to be trusted and it's easy to switch servers while keeping your repository identifiers immutable (they're essentially your pubkey + an arbitrary identifier). Grasp servers can be self-hosted, paid, or ran for free by some community benefactor, and you can mix and use all these at the same time. The existence of grasp servers also makes it easy for people to clone repositories, create branches and make them available to be merged by others, all without leaving the terminal (but of course there can also be all kinds of clients), which is the GitHub "pull request" UX people are familiar with.
Full Grasp support coming to BudaBit soon, the primitives are already implemented. 1. What we struggled with is the creation flow where we first post the repo announcement then the newly created git repo, but that must be done with an arbitrary sleep(X ms) wait time so that's not really ideal. You guys have ideas to improve this flow? 2. I think it this was discussed before but want to get a fresh opinion of yall: Unified grasp api for file browsing and perhaps diffs and really data-heavy ops? Cloning repos via git smart http can still be a fallback but this would benefit performance a lot. Blossom has an api as well so I guess this would make sense, especially in a browser context. @fiatjaf @DanConwayDev
2. In the browser context we cannot rely on iso-morphic git as it doesn't support sparse clones and the UX of a shallow clone (with blobs) isn't good enough for large repos. eg. try browsing for the first time. I think we should explore directly requesting a pack from the http endpoint containing the blobs we need (for a specific file) instead of relying on iso-morphic git. We could create a javascript library that does just this. If this doesn't work then we should add an API endpoint for files / listing directories, etc. My concern about the API is 1) it enables the use of a grasp server as a CDN for files in a repository 2) where would we stop in terms of the API, there isn't a clear boundary and we could end of creating all git commands / options as an API which makes it a more complex protocol and harder to implement. Someone nearly attempted to add sparse clone in iso-moprhic git but there are a baked in assumptions in may parts of the codebase that blobs are present so it would require a larger change and it might be hard to get merged as a first time contirbutor.
I dug into this a while back and hacked together a poc implementation. I wanted to store ojbects in blossom so I extended nip-34 to include content addressable objects. you probably shouldn't do this but there's no bad ideas, right? **Standard NIP-34 repository state (kind 30618):** ```json ["refs/heads/main", "c8c4e344a9c0b4008bb72eebb188be7d7b83dcb1"] ``` **extension for content-addressed storage:** ```json ["ref", "heads/main", "c8c4e344a9c0b4008bb72eebb188be7d7b83dcb1", "9adca865e86f755634dc0e594b74ba7d6ff5a66e4070c278e29ecda1bf7f4150"] ["obj", "589d02c42ef724523ceba0697315e36520332993", "abc123def456789012345678901234567890abcdef123456789012345678901234"] ["obj", "e63987bfc58e1197df38e5edb1d9930eb08d8589", "def456abc789012345678901234567890123abcdef456789012345678901234567"] ``` **Extensions:** - **4th parameter in `ref` tags**: Maps Git commit SHA to content-addressed storage hash (Blossom SHA-256) - **`obj` tags**: Map Git object SHAs (trees, blobs) to their content-addressed storage hashes
Interesting. You'd end up with a lot of objects with that approach and eventually it would be too big for the event size. I thought about doing it with storing packs in blossom. Here is my code to play with that idea. I would have made it into a POC if rust-nostr had blossom support at the time. It does now. It turns out that having a git server is way more flexible so ngit.dev/grasp came to be. Let git be git and let nostr be nostr.
DanConwayDev's avatar DanConwayDev
From 6bcb58925ad5a7ec2421718fb2996add9080f7bc Mon Sep 17 00:00:00 2001 From: DanConwayDev <DanConwayDev@protonmail.com> Date: Fri, 15 Nov 2024 11:57:10 +0000 Subject: [PATCH] feat(blossom): blossom as remote using packs This is a WIP exploration of the use of blossom as an optional alternative to using a git server. The incomplete code focuses on how blossom could fit with nip34 to most efficently replace the git server. It is missing the actual blossom interaction which would hopefully would be facilited by a new blossom feature in rust-nostr. This implementation tries to minimise the number of blobs required for download by using packs. If a branch tip is at height 1304 it will split the commits in into a number of packs. a pack the first 1024 commits, the next 256, the next 16 and the final 8. I planned for the identification of blossom servers to mirror the approach taken for relays: 1. list repository blossom servers in repo announcement event kind 30617 2. also push to user blossom servers in the standard event for that This is not implemented, along with the rest of the blossom aspects. I'm publishing this now as @npub1elta...cume has recently published a POC of an alternative approach and it makes sense to this alternative idea. --- Cargo.lock | 1 + Cargo.toml | 1 + src/bin/git_remote_nostr/fetch.rs | 4 ++++ src/bin/git_remote_nostr/list.rs | 23 ++++++++++++++++++++++- src/bin/git_remote_nostr/push.rs | 124 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----- src/lib/repo_state.rs | 17 ++++++++++++++++- 6 files changed, 163 insertions(+), 7 deletions(-) diff --git a/Cargo.lock b/Cargo.lock index b20b60a..72b37a2 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1805,6 +1805,7 @@ dependencies = [ "serde_json", "serde_yaml", "serial_test", + "sha2", "test_utils", "tokio", "urlencoding", diff --git a/Cargo.toml b/Cargo.toml index ed99aea..320a9f0 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -38,6 +38,7 @@ serde_yaml = "0.9.27" tokio = "1.33.0" urlencoding = "2.1.3" zeroize = "1.6.0" +sha2 = "0.10.8" [dev-dependencies] assert_cmd = "2.0.12" diff --git a/src/bin/git_remote_nostr/fetch.rs b/src/bin/git_remote_nostr/fetch.rs index a972a2f..a1116c5 100644 --- a/src/bin/git_remote_nostr/fetch.rs +++ b/src/bin/git_remote_nostr/fetch.rs @@ -49,6 +49,10 @@ pub async fn run_fetch( let term = console::Term::stderr(); for git_server_url in &repo_ref.git_server { + if git_server_url.eq("blossom") { + // TODO download missing blobs + continue; + } let term = console::Term::stderr(); if let Err(error) = fetch_from_git_server( git_repo, diff --git a/src/bin/git_remote_nostr/list.rs b/src/bin/git_remote_nostr/list.rs index 92faa6b..d71c2d1 100644 --- a/src/bin/git_remote_nostr/list.rs +++ b/src/bin/git_remote_nostr/list.rs @@ -43,7 +43,28 @@ pub async fn run_list( let term = console::Term::stderr(); - let remote_states = list_from_remotes(&term, git_repo, &repo_ref.git_server, decoded_nostr_url); + let mut remote_states = list_from_remotes( + &term, + git_repo, + &repo_ref + .git_server + .iter() + // blossom will always match nostr state + .filter(|s| !s.starts_with("blossom")) + .map(std::borrow::ToOwned::to_owned) + .collect::<Vec<String>>(), + decoded_nostr_url, + ); + if repo_ref.git_server.iter().any(|s| s.eq("blossom")) { + if let Some(nostr_state) = nostr_state.clone() { + remote_states.insert("blossom".to_owned(), nostr_state.state.clone()); + } else if let Some((_, state)) = remote_states.iter().last() { + remote_states.insert("blossom".to_owned(), state.clone()); + } else { + // create blank state if no nostr state exists yet + remote_states.insert("blossom".to_owned(), HashMap::new()); + } + } let mut state = if let Some(nostr_state) = nostr_state { for (name, value) in &nostr_state.state { diff --git a/src/bin/git_remote_nostr/push.rs b/src/bin/git_remote_nostr/push.rs index db86c04..a12e8ba 100644 --- a/src/bin/git_remote_nostr/push.rs +++ b/src/bin/git_remote_nostr/push.rs @@ -2,6 +2,7 @@ use core::str; use std::{ collections::{HashMap, HashSet}, io::Stdin, + str::FromStr, sync::{Arc, Mutex}, time::Instant, }; @@ -11,7 +12,7 @@ use auth_git2::GitAuthenticator; use client::{get_events_from_cache, get_state_from_cache, send_events, sign_event, STATE_KIND}; use console::Term; use git::{sha1_to_oid, RepoActions}; -use git2::{Oid, Repository}; +use git2::{Buf, Commit, Oid, Repository}; use git_events::{ generate_cover_letter_and_patch_events, generate_patch_event, get_commit_id_from_patch, }; @@ -29,11 +30,17 @@ use ngit::{ }; use nostr::nips::nip10::Marker; use nostr_sdk::{ - hashes::sha1::Hash as Sha1Hash, Event, EventBuilder, EventId, Kind, PublicKey, Tag, + hashes::{ + hex::DisplayHex, + sha1::Hash as Sha1Hash, + sha256::{self, Hash as Sha256Hash}, + }, + Event, EventBuilder, EventId, Kind, PublicKey, Tag, }; use nostr_signer::NostrSigner; use repo_ref::RepoRef; use repo_state::RepoState; +use sha2::{Digest, Sha256}; use crate::{ client::Client, @@ -74,7 +81,17 @@ pub async fn run_push( let list_outputs = match list_outputs { Some(outputs) => outputs, - _ => list_from_remotes(&term, git_repo, &repo_ref.git_server, decoded_nostr_url), + _ => list_from_remotes( + &term, + git_repo, + &repo_ref + .git_server + .iter() + .filter(|s| !s.eq(&"blossom")) + .map(std::string::ToString::to_string) + .collect(), + decoded_nostr_url, + ), }; let nostr_state = get_state_from_cache(git_repo.get_path()?, repo_ref).await; @@ -150,11 +167,24 @@ pub async fn run_push( } } + let mut blossom_packs: Option<HashMap<sha256::Hash, Buf>> = None; if !git_server_refspecs.is_empty() { let new_state = generate_updated_state(git_repo, &existing_state, &git_server_refspecs)?; + let blossom_hashes = if repo_ref.git_server.contains(&"blossom".to_string()) { + let (blossom_hashes, packs) = create_blossom_packs(&new_state, git_repo)?; + blossom_packs = Some(packs); + blossom_hashes + } else { + HashSet::new() + }; - let new_repo_state = - RepoState::build(repo_ref.identifier.clone(), new_state, &signer).await?; + let new_repo_state = RepoState::build( + repo_ref.identifier.clone(), + new_state, + blossom_hashes, + &signer, + ) + .await?; events.push(new_repo_state.event); @@ -325,6 +355,13 @@ pub async fn run_push( // TODO make async - check gitlib2 callbacks work async + if let Some(packs) = blossom_packs { + // TODO: upload blossom packs + for (_hash, _pack) in packs { + // blossom::upload(pack) + } + } + for (git_server_url, remote_refspecs) in remote_refspecs { let remote_refspecs = remote_refspecs .iter() @@ -863,6 +900,71 @@ fn generate_updated_state( Ok(new_state) } +fn create_blossom_packs( + state: &HashMap<String, String>, + git_repo: &Repo, +) -> Result<(HashSet<sha256::Hash>, HashMap<sha256::Hash, Buf>)> { + let mut blossom_hashes = HashSet::new(); + let mut blossom_packs = HashMap::new(); + for commit_id in state.values() { + if let Ok(oid) = Oid::from_str(commit_id) { + if let Ok(commit) = git_repo.git_repo.find_commit(oid) { + let height = get_height(&commit, git_repo)?; + let mut revwalk = git_repo.git_repo.revwalk()?; + revwalk.push(oid)?; + let mut counter = 0; + for pack_size in split_into_powers_of_2(height) { + let mut pack = git_repo.git_repo.packbuilder()?; + while counter < pack_size { + if let Some(oid) = revwalk.next() { + pack.insert_commit(oid?)?; + counter += 1; + } + } + let mut buffer = Buf::new(); + pack.write_buf(&mut buffer)?; + let hash = buffer_to_sha256_hash(&buffer); + blossom_hashes.insert(hash); + blossom_packs.insert(hash, buffer); + counter = 0; + } + } + } + } + Ok((blossom_hashes, blossom_packs)) +} + +fn get_height(commit: &Commit, git_repo: &Repo) -> Result<u32> { + let mut revwalk = git_repo.git_repo.revwalk()?; + revwalk.push(commit.id())?; + Ok(u32::try_from(revwalk.count())?) +} + +fn split_into_powers_of_2(height: u32) -> Vec<u32> { + let mut powers = Vec::new(); + let mut remaining = height; + + // Decompose the height into powers of 2 + for i in (0..32).rev() { + let power = 1 << i; // Calculate 2^i + while remaining >= power { + powers.push(power); + remaining -= power; + } + } + + powers +} + +fn buffer_to_sha256_hash(buffer: &Buf) -> sha256::Hash { + let mut hasher = Sha256::new(); + hasher.update(buffer.as_ref()); + let hash = hasher + .finalize() + .to_hex_string(nostr_sdk::hashes::hex::Case::Lower); + sha256::Hash::from_str(&hash).unwrap() +} + async fn get_merged_status_events( term: &console::Term, repo_ref: &RepoRef, @@ -1186,6 +1288,7 @@ trait BuildRepoState { async fn build( identifier: String, state: HashMap<String, String>, + blossom: HashSet<Sha256Hash>, signer: &NostrSigner, ) -> Result<RepoState>; } @@ -1193,6 +1296,7 @@ impl BuildRepoState for RepoState { async fn build( identifier: String, state: HashMap<String, String>, + blossom: HashSet<Sha256Hash>, signer: &NostrSigner, ) -> Result<RepoState> { let mut tags = vec![Tag::identifier(identifier.clone())]; @@ -1202,10 +1306,20 @@ impl BuildRepoState for RepoState { vec![value.clone()], )); } + if !blossom.is_empty() { + tags.push(Tag::custom( + nostr_sdk::TagKind::Custom("blossom".into()), + blossom + .iter() + .map(std::string::ToString::to_string) + .collect::<Vec<String>>(), + )); + } let event = sign_event(EventBuilder::new(STATE_KIND, "", tags), signer).await?; Ok(RepoState { identifier, state, + blossom, event, }) } diff --git a/src/lib/repo_state.rs b/src/lib/repo_state.rs index c3a7606..19e78b6 100644 --- a/src/lib/repo_state.rs +++ b/src/lib/repo_state.rs @@ -1,11 +1,17 @@ -use std::collections::HashMap; +use std::{ + collections::{HashMap, HashSet}, + str::FromStr, +}; use anyhow::{Context, Result}; use git2::Oid; +use nostr_sdk::hashes::sha256::Hash; +#[derive(Clone)] pub struct RepoState { pub identifier: String, pub state: HashMap<String, String>, + pub blossom: HashSet<Hash>, pub event: nostr::Event, } @@ -14,6 +20,7 @@ impl RepoState { state_events.sort_by_key(|e| e.created_at); let event = state_events.first().context("no state events")?; let mut state = HashMap::new(); + let mut blossom = HashSet::new(); for tag in event.tags.iter() { if let Some(name) = tag.as_slice().first() { if ["refs/heads/", "refs/tags", "HEAD"] @@ -26,6 +33,13 @@ impl RepoState { } } } + if name.eq("blossom") { + for s in tag.clone().to_vec() { + if let Ok(hash) = Hash::from_str(&s) { + blossom.insert(hash); + } + } + } } } Ok(RepoState { @@ -35,6 +49,7 @@ impl RepoState { .context("existing event must have an identifier")? .to_string(), state, + blossom, event: event.clone(), }) } -- libgit2 1.8.1
View quoted note →
you could use Go and i already have a second draft blossom server written in go. i didn't write it. claude spun it up in about 3 hours and then another hour fixing it and i just haven't tested it yet. i know it works because it's just http and the tests pass, and i saw it accepting, and allowing me to delete a random blob several times. i just haven't used it. probably will already serve you with blossom. imma make sure you both have permissions in case you want to try it
my take on this is look into techniques used in computer games. i remember when GTA3 came out, and its most epic achievement was loading free inter-map transit. still very few games use this but it's a graph theory algorithm. this is the kind of thing you need to automatically, and quickly partition a map of related data. you need metrics of proximity and some kind of parameters for partitioning the map to fit the compute you need to do. it's not hard. but it may take a while to wrap your head around it. but graphs at high node count are N! style compute cost. so it only takes like 3 or 4 deep and you are practically at infinity as far as even the most powerful computers can do in milliseconds.
yeah youd bloat the event with every object ref as the repo grows. not a great design but a fun poc. i wrote my poc in go. the code is actually hosted on itself, as the poc is a relay/blossom/webui all in one binary server. i also wrote my own git-nostr-remote for the client side. it was a fun hack and generally works for the happy path. no planning to pursue it. i can share the code if you’re interested in it.
humbug, imwald is sending it to my relay first. anyway, it happens now and then, i see in the upload popup on the bottom right, that if my relay doesn't get sent the event first, it returns that it already has it.
The problem is git enables many feature like shallow and parse cloning, packing specific object and data, getting specific files and git logs, etc. These are all battle tested on solid git server implementations. This is all not possible trying to reinvent a simplified git server with blossom.
i'm definitely with you on that. i personally would even suggest to not even write one single bit of code that handles git. palm it off to Linus' excellent implementation and then see what it misses. i always envisioned that nostr was just negotiation rendezvous easy inbound connection. i've already tried to work with an attempt to clone the core features of git, in pure Go, and it was endlessly problematic with even minimal lag between their work and what the git project has already progressed to. git is just a unix shell protocol, based on stdio and unix filesystem. don't over think it. the protocol only needs to provide the correct references and paths.
also, it's just combinatorial stuff. you can't efficiently deal with multidimensional graphs, you want to stick with stuff that can be flattened into a 2d representation. branches and layers are basically exponentially more complex. git is built on the directed acyclic graph geometry. there is a lot of shortcuts because you don't have to escape loops.
i dream of a day, where other people understand that most things are just pipes and shells and access control. if you ever read me on a regular basis you kknow the last point is my biggest gripe with most devs, and i already have experienced first hand how hard it is to explain the first two to most devs. networks are just pipes with extra steps.
you mean, the retard and the jedi saying that, of course. yeah, don't reinvent the wheel if you don't have to. servers that translate between protocol and git is quite trivial to implement. this is not realtime, low latency requirements here. even as much as 5 seconds to lay down a commit and propagate it is fine. never prematurely optimize, and don't roll your own if you can just assemble someone else's stuff into a shape that meshes with your actual part of it.
speaking from experience. even since picking up LLMs to help me with this, there are many things that just are not practical to do of such scale in any sane amount of time. it was a harsh burn discovering that i couldn't do git stuff in pure go. the go version just isn't nearly adequate. it seemed to be working, and then i got all this mess going on. what was the error? i forget, it was some protocol network shit iirc. annoying af. thats' why is just a plain gitea. i wanted to just host only my repos, and not have that stutter in the URL. i spent probably weeks trying to get that working, and in the end, it was futile. linus seems to be turning into a javascript ninja with all his fucking breaking changes this last few years. fuck that guy. just use his binary and interface to the git repo using it. the bitch has got too complicated to build from scratch. oh, sure, we could do all kinds of things involving metadata on nostr and all that shit but you really should just pause, before you race off and do that, and go clone the git repo, and tell an LLM to explain it to you. 30 screenfuls of documentation later, you will be in agreement with me. nah, just call git via your preferred language method for executing child processes. the end.
Issues and PRs (kinds 9803/9804) are automatically published to nostr on handled status changes (merged, closed and reopened). I fetch them from source if possible on import of the repo and try to aggregate those by their timestamps with the nostr kinds. If source is lets say Github im not upstreaming the edits additionally there so far. Anyway still needs polish in finding these kinds better and flows are surely not the endgame, but what i went with so far πŸ€“
I'm AFC today but it's like 'branches with PR/* are supposed to go through the nostr network not got servers" I am working on a PR to fix it for you. I pushed the initial way I was doing it, but it's not 100% there. Let me know if the way im doing that using a separate nak git PR type command is not how you want to solve it. While I have your attention, would you like to have a nak kanban command set to work with nip 100 stuff (and via MCP)? I am doing it all with nak already via bash but I figured it would be cool for me to build it into the MCP there so you could have a Project manager agent able to create, assign and managed work (likely to other ai agents).