Background

A few months ago, I was quite delighted that Microsoft had decided to open source Amudai on GitHub as I see it as a serious contender to Apache Parquet if they add support for union types the «right way»:

Happy times

However, if you click on the link, you will notice that the project no longer exists on GitHub, sadly. I wrote a comment on a the latest LinkedIn post from Professor Smoke, but no comment from his part:

The project is no longer accessible, sadly

I recall Professor Smoke mentioning, in one of his talks, that they need to request funding internally at Microsoft. Perhaps the project didn’t manage to secure further funding, which would be really sad, or perhaps Microsoft have decide to keep development private until they have a stable version (lets hope).

Wayback Machine

However, as they had some really insightful and interesting points in their design, architecture and specification documentation, I would really like to be able to read again (and again). After all, it was an open source project, released under the Apache 2.0 license. My mistake was that I only accessed the project from my browser. I didn’t clone nor forked the project, sadly.

In these cases, you tend to use the Wayback Machine:

The project has been saved 2 times: July 27, 2025 and August 2, 2025.

By choosing the latest snapshot:

We can the snapshot from August 2, 2025

And click on Insights (pulse) to see if any GitHub user has forked the project, but in this case, that information was not available:

Sadly, there is no content

So what do we do now? Accept defeat and give up?

GH Archive

After a few internet searches, the old-school way, I somehow arrived at the following YouTube video How To Access Private and Deleted Github Repositories Data. At the 06:33 mark I was introduced to GH Archive, which I had never heard about before. Apparently, they store all the GitHub events (only events and not data). By reading the different types of events, it was clear that I needed to look for the ForkEvent type in the event logs.

Also, it would be ideal to find the latest fork, as that would contain the most updated project. Therefore, the following bash script was created:

#!/usr/bin/env bash

#ys="2026"
#ms="02 01"

ys="2025"
ms="12 11 10 09 08 07"

ds="31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01"
hs="23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1"

clear

echo "# ensure info folder exists"
mkdir -vp ./info
echo

echo "# clean, create and change to data folder"
rm    -vfr ./data
mkdir -vp  ./data
cd         ./data
echo

echo "# crawl data for each year/month/day (newest -> latest)"
echo

for y in ${ys}; do
    echo "## crawl for year: ${y}"
    echo
    for m in ${ms}; do
        echo "### crawl for month: ${m}"
        echo
        for d in ${ds}; do
            echo "#### crawl for day: ${d}"
            echo
            for h in ${hs}; do
                echo "* get: https://data.gharchive.org/${y}-${m}-${d}-${h}.json.gz"
                wget https://data.gharchive.org/${y}-${m}-${d}-${h}.json.gz
                echo
                echo "* extract: ./data/${y}-${m}-${d}-${h}.json.gz"
                gunzip -v "${y}-${m}-${d}-${h}.json.gz"
                echo
                echo "* look for 'microsoft/amudai'"
                grep 'microsoft/amudai' ${y}-${m}-${d}-${h}.json \
                     >> ../info/${y}_${m}_events.json
                echo
                echo "* remove: ./data/${y}-${m}-${d}-${h}.json"
                /run/current-system/sw/bin/rm -v ${y}-${m}-${d}-${h}.json
                echo
            done
            echo
        done
        echo
    done
    echo
done

Due to the size of some of the files when decompressed, instead of retrieving concurrently, it’s done in a sequentially manner in order to not over-throttle the internet connection as well as keeping the hard-drive space consumption low.

NOTE: Notice how 2026 and the corresponding months are out-commented. I initially searched for the these months, but as no fork event was present, I decided to investigate 2025 all the way to the latest Wayback Machine snapshot

Already after December 2025 I stopped the script:

ll ~/temp/amudai/info/
total 8
-rw-r--r-- 1 johndoe users 7914 Feb  3 21:24 2025_12_events.json
-rw-r--r-- 1 johndoe users    0 Feb  3 20:11 2026_01_events.json
-rw-r--r-- 1 johndoe users    0 Feb  3 20:50 2026_02_events.json

As there was a ForkEvent created at: “2025-12-14T01:24:59Z” (the latest):

{
  "id": "5285444499",
  "type": "ForkEvent",
  "actor": {
    "id": 243496075,
    "login": "secops2001",
    "display_login": "secops2001",
    "gravatar_id": "",
    "url": "https://api.github.com/users/secops2001",
    "avatar_url": "https://avatars.githubusercontent.com/u/243496075?"
  },
  "repo": {
    "id": 978785383,
    "name": "microsoft/amudai",
    "url": "https://api.github.com/repos/microsoft/amudai"
  },
  "payload": {
    "action": "forked",
    "forkee": {
      "id": 1115997402,
      "node_id": "R_kgDOQoTE2g",
      "name": "amudai",
      "full_name": "secops2001/amudai",
      "private": false,
      "owner": {
        "login": "secops2001",
        "id": 243496075,
        "node_id": "U_kgDODoN0iw",
        "avatar_url": "https://avatars.githubusercontent.com/u/243496075?v=4",
        "gravatar_id": "",
        "url": "https://api.github.com/users/secops2001",
        "html_url": "https://github.com/secops2001",
        "followers_url": "https://api.github.com/users/secops2001/followers",
        "following_url": "https://api.github.com/users/secops2001/following{/other_user}",
        "gists_url": "https://api.github.com/users/secops2001/gists{/gist_id}",
        "starred_url": "https://api.github.com/users/secops2001/starred{/owner}{/repo}",
        "subscriptions_url": "https://api.github.com/users/secops2001/subscriptions",
        "organizations_url": "https://api.github.com/users/secops2001/orgs",
        "repos_url": "https://api.github.com/users/secops2001/repos",
        "events_url": "https://api.github.com/users/secops2001/events{/privacy}",
        "received_events_url": "https://api.github.com/users/secops2001/received_events",
        "type": "User",
        "user_view_type": "public",
        "site_admin": false
      },
      "html_url": "https://github.com/secops2001/amudai",
      "description": null,
      "fork": true,
      "url": "https://api.github.com/repos/secops2001/amudai",
      "forks_url": "https://api.github.com/repos/secops2001/amudai/forks",
      "keys_url": "https://api.github.com/repos/secops2001/amudai/keys{/key_id}",
      "collaborators_url": "https://api.github.com/repos/secops2001/amudai/collaborators{/collaborator}",
      "teams_url": "https://api.github.com/repos/secops2001/amudai/teams",
      "hooks_url": "https://api.github.com/repos/secops2001/amudai/hooks",
      "issue_events_url": "https://api.github.com/repos/secops2001/amudai/issues/events{/number}",
      "events_url": "https://api.github.com/repos/secops2001/amudai/events",
      "assignees_url": "https://api.github.com/repos/secops2001/amudai/assignees{/user}",
      "branches_url": "https://api.github.com/repos/secops2001/amudai/branches{/branch}",
      "tags_url": "https://api.github.com/repos/secops2001/amudai/tags",
      "blobs_url": "https://api.github.com/repos/secops2001/amudai/git/blobs{/sha}",
      "git_tags_url": "https://api.github.com/repos/secops2001/amudai/git/tags{/sha}",
      "git_refs_url": "https://api.github.com/repos/secops2001/amudai/git/refs{/sha}",
      "trees_url": "https://api.github.com/repos/secops2001/amudai/git/trees{/sha}",
      "statuses_url": "https://api.github.com/repos/secops2001/amudai/statuses/{sha}",
      "languages_url": "https://api.github.com/repos/secops2001/amudai/languages",
      "stargazers_url": "https://api.github.com/repos/secops2001/amudai/stargazers",
      "contributors_url": "https://api.github.com/repos/secops2001/amudai/contributors",
      "subscribers_url": "https://api.github.com/repos/secops2001/amudai/subscribers",
      "subscription_url": "https://api.github.com/repos/secops2001/amudai/subscription",
      "commits_url": "https://api.github.com/repos/secops2001/amudai/commits{/sha}",
      "git_commits_url": "https://api.github.com/repos/secops2001/amudai/git/commits{/sha}",
      "comments_url": "https://api.github.com/repos/secops2001/amudai/comments{/number}",
      "issue_comment_url": "https://api.github.com/repos/secops2001/amudai/issues/comments{/number}",
      "contents_url": "https://api.github.com/repos/secops2001/amudai/contents/{+path}",
      "compare_url": "https://api.github.com/repos/secops2001/amudai/compare/{base}...{head}",
      "merges_url": "https://api.github.com/repos/secops2001/amudai/merges",
      "archive_url": "https://api.github.com/repos/secops2001/amudai/{archive_format}{/ref}",
      "downloads_url": "https://api.github.com/repos/secops2001/amudai/downloads",
      "issues_url": "https://api.github.com/repos/secops2001/amudai/issues{/number}",
      "pulls_url": "https://api.github.com/repos/secops2001/amudai/pulls{/number}",
      "milestones_url": "https://api.github.com/repos/secops2001/amudai/milestones{/number}",
      "notifications_url": "https://api.github.com/repos/secops2001/amudai/notifications{?since,all,participating}",
      "labels_url": "https://api.github.com/repos/secops2001/amudai/labels{/name}",
      "releases_url": "https://api.github.com/repos/secops2001/amudai/releases{/id}",
      "deployments_url": "https://api.github.com/repos/secops2001/amudai/deployments",
      "created_at": "2025-12-14T01:24:59Z",
      "updated_at": "2025-12-14T01:24:59Z",
      "pushed_at": "2025-12-12T03:30:09Z",
      "git_url": "git://github.com/secops2001/amudai.git",
      "ssh_url": "[email protected]:secops2001/amudai.git",
      "clone_url": "https://github.com/secops2001/amudai.git",
      "svn_url": "https://github.com/secops2001/amudai",
      "homepage": "",
      "size": 3525,
      "stargazers_count": 0,
      "watchers_count": 0,
      "language": null,
      "has_issues": false,
      "has_projects": true,
      "has_downloads": true,
      "has_wiki": true,
      "has_pages": false,
      "has_discussions": false,
      "forks_count": 0,
      "mirror_url": null,
      "archived": false,
      "disabled": false,
      "open_issues_count": 0,
      "license": {
        "key": "apache-2.0",
        "name": "Apache License 2.0",
        "spdx_id": "Apache-2.0",
        "url": "https://api.github.com/licenses/apache-2.0",
        "node_id": "MDc6TGljZW5zZTI="
      },
      "allow_forking": true,
      "is_template": false,
      "web_commit_signoff_required": false,
      "topics": [],
      "visibility": "public",
      "forks": 0,
      "open_issues": 0,
      "watchers": 0,
      "default_branch": "main"
    }
  },
  "public": true,
  "created_at": "2025-12-14T01:24:59Z",
  "org": {
    "id": 6154722,
    "login": "microsoft",
    "gravatar_id": "",
    "url": "https://api.github.com/orgs/microsoft",
    "avatar_url": "https://avatars.githubusercontent.com/u/6154722?"
  }
}

Now to the easy part. Once we have the Amudai fork from secops2001 url:

> This branch is 111 commits ahead of gmh5225/amudai:main.
> Kusto Build System
> Auto-sync from Azure-Kusto-Amudai
> 63c7da3
> 2 months ago (Dec 12, 2025, 4:30 AM GMT+1

We can just clone it locally:

Locally clone / download ZIP file or fork at GitHub.
ll ~/temp/amudai/code/
total 60
drwxr-xr-x  3 johndoe users  4096 Feb  3 21:29 docs
drwxr-xr-x  3 johndoe users  4096 Feb  3 21:29 proto_defs
drwxr-xr-x 22 johndoe users  4096 Feb  3 21:29 rust
drwxr-xr-x  3 johndoe users  4096 Feb  3 21:29 test
drwxr-xr-x  2 johndoe users  4096 Feb  3 21:29 tools
-rw-r--r--  1 johndoe users  7225 Feb  3 21:29 Cargo.toml
-rw-r--r--  1 johndoe users   558 Feb  3 21:29 CODE_OF_CONDUCT.md
-rw-r--r--  1 johndoe users   591 Feb  3 21:29 CONTRIBUTING.md
-rw-r--r--  1 johndoe users 11357 Feb  3 21:29 LICENSE
-rw-r--r--  1 johndoe users  5157 Feb  3 21:29 README.md
-rw-r--r--  1 johndoe users  2656 Feb  3 21:29 SECURITY.md

Conclusion

Heckers gonna heck

And once again, I can read the design, architecture and specification documentation from this awesome Microsoft (open source) project.

Hopefully, it will once again be made public with a stable initial release.