Skip to content
View mantzaris's full-sized avatar
🎯
Focusing
🎯
Focusing

Sponsoring

@JuliaLang
@JoshuaWise
@xenova

Highlights

  • Pro

Block or report mantzaris

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mantzaris/README.md

Hi, I'm Alexander V. Mantzaris! πŸ˜ƒ

I'm an academic (researcher) and developer focusing on Julia Lang packages, Electron.js apps :electron:, social physics simulations πŸ™οΈ, and useful tools. Below, I've organized my key repositories by category. Check them out - many include detailed docs, examples, and ways to contribute.

Julia Language Packages

These tested tools have been peer-reviewed by the Journal of Open Source Software (JOSS). Great for data processing, NLP benchmarks, and more.

  • WunDeeDB.jl: A zero-config embedded vector database with WAL support. Useful for lightweight AI/ML embeddings. Ideal for vector search that scaled with transactions (which come with a cost for the confidence in data integrity) (JOSS-reviewed; in General registry)
    Stars Forks
    tip: using Pkg; Pkg.add("WunDeeDB") - try the examples for instant vector search.

  • LMDiskANN.jl: Julia implementation of Low Memory Disk ANN for efficient nearest-neighbor searches. (JOSS-reviewed; in General registry)
    Stars Forks
    tip: using Pkg; Pkg.add("LMDiskANN") Ideal for large datasets and speed where transactions are not needed.

  • KeemenaPreprocessing.jl: Preprocessing for text data: cleaning, normalization, vectorization, tokenization and more. (in General registry)
    Stars Forks
    tip: using Pkg; Pkg.add("KeemenaPreprocessing"): Great for many text preprocessing tasks and also allows it to perform with controlled memory when processing large text corpora.

  • KeemenaSubwords.jl: Julia-native subword tokenization library supporting BPE, WordPiece, Unigram, SentencePiece-style models, and compatibility with common tokenizer formats. Designed for correctness, reproducibility, and alignment with modern LLM workflows. (in General registry)
    Stars Forks
    tip: using Pkg; Pkg.add("KeemenaSubwords") - load pretrained tokenizers or train your own for GPT-style or transformer-based pipelines.

  • BenchmarkDataNLP.jl: Generates synthetic text (e.g., via Context-Free Grammars) with tunable complexity to test NLP methods like LLMs.
    Stars Forks
    Newcomer tip: Parameterize complexity to stress-test your models on less complex data before going into more complex natural corpora.

Tagasaurus Electron.js Apps :electron:

Two finished desktop apps for media tagging and organization. Built with Electron.js for cross-platform use - download releases for Linux/Windows.

  • Tagasaurus: Tag your planet with semantic search and machine learning. Search and annotate your media files with ease keeping all the data locally.
    Stars Forks

  • TagasaurusMemetic: The original "Tag your Planet" app - gateway to your semantic multiverse. Here memes and emotions take a first place position in search with specific features to allow for users to search for the impact of memes on certain keywords by utilizing a bi-partite graph.
    Stars Forks

Social Simulation Code from Papers πŸ™οΈ

Repos with code implementations from my research papers on social dynamics, entropy, and models like Schelling. Useful for simulations in physics/social sciences - many include Jupyter notebooks for reproducibility.

Miscellaneous Useful Code

Handy packages and templates for everyday productivity, like LaTeX setups.

  • Nighttime LaTeX Template: A sonar themed dark-mode friendly LaTeX template for documents - reduces eye strain at night.
    Newcomer tip: Copy the .tex file and compile with pdflatex; perfect for late-night writing.

  • UCF Style LaTeX Slides Template: Beamer template in University of Central Florida style for presentations.
    Newcomer tip: Customize the department logo.

Get Involved

  • Star ⭐ or fork repos you're interested in!
  • Open issues/PRs if you spot improvements.

Thanks for visiting! πŸ‘

Pinned Loading

  1. WunDeeDB.jl WunDeeDB.jl Public

    Your just-works / zero-config / embedded / WAL: vector database

    Julia 11 1

  2. Tagasaurus Tagasaurus Public

    Tag Your Planet

    JavaScript 5

  3. LMDiskANN.jl LMDiskANN.jl Public

    Julia Implementation of Low Memory Disk ANN (LM-DiskANN)

    Julia 7 1

  4. TagasaurusMemetic TagasaurusMemetic Public

    Tagasuarus, the gateway to your semantic multiverse "Tag your Planet!"

    JavaScript 9 4

  5. BenchmarkDataNLP.jl BenchmarkDataNLP.jl Public

    Generate synthetic text from a variety of methods, eg. Context Free Grammars (CFGs), with parameterized complexity to test your NLP methods (like LLMs)

    Julia 1

  6. schellingEntropyImproved schellingEntropyImproved Public

    Fixing issue with Schelling model that violates the 2nd law of thermodynamics

    Jupyter Notebook 4