Sign in to view Sunny’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Sunny’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sacramento, California, United States
Sign in to view Sunny’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
3K followers
500+ connections
Sign in to view Sunny’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Sunny
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Sunny
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Sunny’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Activity
3K followers
-
Sunny Bains reposted thisSunny Bains reposted thisThe 5 words that usually lead to a $100k engineering mistake. "Let's just shard the database." Manual sharding is a "tax" on your developers. When you shard a MySQL or Postgres cluster manually, you aren't just splitting data. You are: • Breaking Cross-shard Joins. • Forcing your App to be "Database Aware." • Creating a "Re-balancing" nightmare for your future self. In 2026, you don't shard. You distribute. Moving to a Distributed SQL engine like TiDB gives you the horizontal scale of NoSQL with the ACID compliance and Joins of the Relational world you love. Keep the MySQL syntax. Lose the sharding pain. Scale the database, not the complexity. #TiDB #Scalability #DistributedSystems #MySQL #SystemDesign #DatabaseArchitecture #SRE #CloudNative
-
Sunny Bains shared thisIf you've ever watched an agentic AI system work perfectly in a demo and fall apart in production, the bottleneck probably wasn't the model. It was the data architecture underneath it. Agentic systems don't just read data. They also write, remember, reason, and act on it in real time. That puts pressure on your data layer in ways traditional architectures weren't designed for. This post (originally published on The New Stack) breaks down four architecture decisions that determine whether your agentic system scales or stalls: → How you handle memory and context persistence across agent sessions → Whether your transactional and analytical workloads can coexist without trade-offs → Why the choice between embedded vs. external data layers matters more than most teams realize → What isolation and multi-tenancy look like when thousands of agents hit your database at once The teams getting this right aren't just picking better models. They're rethinking the data layer from the ground up. What's the first thing that broke when you moved an AI system from prototype to production? Would love to hear real stories. 🔗 https://ow.ly/HS8x50YBh1A #AgenticAI #DataArchitecture #DistributedSystems #Scalability #CloudNative #TiDB
-
Sunny Bains reposted thisSunny Bains reposted thisTen years ago, on April 1, TiKV began its open source journey. Today, on another April 1, we’re introducing TiDB Cloud Lake, now entering Private Preview — a new cloud-native analytics experience inside TiDB Cloud. With TiDB Cloud Lake, teams can: - create elastic warehouses in just a few clicks - scale compute independently for analytics workloads - pay based on actual usage - automatically suspend idle warehouses to avoid unnecessary cost - manage analytics workloads directly inside TiDB Cloud The goal is simple: make modern analytics more accessible, more elastic, and easier to operate. From building the distributed foundation a decade ago to expanding the cloud analytics experience today, this is an important step toward bringing warehouse-style analytics, object-storage-native architecture, and a simpler cloud experience together in one place. TiDB Cloud Lake starts today. #TiDB #TiDBCloud #CloudAnalytics #DataWarehouse #Lakehouse #SQL #DataInfrastructure #Warehouse
-
Sunny Bains shared thisA week ago I was mulling over Agentic Scale and the ability of a database to handle millions of tenants and billions of tables. See: https://lnkd.in/gYehJaWx This week I decided to try and implement that idea, works just fine at least at a POC level with some simplification compared to a full fledged database implementation. I don't see that as big challenge (at least currently). I now think that both the Catalog based traditional design and the other end NoSQL are maximalist positions. This middle ground should work and scale much better. I tried creating 10M tables, some DML and then running DDL (create index) on the 10millionth table to verify the POC. - What surprised me more was that the Python script took 20% CPU. - Drop table wasn't optimized fully yet, but it's good enough. 10,000,000 create/drop completed. Representative schema: CREATE TABLE `t_N` (id BIGINT NOT NULL, name VARCHAR(255) NOT NULL, score BIGINT NOT NULL) Final-table probe before drop: CREATE TABLE `t_9999999` (id BIGINT NOT NULL, name VARCHAR(255) NOT NULL, score BIGINT NOT NULL) INSERT INTO `t_9999999` (id, name, score) VALUES (1099999991, 'probe_9999999_a', 901), (1099999992, 'probe_9999999_b', 902), (1099999993, 'probe_9999999_c', 903); CREATE INDEX `idx_t_9999999_id` ON `t_9999999` (id); SELECT id, name, score FROM `t_9999999` WHERE id IN (1099999991, 1099999992, 1099999993) ORDER BY id; Probe result: [["1099999991", "probe_9999999_a", "901"], ["1099999992", "probe_9999999_b", "902"], ["1099999993", "probe_9999999_c", "903"]] Summary: - create: 482.017073s => 20,746.15/s - drop: 710.633954s => 14,071.94/s - total: 1199.763061s
-
Sunny Bains posted thisI can now write an entire optimizer as an extension. The extension receives an AST and arena allocators and returns the query graph. Too good! I think the LLM’s are going to usher in a new era of innovation.
-
Sunny Bains posted thisSuper easy to create new types, table types and custom extensions. sql:test> show extensions; Name Type Description VECTOR type Variable-length f32 vector encoded as <u32 len><f32>... csv_table table_type CSV file-backed table. Accepts any schema. Optional path argument. mmap_temp table_type Memory-mapped temporary table with BTree indexing. Accepts any schema. 3 row(s) (0.00 secs) sql:test> CREATE TABLE counters ( counter_id BIGINT NOT NULL, value BIGINT, increment INT ) TYPE = mmap_temp(65536, '/local/d0/tmp'); Query OK, 0 row(s) affected (0.00 secs) sql:test> drop table counters; Query OK, 0 row(s) affected (0.01 secs) sql:test> CREATE TABLE counters ( counter_id BIGINT NOT NULL PRIMARY KEY, value BIGINT, increment INT ) TYPE = mmap_temp(65536, '/local/d0/tmp'); Query OK, 0 row(s) affected (0.00 secs) sql:test> insert into counters values(1, 1, 1); Query OK, 1 row(s) affected (0.00 secs) sql:test> select * from counters; counter_id value increment 1 1 1 1 row(s) (0.00 secs)
-
Sunny Bains shared this"Always use vector databases." "LLMs are too expensive at scale." "Vector search is always faster." The CrowdSnap team actually ran the numbers — and the benchmarks tell a different story. GPT Direct vs TiDB Vector Search, tested on real datasets up to 1,000 rows: - 400x faster repeated queries with TiDB (0.3s vs 120s) - $0.11 vs $10–15 per 1,000 analyses - Break-even at just 51 queries The real answer? Neither wins outright — the architecture depends entirely on your query pattern. Full benchmarks, decision matrix, and code in the blog 👇 https://ow.ly/cxNV50Yyuno #VectorSearch #TiDB #AIArchitecture #LLM
-
Sunny Bains shared thisWorth checking out. Highly recommended. https://lnkd.in/gS9cczepThe Cost of Concurrency Coordination with Jon GjengsetThe Cost of Concurrency Coordination with Jon Gjengset
-
Sunny Bains shared thisTwo weeks from complete infrastructure failure. 2 million users on a waiting list. One database decision that changed everything. That's the Manus story. When their single-instance database started collapsing under the weight of AI agent workloads, the team had a narrow window to fix it. They migrated to TiDB in a 3-hour live switchover — no downtime, no data loss — and went from the edge of failure to powering one of the most talked-about AI products of the year. The result? A $2 billion acquisition by Meta. In this video, I break down exactly what happened: - Why traditional databases can't handle what AI agents are actually doing - What context engineering means for agent workloads - How TiDB's distributed SQL architecture made the migration possible - What TiDB X, database branching, and 10 million databases unlocks next This isn't a feel-good story about scaling. It's a hard look at why your database architecture is a strategic decision — not an infrastructure afterthought. Watch the full breakdown: https://ow.ly/5LtH50Yyufy #TiDB #DistributedSQL #AIAgents #DatabaseArchitecture #Manus
-
Sunny Bains liked thisSunny Bains liked thisSending positive vibes and support to the folks affected at Oracle, especially the Oracle MySQL team. Having been on all sides of a layoff, it's rough for everyone. Please reach out if I can help place people into database companies in my network or VillageSQL is hiring lots of SWEs. https://lnkd.in/eiJQx6jh
-
Sunny Bains liked thisSunny Bains liked thisThe 5 words that usually lead to a $100k engineering mistake. "Let's just shard the database." Manual sharding is a "tax" on your developers. When you shard a MySQL or Postgres cluster manually, you aren't just splitting data. You are: • Breaking Cross-shard Joins. • Forcing your App to be "Database Aware." • Creating a "Re-balancing" nightmare for your future self. In 2026, you don't shard. You distribute. Moving to a Distributed SQL engine like TiDB gives you the horizontal scale of NoSQL with the ACID compliance and Joins of the Relational world you love. Keep the MySQL syntax. Lose the sharding pain. Scale the database, not the complexity. #TiDB #Scalability #DistributedSystems #MySQL #SystemDesign #DatabaseArchitecture #SRE #CloudNative
-
Sunny Bains liked thisSunny Bains liked thisIf you've ever watched an agentic AI system work perfectly in a demo and fall apart in production, the bottleneck probably wasn't the model. It was the data architecture underneath it. Agentic systems don't just read data. They also write, remember, reason, and act on it in real time. That puts pressure on your data layer in ways traditional architectures weren't designed for. This post (originally published on The New Stack) breaks down four architecture decisions that determine whether your agentic system scales or stalls: → How you handle memory and context persistence across agent sessions → Whether your transactional and analytical workloads can coexist without trade-offs → Why the choice between embedded vs. external data layers matters more than most teams realize → What isolation and multi-tenancy look like when thousands of agents hit your database at once The teams getting this right aren't just picking better models. They're rethinking the data layer from the ground up. What's the first thing that broke when you moved an AI system from prototype to production? Would love to hear real stories. 🔗 https://ow.ly/HS8x50YBh1A #AgenticAI #DataArchitecture #DistributedSystems #Scalability #CloudNative #TiDB
-
Sunny Bains liked thisSunny Bains liked thisTen years ago, on April 1, TiKV began its open source journey. Today, on another April 1, we’re introducing TiDB Cloud Lake, now entering Private Preview — a new cloud-native analytics experience inside TiDB Cloud. With TiDB Cloud Lake, teams can: - create elastic warehouses in just a few clicks - scale compute independently for analytics workloads - pay based on actual usage - automatically suspend idle warehouses to avoid unnecessary cost - manage analytics workloads directly inside TiDB Cloud The goal is simple: make modern analytics more accessible, more elastic, and easier to operate. From building the distributed foundation a decade ago to expanding the cloud analytics experience today, this is an important step toward bringing warehouse-style analytics, object-storage-native architecture, and a simpler cloud experience together in one place. TiDB Cloud Lake starts today. #TiDB #TiDBCloud #CloudAnalytics #DataWarehouse #Lakehouse #SQL #DataInfrastructure #Warehouse
Projects
-
Embedded InnoDB
-
Hobby project that is a fork of Embedded InnoDB
Recommendations received
1 person has recommended Sunny
Join now to viewView Sunny’s full profile
-
See who you know in common
-
Get introduced
-
Contact Sunny directly
Other similar profiles
-
Mohamed Hegazy
Mohamed Hegazy
Schneider Electric Industrial Services
9K followersNashville Metropolitan Area
Explore more posts
-
Jack Vanlightly
Confluent • 3K followers
In distributed systems, reliability isn’t just about retries and durability, it’s about knowing who owns recovery. My latest post, based on the Coordinated Progress model I posted previously, explores how reliable triggers create responsibility boundaries and how those boundaries shape resilience, observability, and complexity. https://lnkd.in/diCmywnS
117
8 Comments -
Nicholas Matsakis
3K followers
New blog post, "Symmetric ACP": https://lnkd.in/dDU7FDbq This post describes **SymmACP** -- a proposed extension to Zed Industries' Agent Client Protocol that lets you build AI tools like Unix pipes or browser extensions. Want a better TUI? Found some cool slash commands on GitHub? Prefer a different backend? With SymmACP, you can mix and match these pieces and have them all work together without knowing about each other.
18
-
Kathleen DeRusso
Elastic • 990 followers
Chunking and snippet extraction has been a huge focus lately - my latest blog dives into some of the work we've done on this to date, including support for a chunk rescorer in our semantic reranking retriever, as well as some useful ES|QL primitives to get more visibility into chunks and snippets. #elasticsearch #semanticreranking #snippets #chunks #chunking #esql https://lnkd.in/e6UD7iii
13
-
Jack Vanlightly
Confluent • 3K followers
New blog post: A Fork in the Road: Deciding Kafka's Diskless Future. Kafka is getting serious about S3 and finds itself at an architectural crossroads that will shape its next decade. Several new KIPs (1150, 1176, 1183) aim to reduce replication costs across cloud availability zones, but the implications go far beyond networking cost. It’s a mistake to think of S3 as simply a cheaper disk or a networking cheat. Building on object storage opens the door to operational benefits such as elastic, stateless compute — something many modern analytics systems already exploit. In my latest post, I outline two competing future paths: 🔹 Evolutionary path: Reuse large parts of existing Kafka components to reduce code changes and long-term maintenance. 🔹 Revolutionary path: Separate stateless and stateful layers to realize the full operational benefits of disaggregated storage. The post examines the trade-offs of each path, how the current KIPs map to these two paths and poses a broader question: what should Kafka become? And what will keep it relevant in the decade ahead? https://lnkd.in/d4E2BAe4
149
12 Comments -
Gunnar Morling
Confluent • 9K followers
Seeing quite a few discussions lately about Kafka/Iceberg integrations being "zero-copy" or not. I think this is largely missing the point. First, where I agree is that this integration should be "zero-effort" for users. Materializing a Kafka topic into an Iceberg table shouldn't require more than a click of a button. Queries should provide a uniform way for accessing the data in both a topic and the corresponding table. This is the stream/table duality, and it should Just Work™. Now, whether this requires to store the bytes of data once in a Kafka topic, and a second time elsewhere for table access, shouldn't really matter from a user perspective. I'd argue storing the data twice is actually a benefit, and in fact it's a pattern well established: it resembles the design of WAL and table files known from databases for decades. I don't think anyone ever complained about this structure in their RDBMS? Which makes sense, it's an implementation detail, opaque to users. But as it turns out, having log and table data separately is even more advantageous for the deconstructed database that is Kafka and Iceberg: you can have multiple readers of the same log (Kafka topic), materializing views in multiple destinations and systems optimized for specific use cases. Maybe multiple Iceberg tables with different projections (think PII), maybe an Iceberg table and a full-text index in Elasticsearch, maybe an... you catch my drift. Furthermore, the log is replayable, so you can recreate views if needed, or you can implement new use cases you didn't originally have in mind. All in all, I think "zero copy" is mostly a red herring. Sure, it can be an optimization for certain scenarios, but mostly it's a distraction from the immense value you get from combining Kafka and Iceberg when done the right (seamless) way.
127
26 Comments -
Redis
292K followers
In-memory databases are the architectural backbone of real-time AI. They’re a necessary advantage for memory-intensive workloads that rely on instant access to context. Modern systems like Redis blend durability, multi-model flexibility (JSON, vectors, time-series), and tiered storage to deliver the speed of memory with the reliability of disk. The result? Real-time analytics, instant personalization, and AI systems that actually feel intelligent. Here’s how we deliver the speed of memory with the reliability of disk: https://lnkd.in/gSm65kdB
49
-
Alex B.
Google • 5K followers
A recent K8s podcast with Clayton Coleman and Rob Shaw is a fascinating listen: https://lnkd.in/gJmpX_DB Why do standard K8s approaches don't work with LLM inference? Unlike a normal web app, LLM generates text one token at a time, each one depending on the one that came before. The technique to remember the context of the conversation is called KV caching. Now regular K8s tools for balancing traffic, adding servers and managing resources at huge scale work really well, because the amount of required varies depending on the prompt. The tools the community created to solve this problem: 🛠️ vLLM - popular inference server that is very good at KV cache management. 🛠️ Inference Gateway - smarter load balancer that enables efficient traffic distribution. 🧩 The llm-d project brings these together in a ready-to-use stack, tested and benchmarked well-lit paths, smarter routing, intelligent work splitting to optimize time-to-first token, methods to run "Mixture-of-Experts" models across multiple servers
50
-
Cherif YAYA
Pinterest • 1K followers
What I'm Reading This Week 📚 🤖 Vibe Code Reality Check Steve Krouse's piece argues that "vibe coding" where you "forget that the code even exists" creates legacy code nobody understands. Programming is fundamentally theory building, not just producing lines of code. This resonates - I recently spent hours trying to fix a frontend bug with Claude Code and Gemini CLI stuck in a dead loop of guesses. Two minutes of human inspection found the inefficient hook render. We still very much need the human (expert) in the loop. https://lnkd.in/g_Dt-_kz 📐 Mathematics as Code Dan Abramov's exploration of Lean shows how mathematicians can treat mathematics as code - breaking it into structures, theorems, and proofs that are statically checked and composable. His playful example of "haunted math" where 2=3 demonstrates how axioms shape mathematical reality. Always fascinating to see bleeding-edge programming language theory, especially when it bridges math and code in such elegant ways. https://lnkd.in/gebHeMSU 💻 Local LLMs Innovation Simon Willison highlights how his 2.5-year-old laptop can now write Space Invaders using GLM-4.5 Air and MLX. Open source models are rapidly pushing the frontier of what's possible on a chip. The future where we get GPT-4 level intelligence running on phones feels closer than ever and a powerful enabler for humanity. 🚀 JavaScript's Wild Decade Jamie Birch's comprehensive dive through JavaScript runtimes reveals a staggering ecosystem - from Node.js to Cloudflare Workers to React Native's Hermes. I remember when Apple had a strict no-JS-runtime policy for the App Store. We've come so far that every platform now has multiple JS engines optimized for specific use cases. The proliferation shows JavaScript's incredible adaptability across contexts. https://lnkd.in/g843sYBk 📊 Amazon's Dev Experience ROI Amazon's "Cost to Serve Software" framework quantified a 15.9% reduction in delivery costs through developer experience improvements. The key insight: giving developers back time and reducing toilesome work. For a company with 1,000 developers, a 15% improvement translates to $20M in cost avoidance. Good concrete ROI numbers on investing in developer productivity. https://lnkd.in/ghv5CTfa Which trend do you think will have the biggest impact on software development: local AI democratizing access, the continued JavaScript runtime explosion, or finally having concrete metrics for developer experience ROI? #AI #JavaScript #DeveloperExperience #Programming #LocalLLMs #VibeCoding #SoftwareDevelopment
9
2 Comments -
Norm Johanson
Amazon Web Services (AWS) • 1K followers
Starting the year strong at AWS for .NET with .NET 10 support Lambda. A new feature we added for .NET 10 is support for writing C# file based Lambda functions. Would love to hear feedback on the new support. https://lnkd.in/gzpH_ST3
119
7 Comments -
M Saad Malik
Szabist Karachi • 4K followers
🚀 TOON vs JSON - How I Cut My LLM Token Usage (and Cost) by 30–60% I integrated TOON (Token-Oriented Object Notation) into my daily autonomous agent workflow, the same system that sets up my day’s action items automatically. Before TOON, that workflow consumed anywhere from 60K to 1M tokens per run, depending on the context. After converting all structured JSON data into TOON, the exact same logic now uses only 45K to 72K tokens. ✅ Same context. ✅ Same responses. ❌ Lower cost. That’s a massive 30–60% token reduction, just by changing the data format. So, what’s TOON? It’s a new, compact data format designed for LLM efficiency. Unlike JSON, which repeats keys, braces, and punctuation, TOON defines the structure once and just streams the values. The result? Far fewer tokens, faster processing, and cheaper model runs. Many of my engineering friends ask: “Isn’t TOON just like CSV?” Conceptually: 👉 TOON = CSV’s compactness + JSON’s structure + AI readability Here’s how they differ: JSON → General-purpose data exchange (APIs, configs) CSV → Simple, flat tabular data (spreadsheets, databases) TOON → Structured data optimized for AI token efficiency If your agents or pipelines frequently push JSON into LLMs, switching to TOON could save real money and make your workflows scale more efficiently. 🔗 Learn more: https://lnkd.in/dC4wnkmG #AI #MachineLearning #LLM #SoftwareEngineering #DataEngineering #Automation #TOON #JSON #TokenOptimization #AIInfra
458
70 Comments -
Pamela Fox
Microsoft • 13K followers
Accidentally discovered a new way to work with GitHub Copilot today: write new tests based off test coverage reports. Copilot (or I) run this command to generate a report of all the uncovered lines in the current diff: pytest --cov --cov-report=xml && \ diff-cover coverage.xml --html-report coverage_report.html && \ open coverage_report.html Then Copilot reviews the report, summarizes the uncovered situations, and proposes new test cases to add, or if the code is truly unreachable, suggests deleting it. Just another reason why you should add diff-cover (or similar tool) to your projects!
115
4 Comments -
Linearloop
12K followers
Your CI/CD bill is not exploding because of “expensive runners,” it is bleeding out through flaky tests, over-parallelized jobs, and slow feedback loops that quietly tax every deploy This piece shows you how to redesign pipelines so you cut compute waste, shrink queues, and speed up developer flow without adding yet another approval gate or YAML religion Full breakdown in the comments. #DevOps #CICD #PlatformEngineering
8
1 Comment
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top contentOthers named Sunny Bains
-
Sunny Bains
New Westminster, BC -
Sunny Bains
Mississauga, ON -
Sunny Bains
London -
Sunny Bains
London Area, United Kingdom -
Sunny Bains
Royal Sutton Coldfield
100 others named Sunny Bains are on LinkedIn
See others named Sunny Bains