Software Engineering Cloud Computing

Explore top LinkedIn content from expert professionals.

  • View profile for sukhad anand

    Senior Software Engineer @Google | Techie007 | Opinions and views I post are my own

    104,454 followers

    Netflix once asked a terrifying question: “What happens if our entire database disappears?” - Not a table. - Not a shard. The entire database. To test this, they built a tool called Chaos Monkey, which randomly kills servers. Then they went further and built: - Chaos Gorilla, which simulates losing an entire Availability Zone - Chaos Kong, which simulates losing an entire AWS region These tools intentionally destroy large parts of Netflix infrastructure to ensure the system can survive the worst possible event. When they first ran Chaos Kong internally, dozens of microservices failed. - Fallbacks were missing. - Cross region replication did not handle traffic properly. - Caches did not warm up fast enough. But instead of hiding it, Netflix made this part of their routine engineering practice. That is how their resilience was built: - Multi region active active architectures - Cross region failovers - Stateless services - Data replication with conflict resolution - Region isolation testing All of these are real Netflix engineering strategies, documented openly in their tech blogs and conference talks. You do not build reliability by hoping things will not break. You build reliability by intentionally breaking them in controlled ways.

  • View profile for Brij kishore Pandey
    Brij kishore Pandey Brij kishore Pandey is an Influencer

    AI Architect | AI Engineer | Generative AI | Agentic AI

    708,450 followers

    Load Balancing: Beyond the Basics - 5 Methods Every Architect Should Consider The backbone of scalable systems isn't just about adding more servers - it's about intelligently directing traffic between them. After years of implementing different approaches, here are the key load balancing methods that consistently prove their worth: 1. Round Robin Simple doesn't mean ineffective. It's like a traffic cop giving equal time to each lane - predictable and fair. While great for identical servers, it needs tweaking when your infrastructure varies in capacity. 2. Least Connection Method This one's my favorite for dynamic workloads. It's like a smart queuing system that always points users to the least busy server. Perfect for when your user sessions vary significantly in duration and resource usage. 3. Weighted Response Time Think of it as your most responsive waiter getting more tables. By factoring in actual server performance rather than just connection counts, you get better real-world performance. Great for heterogeneous environments. 4. Resource-Based Distribution The new kid on the block, but gaining traction fast. By monitoring CPU, memory, and network load in real-time, it makes smarter decisions than traditional methods. Especially valuable in cloud environments where resources can vary. 5. Source IP Hash When session persistence matters, this is your go-to. Perfect for applications where maintaining user context is crucial, like e-commerce platforms or banking applications. The real art isn't in picking one method, but in knowing when to use each. Sometimes, the best approach is a hybrid solution that adapts to your traffic patterns. What challenges have you faced with load balancing in production? Would love to hear your real-world experiences!

  • View profile for Prerit Munjal

    Platform Engineering Leader • Senior TPM at Groupon • ex-CTO • Building AI Solutions with ROI

    79,722 followers

    Phases of Cloud-DevOps in Production: 𝗦𝘁𝗮𝗴𝗲 𝟭: Pull the code from Git, install a web server and deploy your code on a VPS/VM. 𝗦𝘁𝗮𝗴𝗲 𝟮: Dockerise the application, pass environment variables and run a container on the VM. 𝗦𝘁𝗮𝗴𝗲 𝟯: Add Secrets and use docker compose to run your containers on the VM. 𝗦𝘁𝗮𝗴𝗲 𝟰: Migrate the application to Kubernetes with add-ons like Service Mesh, Monitoring, Tracing, and Profiling. 𝗦𝘁𝗮𝗴𝗲 𝟱: Add Infrastructure as Code for all the immutability & better maintainability. 𝗦𝘁𝗮𝗴𝗲 𝟲: Integrate GitOps, automated A/B testing and Chaos Engineering. Understand the WHYs: Scalability, Reliability, Maintainability, Immutability, Availability.

  • View profile for Greg Coquillo
    Greg Coquillo Greg Coquillo is an Influencer

    Product Leader @AWS | Startup Investor | 2X Linkedin Top Voice for AI, Data Science, Tech, and Innovation | Quantum Computing & Web 3.0 | I build software that scales AI/ML Network infrastructure

    224,409 followers

    If you look closely at this stack across providers, you’ll notice that AI is just part of the puzzle. I’m not exaggerating when I say, when launching production-grade systems, 80% of the AI challenges continue to be engineering challenges. Selecting which model to work with isn’t even close to being the whole story. To successfully deploy and scale intelligent systems, one needs to understand how to make tradeoffs while evaluating hundreds of services offered by cloud providers like AWS, Google Cloud, and Microsoft Azure Each cloud has its edge; AWS leads in scalability, Google in data innovation, and Microsoft in enterprise integration. Let’s see how they compare across every key layer of the stack : 1.🔸Security & Governance - AWS ensures secure access and monitoring with IAM and GuardDuty. - Google focuses on unified security through Command Center and KMS. - Microsoft leads enterprise defense with Azure Defender and Sentinel. 2.🔸Integration & Automation - AWS automates workflows with Step Functions and Glue. - Google connects systems using Dataflow and Workflows. - Microsoft streamlines operations through Logic Apps and Data Factory. 3.🔸Compute & Infrastructure - AWS delivers scalable compute with EC2, Lambda, and Inferentia chips. - Google uses TPUs and GKE for AI scalability. - Microsoft powers hybrid workloads with Azure VMs and Functions. 4.🔸Data & Analytics - AWS supports data analysis through Redshift and Athena. - Google dominates big data with BigQuery and Looker. - Microsoft combines analytics and visualization via Synapse and Power BI. 5.🔸Edge & Hybrid - AWS offers low-latency AI with Outposts and Wavelength. - Google secures edge processing with GDC and Confidential Computing. - Microsoft extends cloud capabilities using Azure Arc and Stack Edge. 6.🔸Cloud AI Services - AWS offers SageMaker, Comprehend, and Rekognition APIs. - Google provides Vertex AI and Gemini for advanced AI solutions. - Microsoft integrates OpenAI, Cognitive Services, and ML Studio. 7.🔸Agent & Developer Tools - AWS includes Bedrock Agents and CodeWhisperer. - Google enables Gemini and LangChain integrations. - Microsoft supports Copilot Studio and Semantic Kernel. 8.🔸Prototyping & Design Tools - AWS empowers testing with SageMaker Studio Lab. - Google simplifies development using AI Studio and Opal. - Microsoft focuses on no-code creation via Designer and Recognizer Studio. 9.🔸Core Models - AWS relies on Titan and Bedrock models. - Google leads with Gemini. - Microsoft uses Phi, Orca, and Azure OpenAI. Understand how to set up your architecture for scalability, performance, cost, and reliability is a huge advantage, whether via single-cloud, multi-cloud, hybrid, or on-prem. Curious to know how you evaluate tradeoffs from services across these providers to set up your AI systems.

  • View profile for Vishakha Sadhwani

    Sr. Solutions Architect at Nvidia | Ex-Google, AWS | 100k+ Linkedin | EB1-A Recipient | Follow to explore your career path in Cloud | DevOps | *Opinions.. my own*

    139,176 followers

    If you want to break into Cloud DevOps in 2025 Build these 3 high-impact portfolio projects Your resume doesn't need another generic pipeline project. Instead, show a well-rounded, 360-degree technical view. Do these 3 types of Cloud DevOps projects: 1. Automate Application Delivery with CI/CD & GitOps ↳ Provision infrastructure with Terraform. ↳ Implement CI/CD using Jenkins, Docker, and Kubernetes. ↳ Deploy applications with Argo CD for GitOps. ↳ Comprehensive monitoring with Prometheus/Grafana **Don't just highlight containers or tools. Show the full application lifecycle. 2. Securely Deploy and Expose Applications on Kubernetes ↳ Deploy applications onto Kubernetes. ↳ Expose applications using ALB Ingress. ↳ Enforce security policies with Kyverno. ** Don't just deploy security policies. Show the full security implementation strategies 3. Optimize Cloud Costs with Serverless Automation ↳ Analyze cloud resource usage ↳ Implement serverless functions for automated cost optimization. ↳ Design and deploy event-driven cost management strategies. **Don't just show a script. Show the full cost optimization workflow. Use these resources to start: 1. App Delivery Automation : https://lnkd.in/gRPv9mUA 2. K8s Security : https://lnkd.in/ezyiaNEG 3. Cloud Cost Optimization: https://lnkd.in/ebBbzyxP To summarize: These aren't just tutorial implementations These are solutions to real operational challenges These demonstrate depth in cloud architecture, and integrated DevOps workflows.. Because employers want to see how you solve complex problems.. Not how well you can follow tutorials. 🔔 Follow Vishakha Sadhwani for more Cloud & DevOps content ♻️ Share so more people can learn.

  • View profile for Danny Steenman

    Helping startups build faster on AWS while controlling costs, security, and compliance | Founder @ Towards the Cloud

    11,371 followers

    After 10 years in Cloud Engineering, I wish someone had told me these truths from day one: "Embrace boring technology." That shiny new AWS service isn't worth the operational overhead. Master the fundamentals first: EC2, RDS, S3, and IAM. "Infrastructure as Code isn't optional."  Every manual click in the AWS console is technical debt. If you can't recreate your environment from code, you don't own it. "Security by design, not by accident."  Adding security after the fact is 10x harder than building it in. Start with least privilege IAM from day one. "Automation saves your sanity, not just time."  The goal isn't speed, it's consistency. Manual processes create knowledge silos and single points of failure. "Document your decisions, not just your code."  Write down WHY you chose this architecture. Future you (and your team) will thank you during the inevitable 3 AM incident. "Plan for failure from the beginning."  Every service will fail. Every network will have issues. Design for it, test for it, expect it. What's the best cloud advice you wish you'd received earlier?

  • View profile for Lucy Wang

    Founder @ Zero To Cloud | “Tech With Lucy” 250K+ on YouTube, Follow me & let’s grow our skills! 💪☁️

    80,804 followers

    𝗔𝗪𝗦 𝗜𝘀 𝗤𝘂𝗶𝗲𝘁𝗹𝘆 𝗕𝗹𝗲𝗻𝗱𝗶𝗻𝗴 𝗔𝗜 𝗜𝗻𝘁𝗼 𝗘𝘃𝗲𝗿𝘆𝘁𝗵𝗶𝗻𝗴 👇 If you're working with Cloud / AWS, you’ve probably noticed something happening lately: AI isn’t just a separate service anymore... it’s being woven into everyday cloud tools. As a cloud learner / professional you just need to understand how these updates are changing the work we do. Let me break it down 👇 🔹 Lambda: Now supports agent-based workflows You can now create AI agents inside AWS Lambda using the new Agent capabilities. This means it can call external APIs, make decisions based on responses, and Execute step-by-step plans. 🔹 CloudWatch: Smarter anomaly detection CloudWatch has added AI-based insights that automatically detect unusual spikes or drops, help explain what caused the change, and reduce the need for manual dashboard digging. 🔹 IAM: AI-generated policy suggestions When creating IAM roles or policies, AWS now offers auto-suggested permissions based on usage, it saves time and reduces the chance of misconfigured access. 🔹 S3: Data prep for AI/ML built-in S3 recently added features like object transformations for model-ready formats, and integrations with SageMaker and Bedrock. Your raw data can be cleaned, structured, and sent to models, all without leaving S3. You don’t need to shift to a new “AI role” to stay relevant, but you do need to notice what’s changing in the tools you already use. Start small, Try the new options, and understand where AI is quietly helping. 💬 Have you tried any of these new AI features in AWS? Let me know in the comments👇 ♻️ Found this helpful? Feel free to repost & share with your network. — 📥 For weekly Cloud learning tips, subscribe to my free Cloudbites newsletter: https://www.cloudbites.ai/ 📚 My AWS Learning Courses: https://zerotocloud.co/ 📹 Watch my weekly YouTube videos: https://lnkd.in/gQ8k29DE #aws #cloud #ai #genai #tech #zerotocloud #techwithlucy

  • View profile for Antonio Grasso
    Antonio Grasso Antonio Grasso is an Influencer

    Technologist & Global B2B Influencer | Founder & CEO | LinkedIn Top Voice | Driven by Human-Centricity

    41,675 followers

    The trend towards multi-cloud interoperability transforms modern IT infrastructures, allowing organizations to leverage flexibility, cost efficiency, and resilience by ensuring seamless integration across different cloud environments. Achieving effective multi-cloud interoperability relies on essential design principles prioritizing flexibility and adaptability. Cloud-agnostic coding minimizes dependencies on specific platforms, reducing lock-in risks. The microservices-based design allows applications to remain modular and scalable, making them easier to manage and integrate across diverse cloud providers. Automation, by reducing manual intervention, lowers complexity, enhances efficiency, and improves system resilience. Exposing APIs by default standardizes communication and ensures seamless interactions between components. A robust CI/CD pipeline enhances reliability and repeatability, enabling continuous updates and adaptations that meet evolving business needs. #CloudComputing #multicloud

  • View profile for Omkar Sawant

    Helping Startups Grow @Google | Ex-Microsoft | IIIT-B | GenAI | AI & ML | Data Science | Analytics | Cloud Computing

    15,281 followers

    𝐃𝐢𝐝 𝐲𝐨𝐮 𝐤𝐧𝐨𝐰 𝐭𝐡𝐚𝐭 𝐠𝐥𝐨𝐛𝐚𝐥 𝐦𝐨𝐛𝐢𝐥𝐞 𝐝𝐚𝐭𝐚 𝐭𝐫𝐚𝐟𝐟𝐢𝐜 𝐢𝐬 𝐞𝐱𝐩𝐞𝐜𝐭𝐞𝐝 𝐭𝐨 𝐫𝐞𝐚𝐜𝐡 𝐚 𝐬𝐭𝐚𝐠𝐠𝐞𝐫𝐢𝐧𝐠 77.5 𝐞𝐱𝐚𝐛𝐲𝐭𝐞𝐬 𝐩𝐞𝐫 𝐦𝐨𝐧𝐭𝐡 𝐛𝐲 2027? This explosion of data presents both a challenge and a massive opportunity for telecommunication companies. But are they equipped to handle it? The telecommunications industry is undergoing a seismic shift. Why should you care? Because this transformation impacts how we connect, communicate, and experience the digital world. A recent study showed that poor network performance can lead to a 30% increase in customer churn. 👉 In today's hyper-connected world, customer expectations are higher than ever, and telcos need to leverage data to stay ahead of the curve. 👉 Traditional data management systems struggle to keep pace with the sheer volume, velocity, and variety of data generated by modern telecom networks. Sifting through massive datasets to gain actionable insights is like finding a needle in a haystack. 👉 This makes it difficult to optimize network performance, personalize customer experiences, and develop innovative new services. Telcos need a new approach to data management to unlock the true potential of their data. 𝐓𝐡𝐞 𝐬𝐨𝐥𝐮𝐭𝐢𝐨𝐧? 👉 Deutsche Telekom, one of the world's leading telecommunications providers, is leading the charge by designing the telco of tomorrow with BigQuery. 👉 By leveraging BigQuery's powerful data warehousing and analytics capabilities, Deutsche Telekom is able to ingest and analyze massive datasets in real time. This enables them to gain valuable insights into network performance, customer behavior, and market trends. 👉 They can now proactively identify and resolve network issues, personalize offers and services for individual customers, and develop new revenue streams. 𝐊𝐞𝐲 𝐓𝐚𝐤𝐞𝐚𝐰𝐚𝐲𝐬: 👉 Real-time Insights: BigQuery enables real-time analysis of massive datasets, allowing telcos to react quickly to changing network conditions & customer needs. 👉 Improved Customer Experience: By understanding customer behavior and preferences, telcos can personalize services and offers, leading to increased customer satisfaction and loyalty. 👉 Innovation & Growth: Access to rich data insights empowers telcos to develop innovative new services & explore new business models. 👉 Scalability & Flexibility: Cloud-based solutions like BigQuery offer the scalability and flexibility needed to handle the ever-growing data demands of the telecommunications industry. This journey highlights the transformative power of data in the telecommunications industry. By embracing cloud-based data solutions, telcos can unlock valuable insights, improve customer experiences & drive innovation. The future of telecom is data-driven, and companies that embrace this reality will be the leaders of tomorrow. Follow Omkar Sawant for more. #telecommunications #bigdata #cloud #digitaltransformation #datanalytics

Explore categories