Top LinkedIn Content on Understanding Advanced Computing

Founder & Executive Director NTS Group & thejobswitch.com | Leading the No.1 Executive Search, Outplacement & Career Transition Company Across MEA & APAC | peter.norwell@ntsrecruitment.com

46,954 followers 7mo

Pharma just got a lot more powerful. Eli Lilly and Company has just announced a groundbreaking partnership with NVIDIA to build one of the world’s most powerful AI supercomputers dedicated to drug discovery. This isn’t just another “AI in healthcare” headline, it’s a shift in how science happens. For decades, developing a new drug could take 10–15 years. With this collaboration, Lilly will use NVIDIA’s DGX SuperPOD to train advanced AI models on millions of experiments, turning what once took years into months (or even weeks). But what’s even more interesting, Lilly plans to open parts of this capability to biotech startups through its TuneLab platform, meaning smaller innovators can now access big-pharma-level compute power and data science without the massive infrastructure. This is pharma meeting silicon. Science meeting scale. And collaboration meeting computation. As AI reshapes every industry, from drug discovery to supply chain to marketing, the leaders who win will be those who treat technology not as a tool, but as a strategic partner. The future of healthcare isn’t just about discovering new medicines. It’s about discovering new ways to discover.

152 Comments

Dr. Brindha Jeyaraman

19,715 followers 4mo

If your agent action isn’t idempotent, you’re building a liability. Agents retry. Networks fail. Tools timeout. Humans interrupt flows. If “execute_payment()” runs twice, what happens? Production AI requires: ✅ Idempotency keys ✅ Action versioning ✅ Side-effect detection ✅ Safe replay capability An agent should be able to: 1. Retry safely 2. Resume safely 3. Recover safely Without duplicating: 1. Payments 2. Emails 3. Data mutations 4. System state Agents are probabilistic. Your side effects cannot be. Idempotency isn’t optional in enterprise AI. It’s survival engineering. Are your agent actions replay-safe? #AIEngineering #Idempotency #SystemDesign #ReliableAI #AgentDesign #DistributedSystems #ProductionAI #MLOps #EnterpriseArchitecture #AITooling

49 Comments

Brij Kishore Pandey

AI Architect & AI Engineer | Building Agentic Systems & Scalable AI Solutions

729,775 followers 1y

𝗖𝗼𝗻𝗰𝘂𝗿𝗿𝗲𝗻𝗰𝘆 𝘃𝘀. 𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘀𝗺: 𝗨𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱𝗶𝗻𝗴 𝘁𝗵𝗲 𝗗𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝗰𝗲 In the world of software development, the terms 𝗰𝗼𝗻𝗰𝘂𝗿𝗿𝗲𝗻𝗰𝘆 and 𝗽𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘀𝗺 are often used interchangeably—but they’re not the same thing. Let me explain: 𝗖𝗼𝗻𝗰𝘂𝗿𝗿𝗲𝗻𝗰𝘆: - It’s about 𝗱𝗲𝗮𝗹𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗺𝘂𝗹𝘁𝗶𝗽𝗹𝗲 𝘁𝗮𝘀𝗸𝘀 at the same time, but not necessarily executing them simultaneously. - Think of it as multitasking: you switch between tasks to keep things moving, even if only one is actively executing at any moment. 𝗞𝗲𝘆 𝗜𝗻𝘀𝗶𝗴𝗵𝘁: Concurrency is about 𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲—how tasks are managed and interleaved. It’s more about 𝘁𝗶𝗺𝗲-𝘀𝗹𝗶𝗰𝗶𝗻𝗴 and less about simultaneous execution. 𝗘𝘅𝗮𝗺𝗽𝗹𝗲: - Your operating system handles multiple applications—e.g., running a browser while music plays in the background. --- 𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘀𝗺: - It’s about 𝗱𝗼𝗶𝗻𝗴 𝗺𝘂𝗹𝘁𝗶𝗽𝗹𝗲 𝘁𝗮𝘀𝗸𝘀 𝘀𝗶𝗺𝘂𝗹𝘁𝗮𝗻𝗲𝗼𝘂𝘀𝗹𝘆—literally executing tasks at the same time on different processing units. - This typically requires hardware that supports parallel processing, like multi-core CPUs or GPUs. 𝗞𝗲𝘆 𝗜𝗻𝘀𝗶𝗴𝗵𝘁: Parallelism is about 𝗲𝘅𝗲𝗰𝘂𝘁𝗶𝗼𝗻—tasks happening at the exact same moment. 𝗘𝘅𝗮𝗺𝗽𝗹𝗲: - A graphics rendering engine processes multiple pixels in parallel using GPU cores. 𝗖𝗼𝗻𝗰𝘂𝗿𝗿𝗲𝗻𝗰𝘆 𝘃𝘀. 𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘀𝗺 𝗶𝗻 𝗣𝗿𝗮𝗰𝘁𝗶𝗰𝗲: 1. 𝗖𝗼𝗻𝗰𝘂𝗿𝗿𝗲𝗻𝗰𝘆 𝗪𝗶𝘁𝗵𝗼𝘂𝘁 𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘀𝗺: - A single-core CPU can run multiple tasks concurrently by switching between them quickly (context switching). 2. 𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘀𝗺 𝗪𝗶𝘁𝗵𝗼𝘂𝘁 𝗖𝗼𝗻𝗰𝘂𝗿𝗿𝗲𝗻𝗰𝘆: - A multi-core CPU executes two independent tasks simultaneously without needing to switch. 3. 𝗖𝗼𝗻𝗰𝘂𝗿𝗿𝗲𝗻𝗰𝘆 + 𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘀𝗺: - A multi-core system managing multiple interdependent tasks that execute in parallel while coordinating their progress. 𝗪𝗵𝘆 𝗧𝗵𝗶𝘀 𝗠𝗮𝘁𝘁𝗲𝗿𝘀: 1. 𝗖𝗼𝗻𝗰𝘂𝗿𝗿𝗲𝗻𝗰𝘆 improves responsiveness in systems. It’s crucial for apps like servers, where tasks like handling multiple user requests are interleaved. 2. 𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘀𝗺 boosts speed and throughput. It’s ideal for computationally intensive tasks, like training machine learning models or processing large datasets. 𝗤𝘂𝗶𝗰𝗸 𝗧𝗮𝗸𝗲𝗮𝘄𝗮𝘆: - Use 𝗰𝗼𝗻𝗰𝘂𝗿𝗿𝗲𝗻𝗰𝘆 when your goal is better task management. - Use 𝗽𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘀𝗺 when your goal is faster execution. - Many systems today rely on both to achieve efficiency and scalability. Have I overlooked anything? Please share your thoughts—your insights are priceless to me.

37 Comments

Gopalakrishna Kuppuswamy

Co-founder and Chief Innovation Officer, Cognida.ai

5,161 followers 3mo

𝗘𝗻𝘁𝗲𝗿𝗽𝗿𝗶𝘀𝗲 𝗔𝗜 𝗜𝘀 𝗮 𝗦𝘆𝘀𝘁𝗲𝗺𝘀 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗖𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗲 Much of today’s conversation around AI agents focuses on #graphs, #models, #prompts, #context, or orchestration #frameworks. These topics matter, but they rarely determine whether an AI system succeeds once it moves from prototype to enterprise production. The real challenges appear when AI systems operate inside long-running business workflows. Consider a workflow that analyzes documents, retrieves data from multiple systems, calls APIs, and produces a structured decision. Such processes may run for twenty or thirty minutes and involve dozens of steps. Now imagine something routine happens: a network call fails, an API times out, or a container restarts. No problem, the agent says. It starts the workflow again. That may be acceptable for chatbots. It quickly becomes impractical for enterprise processes such as financial analysis, document processing, underwriting, or claims review. These workflows are long-running, resource-intensive, and deeply connected to operational systems. In these situations, the limitation is rarely the model’s intelligence. More often, the challenge lies in the #engineering #discipline around the system. At Cognida.ai, our focus is on building practical enterprise AI systems rather than demos or PoCs. We consistently find that several principles from #distributedsystems engineering become essential once AI moves into production. Here are three such constructs: 𝗗𝘂𝗿𝗮𝗯𝗹𝗲 𝗘𝘅𝗲𝗰𝘂𝘁𝗶𝗼𝗻 Agent workflows should not be treated as temporary requests. Each step should persist its state so that if a failure occurs, the system can resume from the last successful step rather than restarting the entire process. In practice, this means workflow orchestration with checkpointed state, deterministic execution, and event-driven recovery. For long-running processes, this is often the difference between a prototype and a production system. 𝗜𝗱𝗲𝗺𝗽𝗼𝘁𝗲𝗻𝘁 𝗔𝗰𝘁𝗶𝗼𝗻𝘀 AI agents increasingly trigger real-world actions: sending emails, calling APIs, updating records, moving files, or initiating financial transactions. Retries are inevitable in distributed systems. If actions are not idempotent, retries can create duplicate or inconsistent results. Reliable AI systems must ensure the same action cannot run twice unintentionally. 𝗣𝗲𝗿𝘀𝗶𝘀𝘁𝗲𝗻𝘁 𝗦𝘁𝗮𝘁𝗲 𝗕𝗲𝘆𝗼𝗻𝗱 𝘁𝗵𝗲 𝗠𝗼𝗱𝗲𝗹 Large language models operate within limited context windows rather than durable memory. Enterprise workflows often run longer and across many stages. The system managing the workflow must maintain its own persistent state instead of relying on the model’s temporary context. It means treating AI workflows as structured state machines, not simple prompt-response interactions. Are you treating AI workflows more like state machines, event-driven systems, or traditional #microservices? #PracticalAI #EnterpriseAI

3 Comments

Alexey Navolokin

FOLLOW ME for breaking tech news & content • helping usher in tech 2.0 • GM @ AMD • Turning AI, Cloud & Emerging Tech into Revenue

787,032 followers 6mo

AMD, UNSW Sydney & Pawsey: Redefining Real-Time Genomics with Slorado A major milestone for open science and high-performance genomics. AMD, UNSW Sydney, and the Pawsey Supercomputing Research Centre have introduced Slorado — the world’s first fully open-source, real-time nanopore DNA basecaller designed for AMD GPUs and powered by the ROCm open software platform. This breakthrough removes long-standing vendor lock-in and dramatically accelerates genomic workflows, empowering researchers with speed, scale, and flexibility. 🔬 What Slorado Enables + Fully open-source basecalling pipeline for nanopore sequencing + Runs on AMD GPUs via ROCm and supports hybrid GPU environments + Scales across multi-GPU and HPC infrastructures + Delivers performance parity with proprietary alternatives while improving accessibility ⚡ Performance Highlights on Pawsey’s Setonix Supercomputer Powered by AMD Instinct GPUs: + Full human genome decoded in: + 2.3 hours on MI250X GPUs + Just 0.8 hours on next-gen MI300X GPUs + High-accuracy models (HAC & SUP) also show significant acceleration without compromising data quality This level of performance transforms what once took days into hours — or even minutes — enabling faster research cycles, real-time pathogen surveillance, and scalable population genomics. 🌍 Why This Matters ✅ Democratizes access to high-performance genomics ✅ Accelerates discovery and clinical research ✅ Strengthens reproducibility through open-source transparency ✅ Expands AMD’s role as a trusted platform for scientific computing and AI ✅ Bridges HPC, AI, and bioinformatics into a unified ecosystem Slorado is more than a tool — it’s a signal of where the future of genomics is heading: open, accelerated, and accessible at global scale. AMD continues to push the boundaries of what’s possible in scientific computing – from AI to genomics and beyond. 🔗 Explore more: https://lnkd.in/ghSRHX7S #AMD #Genomics #OpenScience #HPC #AIinHealthcare #ROCm #InstinctGPUs #Supercomputing #Innovation #Bioinformatics #FutureOfScience #AMDBrandAmbassador

23 Comments

Saumya Awasthi

Senior Software Engineer | AI & Tech Content Creator | Featured in Times Square | Open to Collabs 🤝

349,642 followers 2w

Most developers think concurrency and parallelism mean the same thing. They don’t. Here’s the difference: Concurrency is about managing multiple tasks during the same period of time. Parallelism is about executing multiple tasks at the same time. Concurrency can create the appearance of things happening simultaneously, even when a system is rapidly switching between tasks. As the number of tasks grows, that switching comes with overhead. Why? Because resources are shared, and every task has to wait for its opportunity to run. On the other hand, parallelism can significantly improve performance for CPU-heavy workloads by utilizing multiple cores. But parallelism alone isn’t enough either. Tasks still need coordination, communication, and proper resource management. That’s why scalable systems rely on a combination of both. Here’s how: 1️⃣ Synchronization → Coordinate access to shared resources safely. → Mutex: allows only one thread to access a resource at a time. → Semaphore: limits how many threads can access a resource simultaneously. 2️⃣ Lock-free communication → Modern languages provide mechanisms that reduce the need for explicit locks. → Goroutines allow lightweight concurrent execution. → Channels enable safe communication between tasks without directly sharing memory. 3️⃣ System architecture → Design matters as much as implementation. → Thread pools help manage workloads efficiently. → Non-blocking I/O reduces waiting time. → Event-driven architectures improve responsiveness under heavy load. The goal isn’t to create more threads. The goal is to maximize useful work while minimizing waiting, contention, and unnecessary overhead. Concurrency helps you manage many tasks. Parallelism helps you execute more work. The strongest systems know how to use both.

8 Comments

Ben Van Roo

CEO and Co-Founder of Legion Intelligence Inc

7,567 followers 6mo

The DoD just unlocked frontier AI models with GenAI.mil. It's a crucial first step for increasing the "AI IQ" of the force. But as this new piece highlights, a bare model sitting behind a chat window cannot own a workflow. It can assist, but it can't execute. The next phase of military AI isn't about finding a smarter chatbot; it’s about building an integrated architecture that turns securing browsing into decisive action. The article outlines the blueprint for moving from experimental bridges to real-world military systems: 1) Moving beyond the "blob of text" to structure unstructured data (OPORDs, FRAGORDs) into executable tasks. 2) Building an Orchestration Layer to manage thousands of specialized agents across classifications and clouds. 3) Solving the Resilience Layer—because we don't always fight with high-bandwidth cloud access. We need workflows that degrade gracefully at the tactical edge. It’s time to turn chat-based experiments into Digital Staff Officers and Digital NCOs and embed them in real systems. https://lnkd.in/gKUrAnfG

GenAI.mil Is Live. Now Comes the Hard Part: Building the Digital NCO Corps. benvanroo.substack.com

2 Comments

Tannika Majumder

Senior Software Engineer at Microsoft | Ex Postman | Ex OYO | IIIT Hyderabad

49,558 followers 9mo

I’ve used this exact scenario to explain Idempotency to over 30+ my juniors in technical discussions. Every single time, it clicks instantly, and the understanding never leaves them which is a reward for me as a Sr. Engineer. So let’s break it down: You've filled your cart during the Flipkart/Amazon Great Indian Festival Sale. Laptop, headphones, gifts for the family, a solid ₹60,000 haul. You click "Place Order," enter your UPI PIN/CVV, and hit Pay. ...and then, due to the classic Indian broadband, Internet disconnects. The page spins forever. No confirmation screen. Now you’re wondering: "Did the payment go through? Did I just lose ₹60,000? Should I try again? What if I get charged twice?!" But you check your bank SMS or UPI app. One deduction. Only one. You refresh the page later, and there's your order, confirmed. Here’s what happens behind the scenes when you click that button: 1. The Unique "Receipt" (Idempotency Key): The moment you click "Place Order," the backend generates the idempotency key, like a UUID (e.g., diwali_sale_<your_user_id>_<random_number>). This is your idempotency key. Think of it as a unique transaction receipt number. This key is attached to every payment request sent to the backend. 2. The First Payment Request: Your request: "Hey Flipkart backend, please charge me ₹60,000. Here's my receipt number: diwali_sale_123_xyz." The backend processes it: charges your card/UPI, creates an order, and most importantly, stores the result ("Success, Order ID: OD123") linked to that exact receipt number diwali_sale_123_xyz. 3. The Retry: The network fails. Your app/browser doesn't get a response. So, what does it do? It does the logical thing: it retries. It sends the exact same request again: "Hey Flipkart backend, please charge me ₹60,000. Here's my same receipt number: diwali_sale_123_xyz." 4. The Backend: This is where the magic happens. The backend doesn't just blindly process the payment again. It checks its database: "Have I seen this receipt number diwali_sale_123_xyz before?" If YES: It understands this is a duplicate request. It does NOT call the payment gateway again. Instead, it simply fetches the result of the first successful request ("Success, Order ID: OD123") and sends it back to you. If NO: It processes it as a new, unique request. This simple check guarantees that one unique key = one financial transaction. Always. No matter how many times you retry. ➤ Why This is a Big Deal at Scale?? Think about the Diwali sale traffic: – Millions of users hit "Place Order" at the same second. – Unreliable mobile networks across India are causing countless timeouts and retries. – Payment gateways (like Razorpay, BillDesk, PayU) are under extreme load, responding slowly. Without idempotency, this would be chaos. Double charges, triple charges, angry customers, and a PR nightmare.

45 Comments

Omkar S.

Top 1% AI Engineering Leader | Platform & SRE | AI-First Teams | Helping orgs scale through intelligent infrastructure

30,060 followers 1mo

𝐌𝐨𝐬𝐭 "𝐀𝐈 𝐚𝐫𝐜𝐡𝐢𝐭𝐞𝐜𝐭𝐮𝐫𝐞𝐬" 𝐢𝐧 𝐩𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧 𝐫𝐢𝐠𝐡𝐭 𝐧𝐨𝐰 𝐚𝐫𝐞 𝐨𝐧𝐞 𝐏𝐲𝐭𝐡𝐨𝐧 𝐬𝐜𝐫𝐢𝐩𝐭 𝐚𝐧𝐝 𝐚 𝐩𝐫𝐚𝐲𝐞𝐫. That works for prototypes. It does not work for enterprise scale, multi-vendor reality, or systems that need to survive a model swap, a vendor change, or a regulator. Real AI architecture is about patterns battle-tested ways to organize agents, models, data, and governance. 𝐇𝐞𝐫𝐞 𝐚𝐫𝐞 𝟔 𝐩𝐚𝐭𝐭𝐞𝐫𝐧𝐬 𝐞𝐯𝐞𝐫𝐲 𝐀𝐈 𝐚𝐫𝐜𝐡𝐢𝐭𝐞𝐜𝐭 𝐬𝐡𝐨𝐮𝐥𝐝 𝐤𝐧𝐨𝐰: 1. Agentic Mesh • Best for: scaling many agents across the enterprise without chaos • Agents: custom and SaaS agents working as peers, not silos • Governance: identity, guardrails, audit wrap every action • Exchange: shared context flows across all agents • Network: decentralized, no single point of failure • Tools: unified registry, reusable across workflows 2. RAG Pattern • Best for: grounding LLMs in your enterprise data without retraining • Source attribution is non-negotiable for trust and compliance 3. LLM Router • Best for: controlling cost and latency across a model portfolio • Small/local model: simple classification, formatting, extraction • Mid-tier API: standard reasoning, summaries, drafts • Frontier model: complex reasoning, code, multi-step planning • Fallback: retry path when primary fails • Stop sending every query to your most expensive model 4. Hub-and-Spoke • Best for: centralizing AI governance while letting business units innovate • Hub (central platform): models, guardrails, observability, FinOps • Spokes (business units and domain teams): build use cases on platform standards, own domain logic • Standards (shared layer): reusable APIs, prompt libraries, evals • Center of Excellence: sets policy, audits, upskills 5. Event-Driven Agents • Best for: real-time, asynchronous agent workflows at scale • Event bus: single source of truth for what happened • Publishers: agents emit events, never call directly • Subscribers: agents react to events they care about • Replay: full audit trail, deterministic debugging • Decoupling: add or remove agents without breaking flow 6. Hexagonal Architecture for AI • Best for: keeping your AI system swappable as models and vendors change • LLM port: swap GPT, Claude, Gemini without rewriting • Vector port: swap Pinecone, Weaviate, pgvector freely • Tool port: add new APIs without touching core logic • UI port: same brain, different interfaces • Observability: logging, tracing, evals plug in cleanly The takeaway Your model will get cheaper. Your vendor will change. Your data will multiply. Your agent count will explode. The architects who win in 2026 will be the ones who designed for swap-ability, governance, and scale from day one not the ones who shipped a clever demo and called it a system. ♻️ Repost to help your team build for the long game ➕ Follow Omkar S. for more on architecting AI systems at scale #AIArchitecture #AIAgents #EnterpriseAI

71 Comments

Raul Junco

Simplifying System Design

140,957 followers 7mo

Most teams mess up idempotency because they treat it like a caching problem. It’s not. It’s a storage problem. Diego built a payments endpoint handling 5k charge requests/sec. Clients retry for minutes... sometimes hours. If he gets idempotency wrong, users get double-charged. Game over. Most engineers try something like: “Just throw the key in Redis with TTL.” Fast. Simple. And wrong. Because the second Redis evicts the key, or a TTL expires, or a node fails, you don’t just lose the key... you lose guarantees. Payments can’t live on “best effort.” Here’s the real play: Idempotency belongs in the source of truth. The database. Use a unique request_id. Do a conditional insert. If it’s new → process and save the result. If it conflicts → return the stored one. Atomically. Durably. Forever. No locks. No race conditions. No cache guessing. No “hope it’s still in memory.” It scales. It survives region failovers. It handles endless retries without breaking a sweat. Because when money moves, you don’t bet on a TTL. You bet on a constraint.

52 Comments

LinkedIn respects your privacy

Understanding Advanced Computing

Explore categories

Understanding Advanced Computing

More in Understanding Advanced Computing

More Artificial Intelligence topics

Explore categories