7
votes
What difference and relation are between fault tolerance and (high) availability?
The basic concepts are orthogonal, however, they are related. One has to do with the availability of your application, and the other has to do with the correctness of your application. Remember, ...
7
votes
Accepted
How can adding redunancy adversely affect performance
Scenario A: You have a system that solves problem X. Scenario B: You have a system that solves problem X and has to ensure that it is always synchronized with the redundant backup.
It is pretty clear ...
6
votes
Accepted
How can we keep sight of business flows in event driven architectures?
The problem is that it can be hard to see such a flow as it's not explicit in any program text. Often the only way to figure out this flow is from monitoring a live system.
There are two separate ...
6
votes
Accepted
How to prevent concurrency problems when using the repository pattern?
The repository pattern does not intend to solve concurrency issues. It only provides a convenient way to work with persistent entities.
In principle you'd use your repository pattern in ...
6
votes
Accepted
Giving multiple components access to a single database
Indeed, option 2 (direct DB access) makes the ownership of the shared entities, and the responsibility for their invariant unclear.
In the long run, maintenance risks increase. A typical example is an ...
5
votes
Accepted
How do atomic updates work at scale?
First things first, based on this question it appears you do not at all understand the concept of atomicity. I’d recommend doing a bit more work to understand that before moving on to distributed ...
4
votes
Accepted
How to design a high-scale, reliable, distributed and periodic task (cronjob) execution service?
This is a very broad question, the answer would be by designing such a system.
I think however you are more interested in what techniques can be used to minimise the impact of a node failing, and ...
4
votes
Accepted
Accepting the UUID collision risk based on number of clients
The whole point of UUIDs is that the risk of collisions can be safely ignored. A conflict solution is not needed.
If you look at your log files and see a message "Fatal Error: UUID collision detected"...
4
votes
How can adding redunancy adversely affect performance
Two simple examples:
Adding an index to a database table is a common way to introduce redundancy, with the intention of speeding up read operations on that table. However, if a table is more ...
4
votes
Is sequential consistency equivalent to performing memory accesses by a processes in program order and performing each memory access atomically?
No, your definition is not quite equivalent to the definition of sequential consistency but would be closer to strict consistency. There are two relevant aspects: (a) there are multiple processors/...
4
votes
Long-running compute-intensive tasks in APIs: background workers?
Move the problem solution to the client by providing separate HTTP API endpoints to submit processing requests and to collect results
Fixed that for you.
Doing this gives you a number of advantages:
...
4
votes
Giving multiple components access to a single database
Sharing a database between multiple application is known as an integration database. Not only does it have to consider the requirements of all of the applications integrating through it (which results ...
3
votes
Docker and GPU-based computations. Feasible?
Containerization is completely orthogonal to “high load” or “parallelization”. Containerization also does not imply any virtualization, and is better interpreted as sandboxing.
So why do people use ...
3
votes
Is shared disk architecture scaling up or scaling out?
Shared disk is vertically scaled approach for disk. As Robert Harvey points out, you are scaling horizontally for memory and CPU but the disk is one (or a few) component.
There's a simple way to ...
3
votes
Does the producer-consumer problem appear in both shared memory and distributed memory architectures?
You are right in thinking that the problems of races and deadlocks in producer-consumer stem from shared variables and data structures.
Any system that allows multiple accessors (processes, threads, ...
3
votes
Multiple sources of truth - Optimistic concurrency & Eventual consistency
There are multiple ways to go about solving the issue of having distributed entities.
Avoiding it
You could make sure the customer entity only exist in one place and have the other parts of the system ...
2
votes
Error handling in distributed system
Appending to a persistent log on A should suffice. This copes with reboots and network partitions to achieve eventual consistency, or to signal breakage which prevents such convergence. With amortized ...
2
votes
Patterns for maintaining consistency in a distributed, event sourced system?
Sounds like you could implement a business process (saga in context of Domain Driven Design) for the user registration where the user is treated like a CRDT.
Resources
https://doc.akka.io/docs/akka/...
2
votes
Distributed training of many small ML models
There's no particular right or wrong way to do this because this depends on your projects, and on whether you can exploit the structure of your data for efficiency.
E.g. for a one-off project, you ...
2
votes
Develop a distributed pointer in C++
The answer is yes and no:
Yes, you can create a distributed pointer that could be based on an IP address and a port. It would be implemented using the remote proxy design pattern but with a pointer ...
2
votes
Develop a distributed pointer in C++
Yes, C++ provides you with awesome powers, and itty bitty living spaces.
Break your problem down first.
You need a Connection object responsible for handling the nitty gritty of communication. It ...
2
votes
Accepted
Consistency and Availability in distributed hashing Key value store
Let me rephrase your question - how is CAP theorem applicable for a raft based system.
For the context: CAP says that in case partitioning is happening, then you have to pick either consistency or ...
2
votes
Architecting a distributed file processing system with leadership election
Well, the "leader election" problem is quite well known and the most commonly used app for solving it is probably Apache Zookeeper.
Google it and you'll find plenty of documents about that.
If you ...
2
votes
Do persistent/transient communication and temporal decoupling/coupling mean the same?
Do persistent communication and temporal decoupling mean the same?
No, but those concepts are related: temporal decoupling requires the messages/exchanged data between processes to be kept (=persisted)...
2
votes
What is the difference between masking and tolerating failures?
From what I understand both are different in respect to the level of abtractions involved:
"Masked" means here: Lower levels "mask" failure transparently for higher levels of the system. Failure on a ...
2
votes
How do atomic updates work at scale?
How does it work?
Database atomicity (aka the A of ACID) is an appearance of atomicity. The general idea for an update is:
keep the old value where it is unchanged
as long as the transaction is not ...
2
votes
Implementing the microservice pattern
You seem to be conflating a number of somewhat unrelated concepts:
Automated deployment (different from continuous deployment)
Containerization
Microservices
Of those, it sounds like what you really ...
2
votes
Ordering of analytical events
Assuming your front end is not malicious and not purposefully manipulating the time, you can follow this procedure:
Modify the message so that it contains two timestamps: (A) The time the event ...
2
votes
How can adding redunancy adversely affect performance
First, as others have pointed out, in general doing more, costs more, so in most cases adding more work results in an increased cost (aka performance reduction).
Secondly, you are missing the most ...
2
votes
Accepted
What are the approaches for joining data in distributed processing
About the name of such a system I can't say anything - I'd guess that it's not a specific name but a function of the system, which I would informally call event consolidation.
Regarding the behavior ...
Only top scored, non community-wiki answers of a minimum length are eligible
Related Tags
distributed-computing × 161distributed-system × 32
architecture × 21
design × 14
java × 12
microservices × 12
message-queue × 8
algorithms × 7
database × 7
design-patterns × 6
enterprise-architecture × 6
c# × 5
domain-driven-design × 5
networking × 5
scalability × 5
redis × 5
eventual-consistency × 5
database-design × 4
event-sourcing × 4
parallelism × 4
distributed-development × 4
web-services × 3
cqrs × 3
caching × 3
aws × 3