My question is: where in this design is scalability defined? Or rather, if we scale up the volume of data that goes in and out, which part of this design would start giving issues?
Scalability approaches areis usually separated into scaling horizontallyhorizontal or verticallyvertical scaling.
"Horizontal scaling" for a data pipeline usually refers to the ability of partitioning the input data into chunks which can be processed in independently independently from each other in parallel. If that's possible in your case depends heavily on the nature of the data and the specific transformation steps. One also has to evaluate whether the input and output stages of the pipeline may become a bottleneck. As mentioned here, several NoSQL databases support horizontal scaling through sharding.
"Vertical scaling" may either refer to introducing more intermediate processing steps into the pipeline, which only brings an improvement when those steps can run interleaved (but often for the price of a higher latency). Or it can refer to increasing the computing power of the hardware (or software) for the individual processing steps (for example, by using CPUs with higher clock speeds or, systems with higher memory or network speed, or an improved compiler/interpreter). However, since the maximum affordable CPU clock speed hasn't really increased any more over the last decade for the major CPU types, this option has lost itsit's practical importancerelevance to some degree. Today, improvements in hardware come mostly through parallelization.
In essence, vertical scaling is a restricted approach - it offers you a finite number of options, but when you took them, you may easily reach their limits. Horizontal scaling, however, allows you to scale way more dynamically up and down (under the precondition you will be able to split up the data into independently processable pieces.)