Approaches to parallel programming

Topic: Approaches to parallel programming (edited)

hello there

Hellope!

Now that the topic is "Parallel programming", what is that anyway? There is this distinction of Parallelism vs Concurrency, popularized maybe also by Rob Pike: https://www.youtube.com/watch?v=oV9rvDllKEg

04:17

But I can never remember which is which, and it seems like not everybody makes that precise distinction

maybe another interesting distinction as a base for discussion, I found this yesterday: https://www.drdobbs.com/parallel/the-pillars-of-concurrency/200001985?pgno=2

according to Pike, concurrency is handling lots of things at once, whereas parallelism is doing alot of things at once

I'm having trouble understanding that. The way I remember it is that one is about authoring code as independent "trains of thought", the other is actually running code on multiple physical cores. But not quite sure about that

responsiveness versus throughput? @jfs

04:26

do we have anyone in the fishbowl at all who's been playing around with parallel programming at all or are we just a big group of beginners?

isn't

04:27

concurrency == breaking program into tasks that are independent from each other (threads baically)

04:27

parallelism == physically running few tasks on different cpu cores

2

If you look at "the three columns of concurrency" that I linked above, I'm mostly interested in the responsiveness pillar right now. But to make that not a pain to develop (e.g. by create a separate OS thread for each task), shared data locks (mutexes) are definitely needed

04:31

So, assuming that we're talking about both concurrency and parallelism here. Because one isn't very meaningful without the other, right?

i thought parallelism was more of the umbrella term given how computer hardware have developed during the last 25 years

let's go with that then

04:34

A realization that lead me into being interested in this topic recently has been how even a single machine is a concurrent and multithreaded thing. Not only do machines have multiple CPU cores, they also have a separate graphics processing unit, and all sort of other things that (for example) communicate through interrupts. It gets more pronounced of course when adding networking...

04:34

I can't really do a plain fgets() or fread() anymore, it seems so wrong to me when I'm doing e.g. GUI on the same thread (edited)

04:35

The human interacting with the GUI is like a completely separate thread as well!

i'm more curious about the massively parallel part of the discussion, maximizing throughput and thereby optimizing for speed (running a single big task on multiple cpu threads to optimize for time)

the "embarassingly parallel" thing

https://en.wikipedia.org/wiki/Embarrassingly_parallel

yeah, monte carlo is a good example of that

04:47

@Vegard @NWDD the fishbowl has started

hello everyone (and thanks for the notification) (edited)

do we have anyone who's played around with threads and parallelism here?

2

pinging you as well @gingerBill @AsafGartner

Hello

1

Distinguishing between Parallelism and Concurrency is extremely important to do. They are orthogonal concepts.

1

Rob Pike's distinction is a good one.

so @bumbread is right; concurrency is basically the go/node idea where you can have lots of "things" (threads, green threads, coroutines, goroutines, ...) in-flight at the same time, but you can still do it all on a single core if that's all you have. and parallelism is more when things run truly in parallel (on different cores).

well maybe let's not worry about definitions for now

Concurrency is about dealing with lots of things at once. Parallelism is about doing loads of things at once.

1

When does this distinction matter in practice? Except for performance, is it super important on how many cores (1 or many) I'm running my N concurrent threads?

it's hard to say what dealing means (edited)

04:55

or is it really happening at once?

concurrency doesn't need thread-safety (to start with a difference that matters in practice) (edited)

In programming: Concurrency is the composition of independently executing processes. Parallelism is the simultaneous execution of (possibly related) computations.

well ok

Purely parallel problems could be classed as idempotent. @jfs's link to "Embarrassingly Parallel" is the essence of this.

why does concurrency not need thread-safety @Vegard? What is that thread-safety here, precisely?

04:59

Are you thinking about needing less synchronization, when you have truly independent tasks? Or things like (non-) preemption?

it's generally independent tasks

Idempotent in this case means independent.

05:01

Processes do not effect each other.

@jfs I think you meant @NWDD but I can answer too

first of all, I think thread safety can be many different things. You meantioned preemption, that's one part of it -- if you only have 1 CPU and you fully control when the next thread executes, then you can be a lot more relaxed about a lot of things, since you know no other CPU will be able to access your variables concurrently. in other words, you don't need locking at all.

I am using the term process in the general sense here to encompass the general idea. e.g. types of processes: core, thread, os process, fibre, green-thread, etc

Now I have the definitions done and over with, let's gone down the dirty stuff

so for example: imagine you're developing a network layer which is something that is highly asynchronous by definition. there are a lot of little details you have to handle: has the connection timed out? what data have you received? was there a network error? did the user put a nintendo switch in the microwave so the network connection broke? with concurrency you can actually ditch all means of synchronization and have it still happen form a perceived point in "parallel" (edited)

Unless I'm misunderstanding what you mean, I disagree. You often need synchronization with concurrency. Many webapps have synchronization bugs even though they are entirely single-threaded.

ah, yeah, i meant specifically cooperative concurrency (for example fibers or lua coroutines running in a single-thread or cooperative threads pinned to one core)

@AsafGartner that sounds like a problem with the single threaded event driven process rather than a simple form of monte carlo where you have one thread that creates jobs and takes back the result afterwards with no interaction in-between each job thread

@AsafGartner can you give an example of that?

Synchronization is for shared state. If you don't have shared state then it's no problem, but if you do you still need to enforce synchronization even in non-parallel systems.

@AsafGartner web apps are sometimes spawned in parallel by the web server, so even though the program itself is single-threaded, there can still be multiple instances of it running in parallel. not sure if this is what you're talking about

The common bug is check state -> make HTTP request -> perform action without checking if the state changed and assuming it didn't.

I would like to state that synchronization is needed whenever two processes (in the sense of @gingerBill 's definition) communicate. Where communication means any kind of information sharing

1

A more concrete example, which I've encountered in several apps, is you hit play in a music app, it starts buffering the song, you hit pause, and when it's done buffering it plays anyway.

that can happen even in a 100% single-thread application where you flip some global bools and don't check them properly though

05:14

so for example, today an user reported that on Spelunky 2 if you freeze an enemy that grabs the player in a certain point in time

05:14

and then approach the enemy, it grabs you while frozen

Exactly. That's why synchronization is not just for parallel operations.

the general idea is that "you don't need as much work to avoid things going bad with concurrency compared to full parallelism"

If we're assuming preemption, I'm still not sure what's a situation where it matters if a concurrent application is being run with parallelism or not.

05:17

Non-preemption (where each process controls when it gives up control of the CPU) could be seen as just like preemption with a global mutex

except it's efficient, yes

isn't that more of a specific node.js issue as it's a preemptive event driven framework?

05:20

never went down the javascript rabbithole personally

Javascript is non-preemptive

05:20

I would say

i'm not sure but i also believe it's non-preemptive?

classic javascript i believe is non-preemptive yes, however google chrome attempts to make it preemptive to increase speed, which was later forked into node.js

well it added worker threads AFAIK, so that changed maybe. But fundamentally it's how I understand a non-preemptive runtime

05:25

a generally non-preemptive framework would maybe also allow a process to choose what other threads can be scheduled while this process is waiting for an I/O request to complete (edited)

I've seen infinite loops hang the whole execution (of a web-server) (edited)

but either way, I don't think non-parallel preemptive systems interesting or useful xD

Worker threads run in a separate context and you have to communicate with them over a message channel, so you don't have parallelism issues there.

yes, this is an important point @AsafGartner . There are certain frameworks (such as actors, channels) that make parallelism much easier (edited)

05:26

they encapsulate the needed synchronization, and so dramatically reduce manual work

@NWDD are you saying you don't think coroutines or green threads are ever interesting, or that they're only interesting if you can (also) schedule them in parallel?

that "preemptive non-parallel concurrency" is not interesting in general

05:30

coroutines are great (obviously mandatory everything I say is my opinion and not a fact blablablablabla) (edited)

1

But coroutines aren't always fair.

05:32

Like when your older brother hogs the controller and you have to get your mom to do some preemption.

2

but fairness is not always a desirable feature, right?

In practice I would agree. In theory I think there's almost always a limit to the unfairness you're willing to have in a system.

I'm not sure if the fairness is really a big issue in practice. you can kinda see when you're going to have a lot of work to do and yield manually from time to time

it's definitely an issue when you're doing some number crunching

05:37

such as decoding a JPG

05:38

software is in some respects easier to architect when you just assume preemption

05:38

and combine pieces that you know will be run in parallel

the worst thing is when it's a surprise

if you assume preemption and combine pieces that you know that will be run in parallel, then it's hard to replace it with concurrency, yes, that sounds about right

I dislike coroutines in the traditional sense of them. Because they are hard to reason about. They don't act like a separate "process". (edited)

i like the idea about them, you can go say i need that later but it's big so please start reading and caching it from disk now so it'll be ready later

05:42

although i haven't toyed around with them so i'll have to see in practice when i get around to that

the idea of coroutines might be that we want to write linear code (top-to-bottom) while still interacting with other threads in a totally synchronized fashion

05:42

what you describe sounds more like asynchrony to me @Wayward

for reading from disk, i think that's a great thing

it's kind of easier describing a transition like "change opacity during X time, then change scene, then change opacity during X time" than pretty much any other alternative

1

05:44

they are really hard to follow and debug and everything if you use them a lot in a dogma-like way, yes

@jfs coroutines are very often used to implement that asynchronicity

there are so many concepts and nuances to this whole parallel/concurrent thing

would be cool to have an infographic or something relating all the terms to each other

Yes. I didn't really understand what you meant @NWDD . And what is it anyway, that is "acting like a separate process" (like @gingerBill sasys)?