Revisions to Parallel processing a Tree on GPU - Software Engineering Stack Exchange

Post Undeleted by Caleth

occurred Apr 25, 2018 at 14:59

Post Deleted by Caleth

occurred Apr 25, 2018 at 14:57

added 52 characters in body

Source Link

edited Apr 25, 2018 at 13:24

Caleth

12.4k
2
29
45

It's very simple.

When you would recurse down the left child then the right child, instead start a task that recurses down the left, and continue down the right. When you are done with the right, you may have to wait on the left, but in a balanced tree that won't be long

When you are at depth N in your tree, you have the potential for 2^N cores working

That specific diagram is only spawning tasks at specific depths, probably because it isn't queueing tasks. That is using knowledge of the structure of the tree to not overload the scheduler

If you have a task queue and a thread pool, the implementation can leverage those to not have to worry about the quantity of tasks spawned.

Say you have

class Node 
{
    int data; // or w/e

    Node * left;
    Node * right;

    template <typname Action>
    void SerialTraverse(Action action)
    {
        action(data); // Pre-order traversal

        if (left) left->SerialTraverse(action);
          // Traverse the left
        if (right) right->SerialTraverse(action);
          // Traverse the right
    }
}

namespace library 
{
    template<typename Task, typename Class, typename Arg>
    std::future<void> enqueue_task(Task task, Class * obj, Arg arg);
    // on some other thread, call "obj->task(arg);"
    // returns a handle upon which we can wait
}

Then you can change SerialTraverse to

    template <typname Action>
    void ParallelTraverse(Action action)
    {
        action(data);
        std::future<void> fut;
        if (left) fut = library::enqueue_task(ParallelTraverse, left, action);
          // start traversing left on another thread

        if (right) right->ParallelTraverse(action);
          // traverse right on this thread, in parallel to traversing left

        if (fut.valid()) fut.wait();
          // wait for the left traversal to end
    }

It's very simple.

When you would recurse down the left child then the right child, instead start a task that recurses down the left, and continue down the right. When you are done with the right, you may have to wait on the left, but in a balanced tree that won't be long

When you are at depth N in your tree, you have the potential for 2^N cores working

That specific diagram is only spawning tasks at specific depths, probably because it isn't queueing tasks. That is using knowledge of the structure of the tree to not overload the scheduler

If you have a task queue and a thread pool, the implementation can leverage those to not have to worry about the quantity of tasks spawned.

Say you have

class Node 
{
    int data; // or w/e

    Node * left;
    Node * right;

    template <typname Action>
    void SerialTraverse(Action action)
    {
        action(data); // Pre-order traversal

        if (left) left->SerialTraverse(action);
          // Traverse the left
        if (right) right->SerialTraverse(action);
          // Traverse the right
    }
}

namespace library 
{
    template<typename Task, typename Class, typename Arg>
    std::future<void> enqueue_task(Task task, Class * obj, Arg arg);
    // on some other thread, call "obj->task(arg);"
}

Then you can change SerialTraverse to

    template <typname Action>
    void ParallelTraverse(Action action)
    {
        action(data);
        std::future<void> fut;
        if (left) fut = library::enqueue_task(ParallelTraverse, left, action);
          // start traversing left on another thread

        if (right) right->ParallelTraverse(action);
          // traverse right on this thread, in parallel to traversing left

        if (fut.valid()) fut.wait();
          // wait for the left traversal to end
    }

It's very simple.

When you would recurse down the left child then the right child, instead start a task that recurses down the left, and continue down the right. When you are done with the right, you may have to wait on the left, but in a balanced tree that won't be long

When you are at depth N in your tree, you have the potential for 2^N cores working

That specific diagram is only spawning tasks at specific depths, probably because it isn't queueing tasks. That is using knowledge of the structure of the tree to not overload the scheduler

If you have a task queue and a thread pool, the implementation can leverage those to not have to worry about the quantity of tasks spawned.

Say you have

class Node 
{
    int data; // or w/e

    Node * left;
    Node * right;

    template <typname Action>
    void SerialTraverse(Action action)
    {
        action(data); // Pre-order traversal

        if (left) left->SerialTraverse(action);
          // Traverse the left
        if (right) right->SerialTraverse(action);
          // Traverse the right
    }
}

namespace library 
{
    template<typename Task, typename Class, typename Arg>
    std::future<void> enqueue_task(Task task, Class * obj, Arg arg);
    // on some other thread, call "obj->task(arg);"
    // returns a handle upon which we can wait
}

Then you can change SerialTraverse to

    template <typname Action>
    void ParallelTraverse(Action action)
    {
        action(data);
        std::future<void> fut;
        if (left) fut = library::enqueue_task(ParallelTraverse, left, action);
          // start traversing left on another thread

        if (right) right->ParallelTraverse(action);
          // traverse right on this thread, in parallel to traversing left

        if (fut.valid()) fut.wait();
          // wait for the left traversal to end
    }

added 1268 characters in body

Source Link

edited Apr 25, 2018 at 13:17

Caleth

12.4k
2
29
45

It's very simple.

When you would recurse down the left child then the right child, instead start a task that recurses down the left, and continue down the right. When you are done with the right, you may have to wait on the left, but in a balanced tree that won't be long

When you are at depth N in your tree, you have the potential for 2^N cores working

That specific diagram is only spawning tasks at specific depths, probably because it isn't queueing tasks. That is using knowledge of the structure of the tree to not overload the scheduler

If you have a task queue and a thread pool, the implementation can leverage those to not have to worry about the quantity of tasks spawned.

Say you have

class Node 
{
    int data; // or w/e

    Node * left;
    Node * right;

    template <typname Action>
    void SerialTraverse(Action action)
    {
        action(data); // Pre-order traversal

        if (left) left->SerialTraverse(action);
          // Traverse the left
        if (right) right->SerialTraverse(action);
          // Traverse the right
    }
}

namespace library 
{
    template<typename Task, typename Class, typename Arg>
    std::future<void> enqueue_task(Task task, Class * obj, Arg arg);
    // on some other thread, call "obj->task(arg);"
}

Then you can change SerialTraverse to

    template <typname Action>
    void ParallelTraverse(Action action)
    {
        action(data);
        std::future<void> fut;
        if (left) fut = library::enqueue_task(ParallelTraverse, left, action);
          // start traversing left on another thread

        if (right) right->ParallelTraverse(action);
          // traverse right on this thread, in parallel to traversing left

        if (fut.valid()) fut.wait();
          // wait for the left traversal to end
    }

It's very simple.

When you would recurse down the left child then the right child, instead start a task that recurses down the left, and continue down the right. When you are done with the right, you may have to wait on the left, but in a balanced tree that won't be long

When you are at depth N in your tree, you have the potential for 2^N cores working

That specific diagram is only spawning tasks at specific depths, probably because it isn't queueing tasks. That is using knowledge of the structure of the tree to not overload the scheduler

If you have a task queue and a thread pool, the implementation can leverage those to not have to worry about the quantity of tasks spawned.

It's very simple.

When you would recurse down the left child then the right child, instead start a task that recurses down the left, and continue down the right. When you are done with the right, you may have to wait on the left, but in a balanced tree that won't be long

When you are at depth N in your tree, you have the potential for 2^N cores working

That specific diagram is only spawning tasks at specific depths, probably because it isn't queueing tasks. That is using knowledge of the structure of the tree to not overload the scheduler

If you have a task queue and a thread pool, the implementation can leverage those to not have to worry about the quantity of tasks spawned.

Say you have

class Node 
{
    int data; // or w/e

    Node * left;
    Node * right;

    template <typname Action>
    void SerialTraverse(Action action)
    {
        action(data); // Pre-order traversal

        if (left) left->SerialTraverse(action);
          // Traverse the left
        if (right) right->SerialTraverse(action);
          // Traverse the right
    }
}

namespace library 
{
    template<typename Task, typename Class, typename Arg>
    std::future<void> enqueue_task(Task task, Class * obj, Arg arg);
    // on some other thread, call "obj->task(arg);"
}

Then you can change SerialTraverse to

    template <typname Action>
    void ParallelTraverse(Action action)
    {
        action(data);
        std::future<void> fut;
        if (left) fut = library::enqueue_task(ParallelTraverse, left, action);
          // start traversing left on another thread

        if (right) right->ParallelTraverse(action);
          // traverse right on this thread, in parallel to traversing left

        if (fut.valid()) fut.wait();
          // wait for the left traversal to end
    }

added 244 characters in body

Source Link

edited Apr 25, 2018 at 10:47

Caleth

12.4k
2
29
45

It's very simple.

When you would recurse down the left child then the right child, instead start a task that recurses down the left, and continue down the right. When you are done with the right, you may have to wait on the left, but in a balanced tree that won't be long

When you are at depth N in your tree, you have the potential for 2^N cores working

HopefullyThat specific diagram is only spawning tasks at specific depths, probably because it isn't queueing tasks. That is using knowledge of the structure of the tree to not overload the scheduler

If you have a task queue and a thread pool, and don't just launch native threadsthe implementation can leverage those to not have to worry about the quantity of tasks spawned.

It's very simple.

When you would recurse down the left child then the right child, instead start a task that recurses down the left, and continue down the right. When you are done with the right, you may have to wait on the left, but in a balanced tree that won't be long

When you are at depth N in your tree, you have the potential for 2^N cores working

Hopefully you have a task queue and a thread pool, and don't just launch native threads.

It's very simple.

When you would recurse down the left child then the right child, instead start a task that recurses down the left, and continue down the right. When you are done with the right, you may have to wait on the left, but in a balanced tree that won't be long

When you are at depth N in your tree, you have the potential for 2^N cores working

That specific diagram is only spawning tasks at specific depths, probably because it isn't queueing tasks. That is using knowledge of the structure of the tree to not overload the scheduler

If you have a task queue and a thread pool, the implementation can leverage those to not have to worry about the quantity of tasks spawned.

Source Link

answered Apr 25, 2018 at 8:56

Caleth

12.4k
2
29
45

Loading

Stack Exchange Network

Return to Answer