7
\$\begingroup\$

I have been studying intermediate or more advanced data structures in Python, one of which I've decided to try out is the Tree Data Structure. I've been studying the theories as well as the implementations from Youtube courses and I might say that I've grasped quite a bit of it.

I decided to make my own original algorithm to append and access a data in a Tree without copying from the videos:

class TreeNode:
    def __init__(self,data = None):
        self.data = data
        self.left = None
        self.right = None

class Tree:
    def __init__(self):
        self.head = TreeNode()
    
    def append(self, data):
        cur = self.head
        new_node = TreeNode(data)
        while True:

            if cur.data == None:
                cur.data = data
                return
            
            if data < cur.data:
                if cur.left == None:
                    cur.left = new_node
                    return
                if cur.left != None:
                    cur = cur.right

            if data > cur.data:
                if cur.right == None:
                    cur.right = new_node
                    return
                if cur.right != None:
                    cur = cur.right
    
    def find_data(self, data):
        cur = self.head
        while cur.right != None and cur.left != None:

            if data > cur.data:
                cur = cur.right

            if data < cur.data:
                cur = cur.left
            
            if data == cur.data:
                return print(f'Data {data} has been found.')

        return print("Im sorry, your data is not found!")

my_tree = Tree()
my_tree.append(50)
my_tree.append(100)
my_tree.append(30)
my_tree.find_data(30)
my_tree.find_data(20)

I would like to know if this is considered an optimal and efficient algorithm since I've been writing it only using the theories and logics from the videos. I would like to know if my understanding's clear enough.

If there are any, in what ways could the algorithms be improved?

\$\endgroup\$

3 Answers 3

6
\$\begingroup\$

A few suggestions:

  • There is a typo in Tree.append where we move to cur.right when cur.left is not None. This leads to incorrect organization of the data.
  • It is rather hard to visualize the tree. You can add a __str__ method to TreeNode.
  • Testing more complex cases is usually necessary to establish correctness.
  • The append method gets stuck in the while loop when duplicate data is inserted.
  • The find_data method is not quite correct; it's perfectly fine to have the children be None. The concern is moving to a child that is None or when the current data is None. Notably, the current code will not work when we try to use find_data on a Tree with 1 insertion.
  • new_node is declared earlier than it needs to be.
  • Using if, elif, else can improve simplicity.
  • Try to use is instead of == when comparing to None.
  • There is an inconsistency in style between append which checks the left branch first and find_data which checks the right branch first.

Here is the code with the suggestions implemented:

class TreeNode:
    def __init__(self, data=None):
        self.data = data
        self.left = None
        self.right = None

    def __str__(self):
        return "({}) <- {} -> ({})".format(
            self.left, self.data, self.right)


class Tree:
    def __init__(self):
        self.head = TreeNode()

    def append(self, data):
        cur = self.head
        while True:
            if cur.data is None:
                cur.data = data
                return

            if data < cur.data:
                if cur.left is None:
                    cur.left = TreeNode(data)
                    return
                cur = cur.left
            elif data > cur.data:
                if cur.right is None:
                    cur.right = TreeNode(data)
                    return
                cur = cur.right
            else:
                return

    def find_data(self, data):
        cur = self.head
        while True:
            if cur.data is None:
                return print("Im sorry, your data is not found!")

            if data < cur.data:
                if cur.left is None:
                    return print("Im sorry, your data is not found!")
                cur = cur.left
            elif data > cur.data:
                if cur.right is None:
                    return print("Im sorry, your data is not found!")
                cur = cur.right
            else:
                return print(f'Data {data} has been found.')


my_tree = Tree()
my_tree.find_data(50)
my_tree.append(50)
my_tree.find_data(50)
my_tree.append(50)
my_tree.append(100)
my_tree.append(30)
my_tree.append(20)
my_tree.append(10)
print(my_tree.head)
my_tree.find_data(30)
my_tree.find_data(10)
my_tree.find_data(5)

Hope this helps! Good work on the MVP.

\$\endgroup\$
4
\$\begingroup\$

3 quick minor suggestions:

  • consider upgrading your class to a dataclass, then you can ditch __init__ and possibly __repr__ as well
  • the match statement can be used to replace if statements
  • In terms of behavior, seems to me a function like find_data should simply return whatever value was found, or None by convention. Let the calling function handle the result accordingly, rather than embed hardcoded prints which are less flexible

Also, I had a quick look at similar code on the Internet. Naming conventions vary from one author to another, thus there is one implementation that has an add_child method to append "data", and there is a corresponding remove_child method as well, which you haven't implemented but could be useful at some point.

\$\endgroup\$
3
\$\begingroup\$

address vs contents

            if cur.data == None:

Prefer ... is None:

It's kind of a weird python culture thing, related to None being a singleton. Sorry. Your linter probably told you about it but then you ignored it. (E711)

Also, it's a bit odd that you accept arbitrary data yet you prohibit a None value. Minimally, the append() """docstring""" should spell out that restriction.

else

                if cur.left == None:
                    ...
                if cur.left != None:

This is just annoying. It leaves the reader wondering how many cases there could possibly be. Use an else clause, please.

            if data < cur.data:
                ...
            if data > cur.data:

In the case of data == cur.data we implicitly leave it unchanged, which is maybe just fine. I note in passing that it is possible for the == operator between objects to return True, even though an object is noticeably different, e.g. an audit trail .modified attribute has a more recent timestamp.

I would have been happier with a simple else clause here.

\$\endgroup\$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.