I have made my own compression tool for a stack-based language that uses a dictionary compression method. The code for the decompression is the following:
elif current_command == "\u201d":
temp_string = ""
temp_string_2 = ""
temp_index = ""
temp_position = pointer_position
while temp_position < len(commands) - 1:
temp_position += 1
try:
current_command = commands[temp_position]
if dictionary.unicode_index.__contains__(current_command):
temp_index += str(dictionary.unicode_index.index(current_command))
temp_position += 1
pointer_position += 2
current_command = commands[temp_position]
temp_index += str(dictionary.unicode_index.index(current_command))
if temp_string == "":
temp_string += dictionary.dictionary[int(temp_index)].title()
else:
temp_string += " " + dictionary.dictionary[int(temp_index)].title()
temp_index = ""
elif current_command == "\u201d":
pointer_position += 1
break
elif current_command == "\u00ff":
temp_string += str(pop_stack(1))
pointer_position += 1
else:
temp_string += current_command
pointer_position += 1
except:
pointer_position += 1
break
if debug:print(str(pointer_position) + " with " + str(hex(ord(current_command))))
stack.append(temp_string)
The temporary variables are temp_string, temp_string_2, temp_index and temp_position. I know I should not be using names like these, but I didn't know what else to name it. This code just doesn't feel clean, and I don't know if it's just me, but it's not pleasant to look at. The code above can be found here. The dictionary.unicode_index is found here.
The best way to explain this is with an example:
For example, we want to compress the sentence Hello, World!. We first look at the index of the word Hello, which has index 2420. Since the language is 0-indexed, we need to substract 1 from this, leaving 2419. We now slice this into two pieces, 24 and 19. Then we find the corresponding indices, found here.
24 gives Ÿ
19 gives ™
The concatenation of the indices is done at this part:
temp_index += str(dictionary.unicode_index.index(current_command))
After the concatenation, we search up the index in the dictionary, which is done here:
if temp_string == "":
temp_string += dictionary.dictionary[int(temp_index)].title()
else:
temp_string += " " + dictionary.dictionary[int(temp_index)].title()
We append a comma, and we repeat the process for the next word, which is World. This has the index 119. Again, substract 1, which gives us the following compressed word: Œ‰. This gives us the following code for my language:
”Ÿ™,Œ‰!
And decompresses to Hello, World!, which can be verified here.
I was wondering how I can make the code look more clean, because right now, it looks like a complete mess.