upd: I am very sorry about my mistake. The old version of data is download from luogu, but the website do NOT allow me to download the data of the worst case. In fact, when facing same amount of operations, the memory usage depends on the type of operations and their orders. I re-generate some data, and update the link and profiling logs.
I am new in Haskell, and I try to implement an AVL tree. To test my code, I submit it to an Online Judge, which has a problem of balanced-tree. However, I get a MLE (Memory Limit Exceeded).
In fact, when facing 10^5 times of operations, my code requires 58 MB memories, while native C++ code only need about 2 to 3 MB. The speed of C++ code is also 10x faster.
I wonder if haskell can deal with such large data structure while having performance close to native code. What should I do?
Here is my code:
type Info = (Int, Int)
data Tree a = Empty | Ord a => Node (Tree a) a Info (Tree a)
just :: Maybe a -> a
just (Just a) = a
height :: Tree a -> Int
height Empty = 0
height (Node _ _ (h,_) _) = h
size :: Tree a -> Int
size Empty = 0
size (Node _ _ (_,s) _) = s
maintainInfo :: Tree a -> Tree a
maintainInfo (Node ls v _ rs) = Node ls v (max (height ls) (height rs) + 1, size ls + size rs + 1) rs
rotateL :: Tree a -> Tree a
rotateL (Node (Node ll lv _ lr) v _ rs) =
maintainInfo (Node ll lv (0,0) (maintainInfo (Node lr v (0,0) rs)))
rotateR :: Tree a -> Tree a
rotateR (Node ls v _ (Node rl rv _ rr)) =
maintainInfo (Node (maintainInfo (Node ls v (0,0) rl)) rv (0,0) rr)
rotateL_ :: Tree a -> Tree a
rotateL_ (Node (Node ll lv _ lr) v _ rs)
| height ll >= height lr = rotateL (Node (Node ll lv (0,0) lr) v (0,0) rs)
| otherwise = rotateL (Node (rotateR (Node ll lv (0,0) lr)) v (0,0) rs)
rotateR_ :: Tree a -> Tree a
rotateR_ (Node ls v _ (Node rl rv _ rr))
| height rl > height rr = rotateR (Node ls v (0,0) (rotateL (Node rl rv (0,0) rr)))
| otherwise = rotateR (Node ls v (0,0) (Node rl rv (0,0) rr))
maintain :: Tree a -> Tree a
maintain (Node ls v info rs)
| height ls > height rs + 1 = rotateL_ (Node ls v info rs)
| height rs > height ls + 1 = rotateR_ (Node ls v info rs)
| otherwise = maintainInfo (Node ls v info rs)
insert :: Ord a => a -> Tree a -> Tree a
insert val Empty = Node Empty val (1,1) Empty
insert val (Node ls v _ rs)
| val < v = maintain (Node (insert val ls) v (0,0) rs)
| val >= v = maintain (Node ls v (0,0) (insert val rs))
iter :: Tree a -> [a]
iter Empty = []
iter (Node ls v _ rs) = iter ls ++ v : iter rs
maxi :: Tree a -> Maybe a
maxi Empty = Nothing
maxi (Node _ v _ Empty) = Just v
maxi (Node _ _ _ rs) = maxi rs
mini :: Tree a -> Maybe a
mini Empty = Nothing
mini (Node Empty v _ _) = Just v
mini (Node ls _ _ _) = mini ls
eraseSwap :: Tree a -> Tree a
eraseSwap Empty = Empty
eraseSwap (Node ls _ _ Empty) = ls
eraseSwap (Node ls _ _ rs) =
let next = just (mini rs) in
maintain (Node ls next (0,0) (erase next rs))
erase :: a -> Tree a -> Tree a
erase _ Empty = Empty
erase val (Node ls v _ rs)
| val < v = maintain (Node (erase val ls) v (0,0) rs)
| val > v = maintain (Node ls v (0,0) (erase val rs))
| val == v = eraseSwap (Node ls v (0,0) rs)
next :: a -> Tree a -> Maybe a
next _ Empty = Nothing
next val (Node ls v _ rs)
| val < v = Just (maybe v (min v) (next val ls))
| val >= v = next val rs
prev :: a -> Tree a -> Maybe a
prev _ Empty = Nothing
prev val (Node ls v _ rs)
| val <= v = prev val ls
| val > v = Just (maybe v (max v) (prev val rs))
rank :: a -> Tree a -> Int
rank _ Empty = 1
rank val (Node ls v _ rs)
| val <= v = rank val ls
| val > v = size ls + 1 + rank val rs
select :: Int -> Tree a -> Maybe a
select _ Empty = Nothing
select k (Node ls v _ rs)
| size ls >= k = select k ls
| size ls == k - 1 = Just v
| otherwise = select (k - size ls - 1) rs
update :: [Int] -> Tree Int -> Tree Int
update [1,x] = insert x
update [2,x] = erase x
ask :: [Int] -> Tree Int -> Int
ask [3,x] tree = rank x tree
ask [4,x] tree = just (select x tree)
ask [5,x] tree = just (prev x tree)
ask [6,x] tree = just (next x tree)
solve :: Int -> Tree Int -> IO ()
solve 0 _ = return ()
solve n tree = do
inputQuery <- getLine
let l = map read (words inputQuery) :: [Int]
if head l < 3 then do
solve (n-1) (update l tree)
else do
print (ask l tree)
solve (n-1) tree
return ()
main :: IO ()
main = do
inputN <- getLine
let n = read inputN :: Int
solve n Empty
return ()
I humbly accept all suggestions, including those that are not aimed at performance but at coding style.
To test my code, I submit it to Luogu P3369, which is in Chinese. The simple translation is as follow:
You need to dynamically maintain a re-set M and provide the following operations:
- Insert a number x into M.
- Delete a number x from M (if there are multiple identical numbers, only delete one).
- Query how many numbers in M are smaller than x, and add one to the answer.
- Query the number that ranks in the x-th position after M is sorted from small to large.
- Query the predecessor of x in M (the predecessor is defined as the largest number that is smaller than x).
- Query the successor of x in M (the successor is defined as the smallest number that is greater than x).
- For operations 3, 5, and 6, it is not guaranteed that the number x exists in the current re-set.
- The first line is n, which indicates the number of operations. Each of the following n lines has two numbers opt and x. opt indicates the sequence number of the operation (1 <= opt <= 6)
- For operations 3,4,5,6, each line outputs a number, indicating the corresponding answer.
If anyone has difficulty to login luogu, here is also some data files to download:
Here is the output of ghc -O2 -rtsopts main.hs && ./main < avl.in > tmp.out +RTS -s
:
1,579,041,840 bytes allocated in the heap
309,624,880 bytes copied during GC
19,305,064 bytes maximum residency (21 sample(s))
186,640 bytes maximum slop
58 MiB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 360 colls, 0 par 0.324s 0.329s 0.0009s 0.0055s
Gen 1 21 colls, 0 par 0.419s 0.425s 0.0202s 0.0500s
INIT time 0.001s ( 0.001s elapsed)
MUT time 0.897s ( 0.907s elapsed)
GC time 0.743s ( 0.754s elapsed)
EXIT time 0.000s ( 0.009s elapsed)
Total time 1.641s ( 1.671s elapsed)
%GC time 0.0% (0.0% elapsed)
Alloc rate 1,759,782,364 bytes per MUT second
Productivity 54.7% of total user, 54.3% of total elapsed