Here is my full code, and it is working fine with ASCII, but when comes the "unicode" charaters in the picture... I hate my life...
I know this is not english, but let me explain:
I have got 2 input files (realmek, nevek), and 1 result file (osszes).
I have got a working page in (html).
- Like I said with ANSI characters this is working.
BUT when I try use strange chracters: "űáéđĐ" I need to save 2 input, and 1 output files in UNICODE. But than my program drops a "encoding decoding" error. And I know it is normal.
So my question is: How can I solve this? where I need to handle decoding encoding?
I am thinking about this for 3 days... I tried many decoding, like "u = unicode( s, "utf-8" )" ; $ export LANG=en_US.UTF-8; etc. But it didn't worked.
from urllib import urlopen
import re
faj = "hiba"
cast = "hiba"
pont = 0
szint = 0
fj = open("C:\Users\Rendszergazda\Desktop\Achievements\Realmek.txt", "r")
tombr = fj.readline()
realmek = tombr.split(" ")
fj.close()
fh = open("C:\Users\Rendszergazda\Desktop\Achievements\Nevek.txt", "r")
tomb = fh.readline()
nevek = tomb.split(" ")
fh.close()
osszes = open("C:\Users\Rendszergazda\Desktop\Achievements\Osszes.txt", "a")
for x in realmek:
realm = x
for y in nevek:
nev = y
lap = urlopen("http://eu.battle.net/wow/en/character/"+str(realm)+"/"+str(nev)+"/achievement").read()
letezik = re.compile('<div id="server-erro(.*)">')
letez = re.findall(letezik,lap)
if (letez != []):
a = 0
else:
lapn = lap.split("\n")
mapo = lapn[1087]
pontos = re.compile('\t\t\t\t\t(.*)\r')
pont = re.findall(pontos,mapo)
mapom = lapn[1322]
feastn = re.compile('<div class="bar-contents">\t\t\t\t\t\t\t\t\t\t\t\t(.*)\r')
feast = re.findall(feastn,mapom)
fajkeres = re.compile('</strong></span> <a href="/wow/en/game/race/(.*)" class="race">')
castkeres = re.compile('</a> <a href="/wow/en/game/class/(.*)" class="class">')
szintkeres = re.compile('<span class="level"><strong>(.*)</strong></span> <a href="/wow/en/game/')
faj = re.findall(fajkeres,lap)
cast = re.findall(castkeres,lap)
szint = re.findall(szintkeres,lap)
link = "http://eu.battle.net/wow/en/character/"+str(realm)+"/"+str(nev)+"/advanced"
ccast = cast [0]
ffaj = faj [0]
sszint = szint [0]
ppont = pont [0]
ffeast = feast [0]
osszes.write(str(nev)+" "+str(realm)+" "+str(ppont)+" "+str(ffeast)+" "+str(ffaj)+" "+str(ccast)+" "+str(sszint)+" "+str(link)+"\n")
osszes.close()