Problem with à in HW4_rec

z
zaur (740 points)
3 6 8
asked Dec 23, 2021 in HW4 by zaur (740 points)
edited Dec 23, 2021 by zaur
In one of the tests of the hw4 recovery, print_recorded_exams function, italian letter à breaks the code.  Apparently, len function considers this letter as 2 which causes issue while printing. Using encoding utf-8 i could fix its appearance in the file. However, while running it gives this error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe0 in position 1131: invalid continuation byte.

The problem is while reading from file, the string "Attività Progettuale Di Ottimizzazione Su Reti M" is read as "Attività  Progettuale Di Ottimizzazione Su Reti M" which adds another space. How can I fix this problem?
124 views

3 Answers

Best answer
Ganni02 (4920 points)
1 2 15
answered Dec 23, 2021 by Ganni02 (4,920 points)
selected Dec 23, 2021 by zaur
I did not encounter this issue after fixing the encoding when opening the files. Specifically opening them using encoding= 'UTF-8'
andrea.sterbini (172340 points)
510 927 1776
commented Dec 23, 2021 by andrea.sterbini (172,340 points)
Always use the utf8 encoding when opening text files!
Nilats (3750 points)
7 14 29
answered Dec 23, 2021 by Nilats (3,750 points)
It's probably the exercise text that is causing this problem, try to delete that.

https://stackoverflow.com/questions/40442264/difference-between-comments-in-python-and
z
zaur (740 points)
3 6 8
commented Dec 23, 2021 by zaur (740 points)
edited Dec 23, 2021 by zaur
I think you got it wrong. There is no problem with exercise text. The problem is in the one of the strings of the test. This is the string: "Attività Progettuale Di Ottimizzazione Su Reti M". It causes problem because len of this string is caculated as 49 instead of 48.
Exyss (21390 points)
1 2 79
answered Dec 23, 2021 by Exyss (21,390 points)
Are you sure you're opening the files using encoding = Utf8?