Why is there a Hair Space character in the test02_enc.txt file?

Question

I keep getting errors while manually testing my decrypting function within the homework 6 program01.py program. My program produces correct keys for all the tests. The texts for test01 is correct in the output, and so are the first 4 lines of test02, however in character with index 244 it has 'â', '€', 'Š' which corresponds to the hex: e2 80 8a which corresponds to U+200A which is a "hair space":

https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8192&number=128

I think this is because the test text is copied from the information theory wikipedia page (https://en.m.wikipedia.org/wiki/Information_theory ), and at this character index there is a superscript "vii" which was not taken out of the test text. Is this something we should be working around as an exercise or is this a mistake? It caused me to waste a lot of time to understand the issue, and although other people may have gotten around it, my algorithm didn't allow me to ignore this character.

AL1990 · Answer 1 · 2023-12-02T21:08:34+0000

commented Dec 2, 2023 by animali (600 points)

there is an error, because when i open the text file in notepad++ it shows an "HSP" character, which does not get turned into a normal character with the encryption key. Every other escape character or control code goes back to something useful, but because the text is taken from wordpress, it doesn't correctly identify the characters and adds these HSP chars which shouldn't be there. This caused me some issues, and may have caused other people issues. I have now resolved the issue, however this is a bug that I don't think was supposed to be in the test.

AL1990 (28120 points)

commented Dec 2, 2023 by AL1990 (28.1k points)

Do you open the files specifying the correct encoding='utf-8'?

Do you need help?

Notice Board

Related questions

Most popular tags

Categories

Why is there a Hair Space character in the test02_enc.txt file?

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.