In Java and several other languages, identifiers (e.g. method names) are allowed to contain unicode characters.
Unfortunately, some combinations of unicode characters are logically identical. For example, á (one character: Latin Small Letter a with Acute U+00E1) is the same as aÌ (two characters: Latin Small Letter A U+0061, and Non-spacing Acute Accent U+0301). These combinations are not just similar – they are identical by definition.
Java does not do any normalisation on your code before compiling it, so two identifiers containing equivalent but different unicode combinations are considered different (ref: JLS 7 section 3.8).
$ cat U.java public class U { static String \u00e1() { return "A WITH ACUTE"; } static String a\u0301() { return "A + NON-SPACING ACUTE"; } public static void main(String[] a) { System.out.println(á()); System.out.println(aÌ()); } } $ javac U.java && java U A WITH ACUTE A + NON-SPACING ACUTE
We can define and use two functions called á and aÌ and they are totally independent entities.
But don’t do this.
Python 3 also allows unicode characters in identifiers, but it avoids the above problem by normalising them (ref: Python 3 Reference, section 2.3):
$ cat U.py #!/usr/bin/env python3 def á(): print("A WITH ACUTE") def aÌ(): print("A + NON-SPACING ACUTE") á() aÌ() $ hexdump -C U.py 23 21 2f 75 73 72 2f 62 69 6e 2f 65 6e 76 20 70 |#!/usr/bin/env p| 79 74 68 6f 6e 33 0a 0a 64 65 66 20 c3 a1 28 29 |ython3..def ..()| 3a 0a 20 20 20 20 70 72 69 6e 74 28 22 41 20 57 |:. print("A W| 49 54 48 20 41 43 55 54 45 22 29 0a 0a 64 65 66 |ITH ACUTE")..def| 20 61 cc 81 28 29 3a 0a 20 20 20 20 70 72 69 6e | a..():. prin| 74 28 22 41 20 2b 20 4e 4f 4e 2d 53 50 41 43 49 |t("A + NON-SPACI| 4e 47 20 41 43 55 54 45 22 29 0a 0a c3 a1 28 29 |NG ACUTE")....()| 0a 61 cc 81 28 29 0a 0a |.a..()..| $ ./U.py A + NON-SPACING ACUTE A + NON-SPACING ACUTE
(Legend: A WITH ACUTE, A + NON-SPACING ACUTE)
The second definition overwrites the first because they are considered identical. You can call it via either way of saying its name.
Both ways of working are scary, but I’d definitely choose the Python 3 way if I had to.