Well the topic of multilingual domains is really looking very interesting to me , so here am i in my 2nd night reading about multilingual domains.
Now as i have studied currently Every Web addresses are typically expressed using Uniform Resource Identifiers or URIs. The URI syntax defined in RFC 3986 STD 66 (Uniform Resource Identifier (URI): Generic Syntax) limits web address or uri`s to limit to english characters and numerals. Now if we want to start à¤—à¤¾à¥ˆà¤°à¤µ.com this is not possible with the RFC 3986 standard.
So for enabling the registration of domain names as à¤—à¤¾à¥ˆà¤°à¤µ.com a new concept of IDN (internationalized domain names) was agreed by IETF in March 2003 and was defined in RFC 3490,3491, 3492 and 3454.
Now as the implementation part goes when a user request a domain name in it`s native format (Unicode for hindi) it is changed/Encoded in punycode before going to the DNS. The Encoding is done at the Application part (Usually by the browsers) and the punycode is sent to to the browsers.Remember this that the current DNS and name resolver infrastructure cannot handle unicode based(non-ascii) domain names, that`s why they are encoded into punycode.
For converting hindi domains into punycode and vice versa use http://mct.verisign-grs.com/index.shtml
The conversion between ASCII and Non-ASCII formats is done by algorithms ToASCII and ToUnicode. I will be giving a separate blog entry for these algo`s.
Now in last blog entry we discussed about the inconsistencies developed due to the unicode representation of IDN. Now i realized something far dangerous spoofing .
So how does concept of spoofing is applied here, let`s take the example of only mld i can found on net raftaar.com , the origial à¤°à¤«à¤¼à¥à¤¤à¤¾à¤°.com (Punycode http://xn--h2bnoc2dn7h.com ) is currently spoofed with a different website à¤°à¤«à¥à¤¤à¤¾à¤°.com(Punycode http://xn--h2bnoc3e8d.com) which is a parked website and You can see that both of the domain names are so similar in unicode representation but produces a very different punycode string. Which could result in spoofing any user to a alternate website, it`s dangerous.
In general this kind of attack is known as a homograph spoofing attack. On February 7, 2005, Slashdot reported that this exploit was disclosed at the hacker conference Shmoocon with an example available at http://www.shmoo.com/idn/.
Since it is such a obvious way of spoofing people this has been taken very seriously by the IDNA , since they were responsible for changing the unicode to the punycode strings. Among which IE7 has implemented a anti-phishing filter to avoid this kind of spoofing and Mozilla Foundation (firefox) shows the punycode URLs instead of the unicode, thus thwarting any attacks while still allowing people to access websites on an IDN domain.
Still this remains a very promising and necessary technology will talk about the algos and methodology used while encoding and decoding punycode and how DNS handles the requests.
Notice: All the work is under GFDL licence and copyright of Gaurav Mishra