snopes.com  

Go Back   snopes.com > Non-UL Chat > Techno-Babble

Reply
 
Thread Tools Display Modes
  #1  
Old 27 October 2009, 07:52 AM
snopes's Avatar
snopes snopes is offline
 
Join Date: 18 February 2000
Location: California
Posts: 75,151
Computer Net set for 'language shake-up'

The internet is on the brink of the "biggest change" to its working "since it was invented 40 years ago", the net regulator Icann has said.

The body said it that it was finalising plans to introduce web addresses using non-Latin characters.

http://news.bbc.co.uk/2/hi/technology/8326241.stm
Reply With Quote
  #2  
Old 27 October 2009, 07:57 AM
Mr. Billion Mr. Billion is offline
 
Join Date: 09 July 2005
Location: Kansas
Posts: 1,617
Default

What's "Viagra" in Arabic?
__________________
Life as Mr. Billion: Nasty, brutish, and tall.
Reply With Quote
  #3  
Old 27 October 2009, 11:22 AM
Troberg's Avatar
Troberg Troberg is offline
 
 
Join Date: 04 November 2005
Location: Borlänge, Sweden
Posts: 9,234
Default

While I understand why they do it, I think it's a very, very bad idea. I mean, I could have a site with Swedish characters in it, and it would be difficult for a non-Swedish visitor to type it. Now, say that I would like to visit a Chinese site. Difficult would not even begin to describe it.

Also, don't forget all the software that is used to be able to validate addresses according to the Latin character set, which now needs to be updated.

Nope, this is a very, very bad idea. It goes against all the fundamental ideas of interoperability on the internet.
__________________
/Troberg
Reply With Quote
  #4  
Old 28 October 2009, 09:12 PM
Mr. Billion Mr. Billion is offline
 
Join Date: 09 July 2005
Location: Kansas
Posts: 1,617
Frying Pan

Come to think of it, I'm going to register snöpes.com and microsöft.com ASAP.
__________________
Life as Mr. Billion: Nasty, brutish, and tall.
Reply With Quote
  #5  
Old 28 October 2009, 10:11 PM
Logoboros's Avatar
Logoboros Logoboros is offline
 
Join Date: 27 April 2004
Location: Columbia, MO
Posts: 2,629
Default

Quote:
Originally Posted by Troberg View Post
Nope, this is a very, very bad idea. It goes against all the fundamental ideas of interoperability on the internet.
Well, I guess that's true if you define "interoperability" as "only one system should be allowed!"
__________________
"The man who never alters his opinion is like standing water, & breeds reptiles of the mind." --William Blake
Reply With Quote
  #6  
Old 29 October 2009, 08:59 AM
Troberg's Avatar
Troberg Troberg is offline
 
 
Join Date: 04 November 2005
Location: Borlänge, Sweden
Posts: 9,234
Default

Quote:
Originally Posted by Logoboros View Post
Well, I guess that's true if you define "interoperability" as "only one system should be allowed!"
No, I define it as "systems (and people) should be allowed to communicate". If that means sticking to a pidgin English language for domain names, then so be it. Face it, a domain name is basically just a moniker for a network address, to make it easier to remember and to move around, nothing else. That doesn't require it to be internationalized, especially since all modern computers can switch to English keyboard layouts anyway (and frequently needs to anyway).

They are solving a nonexisting problem and creating loads of other problems in the process.
__________________
/Troberg
Reply With Quote
  #7  
Old 29 October 2009, 07:07 PM
jimmy101_again jimmy101_again is offline
 
Join Date: 29 December 2005
Location: Greenwood, IN
Posts: 2,159
Default

Quote:
Originally Posted by Troberg View Post
No, I define it as "systems (and people) should be allowed to communicate". If that means sticking to a pidgin English language for domain names, then so be it. Face it, a domain name is basically just a moniker for a network address, to make it easier to remember and to move around, nothing else. That doesn't require it to be internationalized, especially since all modern computers can switch to English keyboard layouts anyway (and frequently needs to anyway).

They are solving a nonexisting problem and creating loads of other problems in the process.
A "moniker" and "make it easier to remember" only applies to someone that uses the pidgin English character set. To non-western alphabet using people current web addresses are no better than their numeric equivalent and the whole idea of domain name mapping is a waste of time. O*.#@)*$)@*$)!(@*$.)*), which is what a western alphabet name looks like to someone that doesn't use that character set, is no easier to remember than 123.321.422

The Chinese character name for web sight is just as valid as a "moniker" and "make(s) it easier to remember" as does the English version. It's just that it is that way for a different subset of the world's population.

It really shouldn't be all that hard to implement, especially if smart people spend more than 5 minutes on the problem. One simple solution is to just give each domain multiple names. A registered domain name might include English, Chinese and Latvian versions. Web users could use the language and character set they are most comfortable with. Total world wide cost to implement such a system? About $5.

A somewhat trickier issue is how to encode the language/character set in the address itself. That info is already supposed to be encoded in the website but now it will have to be included in the URL as well since as far as a computer is concerned "www.thebestporno.com" is just a string of 8 bit numbers, it has no idea that those numbers represent letters, or what those letters mean, or what language they are in. Perhaps a URL will have to be something like
http://{character setspec}/www.thebestporno.com.

Easy enough to adequately identify the character spec so that it's absence is easy to detect and a default translation is used.

Wait a minute, since the "moniker" is really numeric, not character based, the numeric equivalent of the address www.thebestporno.com is all that is needed. The actual character coding is only needed by the user's browser to display it correctly, the actual fetch on the web doesn't need to know what the character encoding is. So the string of numbers represents the actual address (that gets translated by a name server to another string of numbers). How that string of numbers is displayed in a browser is up to the browser. The WWW doesn't care.
Reply With Quote
  #8  
Old 30 October 2009, 05:09 PM
niner niner is offline
 
Join Date: 01 March 2005
Location: Ovid, MI
Posts: 882
Default

Quote:
Originally Posted by Troberg View Post
No, I define it as "systems (and people) should be allowed to communicate". If that means sticking to a pidgin English language for domain names, then so be it. Face it, a domain name is basically just a moniker for a network address, to make it easier to remember and to move around, nothing else. That doesn't require it to be internationalized, especially since all modern computers can switch to English keyboard layouts anyway (and frequently needs to anyway).

They are solving a nonexisting problem and creating loads of other problems in the process.
However, since the domain name system is just a front end for IP resolution, there's no reason you couldn't provide as many front-ends as you want.

Servers using our current character set can continue to run as they are. A new Arabic server could come up for Arabic-character languages that resolves their languages to the same IPs. Since DNS is tiered, you can even provide both services to all people without removing service for some, or requiring all servers to cover all character sets.

Henry
Reply With Quote
  #9  
Old 30 October 2009, 07:37 PM
jimmy101_again jimmy101_again is offline
 
Join Date: 29 December 2005
Location: Greenwood, IN
Posts: 2,159
Default

Quote:
Originally Posted by niner View Post
However, since the domain name system is just a front end for IP resolution, there's no reason you couldn't provide as many front-ends as you want.

Servers using our current character set can continue to run as they are. A new Arabic server could come up for Arabic-character languages that resolves their languages to the same IPs. Since DNS is tiered, you can even provide both services to all people without removing service for some, or requiring all servers to cover all character sets.

Henry
Any thought on how to include the encoding in the address? To the name server it doesn't matter since the server is looking at the address as a string of numbers anyway so the character set doesn't matter.

But how do you tell the web browser that the number 46 is supposed to be displayed as an umlauted b because it is in the Dutch character set and not a curvy uppercase M like think because it is Mongolian? (Just made those up.)
Reply With Quote
  #10  
Old 30 October 2009, 08:57 PM
Mad Jay's Avatar
Mad Jay Mad Jay is offline
 
Join Date: 19 July 2003
Location: Virginia
Posts: 7,820
Default

Quote:
Originally Posted by jimmy101_again View Post
Any thought on how to include the encoding in the address? To the name server it doesn't matter since the server is looking at the address as a string of numbers anyway so the character set doesn't matter.

But how do you tell the web browser that the number 46 is supposed to be displayed as an umlauted b because it is in the Dutch character set and not a curvy uppercase M like think because it is Mongolian? (Just made those up.)
UTF-8 encoding solved that problem years ago. If you use that encoding each character can possibly contain a code that indicates which language that character belongs too. It's backward compatible with ASCII.. so if there is not language code aka locale, it will default to the current locale
__________________
In between my father's fields;And the citadels of the rule; Lies a no-man's land which I must cross; To find my stolen jewel.
Reply With Quote
  #11  
Old 02 November 2009, 10:10 AM
Troberg's Avatar
Troberg Troberg is offline
 
 
Join Date: 04 November 2005
Location: Borlänge, Sweden
Posts: 9,234
Default

Quote:
Originally Posted by niner View Post
However, since the domain name system is just a front end for IP resolution, there's no reason you couldn't provide as many front-ends as you want.

Servers using our current character set can continue to run as they are. A new Arabic server could come up for Arabic-character languages that resolves their languages to the same IPs. Since DNS is tiered, you can even provide both services to all people without removing service for some, or requiring all servers to cover all character sets.
Yep, that would work from a technical perspective.

However, the human factor will fail. Sites will forget that they may be useful in other countries. Users telling other users of a site will forget or don't know the other names. Depending on how you find a link, you'll get different names, but the same content, which will screw up searches and make you wade through duplicates.

Quote:
Originally Posted by jimmy101_again View Post
A "moniker" and "make it easier to remember" only applies to someone that uses the pidgin English character set. To non-western alphabet using people current web addresses are no better than their numeric equivalent and the whole idea of domain name mapping is a waste of time. O*.#@)*$)@*$)!(@*$.)*), which is what a western alphabet name looks like to someone that doesn't use that character set, is no easier to remember than 123.321.422
Well, have you seen any major computer/OS which hasn't the "Pidgin English" character set in it's core anyway? I haven't. If you use, say, an Arabic Windows, to use an example I'm familiar with, you'll still have a need to display stuff using English characters, and the OS is designed to handle that. A quick keyboard combination and you'll have the English layout instead, which is printed on the keyboard. I'm a nerd, so I collect keyboards with different layouts, so I have Arabic, Greek, Thai and Cyrillic keyboards, and they all have English characters as well printed on the keys. To use a computer, you'll have to at least be able to type those character sets anyway.

If one follows your logic, the system files should be localized as well. Why not have kärna.kör and användare.kör instead of kernel.exe and user.exe in a Swedish Windows while you are at it?
__________________
/Troberg
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Myths you just can’t shake off snopes Medical 6 06 July 2009 05:19 PM
Tracking the evolution of language snopes Language 19 15 July 2008 01:28 PM
On the Myth of Ape Language I'mNotDedalus Language 18 09 July 2008 12:50 AM
Never Shake a Baby snopes Inboxer Rebellion 6 08 December 2007 08:44 AM


All times are GMT. The time now is 11:48 AM.


Powered by vBulletin® Version 3.7.4
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.