Bidi discussion

From IUCG - Internet Users Contributing Group

Jump to: navigation, search

Contents

From: jefsey [1] Sent: 13/Feb/2010 4:15 PM

As you know I prepare an appeal to the IESG over the IDNA issue. In a nutshell this appeal concerns the way IESG/IAB handles IDNA2008 as a standalone architectural issue. On an usage point of view the architectural concepts approved at the occasion of IDNA2008 legitimate a much more innovative vision of the same Internet technology. However, that vision introduces new issues which should be addressed or at least announced at the same time in order to avoid confusion. The real matter therefore is most probably with the IAB. My evaluation is that IDNA2008 uses the Internet technology intrinsic capacity for multiplicity (what we are not used to) to address diversity.

The purpose of the appeal is therefore to obtain an (we think urgent) authoritative IESG and probably IAB position about:

(1) the way they consider the IDNA2008 architectural insertion and

impact

(2) if such an impact exists who is to address its consequences.

IETF, users, someone else, under which form.

In order to clarify the issues I consider three areas needing to be documented:

1. IDNA and the Internet system - this is what IDNA documents - where

IETF can specify MUSTs

2. IDNA and the IDNs Internet peritem - this is what Mapping

considers.- where IETF can specify SHOULDs

3. IDNA and the Users Internet exotem - this is the IDNA open use

architecture by users' real world, which is not considered - where IETF can specified MAYs

I am not really familiar with Bidi. My question therefore is: where in the above scheme do you locate your discussion?

1. does that affect IDNA2008 as the interface between the DNS and IDNs?
2. does that relates to IDNs conversion from/to A-labels/U-labels
3. does that belongs to the User experience

My question is how Bidi should be best inserted in the appeal I consider to be of best use?

Slim Amamou <slim@alixsys.com> 14 février 2010 04:57

hi,
It belongs probably to User experience. Although the issue is broader than that, and deals with the question of the universality of the representation of the domain name. In other words : will a domain name look the same in all countries, for all cultures and on every media (screen, print, etc...) or not?

Abdulrahman I. ALGhadir <aghadir@citc.gov.sa> 14 février 2010 05:39

1. does that affect IDNA2008 as the interface between the DNS and IDNs?

No, it doesn't.

2. does that relates to IDNs conversion from/to A-labels/U-labels

No, it doesn't.

3. does that belongs to the User experience

Well, it doesn't give the user him/herself any problem in reading domain names (because they are rendered in correct way) the issue is that the logical order sometimes doesn't have the same order as network order (which is against some rules in IDNA rfc and some other rfcs).

My question is how Bidi should be best inserted in the appeal I consider to be of best use?

Well I don't see the IDNA has major impact on this problem (it is rendering problem) so I am not sure in this part.

Slim Amamou <slim@alixsys.com> 14 février 2010 11:23

does that belongs to the User experience -- Well, it doesn't give the user him/herself any problem in reading domain names (because they are rendered in correct way)

Actually Abdulrahman and I are not agreeing here. I started the thread because I consider rendering L1.R2.R3.L4 as L1.R3.R2.L4 obviously not correct. And I still don't understand how this could be acceptable for anyone. Note that this will be a very common usage, since most domain names begin with LTR www. and end with LTR TLD.

Vint Cerf <vint@google.com> 14 février 2010 15:36

unless there are strong indicators that a string IS a domain name, this is going to be an unsolvable problem I think. Won't it also depend on the actual label contents and the appearance of numerics in the labels, to add more complexity?

vint

Slim Amamou <slim@alixsys.com> 14 février 2010 20:41

unless there are strong indicators that a string IS a domain name, this is going to be an unsolvable problem I think.

I think dealing with cases where a domain name can not be identified as such is out of the scope. It always was, there is no way to tell that example. is a domain name. But our problem could be narrowed to displayed URLs (or should we say IRLs?) and email addresses. In the two contexts the domain name is obvious.

Won't it also depend on the actual label contents and the appearance of numerics in the labels, to add more complexity?

No, because

1- the structure of the domain name will prevail. DNS was not meant to be used like del.icio.us or to write sentences using the period in place of the space. Some components of the internet depend on the implied hierarchy of authority. like HTML5 same origin policy for example, and in general whenever the word "subdomain" is used in a spec.

2- If someone really can't help using a domain name for writing a sentence like www.this.is.an8.label.domain.name.com in arabic, it won't matter to him to input it in reverse order : name.domain.label.an8.is.this.www during domain registration process. what will matter to him is that the domain will *always* be displayed the same whether it is LTR or RTL context. (if this is not clear, please ask, I will develop more)

I'm aware that it would be confusing for someone writing RTL to input a domain name including periods during the registration process, just to have it inverted on display. But this is due to the limitations of the current deployed domain registration platforms, and could be mitigated by simple GUI improvements (GUI improvements have to be made anyway to make them BIDI capable).

Vint Cerf <vint@google.com> 14 février 2010 21:03

Slim,

one doesn't register the whole domain name in one place - one registers a label with each zone manager.

so I am not sure we should take your example below as literally as you might have meant it?

vint

Slim Amamou <slim@alixsys.com> 14 février 2010 22:50

one doesn't register the whole domain name in one place - one registers a label with each zone manager.

But current registrar web interfaces allow for authority zone management. And from what I recall, they unfortunately allow periods in subdomains. If that is not the case, that's even better because enforcing network order for label ordering won't introduce any usability problem.

Slim Amamou <slim@alixsys.com> 16 février 2010 10:21

(...) The way I approach semantic addressing is to locate a notion throught its coordinated concepts. So, this a genitive chain: abc of cde of ghi of ijk which can also be ijk's ghi's cde's abc. When it is a domain name it will use the first format, in the case it is a C structure it will be the second format. In that perspective I presuppose that order within labels is to be purely conventional respecting the punycode order respecting the user entered/registered order.

I agree with this, and I chose the network order out of convenience. But what is a C structure?

(...) To do that I just consider that the DNS uses alphadigital digits (i.e. from 0 to Z, what fits with the way it relates to uppercases). I then consider that a label may be made of several sublabels united by "-" "subjoiner/subseparator", and that labels are tied into a LTR or RTL name seqences in using "." joiner/separator (the xn-- header is therefore a sublabel separated from the other sublabels, by an empty sublabel).

So you are introducing a new separator class, namely the "-" within the label. Thus structuring the label in it's punycode form (A-label) for the sake of meaningfulness. Right?

In such a syntax only URN is reordered if it is a domain name or a structure, not sublabels. This means that this is transparent to punycode.

For that matter, I support the non-reordering of URN components and subcomponents.

As far as I understand Bidi is only treated on "UDN" (user entered domain name, that can still be more complex than the IDN U-label). This is why I am quite concerned by your mention that Bidi registries could "rigidify" the label order in entering "." that would be some kind of intermediary between what I call a subjoiner and a joiner. This could be addressed as a special joiner, but would complexify the whole syntax analysis. The same a "to be hidden" subsubjoiner/separator, could indicate sequences indifferent to Bidi?

Yes. If it's hidden, it's not a problem. In the layer you are working on, only ascii is allowed, so BIDI and i18n altogether is not a concern.

I wish to underline that in some way or another, a semantic address will have to indicate it is a semantic address and to indicate if it uses the DNS or the C structure format. The idea I currently have (to be script and language transparent) is to use something prefix like "|+ |" and "| +|" or "++ +" that are universal. Such a prefix/posfix to be removed before using the URL could also indicate that a string is a domain name (out of the habit in semantic addresses) ?

I have no problem with that, it's just out of the scope of the issue I indicated.

Jefsey

have no problem with that, it's just out of the scope of the issue I indicated.

Yes.

This is totally out of the WG Charter. Yet, IMHO, the WG Charter if correctly completed (as it has been, except that the Mapping document is not approved yet) can only raise the question "is its continuation on the user side in the scope of the whole IETF, if yes who does take care of it, if not who should be the leader because IETF has to interface with them". This is why I consider that this is in the IAB scope/area of decision and that IESG should have not approved the IDNA2008 document set without making it extremely clear. So, no one respecting the IETF positions, like ICANN and responsible lead users do, attempts to use IDNA2008 in an operational (test or not) conditions.

Personal tools