Besides URL & URN

Started by shabdli, Oct 15, 2022, 03:12 AM

Previous topic - Next topic

shabdliTopic starter

Scouring the Internet, I went to the page about URI and after a few hours realized how little I knew about this monster. I admit, before that I thought that a URI was either a URL or a URN, and my knowledge was limited to this scheme.
What was my surprise that there is still a lot of new things in this topic. Let's start in order.



URI

URI (Uniform Resource Identifier) is a unified resource identifier. In short, it allows you to identify any resource: physical or abstract (https://vk.com/settings - it does not exist).

By itself, the URI does not give us anything, it is just an "interface" (expressed in OOP). The most interesting thing is given to us by its subtypes.

Its "interface":

URI = [ scheme ":" ] hierarchical-part [ "?" query ] [ "#" fragment ]

URN

URN (Uniform Resource Name) - the uniform name of the resource. It can give you a resource by name alone (also abstract or physical).

<URN> ::= "urn:" <NID> ":" <NSS>

<NID> - namespace identifier - the identifier of the space.

<NSS> - namespace specific string - the name of the resource in this space

Example: urn:isbn:540665601X is the identifier of a specific book, since the ISBN is unique
URL

URL (Uniform Resource Locator) - unified resource locator. Tells us where we need to find a resource. Probably needs no introduction. We all use it every day.

Now we go to the very depths...
PURL

PURL (Persistent Uniform Resource Locator) is a constant uniform resource identifier.

Let's remember that the URL tells us "where to go to get the resource", but what to do if it was deleted, moved, renamed, etc. One of the solutions may be the use of URN, but this is still far away. This is where PURL comes to our aid.

Its main idea is to create a database of PURL addresses that will map PURL to the active URL and redirect to that URL (Http Redirect, for example).

<PURL> ::= <SCHEME> "://" <HOST> "/" <URL>

As you can see, the structure is not very different from the good old URL. The difference is that the HOST is the PURL database server, and the URL is the address of the resource we want to access.


Displayed on

http://your.web.server/your/web/root/

Will be redirected to

http://your.web.server/your/web/root/chapter12.html

IRI

IRI (Internationalized Resource Identifier) is an internationalized resource identifier. Allows you to write the address of the resource in any language of the world.

It's all because of URL restrictions. Valid characters (RFC 3986):

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;=

To get around this limitation, IRI was created. This type of address uses Unicode instead of US-ASCII.

Thus, the address

https://ru.wikipedia.org/wiki/%D0%9F%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC%D0%BC%D0%B8%D1%81%D1%82

Turns into

https://ru.wikipedia.org/wiki/
XRI Programmer

XRI (Extensible Resource Identifier) - extensible resource identifier. This protocol is created by OASIS. This protocol sets global goals that do not fit within the framework of 1 post. Therefore, I will describe it briefly

XRI is not only compatible with IRI and URI, but is also a possible replacement for the entire DNS and IP addressing system! There are 2 layers of identifiers in the protocol:

    I-Number is a permanent address (this is similar to IP). It is registered to a specific resource and is never re-registered again (unlike different IP addresses for the same server)

    I-Name is a human-readable address (similar to DNS). I-Name resolves to I-Number.

Unlike DNS addressing:

    Non - hierarchical peer - to - peer

    It can have cross-references - one XRI is nested in another and one logical resource can be identified in different contexts

    Has global context registries

        = - private persons

        @ - organizations

        + - general concepts

Examples

I-Names:

    =Ivan.Petrov

    @Yandex/(+programmer.id)

    +phone.number

I-Numbers:

    !!43534!A8C3/!D90F.88

    !!1002!A7C5

The Technical Development Committee was closed on 8.07.2015, and the protocol itself is no longer in development.
Dessert

You probably know that there are Cyrillic domain names. For example, https://стопкоронавирус.рф .

The catch is that only numbers, Latin alphabet characters and "-" (37 characters in total) can be used to store a domain name. But then how are domains used .rf? Are they using a different DNS system? No. The Punycode encoding algorithm is used here. This algorithm converts a Unicode sequence into an ACE sequence that is understood by DNS.

The algorithm consists of 2 steps:

    Transfer all the original ASCII characters to the resulting string (do not touch them)

    If there are non-ASCII characters, then

        Add "-" to the end

        Sequentially encode all remaining characters

The algorithm is described in RFC 3492 and also on Wikipedia
Result

I hope you learned something new for yourself and the article turned out to be useful. At least now you won't claim that "URI has only 2 allies - URL and URN".
  •  

selearnerlive

By the way, the URI does not specify delimiters in the query-string in any way.
The fact that "equal" and "ampersand" have become "de facto" separators is simply the result of popularity. In the old days, some web servers used a semicolon separator, for instance.
  •  

pujagupta

URI: Denotes the name and address of a resource on the network. As a rule, it is divided into URL and URN, so URL and URN are components of URI.

URL: The address of some resource on the web. The URL determines the location of the resource and how to access it.

URN: The name of some resource on the web. The meaning of the URN is that it defines only the name of a specific object, which can be located in many specific places.

So, we can assume that:
URI = URL or URI = URN or URI = URL + URN
  •