When you run a website, you are likely at some point to come across URL decoding and URL encoding. If you haven’t come across these yet, or you have just come across them for the first time and want more information, we’ll explain about them here.
What Is A URL?
A URL (Uniform Resource Locator) is an address used by browsers to find a resource on the internet. Normally, the URL will lead to a webpage, like the one you’re reading. However, at times, it might lead to a document (like a pdf document).
All URLs have a structure that was formulated by the inventor of the world wide web, Tim Berners-Lee. They also conform with a generic syntax that looks like:
scheme:{//{user:password@}host{:port}}path{?query}{#fragment}
Certain aspects of the URL syntax are deprecated and not used often because there would be security concerns. A good example of this would the {user:password}
aspect. Sending this without any form of protection would mean that hackers could gain access to systems they’re not supposed to.
A common URL that you might see includes:
https://example.com/page-1
For this URL, the scheme is the https. The host is the example.com and the path is the page-1 element.
Allowable URL Characters
URLs can only have a certain characters within them. The characters all belong to the US-ASCII character set which includes numbers (0-9), letters (a-z) and a few special characters. If there are characters which are placed in the URL by a website owner that aren’t included in the US-ASCII coding or have been reserved for a special meaning (including ?, /, #, :) then the URL needs to be altered.
This is where the encoding comes into play as no part of the URL should contain these ‘reserved’ characters unless they are for the specific reason they are supposed to be used. For example, the / character is used to denote a pathway in the URL.
So, when one of these characters is used, the data that is included in the URL needs to be encoded. This is when the URL encoding process converts the reserved characters, any unsafe characters and non-ASCII characters into a format that is more universally accepted and understood by web browsers and servers.
The character is changed during the encoding process into something that is one or more bytes represented by two hexadecimal digits that are preceded by the percentage sign (%). This is why URL encoding is sometimes called percent encoding.
An example of this is if you were to send the URL
https://example.com/hello world
Notice the gap between hello and world. This is not allowed in standard URL format and therefore needs to be changed. The standard for this would %20 where 20 is standard representation for a space. Therefore, the URL transmitted after URL encoding would look like:
https://example.com/hello%20world
Why Is URL Encoding Important?
URL encoding is an important part of knowing when there are errors within your URLs. If you can recognize when encoding has occurred, you can go in and look at your structure and find ways to fix the problem, by making a new URL.
Or you might find that you have non-standardized characters that you need to have within the structure of the URL and need to find out how to encode them to be accepted. This might be the case when you have a question down a page and want to link to that section of the page. You can’t have the ‘?’ in the URL structure as that is a reserved character, so you will need to encode it.
The original URL that you might want would be:
https://example.com/questions#what-is-the-question?
But in encoding it would be changed to:
https://example.com/questions#what-is-the-question%3F
This will then take the visitor to the page questions, on the domain example.com and to the section of the page ‘What is the question?’
How To Properly Code A URL
There is a problem that when you need to encode a URL, mistakes can be made within the coding aspect. For instance, say the URL example.com/a+b/c needed to be encoded. The / between the b and c could either be a symbol for the path and therefore an allowed character within the URL or it could be the symbol for divide.
Therefore, encoding the URL example.coma+b/c could result in either
example.com/a%2Bb/c or example.com/a%2Bb%2Fc.
So, to properly encode the domain you need to know what the / between the b and c represents. Does it refer to a divide or a path in the address?
Then, to encode the URL properly, you need to encode different sections. This includes the host (e.g. example.com) and every individual path.
So, if it was a different path, then you would need to encode three parts: example.com, a+b and c.
If, however, it was meant to represent the divide, then you should encode two parts example.com and a+b/c.
To help with encoding any of your URLs, you can use this free URL Decode & URL Encode Online Tool from Gochyu. You will need to enter in all the different paths of the URL separately, but it will provide you with the correct code you need for success.
You can even separate each part of the URL onto a new line in order to get a complete URL in one go.
So, the URL
example.com/questions/what-is-the-question?#Answer#1
should come out as:
example.com/questions/what-is-the-question%3F/#Answer%231
What Is URL Decoding?
URL decoding is the process of URL encoding in reverse. It’s when the URL has been encoded because it contained characters that are not acceptable. You now can decode that URL so it is in a more readable form.
This can be important for finding errors within a URL. For instance, you might have accidentally placed a # within a URL. By decoding, you can remove these and make URLs more readable to both search engines and humans. Once removed, you could see an improvement in your site’s rank on search engines or get more direct traffic as people can now use a standardized URL.
If you need to decode a URL, you can use our free URL Decode & URL Encode Online Tool. All you need to do is enter the encoded URL into the box and press the decode button.
Final Word: What Is URL Decoding And URL Encoding?
Encoding is the simple changing of characters within a URL, that can’t be used for various reasons, into a standardized code. This is to prevent errors being made in retrieving the resources from a server and allowing sites to properly render on a web browser. Decoding is the opposite.
It is very important to recognize when an URL has been encoded and ensure that you can use URL encoding and URL decoding to fix errors on your website.
* read the rest of the post and open up an offer