# parseUri

`parseUri` is a mighty but tiny JavaScript URI/URN/URL parser that splits any URI into its parts (all of which are optional). Its combination of accuracy, comprehensiveness, and brevity is unrivaled (1KB min/gzip, with no dependencies).

## Breaking changes

Version 2 was a major, breaking change that probably requires updating URI part names in your code. See details in the [release notes](https://github.com/slevithan/parseuri/releases/tag/v2.0.0) and compare results on the [demo page](https://slevithan.github.io/parseuri/demo/?compareV1=true&friendlyMode=true). Version 3 was a small update published on npm as pure ESM.

## Compared to the `URL` constructor

`parseUri` includes several advantages over the built-in [`URL`](https://developer.mozilla.org/en-US/docs/Web/API/URL):

* It gives you many additional properties (`authority`, `userinfo`, `subdomain`, `domain`, `tld`, `resource`, `directory`, `filename`, `suffix`) that aren’t available from `URL`.
* `URL` throws e.g. if not given a protocol, and in many other cases of valid (but not supported) and invalid URIs. `parseUri` makes a best case effort even with partial or invalid URIs and is extremely good with edge cases.
* `URL`’s rules don’t allow correctly handling many non-web protocols. For example, `URL` doesn’t throw on any of `'git://localhost:1234'`, `'ssh://myid@192.168.1.101'`, or `'t2ab:///path/entry'`, but it also doesn’t get their details correct since it treats everything after `<non-web-protocol>:` up to `?` or `#` as part of the `pathname`.
* `parseUri` includes a “friendly” parsing mode (in addition to its default mode) that handles human-friendly URLs like `'example.com/file.html'` as expected.
* `parseUri` supports providing a list of second-level domains that should be treated as part of the top-level domain (ex: `co.uk`).

Conversely, `parseUri` is single-purpose and doesn’t apply normalization.

The [demo page](https://slevithan.github.io/parseuri/demo/?urlStandard=true) allows easily comparing with `URL`’s results.

## Results / URI parts

Returns an object with 20 URI parts as properties plus `queryParams`, a [`URLSearchParams`](https://developer.mozilla.org/en-US/docs/Web/API/URLSearchParams) object that includes methods `get(key)`, `getAll(key)`, etc.

Here’s an example of what each part contains:

```text
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                  href                                                    │
├────────────────────────────────────────────────────────────────┬─────────────────────────────────────────┤
│                             origin                             │                resource                 │
├──────────┬─┬───────────────────────────────────────────────────┼──────────────────────┬───────┬──────────┤
│ protocol │ │                     authority                     │       pathname       │ query │ fragment │
│          │ ├─────────────────────┬─────────────────────────────┼───────────┬──────────┤       │          │
│          │ │      userinfo       │            host             │ directory │ filename │       │          │
│          │ ├──────────┬──────────┼──────────────────────┬──────┤           ├─┬────────┤       │          │
│          │ │ username │ password │       hostname       │ port │           │ │ suffix │       │          │
│          │ │          │          ├───────────┬──────────┤      │           │ ├────────┤       │          │
│          │ │          │          │ subdomain │  domain  │      │           │ │        │       │          │
│          │ │          │          │           ├────┬─────┤      │           │ │        │       │          │
│          │ │          │          │           │    │ tld │      │           │ │        │       │          │
"  https   ://   user   :   pass   @ sub1.sub2 . dom.com  : 8080   /p/a/t/h/  a.html    ?  q=1  #   hash   "
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```

> If this chart doesn’t appear correctly, view it on [GitHub](https://github.com/slevithan/parseuri/blob/main/README.md#results--uri-parts).

`parseUri` additionally supports IPv4 and IPv6 addresses, URNs, and many edge cases not shown here. See the extensive [tests](https://slevithan.github.io/parseuri/spec/).

## Parsing modes

`parseUri` has two parsing modes: default and friendly. The default mode follows official URI rules. Friendly mode doesn’t require `'<protocol>:'`, `':'`, or `'//'` to signal the start of an authority, which allows handling human-friendly URLs like `'example.com/file.html'` as expected. This change has several effects:

- It allows starting a URI with an authority (as noted above).
- It precludes friendly mode from properly handling relative paths (that don’t start from root `'/'`) such as `'dir/file.html'`.
- Since the web protocols `http`, `https`, `ws`, `wss`, and `ftp` don’t require `'//'`, this also means that friendly mode extends this behavior to non-web protocols.

## Examples

```js
let uri = parseUri('https://a.b.example.com:80/@user/a/my.img.jpg?q=x&q=#hash');
uri.protocol // → 'https'
uri.host // → 'a.b.example.com:80'
uri.hostname // → 'a.b.example.com'
uri.subdomain // → 'a.b'
uri.domain // → 'example.com'
uri.port // → '80'
uri.resource // → '/@user/a/my.img.jpg?q=x&q=#hash'
uri.pathname // → '/@user/a/my.img.jpg'
uri.directory // → '/@user/a/'
uri.filename // → 'my.img.jpg'
uri.suffix // → 'jpg'
uri.query // → 'q=x&q='
uri.fragment // → 'hash'
uri.queryParams.get('q') // → 'x'
uri.queryParams.getAll('q') // → ['x', '']
uri.queryParams.get('not-present') // → null
uri.queryParams.getAll('not-present') // → []
// Also available: href, origin, authority, userinfo, username, password, tld

// Relative path (not starting from root /)
uri = parseUri('dir/file.html?q=x');
uri.hostname // → ''
uri.directory // → 'dir/'
uri.filename // → 'file.html'
uri.query // → 'q=x'

// Friendly mode allows starting with an authority
uri = parseUri('example.com/file.html', 'friendly');
uri.hostname // → 'example.com'
uri.directory // → '/'
uri.filename // → 'file.html'

// IPv4 address
uri = parseUri('ssh://myid@192.168.1.101');
uri.protocol // → 'ssh'
uri.username // → 'myid'
uri.hostname // → '192.168.1.101'
uri.domain // → ''

// IPv6 address
uri = parseUri('https://[2001:db8:85a3::7334]:80?q=x');
uri.hostname // → '[2001:db8:85a3::7334]'
uri.port // → '80'
uri.domain // → ''
uri.query // → 'q=x'

// Mailto
uri = parseUri('mailto:first@my.com,second@my.com?subject=Hey&body=Sign%20me%20up!');
uri.protocol // → 'mailto'
uri.username // → ''
uri.hostname // → ''
uri.pathname // → 'first@my.com,second@my.com'
uri.query // → 'subject=Hey&body=Sign%20me%20up!'
uri.queryParams.get('body') // → 'Sign me up!'

// Mailto in friendly mode
uri = parseUri('mailto:me@my.com?subject=Hey', 'friendly');
uri.protocol // → 'mailto'
uri.username // → 'me'
uri.hostname // → 'my.com'
uri.pathname // → ''

/* Also supports e.g.:
- https://[2001:db8:85a3::7334%en1]/ipv6-with-zone-identifier
- git://localhost:1234
- file:///path/file
- tel:+1-800-555-1212
- urn:uuid:c5542ab6-3d96-403e-8e6b-b8bb52f48d9a?q=x
*/
```

Use the [demo page](https://slevithan.github.io/parseuri/demo/) to easily test and compare results.

## Install

```bash
npm install parseuri
```

## Use

```js
import { parseUri, setTlds } from 'parseuri';
```

In browsers:

```html
<script src="https://cdn.jsdelivr.net/npm/parseuri/dist/parseuri.min.js"></script>
<script>
  console.log(parseUri('https://example.com/'));
  // If needed, use `parseUri.setTlds`
</script>
```
