Commit d595dcdd authored by Oliver Wiese's avatar Oliver Wiese

Add TLDExtract lib

parent db658fc2
......@@ -20,7 +20,10 @@ PODS:
- KeychainAccess (4.1.0)
- mailcore2-ios (0.6.3)
- Onboard (2.3.1)
- Punycode (1.0.1)
- QAKit (0.0.7)
- TLDExtract (1.0.1):
- Punycode (~> 1.0)
- Travellib (0.0.1)
- VENTokenField (2.5.2):
- FrameAccessor (~> 1.0)
......@@ -34,6 +37,7 @@ DEPENDENCIES:
- mailcore2-ios (from `https://github.com/MailCore/mailcore2.git`, branch `master`)
- Onboard (= 2.3.1)
- QAKit
- TLDExtract
- Travellib (from `https://git.imp.fu-berlin.de/jakobsbode/travellib.git`, branch `master`)
- VENTokenField (~> 2.0)
......@@ -47,7 +51,9 @@ SPEC REPOS:
- GTMSessionFetcher
- KeychainAccess
- Onboard
- Punycode
- QAKit
- TLDExtract
- VENTokenField
EXTERNAL SOURCES:
......@@ -76,10 +82,12 @@ SPEC CHECKSUMS:
KeychainAccess: 445e28864fe6d3458b41fa211bcdc39890e8bd5a
mailcore2-ios: 0637212770ea6b00d73de80b249b42ce937884ec
Onboard: b6871f25ac753175b2ab9a362fb2feb26a81a311
Punycode: ddbef4a269780c8f19a7e8deb01d9f101cb2ef86
QAKit: abefda5db53a58012fc8410d310e0ef217515607
TLDExtract: 63aa739e9b50052ef04e792927c43db62b2bb6b5
Travellib: 819ccc356d19fdaf6f0b3c89db069d34aa6c3ec9
VENTokenField: 5a19b838fb97f040e3d4c93f584b4adeaf3fc1ee
PODFILE CHECKSUM: 09ce4ff7b649af3f9a2f9ae1ce80b3f8243472ac
PODFILE CHECKSUM: 3c83d9ee1ce95d57b28284fce2e2affe4d15b1e3
COCOAPODS: 1.9.0
MIT License
Copyright (c) 2018 Gumob
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
[![Carthage compatible](https://img.shields.io/badge/Carthage-compatible-4BC51D.svg)](https://github.com/gumob/PunycodeSwift)
[![Version](http://img.shields.io/cocoapods/v/Punycode.svg)](http://cocoadocs.org/docsets/Punycode)
[![Platform](http://img.shields.io/cocoapods/p/Punycode.svg)](http://cocoadocs.org/docsets/Punycode)
[![Build Status](https://travis-ci.com/gumob/PunycodeSwift.svg?branch=master)](https://travis-ci.com/gumob/PunycodeSwift)
[![codecov](https://codecov.io/gh/gumob/PunycodeSwift/branch/master/graph/badge.svg)](https://codecov.io/gh/gumob/PunycodeSwift)
![Language](https://img.shields.io/badge/Language-Swift%204.2-orange.svg)
![Packagist](https://img.shields.io/packagist/l/doctrine/orm.svg)
# PunycodeSwift
<code>PunycodeSwift</code> is a pure Swift library to allows you to encode and decode `punycoded` strings by using String extension.
## What is Punycode?
Punycode is a representation of Unicode with the limited ASCII character subset used for Internet host names. Using Punycode, host names containing Unicode characters are transcoded to a subset of ASCII consisting of letters, digits, and hyphen, which is called the Letter-Digit-Hyphen (LDH) subset. For example, München (German name for Munich) is encoded as Mnchen-3ya. [(Wikipedia)](https://en.wikipedia.org/wiki/Punycode)
## Requirements
- iOS 9.3 or later
- macOS 10.12 or later
- tvOS 12.0 or later
- Swift 4.2
<small>* No plans to support tvOS 11 or earlier for now</small>
## Installation
### Carthage
Add the following to your `Cartfile` and follow [these instructions](https://github.com/Carthage/Carthage#adding-frameworks-to-an-application).
```
github "gumob/PunycodeSwift"
```
### CocoaPods
To integrate Punycode into your project, add the following to your `Podfile`.
```ruby
platform :ios, '9.3'
use_frameworks!
pod 'Punycode'
```
## Usage
Encode and decode IDNA:
```
import Punycode
var sushi: String = "寿司"
sushi = sushi.idnaEncoded!
print(sushi) // xn--sprr0q
sushi = sushi.idnaDecoded!
print(sushi) // "寿司"
```
Encode and decode Punycode directly:
```
import Punycode
var sushi: String = "寿司"
sushi = sushi.punycodeEncoded!
print(sushi) // sprr0q
sushi = sushi.punycodeDecoded!
print(sushi) // "寿司"
```
## Copyright
Punycode is released under MIT license, which means you can modify it, redistribute it or use it however you like.
//
// Created by kojirof on 2018-11-19.
// Copyright (c) 2018 Gumob. All rights reserved.
//
import Foundation
// For calling site convenience everything is implemented over Substring and String API is wrapped around it
public extension Substring {
/// Returns new string in punycode encoding (RFC 3492)
///
/// - Returns: Punycode encoded string or nil if the string can't be encoded
var punycodeEncoded: String? {
return Punycode().encodePunycode(self)
}
/// Returns new string decoded from punycode representation (RFC 3492)
///
/// - Returns: Original string or nil if the string doesn't contain correct encoding
var punycodeDecoded: String? {
return Punycode().decodePunycode(self)
}
/// Returns new string containing IDNA-encoded hostname
///
/// - Returns: IDNA encoded hostname or nil if the string can't be encoded
var idnaEncoded: String? {
return Punycode().encodeIDNA(self)
}
/// Returns new string containing hostname decoded from IDNA representation
///
/// - Returns: Original hostname or nil if the string doesn't contain correct encoding
var idnaDecoded: String? {
return Punycode().decodedIDNA(self)
}
}
public extension String {
/// Returns new string in punycode encoding (RFC 3492)
///
/// - Returns: Punycode encoded string or nil if the string can't be encoded
var punycodeEncoded: String? {
return self[..<self.endIndex].punycodeEncoded
}
/// Returns new string decoded from punycode representation (RFC 3492)
///
/// - Returns: Original string or nil if the string doesn't contain correct encoding
var punycodeDecoded: String? {
return self[..<self.endIndex].punycodeDecoded
}
/// Returns new string containing IDNA-encoded hostname
///
/// - Returns: IDNA encoded hostname or nil if the string can't be encoded
var idnaEncoded: String? {
return self[..<self.endIndex].idnaEncoded
}
/// Returns new string containing hostname decoded from IDNA representation
///
/// - Returns: Original hostname or nil if the string doesn't contain correct encoding
var idnaDecoded: String? {
return self[..<self.endIndex].idnaDecoded
}
}
//
// Created by kojirof on 2018-11-19.
// Copyright (c) 2018 Gumob. All rights reserved.
//
import Foundation
/// Helpers
extension Substring {
internal func lastIndex(of element: Character) -> String.Index? {
var position: Index = endIndex
while position > startIndex {
position = self.index(before: position)
if self[position] == element {
return position
}
}
return nil
}
}
extension UnicodeScalar {
internal var isValid: Bool {
return value < 0xD880 || (value >= 0xE000 && value <= 0x1FFFFF)
}
}
//
// Created by kojirof on 2018-11-19.
// Copyright (c) 2018 Gumob. All rights reserved.
//
import Foundation
public class Punycode {
/// Punycode RFC 3492
/// See https://www.ietf.org/rfc/rfc3492.txt for standard details
private let base: Int = 36
private let tMin: Int = 1
private let tMax: Int = 26
private let skew: Int = 38
private let damp: Int = 700
private let initialBias: Int = 72
private let initialN: Int = 128
/// RFC 3492 specific
private let delimiter: Character = "-"
private let lowercase: ClosedRange<Character> = "a"..."z"
private let digits: ClosedRange<Character> = "0"..."9"
private let lettersBase: UInt32 = Character("a").unicodeScalars.first!.value
private let digitsBase: UInt32 = Character("0").unicodeScalars.first!.value
/// IDNA
private let ace: String = "xn--"
private func adaptBias(_ delta: Int, _ numberOfPoints: Int, _ firstTime: Bool) -> Int {
var delta: Int = delta
if firstTime {
delta /= damp
} else {
delta /= 2
}
delta += delta / numberOfPoints
var k: Int = 0
while delta > ((base - tMin) * tMax) / 2 {
delta /= base - tMin
k += base
}
return k + ((base - tMin + 1) * delta) / (delta + skew)
}
/// Maps a punycode character to index
private func punycodeIndex(for character: Character) -> Int? {
if lowercase.contains(character) {
return Int(character.unicodeScalars.first!.value - lettersBase)
} else if digits.contains(character) {
return Int(character.unicodeScalars.first!.value - digitsBase) + 26 /// count of lowercase letters range
} else {
return nil
}
}
/// Maps an index to corresponding punycode character
private func punycodeValue(for digit: Int) -> Character? {
guard digit < base else { return nil }
if digit < 26 {
return Character(UnicodeScalar(lettersBase.advanced(by: digit))!)
} else {
return Character(UnicodeScalar(digitsBase.advanced(by: digit - 26))!)
}
}
/// Decodes punycode encoded string to original representation
///
/// - Parameter punycode: Punycode encoding (RFC 3492)
/// - Returns: Decoded string or nil if the input cannot be decoded
public func decodePunycode(_ punycode: Substring) -> String? {
var n: Int = initialN
var i: Int = 0
var bias: Int = initialBias
var output: [Character] = []
var inputPosition = punycode.startIndex
let delimiterPosition: Substring.Index = punycode.lastIndex(of: delimiter) ?? punycode.startIndex
if delimiterPosition > punycode.startIndex {
output.append(contentsOf: punycode[..<delimiterPosition])
inputPosition = punycode.index(after: delimiterPosition)
}
var punycodeInput: Substring = punycode[inputPosition..<punycode.endIndex]
while !punycodeInput.isEmpty {
let oldI: Int = i
var w: Int = 1
var k: Int = base
while true {
let character: Character = punycodeInput.removeFirst()
guard let digit: Int = punycodeIndex(for: character) else {
return nil /// Failing on badly formatted punycode
}
i += digit * w
let t = k <= bias ? tMin : (k >= bias + tMax ? tMax : k - bias)
if digit < t {
break
}
w *= base - t
k += base
}
bias = adaptBias(i - oldI, output.count + 1, oldI == 0)
n += i / (output.count + 1)
i %= (output.count + 1)
guard n >= 0x80, let scalar = UnicodeScalar(n) else {
return nil
}
output.insert(Character(scalar), at: i)
i += 1
}
return String(output)
}
/// Encodes string to punycode (RFC 3492)
///
/// - Parameter input: Input string
/// - Returns: Punycode encoded string
public func encodePunycode(_ input: Substring) -> String? {
var n: Int = initialN
var delta: Int = 0
var bias: Int = initialBias
var output: String = ""
for scalar in input.unicodeScalars {
if scalar.isASCII {
let char = Character(scalar)
output.append(char)
} else if !scalar.isValid {
return nil /// Encountered a scalar out of acceptable range
}
}
var handled: Int = output.count
let basic: Int = handled
if basic > 0 {
output.append(delimiter)
}
while handled < input.unicodeScalars.count {
var minimumCodepoint: Int = 0x10FFFF
for scalar: Unicode.Scalar in input.unicodeScalars {
if scalar.value < minimumCodepoint && scalar.value >= n {
minimumCodepoint = Int(scalar.value)
}
}
delta += (minimumCodepoint - n) * (handled + 1)
n = minimumCodepoint
for scalar: Unicode.Scalar in input.unicodeScalars {
if scalar.value < n {
delta += 1
} else if scalar.value == n {
var q: Int = delta
var k: Int = base
while true {
let t = k <= bias ? tMin : (k >= bias + tMax ? tMax : k - bias)
if q < t {
break
}
guard let character: Character = punycodeValue(for: t + ((q - t) % (base - t))) else { return nil }
output.append(character)
q = (q - t) / (base - t)
k += base
}
guard let character: Character = punycodeValue(for: q) else { return nil }
output.append(character)
bias = adaptBias(delta, handled + 1, handled == basic)
delta = 0
handled += 1
}
}
delta += 1
n += 1
}
return output
}
/// Returns new string containing IDNA-encoded hostname
///
/// - Returns: IDNA encoded hostname or nil if the string can't be encoded
public func encodeIDNA(_ input: Substring) -> String? {
let parts: [Substring] = input.split(separator: ".")
var output: String = ""
for part: Substring in parts {
if output.count > 0 {
output.append(".")
}
if part.rangeOfCharacter(from: CharacterSet.urlHostAllowed.inverted) != nil {
guard let encoded: String = part.lowercased().punycodeEncoded else { return nil }
output += ace + encoded
} else {
output += part
}
}
return output
}
/// Returns new string containing hostname decoded from IDNA representation
///
/// - Returns: Original hostname or nil if the string doesn't contain correct encoding
public func decodedIDNA(_ input: Substring) -> String? {
let parts: [Substring] = input.split(separator: ".")
var output: String = ""
for part: Substring in parts {
if output.count > 0 {
output.append(".")
}
if part.hasPrefix(ace) {
guard let decoded: String = part.dropFirst(ace.count).punycodeDecoded else { return nil }
output += decoded
} else {
output += part
}
}
return output
}
}
MIT License
Copyright (c) 2018 Gumob
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
[![Carthage compatible](https://img.shields.io/badge/Carthage-compatible-4BC51D.svg)](https://github.com/gumob/TLDExtractSwift)
[![Version](http://img.shields.io/cocoapods/v/TLDExtract.svg)](http://cocoadocs.org/docsets/TLDExtract)
[![Platform](http://img.shields.io/cocoapods/p/TLDExtract.svg)](http://cocoadocs.org/docsets/TLDExtract)
[![Build Status](https://travis-ci.com/gumob/TLDExtractSwift.svg?branch=master)](https://travis-ci.com/gumob/TLDExtractSwift)
[![codecov](https://codecov.io/gh/gumob/TLDExtractSwift/branch/master/graph/badge.svg)](https://codecov.io/gh/gumob/TLDExtractSwift)
![Language](https://img.shields.io/badge/Language-Swift%204.2-orange.svg)
![Packagist](https://img.shields.io/packagist/l/doctrine/orm.svg)
# TLDExtract
<code>TLDExtract</code> is a pure Swift library to allows you to get the public suffix of a domain name using [the Public Suffix List](http://www.publicsuffix.org). You can find alternatives for other languages at [publicsuffix.org](https://publicsuffix.org/learn/).<br/>
## What are domains?
Domain names are the unique, human-readable Internet addresses of websites. They are made up of three parts: a top-level domain (a.k.a. TLD), a second-level domain name, and an optional subdomain.
<img src="Metadata/domain-diagram.svg" alt="drawing" width="480" style="width:100%; max-width: 480px;"/>
## Feature
- Extract root domain, top level domain, second level domain, subdomain from url and hostname
- Foundation URL and String support
- IDNA support
- Multi platform support
## Requirements
- iOS 9.3 or later
- macOS 10.12 or later
- tvOS 12.0 or later
- Swift 4.2
- Python 2.7 or Python 3
<small>* No plans to support tvOS 11 or earlier for now</small>
## Installation
### Carthage
Add the following to your `Cartfile` and follow [these instructions](https://github.com/Carthage/Carthage#adding-frameworks-to-an-application).
```
github "gumob/TLDExtractSwift"
```
Do not forget to include Punycode.framework. Otherwise it will fail to build the application.<br/>
<img src="Metadata/carthage-xcode-config.jpg" alt="drawing" width="480" style="width:100%; max-width: 480px;"/>
### CocoaPods
To integrate TLDExtract into your project, add the following to your `Podfile`.
```ruby
platform :ios, '9.3'
use_frameworks!
pod 'TLDExtract'
```
## Usage
### Initialization
Basic initialization code. Exceptions will not be raised unless [the Public Suffix List on the server](https://publicsuffix.org/list/public_suffix_list.dat) is broken.
```swift
import TLDExtract
let extractor = try! TLDExtract()
```
A safer initialization code to avoid errors by using the frozen Public Suffix List:<br/>
```swift
import TLDExtract
let extractor = try! TLDExtract(useFrozenData: true)
```
*The Public Suffix List is updated every time the framework is built. By setting userFrozenData to true, TLDExtract loads data which checked out from the repository.
### Extraction
#### Passing argument as String
Extract an url:
```swift
let urlString: String = "https://www.github.com/gumob/TLDExtract"
guard let result: TLDResult = extractor.parse(urlString) else { return }
print(result.rootDomain) // Optional("github.com")
print(result.topLevelDomain) // Optional("com")
print(result.secondLevelDomain) // Optional("github")
print(result.subDomain) // Optional("www")
```
Extract a hostname:
```swift
let hostname: String = "gumob.com"
guard let result: TLDResult = extractor.parse(hostname) else { return }
print(result.rootDomain) // Optional("gumob.com")
print(result.topLevelDomain) // Optional("com")
print(result.secondLevelDomain) // Optional("gumob")
print(result.subDomain) // nil
```
Extract an unicode hostname:
```swift
let hostname: String = "www.ラーメン.寿司.co.jp"
guard let result: TLDResult = extractor.parse(hostname) else { return }
print(result.rootDomain) // Optional("寿司.co.jp")
print(result.topLevelDomain) // Optional("co.jp")
print(result.secondLevelDomain) // Optional("寿司")
print(result.subDomain) // Optional("www.ラーメン")
```
Extract a punycoded hostname (Same as above):
```swift
let hostname: String = "www.xn--4dkp5a8a.xn--sprr0q.co.jp")"
guard let result: TLDResult = extractor.parse(hostname) else { return }
print(result.rootDomain) // Optional("xn--sprr0q.co.jp")
print(result.topLevelDomain) // Optional("co.jp")
print(result.secondLevelDomain) // Optional("xn--sprr0q")
print(result.subDomain) // Optional("www.xn--4dkp5a8a")
```
#### Passing argument as Foundation URL
Extract an unicode url: <br/>
URL class in Foundation Framework does not support unicode URLs by default. You can use URL extension as a workaround
```swift
let urlString: String = "http://www.ラーメン.寿司.co.jp"
let url: URL = URL(unicodeString: urlString)
guard let result: TLDResult = extractor.parse(url) else { return }
print(result.rootDomain) // Optional("www.ラーメン.寿司.co.jp")
print(result.topLevelDomain) // Optional("co.jp")
print(result.secondLevelDomain) // Optional("寿司")
print(result.subDomain) // Optional("www.ラーメン")
```
Encode an url by passing argument as percent encoded string (Same as above):
```swift
let urlString: String = "http://www.ラーメン.寿司.co.jp".addingPercentEncoding(withAllowedCharacters: .urlQueryAllowed)!
let url: URL = URL(string: urlString)
print(urlString) // http://www.%E3%83%A9%E3%83%BC%E3%83%A1%E3%83%B3.%E5%AF%BF%E5%8F%B8.co.jp
guard let result: TLDResult = extractor.parse(url) else { return }
print(result.rootDomain) // Optional("www.ラーメン.寿司.co.jp")
print(result.topLevelDomain) // Optional("co.jp")
print(result.secondLevelDomain) // Optional("寿司")
print(result.subDomain) // Optional("www.ラーメン")
```
Encode an unicode url by using [`Punycode`](https://github.com/gumob/Punycode) Framework:
```swift
import Punycode
let urlString: String = "http://www.ラーメン.寿司.co.jp".idnaEncoded!
let url: URL = URL(string: urlString)
print(urlString) // http://www.xn--4dkp5a8a.xn--sprr0q.co.jp
guard let result: TLDResult = extractor.parse(url) else { return }
print(result.rootDomain) // Optional("xn--sprr0q.co.jp")
print(result.topLevelDomain) // Optional("co.jp")
print(result.secondLevelDomain) // Optional("xn--sprr0q")
print(result.subDomain) // Optional("www.xn--4dkp5a8a")
```
## Copyright
TLDExtract is released under MIT license, which means you can modify it, redistribute it or use it however you like.
This diff is collapsed.
This diff is collapsed.
//
// Created by kojirof on 2018-11-17.
// Copyright (c) 2018 Gumob. All rights reserved.
//
import Foundation
internal extension Bundle {
class ClassForFramework {
}
static var current: Bundle {
return Bundle.init(for: ClassForFramework.self)
}
}
internal extension String {
var isComment: Bool {
return self.starts(with: "//")
}
}