Commits
- Commit:
df6ca41da36c3f617cbbf3302ab120721ebfcfd2
- From:
- Omar Polo <op@omarpolo.com>
- Date:
IRI support
This extends the URI parser so it supports full IRI (Internationalized
Resource Identifiers, RFC3987). Some areas of it can/may be improved,
but here's a start.
Note: we assume UTF-8 encoded IRI.
- Commit:
33d32d1fd66a577f22f3f33f238e8dac44ec9995
- From:
- Omar Polo <op@omarpolo.com>
- Date:
implement a valid RFC3986 (URI) parser
Up until now I used a "poor man" approach: the uri parser is barely a
parser, it tries to extract the path from the request, with some minor
checking, and that's all. This obviously is not RFC3986-compliant.
The new RFC3986 (URI) parser should be fully compliant. It may accept
some invalid URI, but shouldn't reject or mis-parse valid URI. (in
particular, the rule for the path is way more relaxed in this parser
than it is in the RFC text).
A difference with RFC3986 is that we don't even try to parse the
(optional) userinfo part of a URI: following the Gemini spec we treat
it as an error.
A further caveats is that %2F in the path part of the URI is
indistinguishable from a literal '/': this is NOT conforming, but due
to the scope and use of gmid, I don't see how treat a %2F sequence in
the path (reject the URI?).