rushlink/README.md

159 lines
5.0 KiB
Markdown
Raw Permalink Normal View History

2019-08-25 11:54:33 +02:00
# RushLink
2019-08-24 23:14:16 +02:00
2019-08-25 11:54:33 +02:00
A URL shortener and (maybe) a pastebin server for our #ru community.
## Building
- `go get -u github.com/go-bindata/go-bindata/...`
- `go generate ./...`
- `go build ./cmd/rushlink`
## Deploying
We recommend running `rushlink` behind a reverse proxy suitable for processing
HTTP requests, such as `nginx`, or `haproxy`.
## Sample `nginx` config
```
server {
location / {
root /var/www/rushlink;
proxy_pass http://127.0.0.1:8000;
proxy_set_header Host rushlink.local;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
}
}
```
`rushlink` automatically detects whether `http` or `https` is used when
`X-Forwarded-Proto` is correctly set. Otherwise, pass `-root_url
https://rushlink.local` to the binary (e.g. in the `systemd` unit file).
## Sample `systemd` unit file
As of 1fe9553cc9, `rushlink` expects its database and file store configuration
in environment variables.
```
[Install]
WantedBy=nginx.service
[Service]
Type=simple
User=rushlink
Group=nogroup
Environment=RUSHLINK_DATABASE_DRIVER=sqlite RUSHLINK_DATABASE_PATH=/var/lib/rushlink/rushlink.sqlite3 RUSHLINK_FILE_STORE_PATH=/var/lib/rushlink/filestore/
ExecStart=/var/lib/rushlink/rushlink -root_url https://rushlink.local
```
---
# Background
2019-08-25 11:54:33 +02:00
## Libraries
Use standard-Go-libraries if the job can be done with those. As of now, these
are the exceptions:
- `github.com/gorilla/mux` provides useful stuff for routing requests.
- `github.com/gorilla/sessions` for session management.
- `go.etcd.io/bbolt` is our database driver.
- `github.com/pkg/errors` provides a [`Wrap`] function.
2019-09-01 18:06:57 +02:00
- `github.com/prometheus/client_golang/prometheus/...` has easy Prometheus
functionality.
2019-08-25 11:54:33 +02:00
[`Wrap`]: https://godoc.org/github.com/pkg/errors#Wrap
2019-08-25 11:54:33 +02:00
## Database
Before 1fe9553cc9 we used [`go.etcd.io/bbolt`]. This file should be the *only* file
2019-08-25 11:54:33 +02:00
apart from our monolithic binary. All settings and keys should go in here.
Any read-only data resides in the binary file (possibly compressed).
Now, we use Gorm and support SQLite and PostgreSQL as backends.
We provide a migration binary in `cmd/rushlink-migrate-db/`. In environment
variables, the destination database is configured, and the binary itself
expects a few flags. Refer to the source for the exact flags.
2019-08-25 11:54:33 +02:00
[`go.etcd.io/bbolt`]: `go.etcd.io/bbolt`
## Namespacing
All shortened URLs exist as a key on the root of the webserver, i.e. `/xd42`.
That means that we have to separate every other page with some kind of
namespace. Ideas:
- `/z/` reserved for flat pages.
- `/z/static/` reserved for "static files".
2019-09-01 18:06:57 +02:00
## Paste types
Previous, I was planning to put pastes (not redirects) in a separate bucket
and below a url namespace. Then when one was created, we would immediately
generate a redirect link.
Turns out it's easier though to just save a unit (we call it a "paste") which
can be of different types. For example:
- Redirect
- Paste
- File
- Image
- Hypothetical other stuff:
- Dead-drop
- [...]?
2019-08-25 11:54:33 +02:00
## Shorten keys and collisions
2019-09-01 18:06:57 +02:00
First of all: A sextet is a value of 6 bits.
For generating keys, we will initially generate a random value of 4 sextets,
where the first bit is set to `0`. If this collides with an existing key, we
2019-09-01 18:06:57 +02:00
will generate a new one made out of 5 sextets, and set the prefix bits to
`0b10`. We will keep doing this until we don't have any collisions anymore.
To get proper-looking keys, we format the key to characters using the
base64url alphabet described in ([RFC4648, par. 5]). The encoded value will
be saved in the database.
2019-08-25 11:54:33 +02:00
[RFC4648, par. 5]: https://tools.ietf.org/html/rfc4648#section-5
## UI design
As is tradition in a lot of URL-shortener/pastebin-like services, we will put
everything in a single `<pre>` tag, and if possible, just serve `text/plain`.
A good example is <https://0x0.st>.
The reason we would use `text/html` instead of `text/plain` is basically
form submissions and JavaScript. Our main API should be cURL, but it would be
useful if users could also use the website and/or drag-and-drop files and URLs.
On the other hand, using `text/plain` saves us *so much effort*, because we
don't have to do any HTML/CSS/JavaScript. We have native terminal support, etc.
The best thing would probably to do both, and correctly listen to the `Accept`
header that the client sends. We can still wrap the plain-text page in a single
`<pre>` to keep it easy for ourselves.
## Retention
- If we can, we don't want to have user accounts. We store the sessions
forever, and store a user's data in there, without having to collect personal
data in any way.
- URL-shortening links will be retained for always, unless the submitter
2019-09-01 18:06:57 +02:00
revokes it, in which case it will be replaced by a `410 Gone` page[*].
- The probles of pastes are not solved. This is an unsolved problem[*].
2019-08-25 11:54:33 +02:00
2019-09-01 18:06:57 +02:00
[*] In any case, we going to comply with all European laws and reasonable
2019-08-25 11:54:33 +02:00
requests for deletion.
## Privacy
We will try as hard as possible to not store any data about our users, and will
2019-08-25 21:33:32 +02:00
only provide any data when we have the legal obligation to do so.