118 Commits

Author SHA1 Message Date
Juan Rodriguez Donado 2c55ebf406 Merge pull request #16 from sjdonado/automated/update-parsers
chore: Update UA regexes and GeoLite2 database
2026-05-05 08:27:37 +02:00
sjdonado 2a4264e4c5 chore: update parsers 2026-05-05 00:38:32 +00:00
Juan Rodriguez Donado adfebe9d63 Merge pull request #15 from sjdonado/automated/update-parsers 2026-02-25 10:23:53 +01:00
sjdonado 45bf499d21 chore: update parsers 2026-02-20 00:24:21 +00:00
Juan Rodriguez Donado 1552b5ce09 Merge pull request #14 from sjdonado/automated/update-parsers 2026-02-06 21:53:16 +01:00
sjdonado 81f3c95c2b chore: update parsers 2026-02-05 00:26:49 +00:00
Juan Rodriguez Donado 3776621fe9 Merge pull request #13 from sjdonado/automated/update-parsers
chore: Update UA regexes and GeoLite2 database
2026-01-13 05:41:34 +01:00
sjdonado 0d68b0d6e1 chore: update parsers 2026-01-11 00:23:25 +00:00
Juan Rodriguez Donado 8048277f1d Merge pull request #12 from sjdonado/automated/update-parsers
chore: Update UA regexes and GeoLite2 database
2025-12-23 20:22:58 +01:00
sjdonado dcff88f55e chore: update parsers 2025-12-20 00:20:09 +00:00
Juan Rodriguez Donado f7add0116e Merge pull request #11 from sjdonado/automated/update-parsers
chore: Update UA regexes and GeoLite2 database
2025-12-19 06:53:05 +01:00
sjdonado 5f702e69c9 chore: update parsers 2025-12-14 00:22:55 +00:00
Juan Rodriguez Donado 2feeff70bc Merge pull request #9 from sjdonado/automated/update-parsers
chore: Update UA regexes and GeoLite2 database
2025-11-13 21:34:01 +01:00
sjdonado 1bb42684c3 chore: update README 2025-11-13 21:33:08 +01:00
sjdonado b6e7c45c80 chore: update parsers 2025-11-09 00:21:26 +00:00
sjdonado 7d275685b4 fix: missing paths to dockerignore 2025-11-02 15:25:30 +01:00
sjdonado 44dab3ca5a chore: update README 2025-11-02 11:40:50 +01:00
Juan Rodriguez Donado b8c1269d6e Merge pull request #8 from sjdonado/automated/update-parsers
chore: Update UA regexes and GeoLite2 database
2025-11-02 11:32:03 +01:00
sjdonado 9fcd478f86 chore: simplify docs structure 2025-11-02 11:31:17 +01:00
sjdonado 7c6d67c0c7 chore: update parsers 2025-11-02 10:03:33 +00:00
Juan Rodriguez Donado 353cf68852 Merge pull request #7 from sjdonado/refactor/benchmark-native-and-updates
Refactor/benchmark native and updates
2025-11-02 10:59:57 +01:00
sjdonado fef015ce53 feat: openapi swagger ui 2025-11-02 10:57:02 +01:00
sjdonado 80a59094f9 refactor: publish workflow replace manifest with multi-platform build in one step 2025-11-02 10:29:55 +01:00
sjdonado 6f5ad76718 refactor: Dockerfile replace debian with alpine 2025-11-02 10:29:08 +01:00
sjdonado 52497235b9 refactor: benchmark.cr configure a separate database 2025-11-02 10:24:06 +01:00
sjdonado aec073b696 fix: tests type errors 2025-11-02 10:16:19 +01:00
sjdonado 6e587f0176 ci: update parsers workflow 2025-11-02 08:55:34 +01:00
sjdonado 0c89fad713 refactor: improve click buffer performance 2025-11-02 08:48:10 +01:00
sjdonado 046b15bdce refactor: replace docker benchmark with native ps 2025-11-02 08:31:24 +01:00
sjdonado d0412d802b chore: update parsers 04.2025 2025-04-07 19:27:41 +02:00
sjdonado c9e7ad1d99 refactor: rename update-data to update-parsers 2025-04-07 19:26:23 +02:00
sjdonado 9fa7142ea7 feat: improve performance with dynamic batch capacity 2025-03-23 22:04:38 +01:00
Juan Rodriguez c539662235 chore: Update README.md 2025-03-23 16:18:44 +01:00
sjdonado a68259a0f4 fix: click_channel decrease buffer size and processor batch size 2025-03-23 13:33:17 +01:00
sjdonado 136e4d44c9 refactor: link controller buffered channel to hold click data 2025-03-23 13:30:34 +01:00
sjdonado 660d536618 fix: replace user_agent_parser with UserAgent + pre compiled regexes 2025-03-23 12:27:52 +01:00
sjdonado e1d3ec480d refactor: IpLookup lazy load reader 2025-03-23 12:25:19 +01:00
sjdonado 0180f36a62 refacgor: move cors handler to middlewares 2025-03-23 12:10:12 +01:00
sjdonado 4500c89904 fix: convert IpLookup to struct and remove reader instance 2025-03-23 11:55:34 +01:00
sjdonado 3df4642c90 chore: update README 2025-03-20 20:53:37 +01:00
sjdonado e67ed7165b feat: click controller performance improvements
- return tuple directly from the block
- avoid to parse remote_address in the main thread
- avoid to replace headers, add only
- avoid separate variables for one-single use
2025-03-20 20:43:15 +01:00
sjdonado 4ae6ef39d5 refactor: replace ClickController class with struct 2025-03-20 20:42:51 +01:00
sjdonado f2b63c00a3 chore: run benchmark 2025-03-20 13:03:15 +01:00
sjdonado 6a151301b8 refactor: replace click tracker with direct spawn 2025-03-20 13:02:38 +01:00
sjdonado d1be283318 refactor: rewrite benchmark in cr 2025-03-20 12:01:13 +01:00
sjdonado bf717dc38f fix: return inserted_link on create 2025-03-20 08:50:55 +01:00
sjdonado 73ee4c4479 fix: user id and link id int64 types 2025-03-20 08:13:07 +01:00
sjdonado 38f9cfd48e fix: replace uuid columns with rowid aliases 2025-03-20 07:34:33 +01:00
sjdonado e14fc266bb chore: cleanup 2025-03-19 06:11:04 +01:00
sjdonado 917a79c536 chore: bump version to 1.5.2 2025-03-18 11:29:48 +01:00
sjdonado 2c951fd834 chore: update API.md response status codes 2025-03-18 11:23:28 +01:00
sjdonado 3983102caa refactor: click tracker remove unused imports 2025-03-18 11:07:34 +01:00
sjdonado fba2039efc refactor: request thread safety context 2025-03-18 11:04:50 +01:00
sjdonado b22381cb7f fix: idx_links_slug_optimized avoid to duplicate id already included by rowid 2025-03-18 10:56:38 +01:00
sjdonado eb0db67358 refactor: LinkController remove unnecessary overhead 2025-03-18 09:39:39 +01:00
sjdonado 1f41d13667 refactor: ClickTracker service 2025-03-18 08:53:03 +01:00
sjdonado 222e408a16 chore: update SETUP.md dokku same network section 2025-03-18 07:58:06 +01:00
sjdonado 67c27d3056 feat: performance improvement replace spawn with Async::Future.execute 2025-03-18 07:51:51 +01:00
sjdonado 001caffba6 feat: performance improvement replace slug index with covering index 2025-03-18 07:46:18 +01:00
sjdonado 006d99a9e7 chore: update SETUP.md 2025-03-17 09:52:22 +01:00
sjdonado 6bd0d195bf refactor: link get raw query 2025-03-17 09:46:41 +01:00
sjdonado 68e00e7c85 feat: database pool_size 2025-03-17 09:40:42 +01:00
sjdonado 4aefd3ff06 chore: update docs 2025-03-17 07:49:57 +01:00
sjdonado bbc900cd05 refactor: update benchmark to use httpbin.org instead of example.com 2025-03-17 07:49:26 +01:00
sjdonado 21f53f257c chore: upgrade version to 1.5.1 2025-03-16 18:56:33 +01:00
sjdonado 8ca6a450a3 chore: update API documentation 2025-03-16 18:55:18 +01:00
sjdonado 58d8d52194 test: update cursor-based pagination test cases 2025-03-16 18:55:02 +01:00
sjdonado 7d617bbb30 feat: api links/:id/clicks endpoint 2025-03-16 18:34:46 +01:00
sjdonado cd6dfa345b feat: links all cursor pagination 2025-03-16 18:30:52 +01:00
sjdonado 1967cc2c22 fix: get remote address cloudfare proxy 2025-03-16 18:04:36 +01:00
sjdonado 60ebac7150 chore: update API docs 2025-03-16 14:22:32 +01:00
sjdonado a066b5e5ab chore: upgrade version to 1.5.0 2025-03-16 13:41:54 +01:00
sjdonado cd8e2433a5 feat(cli): --update-data download UA Parser and GeoLite2 2025-03-16 13:40:33 +01:00
sjdonado b538a379d1 chore: update README 2025-03-16 11:54:48 +01:00
sjdonado 8cab7a51ad feat: country ip lookup 2025-03-16 11:42:01 +01:00
sjdonado ece74226d4 feat: add country to clicks 2025-03-16 11:26:20 +01:00
sjdonado d26aa2f18a feat: links redirect forward client ip 2025-03-16 10:20:01 +01:00
sjdonado ce2f73dfe3 fix(ci): release pipeline determine version 2025-03-16 10:18:51 +01:00
sjdonado 30fb539289 ci: speed up build with crystal arm64 binary 2025-02-07 19:37:54 +01:00
sjdonado 66fd6db3c2 ci: docker hub release tags 2025-02-07 19:05:31 +01:00
sjdonado 93a91cd76e ci: publish github workflow separate version tags from master 2025-02-07 18:53:55 +01:00
sjdonado 1d1444234b ci: publish workflow extract tag version 2025-02-07 18:44:49 +01:00
sjdonado 49ac63210e chore: bump version 2025-02-07 17:51:55 +01:00
sjdonado 702491cb39 chore: improve documentation local development guidelines 2025-02-07 17:50:10 +01:00
Juan Rodriguez d55dbe0471 Merge pull request #4 from sjdonado/chore/documentation
Chore/documentation
2025-02-07 17:34:25 +01:00
sjdonado 55969b03b5 chore: fix docs broken links 2025-02-07 17:31:05 +01:00
sjdonado 70a036e158 chore: cleanup README 2025-02-07 17:23:10 +01:00
sjdonado 6ade7d295b chore: CONTRIBUTING and CODE_OF_CONDUCT 2025-02-07 17:22:54 +01:00
sjdonado e6ae133449 chore: separate API reference from README 2025-02-07 17:22:41 +01:00
sjdonado d3706a8778 chore: separate SETUP docs from README 2025-02-07 17:22:22 +01:00
sjdonado a271e7c35d chore: bump version 2024-11-27 22:51:52 +01:00
sjdonado a46a50b429 chore: update README with new ENV variables 2024-11-27 22:51:20 +01:00
sjdonado dc8c359bfc test: admin env variables cases 2024-11-27 22:51:18 +01:00
sjdonado dfb6b10caf feat: setup admin user via env variables 2024-11-27 22:27:17 +01:00
Juan Rodriguez 3fa30b3a32 chore: Update README.md 2024-10-27 13:01:40 +01:00
Juan Rodriguez 80ed6033d1 chore: update README.md 2024-10-27 12:56:06 +01:00
Juan Rodriguez 4640522d5d chore: refactor create_links with curl parallel 2024-10-27 12:55:38 +01:00
Juan Rodriguez 848232cc11 fix: create links pipe ipc 2024-10-27 12:01:38 +01:00
Juan Rodriguez 98dedc4494 refactor: powered_by_header kemal config 2024-10-27 11:07:59 +01:00
Juan Rodriguez e6f64ea026 chore: update benchmark with bombardier 2024-10-27 11:07:23 +01:00
Juan Rodriguez ea71d3825e fix: generate slug by user + check existing link on update 2024-07-31 22:09:24 +02:00
Juan Rodriguez afa9b33568 tests: update error messages assertions 2024-07-31 21:50:35 +02:00
Juan Rodriguez a93189411b fix: test suite drop database before all 2024-07-31 21:39:24 +02:00
Juan Rodriguez 98f103f5cf fix: url validate format 2024-07-31 21:38:57 +02:00
Juan Rodriguez 6fc48dae83 refactor: replace slug generation with CRC32 + base62 2024-07-31 21:38:36 +02:00
Juan Rodriguez d039add340 chore: bump version 2024-07-31 08:08:38 +02:00
Juan Rodriguez 0214d6f46d ci: fix publish extract version step 2024-07-31 08:08:15 +02:00
Juan Rodriguez 37e14ec2f8 refactor: sha256 slug generation 2024-07-31 08:07:08 +02:00
Juan Rodriguez a85d5a8c73 chore: bump version 2024-07-14 22:30:15 +02:00
Juan Rodriguez 80cebe3357 fix: update slug size after 2nd attempt 2024-07-14 22:30:15 +02:00
Juan Rodriguez 451a5fbf0f fix: link email validate format regex 2024-07-14 14:33:34 +02:00
Juan Rodriguez aeb6d1164b fix: remove X-Powered-By header 2024-07-14 14:33:19 +02:00
Juan Rodriguez 2f14cd82dd fix: error handling override kemal default response 2024-07-14 11:19:59 +02:00
Juan Rodriguez faedd0bc7a fix: missing errors content type 2024-07-14 09:26:38 +02:00
Juan Rodriguez 1d207fae64 fix: log level error for production env 2024-07-14 09:26:25 +02:00
Juan Rodriguez a71f345f66 refactor: APP_URL + DATABASE_URL default values 2024-07-14 08:46:29 +02:00
Juan Rodriguez 7cc6c1197f refactor: auto run migrations on startup 2024-07-14 08:46:11 +02:00
Juan Rodriguez 115bbf7366 refactor: reduce docker image size 2024-07-12 23:18:06 +02:00
53 changed files with 2957 additions and 768 deletions
+39 -1
View File
@@ -1,6 +1,44 @@
.git
.gitignore
.github
/bin/
/bit
/cli
/benchmark
*.dwarf
*.o
*.a
# Dependencies cache
/.shards/
/bruno/
/lib/.shards/
/spec/
# Database files (should be mounted as volumes)
/sqlite/
*.db
*.db-shm
*.db-wal
# Logs and temporary files
*.log
# Documentation
/docs/
*.md
README.md
CODE_OF_CONDUCT.md
CONTRIBUTING.md
LICENSE
DOCKER_MIGRATION.md
# Development environment
.env*
.editorconfig
# Docker files (not needed inside image)
Dockerfile
docker-compose.yml
.dockerignore
+33
View File
@@ -0,0 +1,33 @@
name: Deploy API Documentation
on:
push:
branches:
- master
paths:
- 'docs/openapi.yaml'
- '.github/workflows/deploy-docs.yml'
workflow_dispatch:
permissions:
contents: write
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Generate Swagger UI
uses: Legion2/swagger-ui-action@v1
with:
output: swagger-ui
spec-file: docs/openapi.yaml
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: swagger-ui
+20 -27
View File
@@ -1,4 +1,4 @@
name: Publish Docker image
name: Publish Docker images
on:
push:
@@ -8,23 +8,22 @@ on:
types: [published]
jobs:
push_to_registry:
name: Push Docker image to Docker Hub
build-and-push:
name: Build and Push Multi-Platform
runs-on: ubuntu-latest
permissions:
packages: write
contents: read
attestations: write
id-token: write
packages: write
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
uses: docker/setup-buildx-action@v3
- name: Login to Docker Hub
uses: docker/login-action@v3
@@ -32,27 +31,21 @@ jobs:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Extract version from shard.yml
id: extract_version
- name: Determine version
id: version
run: |
VERSION=$(grep -oP 'version:\s*\K\S+' shard.yml)
VERSION=$(echo $VERSION | tr -d '\n\r')
echo "RELEASE_TAG=$VERSION" >> $GITHUB_ENV
if [ "${{ github.event_name }}" = "release" ]; then
echo "version=${{ github.event.release.tag_name }}" >> $GITHUB_OUTPUT
else
echo "version=latest" >> $GITHUB_OUTPUT
fi
- name: Build and push image
id: push
uses: docker/build-push-action@v5.0.0
- name: Build and push multi-platform image
uses: docker/build-push-action@v5
with:
context: .
push: true
platforms: linux/amd64,linux/arm64
tags: |
sjdonado/bit:latest
${{ github.event_name == 'release' && env.RELEASE_TAG && 'sjdonado/bit:${{ env.RELEASE_TAG }}' || '' }}
- name: Attest
uses: actions/attest-build-provenance@v1
id: attest
with:
subject-name: sjdonado/bit
subject-digest: ${{ steps.push.outputs.digest }}
push: true
tags: sjdonado/bit:${{ steps.version.outputs.version }}
cache-from: type=gha
cache-to: type=gha,mode=max
+63
View File
@@ -0,0 +1,63 @@
name: Update Parsers
on:
schedule:
# Run every two weeks on Sunday at 00:00 UTC (1st and 3rd Sunday of each month)
- cron: '0 0 1-7,15-21 * 0'
workflow_dispatch: # Allow manual trigger
jobs:
update-parsers:
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Install Crystal
uses: crystal-lang/install-crystal@v1
- name: Install dependencies
run: shards install
- name: Build CLI
run: crystal build scripts/cli.cr -o cli
- name: Update parsers
run: ./cli --update-parsers
- name: Check for changes
id: changes
run: |
if [ -n "$(git status --porcelain)" ]; then
echo "has_changes=true" >> $GITHUB_OUTPUT
else
echo "has_changes=false" >> $GITHUB_OUTPUT
fi
- name: Create Pull Request
if: steps.changes.outputs.has_changes == 'true'
uses: peter-evans/create-pull-request@v6
with:
token: ${{ secrets.GITHUB_TOKEN }}
commit-message: 'chore: update parsers'
title: 'chore: Update UA regexes and GeoLite2 database'
body: |
## Automated Parser Update
This PR updates the following data files:
- User Agent parsing regexes
- GeoLite2 database
**Triggered by**: Scheduled workflow (runs every 2 weeks)
**Date**: ${{ github.event.repository.updated_at }}
Please review the changes and merge if everything looks good.
branch: automated/update-parsers
delete-branch: true
labels: |
dependencies
automated
+3 -2
View File
@@ -1,4 +1,3 @@
/docs/
/lib/
/bin/
/.shards/
@@ -8,4 +7,6 @@
/sqlite/
.env.production
resource_usage.txt
*.log
bit
cli
+132
View File
@@ -0,0 +1,132 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official email address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
[INSERT CONTACT METHOD].
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series
of actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or
permanent ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within
the community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available
at [https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.0]: https://www.contributor-covenant.org/version/2/0/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations
+75
View File
@@ -0,0 +1,75 @@
# Contributing Guidelines
We welcome contributions from the community! Please follow these guidelines to help maintain consistency and quality in the project.
## Code of Conduct
This project adheres to the [Contributor Covenant Code of Conduct](CODE_OF_CONDUCT.md). By participating, you agree to uphold its terms.
## How to Contribute
### 1. Fork the Repository
Click the "Fork" button at the top-right of the [repository page](https://github.com/sjdonado/bit).
### 2. Clone Your Fork
```bash
git clone https://github.com/YOUR_USERNAME/bit.git
cd bit
```
### 3. Create a Feature Branch
```bash
git checkout -b feat/your-feature-name
```
### or for bug fixes:
```bash
git checkout -b fix/issue-description
```
### 4. Develop Your Changes
- Check [Local Development](docs/SETUP.md#local-development) guidelines
- Ensure changes match the project scope
- Write clear commit messages
- Include tests for new functionality
- Update documentation when applicable
### 5. Commit Changes
```bash
git commit -am 'Add descriptive commit message'
```
### 6. Push to GitHub
```bash
git push origin your-branch-name
```
### 7. Create a Pull Request
1. Go to the [original repository](https://github.com/sjdonado/bit)
2. Click "New Pull Request"
3. Select your fork and branch
4. Add a clear description including:
- Purpose of changes
- Related issues (if applicable)
- Testing performed
## Pull Request Guidelines
- Keep PRs focused on a single feature/bugfix
- Ensure all tests pass
- Update documentation in the same PR
- Use descriptive titles (e.g., "Add URL validation" not "Update code")
- Reference related issues using #issue-number
## Reporting Issues
When opening an issue, please include:
1. Description of the problem
2. Steps to reproduce
3. Expected vs actual behavior
4. Environment details (OS, Crystal version, etc)
For feature requests:
- Explain the problem you're trying to solve
- Suggest potential implementations
## License
By contributing, you agree that your contributions will be licensed under the [license](LICENSE).
+25 -10
View File
@@ -1,28 +1,43 @@
FROM alpine:edge as base
FROM alpine:edge AS build
ENV ENV=production
WORKDIR /usr/src/app
RUN apk update && apk add --no-cache \
RUN apk add --no-cache \
crystal \
shards \
openssl-dev \
yaml-dev \
sqlite-dev \
openssl-dev
libevent-dev \
tzdata
COPY . .
RUN shards install --production
RUN shards build --release --no-debug --progress --stats
FROM alpine:latest AS runtime
FROM base AS build
ENV ENV=production
WORKDIR /usr/src/app
COPY . .
RUN apk add --no-cache \
gc-dev \
pcre2 \
libevent \
sqlite-libs \
openssl \
yaml \
gmp \
libgcc \
tzdata
RUN shards install
RUN shards build --progress
RUN mkdir -p sqlite
FROM base AS release
RUN mkdir -p /usr/src/app/sqlite
COPY --from=build /usr/src/app/db db
COPY --from=build /usr/src/app/data data
COPY --from=build /usr/src/app/bin /usr/local/bin
COPY --from=build /usr/src/app/data /usr/local/data
EXPOSE 4000/tcp
CMD ["bit"]
+51 -258
View File
@@ -1,300 +1,93 @@
[![Docker Pulls](https://img.shields.io/docker/pulls/sjdonado/bit.svg)](https://hub.docker.com/repository/docker/sjdonado/bit/general)
[![Docker Stars](https://img.shields.io/docker/stars/sjdonado/bit.svg)](https://hub.docker.com/repository/docker/sjdonado/bit/general)
[![Docker Image Size](https://img.shields.io/docker/image-size/sjdonado/bit/latest)](https://hub.docker.com/repository/docker/sjdonado/bit/general)
[![Docker Pulls](https://img.shields.io/docker/pulls/sjdonado/bit.svg)](https://hub.docker.com/r/sjdonado/bit)
[![Docker Image Size](https://img.shields.io/docker/image-size/sjdonado/bit/latest)](https://hub.docker.com/r/sjdonado/bit)
# Benchmark
## Features
```shell
$ ./benchmark.sh
Semaphore initialized with 2666 slots.
Setup...
[+] Running 2/2
✔ Network bit_default Created 0.0s
✔ Container bit-app-1 Started 0.2s
2024-07-12T18:41:20.962052Z INFO - micrate: Migrating db, current version: 0, target: 20240711224103
2024-07-12T18:41:20.965729Z INFO - micrate: OK 20240512214223_create_links.sql
2024-07-12T18:41:20.969198Z INFO - micrate: OK 20240512225208_add_slug_index_to_links.sql
2024-07-12T18:41:20.973136Z INFO - micrate: OK 20240513115731_create_users.sql
2024-07-12T18:41:20.975525Z INFO - micrate: OK 20240513130054_add_api_key_index_to_users.sql
2024-07-12T18:41:20.979195Z INFO - micrate: OK 20240711224103_create_clicks.sql
Captured API Key: Z01Qk4M5E0xhggZUCdQAPw
Waiting for database to be ready...
Creating 1000 short links...
Created short link 100/1000
Created short link 200/1000
Created short link 300/1000
Created short link 400/1000
Created short link 500/1000
Created short link 600/1000
Created short link 700/1000
Created short link 800/1000
Created short link 900/1000
Created short link 1000/1000
Accessing each link 10 times concurrently...
****Results****
Average Memory Usage: 16.36 MiB
Average CPU Usage: 0%
Average Response Time: 12.37 µs
```
- Minimal tracking setup: Country, browser, OS, referer. No cookies or persistent tracking mechanisms are used beyond what's available from a basic client's request.
- Includes `X-Forwarded-For` header.
- Multiple users are supported via API key authentication. Create, list and delete keys via the [CLI](docs/SETUP.md#cli).
- Easy to extend, Ruby on Rails inspired setup.
- Auto update UA regexes and GeoLite2 database.
# Self-hosted
## Why bit?
- Run via docker-compose
**Fast:** **11k req/sec**, latency 11ms, 40MiB avg memory usage (100k requests using 125 connections, [benchmark](docs/SETUP.md#benchmark)).
```bash
docker-compose up
**Lightweight:** Minimal dependencies, image size under 20 MiB, memory usage under 60 MiB at peak.
docker-compose exec -it app migrate
docker-compose exec -it app cli --create-user=Admin
```
**Self-hosted:** [Dokku](docs/SETUP.md#dokku), [Docker Compose](docs/SETUP.md#docker-compose).
- Run via docker cli
**Production ready:** Feature-complete by design, simple and reliable without unnecessary bloat. Bug fixes will continue, but new features aren't planned.
## Run It Anywhere
All images available on [Docker Hub](https://hub.docker.com/r/sjdonado/bit/tags).
### Docker
```bash
docker run \
--name bit \
-p 4000:4000 \
-e ENV="production" \
-e DATABASE_URL="sqlite3://./sqlite/data.db?journal_mode=wal&synchronous=normal&foreign_keys=true" \
-e DATABASE_URL="sqlite3://./sqlite/data.db" \
-e APP_URL="http://localhost:4000" \
-e ADMIN_NAME="Admin" \
-e ADMIN_API_KEY=$(openssl rand -base64 32) \
sjdonado/bit
docker exec -it bit migrate
docker exec -it bit cli --create-user=Admin
# Create a new user
# docker exec -it bit cli --create-user=Admin
```
- Dokku
### Docker Compose
```bash
docker-compose up
# Optional: Generate an api key
# docker-compose exec -it app cli --create-user=Admin
```
### Dokku
- Dockerfile
```dockerfile
FROM sjdonado/bit
```
- Over ssh
```bash
dokku apps:create bit
dokku domains:set bit bit.donado.co
dokku domains:set bit bit.yourdomain.com
dokku letsencrypt:enable bit
dokku storage:ensure-directory bit-sqlite
dokku storage:mount bit /var/lib/dokku/data/storage/bit-sqlite:/usr/src/app/sqlite/
dokku config:set bit DATABASE_URL="sqlite3://./sqlite/data.db?journal_mode=wal&synchronous=normal&foreign_keys=true" APP_URL=https://bit.donado.co
dokku config:set bit DATABASE_URL="sqlite3://./sqlite/data.db" APP_URL=https://bit.yourdomain.com ADMIN_NAME=Admin ADMIN_API_KEY=$(openssl rand -base64 32)
dokku ports:add bit http:80:4000
dokku ports:add bit https:443:4000
dokku run bit migrate
dokku run bit cli --create-user=Admin
# Create a new user
# dokku run bit cli --create-user=Admin
```
# Usage
## API Endpoints
1. **Ping the API**
- **Endpoint**: `/api/ping`
- **HTTP Method**: GET
- **Description**: Ping the API to check if it's running
- **Payload**: -
- **Response Example**:
```json
{
"message": "pong"
}
```
2. **Retrieve a link by its slug**
- **Endpoint**: `/:slug`
- **HTTP Method**: GET
- **Description**: Retrieve a link by its slug
- **Payload**: -
- **Headers**: `X-Api-Key`
- **Response Example**:
```json
{
"data": {
"id": "84f0c7a4-8c4e-4665-b676-cb9c5e40f1db",
"refer": "http://localhost:4000/3wP4BQ",
"origin": "https://monocuco.donado.co",
"clicks": [
{
"id": "730e2202-58f9-478c-a24c-f1c561df6716",
"user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:127.0) Gecko/20100101 Firefox/127.0",
"language": "en-US",
"browser": "Firefox",
"os": "Mac OS X",
"source": "Unknown",
"created_at": "2024-07-12T19:25:22Z"
}
]
}
}
```
3. **Retrieve all links**
- **Endpoint**: `/api/links`
- **HTTP Method**: GET
- **Description**: Retrieve all links
- **Payload**: -
- **Headers**: `X-Api-Key`
- **Response Example**:
```json
{
"data": [
{
"id": "84f0c7a4-8c4e-4665-b676-cb9c5e40f1db",
"refer": "http://localhost:4000/3wP4BQ",
"origin": "https://monocuco.donado.co",
"clicks": [
{
"id": "730e2202-58f9-478c-a24c-f1c561df6716",
"user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:127.0) Gecko/20100101 Firefox/127.0",
"language": "en-US",
"browser": "Firefox",
"os": "Mac OS X",
"source": "Unknown",
"created_at": "2024-07-12T19:25:22Z"
}
]
}
]
}
```
4. **Retrieve a link by its ID**
- **Endpoint**: `/api/links/:id`
- **HTTP Method**: GET
- **Description**: Retrieve a link by its ID
- **Payload**: -
- **Headers**: `X-Api-Key`
- **Response Example**:
```json
{
"data": {
"id": "84f0c7a4-8c4e-4665-b676-cb9c5e40f1db",
"refer": "http://localhost:4000/3wP4BQ",
"origin": "https://monocuco.donado.co",
"clicks": [
{
"id": "730e2202-58f9-478c-a24c-f1c561df6716",
"user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:127.0) Gecko/20100101 Firefox/127.0",
"language": "en-US",
"browser": "Firefox",
"os": "Mac OS X",
"source": "Unknown",
"created_at": "2024-07-12T19:25:22Z"
}
]
}
}
```
5. **Create a new link**
- **Endpoint**: `/api/links`
- **HTTP Method**: POST
- **Description**: Create a new link
- **Payload**:
```json
{
"url": "https://example.com"
}
```
- **Headers**: `X-Api-Key`
- **Response Example**:
```json
{
"data": {
"id": "84f0c7a4-8c4e-4665-b676-cb9c5e40f1db",
"refer": "http://localhost:4000/3wP4BQ",
"origin": "https://monocuco.donado.co/test",
"clicks": []
}
}
```
6. **Update an existing link by its ID**
- **Endpoint**: `/api/links/:id`
- **HTTP Method**: PUT
- **Description**: Update an existing link by its ID
- **Payload**:
```json
{
"url": "https://newexample.com"
}
```
- **Headers**: `X-Api-Key`
- **Response Example**:
```json
{
"data": {
"id": "84f0c7a4-8c4e-4665-b676-cb9c5e40f1db",
"refer": "http://localhost:4000/3wP4BQ",
"origin": "https://newexample.com",
"clicks": []
}
}
```
7. **Delete a link by its ID**
- **Endpoint**: `/api/links/:id`
- **HTTP Method**: DELETE
- **Description**: Delete a link by its ID
- **Payload**: -
- **Headers**: `X-Api-Key`
- **Response Example**:
```json
{
"message": "Link deleted"
}
```
## CLI
```
Usage: ./cli [options]
Options:
--create-user=NAME Create a new user with the given name
--list-users List all users
--delete-user=USER_ID Delete a user by ID
```
# Development
1. **Installation**
### Dokku (subnetwork)
Recommended for lower latency communication (no host network traversal)
```bash
brew tap amberframework/micrate
brew install micrate
dokku network:create bit-net
dokku network:set bit attach-post-create bit-net
dokku network:set myapp attach-post-create bit-net
```
```bash
shards run migrate
shards run bit
```
## Documentation
- [API Reference](https://sjdonado.github.io/bit/)
- [Local Development](docs/SETUP.md)
2. **Generate the `X-Api-Key`**
```bash
shards run cli -- --create-user=Admin
```
# Run tests
```bash
ENV=test crystal spec
```
# Contributing
1. Fork it (<https://github.com/sjdonado/bit/fork>)
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create a new Pull Request
## Contributing
Found an issue or have a suggestion? Please follow our [contribution guidelines](CONTRIBUTING.md).
+9
View File
@@ -1,6 +1,15 @@
require "log"
ENV["ENV"] ||= "development"
ENV["APP_URL"] ||= "http://localhost:4000"
ENV["DATABASE_URL"] ||= "sqlite3://./sqlite/data.db?journal_mode=wal&synchronous=normal&foreign_keys=true
"
{% if env("ENV") != "production" %}
require "dotenv"
Dotenv.load ".env.#{ENV["ENV"]}" # File must exist in non-production!
{% end %}
{% if env("ENV") == "production" %}
Log.setup(:error)
{% end %}
+2
View File
@@ -3,3 +3,5 @@ require "kemal"
Kemal.config.env = ENV["ENV"]? || "development"
Kemal.config.port = ENV["PORT"]?.try(&.to_i) || 4000
Kemal.config.host_binding = ENV["HOST"]? || "0.0.0.0"
Kemal.config.powered_by_header = false
+45
View File
@@ -0,0 +1,45 @@
module App::Controllers
struct ClickController
include App::Models
include App::Lib
include App::Services
def self.redirect_handler
->(env : HTTP::Server::Context) {
link_id, url = Database.raw_query("SELECT id, url FROM links WHERE slug = (?) LIMIT 1", env.params.url["slug"]) do |result|
result.move_next ? {result.read(Int64), result.read(String)} : nil
end || raise App::NotFoundException.new(env)
remote_address = env.request.headers["Cf-Connecting-Ip"]? || env.request.remote_address.to_s
# Send redirect immediately
env.response.status_code = 301
env.response.headers.add("Location", url)
env.response.headers.add("X-Forwarded-For", remote_address)
if user_agent = env.request.headers["User-Agent"]?
env.response.headers.add("User-Agent", user_agent)
end
# non-blocking click proccessing
spawn do
begin
client_ip = IpLookup.ip_from_address(remote_address)
family, _, _, os = UserAgent.parse(env.request.headers["User-Agent"]? || "")
click = App::Models::Click.new
click.link_id = link_id
click.country = client_ip ? IpLookup.country(client_ip) : nil
click.user_agent = env.request.headers["User-Agent"]?
click.browser = family
click.os = os.try &.[0]
click.referer = env.request.headers["Referer"]?.try { |r| URI.parse(r).host rescue r } || env.params.query["utm_source"]? || "Direct"
Database.insert(click)
rescue ex
Log.error { "Click tracking error: #{ex.message}" }
end
end
}
end
end
end
+97 -118
View File
@@ -1,174 +1,153 @@
require "uuid"
require "user_agent_parser"
UserAgent.load_regexes(File.read("data/regexes.yaml"))
require "../lib/controller.cr"
module App::Controllers::Link
class Create < App::Lib::BaseController
module App::Controllers
class LinkController < App::Lib::BaseController
include App::Models
include App::Lib
include App::Services
def call(env)
user = env.get("user").as(User)
body = parse_body(env, ["url"])
def initialize(@env : HTTP::Server::Context)
super(@env)
end
def create
body = parse_body(["url"])
url = body["url"].to_s
query = Database::Query.where(url: url, user_id: user.id.as(String)).limit(1)
query = Database::Query.where(url: url, user_id: current_user_id).limit(1)
existing_link = Database.all(Link, query, preload: [:clicks]).first?
if existing_link
response = {"data" => App::Serializers::Link.new(existing_link)}
return response.to_json
return render_json({"data" => App::Serializers::Link.new(existing_link)})
end
link = Link.new
link.id = UUID.v4.to_s
link.url = url
link.user = user
loop do
slug = Random::Secure.urlsafe_base64(4).gsub(/[^a-zA-Z0-9]/, "")
if !Database.get_by(Link, slug: slug)
link.slug = slug
break
end
end
link.user_id = current_user_id
link.slug = SlugService.shorten_url(url, current_user_id)
changeset = Database.insert(link)
if !changeset.valid?
raise App::UnprocessableEntityException.new(env, map_changeset_errors(changeset.errors))
raise App::UnprocessableEntityException.new(@env, map_changeset_errors(changeset.errors))
end
link.clicks = [] of App::Models::Click
response = {"data" => App::Serializers::Link.new(link)}
inserted_link = Database.get!(Link, changeset.instance.id)
response.to_json
render_json({"data" => App::Serializers::Link.new(inserted_link)}, 201)
end
end
class Index < App::Lib::BaseController
include App::Models
include App::Lib
def list_all
limit, cursor = pagination_params
def call(env)
slug = env.params.url["slug"]
query = Database::Query.where(user_id: current_user_id)
query = query.where("id < ?", cursor) if cursor
query = query.order_by("id DESC").limit(limit + 1)
link = Database.get_by(Link, slug: slug)
raise App::NotFoundException.new(env) if !link
links = Database.all(Link, query)
spawn do
user_agent_str = env.request.headers["User-Agent"]? || "Unknown"
user_agent = user_agent_str != "Unknown" ? UserAgent.new(user_agent_str) : nil
language_header = env.request.headers["Accept-Language"]? || "Unknown"
language = language_header.split(',').first.split(';').first
referer = env.request.headers["Referer"]?
click = Click.new
click.id = UUID.v4.to_s
click.link = link
click.language = language
click.user_agent = user_agent_str
click.browser = user_agent ? user_agent.family : "Unknown"
click.os = user_agent ? (user_agent.os.try &.family || "Unknown") : "Unknown"
click.source = referer ? URI.parse(referer).host : "Unknown"
changeset = Database.insert(click)
if changeset.errors.any?
Log.error { "Logging click event failed: #{changeset.errors}" }
end
end
env.response.status_code = 301
env.response.headers["Location"] = link.url!
env.response.headers["Content-Type"] = "text/html"
env.response.print("Redirecting...")
paginated_response(links, limit) { |link| App::Serializers::Link.new(link) }
end
end
class All < App::Lib::BaseController
include App::Models
include App::Lib
def get
link_id = @env.params.url["id"].to_i64
def call(env)
user = env.get("user").as(User)
query = Database::Query.where(id: link_id, user_id: current_user_id).limit(1)
link = Database.all(Link, query).first?
raise App::NotFoundException.new(@env) if link.nil?
query = Database::Query.where(user_id: user.id.as(String))
links = Database.all(Link, query, preload: [:clicks])
clicks_query = Database::Query.where(link_id: link_id)
.order_by("id DESC")
.limit(100)
link.clicks = Database.all(Click, clicks_query)
response = {"data" => links.map { |link| App::Serializers::Link.new(link) }}
response.to_json
render_json({"data" => App::Serializers::Link.new(link)})
end
end
class Get < App::Lib::BaseController
include App::Models
include App::Lib
def list_clicks
link_id = @env.params.url["id"].to_i64
def call(env)
user = env.get("user").as(User)
link_id = env.params.url["id"]
# Verify link exists and belongs to user
link_query = Database::Query.where(id: link_id, user_id: current_user_id).limit(1)
link = Database.all(Link, link_query).first?
raise App::NotFoundException.new(@env) if link.nil?
query = Database::Query.where(id: link_id.as(String), user_id: user.id.as(String)).limit(1)
link = Database.all(Link, query, preload: [:clicks]).first?
limit, cursor = pagination_params
raise App::NotFoundException.new(env) if link.nil?
query = Database::Query.where(link_id: link_id)
query = query.where("id < ?", cursor) if cursor
query = query.order_by("id DESC").limit(limit + 1)
response = {"data" => App::Serializers::Link.new(link)}
response.to_json
clicks = Database.all(Click, query)
paginated_response(clicks, limit) { |click| App::Serializers::Click.new(click) }
end
end
class Update < App::Lib::BaseController
include App::Models
include App::Lib
def call(env)
user = env.get("user").as(User)
id = env.params.url["id"]
body = parse_body(env, ["url"])
def update
id = @env.params.url["id"].to_i64
body = parse_body(["url"])
new_url = body["url"].to_s
query = Database::Query.where(id: id).limit(1)
link = Database.all(Link, query, preload: [:clicks]).first?
raise App::NotFoundException.new(env) if link.nil?
raise App::ForbiddenException.new(env) if link.user_id != user.id
raise App::NotFoundException.new(@env) if link.nil?
raise App::ForbiddenException.new(@env) if link.user_id != current_user_id
link.url = body["url"].to_s
# Check for existing URL
existing_query = Database::Query.where(url: new_url, user_id: current_user_id).limit(1)
if Database.all(Link, existing_query).first?
raise App::UnprocessableEntityException.new(@env, { "url" => ["URL already exists"] })
end
link.url = new_url
link.slug = SlugService.shorten_url(new_url, current_user_id)
changeset = Database.update(link)
if !changeset.valid?
raise App::UnprocessableEntityException.new(env, map_changeset_errors(changeset.errors))
raise App::UnprocessableEntityException.new(@env, map_changeset_errors(changeset.errors))
end
response = {"data" => App::Serializers::Link.new(link)}
response.to_json
render_json({"data" => App::Serializers::Link.new(link)})
end
end
class Delete < App::Lib::BaseController
include App::Models
include App::Lib
def call(env)
user = env.get("user").as(User)
id = env.params.url["id"]
def delete
id = @env.params.url["id"].to_i64
link = Database.get(Link, id)
raise App::NotFoundException.new(env) if !link
raise App::NotFoundException.new(@env) if !link
raise App::ForbiddenException.new(@env) if link.user_id != current_user_id
if link.user_id != user.id
raise App::ForbiddenException.new(env)
end
result = Database.raw_exec("DELETE FROM links WHERE id = (?)", link.id) # tempfix: Database.delete does not work
result = Database.raw_exec("DELETE FROM links WHERE id = (?)", link.id)
if result.rows_affected == 0
raise App::UnprocessableEntityException.new(env, { "id" => ["Row delete failed"] })
raise App::UnprocessableEntityException.new(@env, { "id" => ["Row delete failed"] })
end
env.response.status_code = 204
@env.response.status_code = 204
end
private def current_user : User
@env.get("user").as(User)
end
private def current_user_id : Int64
current_user.id.as(Int64)
end
private def pagination_params
limit = (@env.params.query["limit"]? || "100").to_i32
cursor = @env.params.query["cursor"]?
{limit, cursor}
end
private def paginated_response(items, limit)
has_more = items.size > limit
items = items[0...limit] if has_more
next_cursor = has_more ? items.last.id : nil
render_json({
"data" => items.map { |item| yield item },
"pagination" => {
"has_more" => has_more,
"next" => next_cursor
}
})
end
end
end
+8 -5
View File
@@ -1,10 +1,13 @@
require "../lib/controller.cr"
module App::Controllers::Ping
class Get < App::Lib::BaseController
def call(env)
response = {"pong" => "ok"}
response.to_json
module App::Controllers
class PingController < App::Lib::BaseController
def initialize(@env : HTTP::Server::Context)
super(@env)
end
def ping
render_json({data: "pong"})
end
end
end
+29 -13
View File
@@ -1,29 +1,45 @@
module App::Lib
abstract class BaseController
def map_changeset_errors(errors)
protected getter env : HTTP::Server::Context
def initialize(@env : HTTP::Server::Context); end
# Convert changeset errors to API-friendly format
protected def map_changeset_errors(errors)
errors.reduce({} of String => Array(String)) do |memo, error|
memo[error[:field]] = memo[error[:field]]? || [] of String
memo[error[:field]] << error[:message]
field = error[:field].to_s
message = error[:message].to_s
memo[field] ||= [] of String
memo[field] << message
memo
end
end
def parse_body(env, fields)
json_params = env.params.json.to_h
missing_fields = [] of String
protected def parse_body(required_fields : Array(String) = [] of String)
json_params = @env.params.json.try(&.to_h) || {} of String => JSON::Any
json_params = json_params.transform_values(&.to_s) # Convert JSON::Any to String
fields.each do |field|
unless json_params.has_key?(field)
missing_fields << field
end
end
missing_fields = required_fields.reject { |field| json_params.has_key?(field) }
unless missing_fields.empty?
error_message = missing_fields.map { |field| "#{field}: Required field" }.join(", ")
raise App::BadRequestException.new(env, error_message)
error_message = "#{missing_fields.first}: Required field"
raise App::BadRequestException.new(@env, error_message)
end
json_params
end
protected def render_json(data, status_code : Int32 = 200)
@env.response.status_code = status_code
@env.response.content_type = "application/json"
data.to_json
end
protected def param(key : String) : String
@env.params.url[key]
rescue KeyError
raise App::BadRequestException.new(@env, "Missing required parameter: #{key}")
end
end
end
+17 -1
View File
@@ -1,5 +1,6 @@
require "sqlite3"
require "crecto"
require "micrate"
module App::Lib
class Database
@@ -8,11 +9,26 @@ module App::Lib
Query = Crecto::Repo::Query
config do |conf|
conf.uri = ENV["DATABASE_URL"]
base_url = ENV["DATABASE_URL"]
separator = base_url.includes?("?") ? "&" : "?"
db_url = base_url + separator +
"&journal_mode=WAL" +
"&synchronous=NORMAL" + # Better performance with reasonable safety
"&foreign_keys=true"
conf.uri = db_url
end
if ENV["ENV"] == "development"
Crecto::DbLogger.set_handler(STDOUT)
end
def self.run_migrations
Micrate::DB.connection_url = ENV["DATABASE_URL"]
Micrate::Cli.run_up
end
run_migrations
end
end
+16
View File
@@ -1,8 +1,18 @@
require "kemal"
module App
class InternalServerErrorException < Kemal::Exceptions::CustomException
def initialize(context)
context.response.content_type = "application/json"
context.response.status_code = 500
context.response.print({ "error" => "Internal Server Error" }.to_json)
super(context)
end
end
class BadRequestException < Kemal::Exceptions::CustomException
def initialize(context, message : String)
context.response.content_type = "application/json"
context.response.status_code = 400
context.response.print({ "error" => message }.to_json)
super(context)
@@ -11,13 +21,16 @@ module App
class UnauthorizedException < Kemal::Exceptions::CustomException
def initialize(context)
context.response.content_type = "application/json"
context.response.status_code = 401
context.response.print({ "error" => "Unauthorized access" }.to_json)
super(context)
end
end
class ForbiddenException < Kemal::Exceptions::CustomException
def initialize(context)
context.response.content_type = "application/json"
context.response.status_code = 403
context.response.print({ "error" => "Access not allowed" }.to_json)
super(context)
@@ -26,13 +39,16 @@ module App
class NotFoundException < Kemal::Exceptions::CustomException
def initialize(context)
context.response.content_type = "application/json"
context.response.status_code = 404
context.response.print({ "error" => "Resource not found" }.to_json)
super(context)
end
end
class UnprocessableEntityException < Kemal::Exceptions::CustomException
def initialize(context, message : Hash(String, Array(String)))
context.response.content_type = "application/json"
context.response.status_code = 422
context.response.print({ "errors" => message }.to_json)
super(context)
+45
View File
@@ -0,0 +1,45 @@
require "maxminddb"
require "log"
module App::Lib
struct IpLookup
MMDB_PATH = "data/GeoLite2-Country.mmdb"
@@reader : MaxMindDB::Reader? = nil
@@reader_mutex = Mutex.new
private def self.get_reader : MaxMindDB::Reader
@@reader_mutex.synchronize do
@@reader ||= MaxMindDB.open(MMDB_PATH)
end
end
def self.country(ip_address : String) : String?
return nil if ip_address == "Unknown" || ip_address.empty?
begin
lookup = get_reader.get(ip_address)
lookup["country"]?.try &.["iso_code"]?.try &.as_s
rescue ex
Log.error { "IP lookup failed: #{ex.message}" }
nil
end
end
def self.ip_from_address(address_string : String?) : String?
return nil if address_string.nil?
if address_string.includes?('[') # IPv6 with port: [2001:db8::1]:8080
address_string.split(']').first.sub('[', '\'')
elsif address_string.includes?(':')
if address_string.count(':') > 1 # IPv6 without port
address_string
else # IPv4 with port: 192.168.1.1:8080
address_string.split(':').first
end
else # Address without port
address_string
end
end
end
end
+123
View File
@@ -0,0 +1,123 @@
require "yaml"
require "semantic_version"
module App::Lib
struct UserAgent
REGEXES_PATH = "data/uap_core_regexes.yaml"
@@regexes_cache : YAML::Any? = nil
@@compiled_regexes = {} of String => Array(Tuple(Regex, YAML::Any))
@@mutex = Mutex.new
private def self.load_regexes
@@mutex.synchronize do
if @@regexes_cache.nil?
begin
regexes_yaml = File.read(REGEXES_PATH)
@@regexes_cache = YAML.parse(regexes_yaml)
# Pre-compile all regexes for better performance
["user_agent_parsers", "os_parsers", "device_parsers"].each do |parser_type|
@@compiled_regexes[parser_type] = [] of Tuple(Regex, YAML::Any)
@@regexes_cache.not_nil![parser_type].as_a.each do |parser|
regex_str = parser["regex"].as_s
options = parser["regex_flag"]?.try(&.as_s) == "i" ?
Regex::Options::IGNORE_CASE : Regex::Options::None
begin
compiled_regex = Regex.new(regex_str, options)
@@compiled_regexes[parser_type] << {compiled_regex, parser}
rescue
# Skip invalid regexes
end
end
end
rescue ex
# If loading fails, set an empty cache to prevent repeated failures
@@regexes_cache = YAML.parse("{}")
@@compiled_regexes = {} of String => Array(Tuple(Regex, YAML::Any))
end
end
end
end
def self.parse(user_agent_string : String)
return {nil, nil, nil, nil} if user_agent_string.empty?
# Load regexes only once and cache them
load_regexes
family = nil
version = nil
device = nil
os = nil
@@compiled_regexes["user_agent_parsers"]?.try &.each do |regex_tuple|
regex, parser = regex_tuple
match = regex.match(user_agent_string)
next unless match
family = match[1]? || nil
v1 = (match[2]? || "0").to_i
v2 = (match[3]? || "0").to_i
v3 = (match[4]? || "0").to_i
# Apply replacements if defined
if replacement = parser["family_replacement"]?
family = replacement.as_s.gsub("$1", family.to_s)
end
version = SemanticVersion.new(v1, v2, v3)
break
end
@@compiled_regexes["os_parsers"]?.try &.each do |regex_tuple|
regex, parser = regex_tuple
match = regex.match(user_agent_string)
next unless match
os_family = match[1]? || nil
os_v1 = (match[2]? || "0").to_i
os_v2 = (match[3]? || "0").to_i
os_v3 = (match[4]? || "0").to_i
# Apply replacements if defined
if replacement = parser["os_replacement"]?
os_family = replacement.as_s.gsub("$1", os_family.to_s)
end
os = {os_family, SemanticVersion.new(os_v1, os_v2, os_v3)}
break
end
@@compiled_regexes["device_parsers"]?.try &.each do |regex_tuple|
regex, parser = regex_tuple
match = regex.match(user_agent_string)
next unless match
model = match[1]? || nil
device_name = model
brand = nil
# Apply replacements if defined
if device_replacement = parser["device_replacement"]?
device_name = device_replacement.as_s.gsub("$1", device_name.to_s)
end
if model_replacement = parser["model_replacement"]?
model = model_replacement.as_s.gsub("$1", model.to_s)
end
if brand_replacement = parser["brand_replacement"]?
brand = brand_replacement.as_s
end
device = {model, brand, device_name}
break
end
{family, version, device, os}
end
end
end
+29
View File
@@ -0,0 +1,29 @@
module App::Middlewares
class CORSHandler < Kemal::Handler
exclude ["/api/ping", "/:slug"]
def initialize(
@allow_origin = "*",
@allow_methods = "GET, POST, PUT, DELETE, OPTIONS",
@allow_headers = "Content-Type, Accept, Origin, X-Api-Key"
)
end
def call(env)
return call_next(env) if exclude_match?(env)
env.response.headers["Access-Control-Allow-Origin"] = @allow_origin
env.response.headers["Access-Control-Allow-Methods"] = @allow_methods
env.response.headers["Access-Control-Allow-Headers"] = @allow_headers
if env.request.method == "OPTIONS"
env.response.status_code = 200
env.response.content_type = "text/plain"
env.response.print("")
return env
end
call_next(env)
end
end
end
+4 -4
View File
@@ -3,16 +3,16 @@ require "crecto"
module App::Models
class Click < Crecto::Model
schema :clicks do
field :id, String, primary_key: true
field :id, Int64, primary_key: true
field :user_agent, String
field :language, String
field :country, String
field :browser, String
field :os, String
field :source, String
field :referer, String
belongs_to :link, Link
end
validate_required [:user_agent, :language, :source]
validate_required [:user_agent, :referer]
end
end
+2 -2
View File
@@ -6,7 +6,7 @@ require "./user.cr"
module App::Models
class Link < Crecto::Model
schema :links do
field :id, String, primary_key: true
field :id, Int64, primary_key: true
field :slug, String
field :url, String
@@ -17,6 +17,6 @@ module App::Models
unique_constraint :slug
validate_required [:slug, :url]
validate_format :url, /\A(?:https?:\/\/)?(?:[\w-]+\.)+[\w-]+(?:\/\S*)?/
validate_format :url, /\A(?:(https?:\/\/)?(?:[\w-]+\.)+[a-z]{2,})(?::\d+)?(?:[\/?#]\S*)?\z/i
end
end
+1 -1
View File
@@ -4,7 +4,7 @@ require "crecto"
module App::Models
class User < Crecto::Model
schema :users do
field :id, String, primary_key: true
field :id, Int64, primary_key: true
field :name, String
field :api_key, String
end
+22 -19
View File
@@ -1,41 +1,44 @@
require "./controllers/**"
require "kemal"
add_handler App::Middlewares::CORSHandler.new
add_handler App::Middlewares::Auth.new
module App
before_all do |env|
env.response.headers["Access-Control-Allow-Origin"] = "*"
env.response.headers["Access-Control-Allow-Methods"] = "GET, POST, PUT, DELETE, OPTIONS"
env.response.headers["Access-Control-Allow-Headers"] = "Content-Type, Accept, Origin, X-Api-Key"
end
after_all do |env|
env.response.content_type = "application/json"
end
get "/:slug", &App::Controllers::ClickController.redirect_handler
# Namespace /api
get "/api/ping" do |env|
Controllers::Ping::Get.new.call(env)
end
get "/:slug" do |env|
Controllers::Link::Index.new.call(env)
Controllers::PingController.new(env).ping
end
get "/api/links" do |env|
Controllers::Link::All.new.call(env)
Controllers::LinkController.new(env).list_all
end
get "/api/links/:id" do |env|
Controllers::Link::Get.new.call(env)
Controllers::LinkController.new(env).get
end
get "/api/links/:id/clicks" do |env|
Controllers::LinkController.new(env).list_clicks
end
post "/api/links" do |env|
Controllers::Link::Create.new.call(env)
Controllers::LinkController.new(env).create
end
put "/api/links/:id" do |env|
Controllers::Link::Update.new.call(env)
Controllers::LinkController.new(env).update
end
delete "/api/links/:id" do |env|
Controllers::Link::Delete.new.call(env)
Controllers::LinkController.new(env).delete
end
error 500 do |env|
App::InternalServerErrorException.new(env)
""
end
end
+2 -2
View File
@@ -11,10 +11,10 @@ module App::Serializers
builder.object do
builder.field("id", @click.id)
builder.field("user_agent", @click.user_agent)
builder.field("language", @click.language)
builder.field("country", @click.country)
builder.field("browser", @click.browser)
builder.field("os", @click.os)
builder.field("source", @click.source)
builder.field("referer", @click.referer)
builder.field("created_at", @click.created_at)
end
end
+9 -1
View File
@@ -16,7 +16,15 @@ module App::Serializers
builder.field("id", @link.id)
builder.field("refer", @refer)
builder.field("origin", @link.url)
builder.field("clicks", @link.clicks.map { |click| App::Serializers::Click.new(click) })
begin
clicks = @link.clicks
unless clicks.empty?
builder.field("clicks", clicks.map { |click| App::Serializers::Click.new(click) })
end
rescue Crecto::AssociationNotLoaded
# Association not loaded, skip this field silently
end
end
end
end
+106 -4
View File
@@ -1,16 +1,18 @@
require "file_utils"
require "http/client"
require "../config/*"
require "../lib/*"
require "../models/*"
module App::Services::Cli
def self.create_user(name)
def self.create_user(name, api_key = nil)
user = App::Models::User.new
user.id = UUID.v4.to_s
user.name = name
user.api_key = Random::Secure.urlsafe_base64()
user.api_key = api_key || Random::Secure.urlsafe_base64()
changeset = App::Lib::Database.insert(user)
return changeset.errors if !changeset.valid?
return changeset.errors unless changeset.valid?
"New user created: Name: #{user.name}, X-Api-Key: #{user.api_key}"
end
@@ -35,4 +37,104 @@ module App::Services::Cli
"User with ID #{user_id} deleted successfully"
end
def self.setup_admin_user
admin_name = ENV["ADMIN_NAME"]?
admin_api_key = ENV["ADMIN_API_KEY"]?
if admin_name && admin_api_key
query = App::Lib::Database::Query.where(name: admin_name, api_key: admin_api_key).limit(1)
existing_user = App::Lib::Database.all(App::Models::User, query).first?
return if existing_user
puts "Admin user setup detected. Creating admin user..."
result = create_user(admin_name, admin_api_key)
puts result
else
puts "Admin setup skipped: Missing ADMIN_NAME or ADMIN_API_KEY environment variables."
end
end
def self.update_uap_regexes
puts "Downloading User-Agent Parser core regexes..."
FileUtils.mkdir_p("data")
url = "https://raw.githubusercontent.com/ua-parser/uap-core/master/regexes.yaml"
output_file = "data/uap_core_regexes.yaml"
begin
http_get_with_redirect(url) do |response|
File.write(output_file, response.body_io.gets_to_end)
end
puts "User-Agent regexes downloaded to #{output_file}"
rescue e
puts "Error: Failed to download UAP core regexes: #{e.message}"
end
end
def self.update_geolite_db
puts "Downloading GeoLite2 Country database..."
FileUtils.mkdir_p("data")
url = "https://github.com/P3TERX/GeoLite.mmdb/raw/download/GeoLite2-Country.mmdb"
output_file = "data/GeoLite2-Country.mmdb"
begin
File.open(output_file, "wb") do |file|
http_get_with_redirect(url) do |response|
IO.copy(response.body_io, file)
end
end
puts "GeoLite2 database downloaded to #{output_file}"
rescue e
puts "Error: Failed to download GeoLite2 database: #{e.message}"
end
end
private def self.http_get_with_redirect(url : String, max_redirects = 5)
redirects = 0
while redirects < max_redirects
uri = URI.parse(url)
client = HTTP::Client.new(uri)
success = false
follow_redirect = false
redirect_url = nil
begin
client.get(uri.request_target) do |response|
case response.status_code
when 200
yield response
success = true
when 301, 302
if new_location = response.headers["Location"]?
puts "Following redirect to: #{new_location}"
redirect_url = new_location
follow_redirect = true
else
raise "Received redirect status but no Location header"
end
else
raise "Failed request with status code: #{response.status_code}"
end
end
ensure
client.close
end
return if success
if follow_redirect && redirect_url
url = redirect_url
redirects += 1
else
break
end
end
raise "Too many redirects (#{max_redirects})"
end
end
+12
View File
@@ -0,0 +1,12 @@
require "digest"
require "base64"
module App::Services::SlugService
def self.shorten_url(url : String, user_id : Int64) : String
combined = "#{user_id}-#{url}"
crc32_hash = Digest::CRC32.digest(combined)
base62_encoded = Base64.urlsafe_encode(crc32_hash).strip.tr("-_=", "")
base62_encoded
end
end
-3
View File
@@ -1,3 +0,0 @@
module App
VERSION = "0.1.0"
end
-154
View File
@@ -1,154 +0,0 @@
#!/bin/bash
api_url="http://localhost:4001/api/links"
num_links=1000
num_requests=10
resource_usage_interval=1 # Interval in seconds for resource usage logging
semaphore="/tmp/semaphore"
max_concurrent_processes=$(ulimit -u) # Adjust this number based on your system's capability
# Initialize semaphore
mkfifo $semaphore
exec 3<> $semaphore
rm $semaphore
for ((i=0; i<max_concurrent_processes; i++)); do
echo >&3
done
echo "Semaphore initialized with $max_concurrent_processes slots."
function get_resource_usage {
while true; do
docker stats --no-stream --format "table {{.MemUsage}} {{.CPUPerc}}" bit-app-1 | awk 'NR>1 {print "Memory:", $1, "CPU:", $2}' >> resource_usage.txt
sleep $resource_usage_interval
done
}
function calculate_average_usage {
total_mem=0
total_cpu=0
count=0
while read -r line; do
if echo $line | grep -q 'Memory'; then
mem=$(echo $line | awk '{print $2}' | sed 's/MiB//')
total_mem=$(echo "$total_mem + $mem" | bc)
elif echo $line | grep -q 'CPU'; then
cpu=$(echo $line | awk '{print $2}' | sed 's/%//')
total_cpu=$(echo "$total_cpu + $cpu" | bc)
fi
((count++))
done < resource_usage.txt
avg_mem=$(echo "scale=2; $total_mem / ($count / 2)" | bc) # Since there are 2 lines per interval
avg_cpu=$(echo "scale=2; $total_cpu / ($count / 2)" | bc)
rm resource_usage.txt
echo "Average Memory Usage: $avg_mem MiB"
echo "Average CPU Usage: $avg_cpu%"
}
function measure {
total_time=0
declare -a refer_links
# Start resource usage logging in the background
nohup bash -c "$(declare -f get_resource_usage); get_resource_usage" &> /dev/null &
resource_usage_pid=$!
disown
echo "Creating $num_links short links..."
for ((i=1; i<=num_links; i++)); do
response=$(curl --silent --request POST \
--url $api_url \
--header "X-Api-Key: $api_key" \
--header "Content-Type: application/json" \
--data "{ \"url\": \"https://kagi.com\" }")
refer=$(echo $response | awk -F'"' '/"refer":/{print $(NF-1)}')
if [[ -n $refer ]]; then
refer_links+=("$refer")
if (( i % 100 == 0 )); then
echo "Created short link $i/$num_links"
fi
else
echo "Failed to create short link $i"
echo $response
exit 1
fi
done
echo "Accessing each link $num_requests times concurrently..."
> times.txt # Ensure times.txt is created and empty
total_accesses=$((num_links * num_requests))
accesses_done=0
for refer in "${refer_links[@]}"; do
for ((i=1; i<=num_requests; i++)); do
# Wait for a slot
read -u 3
{
start_time=$(date +%s%6N)
curl -s "$refer" >> /dev/null
end_time=$(date +%s%6N)
elapsed_time=$(echo "$end_time - $start_time" | bc)
echo $elapsed_time >> times.txt
# Release the slot
echo >&3
((accesses_done++))
if (( accesses_done % 10 == 0 )); then
echo "Accessed $accesses_done/$total_accesses"
fi
} &
done
done
wait
# Stop resource usage logging
if kill -0 $resource_usage_pid 2>/dev/null; then
kill $resource_usage_pid
fi
# Read all elapsed times and calculate total
while read -r time; do
total_time=$(echo "$total_time + $time" | bc)
done < times.txt
rm times.txt
echo "****Results****"
calculate_average_usage
echo "Average Response Time: $(echo "scale=2; $total_time / ($num_links * $num_requests)" | bc) µs"
}
echo "Setup..."
docker-compose up -d
if [ $? -ne 0 ]; then
echo "Failed to start Docker containers."
exit 1
fi
docker-compose exec -T app migrate
if [ $? -ne 0 ]; then
echo "Failed to run database migrations."
exit 1
fi
# Create a new user and capture the API key
output=$(docker-compose exec -T app cli --create-user=Admin)
api_key=$(echo "$output" | awk -F' ' '/X-Api-Key:/{print $NF}')
echo "Captured API Key: $api_key"
echo "Waiting for database to be ready..."
sleep 5
measure
# Clean up
docker-compose down
+2 -5
View File
@@ -5,14 +5,11 @@ require "./app/lib/*"
require "./app/models/*"
require "./app/serializers/*"
require "./app/middlewares/*"
require "./app/services/*"
require "./app/routes"
add_context_storage_type(App::Models::User)
add_handler(App::Middlewares::Auth.new)
error 500 { |env| {"error" => "Internal Server Error" }.to_json}
error 401 { |env| {"error" => "Unauthorized" }.to_json}
error 404 { |env| {"error" => "Not Found" }.to_json}
App::Services::Cli.setup_admin_user
Kemal.run
Binary file not shown.
+337 -35
View File
@@ -93,6 +93,10 @@ user_agent_parsers:
- regex: '(NewRelicPinger)/(\d+)\.(\d+)'
family_replacement: 'NewRelicPingerBot'
# Dynatrace/Ruxit synthetic monitor
- regex: '(RuxitSynthetic)/(\d+)\.(\d+)'
family_replacement: 'Ruxit Synthetic'
# Tableau
- regex: '(Tableau)/(\d+)\.(\d+)'
family_replacement: 'Tableau'
@@ -148,11 +152,11 @@ user_agent_parsers:
family_replacement: 'Pinterestbot'
# Bots
- regex: '(CSimpleSpider|Cityreview Robot|CrawlDaddy|CrawlFire|Finderbots|Index crawler|Job Roboter|KiwiStatus Spider|Lijit Crawler|QuerySeekerSpider|ScollSpider|Trends Crawler|USyd-NLP-Spider|SiteCat Webbot|BotName\/\$BotVersion|123metaspider-Bot|1470\.net crawler|50\.nu|8bo Crawler Bot|Aboundex|Accoona-[A-z]{1,30}-Agent|AdsBot-Google(?:-[a-z]{1,30}|)|altavista|AppEngine-Google|archive.{0,30}\.org_bot|archiver|Ask Jeeves|[Bb]ai[Dd]u[Ss]pider(?:-[A-Za-z]{1,30})(?:-[A-Za-z]{1,30}|)|bingbot|BingPreview|blitzbot|BlogBridge|Bloglovin|BoardReader Blog Indexer|BoardReader Favicon Fetcher|boitho.com-dc|BotSeer|BUbiNG|\b\w{0,30}favicon\w{0,30}\b|\bYeti(?:-[a-z]{1,30}|)|Catchpoint(?: bot|)|[Cc]harlotte|Checklinks|clumboot|Comodo HTTP\(S\) Crawler|Comodo-Webinspector-Crawler|ConveraCrawler|CRAWL-E|CrawlConvera|Daumoa(?:-feedfetcher|)|Feed Seeker Bot|Feedbin|findlinks|Flamingo_SearchEngine|FollowSite Bot|furlbot|Genieo|gigabot|GomezAgent|gonzo1|(?:[a-zA-Z]{1,30}-|)Googlebot(?:-[a-zA-Z]{1,30}|)|Google SketchUp|grub-client|gsa-crawler|heritrix|HiddenMarket|holmes|HooWWWer|htdig|ia_archiver|ICC-Crawler|Icarus6j|ichiro(?:/mobile|)|IconSurf|IlTrovatore(?:-Setaccio|)|InfuzApp|Innovazion Crawler|InternetArchive|IP2[a-z]{1,30}Bot|jbot\b|KaloogaBot|Kraken|Kurzor|larbin|LEIA|LesnikBot|Linguee Bot|LinkAider|LinkedInBot|Lite Bot|Llaut|lycos|Mail\.RU_Bot|masscan|masidani_bot|Mediapartners-Google|Microsoft .{0,30} Bot|mogimogi|mozDex|MJ12bot|msnbot(?:-media {0,2}|)|msrbot|Mtps Feed Aggregation System|netresearch|Netvibes|NewsGator[^/]{0,30}|^NING|Nutch[^/]{0,30}|Nymesis|ObjectsSearch|OgScrper|Orbiter|OOZBOT|PagePeeker|PagesInventory|PaxleFramework|Peeplo Screenshot Bot|PHPCrawl|PlantyNet_WebRobot|Pompos|Qwantify|Read%20Later|Reaper|RedCarpet|Retreiver|Riddler|Rival IQ|scooter|Scrapy|Scrubby|searchsight|seekbot|semanticdiscovery|SemrushBot|Simpy|SimplePie|SEOstats|SimpleRSS|SiteCon|Slackbot-LinkExpanding|Slack-ImgProxy|Slurp|snappy|Speedy Spider|Squrl Java|Stringer|TheUsefulbot|ThumbShotsBot|Thumbshots\.ru|Tiny Tiny RSS|Twitterbot|WhatsApp|URL2PNG|Vagabondo|VoilaBot|^vortex|Votay bot|^voyager|WASALive.Bot|Web-sniffer|WebThumb|WeSEE:[A-z]{1,30}|WhatWeb|WIRE|WordPress|Wotbox|www\.almaden\.ibm\.com|Xenu(?:.s|) Link Sleuth|Xerka [A-z]{1,30}Bot|yacy(?:bot|)|YahooSeeker|Yahoo! Slurp|Yandex\w{1,30}|YodaoBot(?:-[A-z]{1,30}|)|YottaaMonitor|Yowedo|^Zao|^Zao-Crawler|ZeBot_www\.ze\.bz|ZooShot|ZyBorg|ArcGIS Hub Indexer)(?:[ /]v?(\d+)(?:\.(\d+)(?:\.(\d+)|)|)|)'
- regex: '(CSimpleSpider|Cityreview Robot|CrawlDaddy|CrawlFire|Finderbots|Index crawler|Job Roboter|KiwiStatus Spider|Lijit Crawler|QuerySeekerSpider|ScollSpider|Trends Crawler|USyd-NLP-Spider|SiteCat Webbot|BotName\/\$BotVersion|123metaspider-Bot|1470\.net crawler|50\.nu|8bo Crawler Bot|Aboundex|Accoona-[A-z]{1,30}-Agent|AdsBot-Google(?:-[a-z]{1,30}|)|altavista|AppEngine-Google|archive.{0,30}\.org_bot|archiver|Ask Jeeves|[Bb]ai[Dd]u[Ss]pider(?:-[A-Za-z]{1,30})(?:-[A-Za-z]{1,30}|)|bingbot|BingPreview|blitzbot|BlogBridge|Bloglovin|BoardReader Blog Indexer|BoardReader Favicon Fetcher|boitho.com-dc|BotSeer|BUbiNG|\b\w{0,30}favicon\w{0,30}\b|\bYeti(?:-[a-z]{1,30}|)|Catchpoint(?: bot|)|[Cc]harlotte|Checklinks|clumboot|Comodo HTTP\(S\) Crawler|Comodo-Webinspector-Crawler|ConveraCrawler|CRAWL-E|CrawlConvera|Daumoa(?:-feedfetcher|)|Feed Seeker Bot|Feedbin|findlinks|Flamingo_SearchEngine|FollowSite Bot|furlbot|Genieo|gigabot|GomezAgent|gonzo1|(?:[a-zA-Z]{1,30}-|)Googlebot(?:-[a-zA-Z]{1,30}|)|GoogleOther|Google SketchUp|grub-client|gsa-crawler|heritrix|HiddenMarket|holmes|HooWWWer|htdig|ia_archiver|ICC-Crawler|Icarus6j|ichiro(?:/mobile|)|IconSurf|IlTrovatore(?:-Setaccio|)|InfuzApp|Innovazion Crawler|InternetArchive|IP2[a-z]{1,30}Bot|jbot\b|KaloogaBot|Kraken|Kurzor|larbin|LEIA|LesnikBot|Linguee Bot|LinkAider|LinkedInBot|Lite Bot|Llaut|lycos|Mail\.RU_Bot|masscan|masidani_bot|Mediapartners-Google|Microsoft .{0,30} Bot|mogimogi|mozDex|MJ12bot|msnbot(?:-media {0,2}|)|msrbot|Mtps Feed Aggregation System|netresearch|Netvibes|NewsGator[^/]{0,30}|^NING|Nutch[^/]{0,30}|Nymesis|ObjectsSearch|OgScrper|Orbiter|OOZBOT|PagePeeker|PagesInventory|PaxleFramework|Peeplo Screenshot Bot|PHPCrawl|PlantyNet_WebRobot|Pompos|Qwantify|Read%20Later|Reaper|RedCarpet|Retreiver|Riddler|Rival IQ|scooter|Scrapy|Scrubby|searchsight|seekbot|semanticdiscovery|SemrushBot|Simpy|SimplePie|SEOstats|SimpleRSS|SiteCon|Slackbot-LinkExpanding|Slack-ImgProxy|Slurp|snappy|Speedy Spider|Squrl Java|Stringer|TheUsefulbot|ThumbShotsBot|Thumbshots\.ru|Tiny Tiny RSS|Twitterbot|WhatsApp|URL2PNG|Vagabondo|VoilaBot|^vortex|Votay bot|^voyager|WASALive.Bot|Web-sniffer|WebThumb|WeSEE:[A-z]{1,30}|WhatWeb|WIRE|WordPress|Wotbox|www\.almaden\.ibm\.com|Xenu(?:.s|) Link Sleuth|Xerka [A-z]{1,30}Bot|yacy(?:bot|)|YahooSeeker|Yahoo! Slurp|Yandex\w{1,30}|YodaoBot(?:-[A-z]{1,30}|)|YottaaMonitor|Yowedo|^Zao|^Zao-Crawler|ZeBot_www\.ze\.bz|ZooShot|ZyBorg|ArcGIS Hub Indexer|GPTBot|Google-InspectionTool)(?:[ /]v?(\d+)(?:\.(\d+)(?:\.(\d+)|)|)|)'
# AWS S3 Clients
# must come before "Bots General matcher" to catch "boto"/"boto3" before "bot"
- regex: '\b(Boto3?|JetS3t|aws-(?:cli|sdk-(?:cpp|go|java|nodejs|ruby2?|dotnet-(?:\d{1,2}|core)))|s3fs)/(\d+)\.(\d+)(?:\.(\d+)|)'
- regex: '\b(Boto3?|JetS3t|aws-(?:cli|sdk-(?:cpp|go|go-v\d|java|nodejs|ruby2?|dotnet-(?:\d{1,2}|core)))|s3fs)/(\d+)\.(\d+)(?:\.(\d+)|)'
# SAFE FME
- regex: '(FME)\/(\d+\.\d+)\.(\d+)\.(\d+)'
@@ -179,6 +183,9 @@ user_agent_parsers:
- regex: '\[FB.{0,300};'
family_replacement: 'Facebook'
# RecipeRadar crawler
- regex: '(RecipeRadar)/(\d+)\.(\d+)(?:\.(\d+)|)'
# Bots General matcher 'name/0.0'
- regex: '^.{0,200}?(?:\/[A-Za-z0-9\.]{0,50}|) {0,2}([A-Za-z0-9 \-_\!\[\]:]{0,50}(?:[Aa]rchiver|[Ii]ndexer|[Ss]craper|[Bb]ot|[Ss]pider|[Cc]rawl[a-z]{0,50}))[/ ](\d+)(?:\.(\d+)(?:\.(\d+)|)|)'
# Bots containing bot(but not CUBOT)
@@ -203,7 +210,12 @@ user_agent_parsers:
- regex: '\[(Pinterest)/[^\]]{1,50}\]'
- regex: '(Pinterest)(?: for Android(?: Tablet|)|)/(\d+)(?:\.(\d+)|)(?:\.(\d+)|)'
# Instagram app
# iOS Instagram embeds the token inside a full WebKit UA:
# Mozilla/5.0 (iPhone; ...) Mobile/... Instagram VERSION (...)
# Android Instagram uses a bare format with no browser wrapper:
# Instagram VERSION Android (...)
- regex: 'Mozilla.{1,200}Mobile.{1,100}(Instagram).(\d+)\.(\d+)\.(\d+)'
- regex: '(Instagram) (\d+)\.(\d+)\.(\d+)'
# Flipboard app
- regex: 'Mozilla.{1,200}Mobile.{1,100}(Flipboard).(\d+)\.(\d+)\.(\d+)'
# Flipboard-briefing app
@@ -215,6 +227,19 @@ user_agent_parsers:
# Twitter
- regex: '(Twitter for (?:iPhone|iPad)|TwitterAndroid)(?:\/(\d+)\.(\d+)|)'
family_replacement: 'Twitter'
# TikTok
- regex: '(musical_ly) app_version\/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'TikTok'
- regex: '(musical_ly_)(\d+)\.(\d+)\.(\d+)'
family_replacement: 'TikTok'
- regex: '(BytedanceWebview)\/[a-z0-9]+'
family_replacement: 'TikTok'
# KakaoTalk
- regex: 'Mozilla.{1,200}Mobile.{1,100}(KAKAOTALK)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'KakaoTalk'
# Telegram
- regex: '(Telegram-Android)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Telegram'
# Phantom app
- regex: 'Mozilla.{1,200}Mobile.{1,100}(Phantom\/ios|Phantom\/android).(\d+)\.(\d+)\.(\d+)'
@@ -235,6 +260,10 @@ user_agent_parsers:
- regex: '(PaleMoon)/(\d+)\.(\d+)(?:\.(\d+)|)'
family_replacement: 'Pale Moon'
# Camoufox - anti-detect Firefox fork for web scraping/automation; replaces the
# Firefox version token with "Camoufox Camoufox VERSION" in the UA string
- regex: '(Camoufox) Camoufox (\d+)\.(\d+)'
# Firefox
- regex: '(Fennec)/(\d+)\.(\d+)\.?([ab]?\d+[a-z]*)'
family_replacement: 'Firefox Mobile'
@@ -283,7 +312,7 @@ user_agent_parsers:
# UC Browser
# we need check it before opera. In other case case UC Browser detected look like Opera Mini
- regex: '(UC? ?Browser|UCWEB|U3)[ /]?(\d+)\.(\d+)\.(\d+)'
- regex: '(UC? ?Browser|UCWEB|UCMobile|U3)[ /]?(\d+)\.(\d+)\.(\d+)'
family_replacement: 'UC Browser'
# Opera will stop at 9.80 and hide the real version in the Version string.
@@ -308,6 +337,14 @@ user_agent_parsers:
- regex: '(?:Chrome).{1,300}(OPR)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Opera'
# Opera GX uses "OPX" instead of "OPR"
- regex: '(OPX)/(\d+)\.(\d+)(?:\.(\d+)|)'
family_replacement: 'Opera GX'
# Opera Touch uses "OPT"
- regex: '(OPT)/(\d+)\.(\d+)(?:\.(\d+)|)'
family_replacement: 'Opera Touch'
# Opera Coast
- regex: '(Coast)/(\d+).(\d+).(\d+)'
family_replacement: 'Opera Coast'
@@ -394,13 +431,17 @@ user_agent_parsers:
- regex: '(Instabridge)/(\d+)(?:\.(\d+)|)(?:\.(\d+)|)'
# Aloha Browser
- regex: '(AlohaBrowser)/(\d+)\.(\d+)\.(\d+)(?:\.(\d+)|)'
- regex: '(AlohaBrowser|ABB)/(\d+)\.(\d+)\.(\d+)(?:\.(\d+)|)'
family_replacement: 'Aloha Browser'
# Brave Browser https://brave.com/ , should go before Safari and Chrome Mobile
# Brave Browser, should go before Safari and Chrome Mobile
- regex: '((?:B|b)rave(?:\sChrome)?)/(\d+)(?:\.(\d+)|)(?:\.(\d+)|)(?:\.(\d+)|)'
family_replacement: 'Brave'
# Brave iOS Browser, checks for (Brave) or Brave at end
- regex: '(?:\()?Brave(?:\))?\s*$'
family_replacement: 'Brave'
# Amazon Silk, should go before Safari and Chrome Mobile
- regex: '(Silk)/(\d+)\.(\d+)(?:\.([0-9\-]+)|)'
family_replacement: 'Amazon Silk'
@@ -415,7 +456,7 @@ user_agent_parsers:
family_replacement: 'Edge Mobile'
# Oculus Browser, should go before Samsung Internet
- regex: '(OculusBrowser)/(\d+)\.(\d+).0.0(?:\.([0-9\-]+)|)'
- regex: '(OculusBrowser)/(\d+)\.(\d+)(?:\.([0-9\-]+)|)'
family_replacement: 'Oculus Browser'
# Samsung Internet (based on Chrome, but lacking some features)
@@ -430,12 +471,6 @@ user_agent_parsers:
- regex: '(coc_coc_browser)/(\d+)\.(\d+)(?:\.(\d+)|)'
family_replacement: 'Coc Coc'
# Baidu Browsers (desktop spoofs chrome & IE, explorer is mobile)
- regex: '(baidubrowser)[/\s](\d+)(?:\.(\d+)|)(?:\.(\d+)|)'
family_replacement: 'Baidu Browser'
- regex: '(FlyFlow)/(\d+)\.(\d+)'
family_replacement: 'Baidu Explorer'
# MxBrowser is Maxthon. Must go before Mobile Chrome for Android
- regex: '(MxBrowser)/(\d+)\.(\d+)(?:\.(\d+)|)'
family_replacement: 'Maxthon'
@@ -464,6 +499,12 @@ user_agent_parsers:
- regex: 'Mozilla.{1,200}Android.{1,200}(GSA)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Google'
# Baidu Browsers (desktop spoofs chrome & IE, explorer is mobile)
- regex: '(baidubrowser)[/\s](\d+)(?:\.(\d+)|)(?:\.(\d+)|)'
family_replacement: 'Baidu Browser'
- regex: '(FlyFlow|flyflow|baiduboxapp)/(\d+)\.(\d+)(?:\.(\d+)|)(?:\.(\d+)|)'
family_replacement: 'Baidu Explorer'
# QQ Browsers
- regex: '(MQQBrowser/Mini)(?:(\d+)(?:\.(\d+)|)(?:\.(\d+)|)|)'
family_replacement: 'QQ Browser Mini'
@@ -492,6 +533,36 @@ user_agent_parsers:
- regex: '(Ecosia) android@(\d+)(?:\.(\d+)|)(?:\.(\d+)|)(?:\.(\d+)|)'
family_replacement: 'Ecosia Android'
# VivoBrowser
- regex: '(VivoBrowser)\/(\d+)\.(\d+)\.(\d+)(?:\.(\d+)|)'
# HiBrowser
- regex: '(H[Ii]Browser)\/v(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: 'HiBrowser'
# Honor Browser
- regex: '(HonorBrowser)/(\d+)\.(\d+)\.(\d+)(?:\.(\d+)|)'
family_replacement: 'Honor Browser'
# Honor Browser
- regex: '(bdhonorbrowser)/(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Honor Browser'
# HeyTap Browser
- regex: '(HeyTapBrowser)/(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: 'HeyTap Browser'
# Weibo
# Must before Chrome Mobile WebView
- regex: '(weibo)__(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Weibo'
- regex: '(WeiboliteiOS|WeiboIntliOS)'
family_replacement: 'Weibo'
# Phoenix Browser
- regex: '(PHX)/(\d+)\.(\d+)'
family_replacement: 'Phoenix Browser'
# Chrome Mobile
- regex: 'Version/.{1,300}(Chrome)/(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Chrome Mobile WebView'
@@ -569,6 +640,129 @@ user_agent_parsers:
- regex: '^(surveyon)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Surveyon'
# 115 Browser
- regex: '(115Browser)/(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: '115 Browser'
# Avira
- regex: '(Avira)/(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Avira'
# CCleaner Browser
- regex: '(CCleaner)/(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: 'CCleaner'
# Norton
- regex: '(Norton)/(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Norton'
# Quark
- regex: '(Quark)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Quark'
# Quark PC
- regex: '(QuarkPC)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Quark PC'
# Smart Lenovo Browser
- regex: '(SLBrowser)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Smart Lenovo Browser'
# Atom Browser
- regex: '(Atom)/(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Atom Browser'
# 360 Secure Browser
- regex: '(Chrome)/\d+\.\d+\.\d+\.\d+ .* QIHU 360(?:SEi18n|ENT)'
family_replacement: '360 Secure Browser'
# Decentr Web3 Browser
- regex: '(Decentr)'
family_replacement: 'Decentr Web3 Browser'
# Sparrow Browser
- regex: '(Sparrow)'
family_replacement: 'Sparrow Browser'
# Chromium GOST Browser
- regex: '(Chromium GOST)'
family_replacement: 'Chromium GOST Browser'
# AOL Shield Browser
- regex: '(AOLShield)/(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: 'AOL Shield Browser'
# Hola Browser
- regex: '(Hola)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Hola Browser'
# Craving Explorer Browser
- regex: '(CravingExplorer)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Craving Explorer Browser'
# Talon Cyber Security Browser
- regex: '(Talon)'
family_replacement: 'Talon Cyber Security Browser'
# QAX Browser
- regex: '(Qaxbrowser)'
family_replacement: 'QAX Browser'
# AOL Desktop Gold Browser
- regex: '(ADG)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'AOL Desktop Gold Browser'
# Sber Browser
- regex: '(SberBrowser)/(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Sber Browser'
# JiSu Browser
- regex: '(JiSu)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'JiSu Browser'
# Wolvic Browser
- regex: '(Wolvic)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Wolvic Browser'
# SmartTV WebBrowser
- regex: '(Thano)/(\d+)\.(\d+)'
family_replacement: 'SmartTV WebBrowser'
# WeChat Browser
- regex: '(MicroMessenger)/(\d+)\.(\d+)(?:\.(\d+)|)'
family_replacement: 'WeChat Browser'
# Odin Browser
- regex: '(Odin)/(\d+)\.(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Odin'
# NetCast Smart TV
- regex: '(Colt)/(\d+)\.(\d+)'
family_replacement: 'NetCast Smart TV'
# Lite Browser
- regex: '(Lite Browser)/(\d+)\.(\d+)'
family_replacement: 'Lite Browser'
# Vewd Browser
- regex: '(OMI)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Vewd Browser'
# Mypal
- regex: '(Mypal)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Mypal Browser'
# Chess.com native app
- regex: '(Chesscom-Android)/(\d+)\.(\d+)\.(\d+)'
# Roblox native app
- regex: '(RobloxApp)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Roblox App'
# Roadrunner iOS app (not the legacy Time Warner Cable ISP identifier)
- regex: '(Roadrunner)/IOS/\d+/(\d+)\.(\d+)\.(\d+)'
# Ancestry.com Android app
- regex: '(AncestryAndroid)/(\d+)\.(\d+)(?:\.(\d+)|)'
#### END SPECIAL CASES TOP ####
#### MAIN CASES - this catches > 50% of all browsers ####
@@ -666,6 +860,96 @@ user_agent_parsers:
# Browser/major_version.minor_version
- regex: '(bingbot|Bolt|AdobeAIR|Jasmine|IceCat|Skyfire|Midori|Maxthon|Lynx|Arora|IBrowse|Dillo|Camino|Shiira|Fennec|Phoenix|Flock|Netscape|Lunascape|Epiphany|WebPilot|Opera Mini|Opera|NetFront|Netfront|Konqueror|Googlebot|SeaMonkey|Kazehakase|Vienna|Iceape|Iceweasel|IceWeasel|Iron|K-Meleon|Sleipnir|Galeon|GranParadiso|iCab|iTunes|MacAppStore|NetNewsWire|Space Bison|Stainless|Orca|Dolfin|BOLT|Minimo|Tizen Browser|Polaris|Abrowser|Planetweb|ICE Browser|mDolphin|qutebrowser|Otter|QupZilla|MailBar|kmail2|YahooMobileMail|ExchangeWebServices|ExchangeServicesClient|Dragon|Outlook-iOS-Android)/(\d+)\.(\d+)(?:\.(\d+)|)'
# Qt Web Engine embedded browser, must be before Chrome
- regex: '(QtWebEngine)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Qt Web Engine'
# OpenWave browser (Chromium-based), must be before Chrome
- regex: '(OpenWave)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Open Wave'
# AtContent - confirmed APT29/Nobelium (Cozy Bear) C2 malware marker. The implant
# (AcroSup.dll, side-loaded via Adobe WCChromeNativeMessagingHost.exe) uses a hardcoded
# UA of the form 'Chrome/100.0.4896.75 Safari/537.36 AtContent/91.5.2444.45' to
# communicate with Dropbox C2. Also observed appended after Edg/ tokens.
# Source: Cluster25/DuskRise 'Cozy Smuggled Into the Box', May 2022
# (https://www.duskrise.com/2022/05/13/cozy-smuggled-into-the-box-apt29-abusing-legitimate-software-for-targeted-operations-in-europe/)
- regex: '(AtContent)/(\d+)\.(\d+)\.(\d+)'
# Trailer - suspicious fake UA token appended to Chrome/Edge/Opera UA strings
# (TOKEN/MAJOR.MINOR.BUILD.PATCH). No known legitimate browser uses this token.
# Structurally identical to AtContent (confirmed APT29/Nobelium C2 marker; see
# Cluster25/DuskRise 'Cozy Smuggled Into the Box', May 2022). Unconfirmed attribution;
# may be same actor rotating token names or a copycat using the same spoofing technique.
- regex: '(Trailer)/(\d+)\.(\d+)\.(\d+)'
# Agency - suspicious fake UA token appended to Chrome UA strings
# (TOKEN/MAJOR.MINOR.BUILD.PATCH). No known legitimate browser uses this token.
# Structurally identical to AtContent (confirmed APT29/Nobelium C2 marker; see
# Cluster25/DuskRise 'Cozy Smuggled Into the Box', May 2022). Unconfirmed attribution;
# may be same actor rotating token names or a copycat using the same spoofing technique.
- regex: '(Agency)/(\d+)\.(\d+)\.(\d+)'
# Herring - suspicious fake UA token appended to Chrome UA strings
# (TOKEN/MAJOR.MINOR.BUILD.PATCH). No known legitimate browser uses this token.
# Structurally identical to AtContent (confirmed APT29/Nobelium C2 marker; see
# Cluster25/DuskRise 'Cozy Smuggled Into the Box', May 2022). Unconfirmed attribution;
# may be same actor rotating token names or a copycat using the same spoofing technique.
- regex: '(Herring)/(\d+)\.(\d+)\.(\d+)'
# Config - suspicious fake UA token appended to Chrome UA strings
# (TOKEN/MAJOR.MINOR.BUILD.PATCH). No known legitimate browser uses this token.
# Structurally identical to AtContent (confirmed APT29/Nobelium C2 marker; see
# Cluster25/DuskRise 'Cozy Smuggled Into the Box', May 2022). Unconfirmed attribution;
# may be same actor rotating token names or a copycat using the same spoofing technique.
- regex: '(Config)/(\d+)\.(\d+)\.(\d+)'
# Viewer - suspicious fake UA token appended to Chrome UA strings
# (TOKEN/MAJOR.MINOR.BUILD.PATCH). No known legitimate browser uses this token.
# Structurally identical to AtContent (confirmed APT29/Nobelium C2 marker; see
# Cluster25/DuskRise 'Cozy Smuggled Into the Box', May 2022). Unconfirmed attribution;
# may be same actor rotating token names or a copycat using the same spoofing technique.
- regex: '(Viewer)/(\d+)\.(\d+)\.(\d+)'
# LikeWise - suspicious fake UA token appended to Chrome UA strings
# (TOKEN/MAJOR.MINOR.BUILD.PATCH). No known legitimate browser uses this token.
# Structurally identical to AtContent (confirmed APT29/Nobelium C2 marker; see
# Cluster25/DuskRise 'Cozy Smuggled Into the Box', May 2022). Unconfirmed attribution;
# may be same actor rotating token names or a copycat using the same spoofing technique.
- regex: '(LikeWise)/(\d+)\.(\d+)\.(\d+)'
# Unique - suspicious fake UA token appended to Chrome/Opera UA strings
# (TOKEN/MAJOR.MINOR.BUILD.PATCH). No known legitimate browser uses this token.
# Structurally identical to AtContent (confirmed APT29/Nobelium C2 marker; see
# Cluster25/DuskRise 'Cozy Smuggled Into the Box', May 2022). Unconfirmed attribution;
# may be same actor rotating token names or a copycat using the same spoofing technique.
- regex: '(Unique)/(\d+)\.(\d+)\.(\d+)'
# CitizenFX - embedded Chromium browser in FiveM/RedM (GTA V / RDR2 game mod frameworks)
- regex: '(CitizenFX)/(\d+)\.(\d+)\.(\d+)'
# R2Client - R2Games game launcher embedded browser (CEF-based)
- regex: '(R2Client)/(\d+)\.(\d+)(?:\.(\d+)|)'
# OBS Studio embedded browser (CEF-based, used for browser sources/docks)
- regex: '(OBS)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'OBS Studio'
# Adobe CEP - embedded Chromium runtime for extension panels in Adobe CC apps
- regex: '(AdobeCEP)/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Adobe CEP'
# Steam embedded browsers; version from Chrome. Must be before Chrome.
# GameOverlay = in-game overlay browser (Shift+Tab)
- regex: 'Valve Steam (GameOverlay).{1,200}Chrome/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Steam GameOverlay'
# Steam Deck built-in browser
- regex: 'Valve Steam (Gamepad)/Steam Deck.{1,200}Chrome/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Steam Deck'
# Steam desktop client browser
- regex: '(Valve(?: Steam|) Client).{1,200}Chrome/(\d+)\.(\d+)\.(\d+)'
family_replacement: 'Steam Client'
# Chrome/Chromium/major_version.minor_version
- regex: '(Chromium|Chrome)/(\d+)\.(\d+)(?:\.(\d+)|)(?:\.(\d+)|)'
@@ -1388,6 +1672,13 @@ os_parsers:
- regex: '^Box.{0,200};(Darwin)/(10)\.(1\d)(?:\.(\d+)|)'
os_replacement: 'Mac OS X'
##########
# Hashicorp API
# APN/1.0 HashiCorp/1.0 Terraform/1.8.0 (+https://www.terraform.io) terraform-provider-aws/4.67.0 (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.44.261 (go1.19.8; darwin; arm64)
##########
- regex: 'darwin; arm64'
os_replacement: 'Mac OS X'
##########
# iOS
# http://en.wikipedia.org/wiki/IOS_version_history
@@ -1672,29 +1963,27 @@ os_parsers:
- regex: 'CFNetwork/.{0,100} Darwin/(21)\.\d+'
os_replacement: 'iOS'
os_v1_replacement: '15'
- regex: 'CFNetwork/.{0,100} Darwin/22\.0\.\d+'
- regex: 'CFNetwork/.{0,100} Darwin/22\.([0-5])\.\d+'
os_replacement: 'iOS'
os_v1_replacement: '16'
os_v2_replacement: '0'
- regex: 'CFNetwork/.{0,100} Darwin/22\.1\.\d+'
os_replacement: 'iOS'
os_v1_replacement: '16'
os_v2_replacement: '1'
- regex: 'CFNetwork/.{0,100} Darwin/22\.2\.\d+'
os_replacement: 'iOS'
os_v1_replacement: '16'
os_v2_replacement: '2'
- regex: 'CFNetwork/.{0,100} Darwin/22\.3\.\d+'
os_replacement: 'iOS'
os_v1_replacement: '16'
os_v2_replacement: '3'
- regex: 'CFNetwork/.{0,100} Darwin/22\.4\.\d+'
os_replacement: 'iOS'
os_v1_replacement: '16'
os_v2_replacement: '4'
os_v2_replacement: '$1'
- regex: 'CFNetwork/.{0,100} Darwin/(22)\.\d+'
os_replacement: 'iOS'
os_v1_replacement: '16'
- regex: 'CFNetwork/.{0,100} Darwin/23\.([0-5])\.\d+'
os_replacement: 'iOS'
os_v1_replacement: '17'
os_v2_replacement: '$1'
- regex: 'CFNetwork/.{0,100} Darwin/(23)\.\d+'
os_replacement: 'iOS'
os_v1_replacement: '17'
- regex: 'CFNetwork/.{0,100} Darwin/24\.([0-5])\.\d+'
os_replacement: 'iOS'
os_v1_replacement: '18'
os_v2_replacement: '$1'
- regex: 'CFNetwork/.{0,100} Darwin/(24)\.\d+'
os_replacement: 'iOS'
os_v1_replacement: '18'
- regex: 'CFNetwork/.{0,100} Darwin/'
os_replacement: 'iOS'
@@ -1889,13 +2178,28 @@ os_parsers:
# Roku Digital-Video-Players https://www.roku.com/
- regex: '^(Roku)/DVP-(\d+)\.(\d+)'
##########
# Amazon S3 client boto3
# Hasicorp API
# Boto3/1.28.62 md/Botocore#1.31.62 ua/2.0 os/macos#22.4.0 md/arch#arm64 lang/python#3.11.6 md/pyimpl#CPython cfg/retry-mode#legacy Botocore/1.31.62
# APN/1.0 HashiCorp/1.0 Terraform/1.8.1 (+https://www.terraform.io) terraform-provider-aws/4.67.0 (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go-v2/1.18.0 os/macos lang/go/1.19.8 md/GOOS/darwin md/GOARCH/arm64 api/identitystore/1.16.11
##########
- regex: 'os\/macos[#]?(\d*)[.]?(\d*)[.]?(\d*)'
os_replacement: 'Mac OS X'
os_v1_replacement: '$1'
os_v2_replacement: '$2'
os_v3_replacement: '$3'
# Huawei HarmonyOS
- regex: '(HarmonyOS)[\s;]+(\d+|)\.?(\d+|)\.?(\d+|)'
device_parsers:
#########
# Mobile Spiders
# Catch the mobile crawler before checking for iPhones / Androids.
#########
- regex: '^.{0,100}?(?:(?:iPhone|Windows CE|Windows Phone|Android).{0,300}(?:(?:Bot|Yeti)-Mobile|YRSpider|BingPreview|bots?/\d|(?:bot|spider)\.html)|AdsBot-Google-Mobile.{0,200}iPhone)'
- regex: '^.{0,100}?(?:(?:iPhone|Windows CE|Windows Phone|Android).{0,300}(?:(?:Bot|Yeti)-Mobile|YRSpider|BingPreview|bots?/\d|(?:bot|spider)\.html|Google-InspectionTool)|AdsBot-Google-Mobile.{0,200}iPhone)'
regex_flag: 'i'
device_replacement: 'Spider'
brand_replacement: 'Spider'
@@ -3097,7 +3401,7 @@ device_parsers:
device_replacement: 'HTC $1'
brand_replacement: 'HTC'
model_replacement: '$1'
- regex: '; {0,2}(ADR6200|ADR6400L|ADR6425LVW|Amaze|DesireS?|EndeavorU|Eris|EVO|Evo\d[A-Z]+|HD2|IncredibleS?|Inspire[A-Z0-9]*|Inspire[A-Z0-9]*|Sensation[A-Z0-9]*|Wildfire)[ _-](.{1,200}?)(?:[/;\)]|Build|MIUI|1\.0)'
- regex: '; {0,2}(ADR6200|ADR6400L|ADR6425LVW|Amaze|DesireS?|EndeavorU|Eris|EVO|Evo\d[A-Z]+|HD2|IncredibleS?|Inspire[A-Z0-9]*|Sensation[A-Z0-9]*|Wildfire)[ _-](.{1,200}?)(?:[/;\)]|Build|MIUI|1\.0)'
regex_flag: 'i'
device_replacement: 'HTC $1 $2'
brand_replacement: 'HTC'
@@ -5501,7 +5805,6 @@ device_parsers:
brand_replacement: 'Asus'
model_replacement: '$1'
##########
# Bird
##########
@@ -5701,7 +6004,6 @@ device_parsers:
brand_replacement: 'Motorola'
model_replacement: '$2'
##########
# nintendo
##########
@@ -5905,7 +6207,7 @@ device_parsers:
##########
# Spiders (this is a hack...)
##########
- regex: '^.{0,100}(bot|BUbiNG|zao|borg|DBot|oegp|silk|Xenu|zeal|^NING|CCBot|crawl|htdig|lycos|slurp|teoma|voila|yahoo|Sogou|CiBra|Nutch|^Java/|^JNLP/|Daumoa|Daum|Genieo|ichiro|larbin|pompos|Scrapy|snappy|speedy|spider|msnbot|msrbot|vortex|^vortex|crawler|favicon|indexer|Riddler|scooter|scraper|scrubby|WhatWeb|WinHTTP|bingbot|BingPreview|openbot|gigabot|furlbot|polybot|seekbot|^voyager|archiver|Icarus6j|mogimogi|Netvibes|blitzbot|altavista|charlotte|findlinks|Retreiver|TLSProber|WordPress|SeznamBot|ProoXiBot|wsr\-agent|Squrl Java|EtaoSpider|PaperLiBot|SputnikBot|A6\-Indexer|netresearch|searchsight|baiduspider|YisouSpider|ICC\-Crawler|http%20client|Python-urllib|dataparksearch|converacrawler|Screaming Frog|AppEngine-Google|YahooCacheSystem|fast\-webcrawler|Sogou Pic Spider|semanticdiscovery|Innovazion Crawler|facebookexternalhit|Google.{0,200}/\+/web/snippet|Google-HTTP-Java-Client|BlogBridge|IlTrovatore-Setaccio|InternetArchive|GomezAgent|WebThumbnail|heritrix|NewsGator|PagePeeker|Reaper|ZooShot|holmes|NL-Crawler|Pingdom|StatusCake|WhatsApp|masscan|Google Web Preview|Qwantify|Yeti|OgScrper)'
- regex: '^.{0,100}(bot|BUbiNG|zao|borg|DBot|oegp|silk|Xenu|zeal|^NING|CCBot|crawl|htdig|lycos|slurp|teoma|voila|yahoo|Sogou|CiBra|Nutch|^Java/|^JNLP/|Daumoa|Daum|Genieo|ichiro|larbin|pompos|Scrapy|snappy|speedy|spider|msnbot|msrbot|vortex|^vortex|crawler|favicon|indexer|Riddler|scooter|scraper|scrubby|WhatWeb|WinHTTP|bingbot|BingPreview|openbot|gigabot|furlbot|polybot|seekbot|^voyager|archiver|Icarus6j|mogimogi|Netvibes|blitzbot|altavista|charlotte|findlinks|Retreiver|TLSProber|WordPress|SeznamBot|ProoXiBot|wsr\-agent|Squrl Java|EtaoSpider|PaperLiBot|SputnikBot|A6\-Indexer|netresearch|searchsight|baiduspider|YisouSpider|ICC\-Crawler|http%20client|Python-urllib|dataparksearch|converacrawler|Screaming Frog|AppEngine-Google|YahooCacheSystem|fast\-webcrawler|Sogou Pic Spider|semanticdiscovery|Innovazion Crawler|facebookexternalhit|Google.{0,200}/\+/web/snippet|Google-HTTP-Java-Client|BlogBridge|IlTrovatore-Setaccio|InternetArchive|GomezAgent|WebThumbnail|heritrix|NewsGator|PagePeeker|Reaper|ZooShot|holmes|NL-Crawler|Pingdom|StatusCake|WhatsApp|masscan|Google Web Preview|Qwantify|Yeti|OgScrper|RecipeRadar|GPTBot|Google-InspectionTool)'
regex_flag: 'i'
device_replacement: 'Spider'
brand_replacement: 'Spider'
@@ -4,7 +4,6 @@ CREATE TABLE clicks (
id TEXT PRIMARY KEY NOT NULL,
link_id TEXT NOT NULL,
user_agent TEXT,
language TEXT,
browser TEXT,
os TEXT,
source TEXT,
@@ -0,0 +1,49 @@
-- +micrate Up
-- SQL in section 'Up' is executed when this migration is applied
-- Step 1: Create a new table with the desired column type
CREATE TABLE links_new (
id TEXT PRIMARY KEY NOT NULL,
user_id TEXT NOT NULL,
slug VARCHAR(8) UNIQUE NOT NULL,
url TEXT NOT NULL,
created_at INTEGER DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at INTEGER DEFAULT CURRENT_TIMESTAMP NOT NULL,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);
-- Step 2: Copy data from the old table to the new table
INSERT INTO links_new (id, user_id, slug, url, created_at, updated_at)
SELECT id, user_id, slug, url, created_at, updated_at FROM links;
-- Step 3: Drop the old table
DROP TABLE links;
-- Step 4: Rename the new table to the old table's name
ALTER TABLE links_new RENAME TO links;
-- +micrate Down
-- SQL section 'Down' is executed when this migration is rolled back
-- Step 1: Create a new table with the original column type
CREATE TABLE links_old (
id TEXT PRIMARY KEY NOT NULL,
user_id TEXT NOT NULL,
slug VARCHAR(4) UNIQUE NOT NULL,
url TEXT NOT NULL,
created_at INTEGER DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at INTEGER DEFAULT CURRENT_TIMESTAMP NOT NULL,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);
-- Step 2: Copy data from the current table to the old table
INSERT INTO links_old (id, user_id, slug, url, created_at, updated_at)
SELECT id, user_id, substr(slug, 1, 4), url, created_at, updated_at FROM links;
-- Step 3: Drop the current table
DROP TABLE links;
-- Step 4: Rename the old table to the current table's name
ALTER TABLE links_old RENAME TO links;
@@ -0,0 +1,8 @@
-- +micrate Up
-- SQL in section 'Up' is executed when this migration is applied
ALTER TABLE clicks ADD COLUMN country TEXT;
ALTER TABLE clicks RENAME COLUMN source TO referer;
-- +micrate Down
-- SQL section 'Down' is executed when this migration is rolled back
ALTER TABLE clicks RENAME COLUMN referer TO source;
@@ -0,0 +1,13 @@
-- +micrate Up
-- SQL in section 'Up' is executed when this migration is applied
UPDATE clicks SET user_agent = NULL WHERE user_agent = 'Unknown';
UPDATE clicks SET browser = NULL WHERE browser = 'Unknown';
UPDATE clicks SET os = NULL WHERE os = 'Unknown';
UPDATE clicks SET referer = NULL WHERE referer = 'Unknown';
-- +micrate Down
-- SQL section 'Down' is executed when this migration is rolled back
UPDATE clicks SET user_agent = 'Unknown' WHERE user_agent IS NULL;
UPDATE clicks SET browser = 'Unknown' WHERE browser IS NULL;
UPDATE clicks SET os = 'Unknown' WHERE os IS NULL;
UPDATE clicks SET referer = 'Unknown' WHERE referer IS NULL;
@@ -0,0 +1,9 @@
-- +micrate Up
-- SQL in section 'Up' is executed when this migration is applied
DROP INDEX IF EXISTS idx_links_slug; -- Remove old composite index
CREATE INDEX IF NOT EXISTS idx_links_slug_optimized ON links (slug, url);
-- +micrate Down
-- SQL in section 'Down' is executed when this migration is rolled back
DROP INDEX IF EXISTS idx_links_slug_optimized;
CREATE INDEX IF NOT EXISTS idx_links_slug ON links (id, slug, url);
@@ -0,0 +1,102 @@
-- +micrate Up
-- SQL in section 'Up' is executed when this migration is applied
-- 1. Create new users table with INTEGER PK
CREATE TABLE users_new (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name VARCHAR(100) NOT NULL,
api_key VARCHAR(64) UNIQUE NOT NULL,
created_at INTEGER DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at INTEGER DEFAULT CURRENT_TIMESTAMP NOT NULL
);
-- Create a mapping table to track old and new user IDs
CREATE TEMPORARY TABLE user_id_map (
old_id TEXT,
new_id INTEGER
);
-- Insert users data and capture the mappings
INSERT INTO users_new (name, api_key, created_at, updated_at)
SELECT name, api_key, created_at, updated_at FROM users;
INSERT INTO user_id_map
SELECT u.id, u_new.id
FROM users u
JOIN users_new u_new ON u_new.api_key = u.api_key;
-- 2. Create new links table with INTEGER PK
CREATE TABLE links_new (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL,
slug VARCHAR(8) UNIQUE NOT NULL,
url TEXT NOT NULL,
created_at INTEGER DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at INTEGER DEFAULT CURRENT_TIMESTAMP NOT NULL,
FOREIGN KEY (user_id) REFERENCES users_new(id) ON DELETE CASCADE
);
-- Create a mapping table for links
CREATE TEMPORARY TABLE link_id_map (
old_id TEXT,
new_id INTEGER
);
-- Insert links data with new user_id foreign keys
INSERT INTO links_new (user_id, slug, url, created_at, updated_at)
SELECT
(SELECT new_id FROM user_id_map WHERE old_id = l.user_id),
l.slug,
l.url,
l.created_at,
l.updated_at
FROM links l;
-- Create the mapping for links
INSERT INTO link_id_map
SELECT l.id, l_new.id
FROM links l
JOIN links_new l_new ON l_new.slug = l.slug AND l_new.url = l.url;
-- 3. Create new clicks table with INTEGER PK
CREATE TABLE clicks_new (
id INTEGER PRIMARY KEY AUTOINCREMENT,
link_id INTEGER NOT NULL,
user_agent TEXT,
browser TEXT,
os TEXT,
referer TEXT,
country TEXT,
created_at INTEGER DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at INTEGER DEFAULT CURRENT_TIMESTAMP NOT NULL,
FOREIGN KEY (link_id) REFERENCES links_new(id) ON DELETE CASCADE
);
-- Insert clicks data with new link_id foreign keys
INSERT INTO clicks_new (link_id, user_agent, browser, os, referer, country, created_at, updated_at)
SELECT
(SELECT new_id FROM link_id_map WHERE old_id = c.link_id),
c.user_agent,
c.browser,
c.os,
c.referer,
c.country,
c.created_at,
c.updated_at
FROM clicks c;
-- 4. Drop old tables and rename new tables
DROP TABLE clicks;
DROP TABLE links;
DROP TABLE users;
ALTER TABLE clicks_new RENAME TO clicks;
ALTER TABLE links_new RENAME TO links;
ALTER TABLE users_new RENAME TO users;
-- 5. Drop unused indexes
DROP INDEX IF EXISTS index_users_api_key;
DROP INDEX IF EXISTS idx_links_slug;
DROP INDEX IF EXISTS idx_links_slug_optimized;
-- +micrate Down
-- SQL section 'Down' is executed when this migration is rolled back
+69
View File
@@ -0,0 +1,69 @@
INSERT INTO users (name, api_key)
VALUES
('User 1', 'secure_api_key_1'),
('User 2', 'secure_api_key_2');
-- Create 10,000 links (5,000 per user)
WITH RECURSIVE link_numbers(n) AS (
SELECT 1
UNION ALL
SELECT n+1 FROM link_numbers
LIMIT 10000
)
INSERT INTO links (user_id, slug, url)
SELECT
((n-1) % 2) + 1, -- User ID (1-2)
'slug' || n, -- Unique slug
'https://sjdonado.com/page/' || n
FROM link_numbers;
-- Create 1,000 clicks per link (10 million total)
WITH RECURSIVE counts(n) AS (
SELECT 1
UNION ALL
SELECT n+1 FROM counts
LIMIT 1000
)
INSERT INTO clicks (link_id, user_agent, browser, os, referer, country)
SELECT
l.id,
CASE (c.n % 5)
WHEN 0 THEN 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'
WHEN 1 THEN 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15)'
WHEN 2 THEN 'Mozilla/5.0 (iPhone; CPU iPhone OS 14_0)'
WHEN 3 THEN 'Mozilla/5.0 (X11; Linux x86_64)'
ELSE 'Mozilla/5.0 (Android 11; Mobile)'
END,
CASE (c.n % 3)
WHEN 0 THEN 'Firefox'
WHEN 1 THEN 'Chrome'
ELSE 'Safari'
END,
CASE (c.n % 4)
WHEN 0 THEN 'macOS'
WHEN 1 THEN 'Windows'
WHEN 2 THEN 'iOS'
ELSE 'Android'
END,
CASE (c.n % 6)
WHEN 0 THEN 'https://sjdonado.com'
WHEN 1 THEN 'https://donado.co'
WHEN 2 THEN 'https://idonthavespotify.donado.co'
WHEN 3 THEN 'https://spookyplanning.com'
WHEN 4 THEN 'https://github.com/sjdonado'
ELSE NULL
END,
CASE (c.n % 10)
WHEN 0 THEN 'Colombia'
WHEN 1 THEN 'Brazil'
WHEN 2 THEN 'Canada'
WHEN 3 THEN 'Germany'
WHEN 4 THEN 'France'
WHEN 5 THEN 'Japan'
WHEN 6 THEN 'Australia'
WHEN 7 THEN 'Brazil'
WHEN 8 THEN 'India'
ELSE 'China'
END
FROM links l
CROSS JOIN counts c;
+9 -3
View File
@@ -1,9 +1,15 @@
services:
app:
container_name: bit
build: .
environment:
ENV: production
DATABASE_URL: sqlite3://./sqlite/data.db?journal_mode=wal&synchronous=normal&foreign_keys=true
APP_URL: http://0.0.0.0:4001
ADMIN_NAME: 'Tester'
ADMIN_API_KEY: '0p+mDvbpZGLPGVCXnV+EDduR9Blkv27Dhq9XSzSbdQY='
ports:
- 4001:4000
- 4000:4000
volumes:
- sqlite_data:/app/sqlite
volumes:
sqlite_data:
+187
View File
@@ -0,0 +1,187 @@
## CLI
```
Usage: ./cli [options]
Options:
--create-user=NAME Create a new user with the given name
--list-users List all users
--delete-user=USER_ID Delete a user by ID
--update-parsers Download all required data files
```
## Local Development
### Requirements
- Crystal 1.18+
- Shards package manager
- SQLite3
### Install Dependencies
- linux
```bash
sudo apt-get update && sudo apt-get install -y crystal libssl-dev libsqlite3-dev
```
- macos
```bash
brew tap amberframework/micrate
brew install micrate
```
### Install Shards and Run
```bash
shards run bit
```
- Generate the `X-Api-Key`
```bash
shards run cli -- --create-user=Admin
```
- Run tests
```bash
ENV=test crystal spec
```
## Benchmark
### Run
```
shards build --release --no-debug --progress --stats
shards run benchmark
```
### Output
Chip: Apple M4 Pro. Memory: 24GB
```
1762075350 ~/p/bit> shards build --release --no-debug --progress --stats
shards run benchmark
Dependencies are satisfied
Building: bit
Parse: 00:00:00.000652375 ( 1.17MB)
Semantic (top level): 00:00:00.419246250 ( 163.45MB)
Semantic (new): 00:00:00.001636125 ( 163.45MB)
Semantic (type declarations): 00:00:00.019569792 ( 179.45MB)
Semantic (abstract def check): 00:00:00.009145125 ( 195.45MB)
Semantic (restrictions augmenter): 00:00:00.008421709 ( 195.45MB)
Semantic (ivars initializers): 00:00:00.019696584 ( 211.45MB)
Semantic (cvars initializers): 00:00:00.106829666 ( 211.50MB)
Semantic (main): 00:00:00.649298375 ( 499.88MB)
Semantic (cleanup): 00:00:00.000765250 ( 499.88MB)
Semantic (recursive struct check): 00:00:00.000752250 ( 499.88MB)
Codegen (crystal): 00:00:00.521307417 ( 532.38MB)
Codegen (bc+obj): 00:00:00.143842542 ( 532.38MB)
Codegen (linking): 00:00:00.236228750 ( 532.38MB)
Macro runs:
- /opt/homebrew/Cellar/crystal/1.18.2/share/crystal/src/ecr/process.cr: reused previous compilation (00:00:00.003593375)
Codegen (bc+obj):
- all previous .o files were reused
Building: cli
Parse: 00:00:00.000053291 ( 1.17MB)
Semantic (top level): 00:00:00.323534167 ( 163.45MB)
Semantic (new): 00:00:00.001705500 ( 163.45MB)
Semantic (type declarations): 00:00:00.018311958 ( 179.45MB)
Semantic (abstract def check): 00:00:00.007766750 ( 195.45MB)
Semantic (restrictions augmenter): 00:00:00.005686667 ( 195.45MB)
Semantic (ivars initializers): 00:00:00.011239792 ( 211.45MB)
Semantic (cvars initializers): 00:00:00.100870833 ( 211.50MB)
Semantic (main): 00:00:00.285426750 ( 371.62MB)
Semantic (cleanup): 00:00:00.000369875 ( 371.62MB)
Semantic (recursive struct check): 00:00:00.000570917 ( 371.62MB)
Codegen (crystal): 00:00:00.317534875 ( 387.88MB)
Codegen (bc+obj): 00:00:00.097321417 ( 387.88MB)
Codegen (linking): 00:00:00.095931000 ( 387.88MB)
Codegen (bc+obj):
- all previous .o files were reused
Building: benchmark
Parse: 00:00:00.000228500 ( 1.17MB)
Semantic (top level): 00:00:00.242174458 ( 147.78MB)
Semantic (new): 00:00:00.000863333 ( 147.78MB)
Semantic (type declarations): 00:00:00.011527792 ( 147.78MB)
Semantic (abstract def check): 00:00:00.031242333 ( 147.78MB)
Semantic (restrictions augmenter): 00:00:00.003593583 ( 147.78MB)
Semantic (ivars initializers): 00:00:00.006753667 ( 147.78MB)
Semantic (cvars initializers): 00:00:00.028373834 ( 195.78MB)
Semantic (main): 00:00:00.152039542 ( 243.83MB)
Semantic (cleanup): 00:00:00.000249084 ( 243.83MB)
Semantic (recursive struct check): 00:00:00.000460417 ( 243.83MB)
Codegen (crystal): 00:00:00.075461000 ( 259.83MB)
Codegen (bc+obj): 00:00:04.834914333 ( 259.83MB)
Codegen (linking): 00:00:00.119920416 ( 259.83MB)
Codegen (bc+obj):
- no previous .o files were reused
Dependencies are satisfied
Building: benchmark
Executing: benchmark
Cleaning up benchmark database...
Deleted existing database: ./sqlite/data.benchmark.db
Database cleanup completed.
Running database migrations...
Migrating db, current version: 0, target: 20250319192003
OK 20240512214223_create_links.sql
OK 20240512225208_add_slug_index_to_links.sql
OK 20240513115731_create_users.sql
OK 20240513130054_add_api_key_index_to_users.sql
OK 20240711224103_create_clicks.sql
OK 20240714215409_update_slug_size_links.sql
OK 20250316102350_add_country_to_clicks.sql
OK 20250316111734_replace_unkwown_with_null.sql
OK 20250318072657_replace_slug_index_with_covering_index.sql
OK 20250319192003_convert_all_tables_text_ids_to_integer.sql
Migrations completed successfully.
Seeding benchmark database...
Database seeded successfully.
Starting application: ./bit...
Application output will be saved to: app_output.log
Application started with PID: 11638
Using database: ./sqlite/data.benchmark.db
Checking if server is ready at http://localhost:4001...
.Server is ready!
Fetching links from API...
Selected link: http://localhost:4001/slug9391
Starting benchmark with 100000 requests...
Bombarding http://localhost:4001/slug9391 with 100000 request(s) using 125 connection(s)
100000 / 100000 [============================================================================] 100.00% 11078/s 9s
Done!
Statistics Avg Stdev Max
Reqs/sec 11427.28 8889.68 30270.91
Latency 11.02ms 6.55ms 53.91ms
Latency Distribution
50% 1.85ms
75% 5.37ms
90% 39.36ms
95% 39.87ms
99% 42.66ms
HTTP codes:
1xx - 0, 2xx - 0, 3xx - 100000, 4xx - 0, 5xx - 0
others - 0
Throughput: 3.08MB/s
Benchmark completed successfully.
**** Resource Usage Statistics ****
Measurements: 12
Average CPU Usage: 71.5%
Average Memory Usage: 39.8 MiB
Peak CPU Usage: 100.0%
Peak Memory Usage: 53.41 MiB
**** Files Generated ****
Resource stats: resource_usage.log
Application log: app_output.log
Database: ./sqlite/data.benchmark.db
Stopping application...
Application stopped.
```
+503
View File
@@ -0,0 +1,503 @@
openapi: 3.0.3
info:
title: Bit - URL Shortener API
description: |
Fast, lightweight, self-hosted URL shortener service with minimal click tracking.
## Getting Started
For setup instructions, please check the [README](https://github.com/sjdonado/bit/blob/master/README.md).
## Authentication
Multiple users are supported via `X-Api-Key` headers. Create, list and delete keys via the [CLI](https://github.com/sjdonado/bit/blob/master/SETUP.md#cli).
version: 1.6.0
contact:
name: sjdonado
url: https://sjdonado.com
servers:
- url: http://localhost:4000
description: Development server
security:
- ApiKeyAuth: []
paths:
/api/ping:
get:
summary: Ping the API
description: Health check endpoint to verify the API is running
operationId: ping
tags:
- Health
security: []
responses:
'200':
description: API is healthy
content:
application/json:
schema:
type: object
properties:
data:
type: string
example: pong
/{slug}:
get:
summary: Redirect by slug
description: Redirects to the original URL and tracks the click asynchronously
operationId: redirectBySlug
tags:
- Redirects
security: []
parameters:
- name: slug
in: path
required: true
description: The short URL slug
schema:
type: string
example: 3wP4BQ
- name: utm_source
in: query
required: false
description: UTM source parameter for tracking
schema:
type: string
example: email_campaign
responses:
'301':
description: Redirect to original URL
headers:
Location:
description: The original URL
schema:
type: string
example: https://example.com
X-Forwarded-For:
description: Client IP address
schema:
type: string
User-Agent:
description: User agent string
schema:
type: string
'404':
description: Link not found
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
/api/links:
get:
summary: List all links
description: Retrieve all links for the authenticated user with pagination support
operationId: listLinks
tags:
- Links
parameters:
- name: limit
in: query
description: Number of results per page
schema:
type: integer
default: 100
minimum: 1
maximum: 1000
- name: cursor
in: query
description: Pagination cursor from previous response
schema:
type: string
responses:
'200':
description: List of links
content:
application/json:
schema:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/LinkSummary'
pagination:
$ref: '#/components/schemas/Pagination'
'401':
description: Unauthorized
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
post:
summary: Create new link
description: Create a new shortened link
operationId: createLink
tags:
- Links
requestBody:
required: true
content:
application/json:
schema:
type: object
required:
- url
properties:
url:
type: string
format: uri
description: The URL to shorten
example: https://example.com
responses:
'201':
description: Link created successfully
content:
application/json:
schema:
type: object
properties:
data:
$ref: '#/components/schemas/Link'
'400':
description: Bad request - invalid URL or missing field
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
examples:
missingField:
value:
error: "url: Required field"
invalidUrl:
value:
errors:
url:
- is invalid
'401':
description: Unauthorized
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
/api/links/{id}:
get:
summary: Get link by ID
description: Retrieve a specific link with up to 100 most recent clicks. For complete click history, use /api/links/{id}/clicks
operationId: getLink
tags:
- Links
parameters:
- name: id
in: path
required: true
description: Link ID
schema:
type: integer
format: int64
responses:
'200':
description: Link details
content:
application/json:
schema:
type: object
properties:
data:
$ref: '#/components/schemas/Link'
'404':
description: Link not found
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
'401':
description: Unauthorized
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
put:
summary: Update link
description: Update the URL of an existing link
operationId: updateLink
tags:
- Links
parameters:
- name: id
in: path
required: true
description: Link ID
schema:
type: integer
format: int64
requestBody:
required: true
content:
application/json:
schema:
type: object
required:
- url
properties:
url:
type: string
format: uri
description: The new URL
example: https://newexample.com
responses:
'200':
description: Link updated successfully
content:
application/json:
schema:
type: object
properties:
data:
$ref: '#/components/schemas/Link'
'400':
description: Bad request
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
'401':
description: Unauthorized
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
'403':
description: Forbidden - link belongs to another user
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
'404':
description: Link not found
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
delete:
summary: Delete link
description: Delete a link and all its associated clicks
operationId: deleteLink
tags:
- Links
parameters:
- name: id
in: path
required: true
description: Link ID
schema:
type: integer
format: int64
responses:
'204':
description: Link deleted successfully
'401':
description: Unauthorized
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
'403':
description: Forbidden - link belongs to another user
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
'404':
description: Link not found
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
/api/links/{id}/clicks:
get:
summary: List clicks for a link
description: Retrieve all clicks for a specific link with pagination support
operationId: listClicks
tags:
- Clicks
parameters:
- name: id
in: path
required: true
description: Link ID
schema:
type: integer
format: int64
- name: limit
in: query
description: Number of results per page
schema:
type: integer
default: 100
minimum: 1
maximum: 1000
- name: cursor
in: query
description: Pagination cursor from previous response
schema:
type: string
responses:
'200':
description: List of clicks
content:
application/json:
schema:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/Click'
pagination:
$ref: '#/components/schemas/Pagination'
'401':
description: Unauthorized
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
'404':
description: Link not found
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
components:
securitySchemes:
ApiKeyAuth:
type: apiKey
in: header
name: X-Api-Key
description: API key for authentication
schemas:
LinkSummary:
type: object
properties:
id:
type: integer
format: int64
description: Unique link identifier
example: 1
refer:
type: string
format: uri
description: The shortened URL
example: http://localhost:4000/3wP4BQ
origin:
type: string
format: uri
description: The original URL
example: https://monocuco.donado.co
Link:
allOf:
- $ref: '#/components/schemas/LinkSummary'
- type: object
properties:
clicks:
type: array
description: Array of click records (up to 100 most recent)
items:
$ref: '#/components/schemas/Click'
Click:
type: object
properties:
id:
type: integer
format: int64
description: Unique click identifier
example: 1
user_agent:
type: string
description: User agent string
example: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:127.0) Gecko/20100101 Firefox/127.0
country:
type: string
description: Country code (ISO 3166-1 alpha-2)
example: US
nullable: true
browser:
type: string
description: Browser name
example: Firefox
nullable: true
os:
type: string
description: Operating system
example: Mac OS X
nullable: true
referer:
type: string
description: Referer domain or utm_source
example: Direct
nullable: true
created_at:
type: string
format: date-time
description: Click timestamp
example: 2024-07-12T19:25:22Z
Pagination:
type: object
properties:
has_more:
type: boolean
description: Whether there are more results
example: true
next:
type: integer
format: int64
description: Cursor for next page (link/click ID)
example: 12
nullable: true
Error:
type: object
properties:
error:
type: string
description: Error message
example: Resource not found
required:
- error
ValidationErrors:
type: object
properties:
errors:
type: object
additionalProperties:
type: array
items:
type: string
description: Field-level validation errors
example:
url:
- is invalid
tags:
- name: Health
description: Health check endpoints
- name: Redirects
description: URL redirection and click tracking
- name: Links
description: Link management operations
- name: Clicks
description: Click analytics and tracking
-11
View File
@@ -1,11 +0,0 @@
#!/bin/bash
REGEXES_URL="https://raw.githubusercontent.com/ua-parser/uap-core/master/regexes.yaml"
DOWNLOAD_DIR="data"
REGEXES_FILE="regexes.yaml"
mkdir -p $DOWNLOAD_DIR
curl -L -o $DOWNLOAD_DIR/$REGEXES_FILE $REGEXES_URL
echo "Regexes file downloaded to $DOWNLOAD_DIR/$REGEXES_FILE"
+341
View File
@@ -0,0 +1,341 @@
#!/usr/bin/env crystal
require "http/client"
require "json"
PORT = "4001"
APP_URL = "http://localhost:#{PORT}"
API_URL = "#{APP_URL}/api/links"
API_KEY = "secure_api_key_1"
NUMBER_OF_REQUESTS = 100000
APP_COMMAND = "./bit"
APP_ARGS = [] of String
STATS_FILE = "resource_usage.log"
APP_LOG_FILE = "app_output.log"
DATABASE_URL = "sqlite3://./sqlite/data.benchmark.db?journal_mode=wal&synchronous=normal&foreign_keys=true"
DATABASE_FILE = "./sqlite/data.benchmark.db"
class ResourceMonitor
def initialize(@pid : Int32)
@running = false
@stats = [] of {timestamp: Time, cpu: Float64, memory: Float64}
end
def start
@running = true
@stats.clear
File.write(STATS_FILE, "Timestamp\tCPU(%)\tMemory(MiB)\n")
spawn do
while @running
if stat = capture_stats
File.open(STATS_FILE, "a") do |file|
file.puts "#{stat[:timestamp].to_unix}\t#{stat[:cpu]}\t#{stat[:memory]}"
end
@stats << stat
end
sleep 1.seconds
end
end
end
def stop
@running = false
sleep 1.seconds
end
private def capture_stats
output = IO::Memory.new
process = Process.run(
"ps", ["-p", @pid.to_s, "-o", "%cpu,%mem,rss"],
output: output
)
if process.success?
lines = output.to_s.strip.split("\n")
if lines.size >= 2
data_line = lines[1].strip.split
if data_line.size >= 3
cpu = data_line[0].to_f
# RSS is in KB on macOS, convert to MiB
memory_kb = data_line[2].to_f
memory_mib = memory_kb / 1024.0
return {timestamp: Time.utc, cpu: cpu, memory: memory_mib}
end
end
end
nil
end
def print_summary
return if @stats.empty?
total_cpu = 0.0
total_memory = 0.0
peak_cpu = 0.0
peak_memory = 0.0
@stats.each do |stat|
total_cpu += stat[:cpu]
total_memory += stat[:memory]
peak_cpu = stat[:cpu] if stat[:cpu] > peak_cpu
peak_memory = stat[:memory] if stat[:memory] > peak_memory
end
avg_cpu = total_cpu / @stats.size
avg_memory = total_memory / @stats.size
summary = <<-STATS
**** Resource Usage Statistics ****
Measurements: #{@stats.size}
Average CPU Usage: #{avg_cpu.round(2)}%
Average Memory Usage: #{avg_memory.round(2)} MiB
Peak CPU Usage: #{peak_cpu.round(2)}%
Peak Memory Usage: #{peak_memory.round(2)} MiB
STATS
File.open(STATS_FILE, "a") do |file|
file.puts summary
end
puts summary
end
end
def start_application : Process
puts "Starting application: #{APP_COMMAND}..."
puts "Application output will be saved to: #{APP_LOG_FILE}"
log_file = File.open(APP_LOG_FILE, "w")
process = Process.new(
APP_COMMAND,
APP_ARGS,
output: log_file,
error: log_file,
env: {
"DATABASE_URL" => DATABASE_URL,
"APP_URL" => APP_URL,
"PORT" => PORT,
}
)
puts "Application started with PID: #{process.pid}"
puts "Using database: #{DATABASE_FILE}"
process
end
def stop_application(process : Process)
puts "\nStopping application..."
process.signal(Signal::TERM)
# Give it a few seconds to shut down gracefully
sleep 3.seconds
# Force kill if still running
begin
process.signal(Signal::KILL)
rescue
# Process already terminated
end
puts "Application stopped."
end
def check_dependencies
{"bombardier", "sqlite3", "micrate"}.each do |cmd|
process = Process.run("which", [cmd], output: Process::Redirect::Close)
unless process.success?
puts "Error: #{cmd} is not installed. Please install it to proceed."
case cmd
when "bombardier"
puts " brew install bombardier"
when "sqlite3"
puts " brew install sqlite3"
when "micrate"
puts " shards install"
end
exit(1)
end
end
end
def wait_for_server
puts "Checking if server is ready at #{APP_URL}..."
30.times do
begin
if HTTP::Client.get("#{APP_URL}/api/ping").success?
puts "Server is ready!"
return
end
rescue
# Server not ready yet
end
sleep 1.seconds
print "."
end
puts "\nError: Server is not responding. Please start your application first."
exit(1)
end
def run_benchmark
puts "Fetching links from API..."
response = HTTP::Client.get(
"#{API_URL}?limit=10000",
headers: HTTP::Headers{"X-Api-Key" => API_KEY}
)
unless response.success?
puts "Failed to fetch links. Status: #{response.status_code}"
puts "Make sure your server is running and the API key is correct."
exit(1)
end
data = JSON.parse(response.body)
links = data["data"].as_a.map { |link| link["refer"].as_s }
if links.empty?
puts "No links found. Please seed your database first."
exit(1)
end
random_link = links.sample
puts "Selected link: #{random_link}"
puts "\nStarting benchmark with #{NUMBER_OF_REQUESTS} requests..."
sleep 2.seconds
process = Process.new(
"bombardier",
["-n", NUMBER_OF_REQUESTS.to_s, "-l", "--disableKeepAlives", random_link],
output: Process::Redirect::Inherit,
error: Process::Redirect::Inherit
)
status = process.wait
if status.success?
puts "\nBenchmark completed successfully."
else
puts "\nBombardier failed with error code: #{status.exit_code}"
exit(1)
end
end
def cleanup_database
puts "Cleaning up benchmark database..."
if File.exists?(DATABASE_FILE)
File.delete(DATABASE_FILE)
puts "Deleted existing database: #{DATABASE_FILE}"
end
# Also remove WAL and SHM files if they exist
["#{DATABASE_FILE}-wal", "#{DATABASE_FILE}-shm"].each do |file|
if File.exists?(file)
File.delete(file)
puts "Deleted: #{file}"
end
end
# Ensure sqlite directory exists
Dir.mkdir_p("./sqlite")
puts "Database cleanup completed."
end
def run_migrations
puts "Running database migrations..."
process = Process.run("which", ["micrate"], output: Process::Redirect::Close)
unless process.success?
puts "Error: micrate is not installed. Please install it to proceed."
puts " shards install"
exit(1)
end
process = Process.run(
"micrate",
["up"],
env: {"DATABASE_URL" => DATABASE_URL},
output: Process::Redirect::Inherit,
error: Process::Redirect::Inherit
)
if process.success?
puts "Migrations completed successfully."
else
puts "Error: Migrations failed."
exit(1)
end
end
def seed_database
puts "Seeding benchmark database..."
unless File.exists?("./db/seed.sql")
puts "Warning: ./db/seed.sql not found. Skipping database seeding."
return
end
unless File.exists?(DATABASE_FILE)
puts "Warning: #{DATABASE_FILE} not found. Database may not be initialized."
end
process = Process.run(
"sqlite3",
[DATABASE_FILE],
input: File.open("./db/seed.sql"),
output: Process::Redirect::Inherit,
error: Process::Redirect::Inherit
)
if process.success?
puts "Database seeded successfully."
else
puts "Warning: Database seeding failed. Continuing anyway..."
end
end
def main
check_dependencies
# Setup benchmark database
cleanup_database
run_migrations
seed_database
app_process = start_application
begin
wait_for_server
# Give it a moment to settle
sleep 2.seconds
monitor = ResourceMonitor.new(app_process.pid.to_i32)
monitor.start
run_benchmark
monitor.stop
monitor.print_summary
puts "\n**** Files Generated ****"
puts " Resource stats: #{STATS_FILE}"
puts " Application log: #{APP_LOG_FILE}"
puts " Database: #{DATABASE_FILE}"
ensure
# Always stop the application
stop_application(app_process)
end
end
main
+12 -4
View File
@@ -1,4 +1,3 @@
require "uuid"
require "option_parser"
require "../app/services/cli"
@@ -19,11 +18,20 @@ OptionParser.parse do |parser|
exit
end
parser.on("--update-parsers", "Download UA regexes and/or GeoLite2 database") do
puts "=== Starting data files update ==="
App::Services::Cli.update_uap_regexes
App::Services::Cli.update_geolite_db
puts "=== Data files updated successfully ==="
exit
end
if ARGV.empty?
puts "Usage: ./cli [options]"
puts "Options:"
puts " --create-user=NAME Create a new user with the given name"
puts " --list-users List all users"
puts " --delete-user=USER_ID Delete a user by ID"
puts " --create-user=NAME Create a new user with the given name"
puts " --list-users List all users"
puts " --delete-user=USER_ID Delete a user by ID"
puts " --update-parsers Download all required data files"
end
end
-7
View File
@@ -1,7 +0,0 @@
require "sqlite3"
require"micrate"
require "../app/config/env"
Micrate::DB.connection_url = ENV["DATABASE_URL"]
Micrate::Cli.run_up
+11 -7
View File
@@ -2,11 +2,11 @@ version: 2.0
shards:
backtracer:
git: https://github.com/sija/backtracer.cr.git
version: 1.2.2
version: 1.2.4
crecto:
git: https://github.com/fridgerator/crecto.git
version: 0.12.1
version: 0.14.0
db:
git: https://github.com/crystal-lang/crystal-db.git
@@ -18,10 +18,18 @@ shards:
exception_page:
git: https://github.com/crystal-loot/exception_page.git
version: 0.4.1
version: 0.5.0
ipaddress:
git: https://github.com/sija/ipaddress.cr.git
version: 0.2.3
kemal:
git: https://github.com/kemalcr/kemal.git
version: 1.7.3
maxminddb:
git: https://github.com/delef/maxminddb.cr.git
version: 1.5.0
micrate:
@@ -48,7 +56,3 @@ shards:
git: https://github.com/crystal-lang/crystal-sqlite3.git
version: 0.19.0
user_agent_parser:
git: https://github.com/busyloop/user_agent_parser.git
version: 2.0.1
+8 -8
View File
@@ -1,20 +1,21 @@
name: bit
version: 1.1.0
version: 1.6.0
authors:
- Juan Rodriguez <sjdonado@icloud.com>
- Juan Rodriguez Donado <@sjdonado>
targets:
bit:
main: bit.cr
cli:
main: scripts/cli.cr
migrate:
main: scripts/migrate.cr
benchmark:
main: scripts/benchmark.cr
dependencies:
kemal:
github: kemalcr/kemal
version: 1.7.3
sqlite3:
github: crystal-lang/crystal-sqlite3
crecto:
@@ -22,9 +23,8 @@ dependencies:
micrate:
github: amberframework/micrate
version: 0.15.1
user_agent_parser:
github: busyloop/user_agent_parser
version: 2.0.1
maxminddb:
github: delef/maxminddb.cr
development_dependencies:
dotenv:
@@ -32,6 +32,6 @@ development_dependencies:
spec-kemal:
github: kemalcr/spec-kemal
crystal: ">= 1.12.1"
crystal: ">= 1.18.2"
license: MIT
+210 -48
View File
@@ -15,14 +15,14 @@ describe "App::Controllers::Link" do
body: payload.to_json
)
parsed_response = Hash(String, Hash(String, String | Int64 | Array(Hash(String, String)))).from_json(response.body)
parsed_response = Hash(String, Hash(String, String | Int64 | Array(Hash(String, String | Int64)))).from_json(response.body)
parsed_response["data"]["origin"].should eq(payload["url"])
end
it "should return existing link if url already exists" do
test_user = create_test_user()
payload = {"url" => "https://kagi.com"}
payload = {"url" => "http://idonthavespotify.donado.co"}
post(
"/api/links",
headers: HTTP::Headers{"Content-Type" => "application/json", "X-Api-Key" => test_user.api_key.to_s},
@@ -75,62 +75,94 @@ describe "App::Controllers::Link" do
payload = {"url" => "https://kagi.com"}
post("/api/links", headers: HTTP::Headers{"Content-Type" => "application/json"}, body: payload.to_json)
expected = {"error" => "Unauthorized"}.to_json
expected = {"error" => "Unauthorized access"}.to_json
response.status_code.should eq(401)
response.body.should eq(expected)
end
end
describe "Index" do
it "should redirect to origin domain" do
link = "https://kagi.com"
it "should redirect to origin domain with forwarded headers" do
link = "https://test.com"
test_user = create_test_user()
test_link = create_test_link(test_user, link)
serialized_link = App::Serializers::Link.new(test_link)
get(serialized_link.refer, headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:127.0) Gecko/20100101 Firefox/127.0"
get("/#{test_link.slug}", headers: HTTP::Headers{
"X-Api-Key" => test_user.api_key.to_s,
"User-Agent" => user_agent
})
response.headers["Location"].should eq(link)
response.headers["User-Agent"].should eq(user_agent)
response.headers.has_key?("X-Forwarded-For").should be_true
end
it "should create a new click after redirect" do
link = "https://kagi.com"
it "should create a new click after redirect with proper information" do
link = "https://sjdonado.com"
test_user = create_test_user()
test_link = create_test_link(test_user, link)
serialized_link = App::Serializers::Link.new(test_link)
get(serialized_link.refer, headers: HTTP::Headers{"User-Agent" => "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:127.0) Gecko/20100101 Firefox/127.0"})
user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:127.0) Gecko/20100101 Firefox/127.0"
referer = "https://example.com/page"
get("/#{test_link.slug}", headers: HTTP::Headers{
"User-Agent" => user_agent,
"Referer" => referer
})
Fiber.yield # replace yield with sleep 5 to debug errors
response.headers["Location"].should eq(link)
updated_test_link = get_test_link(test_link.id)
# Verify that the click was recorded
updated_test_link = get_test_link(test_link.id.not_nil!)
updated_test_link.clicks.size.should eq(test_link.clicks.size + 1)
# Verify click details
latest_click = updated_test_link.clicks.last
latest_click.user_agent.should eq(user_agent)
latest_click.browser.should eq("Firefox")
latest_click.os.should eq("Mac OS X")
latest_click.referer.should eq("example.com") # Should extract host from the referer
end
it "should return 404 - link does not exist" do
link = "https://kagi.com"
it "should create a click with utm_source when no referer is provided" do
link = "https://sjdonado.com"
test_user = create_test_user()
test_link = create_test_link(test_user, link)
serialized_link = App::Serializers::Link.new(test_link)
delete_test_link(test_link.id)
# Add utm_source parameter
user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:127.0) Gecko/20100101 Firefox/127.0"
get("/#{test_link.slug}?utm_source=email_campaign", headers: HTTP::Headers{
"User-Agent" => user_agent
})
get(serialized_link.refer, headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
sleep 0.2.seconds # Wait for async click creation
expected = {"error" => "Not Found"}.to_json
updated_test_link = get_test_link(test_link.id.not_nil!)
latest_click = updated_test_link.clicks.last
latest_click.referer.should eq("email_campaign")
end
it "should return 404 - link does not exist" do
test_user = create_test_user()
get("/R4kj2")
expected = {"error" => "Resource not found"}.to_json
response.status_code.should eq(404)
response.body.should eq(expected)
end
end
describe "All" do
it "should return all links" do
links = ["https://google.com", "google.com", "google.com.co"]
it "should return all links with pagination" do
links = ["https://sjdonado.com", "sjdonado.com", "sjdonado.com.co"]
test_user = create_test_user()
links.each do |link|
@@ -139,14 +171,58 @@ describe "App::Controllers::Link" do
get("/api/links", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
parsed_response = Hash(String, Array(Hash(String, String | Int64 | Array(Hash(String, String))))).from_json(response.body)
parsed_response["data"][0]["origin"].should eq(links[0])
parsed_response["data"][1]["origin"].should eq(links[1])
parsed_response["data"][2]["origin"].should eq(links[2])
parsed_response = Hash(String, Array(Hash(String, String | Int64)) | Hash(String, Bool | String? | Int64?)).from_json(response.body)
# Check that each link is in the response data
origins = parsed_response["data"].as(Array).map { |link| link["origin"] }
links.each do |link|
origins.should contain(link)
end
parsed_response["pagination"].as(Hash)["has_more"].should be_false
end
it "should respect custom limit parameter" do
test_user = create_test_user()
5.times do |i|
create_test_link(test_user, "https://example.com/#{i}")
end
get("/api/links?limit=2", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
parsed_response = Hash(String, Array(Hash(String, String | Int64)) | Hash(String, Bool | String? | Int64?)).from_json(response.body)
parsed_response["data"].as(Array).size.should eq(2)
parsed_response["pagination"].as(Hash)["has_more"].should be_true
parsed_response["pagination"].as(Hash)["next"].should_not be_nil
end
it "should support cursor-based pagination" do
test_user = create_test_user()
5.times do |i|
create_test_link(test_user, "https://example.com/#{i}")
end
# Get first page
get("/api/links?limit=2", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
first_page = Hash(String, Array(Hash(String, String | Int64)) | Hash(String, Bool | String? | Int64?)).from_json(response.body)
cursor = first_page["pagination"].as(Hash)["next"]
# Get second page using cursor
get("/api/links?limit=2&cursor=#{cursor}", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
second_page = Hash(String, Array(Hash(String, String | Int64)) | Hash(String, Bool | String? | Int64?)).from_json(response.body)
# Ensure different links are returned
first_page_ids = first_page["data"].as(Array).map { |link| link["id"] }
second_page_ids = second_page["data"].as(Array).map { |link| link["id"] }
# Check that no IDs from first page appear in second page
(first_page_ids & second_page_ids).empty?.should be_true
end
it "should return owned links only" do
links = ["https://google.com", "google.com", "google.com.co", "kagi.com"]
links = ["https://donado.co", "donado.co", "uninorte.edu.co", "kagi.com"]
test_user = create_test_user()
links[0..2].each do |link|
@@ -158,41 +234,48 @@ describe "App::Controllers::Link" do
get("/api/links", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
parsed_response = Hash(String, Array(Hash(String, String | Int64 | Array(Hash(String, String))))).from_json(response.body)
parsed_response["data"].size.should eq(3)
parsed_response["data"][0]["origin"].should eq(links[0])
parsed_response["data"][1]["origin"].should eq(links[1])
parsed_response["data"][2]["origin"].should eq(links[2])
parsed_response = Hash(String, Array(Hash(String, String | Int64)) | Hash(String, Bool | String? | Int64?)).from_json(response.body)
parsed_response["data"].as(Array).size.should eq(3)
origins = parsed_response["data"].as(Array).map { |link| link["origin"] }
links[0..2].each do |link|
origins.should contain(link)
end
origins.should_not contain(links[3])
end
it "should return 401 - missing api key" do
get "/api/links"
expected = {"error" => "Unauthorized"}.to_json
expected = {"error" => "Unauthorized access"}.to_json
response.status_code.should eq(401)
response.body.should eq(expected)
end
end
describe "Get" do
it "should return the specified link with click details" do
link = "https://kagi.com"
it "should return the specified link with limited click details" do
link = "https://bing.com"
test_user = create_test_user()
test_link = create_test_link(test_user, link)
110.times do
create_test_click(test_link)
end
get("/api/links/#{test_link.id}", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
parsed_response = Hash(String, Hash(String, String | Int64 | Array(Hash(String, String)))).from_json(response.body)
parsed_response = Hash(String, Hash(String, String | Int64 | Array(Hash(String, String | Int64)))).from_json(response.body)
parsed_response["data"]["origin"].should eq(link)
parsed_response["data"]["clicks"].should be_a(Array(Hash(String, String)))
parsed_response["data"]["clicks"].as(Array).size.should eq(100)
end
it "should return 404 - link does not exist" do
test_user = create_test_user()
get("/api/links/1", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
get("/api/links/999999", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
expected = {"error" => "Not Found"}.to_json
expected = {"error" => "Resource not found"}.to_json
response.status_code.should eq(404)
response.body.should eq(expected)
end
@@ -200,7 +283,86 @@ describe "App::Controllers::Link" do
it "should return 401 - missing api key" do
get "/api/links/1"
expected = {"error" => "Unauthorized"}.to_json
expected = {"error" => "Unauthorized access"}.to_json
response.status_code.should eq(401)
response.body.should eq(expected)
end
end
describe "Clicks" do
it "should return paginated clicks for a link" do
link = "https://example.com"
test_user = create_test_user()
test_link = create_test_link(test_user, link)
5.times do
create_test_click(test_link)
end
get("/api/links/#{test_link.id}/clicks", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
parsed_response = Hash(String, Array(Hash(String, String | Int64)) | Hash(String, Bool | String? | Int64?)).from_json(response.body)
parsed_response["data"].as(Array).size.should eq(5)
parsed_response["pagination"].as(Hash)["has_more"].should be_false
end
it "should respect limit parameter" do
link = "https://example.com"
test_user = create_test_user()
test_link = create_test_link(test_user, link)
10.times do
create_test_click(test_link)
end
get("/api/links/#{test_link.id}/clicks?limit=3", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
parsed_response = Hash(String, Array(Hash(String, String | Int64)) | Hash(String, Bool | String? | Int64?)).from_json(response.body)
parsed_response["data"].as(Array).size.should eq(3)
parsed_response["pagination"].as(Hash)["has_more"].should be_true
parsed_response["pagination"].as(Hash)["next"].should_not be_nil
end
it "should support cursor-based pagination" do
link = "https://example.com"
test_user = create_test_user()
test_link = create_test_link(test_user, link)
10.times do
create_test_click(test_link)
end
# Get first page
get("/api/links/#{test_link.id}/clicks?limit=3", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
first_page = Hash(String, Array(Hash(String, String | Int64)) | Hash(String, Bool | String? | Int64?)).from_json(response.body)
cursor = first_page["pagination"].as(Hash)["next"]
# Get second page using cursor
get("/api/links/#{test_link.id}/clicks?limit=3&cursor=#{cursor}", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
second_page = Hash(String, Array(Hash(String, String | Int64)) | Hash(String, Bool | String? | Int64?)).from_json(response.body)
# Ensure different clicks are returned
first_page_ids = first_page["data"].as(Array).map { |click| click["id"] }
second_page_ids = second_page["data"].as(Array).map { |click| click["id"] }
# Check that no IDs from first page appear in second page
(first_page_ids & second_page_ids).empty?.should be_true
end
it "should return 404 - link does not exist" do
test_user = create_test_user()
get("/api/links/999999/clicks", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
expected = {"error" => "Resource not found"}.to_json
response.status_code.should eq(404)
response.body.should eq(expected)
end
it "should return 401 - missing api key" do
get("/api/links/1/clicks")
expected = {"error" => "Unauthorized access"}.to_json
response.status_code.should eq(401)
response.body.should eq(expected)
end
@@ -208,18 +370,18 @@ describe "App::Controllers::Link" do
describe "Update" do
it "should update link url" do
link = "https://kagi.com"
link = "https://github.com"
test_user = create_test_user()
test_link = create_test_link(test_user, link)
payload = {"url" => "https://kagi.com.co"}
payload = {"url" => "https://github.com.co"}
put(
"/api/links/#{test_link.id}",
headers: HTTP::Headers{"Content-Type" => "application/json", "X-Api-Key" => test_user.api_key.to_s},
body: payload.to_json
)
parsed_response = Hash(String, Hash(String, String | Int64 | Array(Hash(String, String)))).from_json(response.body)
parsed_response = Hash(String, Hash(String, String | Int64 | Array(Hash(String, String | Int64)))).from_json(response.body)
parsed_response["data"]["origin"].should eq(payload["url"])
end
@@ -228,12 +390,12 @@ describe "App::Controllers::Link" do
payload = {"url" => "https://kagi.com.co"}
put(
"/api/links/1",
"/api/links/999999",
headers: HTTP::Headers{"Content-Type" => "application/json", "X-Api-Key" => test_user.api_key.to_s},
body: payload.to_json
)
expected = {"error" => "Not Found"}.to_json
expected = {"error" => "Resource not found"}.to_json
response.status_code.should eq(404)
response.body.should eq(expected)
end
@@ -246,7 +408,7 @@ describe "App::Controllers::Link" do
body: payload.to_json
)
expected = {"error" => "Unauthorized"}.to_json
expected = {"error" => "Unauthorized access"}.to_json
response.status_code.should eq(401)
response.body.should eq(expected)
end
@@ -254,7 +416,7 @@ describe "App::Controllers::Link" do
describe "Delete" do
it "should delete link url" do
link = "https://kagi.com"
link = "https://news.ycombinator.com"
test_user = create_test_user()
test_link = create_test_link(test_user, link)
@@ -266,9 +428,9 @@ describe "App::Controllers::Link" do
it "should return 404 - link does not exist" do
test_user = create_test_user()
delete("/api/links/1", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
delete("/api/links/999999", headers: HTTP::Headers{"X-Api-Key" => test_user.api_key.to_s})
expected = {"error" => "Not Found"}.to_json
expected = {"error" => "Resource not found"}.to_json
response.status_code.should eq(404)
response.body.should eq(expected)
end
@@ -276,7 +438,7 @@ describe "App::Controllers::Link" do
it "should return 401 - missing api key" do
delete "/api/links/1"
expected = {"error" => "Unauthorized"}.to_json
expected = {"error" => "Unauthorized access"}.to_json
response.status_code.should eq(401)
response.body.should eq(expected)
end
+1 -1
View File
@@ -4,7 +4,7 @@ describe "App::Controllers::Ping" do
it "should return pong" do
get "/api/ping"
expected = {"pong" => "ok"}.to_json
expected = {"data" => "pong"}.to_json
response.body.should eq(expected)
end
end
+25
View File
@@ -34,4 +34,29 @@ describe "App::Services::Cli" do
output.should contain "Failed to delete user"
end
it "sets up an admin user if environment variables are present" do
ENV["ADMIN_NAME"] = "adminuser"
ENV["ADMIN_API_KEY"] = "secure_admin_key"
App::Services::Cli.setup_admin_user
admin_user = App::Lib::Database.all(App::Models::User).find { |u| u.name == "adminuser" }
admin_user.should_not be_nil
admin_user = admin_user.not_nil!
admin_user.api_key.should eq "secure_admin_key"
App::Services::Cli.delete_user(admin_user.id)
end
it "skips admin setup if environment variables are missing" do
ENV.delete("ADMIN_NAME")
ENV.delete("ADMIN_API_KEY")
App::Services::Cli.setup_admin_user
users = App::Lib::Database.all(App::Models::User)
users.none? { |u| u.name == "adminuser" }.should be_true
end
end
+42 -13
View File
@@ -1,11 +1,20 @@
require "uuid"
require "file_utils"
require "spec-kemal"
require "micrate"
require "dotenv"
Dotenv.load ".env.#{ENV["ENV"]}"
require "../bit"
Spec.before_suite do
# Delete the SQLite database file if it exists
db_file_path = ENV["DATABASE_URL"].split("sqlite3://").last.split("?").first
if File.exists?(db_file_path)
File.delete(db_file_path)
end
Micrate::DB.connection_url = ENV["DATABASE_URL"]
Micrate::Cli.run_up
@@ -14,37 +23,57 @@ end
def create_test_user
user = App::Models::User.new
user.id = UUID.v4.to_s
user.name = "Tester"
user.api_key = Random::Secure.urlsafe_base64()
changeset = App::Lib::Database.insert(user)
if !changeset.valid?
raise "Test user creation failed"
error_messages = changeset.errors.map { |error| "#{error}" }.join(", ")
raise "Test user creation failed #{error_messages}"
end
user
changeset.instance
end
def create_test_link(user, url)
link = App::Models::Link.new
link.id = UUID.v4.to_s
link.slug = App::Services::SlugService.shorten_url(url, user.id.not_nil!)
link.url = url
link.slug = Random::Secure.urlsafe_base64(4)
link.user = user
changeset = App::Lib::Database.insert(link)
if !changeset.valid?
raise "Test link creation failed"
unless changeset.valid?
error_messages = changeset.errors.map { |error| "#{error}" }.join(", ")
raise "Test link creation failed: #{error_messages}"
end
link.clicks = [] of App::Models::Click
inserted_link = changeset.instance
inserted_link.clicks = [] of App::Models::Click
link
inserted_link
end
def get_test_link(link_id)
query = App::Lib::Database::Query.where(id: link_id.as(String)).limit(1)
def create_test_click(link)
click = App::Models::Click.new
click.user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:127.0) Gecko/20100101 Firefox/127.0"
click.browser = "Firefox"
click.os = "Mac OS X"
click.referer = "example.com"
click.country = "US"
click.created_at = Time.utc
click.link = link
click.link_id = link.id.not_nil!
changeset = App::Lib::Database.insert(click)
unless changeset.valid?
error_messages = changeset.errors.map { |error| "#{error}" }.join(", ")
raise "Test click creation failed: #{error_messages}"
end
changeset.instance
end
def get_test_link(link_id : Int64)
query = App::Lib::Database::Query.where(id: link_id).limit(1)
link = App::Lib::Database.all(App::Models::Link, query, preload: [:clicks]).first?
raise "Link not found" if link.nil?
@@ -52,6 +81,6 @@ def get_test_link(link_id)
link
end
def delete_test_link(link_id)
def delete_test_link(link_id : Int64)
App::Lib::Database.raw_exec("DELETE FROM links WHERE id = (?)", link_id) # tempfix: Database.delete does not work
end