GunplaHub Update #2 - Slugs, Obfuscated Primary Keys

In an older post, I gave an update about my gallery API for toys (more specifically, robot action figures called Gundams). Today I'll talk about what GunplaHub will do initially, the technical decisions I've made while building this project, and what's next for it.

2018-06-17 Edit: The project is now hosted on a g1-small Google Cloud compute instance, while the code is on a personal Gitea instance.

Table of Contents

1) Intro, Technical Decisions, What's Next

2) (Current) Slugs, Obfuscated Primary Keys

3) Deployment and Serving Images

4) Refactor and Document Often

5) Automation and SEO

Planned Initial Release Features

~~1) Display Action Figures (/toys/, /toys/:id/)~~ DONE!

~~2) Display Shows/Movies that included these Action Figures (/shows/ /shows/:id/, /shows/:id/toys/)~~ DONE!

~~3) Display Historical Prices (toys/:id/prices)~~ DONE!

~~4) Link these Action Figures to their respective Amazon pages~~ DONE!

The main features (for now) are done. Previously I also talked about spending time reading up on proper ways to authenticate a client to access an API. As of now, I'm able to authenticate a client using a token and give it read permissions on models related to the initial features. I've written relevant tests as well.

It's been a busy three weeks for me, working on things related and unrelated to this project. Here's some stuff related to this project that I've accomplished during that time:

Getting Amazon Links for some Toys

It took me 1-2 hours manually looking up 100~ toys' Amazon links, parsing their prices, and adding my affiliate tag to them. Surely, there has to be a better way to do this, especially if I were to provide users links / prices from other websites. I decided not to spend too much time on this and focus on development instead.

Encode Primary Keys

Using Django Rest Framework's helpful ViewSet class, my URLs looked like this:

/toys/1/

This works well during development. As the sole developer, I just want to access a toy and see if its data is displayed correctly. It's convenient for me to use a very readable/predictable format (such as an auto-incrementing integer). All my Django app does is query across different models and displays the results in a nice .json response.

What sucks about this: my primary keys are exposed. Random people, competitors, and some malicious party could determine the number of toys I have in my database, how often / how much I add at a time, etc. Keep in mind that this app is actually very simple, the magic sauce comes from data I've aggregated from different sources!

It seems like overkill to use something like UUID to use as my primary key (since the 90's, there are only 900~ Gunplas in existence). There's no concern for collision, as only a privileged user or admin are able to insert new Toy objects into my database (through a script, but I can go over that later).

I'd rather not add another field in my tables, either!

So I decided to encode my primary keys on the fly and use the encoded key as a parameter in my URLs for lookup. The Django library django-encrypted-id makes this process as easy as updating your URLs and extending your models with the EncryptedIDModel class.

My URLs now look something like: (Of course, this isn't a real key)

/toys/G24D-IMR_1GDI/

Slug in URLs

Let's look at a sample StackOverflow question URL.

https://stackoverflow.com/questions/23070298/get-nested-json-object-with-gson-using-retrofit

See that it pulls an object from the Question collection using some identifier, and displays a slug of the title?

Also, try this out. Take out the slug, and just use the identifier.

https://stackoverflow.com/questions/23070298

The link still works, suggesting that the slug field is optional. You can also type whatever on the slug, it'll still redirect you to the correct page.

https://stackoverflow.com/questions/23070298/hello-i-am-a-cat

I like this approach. This way, slugs enhance readability for the user. People will be using my app, not machines!.

Now, do I want a specific slug field for each object in my database (manually/automatically generated), or should I just create slugs on the fly? I think if my slugs were a required parameter in my URLs, having the slug ready from my objects would be handy. That way, I have some assurance that these slugs won't be changing too often. There would also be times where I have to update the title field of a Toy object, which means I also have to update the slug field!

I went with generating slugs on the fly. My slugs aren't required, so I don't really need to care if somehow I generate a "bad" slug, as long as its human-readable and describes the content of the page. Also, I want the slugs to update themselves if the title changes. This solution keeps me from adding/manipulating application specific stuff to my models.

Now, my URLs look like this:

/toys/G24D-IMR_1GDI/RX-78-2_Gundam

Easier to know what toy this link refers to! With a glance, we know this link directs the user to the RX-78-2 Gundam's page.

tl;dr - I've obfuscated my primary keys from prying eyes while keeping its functionality, and added a human-readable / SEO-friendly slug to my URLs. The more awesome bits: my models keep their original structure, only contain data related to that object, and avoid having application stuff in them!

What's Next

My initial release features are done! There are already tests for the important features of my API. My next focus will be deploying this API in a secure, and automated way.

My progress is currently in a private Gitlab repository. I also have a Heroku account that I'm not using. I like both services, as Gitlab has it's own CI built in, I could just whip up a quick job that would run my tests, and (if tests pass) deploy my latest builds to Heroku. This way I get to focus more on development and less on DevOps!

And both Gitlab and Heroku offer those services for free :)

tags: gunplahub